Asklemmy

43968 readers

1259 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

Open-ended question
Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
Not ad nauseam inducing: please make sure it is a question that would be new to most members
An actual topic of discussion

Looking for support?

Looking for a community?

Lemmyverse: community search
sub.rehab: maps old subreddits to fediverse options, marks official as such
!lemmy411@lemmy.ca: a community for finding communities

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 5 years ago

MODERATORS

Is there a collection of all human knowledge ever created ? (lemm.ee)

submitted 1 year ago by saltynuts420@lemm.ee to c/asklemmy@lemmy.ml

71 comments fedilink hide all child comments

Recently I was wandering if there is someone or some group preserving , collecting , organizing and publishing all the knowledge of mankind ever created throughout its existence so that if ever mankind faces the 6th mass extinction we don't have to reinvent the wheel and can have a kick start to our new post apocalyptic civilization .

you are viewing a single comment's thread
view the rest of the comments

[–] JohnDClay@sh.itjust.works 2 points 1 year ago* (last edited 1 year ago)

It's never going to be all knowledge, since a lot of stuff is just lost or never recorded. A ton of stuff (like this thread) are probably low on the priority list for recording as well. But the closest you'd probably get to a full catalog of human knowledge (at last text based) are the huge data sets of nearly all text data on the internet used for training LLMs. I wouldn't be surprised if there are ones soon that include video and pictures as well, since newer AI models are starting to be able to interpret those too.

I believe this is one of those data sets: https://github.com/yaodongC/awesome-instruction-dataset

Edit: here's a big data set used for a lot of gpt3 https://commoncrawl.org/