this post was submitted on 07 Sep 2023
98 points (92.2% liked)

Asklemmy

43968 readers
1259 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy ๐Ÿ”

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 5 years ago
MODERATORS
 

Recently I was wandering if there is someone or some group preserving , collecting , organizing and publishing all the knowledge of mankind ever created throughout its existence so that if ever mankind faces the 6th mass extinction we don't have to reinvent the wheel and can have a kick start to our new post apocalyptic civilization .

you are viewing a single comment's thread
view the rest of the comments
[โ€“] JohnDClay@sh.itjust.works 2 points 1 year ago* (last edited 1 year ago)

It's never going to be all knowledge, since a lot of stuff is just lost or never recorded. A ton of stuff (like this thread) are probably low on the priority list for recording as well. But the closest you'd probably get to a full catalog of human knowledge (at last text based) are the huge data sets of nearly all text data on the internet used for training LLMs. I wouldn't be surprised if there are ones soon that include video and pictures as well, since newer AI models are starting to be able to interpret those too.

I believe this is one of those data sets: https://github.com/yaodongC/awesome-instruction-dataset

Edit: here's a big data set used for a lot of gpt3 https://commoncrawl.org/