Technology

34973 readers

183 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago

MODERATORS

MinutePhrase@lemmy.ml

164

Suing Writers Seethe at OpenAI's Excuses in Court (futurism.com)

submitted 1 year ago by floofloof@lemmy.ca to c/technology@lemmy.ml

112 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] mindbleach@sh.itjust.works 23 points 1 year ago (2 children)

I don't care what works a neural network gets trained on. How else are we supposed to make one?

Should I care more about modern eternal copyright bullshit? I'd feel more nuance if everything a few decades old was public-domain, like it's fucking supposed to be. Then there'd be plenty of slightly-outdated content to shovel into these statistical analysis engines. But there's not. So fuck it: show the model absolutely everything, and the impact of each work becomes vanishingly small.

Models don't get bigger as you add more stuff. Training only twiddles the numbers in each layer. There are two-gigabyte networks that have been trained on hundreds of millions of images. If you tried to store those image, verbatim, they would each weigh barely a dozen bytes. And the network gets better as that number goes down.

The entire point is to force the distillation of high-level concepts from raw data. We've tried doing it the smart way and we suck at it. "AI winter" and "good old-fashioned AI" were half a century of fumbling toward the acceptance that we don't understand how intelligence works. This brute-force approach isn't chosen for cost or ease or simplicity. This is the only approach that works.

[–] anachronist@midwest.social 3 points 1 year ago (1 children)

Models don’t get bigger as you add more stuff.

They will get less coherent and/or "forget" the earlier data if you don't increase the parameters with the training set.

There are two-gigabyte networks that have been trained on hundreds of millions of images

You can take a huge tiff of an image, put it through JPEG with the quality cranked all the way down and get a tiny file out the other side, which is still a recognizable derivative of the original. LLMs are extremely lossy compression of their training set.

[–] mindbleach@sh.itjust.works 4 points 1 year ago

which is still a recognizable derivative of the original

Not in twelve bytes.

Deep models are a statistical distillation of a metric shitload of data. Smaller models with more training on more data don't get worse, they get more abstract - and in adversarial uses they often kick big networks' asses.

[–] DeathsEmbrace@lemmy.ml -5 points 1 year ago (1 children)

Which is why we shouldn't be using something we don't and can't use properly.

[–] mindbleach@sh.itjust.works 13 points 1 year ago (1 children)

Right, copyright law.

[–] DeathsEmbrace@lemmy.ml 1 points 1 year ago (1 children)

No this will benefit capitalism and wealthiest people the most. The rest of us will suffer because of this. People can only think of the positives of AI and never the negatives this is weed all over again.

[–] mindbleach@sh.itjust.works 6 points 1 year ago (1 children)

Motivation to discuss anything with you goes flying out the window, if you think ending marijuana prohibition is anything but positive for the common people. And you're going to drop that turd in a completely unrelated punchbowl.

[–] DeathsEmbrace@lemmy.ml -3 points 1 year ago* (last edited 1 year ago) (1 children)

Marijuana is always characterized as positives and people always forget the negatives in every conversation. This is the exact same shit. Weed shouldn't even be illegal but those dumb racist white men in the 60s-80s with their paranoia decided to outlaw it. Fuck the exact doctors and psychologists that "analyzed" it said everything was bullshit so they had a professional you dumbass too. I'm not getting into racist history with you but take my first sentence as the argument.

[–] mindbleach@sh.itjust.works 2 points 1 year ago (1 children)

Talk less.

[–] DeathsEmbrace@lemmy.ml 0 points 1 year ago

I should the average human is stupid.