Technology

63652 readers

4482 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

768

OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling's Harry Potter series (www.businessinsider.com)

submitted 2 years ago by L4s@lemmy.world to c/technology@lemmy.world

297 comments fedilink hide all child comments

OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling's Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

you are viewing a single comment's thread
view the rest of the comments

[–] Jat620DH27@lemmy.world 10 points 2 years ago (1 children)

I thought everyone knows that OpenAI has the same access to any books, knowledge that human beings have.

[–] Redditiscancer789@lemmy.world 13 points 2 years ago (1 children)

Yes, but it's what it is doing with it that is the murky grey area. Anyone can read a book, but you can't use those books for your own commercial stuff. Rowling and other writers are making the case their works are being used in an inappropriate way commercially. Whether they have a case iunno ianal but I could see the argument at least.

[–] Touching_Grass@lemmy.world 3 points 2 years ago (1 children)

Harry potter uses so many tropes and inspiration from other works that came before. How is that different? wizards of the coast should sue her into the ground.

[–] Redditiscancer789@lemmy.world 3 points 2 years ago* (last edited 2 years ago) (1 children)

Because its not literally using the same stuff, you can be inspired by something ala Starcraft from Warhammer 40k, but you can't use literally the same things. Also you can't copyright as far as I understand it, broad subject matter. So no one can just copyright "wizard" but can copyright "Harry Potter the Wizard". You also can tell the OpenAI company knows it may be doing something wrong because their latest leak includes passages on how to hide the fact the LLMs trained on copyrighted materials.

[–] Touching_Grass@lemmy.world 0 points 2 years ago (1 children)

I would hide stuff too. Copyright laws are out of control. That doesn't mean they did something wrong. Its CYA.

copyrights are for reproducing and selling others work not ingesting them. If they found it online it should be legal to ingest it. If they bought the works they should also be legally able to train off it

[–] Redditiscancer789@lemmy.world 2 points 2 years ago (1 children)

No it does matter where they got the materials. If they illegally downloaded a copy off a website "just cause its on the internet" its still against the law.

[–] Touching_Grass@lemmy.world 0 points 2 years ago (1 children)

Shouldn't be illegal. Give them a letter how angry they are and call it a day

[–] Redditiscancer789@lemmy.world 2 points 2 years ago (1 children)

Yeah...about that...

https://www.google.com/search?q=woman+sued+for+13+songs+on+napster&sca_esv=559711199&source=hp&ei=hFXnZPutG-Hg0PEPndKmwAI&oq=woman+sued+for+13+songs+on+napster&gs_lp=EhFtb2JpbGUtZ3dzLXdpei1ocCIid29tYW4gc3VlZCBmb3IgMTMgc29uZ3Mgb24gbmFwc3RlcjIFECEYoAEyBRAhGKABMgUQIRigATIFECEYoAFIpUxQnghY3EtwA3gAkAEAmAG0AaABhBqqAQUyNC4xMbgBA8gBAPgBAagCD8ICEBAAGAMYjwEY6gIYjAMY5QLCAgsQABiABBixAxiDAcICCxAuGIAEGLEDGIMBwgIREC4YgAQYsQMYgwEYxwEY0QPCAggQABiABBixA8ICCxAuGIAEGMcBGK8BwgILEAAYigUYsQMYgwHCAgUQABiABMICBRAuGIAEwgIIEC4YgAQYsQPCAggQLhixAxiABMICBBAAGAPCAgcQABiABBgKwgIIEAAYgAQYyQPCAgYQABgWGB7CAgUQIRirAsICCBAhGBYYHhgd&sclient=mobile-gws-wiz-hp

There's clearly legal precedence.

[–] Touching_Grass@lemmy.world 1 points 2 years ago (1 children)

Couple things. That was wrong then as it is wrong today. Training data isn't file sharing. Too many of you are ushering in a new era of spying and erosion of the internet on behalf of corporations under the guise of " protecting artists" like they did in Napster days.

[–] Redditiscancer789@lemmy.world 1 points 2 years ago* (last edited 2 years ago) (1 children)

Not at all, I simply recognize that the argument may have merit as I said. I never said which side of the isle I personally fall on. Also they are a company so theoretically the scrutiny on the methods they use to acquire data is deserved. Data has a price whether you think it should or shouldn't.

[–] Touching_Grass@lemmy.world 1 points 2 years ago (1 children)

And my opinion is if it has a price don't give it away free online where anyone or anything can I ingest it. Should webcrawlers be paying websites for indexing them?

I also believe in private property. If I buy a book I can do what I want with it. Like use it to train AI. It is my property.

[–] Redditiscancer789@lemmy.world 1 points 2 years ago

Which are 2 contradictory philosophies, how can one simultaneously supposedly not care if someone's private property is stolen yet believes in private property rights? The argument would indeed be if they stole the book off the internet versus bought a copy themselves.