Thousands of authors demand payment from AI companies for use of copyrighted works

L4sBot@lemmy.world · 2 years ago

Thousands of authors demand payment from AI companies for use of copyrighted works

scarabic@lemmy.world · 2 years ago

When you sell a book, you don’t get to control how that book is used.

This is demonstrably wrong. You cannot buy a book, and then go use it to print your own copies for sale. You cannot use it as a script for a commercial movie. You cannot go publish a sequel to it.

Now please just try to tell me that AI training is specifically covered by fair use and satire case law. Spoiler: you can’t.

This is a novel (pun intended) problem space and deserves to be discussed and decided, like everything else. So yeah, your cavalier dismissal is cavalierly dismissed.

cerevant@lemmy.world · 2 years ago

No, you misunderstand. Yes, they can control how the content in the book is used - that’s what copyright is. But they can’t control what I do with the book - I can read it, I can burn it, I can memorize it, I can throw it up on my roof.

My argument is that the is nothing wrong with training an AI with a book - that’s input for the AI, and that is indistinguishable from a human reading it.

Now what the AI does with the content - if it plagiarizes, violates fair use, plagiarizes- that’s a problem, but those problems are already covered by copyright laws. They have no more business saying what can or cannot be input into an AI than they can restrict what I can read (and learn from). They can absolutely enforce their copyright on the output of the AI just like they can if I print copies of their book.

My objection is strictly on the input side, and the output is already restricted.

Redtitwhore@lemmy.world · 2 years ago

Makes sense. I would love to hear how anyone can disagree with this. Just because an AI learned or trained from a book doesn’t automatically mean it violated any copyrights.

cerevant@lemmy.world · edit-2 2 years ago

The base assumption of those with that argument is that an AI is incapable of being original, so it is “stealing” anything it is trained on. The problem with that logic is that’s exactly how humans work - everything they say or do is derivative from their experiences. We combine pieces of information from different sources, and connect them in a way that is original - at least from our perspective. And not surprisingly, that’s what we’ve programmed AI to do.

Yes, AI can produce copyright violations. They should be programmed not to. They should cite their sources when appropriate. AI needs to “learn” the same lessons we learned about not copy-pasting Wikipedia into a term paper.

lily33@lemmy.world · edit-2 2 years ago

It’s specifically distribution of the work or derivatives that copyright prevents.

So you could make an argument that an LLM that’s memorized the book and can reproduce (parts of) it upon request is infringing. But one that’s merely trained on the book, but hasn’t memorized it, should be fine.

scarabic@lemmy.world · 2 years ago

But by their very nature the LLM simply redistribute the material they’ve been trained on. They may disguise it assiduously, but there is no person at the center of the thing adding creative stokes. It’s copyrighted material in, copyrighted material out, so the plaintiffs allege.

lily33@lemmy.world · 2 years ago

They don’t redistribute. They learn information about the material they’ve been trained on - not there natural itself*, and can use it to generate material they’ve never seen.

Bigger models seem to memorize some of the material and can infringe, but that’s not really the goal.

Thousands of authors demand payment from AI companies for use of copyrighted works

Thousands of authors demand payment from AI companies for use of copyrighted works

Thousands of authors demand payment from AI companies for use of copyrighted works | CNN Business