Authors Are Furious After Finding Their Works on List of Books Used To Train AI

stopthatgirl7@kbin.social to Technology@lemmy.world – 430 points –
themarysue.com

Authors using a new tool to search a list of 183,000 books used to train AI are furious to find their works on the list.

136

You are viewing a single comment

AI isn't either. It's selling statistical data about the books.

It literally shares passages verbatim

So does any site that quotes the book. Just being trained on a work doesn't give the model the ability to cite it word for word. For most of the books in this set you wouldn't even be able to get a single accurate quote out of most models. The models gain the ability to cite passages from training on other sources citing these same passages.

That's maybe an issue. I mirror speech a lot, though. How large are the passages?

That claim is disingenuous at best, and misinformed otherwise.

3 more...
9 more...