Japan Decides That Copyright Doesn't Apply to AI Training

Michael@lemmy.perthchat.org to Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ@lemmy.dbzer0.com – 271 points –
Japan Goes All In: Copyright Doesn't Apply To AI Training
technomancers.ai
73

You are viewing a single comment

To me it’s essentially the same as someone reading a book or watching a movie when the AI learns from those examples.

The problem is that the AI can print the book word for word if you ask the right questions and at that point it's breaking copyright again but that's not a problem with the learning part but with how AI has no concept of understanding context at all

You can easily get photoshop to reproduce one of Mondrain's paintings. That's on the user not the tool, i fail to see why the same doesnt apply to the tool of generative AI.

You can't easily tell it to replicate any painting for you - with current AI you can do that with almost any book it trained with

I was skeptical of this, but it checks out: I easily got ChatGPT to print out the full text to The Tell-Tale Heart, without any errors at all in the various spots I accuracy-checked.

Granted I chose it because it's a very short public domain work - I was more skeptical of its technical ability to recall the exact text without errors than of the ability to trick it into violating copyright law.

I still suspect it's much easier to (accidentally) trick it into writing a fanfiction of a copyrighted work that it claims is the original than it is to get it to produce the true original, though.

Your argument that it is useful as a copyright infringing machine is that it can reproduce a public domain work? That’s… not the argument you think it is.

My message was pretty clear about which part of their claim I was skeptical about and what I was testing for. It's not what you described here.

Just like a person with really good memory can. So what? Nobody is actually printing 300 page books that way when we can use libgen or any other source instead.

AI has no personal agency, lived experiences, or independent creative input.
Humans don't have the ability to synthesize thousands of pages of text in a matter of minutes.

Any analogy toward human learning or behavior is shallow and flawed.

This is why humans are involved in the process. Your counterargument is shallow and flawed

Yes exactly. When someone is creating art using stable diffusion it is clearly a manifestation of that artists intent. That is what copyright is designed to protect and should protect.

Wrong. Copyright protects works, not ideas.

The part that you AI bots always forget is that the machine doesn't do shit without a dataset. No data input, no output. And if you don't own the inputs, what the hell makes you think you can claim ownership over the outputs?

If you ask an AI art program to paint you a "pretty kitty cat", it can only do so because it has been fed enough pictures and paintings (plus metadata) to synthesize an acceptable output. Your human intent is an insignificant filter over their data, and if they haven't trained on any pictures of cats, you will never achieve anything even close to your intent. Your prompt has the value of a Google search.

Finally, there is a key thing called the "artistic process" in which a human artist imagined vision of their finished work takes shape as they work. This is nothing like what happens under a neutral network, and it is why you are never going to be an artist simply by filling in a web form. You have no vision, and even if you did, the AI will never achieve it on your behalf.

Sorry, but if AI art sounds too good to be true, it's because it is simply exploiting and distorting other people's copyrighted artwork. It gives you the illusion of having created something, like the kid mashing buttons at the arcade machine without putting any money in. But the good news is that it's not too late to learn how to draw.

You're fundamentally wrong and presenting a bad-faith argument in an insulting manner. Please shut up

I was wrong to use the dismissive term "AI bots". I'm genuinely sorry about that and I let my feelings as an artist get the best of me, but other than that my point still stands. To be fair, "you're wrong" and "shut up" aren't exactly the strongest counter arguments either. No hard feelings.

The objective truth is that "AI" neural networks synthesize an output based on an input dataset. There is no creativity, personality artistry or other x-factor there, and until there is real "general artificial intelligence" there never will be. Human beings feed inputs into the machine, and they generate an output based on some subset of those inputs. If those inputs are "fair use" or otherwise licensed, then that's perfectly fine. But if those inputs are unlicensed copyrighted works, then you would be insane to believe that you own the output that the algorithm produces--that's like thinking you own the music that comes out of your speakers because you hit the play button. Just because you're in control of the playback does not mean that you created the music, and nobody would seriously think that.

I've worked as an artist and a programmer, and a simple analogy is the concept of a software license. Just because you can see or download some source code on GitLab does not mean that you own it or can use it freely for any purpose; most code repositories are open sourced under some kind of license, which legitimate users of that code must comply with. We've already seen Microsoft make this mistake and then instantly backtrack with Github Copilot, because they understand that they simply do not have the IP rights to use GPL code (for one example) to train their AI. Similarly, if a musician samples a portion of a song to use in their own song, depending on various factors they may have to share credit with the original creator, and sometimes that make sense, in my opinion.

No matter how you or I feel about it, copyright law has always been there with the basic intent to protect people who create unique works. There are some circumstances which are currently considered "fair use" of unlicensed copyrighted works (for example, for educational purposes), and I think that's great. But I think there is zero argument that unlimited automated content generation via AI ought to be considered genuine fair use. No matter how much AI fans want to try to personify the technology, it is not engaging in a creative or artistic process, it is merely synthesizing an output based on mixed inputs, just like how an AI chat bot is not truly thinking but merely stringing words together.

You do realize individuals can train neural networks on their own hardware, right?

Good luck training something that rivals big tech, especially now that they're all putting "moats" around their data...

We, the little people, don't have the data, the storage, the processing power, the RAM, and least but not least, the cash, to compete with them.

At any rate, if you train your NN using appropriately licensed or public domain data, more power to you. But if you feed a machine a bunch of other people's writing, artwork, music, etc., please understand that you will never truly own the output.

You seem to be imagining a future in which AI is the great equalizer that ushers us in to some kind of utopia, but right now I'm only seeing even more money, power and control being clawed away from the people in favor of the biggest, richest tech conglomerates. It's fucking dystopian, and I hope people like you will recognize that before it's really too late.

Direct your ire where it belongs - at capitalism not technology

At any rate, if you train your NN using appropriately licensed or public domain data, more power to you. But if you feed a machine a bunch of other people's writing, artwork, music, etc., please understand that you will never truly own the output.

I am.

It is only the profit maximizing hyper capitalists who intend to use AI to exploit workers and rip off artists. I have no problem with the technology behind AI, I just don't think people should be using it as a tool for continual, industrialized mass exploitation of the little people (like you and me) who actually own the data that they put online.

Can it tell you what it learned, or does it copy billions of conversations online of what other people learned?

If it can't interpret, it's not learning.

All you get is the most basic form of data retention, if it retained millions of examples.

@brimnac it's not a 'someone' though. The AI isn't an actual consciousness. It's a software company illegally using other artists work to develop their own commercial product. BIG DIFFERENCE.

You learned from those same things and make a profit.