Reddit has reportedly signed over its content to train AI models

return2ozma@lemmy.world to Technology@lemmy.world – 1044 points –
Reddit has reportedly signed over its content to train AI models
mashable.com
201

You are viewing a single comment

This is why I don't blame anyone for editing/deleting their post history on reddit.

I do. It's frankly selfish. Having an AI get training on my old comments costs me nothing and it results in the development of useful AI tools. Trying to sabotage that is petty and pointless. It's not like you could somehow collect the fraction of a pittance that you think you're owed retroactively. I never commented on Reddit thinking "awesome, I'm going to make bank on the content I'm generating here."

People complain about the capitalist mindset of the world and then they do this. Sigh.

Defending giant corporations profiting off of uncompensated individuals, while criticizing anyone who doesn't want to provide free labor to said corporations, is a disgusting take. Are you a CEO?

The more accessible training data there is the easier it is for new AI projects to enter the field less dominant those "giant corporations" become.

The free labour was already freely given. If someone doesn't want to have shitposted on Reddit for free then maybe they shouldn't have shitposted on Reddit for free.

"if you didn't want me to steal your intellectual property, you shouldn't have thought of it in the first place"

No, you shouldn’t have posted it to Reddit, in which you were required to give them a perpetual license to use your IP in any way they see fit.

For the record, I’m here because Reddit pissed me off when they axed the free API, and I’m pissed at myself for not expecting it. That’s what I get for accepting their terms and conditions, I guess.

Edit: I also don’t accept the idea that using my content for training data is “fair use” when it is used to train proprietary models, especially ones in which the end user is allowed to prompt it to plagiarize or otherwise imitate my content.

So, for an example of what the other user was talking about, I'm just some guy and for my first foray inyo programming / machine learning (I kind of just threw myself into the deep end) I modified stylegan 3 and trained it on about 500g of reddit porn that I scraped off reddit.

Now, I stopped the training after about a week (it was going to take about a solid month on my rtx 2080 ti) when I found out stable diffusion existed but I learned a LOT from that experience.

I couldn't do that now. Arguably none of that was how any of that should be done but whatever.

I'm not sure what you mean here. Nothing's being stolen. Even if you think there needs to be permission for training an AI off of data, Reddit has that permission.

I assume you're more of a moron than a troll, which is disappointing. Regardless, you're not worth my time, as I don't think any argument could convince you to have an open mind and be willing to change. Good luck out there!

I had an 11 year old account that I deleted all my old comments and posts from because of the API debacle. Does that make me selfish that I felt like Reddit wasn’t holding up its end of the unwritten agreement?

Reddit doesn’t deserve my content anymore than I deserve access from the third party API.

If you did it over the API debacle then you're not one of the people I'm talking about here. This is about people deleting their content to prevent it from being used to train AIs.

Do you not remember the real reason why the API debacle happened in the first place was to prepare for this moment? It was always about easy access to training data, third party apps got caught in the crossfire.

That's ignoring an awful lot of other considerations. Obviously Reddit hasn't explained itself in a trustworthy way, but a common belief at the time is that it was to force people to use the official Reddit mobile app so they could be subject to advertising.

2 more...
2 more...
2 more...
2 more...

Selfish? Perhaps you forget why people deleted their content in the first place.

It's their comment to do with as they see fit. I can't get mad at them for wanting to erase their presence on a site they don't use anymore.

And I'm free to judge them however I wish for their actions and intent.

How is not wanting capitalist companies to profit off of your content not aligned with complaining about the capitalist mindset of the world? Wtf lol.

It's the insistence that everything that people do must be compensated with money. People have spent years posting on Reddit for fun, without any thought to being paid for it, and now all of a sudden someone else is making some money so they're demanding that they should get their slice. And doing what they can to wreck their earlier efforts when they don't.

How does Reddit making some money licensing this stuff harm those of us who contributed to it? Is there any problem aside from "I wanna get paid!"?

Why do you think it's about wanting a slice? They posted on Reddit with no expectation of profit. But they don't want others to profit off it either. It's not that complicated.

But they don’t want others to profit off it either.

And that's why I call them selfish. It doesn't harm them in the slightest if someone else profits off of it.

They wouldn't have posted if they knew this was going to happen. They posted because it was fun, not for this.

They may be morally opposed to AI (as there are many valid reasons to be opposed to it), or they may just have wanted to have been able to make an informed decision before posting, but by retroactively training the AI on their posts they've robbed them of the agency to make that decision.

That's why they're upset.

They posted content on a website whose user agreement says "we can do whatever we like with the content you post here" and then go surprised-pikachu when the website goes ahead and does whatever they like with the content they posted. Frankly, I'm not tremendously sympathetic. This should have been easy to predict.

Oh yeah I'm sure you predicted LLMs, and that they would need ridiculous amounts of training data wayyyy back in 2005 when Reddit started lol. Super easy to predict. Good job bud.

And that's why I'm calling you either a moron or a tool. Probably both.

For me it's a privacy matter. Going through old posts (whether human or machine learning) can nor be used for anything good.

What about people who just think “A.I.” Is dog shit and chat bots are a dumb obsession steering the industry in the wrong direction due to hype and money?

What about them? I don't see why they'd care what AI companies are doing in that case. They'd assume they were just wasting money on this stuff.

2 more...
2 more...