Reddit's licensing deal means Google's AI can soon be trained on the best humanity has to offer — completely unhinged posts

Lee Duna@lemmy.nz to Technology@lemmy.world – 1003 points –
Reddit's licensing deal means Google's AI can soon be trained on the best humanity has to offer — completely unhinged posts
businessinsider.com
252

You are viewing a single comment

Glad I deleted everything on there. fucking hell.

This keeps coming up and I keep replying, not to break anyone down but to point out the reality of the situation that a lot of people don't seem to get.

Reddit administrators, developers, and even the leadership has gone on the record saying that they retain all copies of comments, they cannot be deleted (delete action only marks it as "deleted"). Furthermore they have said they will undelete/unedit any comments or account at their whim and some discretion.

Have you ever search-engined something and came to a Reddit post, and you noticed that the original OP is [deleted]? That is what I described above playing out in front of you.

You cannot retract your past participation in Reddit, what is done is done. The only meaningful action you can take is to not participate there.

As I mentioned before, I use scripts to replace my comments with random excerpts from text in the public domain. I do this multiple times before finally deleting them. The result is that it becomes very difficult for the AI or anyone to figure out what is a legitimate comment and what is a line from Lady Chatterley's Lover or a scientific paper of the ecological impact from the Japanese whaling industry. It's easier to just filter out my username from their data sets.

They have almost definitely archived data and around the time of the API bullshit, made sure they didn't delete those archives. They have that content if they want to use it.

I've done the "switch, switch, switch, delete" at least twice a year for most of the twelve years I was there. The idea was to pollute the data, not delete it. Even if you started during the API bullshit, you still would have had plenty of time to corrupt your data enough. Remember, the idea is to make it so that it is difficult to tell what is a legitimate comment and what are excerpts from random text.

Most people don't reread their past comments and edit them. They could simply ignore any edits after the average time a person would notice a typo or something needing clarification, say anywhere between 5 minutes and 24 hours, or just ignore all edits. So your effort is wasted and you're still training the AI.

Yeah, I assume in general that nothing on the internet ever goes away. At least makes it somewhat more annoying though.

It's archived forever. Sorry.

i did the thing that means it's probably less archived (by editing all the replies before deleting), but i assume some of it probably remains out there. Nothing I can do about that.

1 more...
1 more...
1 more...