Now that ChatGPT is being trained using Reddit posts

public_image_ltd@lemmy.world to

Showerthoughts@lemmy.world – 347 points – 8 months ago

it will loose its ability to differentiate between there and their and its and it’s.

loose

Irony?

must of made a mistake their

your so dumb lmao

thank you kind stranger

Should of proof red it

And my axe!

This guy fucks

I also choose this guy's dead wife.

this fly fucks

1 more...

I need to of a word with you

Knead*

This one must be the worst. "Could care less" being a close second

I mean for all intensive porpoises I could care less

Clothes second*

Duh

Nah, there is some scents there. As in, I barely care, but I could care less.

1 more...

OP hasn't payed enough attention in English class.

Me fail English? That's unpossible!

1 more...

"must have" not "must of"

That bot gets downvoted so much by idiots. It's not pedantry but fucking 1st grade English. Amazing how many people take offense instead of fixing it and moving on.

I prefer the must of white grapes for wine making than the must of red grapes.

1 more...

Muphry's Law at work

Woosh

1 more...

Now when you submit text to chat GPT, it responds with “this.”

Unironically this

Criminaly underated post

As a language model, I laughed at this way harder than I should have

NTA, that was funny.

And it will get LOSE and LOOSE mixed up like you did

it will UNLEASH its ability to differentiate between there and their and its and it’s.

I'm waiting for it to start using units of banana for all quantities of things

ChatGPT trained used Reddit posts -> ChatGPT goes temporarily “insane”

Coincidence? I don't think so.

This is exactly what I was thinking.

And maybe some more people did what i did. Not deleting my accounts but replacing all my posts with content created by a bullshit-generator. Made texts look normal, but everything was completely senseless.

Back in june-july, I used a screen tapping tool + boost to go through and change every comment i could edit with generic type fill, then waited something like 2 weeks in hopes that all of their servers would update to the new text, and then used the same app to delete each comment and post, and then the account itself. Its about all I could think to do.

I hope so. I generally liked that idea back then, but couldn’t do that to my historical collection of words. My words remain in the cloud, as they always will

They have always trained on reddit data, like, gpt2 was, i'm unsure about gpt1

I downloaded my content before changing the posts to nonsense.

ChatGPT 9 to be trained on R9K posts. Won't be able to distinguish fake free text from real.

It also won't be able to differentiate between a jackdaw and a crow.

Wild to think that was 7 years ago.

1 more...

They both look like ravens

1 more...

ChatGPT also chooses that guy's dead wife

The Narwhal Bacons at Midnight.

On the contrary, it'll becomes excessively perfectionist about it. Can't even say "could have" without someone coming in and saying "THANK YOU FOR NOT SAYING OF"

It already was, the only difference is that now reddit is getting paid for it.

Its going to be a poop knife wielding guy with 2 broken arms out to get those jackdaws.

Would have and would of

Would've

Wood're

This one is the worst for me

From now on, when you say something like "I think I can give my hoodie to my girlfriend", it will answer"and my axe""

GROND

"I also choose this guy's dead wife"

Not always.

Sometimes it will say "and my bow". :-P

ChatGPT was already trained on Reddit data. Check this video to see how one reddit username caused bugs on it: https://youtu.be/WO2X3oZEJOA?si=maWhUpJRf0ZSF_1T

I'm not gonna watch, but I assume little Bobby Tables strikes again.

It's about the counting subreddit. It was used on the token generation database, but then removed on the training. This user posted so much on that subreddit that a token with its username was created, but then it had nothing associated with it in the training and the model dosen't know how to act when the token is present.

Here is an alternative Piped link(s):

https://piped.video/WO2X3oZEJOA?si=maWhUpJRf0ZSF_1T

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source; check me out at GitHub.

And between were, we’re and where.

Insure and ensure.

It will also reply "Yes." to questions "is it A or B?".

Perfectly acceptable answer

Not of it's neither A nor B ;)
Would you trust ChatGPT to know?

Don't forget the bullshit that is "would of"

Who could of

Your right.

"What is a giraffe?"

ChatGPT: "geraffes are so dumb."

“I have not been trained to answer questions about stupid long horses.”

Would you rather fight a long horse or a short giraffe?

"Can't even breath"

And then and than.

Is it a showerthought if it's actually just incorrect

Sure it might have some effect, but a big part of ChatGPT besides "raw" training data is RLHF, reinforcement learning from human feedback. Realistically, the bigger problem is training on AI-generated content that might have correct spelling, but hardly makes sense.

Then I did the right thing by replacing my texts with correct spelled nonsense.

And when it learns something new, the response will be "Holy Hell".

TIL

The same for Gemini, Google brought its api

I don't know what to think about this, because on one hand I don't like not beeing asked permission, but on the other hand I'm glad that my opinions will be somewhat reflected in chatgpt

I’m a bit annoyed by someone else profitting off my words, though. I freely gave them to the world I guess, and never objected to search engines using them, but my words were not “monetized”, even if they were later used to sell advertising. But it just doesn’t seem right for Reddit and others to be cashing in

My iphone lost that ability year’s ago