Pixelologist

@Pixelologist@lemmy.world
1 Post – 5 Comments
Joined 1 years ago

Please correct me if I'm mistaken but isn't the reddit dataset used to train LLMs from before Chat GPT became widely known? I was under the impression data from that point onwards was poisoned and not useful for training purposes

I can't seem to find it now but I remember there being a ~90gb .zip megadb upload that got passed around a lot on machine learning reddit subs that was a snapshot of reddit before x date

I deleted all my comments except 2 explaining how to delete all your data on their way out to hopefully inspire some others

No, baby jesus will come down and bitch slap you. Better not press your luck.

Is that what's happening when I try to reply and it just loads forever?

And then there's me with ublock origin and sponsor block always on.

I was raised to hate ads, it worked. I'd rather support via patreon