Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT

misk@sopuli.xyz to

Technology@lemmy.world – 1698 points – 5 months ago

Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT

tomshardware.com

You are viewing a single comment

View all comments Show the parent comment

Everything you submit to StackOverflow is licensed under either MIT or CC depending on when you submitted it.

Regardless of the license (apart perhaps from public domain) it is legally still your copyright, since you produced the content. Pretty sure in EU they cannot prevent you from deleting your content.

But those two licenses give everyone an irrevocable right to do certain things with your content forever and displaying it on a website is one of those things (assuming they follow the other requirements of the license).

If StackOverflow teach me something, that is that legal jargon about copyright isn't very efficient again ctrl+C/ctrl+V

it is legally still your copyright, since you produced the content. Pretty sure in EU they cannot prevent you from deleting your content.

They absolutely can, you gave them an explicit (under most circumstances irrevocable) permission to do so. That’s how contracts work.

Unlike in US, and I cannot speak for all of EU, but at least in Finland a contract cannot take away your legal rights.

You can when it comes to copyright. That’s EU-law and anything else would be such a horrible idea that no country would ever set up a law saying otherwise.

If you could simply revoke copyright licenses you would completely kill any practicality of selling your copyrighted works and it would fully undermine any purpose it served in the first place.

So does that mean anyone is allowed to use said content for whatever purposes they'd like? That'd include AI stuff too I think? Interesting twist there, hadn't thought about it like this yet. Essentially posters would be agreeing to share that data/info publically. No different than someone learning how to code from looking at examples made by their professors or someone else doing the teaching/talking I suppose. Hmm.

CC (not sure about MIT) virtually always requires attribution, but as GitHub Copilot showed right now open-"media" authors have basically no way of enforcing their rights.

Probably cuz they gave them away when they open licensed....you know...how it's supposed to work

In most jurisdictions you can't give away copyright - that's why CC0 exists. And again most open-source and CC licences require attribution, if you use those licences you have a right to be attributed

For super permissible licenses like MIT then it's probably fine. Maybe folks would need to list the training data and all the licenses (since a common requirement of many of even the most permissible licenses is to include a copy of the license).

As far as I know, a court hasn't ruled on whether clauses like "share alike" or "copy left" (think CC BY-SA or GPL) would require anything special or not allow models. Anyone saying otherwise is just making a best guess. My best guess is (pessimistically) that it won't do any good because things produced by a machine cannot be copyrighted. But I haven't done much of a deep dive. I got really interested in the differences between many software licenses a few years back and did some reading but I'm far from an expert.

So they have to carefully only source the MIT data?

It hasn't been tested in court so any answer anyone gives is only a best guess.