Maybe It Should Be Illegal To Instantly Delete A Website's Archives - Aftermath

Technology@lemmy.world – 105 points – 2 months ago

Maybe It Should Be Illegal To Instantly Delete A Website's Archives - Aftermath

You are viewing a single comment

View all comments Show the parent comment

That being said, if a third party, like the Internet Archive, wants to archive it they should have every right.

Maybe for sites from corporations or similar sources. But people should have always have the right to be forgotten. And in fact in some countries they do have this right.

Want to be forgotten is about personally identifiable information. Other work, which is covered under copyright, which means if someone has legally obtained a copy of it, as long as they're not distributing it, is their right to do whatever the fuck they want with it. Even hold it until the copyright expires at which point they can publish it as much as they want.

A "Library of Congress" for published web content maybe. Some sort of standard that allows / requires websites that publish content on oublic-facing sites to also share a permanent copy with an archive, without having the archive have to scrape it.

Sort of like how book publishers send a copy to the LoC.

I don't think requiring is a great idea, but definitely making the standard that you can do if you want would be very cool.

I'm not sure if i can agree with that. A third party cannot simply override the rights of the owner. If i want my website gone, i want it gone from everywhere. no exception.

That kinda also goes in the whole "Right to be forgotten" direction. I have absolute sovereignty over my data. This includes websites created by me.

Yes they can, otherwise Disney can decide that that DVD you bought 10 years ago, you're no longer allowed to have and you must destroy it.

Right to be forgotten is bullshit, not from an ideological standpoint right, but purely from a practicality stand point the old rule of once its on the internet its on the internet forever stands true. That's not even getting started on the fact that right to be forgotten is about your personal information, not any material you may publish that is outside of that.

Disney can decide to terminate that license but the disc is another story. The license is for the media on the disc but the physical disc itself is owned by the person who bought it. This is literally why a company can remove a show or movie or song from your digital library. The license holder can always revoke the license. It was harder to enforce with physical media (and cost prohibitive in a lot of cases), but still possible.

No, they can't Google first sale doctrine.

They can remove shit from your digital library because in page 76 of the terms and conditions that you didn't read, they redefined the word purchase to mean temporarily rent.

It's the same licensing agreement. I phrased what I said to specifically adhere to what they say in their own terms of use in accordance with FCC regulation.

https://disneytermsofuse.com/english/

If you were to, say in 1990, get caught broadcasting your copy of a Disney movie without the legal ability to do so, they could absolutely use the court system to revoke your right to the licensed copy of that media and have it confiscated.

No. When you purchase the dvd you become the owner of that specific disc... you never gained ownership of my website just because you visited and copied my content.

Yes, and when I archived your website, I became the owner of that specific copy of your website.

I'd better never see you bitching about AI scraping your content. I'll remind you of this very comment.

For what it's worth, I agree with the other commenter and, as much as I dislike AI as it currently is, I have never and probably never will bitch about the scraping. If I put things out there online, I am aware that they may be used in ways that I never intended. That's how it has always been, after all.

I would argue that AI is a derivative work and that is protected by copyright. Archiving a copy of something and keeping it for personal use is not derivative work and not distribution and that's not protected by copyright.

No, I never granted you any ownership of my content. Period. You didn't pay me, you didn't engage in any contract with me.

Simply archiving my stuff and running away then publishing it as your own is theft.

You've put it out there for free, though, and the data literally ends up on my machine because you made it do that, so what's the problem with me saving the data on my machine for later, and potentially sharing it elsewhere for free again?

then publishing it as your own is theft

This scenario (misattribution of content) has nothing to do with the previous discussion. The other commenter is making an analogy to CDs, owning a CD and lending it to others doesn't mean you're claiming its content is your own creation.
Theft implies deprivation of ownership. Calling this theft is like calling piracy theft. It may be illegal by this or that metric, but it's not normal theft.

You’ve put it out there for free

Irrelevant. It's still my content that I have sole rights to. If I want to share it to individuals I can do that if I please. You don't have any rights to do anything else with it.

and the data literally ends up on my machine because you made it do that

Incorrect. Your browser made it do that. How that data is accessed and displayed is not controlled by me. Case and point you can have extensions on your browser that changes how my websites are rendered.

That doesn't give you a right to replicate my content elsewhere.

and potentially sharing it elsewhere for free again?

Because it's not yours? And publishing it again elsewhere is effectively you claiming it is yours. Especially if published without attribution.

You guys can't have this both ways. If an artist makes a painting... and posts a picture of it. They have no rights to the painting anymore? They deserve no ownership/pay for what they've done? If a news story is published... They have no rights to sell that story to another publisher just because you can copy and paste the text? This is absurd logic. My website has/had a cost. I bore it. I have sole rights to that content.

This scenario (misattribution of content) has nothing to do with the previous discussion. The other commenter is making an analogy to CDs, owning a CD and lending it to others doesn’t mean you’re claiming its content is your own creation.

No, this has to do with rights of the content. Owning the CD grants you a license to the content on that CD. That's about as good as ownership gets there. They own the CD/license. As long as that CD exists/works. You don't gain that same right by simply visiting a website.

Theft implies deprivation of ownership. Calling this theft is like calling piracy theft. It may be illegal by this or that metric, but it’s not normal theft.

No it doesn't. Taking content and using in an unauthorized way while gaining money or some other consideration is also theft. Wayback Machine and other archives are paid for somehow. If some content being on a site swayed someone to make a donation to that archive site, then that value should have gone to the original creator. That is theft. This is the core of most of the current lawsuits. Although they often equate this to "potential and future earnings" which is bullshit because oftentimes that content would never be have been viewed at whatever cost they ascribed.

You don’t have any rights to do anything else with it.

That's patently false. At a minimum, I can quote parts of your content, just as you can quote smaller portions of any published text anywhere, you don't have to ask the publisher or author for permission. It's also ridiculous and impossible to control, the content is on my private machine already, how can any law be relevant or exerted upon what I do there? I doubt you're writing this comment on the basis of your knowledge of copyright law.

Incorrect. Your browser made it do that. How that data is accessed and displayed is not controlled by me.

You're arguing semantics that really don't make any difference. The display is irrelevant, because the data by itself is stored on my computer before it is displayed. That data is what you've put up online to be accessed.

Owning the CD grants you a license to the content on that CD. That’s about as good as ownership gets there. They own the CD/license. As long as that CD exists/works. You don’t gain that same right by simply visiting a website.

I fail to see the difference between getting a CD with some data (buying it or being given for free, as e.g. a gift) and being sent some data online for free. More importantly - says who? Does copyright law say this about websites?

If an artist makes a painting… and posts a picture of it. They have no rights to the painting anymore? They deserve no ownership/pay for what they’ve done?

This simply doesn't follow from what I've written. They certainly retain the rights to the painting. Besides, "deserving pay" depends on completely different factors than the ones we're discussing, usually artists sell the actual object, the painting. A digital reproduction is, as far as most people care (I think), merely an informative reproduction, and not the real thing. Stuff that's posted online for free is... free. It wasn't intended to be made money with directly.

Your final paragraph is really confusing me, you seem to be saying that Wayback Machine is also committing theft, which I'm pretty sure is not true (I've followed the lawsuits against IA for a while and don't remember anyone invoking that term). And at this point I don't know what "theft" is even supposed to mean to you or to anyone else, and what was the point of the discussion anyway. Maybe I should reread the whole discussion carefully all over again, but I'm on my phone and it's all giving me a headache.

the content is on my private machine already, how can any law be relevant or exerted upon what I do there?

So child porn is okay then? You would already have it on your system and got it for free on your private machine!

I doubt you’re writing this comment on the basis of your knowledge of copyright law.

I doubt you are either. Yet we're both here.

you seem to be saying that Wayback Machine is also committing theft

It does... on paper... A lot. https://time.com/6266147/internet-archive-copyright-infringement-books-lawsuit/ To the point it's losing lawsuits over exactly that.

So child porn is okay then? You would already have it on your system

You'd have to look for it, knowing fully well that it is illegal to produce in the first place and distribute to others, access it online, and then deliberately retain it. It's not really the same as something that's legal to produce and distribute (it is certainly legal for me to view your site). You wouldn't "already" have it.

I doubt you are either.

Well I've read some copyright laws, had to solve some issues regarding usage of copyrighted works, etc. Nothing that makes me an expert, but I'm not talking wholly out of my ass either.

It does… on paper… A lot. https://time.com/6266147/internet-archive-copyright-infringement-books-lawsuit/ To the point it’s losing lawsuits over exactly that.

That's not Wayback Machine per se, that's Internet Archive's book scanning and "digital lending" system, which was most definitely doing legally questionable (and stupid) things even to an amateur eye. However, Wayback Machine making read-only copies of websites has for now never been disputed successfully.

You wouldn’t “already” have it.

You've missed the point. Simply having something on your harddrive is already something the law does care about. It simply depends on the something.

Well I’ve read some copyright laws

So have I. Because I had access to an exception under it in my prior job. Seems like we're still on the same page here. Not sure why you'd feel the need to call out someone else's knowledge on a topic that you have no idea about.

However, Wayback Machine making read-only copies of websites has for now never been disputed successfully.

Except it has. That's why administrators can exclude domains from it. DMCA notices also can yield complete removals.

Well the whole premise of their argument is flawed because they're basing it on the fact of redistribution. If I'm not redistributing it, then the whole argument of that falls away entirely. Under fair use, I believe you're also allowed to make copies of things for research purposes, so I'd argue that's what an archive is.

Copyright only protects distribution and derivative works. I can keep a copy of it on my local machine for as long as I want. Theoretically I can keep it until the copyright expires and then I can do whatever the fuck I want with it.

I can keep it until the copyright expires and then I can do whatever the fuck I want with it.

general copyright is 70 years. So no. You couldn't do whatever you wanted with it as the computer you're using would be long dead... and possibly you'd even be long dead. Replicating the content to another device without owners consent could and likely would be a violation of that same copyright.

Replicating a personal backup to another device is covered by free use. Only distribution and derivative works are covered by copyright.

And yes, the length of copyright is way too long. It recon it should be the same as patents, 20 years. Or let it be as long as the warranty and let the big companies duke it out with each other.

You compare entirely different things here. I'm talking about a website i own not a product i sell. And no, this "on the internet forever" is complete and utter nonsense that was never true to begin with. the amount of stuff lost to time easely dwarfs the one still around.

You chose to distribute said website to everyone on the internet. I chose to exercise my rights of fair use to make a local convenience copy of said website. I can then theoretically hold, said local convenience copy, for as long as I want, until your copyright expires, at which point I can publish it.

It's a bold assumption that that data is not just sitting on someone's hard drive somewhere.

You are moving the goalpost. again. The talk was about the Internet Archive providing a copy of my website to the public. Not you storing it somewhere on your drive for personal use. Although that's also a rather tricky legal matter.

But nice for you to agree with the rest. Yes, you could at one point publish a copy. 70 Years after my death. and not a second before that. and only if its not specific protected because i contains personal information. i think the protection is not limited in that case.

Information doesn't have "owners." It only has -- at most -- "copyright holders," who are being allowed to temporarily borrow control of it from the Public Domain.

Imagine that absolute historical clusterfuck if terrible politicians and bad actors could just delete entire portions of their history.

This is just like AI scraping

Edit if you allow a third party to "archive" your content, the ship has sailed. I'm not advocating for or against anything but once your stuff is scraped (by anyone) it's gone.

Not really. If the archive decides to publish your work, that's copyright infringement. If an AI company decides to scrape your content and develop an AI with your content, I would argue that that's a derivative work, which is also protected by copyright.

I'm not discussing what they do with it, I'm discussing the raw act of ingesting your page.

Cats and bags

To venture into opinion, I think there shouldn't be "every right" to archive your page, for any purposes such as archive or ai or whatever.

Edit but I acknowledge how the open internet works and the futility of trying to control that

It seems like a very dangerous, very slippery slope. The first people to abuse this would be the big corporations who want to hide and cover up as much as they possibly can. I think the copyright law framework is a useful lens to view this with which I outlined in my response above.

Totally get what you're saying, but I'm highlighting the mechanical step of a third party having "every right" to scrape or persist your content is in complete contrast to the other points in this thread about rights to be forgotten and so on.

Right to be forgotten is specifically for personally identifiable information. And I'm pretty sure it's sound on copyright grounds as long as you don't distribute. And honestly, I don't really see a problem with it.

And if you've made a personal website, say, with a blog of your valuable ideas/art (valuable to you, or anyone, arbitrarily), the ability to erase your site represents forgetting. The whole site may contain your PII throughout.

Any scraping or archiving techniques degrade that right.

You have a right to be forgotten. Your ideas and the work you create does not.

Yes except AI companies are making mad cheddar.