Mastodon thinks Lemmy’s privacy stinks. What say you?

elbowmacaroni@beehaw.org to Technology@beehaw.org – 160 points –
Warning: Lemmy doesn't care about your privacy, everything is tracked and stored forever, even if you delete it
raddle.me

Federated services have always had privacy issues but I expected Lemmy would have the fewest, but it's visibly worse for privacy than even Reddit.

  • Deleted comments remain on the server but hidden to non-admins, the username remains visible
  • Deleted account usernames remain visible too
  • Anything remains visible on federated servers!
  • When you delete your account, media does not get deleted on any server
269

In my opinion it's unreasonable to think anything can truly be deleted in a federated system. Even if the official codebase is updated to do complete deletion & overwrite, it's impossible to prevent some bad actor from federating in a fork that just ignores deletion requests.

Seems sensible to just not post anything that you don't want to be available for the lifetime of the internet.

Just as it's impossible to stop scrapers from archiving data on traditional websites. "Deleted" data is probably in a database somewhere, being sold by someone. As you said, you lose some degree of control over your data as soon as you post it. Data is valuable, and if there is a will there is a way.

In my opinion it’s unreasonable to think anything can truly be deleted in a federated system.

yeah like. this is just a byproduct of how federation works currently. i don't even know how you'd begin to design a federated system where some of these critiques can't be levied

Anything that is visible to another party can be hijacked - even a 1:1 communication does not guarantee that the other party doesn't capture the data and then spread it. The only things that are private are thoughts that you have which are not shared with others in any fashion. As soon as information is shared in any fashion, it is not private.

Past this point it's a matter of how private you think is reasonably private. You could design a system where users are in control of their own data through a series of public and private keys, ensuring that keys must be active to view content, but as stated above even in such a case and the user revoking keys does not stop other people from making copies of said data. This is akin to screenshotting an NFT. For all intents and purposes, a copy of the data as it existed at the time of copying is now publicly available.

Quibbling over the fact that you're the one who "truly owns" the data when it comes to something like social media feels like a mostly pointless endeavor because the outcome (data is available for others to view/consume/read/etc) is the same regardless of who "owns" it. Copyright law will apply to anything you produce, if it comes to legal problems (someone copies your artwork and sells it, for example) and having a system to prove you own it is primarily a formality to make it easier to prove ownership. Generally people aren't arguing through this lens, however, and are instead arguing through the privacy/security lens - that they don't want people stealing/selling their data, which lol, good luck. AI models are proof that no one in the world actually cares about this ownership if they reasonably think they can get away with using your data without any real incentive to not do so - interestingly copyright law and models being trained on corporate data such as movies are a vector by which the legality of this might actually stop or slow AI development and protect the end-users data.

I don't expect my data to be fully deleted in a centralized system either. even if it was deleted from the central server someone might have made an archive of it

and reddit is definitely guilty of this since they were bringing back peoples deleted comments and accounts

This is how I treated Reddit too. And Twitter. And everything else. I have two modes; public and private. And private is private; strong encryption and local storage. Having some middle ground is a recipe for disaster.

Exactly. Even a server to just go down one day. Theoretically it has a snapshot in time

You don't even have to modify the code in a fork, just take regular database backups

First - we're all using alpha/beta software (Lemmy is 0.17.4, Kbin is 0.10.). None of these services are "production quality" software yet, so let's keep that in our minds - we're all early adopters.

The points mentioned in the OP are a bad look. Naturally. User should have expectation of their data being deleted on request - especially since this request might be regulatory privacy request (GDPR related). It's a clear failure from the software and should be improved and iterated upon.

The expectation shouldn't be "oh well it's on the Internet, live with it". While Facebook might keep mining your data after deletion request, our software shouldn't behave like that, we should strive to be better with this stuff.

And finally, ensuring privacy in federated system is hard. Mastodon suffers from same problems. We shouldn't give up on the idea though.

It is an early stage software and such things can be worked out, you're right. But on the other hand, such basic elements should be based on a thorough concept before a single line is coded, and implementing something like a delete button with "Let's just make it delete the most visible stuff for now, we can always improve that later when there is time" is recipe for disaster.

Agree, it's a little late to change core architecture. But this is the philosophy the devs ran with, and it has the advantage of longevity when an instance goes offline, then it's still visible to everyone else.

The more important part for privacy: Mail address is optional, and IP addresses are not stored in the database. A correctly configured instance (at least for EU legislation) also will not log IP addresses in the web server - with that you can have profiles that can't be tied to an actual human, and you don't have location and movement data.

The data deletion is pretty much a nice to have - it's on the level of the Exchange feature to recall Emails: Sure, you can ask nicely, but outside of your own server pretty much nobody will care. Lemmy is federated over multiple jurisdictions, so even with full deletion implemented there'll almost certainly be instances which will ignore the deletion request - and it will be completely legal for them to do so. More important is education about what you publish, and a basic understanding of the technical and legal realities you'll have to deal with if you later decide you want that information gone.

I already had that discussion with my 6 year old when she wanted to publish some videos - and she understood the problems quite well.

but outside of your own server pretty much nobody will care. Lemmy is federated over multiple jurisdictions, so even with full deletion implemented there’ll almost certainly be instances which will ignore the deletion request - and it will be completely legal for them to do so

Lemmy also seems to federate your matrix_user_id, that is clear personal data. It does not matter how the data gets to the federated server, this is still user data within the scope of the GDPR. It does not matter that that server does not have an agreement with the user, the instance that would ignore a GPDR related deletion request would be in direct violation of the GDPR. Maybe it can do that without consequences, though.

I completely understand that making Lemmy fully GPDR compliant will probably be impossible, however I don't like the approach of "we will not succeed, so we don't make any attempt". Instances should actually delete data when that is requested, or instance hosts can get fined. For now, Lemmy has bigger issues to solve, but eventually they should do at least a best effort attempt to respect user data.

I had a look into the wording of the gdpr (more specifically the Data protection act as it is implemented in the UK) it seems to refer to organisations. I think most, if not all, instances are not hosted by organisations. (Just some group or individual hosting it on personal or rented hardware). Laws such as this are designed with centralization in mind, and kind of don't make sense in the context of decentralisation.

Lemmy also seems to federate your matrix_user_id, that is clear personal data.

Just like specifying an email address when signing up adding a matrix identifier is your personal choice. Lemmy is perfectly usable without either.

It does not matter how the data gets to the federated server, this is still user data within the scope of the GDPR. It does not matter that that server does not have an agreement with the user, the instance that would ignore a GPDR related deletion request would be in direct violation of the GDPR.

Not a lawyer, but I'd say the instance outside of EU, not targetting EU users would not be in violation - though EU instances transmitting data there might.

Instances should actually delete data when that is requested, or instance hosts can get fined.

With that part I agree - but it should be made clear when deleting something that this is a local deletion, which may or may not propagate to other instances, and will almost certainly not remove the data from the internet.

EU instances transmitting data there might.

This is an interesting thought, as data transfer between the US and EU has been an issue with other social networks. Federation between an EU instance and a US instance could be seen as the same thing - data for EU users is being transferred to non-EU servers.

It's very possible that an EU instance that comes under regulatory scrutiny for whatever reason will have to start requiring Data Processing Agreements (DPAs) from every instance it federates with.

Ultimately that would likely result in a few paid, professionally run instances, which only federate with each other and maybe a few similar instances in other regions with the capacity to provide DPAs.

And next to that, a forest of independent, non-conforming instances flying under the regulatory radar; an entirely separate fediverse from the centralized one where instances disappearing is a regular occurence.

But is it solvable at all in principle? The only enforcement policy available is defederation, but that just means future posts won't go to that instance, the older posts will still be there. Plus an instance could just lie when confirming delete requests and you'd never know unless the non-deleted posts leaked.

Not really, same as email. Once you send it out and it's on somebody else's server, you can request they delete it but that's about it. They have a copy of your message and can do whatever they want with that.

This is not a principle that needs solving imo, it's the nature of Internet. If you post it online then you should know that there's a chance it'll be there permanently.

Hmm, it's an interesting problem. I'm afraid you are right and there's really nothing left but defederation - on the other hand, then it's the same as with stuff like the parsers that could show deleted reddit messages, or things like waybackmachine, which basically do the same, so the core logic of base lemmy source should be as privacy-respecting as possible.

I remember few years ago when I was reading about Signal that there is some way how you can verify that their server is running on the same code as the one published (and audited heavily), so you can be 100% sure that there were no modifications. Wouldn't something like that be a solution? That would prevent servers from modifying the code that deletes data. I don't know how it works, and I couldn't find it when I tried looking for it again, but assuming such a thing is possible, each Lemmy instance could just have a verify widget on their VCS and you could be sure that this instance really does delete your data, since they didn't modify the deletion code.

But this is just a theorycrafting, I wouldn't really have enough experience to create something like that and I can imagine that it's not an easy thing. But if anyone knows more details about the way Signal verification works, assuming I'm just didn't misunderstood something (since it's literally a memory I have of a single sentence from one random article when I was researching best private messages app), I would love to read more about the way it works!

But yeah, outside of that, I'm afraid that the following set of features is mutually exclusive:

  • An user is able to delete their data, and it's guaranteed that they are deleted from everywhere.
  • If a lemmy instance dies, it's data is not lost.
  • There is not a single centralized authority for anything.

Another option would be to create some kind of reputation system, where self-hosted bots could check for servers that still provide posts and comments that should be deleted, and flag offenders. But that's overengineering anyway, and as I've already said - there's still no way how to stop scraper or anyone from simply copying your data when they see it.

So, I was born in the late 90's - I don't know if they still have "computer literacy" as a core course in schools these days, but they did when I was going through K-12 (or, well K-9.. once you were in high school they assumed you knew the basics of how to use a computer, and had more advance courses).

One of the very first things we learned about the internet is that once you put something on the internet, there is no way to take it back. At the time, uploading pictures to the "cloud" and such wasn't really a thing so we learnt this by using email: Once you've sent an email to someone, you cannot "unsend" it. You can kindly ask the other party to delete the copy of the email without opening it, but you cannot guarantee that the email wasn't saved on another computer, or saved somewhere else along the route between your computer and the receiver's computer. Clicking the send button was taught to us as "etching your letter into stone".

Because of this, I've always (or at least, as far as I can remember) made sure that anything I put on the internet, or even "put into digital form" (such as even writing something in a file on your computer - you can recover deleted files from a hard drive unless you really put in the effort to actually erase it... there is a huge difference between erasing a file, and marking it as "deleted") is something that I'm okay being tied with me forever. I'm sure if you looked hard enough, you could find me participating on message boards as a young teenager - and to that I just say "Oh well". Is some of it probably very cringe-inducing and embarrassing? I have no doubt.

(This is also why you should take extreme caution when talking about say, your friend, on the internet - if you post something about them on the internet, you're condemning them to this same exact thing)

Now funnily enough, as far as I understand the ActivityPub protocol, it is for all intents and purposes the exact same as email in this regard. Once you've sent something, there are no "take backs". All you can do is kindly ask others to delete their copy, and that comes with zero guarantees. If I had a mastodon server, and someone deletes their toot - I could take down my server and my server would never receive that delete request. Or, just simply change the source code of the Mastodon instance on my server to straight up ignore deletion requests.

Would it be nice for Lemmy to have a way to actually delete your content? Sure. But that's not technically feasible, and personally (as controversial as it may seem) I would rather Lemmy not try to give you the false sense that everything was completely gone forever. I'm not saying that you shouldn't be able to delete your account off a Lemmy instance, but it shouldn't come with an option that says "Check here to remove your data/media from all federated instances" because Lemmy/no one can promise that, and I really hate it when software (or really anyone/anything) attempts to make a promise in bad-faith knowing that they can't possibly ever uphold it.

Anyone who thinks Reddit is "better" than Lemmy in this regard probably doesn't realize that Reddit is making a claim they can't keep. The most obvious example of this is all of these subreddits that have gone dark? You can bring up most of their posts on the Wayback Machine or Google Cache. That would be the case regardless of whether they were set to private, or even if they were just straight up "deleted".

We really should not be setting the belief for people that there exists a way to completely nuke a piece of data off the internet, because you cannot make a guarantee of that being the case.

Not a guarantee, but a reasonable effort would be good.

Consider doxxing. It would be better if instances propagated delete requests to the fullest extent possible so that that information would be as hard as possible to find.

Moderation is a separate matter entirely.

Not if deletes don't propagate well.

Propagating deletes is a request from one moderator to another moderator. If the 2nd moderator doesn't cooperate with the delete, then you have a moderation policy issue.

I don't really agree with this. The core behavior of Lemmy should be to make a reasonable effort to delete it, which as I've understood it doesn't really.

And you don't have to give people a false belief - the button shouldn't only say "Request removal of data from all Federated instances", but also add that "But keep in mind that it's not possible to enforce deletion from all instances in a Federated environment, and some instances may refuse to comply".

I think we should strive for privacy as much as possible, and by default the instances should comply. Sure, there's nothing stopping anyone from not complying, but that doesn't mean that we shouldn't at least attempt to do it.

It is reasonable that people should be able to delete their posts / comments. However I don't see how is this related to "privacy". How can something you post on a public forum be private?

its the principle behind the 'right to be forgotten'

if you posted something to a public forum and changed your mind, deciding it shouldnt be public after all, you should have that option

While this makes sense for corporations - it doesn't really make sense on the internet. People will archive, take screenshots, etc. Anything that is public on the internet will likely stay on someone's computer for years no matter how much we try to delete things.

It is kind of naive to think that the right to be forgotten will be respected by anyone other than the service provider.

You can’t delete a mail you sent me, nor put your hand written letter to me in the bin. I can keep both and I can keep your name and addresses in my little black book. So there isn’t even that level of privacy in the real old fashioned communication.

And communication over the Internet was always the subject of storage. Your mail may be on the backup tape of a mail server. Your usenet posting is on archive.

So the assumption that the fediverse can forget….

There's long dead people's very private letters and diaries in museum's and public archives. Really available on the internet now. So that's not even a failing of the internet, if you write something people find interesting, they'll find a way to preserve it.

I'm not sure how they think the fesiverse will be the one to solve that.

That is generally true, with exceptions like leaking someone else's private information.

But it implicates the adjacent "right to be forgotten" rather than narrowly defined "privacy". This could be a real legal issue in the EU.

It is. GDPR in the EU dictates that every user which requests their information has to get it in 30 days, and every user who removes their information has to be able to get it removed (I think the time span for that is even shorter, so more pressure for the server admins)

It almost definitely isn't and that's clear looking into GDPR at all.

The right to be forgotten is not all powerful, and the lemmy instance your data originates on has an obligation to delete your data, that is true. However other servers may or may not have any of that obligation for a variety of reasons.

Now if you go to those other servers and make the request to have your information deleted, they may have an obligation to depending on whether that data is seen as currently usable.

The right to be forgotten is far weaker than you think it is, especially on public forums, under GDPR.

The problem here is that your data is not only recopilated by your server and accessible to your server admins, the servers of the communities/magazines or people you interact with also recopilate any activity you have in relation to any community/magazine or user hosted in their server.

So, while the admin of your server has the obligation of deleting your data if you ask for it, the other servers admins don't necessarily have that obligation.

Also, I'm reading the GDPR and the "right to be forgotten" that many are quoting seems to refer to personal information only.

I'm also not sure how it's enforceable in a distributed system.

Blockchains have the property of being append-only, so a blockchain is precisely what makes it impossible to delete transactions. That being said, in a distributed system, once the message leaves trusted servers, it is obviously also impossible to delete it.

Nothing about how lemmy or the fediverse platforms work has anything to do with blockchains. Don't conflate "decentralization" to include blockchain. Torrents are also decentralized and have nothing to do with blockchains.

Why are you bringing up blockchain?

Lovely, the parent comment mentioned blockchain but was since edited... Trust me I would not have brought it up otherwise.

Probably in the sense that if it's not me that posted it, then I don't have any way of truly remove it (which I think is against the EU's laws).

What I can think of right off the top of my head is revenge porn and doxxing. Furthermore there's also the right to be forgotten.

Did anyone use reddit thinking it was private? With stuff like push shift and way back machine people shouldn't be posting stuff they aren't comfortable sharing anyways on a wide open message board.

Always weirded me out the people who'd treat their reddit accounts like Facebook.

With stuff like push shift and way back machine

So much this. I don't get why people don't remember this first thing when it comes to data storage.

But my manufactured outrage. How will I get my emotional highs now?

What does this have to do with Mastodon?

The same privacy issues also exist with Mastodon and all distributed systems.

BTW, the OP on Raddle was spamming that message around Reddit last week and directing people to Raddle. I think he has a bone to pick with the developers' politics more than anything.

That’s probably because the Lemmy dev’s “politics” are beliefs that have no place in a civilized society. Luckily, Lemmy itself and the fediverse writ large don’t have any relationship to those beliefs.

True, and I agree. Thing is, I don't think the raddle base is willing to look past that, or consider how federation makes the dev's beliefs largely meaningless outside of the server he moderates.

100% agree

It's funny, they commenter in that chain are calling out Lemmy users for being blind to dev-related issues, but they demonstrate a strong myopic slant on their own assessment of the landscape.

Anything put on the internet is forever. No one should be publicly posting anything with the expectation that they have any control of it after it goes out. If it’s not held by the server, there’s the way back machine or even just folks taking screenshots.

I completely agree. I just don't see how there can be any realistic expectation of privacy when publishing something publicly.

I appreciate the idea of laws establishing a right to be forgotten and I think there's still some value in being able to take your data away from certain companies, but there's no guarantee it wasn't copied many times before the original location is taken down.

The Fediverse works like email. Once somebody hits send, there's no real way to claw that back.

There's a difference between "there's no way to guarantee total privacy" and "the system is designed to guarantee no privacy", though. Even the best of us fuck up and say something they shouldn't on occasion, and plenty of people online were never given proper lessons or are too young to understand how serious revealing information is.

Whether is Lemmy, federated, corporate owned, or even your own private site - nothing you put on the internet is ever truly private. If you have a public profile someone can access it and copy it.

The only things I'll say that I have an expectation of privacy is health related, everything else I fully expect someone else to read, copy, and multiply.

I think there should be, but I never expect there to be. Did people's parents not teach them about putting things on the internet they didn't want shared?

Did people's parents not teach them about putting things on the internet they didn't want shared?

They used to, then social media became a thing and they stopped. Suddenly, it was normal to put your entire life up online for other people to see, and if you didn't feel comfortable doing that you were the weird one.

My rule is, never post anything you wouldn't mind the media tracing back to you IRL and then making the top story of the day in your country. Because, while rare, that does occasionally happen!

My rule is, never post anything you wouldn't mind the media tracing back to you IRL and then making the top story of the day in your country.

So don't live, basically.
Or you can just maintain anonymity as best as you reasonably can and hope no one goes out of their way to identify you or the account(s). Making a new account after awhile is a safe practice. The goal is to decrease the likelihood of undesirable things, not make them impossible.

Odd response, you can still “live” without documenting your activities. Were people not living pre-Facebook/Instagram?

Probably because it became very profitable to let everyone do that 😔

Exactly, when you put it out there it's out there on every single platform there is. It doesn't matter if you "delete it", the moment you share it you have lost control over it entirely.

For the same reasons I never understood why people post on Facebook with their own full name and life story out there in the open either.

True but you should still be able to delete your account and your comments and username leave the service. Online privacy isn't about completely disappearing, but making yourself so hard to track the average person won't bother digging.

I mean yes but it's still bad practice to keep deleted content. It'll be a bad look to people interested in switching to lemmy and more people is really what it needs right now

This is generally true, but at the same time, the Internet archive doesn't archive every single page ever.

3 more...

I understand the impulse but the way some people get so hung up on trying to make a way to permanently and universally delete posts made on public facing social media and framing it as a "privacy" issue feels kinda like saying something you regret on mic at a town hall and being mad that you can't permanently delete the memory of it from the minds of everyone present, and claiming that they violated your privacy by remembering it

it's an interesting idea, but it doesn't vibe with the reality of the laws in the EU which has "right to be forgotten" rules

The "right to be forgotten" rules are, with all due respect to the EU regulators, pretty shortsighted.

I think the initial "right to be forgotten" lawsuit that Google faced from that Spanish guy-- where he claimed bankruptcy years prior. People( potential lenders?) kept finding that information online through google searches. He sued to have Google remove those sites from the index. He won and the Spanish Judge told Google they had to remove those results from searches.

But it didn't change that the information was still on each site. Those sites, the ones that actually held the information didn't get sued, just Google.

It also opened the door for oppressive governments covering up human rights abuses or hide other information they dont want widely available.

Google appealed and won: https://www.bbc.com/news/technology-49808208

I also want to point out that this Spanish guy's situation is very different from "posting publicly on social media". He was getting written about by others and the courts eventually said "no, this can stand. This information should remain available". So I imagine, public statements made by an individual certainly wouldn't qualify to be forgotten.

At the end of the day, to me, this is a technical decision not a privacy one.

I think this is a great point. I would say its much less of a privacy issue and more of a technical issue.

I think deletions should propagate across all instances and there should be a level of trust between federated servers that they will make those deletions as requested. If only because we'd have a mismatch and orphan comments lingering in perpetuity and we could end up with wildly inconsistent data across the fediverse.

That's a strawman. No one demands mind-altering powers. Records to be deleted: that's another story.

Being able to delete tweets doesn't stop people from screengrabbing them. It's still good that the option exists.

The illusion of Privacy is Mastodon (or social media in general)

There's a reason why when you go to "private mentions" on Mastodon, this appears:

Private mentions. Post on mastodon are not end-to-end encrypted.Do not share any sensitive information over Mastodon

While yes, we should be able to delete our content if we want, but it's a bit naive to think there could be true privacy in any decentralised social media platform.

There's a reason why one of the think people tell you when you come to the fediverse is not to share personal and sensible information.

The only decentralised social media that has some level of privacy is Matrix, and that's why it has it's own protocol and only federates within/between its own servers.

In general I think we should go back to separating personal identities from internet identities on discussion forums like these. There are already platforms for promoting your personal identity that are way better than these types of forums

I completely agree. I'd add that. in general I wouldn't put any type of personal information on the internet, no social media site, is really private.

The line gets a little blurry if you start posting into a geographical community though. Sometimes it’s hard to stay 100% anonymous

I was rather peeved I had to give an email to create an account on Lemmy. It shouldn't be needed.

I have an email that I specifically use for the fediverse. I wasn't asked to give email here, but otherwise it would have been hard to know when and whether my join in request was approved or not.

Unfortunately there has been a wave of fake accounts being created on lemmy. Requiring email on signup is one way to try to prevent this from happening.

There’s a reason why when you go to “private mentions” on Mastodon, this appears:

Lemmy carries the same warning:

While yes, we should be able to delete our content if we want, but it’s a bit naive to think there could be true privacy in any decentralised social media platform.

Especially an email or "reddit" threaded conversation systems where quoting of messages is routine. Here I am, quoting you.

You are putting a billboard up in public, on a bulletin board in the center of the Internet, the assumption should be that anyone can photograph it.

Exactly.

That with the addition that the function of thread-like social media is being a place to discuss topic and share information/knowledge. So content needs to be kept even if the account that posted it exist no more. The contain remaining when the account gets deleted is a feature, because otherwise important information could be lost.

Content deletion should be an option, but the content remaining if you delete your account its a needed feature for this type of platform

This demonstrates a fundamental misunderstanding of digital privacy. You can never be guaranteed that data is deleted, just like you can never be guaranteed that someone has "forgotten" something. It doesn't matter what any entity claims they are doing under the hood, you have to assume they can't be trusted. That's not an expectation you can have, and not something privacy advocates are asking for.

I'm posting this comment publicly, and there's nothing stopping any random user (or non-user) from scraping this lemmy instance and archiving the data themselves. I know that when I post it. Same for reddit, raddle, any mastodon instance, etc. I can copy the text and usernames of everyone involved in that raddle thread and do whatever I want with it, there's nothing anyone can do to stop me.

To think otherwise reminds me of that first day on the internet kid meme. "I deleted my comments off of their servers, hah, they'll never get them now!"

What I can demand is: if I send a message directly to another party, I want to be able to verify that that party and ONLY that party can read the message (end-to-end encryption). I can also demand that they not require me to dox myself to them, that they not run weird js-based fingerprinting/port scanning processes on my system/network, and that I am allowed to connect to their services through a VPN should I so choose.

You're talking about real privacy, the critiques above are all about exposure reduction (incorrectly framed as privacy). Good retention policies are still important for situations like trying to delete something that you regret posting.

An example I could think of from the other site is the very common occurrence of posting some relationship questions and then deleting them later so that the person they're about can't stumble onto them. In that case you want finding the thing you deleted to be nontrivial enough that it can't accidentally be found. Someone with both the skills and knowledge about what they're looking for may still find it, because it was once public, but that's a different threat.

Knowing that any information you share publicly can be stolen, I think the way Lemmy's instances have the original comment after you deleted it could help counteract people manipulating what you said after you deleted it, such as making a quote and editing "your" original post after it was deleted. But this could give a lot of power to the admins as well, as they could be the ones manipulating.

The same is true for raddle. They kid themselves if they think anyone can't record anything in there forever.

Anyway it's also inaccurate. Deleted accounts are purged from the DB, so they're definitelly not visible anymore

Likewise you you edit your comment, it's edited in the DB.

So what your saying is that it’s just like Reddit in that respect.

Yeah, I can live with that, as long as everyone knows that if they really want something deleted, edit over it first.

For a humbling experience just seach for your Reddit and Lenny IDs on a seach engine. You will get a list of everything you have posted. Also some account info. It is all public. What happens when deleted, depends on who has scraped the data and their retension. This is just how public forums are and that goes all the way back to Usenet and listservs.

Deleting your account should also delete all your comments properly

This is assuming your local is still federated. If your local gets defederated you currently have no control over any previously federated copies of your posts / comments / votes.

And it also assumes, no one made a screenshot or used the web archive, crawled it and stored it in their own DB or any other way of copying stuff. Of course!

If you post any thing publicly on the internet, there is no way to be 100% sure it can be ever deleted again.

That isn't what I am speaking to, and the fact someone could make a copy or it is archived somewhere doesn't make the statement that you can always remove your data from the platform true. And there is a difference between a potential copy and an original federated, distributed, and indexed version. There are also reasons someone might want to remove their data other than simply being worried about the actual content of it.

People need to be aware of the persistence of data, but people also have to understand the technology they are using to make their own informed decisions on how they engage.

People need to be aware of the persistence of data, but people also have to understand the technology they are using to make their own informed decisions on how they engage.

Exactly. Federation as well as the internet has restrictions in whether you can deleted your data. This should be known. Non federated data has the same problem, but the other way around. Someone running the site wants your stuff gone? It is now.

I know, what you are talking about, but there are things one has to accept, this being one of them.

the fact someone could make a copy or it is archived somewhere doesn't make the statement that you can always remove your data from the platform true.

Why would someone think that?

And there is a difference between a potential copy and an original federated, distributed, and indexed version.

What is this difference? What do you think happens more often, screenshotting weird/compromizing stuff someone said or defederation?

But there can be a way around All that and that is deleting all Content from defederated sources. Maybe someone could make an issue or implemented it themselves...

Why would someone think that?

Because the comment I replied to, the actual thing I am addressing, makes an assertion that isn't entirely true and could lead someone uninformed into believing they can have their information removed platform wide.

What is the difference?

Not everyone is concerned with someone digging up dirt or wildly compromising material. Most people aren't special enough to be worried about that.

Most archives won't be globally search indexed. An archive won't show up on a federated search. There is more legitimacy to a federated version over someone reposting a screenshot (at least in perception, how federated could be altered or forged is another topic).

I also mention there are other reasons one might want to remove content. Just look at reddit right now, some may simply want to revoke support for a platform sometime in the future.

Sure, there could be a future where this is addressed. It isn't right now.

I don't disagree with you in the larger discussion on persistence of data. I am adding context to a scoped subtopic of it.

I'm behind Lemmy, but I've made an informed decision on what that means for my data.

You are also kidding yourself if you think that defederation will not become more common. The community we are commenting on has already defederated 2 very large instances.

Given the beta status of Lemmy, I don't even think it's a great idea to give the appearance of privacy. I think the core purpose of a webapp like Lemmy is public messages.

I think it's a can of worms for server operators to get into the business of thinking they can safely hold private messages between users/strangers. None of the Lemmy instances I've joined have had a "terms of service" or anything like that on SIgn Up, I really think the message should be sent far and wide that Lemmy is about posting IN PUBLIC and that messages are being FEDERATED to peers, even people that you don't know could be collecting the data for a search engine.

With small-time server operators opening up hundreds of Lemmy instances, without giving away their experience or human identity, how can you have any confidence that someone is properly securing a server they only have part-time job to update and operate? Major corporations are having their database stolen, Valve, Sony, Nintendo, health care companies, mobile network companies (AT&T)... you think a low-budget shoestring server by a hobbyist running Lemmy should be held to the same standards as a corporation who has an entire team and services to defend their data?

Exactly my thoughts. People looking for privacy on these public forums/platforms with o real audit or checks in place is really ironic in my opinion.

The fediverse is the real internet, it's not a company providing a service. On the real internet, once something gets out there, there can never be a guarantee that it's taken back. Even on Reddit, once you post something, Reddit might fully delete it but someone out there may have copied it.

Multiple people reported Reddit undeleted stuff they had deleted from their accounts recently ...

That's why you rewrite your old comments to actively steer people away from the site. ASCII rocket ships, Lemmy links, etc

That's what I was thinking, do someone know if Reddit keeps logs or something?

I had years worth of posts and comments that I deleted via the interface a while ago. Then as part of the reddit exodus I decided to run a removal tool that used the API, and it turns out 11 years worth of "deleted posts" were all still sitting out there, they were just hidden from me.

I did find it strange when I received a reply to a years old comment that my profile page said was deleted, but I just thought it was a caching issue. Turns out all of that content was still out there with my name attached, I was the only one who couldn't see it.

What about editing the comments? Do they keep any log of the original message and the subsequent edits or something? Maybe this would be a workaround to effectively delete them.

There's no way to know without proper testing, and I'm gone for good. I did use redact.dev, which overwrote all of my comments before deleting them, so fingers crossed that the account is nuked.

Opposite to Instagram or Facebook, on Lemmy or Mastodon you can create an anonymous account. Yes it will be logged (normal public internet), but you won't be treacable. The UI doesn't have any tracking scripts, and many instances don't require an email even to sign up. Use the Tor browser to spoof your IP.

There are certainly ways to manage your privacy in how you use this service, and it's different in a lot of ways from other services out there. Users should be educated on the risks against different types of threat models:

  • In what ways can my comments be linked to my real world identity, through correlation to my username, registered email address/phone number/Matrix ID/other identifier, by other users of this service?
  • In what ways can my comments and activity be linked to my real world identity by site administrators or other privileged users of the service (through access to things like server logs, trackers, etc.)?
  • How can I control what activity I consider to be public or private on this service, and who can view that activity I prefer to be considered private?

Even with end to end encryption (which Lemmy does not have for DMs), the most secure protocol is only as secure as the other end you don't control. People can and will screenshot, save, log, or simply remember what you've sent them before.

Lemmy and ActivityPub are new services and protocols to a lot of people. The shortcuts they have internalized on what is or isn't true about privacy of other services (Facebook, Instagram, TikTok, Snapchat, Reddit, plain old email, cell phones, WhatsApp, iMessage/Facetime, etc.) need to be re-learned for these specific services.

New users should understand that the Lemmy/ActivityPub protocols on deletion or privacy of DMs don't necessarily work like other services they're used to. And we should encourage robust discussion around these things until they become common knowledge.

i mean raddle is a site that has an anti doctor post pinned in the mental health community ... like c'mon I and many others need medicine to survive and you are encouraging anti-psychiatrist posting, Church of Scientology levels of anti-medicalist posting

That's fucking ghoulish.

— someone who has to do that shit in order to have a stable life where I don't want to end it all on a daily basis

I didn't know anything about Raddle besides the name until now. But gosh, is that a needlessly toxic pit. There's a poor guy there getting completely beaten up by an admin and some others which seem to be enjoying their time-wasting public bullying. Oh well...

It’s no different than me sending an email to someone and then sending a request to delete it. There likely is still a copy on the email provider’s server and the recipient could have potentially backed up their emails to something outside of the email ecosystem.

Unfortunately the only way to be absolutely sure that there isn’t information you don’t want on the internet is to not share it at all. There will always be an issue of making sure every system actually deletes content when you request it. Like I said, that doesn’t stop anyone from backing up the data to another system. (E.g. Reddit archives from 2005 to now are available to download, even content that has already been deleted)

Honestly, I kinda question how good of a time investment it is to try and allow deletion from the public facing parts of the internet, given the numerous places where your content will be cached or otherwise stored.

There is certainly some value in simply making it as hard as possible to find things you want to delete. Why let perfect be the enemy of good, after all. There's plenty of types of content we certainly want to do our best at deleting even if we can't be perfect. Eg, do you wanna be the one to tell a revenge porn victim, "sorry, we can't make it harder to find the content that harms you because we can't delete all of it anyway"?

But at the same time, development time is limited. Everything is a trade off. We do have to decide what is most important, because we can't do it all immediately. The fact we can't actually delete everything does have to be a factor in this prioritization, too.

There is something to be said about ensuring people know and understand that nothing can truly be 100% deleted once it's posted on the internet. Not that Lemmy is doing good about that, either (especially since deleted comments apparently lie about being deleted).

All this said, I do think federated, reliable deletion is critical for illegal content. Such content needs to be removed quickly and easily from as many places as possible. Without this, instance owners are put at considerable legal risk. This risk poses a threat to the scalability of the Fediverse.

Anyone who has open discussions on the Internet and thinks they're somehow private is a fool. Short of end to end encrypted chat I'm not sure what they expect.

It is all public just as most forums on Reddit. No real difference. No difference with Usenet either. Relax.

Damn, Raddle seems worse than Reddit when it comes to toxic attitudes. I never looked much into it since it's just another centralized platform like Reddit with different management, but boy oh boy are those comments just awful. Great community you folks got over there 😬

Mastodon's privacy issues are just the same as the rest of the fediverse/threadiverse.

With federation there is more openness, transparency and accountability. Take care of your privacy, use alts.

I assume anything I post online to remain there forever anyways. That's why I regularly make a new account so atleast everything isn't behind one username

Use a pseudonym that you don’t use anywhere else and don’t dox yourself in your posts or comments

“Average user.” Think Reddit, Facebook, having communities. I’m old enough that I was a first gen internet user. Like slow-ass 56k, and bbs in terminal and Apple with floppy floppies and point/click before Gates did his hoodoo.

a good habit is also regularly abandoning/deleting an account and starting from scratch. I went thru 6 reddit accounts over my 13 years there

Same here. I had used reddit since 2010 and must have had close to a dozen accounts. I didn’t like too much info piling up under any one account. And I used a local city subreddit a lot.

same. it also helped to separate interests. each hobby/interest would get a different account, local stuff another account, maybe an "engage in politics" account or three (so I can log off and not get hateful replies at random hours of the day)

If I stick around I figure I'll do the same with lemmy. So far local content, angry debate, and niche hobbies haven't been a 'problem'.

I find all the "privacy isn't possible on the clearnet, lol" Commets quite troubling. Yes, the internet doesn't forget and we should always behave on the internet as if our moms could read it.

But that kind of "privacy realism" fosters an additude that doesn't care about privacy at all; no matter how it could be improved (even if it's never perfect). Just because anyone on the street can follow me home and therefore can find my home address, I'm not carrying a sign with my address when going to a protest.

According to this comment, privacy is worse than with mastodon. And while data always can be scraped, it still isn't too much to ask to properly federate deletions.

Yes, the internet is a public place and reddit is bad and you might not like raddle, but come on, people. Have you all given up on improving things already? And do only tech-savvy people with the knowledge and resources to run their own servers have a right to privacy on the internet?

I think you are conflicting some things.

The analogy you used doesn't quite work, because you are not telling everyone at the protest where you live. A more accurate analogy might be you going to a protest, loudly saying something which you later regret, and then ask everyone to just forget about it and delete any footage you might be on. Some might comply, but many won't, and you won't have any idea who didn't.

Furthermore, "people with the knowledge and resources to run their own servers" would be no more safe than you are, because other servers (instances) will still record whatever they post out there. If I make my own federated server and send out a comment, other instances that federate with mine will receive a copy of it. At that point I can ask them to delete it; however, even if they do comply, there is no guarantee that another user hasn't made a local backup of the comment or just screenshotted it.

At the end of the day, tech isn't magic. Everything has limitations, and you can't do everything at once. You can't have a system that allows you to make public comments that go out to several servers where it is shown to thousands or millions of people, and at the same time expect to be able to delete all of it when you feel like it. Tech can't do everything, and at some point we need to take agency and accept responsibly for what we put out there.

Finally, I'll add on what another user said:

Opposite to Instagram or Facebook, on Lemmy or Mastodon you can create an anonymous account. Yes it will be logged (normal public internet), but you won’t be treacable. The UI doesn’t have any tracking scripts, and many instances don’t require an email even to sign up. Use the Tor browser to spoof your IP.

One thing that mastodon does is proxying all the media from the federated servers, lemmy does not do this.. (yet)

For example on this comment page there are 9 domains trying to connect directly to me according to ublock origin. I suggest blocking all third party requests on your instance using ublock origins advanced mode because the website works fine without them, it might be mostly avatars?

For example on this comment page there are 9 domains trying to connect directly to me according to ublock origin.

ublock origin isn't a firewall. They aren't connecting inbound to your system, you are loading content from those servers.

The privacy stinks you say? Did you know that Likes and Dislikes are public too? That was the most shocking to me. Because it is very much not like Reddit or others.

It's still a fantastic piece of software, with all its flaws, though.

It's impossible to federate these without making them public in this way.

The up-votes are also mapped to favourites in Mastodon etc, so that was always public anyway.

You could argue that this should not be hidden in the Lemmy UI, but there are also good reasons to not highlight that much who voted on a post.

The up-votes are also mapped to favourites in Mastodon

Explains why this obvious issue is not brought up by Mastodon lol

I thought votes didn't federate yet anyways... but, yes, it is possible, and i can come up off the top of my head with three or four potential implementations.

Good luck with finding an anonymous system that can not be easily abused.

FHE solves that through and through, as has been documented widely, but that's overengineering when you could just use plain ZKP.
Zero-knowledge voting is here and has been for a while now.

Hey 👋 I know you. Hehe.

And yes, it should not be hidden. It is very much unexpected, because Reddit doesn't do it, and it's not visible to normal users.

I would encourage you to stay as far away from Raddle as possible. It has an incredibly toxic site-wide culture, and some serious security problems.

Do they really advocate you use tor to post memes?

I think an option for full data deletion would be nice for those who want it, otherwise people should also expect others recording their data, which can be published later on.

Parts of it may actually be required under EU law. GDPR requires that anyone holding data on EU citizens comply with certain things, including a request to delete certain kinds of data. The EU has shown themselves willing to go after sizeable corporations for violations; most Lemmy instance operators are much smaller. This should probably be addressed before people find themselves on the wrong end of lawsuits.

Thing is, Lemmy is easily compliant with the EU's laws on this, because the laws state that the EU citizen merely needs to request the data be deleted. It says nothing about them having direct access to the lever to do it.

A basic Python script can be used purge the database after a written request and everything's kosher.

I don't understand why posts are held in reserve, rather than outright deleted. That's a design decision that doesn't totally make sense to me. I can see holding on to it for a period of time - 24 hours, 7 days, 30 days, what have you - so that users can undelete things, but just hiding it from end users and calling it deleted seems pointless to me.

It's not like anyone is trying to sell it to 3rd parties for model training. And while I could see a use case in academic research, the delete button seems like an implied revocation of a license to show or distribute the content, at least in the absence of a proper ToS.

And it just makes more noise for admins and mods.

GDPR likely doesn't apply to public facing forums in the way you're thinking, if you post actual personal data (which has a strict definition) yes it's murkier, but in general just posting on a public facing forum is extremely unlikely to qualify under right to be forgotten under GDPR.

Notably, GDPR is extremely unclear about this specific circumstance, and will likely fall to practicality. The user can make requests for their data to be deleted, those should in general be followed no matter who's server it's on, but they have to be given to each server by the user. Following the deletion requests is generally advisable, but again, it's highly unlikely GDPR applies here. Feel free to get a GDPR lawyer to actually weigh in though.

Part of it will depend on what data you're holding, and part will depend on who's running the instance. A lot of people won't be covered, but I'd wager there's some here and there who need to consider it.

I don't think GDPR necessarily applies here, but I am not a lawyer. Quoting https://gdpr.eu/companies-outside-of-europe/:

Article 3.1 states that the GDPR applies to organizations that are based in the EU even if the data are being stored or used outside of the EU. Article 3.2 goes even further and applies the law to organizations that are not in the EU if two conditions are met: the organization offers goods or services to people in the EU, or the organization monitors their online behavior. (Article 3.3 refers to more unusual scenarios, such as in EU embassies.)

I'm not sure just what the definition of an organization is, so perhaps any server hosted within the EU is covered by the GDPR, but for servers outside of the EU that don't have ads (which seems like all servers currently), I don't think this would count. The example on the linked site about "goods and services" includes stuff like looking for ads tailored at European countries, so I suspect that simply serving traffic from Europe isn't enough.

The website also mentions the GDPR applies to "professional or commercial activity". There's also apparently an exception for under 250 employees. I don't even know how that works when something is entirely managed by volunteers like this currently is.

At any rate, I suspect we're a long way off from having to worry about the GDPR.

The GDPR itself doesn't use the term organisation, it refers to data controllers and data processors.

A “data controller” refers to a person, company, or other body which decides the purposes and methods of processing personal data.

A “data processor” refers to a person, company, or other body which processes personal data on behalf of a data controller.

As someone from within the EU working in data the fediverse is absolutely not a long way off having to consider this, GDPR impacts even the smallest businesses or voluntary groups - it's just how we handle data.

To make it easier to grasp GDPR is about your rights over your data, those don't change depending on who is processing it, nor does the processors obligation, however what would be considered appropriate safeguards would scale with the size and intent of your organisation - it would be silly for my local shop to have a data protection officer.

I suppose the question would become who is the controller, is it the person who provides the software or the person who provides the servers? Typically it's the servers.

Gdpr applies to servers within the EU, or for servers with EU clients. You can demand that they delete and stop transmitting data.

But you accept to transmit data all over the world, in the end that data could end up somewhere outside of the EU without any direct EU customers. Then all bounds are gone.

--
Do worry about GDPR in conforming to deletion requests, but only your own data, not anything you transmitted.

If you think anything on the Internet can ever be forgotten... Your going to have a bad time. Passwords, one of the most protected data types, are compiled from beaches into huge databases so that hackers can use them to try to log into website. There are literally dozens of not hundreds of those password databases on the public Internet to be downloaded, not to mention private or dark web collections. If passwords are not safe, what makes you think publicly available social media would be any different?

Even if somehow the whole federation agreed to purge all post every year, things like the Internet archive and Google cache of pages would retain the data.

Personally when I want to share what I'm saying with the world I write a letter, burn it, and snort the ashes. This is the only truly private way to do this.

In order for me to be offended, I'd first have to care about that opinion. I don't.

Not sure what the point of "Mastodon's" opinion is? Firstly, Mastodon is pretty big and decentralised, and it has no-one who really speaks on behalf of all its users. Lemmy is not a privacy central network like a direct messenger service. It never claimed to be privacy centric as far as I know. The point is to share posts in communities, and the more that see them, the better.

But it is federated which means posts do get shared to other servers everywhere, and deleting those is not as easy as for a centralised server. Whatever I post on any sharing type service, I consider to be public.

I don't even understand why the OP calls this "Mastodon's" opinion. The link doesn't go to Mastodon. I think the parent post is being a bit of a troll honestly :( The criticisms at the link don't make sense, the person posting the link doesn't seem to think the criticisms are good, and they attribute the criticism to Mastodon while posting "Raddle". It's like they're only doing this to get everybody riled up

i think OP may have mistaken Raddle for a mastodon instance of some kind, idk

Here is the title of the Raffle post that was linked: "Warning: Lemmy doesn't care about your privacy, everything is tracked and stored forever, even if you delete it".

But wouldn't Mastodon instances be able to automatically backup posts, comments, edits, and deletions? Hell, users would be able to do it too yeah?

The whole idea of this being a privacy issue kind of goes against the whole internet archival movement and is really a moot point.

I can see this maybe being a problem with privacy regulations though.

Mastodon is where the link to the raddle article appeared. The post on Mastodon basically said they wouldn't use Lemmy because of what the article stated.

I'm not sure what this has to do with mastodon all I see are some salty idiots on raddle moaning.

That's a non issue. You just cannot expect to be able to delete anything you post on the internet. Even the great reddit with the awesome deletion feature cannot help you. You might be able to delete your comment there, but there is https://www.unddit.com/ https://archive.is/ https://web.archive.org/ and many others, where your comment will still be available.

Eh. Often times I want to delete it particularly on reddit or some other place. Just so that it doesn't hang on my profile

Well, reddit doesn't actually allow you to delete things anymore, so tough luck.

Do you think about Reddit "undeleting" posts? The reason for this is that your posts in privated subs make them disappear from your profile. So when they go public again, they are there.

After reading some more comments, I think I came up with a good analogy to explain this issue, and I wanted to share.

Think of websites like a bar, that also has an open mic.

Now, when I go to a bar, I don't want to have to give the bouncers and staff my full name as well as my address. I also wouldn't want them to know that I just came, for example, from a store where I was looking for a vacuum, and then have them warn a vacuum seller about it. A vacuum seller who is then going to sit next to me, while I'm trying to have a drink, and show me a pamphlet regarding the amazing vacuum he is selling.

Ideally, I can also look for a bar that will allow me to come in costumed and not show my face. Or I could ask the bar to delete footage of me at some point, and to not store my ID if I do have to show it to a bouncer at the entrance.

All of that is relatively feasible and within the realm of reason; and all of that are things that privacy advocates might advocate for.

However, what is not feasible, within the real of reason, or what privacy advocates tend to advocate for, is the ability for me to willingly go up on stage, say something on the mic which I immediately regret, and then ask everyone present to forget it ever happened, and delete any footage they might have of it. No reasonable person would ask for something like that, because it is not a reasonable request.

That is how regular websites work. With federated websites, that becomes enhanced; it's like if the bar you're in has a camera pointed at the microphone, and transmits both video and audio directly into several other bars. So when you go up to that mic, you better make sure you're okay with what you are saying being made public and available to anyone.

Allow me to pick your example apart a bit.

However, what is not feasible, or within the realm of reason, or what privacy advocates tend to advocate for, is the ability for me to willingly go up on stage, say something on the mic which I immediately regret, and then ask everyone present to forget it ever happened and delete any footage they might have of it. No reasonable person would ask for something like that, because it is not a reasonable request.

That's not what is demanded. No one demands that the audience (users) forget what I said (the comment), much less: immediately. No one is asking for mind-erasing power or the ability to remove screenshots from other people's client devices.

With federated websites, that becomes enhanced; it's like if the bar you're in has a camera pointed at the microphone, and transmits both video and audio directly into several other bars.

Now, that is where the actual demands come into play: As you pointed out, it is reasonable to demand that the bar deletes any recording of what I said on stage. But the way the footage is shared with the other bars can be regulated via a protocol. In your analogy, it's like the other bars copy tapes from the original bar and show them at their place. Now, implementing a procedure of "delete that tape, please" is not impossible. In fact, it already works on Mastodon. If a bar doesn't comply, it simply wont get any tapes from the other bars (it gets defederated).

AFAIK, there is already such a feature planned on github. Which is great. But that is exactly the reason why these things need to be brought up and "privacy realism" is counterproductive.

That’s not what is demanded. No one demands that the audience (users) forget what I said (the comment), much less: immediately. No one is asking for mind-erasing power or the ability to remove screenshots from other people’s client devices.

Well, that why it is an analogy; the forgetting is equivalent to erasing from someone else's storage. You have no real control over it. Other people can say they do, but you don't know that. And that is what is being demanded - right now I can already "delete" my comments and Beehaw will indicate to other instances that it was deleted, but it can't control whether they do it, and it has no way to know if they really deleted something or just hid it from public view.

Differentiating between a client and a provider becomes extra tricky when you remember everyone can start up their own instance and still be essentially just a client - and, I think this is also worth mentioning, people can create their own backends that also federate using ActivityPub, but which are not open-source, and you'll have no idea what goes on in their servers. In the bar analogy, this would be people watching a stream of the mic at home; or another place, other than a bar with the same set-up, streaming and recording what goes on in that bar.

Also, if no one is demanding that things be deleted from client devices, then logically nothing should stop someone from sharing it with other people/clients. And if you believe otherwise, then as example: what if someone posts a comment, I reply, and then they edit it to put me in a bad light? Is it an invasion of privacy for me to show what it said previously?

This is not a privacy issue; you cannot demand privacy for something you shared willingly and publicly.

Respectfully, I find it more counterproductive, and even harmful, to encourage and spread the idea that people should have any expectation of privacy regarding things they have shared publicly.

With all due respect: I think your analogy made a strawman of what was originally demanded.

Originally, several less-than-ideal "privacy" (or whatever you call it) issues were pointed out.

No one demanded perfect privacy like with E2EE messengers, but rather: sensible protocol implementation of deletions.

No one is demanding that people shouldn't be able to scrape stuff from the internet.

Still: There is a possibility of doing everything in your power to delete stuff that's supposed to be deleted when you're a developer.

And they actually do implement this stuff. That is why it is important to point these things out! The squeaky wheel gets the grease, as they say. Or is this issue counterproductive too, because it gives people the illusion that you can delete things on the internet?

If you think that "privacy" is the wrong term: granted. But sensible deletion protocols are not too much to ask for.

If you think that “privacy” is the wrong term: granted. But sensible deletion protocols are not too much to ask for.

Well, that is in a nutshell what I am arguing. I'm not inherently against the ability to delete things, as it can be quite useful as a quick means to say "I take this back", or "this information I shared is wrong, so I'm removing it" (although in that case I would opt to use an edit). Even "I'm embarrassed about this, so I don't want more people to look at it" is a good enough reason that I would respect, and for which I would delete the thing if it was in my possession. Essentially, I just don't think it should be treated as a privacy issue, because that might give a lot of people the wrong idea.

Ok, so I guess it's a semantics issue then.

Thank you for a more productive conversation than any of the ones I've had on twitter. Take care.

I don't think much of Mastodon as it is, so they're free to rag on Lemmy all they want.

This is a big issue because in the EU you have the right to remove your data. It could make Lemmy illegal in the EU

Yeah. GDPR §7 is very clear on that. And the removal must be facilitated by the original data collection point - so e.g. Beehaw is liable that all other servers delete the personal data if Beehaw willingly distributed the data there. (It gets more interesting because Data transfer from a EU to a non-EU server is also basically impossible and of course the initial server would need a data transfer agreement with all following nodes)

So just to clarify this point:

Anything remains visible on federated servers!

If I delete a comment on beehaw.org, it doesn't get deleted when accessed from another Lemmy instance that federates with Beehaw?

When you delete it your instance tells others that it was deleted, but it cannot force them to follow through.

Which is indeed a problem as it makes it impossible for any admin to host in the EU or for EU citizens, in theory. GDPR §7 makes it very clear that complete deletion of all personal data (and yes,a Lemmy comment is personal data) must be facilitated by the original data collection point.

From what I understand instance 1 has to delete data if requested, but instance 2 has no obligation to unless requested. Just like data remains archived in sites like internet archive or other private archives. Just like it works on reddit or any other site currently.

it can't make it impossible. If facebook sold data to amazon, so now amazon has a copy, and then facebook's user asks their data to be deleted, facebook can't just march into amazon's servers and delete the data themselves. The best they can do is send a formal notice to amazon requesting it be deleted, which sounds like what lemmy does. At this point it's up to the federated server if they comply with the law...

Actually that is exactly what the GDPR stipulates. In your example Facebook needs a data processing agreement that ensures that all rights of the data owners are secured and the GDPR is followed. Facebook is liable here, not Amazon - the user must explicitly NOT ask Amazon to delete as the user may not even know where the data went to/should not be bothered to write requests to a huge amount of different data processing locations.

But, @hikaru755@feddit.de added another interesting point: The Instance may or may not be seen as a single data processing entity that does not voluntarily hands over data to other instances. That could indeed be a reasonable cause as e.g. data scrubbers are not within the sphere of influence of e.g. a service publicly displaying data. But as the whole network is build on interconnected nodes I wouldn't count on it if that reasoning would fly in front of a court. It may. Or it may not.

In this case though, would it not be that then if Facebook did have a processing agreement with Amazon with which they communicate information, and this agreement stipulates that (in order to comply with GDPR) data they sell to amazon must be deleted upon request, and Amazon does NOT do so, this would make amazon liable for breach of contract instead of facebook being liable for breach of GDPR?

If so, all fediverse instances would need is a copy-paste agreement when two instances federate that data from one must be deleted on the other upon request.

Partially right - Amazon would be liable, but not towards the data owner but Facebook. The data owner sues Facebook, Facebook then sues Amazon.

A copy&paste agreement is the first (and from my point of few most important step). Personally I would also integrate a automatic mechanism that deletes data (e.g. the delete request gets automatically federated) and defederates instances that do not follow them globally. Sadly this is still not enough - data handling in the US and other jurisdictions with similar bad privacy laws is also a problem, see the recent Facebook case and Schremp2. But tbh I have no idea how to solve that.

Lemmy can, by definition, not be GDPR obtain full GDPR compliance. We should make sure that best effort is ensured, especially with the right of deletion and the right to "know"(where data is stored), but also consider lobbying towards a reformed law for the federated use cases.

The originating instance definitely cannot be held responsible for failing to force a separate instance in another country to delete its cached copy of user data imo. I think what is more likely is that EU courts could force European Jimmy instances to only federate with GDPR-compliant instances.

This is incorrect if the data transfer was done voluntarily/planned. This also applies to EU data outside the EU - Meta has been fined a 1.2 billion euro for that.

And no, the definitive definition of the data transfer extent is a key point of the GDPR. Each and every data owner has the right to know where their data is stored exactly. So a "EU only" would not be enough - It is basically already mandatory as transfer to other countries is a major problem after Schrems 2.

Ah yeah if the originating instance sends data to a secondary one then that is somewhat different.

I don't think it's quite that bad/simple. Viewing your main instance as the Controller and other instances as Processors in GDPR terms won't work, because instances don't have the necessary control over each other for that, as you say.

However, you could circumvent that issue by making the case that each instance actually acts as an independent Controller. By participating on a federated service, you are explicitly agreeing to the data you provide (your profile, posts, comments, etc.) being made public and shared with other compatible services. That should be enough as the basis for other instances to reasonably assume you want your data to be processed by them, which (I think, not a lawyer) is sufficient justification for processing the data independently, as long as it's in line with how you generally expect the fediverse to work.

This would mean that each federated instance is its own, independent entity that processes your data, and to make use of your rights under GDPR, you need to do that with each of them individually. They effectively become their own "original data collection point", in your words, even if that data collection was not explicitly triggered by you.

The only thing missing for that to be legal (again, in my layman's view) is transparency about who's processing your data and how, which is necessary under GDPR. Every instance that receives your data via federation would need to let you know about that, and make available to you information on how exactly your data is processed and how you can make use of your rights under GDPR with them. That, in turn, would probably be easiest if the protocol spoken between fediverse servers were extend with automated and standardized ways to propagate GDPR requests from your home instance to any other instance that is processing your data, so that you don't have to actually deal with every single server yourself to get your rights enacted. Defederation in the meantime might be a problem, but there's ways around that, too.

The first point is indeed the only one I see atm that might be working. If one can reasonably argue that the node/instance is not voluntarily giving away the data and has no way to prevent that without massively hampering operation of the plattform it might be acceptable in front of a court.

Again: With a lots of might/could/ifs.

Because simply the fact that the nodes themselves are build for connecting to each other and very much do so (and you can effectively block other nodes from federating your content to a extent) speaks against that reasoning. But it worked for e.g. data scrubbers,etc.

However, you could circumvent that issue by making the case that each instance actually acts as an independent Controller. By participating on a federated service, you are explicitly agreeing to the data you provide (your profile, posts, comments, etc.) being made public and shared with other compatible services. That should be enough as the basis for other instances to reasonably assume you want your data to be processed by them, which (I think, not a lawyer) is sufficient justification for processing the data independently, as long as it’s in line with how you generally expect the fediverse to work.

That sadly explicitly does not work. Any consent given must be under definitive circumstances - a 'card blanc' consent is not possible under the GDPR. You must absolutely know where, by whom and what for your data is processed or transfered. And the initial data processor still has the obligation for a data processing agreement.

It could defederate any non-compliant instances.

It could, but actually policing it would be difficult. I don’t think there is any “yeah I’ll do that” response and even if there is an instance could say it will delete it and still do nothing.

You could defederate with instances running versions that don't delete federated posts. Removing compatipility with older protocol implementations is not unheard of.

while this is certainly feasible, it is just a compliance checkmark of "doing your best". It wouldn't actually prevent someone attempting to persist that data. For example, I just need to maintain an insert-only copy of my deletion-compliant lemmy instance DB, and none of the deletions would be reflected on that.

I could then host that copy publicly on some unrelated lemmy instance, and without systematically de-federating from all other instances, you wouldn't know which one was retaining the data.

If I wanted privacy, I wouldn't be browsing online.

That's a poor answer to be honest. Total privacy is an illusion, but having the tools to delete some of the traces if wanted should be there. I would argue that the EU law about the right to be forgotten might want a word with someone.

I escaped Reddit, but i hold anyone else to a standard too.

Lemmy, do better or it wont end well. https://gdpr.eu/right-to-be-forgotten/

This is a link to Raddle.me, what does this have to do with Mastodon?

Other people have already commented on how federated social media often requires certain data just for implementations to work and make sense, and there's not much more to add to that.

If you want private, end-to-end-encrypted, decentralized communication, the best modern solution to that is #matrix.

The same "rumors" exist about Matrix. According to some, "a lot of metadata is unencrypted". While somewhat true, there's literally no way to be able to deliver a message from person A to person B without knowing who the message is from and who it's going to, especially on a decentralized platform. Most of the (not E2EE) metadata sent with an event in Matrix needs to be read by the homeserver, and thus can't be E2EE.

In both services you are basically shouting into a giant megaphone. What’s so private about it? If you don’t want say it in public, don’t say it there.

If you need privacy there are much better tools available such as pgp encrypted email or encrypted Matrix DMs (a nonfederated Matrix sever would be even more secure but rather overkill).

Edit: specified encrypting Matrix DMs. I forgot for a moment that you can send unencrypted DMs over Matrix.

Eww. Well, there is a reason why I try and be extremely careful about what I post nowadays. Don't want to regret dumb shit I said in the future.

As a life long anarchist, I personally find raddle to be a fucking embarrassment. The elitist bullshit is right up there with other political anarchist sites like anarchist news; they're all a fucking shit show and shows why anarchists will never accomplish anything.

Isn't the fediverse an anarchist project?

It seems to be the most flat peer structure of any social media.

Pretty much yeah, either the fediverse or Usenet. Somebody pointed that out to them in the comments of the linked post but they dismissed the point as nonsense.

Very performative anarchists over there lol

I'd like to see a more completely decentralized implementation, but federation does seem like it's practical in that it's easier to implement and use while still having a lot of the benefits of decentralization.

Ideally I picture something like a lemmy application that runs it's own internal, persona instance, but I'm not sure how the protocol would deal with that many isolated instances.

Keeping an eye on things like holochain and locutus to see if one of them will end up being a viable protocol to build a fully decentralized forum app on.

In the mean time I mostly like lemmy because it's written in rust. Postmill looks cool, feature-wise, but I can't see myself contributing to it when I it's written in PHP. I already have to use too much PHP in my day job. When I come home I just want to use an enjoyable language.

The stuff listed in OP doesn't really seem like much concern. "What you put on the internet is there forever!" is completely true, and things like this should only make it more concrete that you can't rely on your service provider to delete information somebody else already archived.
With that being said, default privacy settings - at least on Kbin - seem pretty bad.

i use kbin because I don't like lemmy's devs 🙃
bonus points that it actually deletes things

Do you think kbin is just reaching into other servers and pulling the bytes off the disk? You can't guarantee anything is deleted in a federated system, other servers can just ignore your delete request. So this makes no difference.

And it breaks easily. I still can see several posts on my private instance that have been deleted. The delete command never made it to my server for any number of reasons. As some posts never make it to my instance either. I guess in the long term some kind of delivery queue and guarantee would be nice.

3 more...
3 more...
3 more...

I think this is a feature, well the media aspect anyway. Immutable media. The rest can be developed on.

It isn't truly immutable though, and could be dangerous to propigate the idea that it is 100% immutable

Kinda unsurprising as rumors have it that lemmy's developed by pro-China Tankies.

The developers will freely admit that they are Marxist-Leninists who support China. I don't get why people frame it as a rumor.

That said, that has nothing to do with this. It's just implementation details, and are on the docket to be worked on once the mission-critical stuff is out of the way.

That said, anarkiddies rallying against federation and preferring to use a centralized service like raddle is very funny.

I was thinking that. I can understand disliking lemmy for its developer, but then making it a call against federated media seems strange, as someone who also considers themselves an anarchist.

Am I missing something or isnt it that no matter what Lemmy does all those same problems would still exist, just from the internet archival sites instead. Sure the privacy could be better to deter some of it, but none of those issues are fully solveable so long as thise archival sites run. I guess the media not deleting is likely the biggest thing you could effect that archives would be less likely to store in the first place.

I’m at a loss. You’re saying that things that you said publicly are private? Or you’re saying that they become private because you delete your account? Assume you dox someone. I need to find out if that happened. As an admin I’d be able to see that

  1. you
  2. publicly posted
  3. their data

I would need to be able to provide this to authorities if they provided needed legal documentation. Why do you think that privacy dictates you should be able to commit a crime, and get away with it by deleting your account?

Wouldn't Mastodon have the same legal requirements?

That’s a hard question to answer. My position is based on where I live and what legal council I have worked with has said in situations I’ve dealt with. My recommendation is, check with an attorney.

I don't think there is a legal requirement that you store that data, just that you make the data you store available, or in some situations, you add logging for valid law enforcement requests.

Apple for example does not have access to end-to-end iCloud data that is encrypted to my knowledge. They wouldn't be able to provide the contents of my notes application to law enforcement necessarily - and that is currently legal.

I’m basing what I have said off of work I have done with attorneys in similar situations. I don’t know evidentiary law, but I wouldn’t want to be accused of destroying evidence of something. But my question stands. Why should someone who has doxed someone get away with it by deleting their account? How is that ethical?

Why should someone who has doxed someone get away with it by deleting their account?

Doxxing is not illegal in many places - the US included. Cyberstalking and harassment may be illegal, depending on location. That's beside the point, but this is an extremely specific example.

Ultimately users should, in my opinion, be in control of their data. Tildes, for example, preserves deleted comments for (I think) 30 days and then permanently removes them. It seems like that approach is a compromise that would work for your situation while still respecting privacy long term.

So the key thing here is, "are you aware that the data is part of a legal proceeding or crime?"

If no, deleting it as part of normal operations is perfectly legal. There are plenty of VPNs which do not log user information, and will produce for the authorities all of the logs they retain (i.e. an empty log file).

From an ethical standpoint, keeping peoples' data which they want removed, against their wishes, based on the hypothetical that at some point someone might do something wrong, is by far the less ethical route.

"You might do something bad, so I'm going to keep all your data whether you like it or not!" <- the bad thing

It's cute how you think I'm going to take legal advice from you. You do you, have a nice evening.

Apple (and Google, Microsoft, etc) are checking signatures of all files on their services to detect illegal stuff. They do it for copyrighted content and they do it for CSAM.

Checking against a known-malicious hash is very different than claiming to have access to the plain data. In fact, even for the known-malicious hashes, the companies doing the checks usually don't have access to the source data (so i.e. they don't even necessarily know what it contains).

Deleted comments remain on the server but hidden to non-admins, the username remains visible

This is a negative behavior by Lemmy, in my opinion. Deleted comments should be purged after some time. Tildes does the same thing - I think with 30 days?

Deleted account usernames remain visible too

These should be replaced with some random string of characters or something like DeleteUser<numberhere> or something.

Anything remains visible on federated servers!

This is just a concession of federation.

When you delete your account, media does not get deleted on any server

This is an issue, too, in my opinion.

Honestly, this is definitely something that can be added - and in fact it might even be beneficial to server costs. Alongside optional deletion of cached data from other instances maybe a year or two after the data arrived.

People need to remember that Lemmy is an alpha software - we haven't even reached the big 1.0 release

can't anyone who runs a lemmy instance script all that in the db? alternately, can't anyone who claims to do so just not do it in the db? it's not like you would ever know.

A sketchy instance operator isn’t really a solid defense against implementation of better privacy features in the source code.

I switch accounts after some time and use other ones. It's quiet okay this way