iPhone owners say the latest iOS update is resurfacing deleted nudes

misk@sopuli.xyz to Technology@lemmy.world – 510 points –
iPhone owners say the latest iOS update is resurfacing deleted nudes
theverge.com

cross-posted from: https://sopuli.xyz/post/12670977

iPhone owners say the latest iOS update is resurfacing deleted nudes

209

You are viewing a single comment

Nothing sinister, we just don't delete what we say we delete. Instead we keep it in your profile to feed the algorithms and set the "deleted" flag to make you think it's gone.

I mean, to be completely fair, that's how data storage works.

We cannot really just make data disappear, so we let it get overwritten instead

But clearly the data is not overwritten and this was intentional. How do I know? Because that would amount to a massive amount of data, if it was de to a bug in Apple software or underlying filesystems, it would be detected in monitoring systems "Hey, we're using 10x the data we should be, maybe we should look into it".

The mistake was in the flag code that was supposed to fool us.

no when I say "overwritten" I mean that the area is set as deleted in the filesystem and the next time something writes to that area the data that was there before is disregarded.

and the next time something writes to that area the data that was there before is disregarded.

A single overwrite might not be enough to defeat physical forensics because shadows of the old data persist in how the new data is stored. Also when it comes to SSDs you might be waiting a long time for the data to get overwritten as the drive will wear-level its erm sectors (what are those things called with SSDs?).

So are you saying that they suffered from a filesystem bug that caused deletion failure? I'd imagine they use standard filesystems on their backend, I haven't heard about any bugs like this.

If you ask me, what's more likely, that a company known for shitty behavior lies about deleting files so they can continue to use that information to profit, -- OR -- that they are experiencing a filesystem bug on their backend, I'll choose the former.

no I don't believe a damn word of what apple's gonna say on this, I just wanted to get the message out there that generally file deletion works by allowing data to be overwritten, so if the images are local this could very well just be that either it's showing data that hasn't been overwritten yet or it accidentally brought things out of the "recently deleted" depending on how long ago it was deleted.

Undeleting nudes

That’s iPhone

Seriously: I don’t think the cost benefit is there to intentionally make a maneuver like this. Any crap they pull needs to have a perfectly proper explanation, with our agreement to a specific term buried somewhere in their policies. Can only imagine how much money they blew throwing these billboards up all over the San Francisco Bay area. We have to buy Apple over Google for ostensible privacy gains, and Apple has to lock us in to their walled gardens to make up for their comparatively smaller ad/data business.

This post assumes Apple is aethical (that’s like amoral but for ethics right?) but still a self-interested economic actor. They can’t let short-term greed get in the way of long-term greed!

Seriously: I don’t think the cost benefit is there to intentionally make a maneuver like this.

You might be right

They can’t let short-term greed get in the way of long-term greed!

lol

1 more...
1 more...
1 more...

the shred command in Linux tries to do this, but it may not work if the hardware moves rewritten data blocks around to mitigate wear.

shred doesn't even necessarily work at the OS level. If you use something like ext3 and I assume ext4, normally when you overwrite data in a file, you're not overwriting data even at the logical level in the block device. Journalling entails that you commit data to somewhere else on the disk, then update the metadata atomically to reference the new data.

It was more-practical in an era of older filesystems.

Proper deletion should include writing all ones or all zeroes to the block but y'all be lazy as fuck.

Only necessary on the ol spinning rust, with SSDs not only is it completely unnecessary, but it also burns extra writes.

Spinny's store data magnetically on the platter with 1s and 0s, SSDs store data on the NAND as a held charge. If there's a charge in the block it's a 1 if there's no charge it's a 0.

With spinny's, a file gets marked as "deleted" but the residual magnetic 1s and 0s will remain on the platter until eventually overwritten

With SSDs a file gets marked "deleted" and within no more than a few minutes TRIM comes along and ensures the charge on the NAND is released for that data, there's no residuals to worry about like with spinny's and is in fact necessary to ensure decent lifespans.

Wow, the SSD can hold the charges perfectly while unplugged for ages? Amazing.

In a post apocalyptic world where I am in charge of building a storage drive and I’m given all the instructions and fabs, the world is going without storage.

Wow, the SSD can hold the charges perfectly while unplugged for ages? Amazing.

Yup. Before flash memory, devices like video game cartridges which had game saves actually needed a battery to power the memory holding the saves.

But wouldn't TRIM be the deleting he is requesting? Removing the charges would be setting all the bits in that block to the same value.

That just makes no sense to do, modern storage is write limited. As long as you used encryption the old bits mean nothing to anyone but you.

SSDs are. Big storage is not using SSDs.

I’m not an expert, but wouldn’t proper deletion be writing random ones and zeroes to the block? Multiple times?

yeah cuz for normal, day-to-day use that's exponentially slower the more you're deleting

You can do that when you wipe something.

Nitpick: it should be fuzzed with random 0s and 1s.

That's skipping over the fact that recovering deleted data, even if it isn't overwritten, is not an "oops". It it takes extra effort, and if that data isn't being protected it would be overwritten incidentally as drives are used.

There is a big difference in a database between "flagging" data and actually removing the association of the data to the database.

1 more...

That's how a lot of people handle deleted data in database, it's literally just a flag. That's why there's a recommendation to edit Reddit posts before deleting them, to ensure they're actually overwritten so they can't just be restored.

Every time someone says something like this I have to explain CDC and regular old backups. There’s no way in hell Reddit doesn’t keep cold and hot backups of their shit. And while Reddit is unlikely to be doing CDC for soc2 or other compliance reasons, it’s the easiest method to capture data for analytics purposes.

CDC stands for change data capture. It’s generally done with databases by streaming the change log or ref log to a bucket or a service like Kafka where you can fast forward and rewind the log queue to see the state of the DB at any point in time. Even if you edit your comments it’s likely sitting in a Kafka topic or a snowflake bucket outside of the DB or cache used for the presentation layer.

Zero large scale websites operate with a truly single data store. There is always another layer that your user operations don’t impact

Yes, that's certainly possible, but it's also out of my control. I have basically three options:

  1. Delete account - we know this doesn't delete comments
  2. Delete comment - "seems" to delete comments, but we've seen comments get restored - so probably using a "deleted" flag
  3. Edit comment with nonsense and when delete - should poison comment if they're just using the deleted flag

That's it. There's no guarantee it works, but it has a much higher chance of working than the other two.

And there's a good chance they delete old backups. Hosting every edit is expensive, so there's a decent chance they clean up old data after some months.

In 2019 the total size of the text stored by Reddit was only 50TB. A Petabyte of data in cold storage is only 12k a year so even if they 500x in size since 2019 (very unlikely) it’s a drop in their ARR. given they sell the data for advertising and for AI, they are not deleting it. Reddit also self hosts a lot of their infra (they used to present their architecture at kubecon) so the storage costs would be even lower

They don't care about your security or privacy, they care about being the exclusive vendor of your personal information.

1 more...