We just lost 3TB of data on a SanDisk Extreme SSD

floofloof@lemmy.ca to Technology@beehaw.org – 161 points –
We just lost 3TB of data on a SanDisk Extreme SSD
theverge.com
85

NOTHING I have that is irreplaceable is on less than 2 drives nor are they ever connected at the same time. You're just asking to lose files if you only save them on one drive.

If you have your data in one location, you have your data in zero locations.

The 3 2 1 of data retention is important

3 copies of your data

2 local

1 off-site

The 2 stands for on 2 different mediums. So HDD and tape for instance. Or HDD and SSD. Or SSD and DVDs. Whatever combo you choose that fits your needs. This (minimizes) the chance of loss of both.

I’d love to use tape but so far couldn’t bring myself to make the Jump cause of the upfront cost of the drive. Other than that it would sound great to have tapes of my digitized bluray collection so as if my nas should fail unrecoverably, I could simply setup a new one and copy back the data instead of having to digitize everything again.

I know a lot of people who put their single copy of files on USB drives "for safety"

But in the case of the article looks like it was video shot and saved directly from the camera (professional cameras like the blackmagic save directly on USB SSDs), so there wasn't time to backup it

Looking at Blackmagic's pro-level cameras, they support external USB storage and dual SD Cards and dual CFast cards.

So there's certainly no requirement to use external USB storage.

But, they also say:

When shooting is complete you can simply move the external disk to your computer and start editing from the same disk, eliminating file copying!

Rather unfortunate advice.

Anything I have that is super important is just uploaded to a server with backups turned on. Becomes 100%, not my problem anymore.

Until the backups don’t work.

Untested backups can hold all sorts of surprises.

Sadly, testing backups is a lot of work and is rarely done.

Deja Dup has a nice feature in that every once in a while is spawns and verifies that the backup is retrievable

Retrievable is a start, at least.

Sorry, I should have said verified. it reads through the backup and checks data integrity/checksums etc. so you know it can be retrieved properly.

Testing a back isn't that hard, I typically test backups through digital ocean. They worked great.

Not your problem... until the hosting provider publishes a press release about some recent fire or flooding in the data center that "only impacted less than 1% of our customers"... and you turn out to be among them.

For "super important" stuff, I keep closer to 10 copies spread around in different places. Normal stuff is 321, and everything else is temporary.

eh, I've never hit that issue but I also have a copy of everything locally.

I have. Both server and backup lost, and all I got was a complimentary 1 free month. Not a fun time uploading everything again from the single local copy.

Now something similar is going on with Google for Business, where they've switched from "unlimited storage" to "actually, $300/10TB/month". Like that's going to happen (there are $100/100TB/month bare metal out there), but now I have to decide what to delete, what to keep, and what to downgrade from 321, to "temporary" single copy.

I know a lot of people who put their single copy of files on USB drives "for safety"

But in the case of the article looks like it was video shot and saved directly from the camera (professional cameras like the blackmagic save directly on USB SSDs), so there wasn't time to backup it

3 more...

The article alludes to this problem, but Amazon has basically forfeited the consumer goodwill they used to have. It used to be that their reviews were trustworthy (and relatively hard to game), and ordering products "sold by Amazon" was a guarantee that there wouldn't be counterfeits intermingled in. Plus they had a great return policy, even without physical presence in most places.

Now they don't police fake reviews, and do a bad job of the "SEO" of which reviews are actually the most helpful, they're susceptible to commingling of counterfeit goods (especially electronics and storage media), and their return policy has gotten worse.

It basically makes it so that they're no longer a good retailer for electronics, and it's worth going into a physical store to avoid doing business with them.

Or there's the proper online tech stores as an alternative. With a smaller product base reviews and checks would work a lot better.

Enshittification. Applies to Amazon too.

First they attracted consumers. Then they attracted sellers. Now they're exploiting both.

There is a reason why they got brick and mortar shops to close, while sellers with too good of a return policy are going under, and the search feature returns random numbers of items in a random order that have little to do with what you asked it for (the most egregious is "sort by price", which suddenly makes the product count go down... but you go to camelcamelcamel, and for the same search it stays the same with actual sorting by price).

I get a lot of folks are correctly pointing out the need to back up data but isn’t that a little bit of victim blaming? This isn’t a situation where the guy had a 10 year old drive with all his photos and videos sitting around unbacked up. He had a new drive and it failed. Can we agree that brand new drives aren’t supposed to fail?

Can we agree that brand new drives aren’t supposed to fail?

No.

The typical failure rates, for pretty much all electronics, even mechanic stuff, form a "bathtub graph": relatively many early failures, very few failures for a long time, with a final increasing number of failures tending to a 100%.

That's why you're supposed to have a "burn in" period for everything, before you can trust it within some probably (still make backups), and beware of it reaching end of life (make sure the backups actually work).

That's absolutely true in the physical sense, but in the "commercial"/practical sense, most respectable companies' QA process would shave off a large part of that first bathtub slope through testing and good quality practices. Not everything off of the assembly line is meant to make it into a boxed up product.

Apparently even respectable companies are finding out that it's cheaper to skimp on QA and just ship a replacement item when a customer complains. Particularly when it's small items that aren't too expensive to ship, but some are doing it even with full blown HDDs.

In this case, I think we can remove what's left of the benefit of the doubt from Western Digital (who owns SanDisk). They are as scammy/shady as I know a company to be.

Personally I've been boycotting them since 2016 after I couldn't recover the data from an external drive, which WD encrypted without warning nor consent. A faulty component on the PCB (unrelated to the drive itself), combined with WD's non standard practices (non SATA pins + mandated proprietary encryption) meant that I had to lose this drive and the data it contained so they could make a quick buck. I can't trust a company with such ethics to store anything for me.

In 2020, they got themselves into another scandal. WD reds, which were advertised as pro/NAS storage, and sold at a premium, were found to behave like shingled drives (a technique that trades away some reliably and availability in exchange for extra storage density), exposing many users to heightened risk of critical failure (esp. during disks swaps). WD of course denied, and then again when confronted with evidence, up until the internet burst in flames. Again consumer hostile practices.

Here we have SSDs which have been reported for months, and by several reputable sources, to be having problems, which SanDisk even attempted to patch without success. And now, wouldn't you think that they are trying to recall them all in order to protect consumers from likely data loss (like any responsible data storage provider would do)? Nope. They are currently trying to sell those at significant discount, as quickly as they can, hurting plenty of consumers in the process is less important than their short term financials.

As far as I can care, they can go to hell, bankruptcy is all they deserve, for the greater good.

Indeed. An old EE mentor told me once that most component aging takes place the first two weeks of operation. If it operates for two weeks, it will probably operate for a long, long time after that. When you're burning in a piece of gear, it helps the testing process if you put it in a high temperature environment as well (within reason) to place more stress on the components.

The high temperature part is kind of a trap with SSDs: flash memory is easier to write (less likely to error out) at temperatures above 50C, so if you run a write heavy application at higher temperature, it's less likely to fail than if it was kept colder.

Properly stress testing an SSD would be writing to it while cold (below 20C) and checking read errors while hot (above 60C).

For normal use you'd want the opposite: write hot, read cold.

They should at least try to recover the data. Maybe a data recovery program like spinrite would just do it. https://www.grc.com/sr/spinrite.htm .

Not running raid, not backing up, and not even trying the simplest recovery approaches is just sloppy and lazy. Do at least one of the three.

Like someone else said. Expect the biggest risk of failure when you buy it. Then like maybe 5 years out rising failure rates. Refreshing the disk pattern as it gets older can help too.

Just pay triple! Don't be a poor!

Such great advice.

You can be mad at it but what they said is largely true. Not having the data backed up somewhere and expecting everything to be perfectly fine forever is like not having old photos backed up somewhere and expecting everything to be perfectly fine forever.

It's even more egregious here because if OP can afford a 3TB SSD, they should be able to afford a 3+TB HDD as a backup no problem. The money isn't an issue for OP, just improper knowledge of how to handle data storage. It isn't necessarily their fault this happened since the average person isn't given this info, but at its core, "pay more money" because you need backups is the only true answer

🤖 I'm a bot that provides automatic summaries for articles: ::: spoiler Click here to see the summary This isn’t a drive he purchased many months or years ago — it’s the supposedly safe replacement that Western Digital recently sent after his original wiped his data all by itself.

SanDisk issued a firmware fix for a variety of drives in late May, shortly after our story.

But data recovery services can be expensive, and Western Digital never offered Vjeran any the first time it left him out to dry.

Honestly, it feels like WD has been trying to sweep this under the rug while it tries to offload its remaining inventory at a deep discount — they’re still 66 percent off at Amazon, for example.

Unfortunately, the broken state of the internet means Western Digital doesn’t have to work very hard to keep selling these drives.

I’d also like to say shame on CNET, Cult of Mac and G/O Media’s The Inventory for writing deal posts about this drive that don’t warn their readers at all. :::

Wow so the first one failed, then they relied on its replacement completely and blindly. It’s dumb shit like this that made me stop feeling bad for those who experience data loss.

WD writing fake reviews?

There's no way an actual human wrote such an extensive, detailed but overall dry of content as a review, unless they got it for free in exchange of an enthusiastic review

Edit: the article shows screenshots of clearly fake reviews on Amazon from "verified" buyers. This is what I'm referring to fake reviews

What the hell are you talking about? Consider reading the actual article before commenting something snarky. WD owns SanDisk, and this article is shitting all over them.

Here's a short version if you can't be bothered: It's a follow-up to this article from May where they reported on a bug in SanDisk firmware that erased your data. WD claims to have fixed it with an update, but that appears to be false. The fact that these drives with a high failure rate are also being sold with a deep discount makes it seem like WD/SanDisk is just trying to get rid of defective hardware as quickly as possible while minimizing dollars lost, at the expense of your data.

They were talking about the reviews featured in the article. Did you read it?

Ah, got it. Very unclear from your comment.

I wasn't the one writing the comment.

They clearly didn’t bother to read anything before replying.

And that's why RAID is a good idea.

For availability, yes, but RAID is not a substitute for proper backup procedures. E.g. - offsite, cloud, or automated scheduled local backups, or even regular data integrity checks.

True, but it will protect you from a single drive failing like this.

I don’t think the drive actually failed. The article said that the files disappeared from the drive one-by-one, which sounds like a firmware bug to me.

You could theoretically have the same problem due to a buggy RAID controller or driver.

I don’t think the drive actually failed. The article said that the files disappeared from the drive one-by-one

It didn't fail in the sense of reporting an I/O error, but it did fail in the sense that the bytes previously written to it can't be read any more.

which sounds like a firmware bug to me.

Could be. SSD firmware is pretty notorious for data loss.

You could theoretically have the same problem due to a buggy RAID controller or driver.

Which is why I don't trust hardware RAID controllers, only software RAID, preferably with per-block checksums so that the software RAID controller knows which copy is the good copy.

The author is using macOS, whose APFS file system has those features. Linux's btrfs does too.

Raid 0 right?

RAID 1. Raid 0 stripes data between disks, meaning you get much faster I/O speeds but if one disk fails, you lose it all. RAID 1 is when you have 2 (or more) disks and the data is mirrored between both. So if one does, you’ve got a perfect copy of it on the other disk. RAID 0 = “striped”, RAID 1 = “mirrored”

In case anyone is in a similar situation, I can't say enough good things about PhotoRec. It saved my ass more than once from hard drive recovery down to SD cards.

https://www.cgsecurity.org/wiki/PhotoRec

Yeah, yeah, it's command line only, but once you get your stuff back it's worth learning!

And it is not only command line. PhotoRec has a gui, only testdisk doesn't

I second this. I luckily never needed it myself but I saved the stuff of a few people over the years and it’s not one of those annoying „free“ apps.

Did they really abbreviate "paragraph" to "graf"?

Journalistic jargon: hed, dek, lede, nut graf/nutgraf

Yes, but it’s standard journalist speak and predates this article by a long time.

"Lede" I've heard because of the common expression "burying the lede." You're telling me "graf" is standard language for published articles?

You're telling me "graf" is standard language for published articles?

And its unabbreviated form, "nut graf." Like, it's a legit thing. I mean...wow. Nut graf.

SSDs are nice and fast but if the data table goes bad, you have lost everything. At least with a HDD you can still pull files off if filesystem table goes bad. Also unplugged SSD in a hot location will lose data quite readily. Always keep them powered to keep the bits.

We had the same problem here in our company. Don’t use theirs drives.

That's a bit extreme. Don't use 'Extreme PRO Portable SSD' units, but WD has some pretty reliable SSDs and if we boycott WD only Samsung is left..

Data being lost on a drive isn't a reason not to purchase. If it were then we would never buy any drives.

Data loss is a reason not to purchase if it happens more often than with competing products, and that may be the case with these.

But a sample size of 1 or 2 does not prove it happens more than other products.

Pretty sure the sample size is hundreds or thousands. SanDisk would not bother with a firmware “fix” for something that only affected 2 drives. I had a SanDisk I bought recently have this exact same issue and when I went searching for the problem it was reported in a lot of places.

This is one of the reasons why I prefer having a few smaller drives than one big one. Having a zillion terabytes of storage one one drive is great and all, but that's a lot of stuff to potentially lose when something craps out. I'd sooner have a couple smaller ones so that if one hdd does shit the bed..err case? at least not everything's gone.

that's the neat thing if you're shooting 4k or above 3TB IS a small drive

If you have a proper backup system in place that shouldn't be a problem. Speaking of which I should do another round of backups...

My only big drive is just for games. I have two internal 1t drives one hdd and one SSD for storage on my computer and four external 512g drives as backup to those. It's not the best solution, and it's kind of clunky but I'd rather have something than lose everything to a bad drive.

That's good to know. I almost thought of buying a couple (I always back up with pairs) to replace a couple of aging spinning disk portables.

Guess I will wait.

(Extremely drunk)

I don't remember any more.

(Fishes out a SanDisk Extreme SD 128GB card from the box)

Was this the one? I think this was the one.
The one that fucking failed.

...fuck. I had more money than sense a few years ago.

And the truism holds. If you only have one backup. you don't have backups.