I would never buy a no-name SSD. Did it once long ago and got bit, wrote a program to sequentially write a pseudorandom sequence across the whole volume then read back and verify, and proved all 8 Pacer SSD's I had suffered corruption.
That’s also fairly common for cheap ‘thumb drives’, as I understand it. I’ve been bitten by that before.
(Edit: Allegedly if you use low-numbered storage blocks you’ll be okay, but the advertised capacity (both packaging and what it reports to OS) is a straight-up lie.)
I wouldn't say it's controversial but I suspect most people don't know about it. There's been a lot of discussion about SSD write endurance but almost none about retention.
This is a known issue. You have to power up your SSDs (and flash cards, which are based on even more flimsy/cost optimized version of the same tech) every now and then for them to keep data. SSDs are not suitable for long term cold storage or archiving. Corollary: don't lose that recovery passphrase you've printed out for your hardware crypto key, the flash memory in it is also not eternal.
A not-so-fun fact is that this even applies to modern read-only media, most notably Nintendo game carts. Back in the day they used mask ROMs which ought to last more or less forever, but with the DS they started using cheaper NOR or NAND flash for larger games, and then for all games with the 3DS onwards. Those carts will bit-rot eventually if left unpowered for a long time.
I've noticed a number of GBA carts I've picked up used (and probably not played in a long while) fail to load on the first read. Sometimes no logo, sometimes corrupted logo. Turning it off and on a couple of times solved the issue, and once it boots OK it'll boot OK pretty much every time after. Probably until it sits on the shelf for a long while.
It's also a good idea to have a backup copy of the encryption keys. Losing signing keys is not a big deal but losing encryption keys can lead to severe data loss.
Please explain to me how is that supposed to work. For all I know the floating gate is, well, isolated and only writes (which SSDs don't like if they're repeated on the same spot) touch it through mechanisms not unlike MOSFET aging i.e. carrier injection.
Reading on the other hand depends on the charge in floating gate altering Vt of the transistor below, this action not being able to drain any charge from the floating gate.
According to a local expert (ahem), leakage can occur through mechanisms like Fowler-Nordheim tunneling or Poole-Frenkel emission, often facilitated by defects in the oxide layers.
If you at least read the data from the drive from time to time, the controller will "refresh" the charge by effectively re-writing data that can't be read without errors. Controllers will also tolerate and correct _some_ bit flips on the fly, topping up cells, or re-mapping bad pages. Think of it as ZFS scrub, basically, except you never see most of the errors.
Endurance is proportional to programming temperature. In the video, when all four SSDs are installed at once, the composite device temperature ranges over 12º. This should be expected to influence the outcomes.
After I was bamboozled with a SMR drive, always great to just make the callout to those who might be unaware. What a piece of garbage to let vendors upsell higher numbers.
(Yes, I know some applications can be agnostic to SMR, but it should never be used in a general purpose drive).
Untested hypothesis, but I would expect the wider spacing between tracks in CMR makes it more resilient against random bit flips. I'm not aware of any experiments to prove this and it may be worth doing. If the HD manufacture can convince us that SMR is just as reliable for archival storage it would help them sell those drives since right now lots of people are avoiding SMR due to poor performance and the infamy of the bait-and-switch that happened a few years back.
> - make one yourself by hacking the firmware: https://news.ycombinator.com/item?id=40405578
Be careful when you use something "exotic", and do not trust drives that are too recent to be fully tested
Do you realize the irony of cautioning about buying off the shelf hardware but recommending hacking firmware yourself?
If you care about long term storage, make a NAS and run ZFS scrub (or equivalent) every 6 months. That will check for errors and fix them as they come up.
All error correction has a limit. If too many errors build up, it becomes unrecoverable errors. But as long as you reread and fix them within the error correction region, it's fine.
What hardware, though? I want to build a NAS / attached storage array but after accidentally purchasing an SMR drive[0] I’m a little hesitant to even confront the project.
A few tens of TBs. Local, not cloud.
[0] Maybe 7 years ago. I don’t know if anything has changed since, e.g. honest, up-front labeling.
Nothing can really save you from accidentally buying the wrong model other than research. For tens of TBs you can use either 4-8 >20TB HDDs or 6-12 8TB SSDs (e.g. Asustor). The difference really comes down to how much you're willing to pay.
zfs in mirror mode offers redundancy at the block level but scrub requires plugging the device
> All error correction has a limit. If too many errors build up, it becomes unrecoverable errors
There are software solutions. You can specify the redundancy you want.
For long term storage, if using a single media that you can't plug and scrub, I recommend par2 (https://en.wikipedia.org/wiki/Parchive?useskin=vector) over NTFS: there are many NTFS file recovery tools, and it shouldn't be too hard to roll your own solution to use the redundancy when a given sector can't be read
Tape is extremely cheap now. I booted up a couple laptops that have been sitting unpowered for over 7 years and the sata SSD in one of them has missing sectors. It had zero issues when shutdown.
If you don’t have a massive amount of data to backup, used LTO5/6 drives are quite cheap, software and drivers is another issue however with a lot of enterprise kit.
The problem ofc is that with a tape you need to also have a backup tape drive on hand.
Overall if you get a good deal you can have a reliable backup setup for less than $1000 with 2 drives and a bunch of tape.
But this is only good if you have single digit of TBs or low double digit of TBs to backup since it’s slow and with a single tape drive you’ll have to swap tapes manually.
LTO5 is 1.5TB and LTO6 is 2.5TB (more with compression) it should be enough for most people.
Tapes are cheap, tape drives are expensive. Using tape for backups only starts making economic sense when you have enough data to fill dozens or hundreds of tapes. For smaller data sets, hard drives are cheaper.
HDDs are a pragmatic choice for “backup” or offline storage. You’ll still need to power them up, just for testing, and also so the “grease” liquefies and they don’t stick.
Up through 2019 or so, I was relying on BD-XL discs, sized at 100GB each. The drives that created them could also write out M-DISC archival media, which was fearsomely expensive as a home user, but could make sense to a small business.
100GB, spread over one or more discs, was plenty of capacity to save the critical data, if I were judiciously excluding disposable stuff, such as ripped CD audio.
Discussion on the original source: (20 points, 3 days ago, 5 comments) https://news.ycombinator.com/item?id=43702193
Related: SSD as Long Term Storage Testing (132 points, 2023, 101 comments) https://news.ycombinator.com/item?id=35382252
I would never buy a no-name SSD. Did it once long ago and got bit, wrote a program to sequentially write a pseudorandom sequence across the whole volume then read back and verify, and proved all 8 Pacer SSD's I had suffered corruption.
That’s also fairly common for cheap ‘thumb drives’, as I understand it. I’ve been bitten by that before.
(Edit: Allegedly if you use low-numbered storage blocks you’ll be okay, but the advertised capacity (both packaging and what it reports to OS) is a straight-up lie.)
I didn't think it was controversial that SSDs are terrible at long term storage?
I wouldn't say it's controversial but I suspect most people don't know about it. There's been a lot of discussion about SSD write endurance but almost none about retention.
This is a known issue. You have to power up your SSDs (and flash cards, which are based on even more flimsy/cost optimized version of the same tech) every now and then for them to keep data. SSDs are not suitable for long term cold storage or archiving. Corollary: don't lose that recovery passphrase you've printed out for your hardware crypto key, the flash memory in it is also not eternal.
A not-so-fun fact is that this even applies to modern read-only media, most notably Nintendo game carts. Back in the day they used mask ROMs which ought to last more or less forever, but with the DS they started using cheaper NOR or NAND flash for larger games, and then for all games with the 3DS onwards. Those carts will bit-rot eventually if left unpowered for a long time.
I've noticed a number of GBA carts I've picked up used (and probably not played in a long while) fail to load on the first read. Sometimes no logo, sometimes corrupted logo. Turning it off and on a couple of times solved the issue, and once it boots OK it'll boot OK pretty much every time after. Probably until it sits on the shelf for a long while.
I think GBA games were all MaskROMs, so with those it's probably just due to the contacts oxidizing or something.
> You have to power up your SSDs every now and then for them to keep data.
What is the protocol you should use with SSDs that you’re storing? Should you:
- power up the SSD for an instant (or for some minutes?) without needing to read anything?
- or power up the cells where your data resides by reading the files you had created on the SSD?
- or rewrite the cells by reading your files, deleting them, and writing them back to the SSD?
> don't lose that recovery passphrase you've printed out for your hardware crypto key, the flash memory in it is also not eternal
Yeah. Paper is the best long term storage medium, known to last for centuries.
https://wiki.archlinux.org/title/Paperkey
It's also a good idea to have a backup copy of the encryption keys. Losing signing keys is not a big deal but losing encryption keys can lead to severe data loss.
Please explain to me how is that supposed to work. For all I know the floating gate is, well, isolated and only writes (which SSDs don't like if they're repeated on the same spot) touch it through mechanisms not unlike MOSFET aging i.e. carrier injection. Reading on the other hand depends on the charge in floating gate altering Vt of the transistor below, this action not being able to drain any charge from the floating gate.
According to a local expert (ahem), leakage can occur through mechanisms like Fowler-Nordheim tunneling or Poole-Frenkel emission, often facilitated by defects in the oxide layers.
If you at least read the data from the drive from time to time, the controller will "refresh" the charge by effectively re-writing data that can't be read without errors. Controllers will also tolerate and correct _some_ bit flips on the fly, topping up cells, or re-mapping bad pages. Think of it as ZFS scrub, basically, except you never see most of the errors.
Endurance is proportional to programming temperature. In the video, when all four SSDs are installed at once, the composite device temperature ranges over 12º. This should be expected to influence the outcomes.
For long term storage, prefer hard drives (careful about CMR vs SMR)
If you have specific random IO high performance needs, you can either
- get a SLC drive like https://news.solidigm.com/en-WW/230095-introducing-the-solid...
- make one yourself by hacking the firmware: https://news.ycombinator.com/item?id=40405578
Be careful when you use something "exotic", and do not trust drives that are too recent to be fully tested: I learned my lesson for M2 2230 drives https://www.reddit.com/r/zfs/comments/17pztue/warning_you_ma... which seems validated by the large numbers of similar experiences like https://github.com/openzfs/zfs/discussions/14793
> (careful about CMR vs SMR)
Given the context of long term storage... why?
After I was bamboozled with a SMR drive, always great to just make the callout to those who might be unaware. What a piece of garbage to let vendors upsell higher numbers.
(Yes, I know some applications can be agnostic to SMR, but it should never be used in a general purpose drive).
Untested hypothesis, but I would expect the wider spacing between tracks in CMR makes it more resilient against random bit flips. I'm not aware of any experiments to prove this and it may be worth doing. If the HD manufacture can convince us that SMR is just as reliable for archival storage it would help them sell those drives since right now lots of people are avoiding SMR due to poor performance and the infamy of the bait-and-switch that happened a few years back.
> - make one yourself by hacking the firmware: https://news.ycombinator.com/item?id=40405578 Be careful when you use something "exotic", and do not trust drives that are too recent to be fully tested
Do you realize the irony of cautioning about buying off the shelf hardware but recommending hacking firmware yourself?
If you care about long term storage, make a NAS and run ZFS scrub (or equivalent) every 6 months. That will check for errors and fix them as they come up.
All error correction has a limit. If too many errors build up, it becomes unrecoverable errors. But as long as you reread and fix them within the error correction region, it's fine.
What hardware, though? I want to build a NAS / attached storage array but after accidentally purchasing an SMR drive[0] I’m a little hesitant to even confront the project.
A few tens of TBs. Local, not cloud.
[0] Maybe 7 years ago. I don’t know if anything has changed since, e.g. honest, up-front labeling.
[0*] For those unfamiliar, SMR is Shingled Magnetic Recording. https://en.m.wikipedia.org/wiki/Shingled_magnetic_recording
Nothing can really save you from accidentally buying the wrong model other than research. For tens of TBs you can use either 4-8 >20TB HDDs or 6-12 8TB SSDs (e.g. Asustor). The difference really comes down to how much you're willing to pay.
Toshi Nx00/MG/MN are good picks. The company never failed us and I don't believe they've had the same kinds of controversies as the US competition.
Please don't tell everyone so we can still keep buying them? ;)
I use TrueNAS and it does a weekly scrub IIRC.
> run ZFS scrub (or equivalent) every 6 months
zfs in mirror mode offers redundancy at the block level but scrub requires plugging the device
> All error correction has a limit. If too many errors build up, it becomes unrecoverable errors
There are software solutions. You can specify the redundancy you want.
For long term storage, if using a single media that you can't plug and scrub, I recommend par2 (https://en.wikipedia.org/wiki/Parchive?useskin=vector) over NTFS: there are many NTFS file recovery tools, and it shouldn't be too hard to roll your own solution to use the redundancy when a given sector can't be read
Tape is extremely cheap now. I booted up a couple laptops that have been sitting unpowered for over 7 years and the sata SSD in one of them has missing sectors. It had zero issues when shutdown.
Is tape actually cheap? Tape drives seem quite expensive to me, unless I don't have the right references.
If you don’t have a massive amount of data to backup, used LTO5/6 drives are quite cheap, software and drivers is another issue however with a lot of enterprise kit.
The problem ofc is that with a tape you need to also have a backup tape drive on hand.
Overall if you get a good deal you can have a reliable backup setup for less than $1000 with 2 drives and a bunch of tape.
But this is only good if you have single digit of TBs or low double digit of TBs to backup since it’s slow and with a single tape drive you’ll have to swap tapes manually.
LTO5 is 1.5TB and LTO6 is 2.5TB (more with compression) it should be enough for most people.
Tapes are cheap, tape drives are expensive. Using tape for backups only starts making economic sense when you have enough data to fill dozens or hundreds of tapes. For smaller data sets, hard drives are cheaper.
HDDs are a pragmatic choice for “backup” or offline storage. You’ll still need to power them up, just for testing, and also so the “grease” liquefies and they don’t stick.
Up through 2019 or so, I was relying on BD-XL discs, sized at 100GB each. The drives that created them could also write out M-DISC archival media, which was fearsomely expensive as a home user, but could make sense to a small business.
100GB, spread over one or more discs, was plenty of capacity to save the critical data, if I were judiciously excluding disposable stuff, such as ripped CD audio.