Storage vendors, including but reportedly not limited to Western Digital, have quietly begun shipping SMR (Shingled Magnetic Recording) disks in place of earlier CMR (Conventional Magnetic Recording) disks.
SMR is a technology that allows vendors to eke out higher storage densities, netting more TB capacity on the same number of platters—or fewer platters, for the same amount of TB.
Until recently, the technology has only been seen in very large disks, which were typically clearly marked as “archival”. In addition to higher capacities, SMR is associated with much lower random I/O performance than CMR disks offer.
This is going to be another one of those stupid things us technology buyers have to look out for when buying storage, isn’t it? Like
I nearly bought one before seeing the performance issues come up in a review.
The main problem is that manufactures and stores are not clearly labeling these and it’s not until you experience bad performance that you realize something is “wrong”. Even armed with the exact model, the manufacturer spec sheets can’t be relied on to tell us the facts.
Here are statements WD made in the article:
This is deceptive as hell!
Now WD’s got a lawsuit.
https://arstechnica.com/gadgets/2020/05/western-digital-gets-sued-for-sneaking-smr-disks-into-its-nas-channel/
I hope all manufacturers take note. Selling SMR drives isn’t the problem, but not clearly and openly disclosing the fact is.
SMR has a place, but it is not NAS disks which are supposed to provide high performance and low latency. WD knew this, so did the other manufacturers (I don’t have a list, but I think virtually all HDD manufacturers did a similar scheme).
And, if you think about it 10+TB disks are excluded from this. They would actually benefit from SMR, but their customers would be more savvy. (If you will your NAS with 4TB disks, you don’t actually need a NAS at this point in time). So I think, they assumed those customers would not notice the difference.
Those buying the higher capacity ones would obviously notice the difference. One of the first things I do when I get a new disk it actually testing it for performance and defects (and let’s not go into having bad blocks before filling a single pass).
It is nice to see that they could not get away with it. Hope this becomes a lesson.
sukru,
I don’t have an issue if they make a good faith attempt to provide the information. However it’s downright unethical to hide it and omit it from the specs. It’s got nothing to do with whether it’s a “NAS drive” or “x TB capacity” a lot of applications are affected by the bad random access performance regardless of capacity and regardless of what kind of system it’s plugged into.
Ultimately the consumer needs to have the facts in order to make an informed choice.
That’s a good practice, although alot of people test by copying large files and not testing random access, which can be a more significant bottleneck (YMMV).
I hope so. We deserve to know what we’re buying! If consumers are willing to buy the drives say for a discount, then that is ok. But It’s totally unfair to pawn these off on unsuspecting consumers. I seriously hope the lawsuit forces manufacturers to truthfully label the disks.
NO, for high performance and low latency you use (caching) ssd’s. Most NAS disks are only used by one user at a time for storing big files (audio/video/photo/compressed-backups). This would actually be a good match for SMR: budget disks in a network-interfaced device with large file transfers
If you are doing webdevelopment (many small files): use (caching) ssd’s
If you are running a (small) business: A proper fileserver with SSD’s might be better than a NAS
If you are running a large business: SAN, not NAS
However a NAS is going to depend on what people put on it, generalizing isn’t helpful. You can’t just say “these are good for NAS”, without knowing more about the file access patterns on the NAS server. Realistically some users (myself included) use NAS as regular storage and not just archives/backups so one cannot automatically say SMR is always suited for “NAS”. Sure we understand that, but that’s not the way these drives are being marketed unfortunately.
A separate issue that’s come up when you read more complaints about SMR drives is that they’re being auto-marked as defective in some raid arrays as defective because they’re timing out.
I’d argue if there’s any application where these are most suited, it’s probably media PCs / security DVRs where it processes long continuous streams all day.
It depends. Hard drives are significantly cheaper for large capacities and they don’t suffer from bad write endurance, which has been getting worse with SSD as they trade off lifetime for more capacity. I’d advice everyone to look at the write endurance on SSDs when making purchasing decisions. You can buy enterprise SSDs that have more endurance, but these are extremely pricey especially if you need to buy many for raid. SSDs give a nice performance boost, but until SSDs evolve to the point where there’s no tradeoffs I’d say hard disks will continue to have merit for some users and businesses. YMMV.
Funnily enough, the Purple line for video surveilance systems is the first consumer application of SMR drives from WD.
Yes, capacities above 1TB. Below that, there isn’t a reason to buy an HD over an SSD.
This isn’t as true as it used to be. Write endurance has stayed stable and gotten better in some cases.
Looking at a 1TB WD Blue with 3D NAND, which is a mainstream drive and not an ultra expensive prosumer drive, it has a MTTF of 1.75M hours and a TBW of 400TB. People don’t write 1TB of data a day. That is a crazy high number for most people, and it is more the adequate for the vast majority of use cases.
https://www.newegg.com/western-digital-blue-1tb/p/N82E16820250088
That will never happen because we have to deal with physics. HDs have the problem of being mechanical devices, and SSDs have the problem of write endurance. Both are wear items which need to be check periodically. They’re like car tires.
Overall, SSDs have similar durability to HDs, better in some metrics, and much better performance. HDs are niche tech for home users and becoming niche tech in the datacenter relegated to storage arrays.
Flatland_Spider,
P/E cycles have only gotten worse as they pack cells more and more densely. Flash chips designed for TLC/QLC bit densities are speced down to as low as 1k cycles for raw storage. Of course there are mitigations that drive manufactures can employ in the controllers to mask the low endurance of modern NAND chips:
– ware leveling algorithms to balance out the remaining life across all cells
– over-provisioning (a big difference between enterprise and consumer SSDs is substantial over-provisioning to increase the overall lifespan),
– more ECC bits
– using a more durable SLC/MLC cache in front of the TLC/QLC storage.
– compressing data to occupy fewer cells.
– battery backed cache
But it’s gotten harder to find the detailed specs in the last decade as manufacturers have become less forthcoming.
It does not say “it has a MTTF of 1.75M hours and a TBW of 400TB” it says “MTTF Up to 1.75M hours” and “up to 400 terabytes written (TBW)”. I know we’re programmed to ignore that these days, but it does change the meaning. Also the estimate is probably under ideal conditions (ie long sequential writes). For random access workloads, fragmentation/write amplification takes a larger toll on flash. As always, engineers can employ mitigations, but given that no detailed specs are given, its impossible for us to know whether they actually did. Unless you buy an enterprise drive, you genuinely don’t know how much the manufacturer compromised on endurance to save on their manufacturing costs.
I’ve had hard disks running decades in systems with no signs of failing. I’ve also had HD fail too, but by far most still work. I’m curious whether the hard drives that went on to fail could have been distinguished from those that did not fail at the time they were brand new? It would probably be feasible to make magnetic drives that last a lifetime with better manufacturing tolerances, but then again they’d probably become obsolete before they fail.
Flatland_Spider,
Not in the same sense though. Given enough time, NAND hardware will eventually break, and hard disks will eventually break in the sense that everything eventually “breaks” due to component failures, the average time described by MTBF. But on top of that NAND chips wear out (and become slower too) predictably over time as they’re written to. This type of wear is specific to SSDs.
In other words, for hard disks failure over time is a probability, whereas SSD wearing out over time is a function of how many times its cells were erased & rewritten. An HD that’s written 400PB might manage to write another 400PB whereas an SSD will have used up all of it’s write endurance. With this in mind, I have to object to characterizing hard drives as “wear items” in the same sense that SSDs are.
I still object to your classifying HDs as a “wear item”. Presumably the CD/radio in your car has a MTBF, but it is not a “wear item” in the same sense as the brakes on your car. One literally wears out whereas the other only breaks if you are abusive and/or unfortunate. I know it’s nitpicking, but when you call every electronic or mechanical item a “wear item”, it truly looses all meaning. There is a distinction to be made between products that could fail and those that wear out expectedly during their lifetime.
I found your lightbulb example kind of ironic given that more expensive LED lighting technology that lasts longer is replacing cheaper incandescent light bulbs that need to be replaced more frequently, so I don’t think it was the best example to make that point
Anyways, we’re kind of off track here, haha. I know what you’re saying and I don’t fundamentally disagree, longevity isn’t the end all be all, especially at lower price points. The main thing is that consumers know what they’re buying. Manufacturers need to improve in this area.
For high perf and low latency you buy lots of servers with adequate RAM and SSDs from the getgo.
SSD caches aren’t as useful as people think. Performance drops off considerably when the cache gets full or data outside of the cache is requested, and they are still slower than RAM. Moral of the story, buy more RAM first.
LOL No.
USB drives are only used by one user at a time. NAS are to be used by multiple people concurrently. Otherwise, what is the point; buy a USB drive to save some money.
For lots of small files, get an SSD. 1TB SSDs are cheap.
What do you think a NAS is? NAS and fileservers are basically the same thing. The same thing with a SAN. The line is very blurry between all of them, and it’s really just protocols they use.
> Most NAS disks are only used by one user at a time for storing big files (audio/video/photo/compressed-backups).
Even if you assume this, NAS systems pretty universally use storage layouts that have issues with the performance implications of SMR drives. Back when RAID1 was the norm, yes, SMR would be fine. These days the norm is either RAID5/6, ZFS, or BTRFS, all of which have serious issues with the performance implications of SMR drives even if you are storing mostly very big files.
However, you’re also basing your assumption on classical terminology where a NAS is little more than a storage device with a dinky little CPU and next to no RAM used for just providing storage on a conventional network. These days the term NAS just as easily refers to purpose built file-servers as the poorly designed classical storage appliances, and the marketing gets very fuzzy between NAS and SAN hardware when you go beyond the network interface.
But none of that matters, because HDD’s marketed for NAS usage have always been understood to be targeted at running in nearline and/or always-on RAID arrays. IOW, they’re not supposed to be super high performance like HDD’s marketed for Enterprise usage (and thus cost correspondingly less than the price gouging going on there), but still provide the behavioral characteristics and features required to make always-on and near-line RAID arrays work correctly. The big distinction here has historically been that they just return write and read errors when they happen instead of retrying for multiple minutes in firmware before giving up and returning errors. More recently though there’s also been a general assumption though that they are _not_ SMR disks, because SMR makes it impossible to run RAID5/6 arrays efficiently or run ZFS/BTRFS safely.
ahferroin7,
It bugs me that the manufacturers are selling these for markets where they don’t belong. We can’t pay any attention to what the manufacturer thinks for many of the reasons that you brought up and because it just doesn’t make any sense to categorize all “NAS” systems as the same. The fact that these drives happen to be in a NAS is not an indication of how they’ll be configured or what kind of performance it will need.
I see the WD40EFAX is “Specifically designed for use in NAS systems with up to 8 bays” well what the hell does that mean? This is one of the drives with terrible raid performance BTW. Not only is labeling it NAS drive ambiguous, it can be downright counterproductive and misleading.
As other sites have pointed out, it’s about money. High capacity disks are high margin, and lower capacity disks are not.
Switching the lower end Reds to SMR was a cost-cutting move by WD to squeeze out more margin. Nothing more, nothing less.
It’s about the access patterns and usage; not data size. Network and Attached are the keywords here.
They assumed it would work, but it didn’t. If it had worked, no one would have cared. The lower end market is cost sensitive, and as long as the performance had been comparable, people would have been fine with it.
No because 1TB SSDs have dropped below $149 with many dropping down to the $109-$129 range at times.
1TB is still a huge amount of space, and more people are using SaaS software these days. SMR drives really only affect data hoarders with RAID arrays.
This affects me because running storage servers with RAID arrays is part of my day job, and at home, I need space for VMs to prototype storage heavy things. (I fully understand I am on the high side of the Bell Curve.) However, this doesn’t even register for normal people. I’ll guess they don’t have more than 0.5 TB worth of local data which matters to them.
The SMR drives are good enough for backup drives and warm storage. It’s when they get used in RAID arrays they start falling down, which is highly likely considering the market Reds are targeted at. It says so right on the box: “designed for RAID arrays and high disk systems.”
Performance numbers: https://www.servethehome.com/wd-red-smr-vs-cmr-tested-avoid-red-smr/
Flatland_Spider,
Thanks for posting that!
Some of those benchmarks are reasonable, I’m sure the 256MB cache helped alot to smooth out bursts of IO.
I was surprised how badly the SMR disk did copying a large 125GB file, I expected better. Granted average consumers may have lighter workloads, but it does highlight the performance deficits of the underlying media once the cache fills up.
And the RAID rebuilding performance was just atrocious:
Yeah, the SMR drive performance isn’t atrocious until you do the one thing people buy Reds for. LOL