OpenBSD Journal

4k Sector Disks

Contributed by jcr on from the disk-reality dept.

I noticed a very important post by David Gwynne (dlg@) on the misc@ mailing list but unfortunately, his message went without any response at all.

ive recently made a start on better supporting disks in openbsd that present 512 byte logical sectors, but actually use 4096 byte physical sectors on the platter. the best examples of these are the western digital "advanced format" SATA drives which have been mention on misc@ before. it was noted at the time that performance on these disks is much better if you can align your partitions and filesystems onto the 4k boundaries the physical sectors are on.

the process of being able to better use 4k physical sectors relies on changes at many layers of the kernel and in the partitioning and filesystem utilities, beginning with fetching the details off the hardware, and then propagating it up the storage stack into the disk and block layers, and then out to userland to make smart decisions with.

the tragedy of this situation is that i cannot find a disk that implements the parts of ATA specification that describe logical vs physical sector layouts. i have bought a couple of the WD advanced format drives, and some other people have bought me different models in the same family of drives, but none of them include the bits of the spec required to be useful. i dont know of any other manufacturers claiming to have disks with different sized logical and physical sectors, so this work has kinda stalled before it really began.

however, as users we should know that the hardware has the 4k sector "feature", so we should be able to configure machines to take advantage of it. i have talked to a few people who have tried to use these drives, but have had trouble setting them up as bootable disks.

if you want to install onto one of these disks and line the / filesystem up on a 4k boundary, the trick is to modify the start of the openbsd partition (not slice) in fdisk (not disklabel) so it begins on sector 64, not sector 63. lining the rest of the partitions up in disklabel is then an easy exercise left up to the reader. if you line the partition up properly then things will Just Work(tm).

there are western digital drives that do implement the correct parts of the ATA spec, i just dont know how to get hold of them. it appears that drives with models beginning with WD??EARS-00Z have the spec implemented, but drives with -00Y or before in their model name dont. all the local sellers only have -00Y revisions of these drives :(

dlg

The above is essentially a status update regarding requested hardware for one of the project developers (listed on want.html). Though multiple people tried and failed to fill his request for needed hardware, few people really understand the importance of support for 4k sector disks, or better said, "non-512 byte sector" storage devices.

To pilfer a phrase from tedu@, "I assert without proof" that most modern storage devices pretend to have 512 byte sectors, but internally they use whatever secret recipe the vendor has defined. Disk vendors are horrifically secretive about how their devices actually work, and in most cases, even a strict Non-Disclosure Agreement (NDA) would never give you access to the real details.

Western Digital is simply the first disk vendor to provide the supposed details of how some their new disks work internally (e.g. 4K sectors). All other vendors refuse to provide this information. Part of the reasoning for hiding internals while pretending to have 512 byte sectors is due to the usual nonsense, namely a competitive market and fanciful ideas about Imaginary Property, but there's a lot more to it. By the time Western Digital was founded 30 years ago in 1980, the use of 512 byte sectors was already the de facto standard. With more than three decades of development and use, the 512 byte concept is extremely well entrenched, as well as very antiquated.

As with all potential changes to entrenched technical ideas, there will be costs and growing pains. If you remember the days when users complained about their systems no longer having floppy diskette drives, then you understand how much bad press a major technical change can generate. Most companies prefer to avoid having some numpty user publicly ranting how their products have awful performance and are only useful as a door stop (kudos to Western Digital for leaving the mindless rants on their site).

An interesting observation about these new WD??EARS disks is they have a compatibility mode jumper to support Microsoft Windows-XP (2001), while at the same time, support for these drives already exists in Windows-Vista (2006) and Windows-7 (2009). With any big technical change, back room deals were most likely arranged years in advance with the major players so support was available on launch. Since Microsoft already has support for the 4K sector disks, you can guess the name of at least one company in the secret meetings. Needless to say, open source projects seldom if ever get advanced warning about industry changing events. In fact, we're very lucky to have someone like dlg@ with the skill and dedication to add software support for such advances.

We're seeing the first storage devices exposing 4k sectors hit the market with pre-arranged support in widely used software, so it's a safe bet to expect other devices to be released with non-512b sectors (or even sectors). These speculative (possibly imaginary) storage devices of the future can be typical hard disks (e.g. rotating magnetic media) but other types of storage devices with similar non-512b sectors are most certainly possible. If you learn a bit how flash memory works and know a bit about the use of flash in "Solid State Disks" (SSD), then you can now see how important open source software support for "non-512b sector" devices could eventually become.

Since fighting an entrenched de facto standard is difficult and costly, a significant change like variable sector sizes in new storage products may not happen quickly, or even happen at all.

ADDENDUM:
A pair of WD15EARS-00Z5B1 (Rev: 80.00A80, Jan 31 2010) disks were found here in the Silicon Valley and a patch from dlg@ is being tested to determine if they will meet his requirements (e.g. specific parts of the ATA spec are implemented).

You might want to note the suggestion above from dlg@ about installing the root filesystem (the 'a' partition) at sector 64 rather than the default sector 63 was not necessary with these very new disks. At present, the reason why they just work is unknown, but it is possibly due to commits like this or this or this which have been made without having access to the needed hardware.

(Comments are closed)


Comments
  1. By Zachary (zmisc) zachary@sdf.lonestar.org on

    You make a very good point, I'm sure many people aren't aware of the issues involving hard disks because "they just work". You'd think companies would be more willing to release specs to support a greater amount of platforms, but then again Windows is the bottom line for them, so they could care less.

  2. By Steve Shockley (steve.shockley) steve.shockley@shockley.net on

    I don't know if Vista support for >512 byte sector alignment is necessarily a secret collaboration; sector alignment can be a big performance problem on SANs; Microsoft added support to detect and correct sector alignment problems in Server 2008 ("Vista Server") and that probably also works for these 4k cluster drives.

    Comments
    1. By J.C. Roberts (jcr) on http://www.designtools.org

      > I don't know if Vista support for >512 byte sector alignment is necessarily a secret collaboration; sector alignment can be a big performance problem on SANs; Microsoft added support to detect and correct sector alignment problems in Server 2008 ("Vista Server") and that probably also works for these 4k cluster drives.

      Steve, you're are either kind of missing the obvious, or you know something that is not widely known. The issue is, the 106th word in the ata identify reply is where the is where the advertisement of physical sector size *should* happen according to the ATA spec, but none of these consumer level drives implement that part of the spec and return 0.

      There is most certainly a way to query these drives and read the physical sector size, but the method is unpublished. Additionally, each disk vendor might do it in a slightly different (unpublished) way, so you're looking at discovering and implementing a whole lot of secret sauce to do it correctly.

      We know the way to do it as published in the ATA spec, but the method being used by Microsoft and others is unpublished and unknown. If you can point us towards some docs (to prove there was no proprietary collaboration), it would be appreciated.

      Comments
      1. By Anonymous Coward (Hello) on

        Hello,

        a quick observation: maybe Windows Vista/7 and Mac OS X don't actually support 4k sector drives per se, but simply don't care which drive they get installed on and create partitions on boundaries which are nice to 4k sector layout drives (and thus 512b sector drives)?

        A standard Windows 7 Professional x64 installation in VMware yields the following:

        # dmesg | grep sd0
        sd0 at scsibus1 targ 0 lun 0:  SCSI2 0/direct fixed
        sd0: 16384MB, 512 bytes/sec, 33554432 sec total
        
        # fdisk sd0 
        Disk: sd0	geometry: 2088/255/63 [33554432 Sectors]
        Offset: 0	Signature: 0xAA55
                    Starting         Ending         LBA Info:
         #: id      C   H   S -      C   H   S [       start:        size ]
        -------------------------------------------------------------------------------
        *0: 07      0  32  33 -     12 223  19 [        2048:      204800 ] NTFS        
         1: 07     12 223  20 -   2088 137  33 [      206848:    33345536 ] NTFS        
         2: 00      0   0   0 -      0   0   0 [           0:           0 ] unused      
         3: 00      0   0   0 -      0   0   0 [           0:           0 ] unused      
        

        The first partition is the boot partition, starting at MB 1 with a size of exactly 100 MB.
        The second is the system partition, starting at MB 101 with a size of exactly 16282 MB, leaving 1 MB at the end of the disk.

        Therefore everything neatly aligned.

        Same principle with the "Vista Server" Windows Server 2008 x64:

        # dmesg | grep sd0
        sd0 at scsibus1 targ 0 lun 0:  SCSI2 0/direct fixed
        sd0: 40960MB, 512 bytes/sec, 83886080 sec total
        
        # fdisk sd0
        Disk: sd0       geometry: 5221/255/63 [83886080 Sectors]
        Offset: 0       Signature: 0xAA55
                    Starting         Ending         LBA Info:
         #: id      C   H   S -      C   H   S [       start:        size ]
        -------------------------------------------------------------------------------
        *0: 07      0  32  33 -   5221 137  36 [        2048:    83881984 ] NTFS        
         1: 00      0   0   0 -      0   0   0 [           0:           0 ] unused      
         2: 00      0   0   0 -      0   0   0 [           0:           0 ] unused      
         3: 00      0   0   0 -      0   0   0 [           0:           0 ] unused      
        

        No separate boot partition; leaving 1 MB at the end.

        Someone please post numbers for a standard Mac OS X installation. Thanks.

        Comments
        1. By Anonymous Coward (Hello) on

          > which are nice to 4k sector layout drives (and thus 512b sector drives)?

          Sorry, being nice to 512b sector drives is of course nonsensical

  3. By Kami Petersen (kokamomi) kokamomi@gueststars.net on

    What kind of performance are we talking about?

    While hoping the industry will grow up, is it merely a matter of benchmarking new disks fdisked on sector 64 and then on sector 63 to find out if they've got secret goodness?

    Comments
    1. By J.C. Roberts (jcr) on http://www.designtools.org

      > What kind of performance are we talking about?
      >
      > While hoping the industry will grow up, is it merely a matter of
      > benchmarking new disks fdisked on sector 64 and then on sector 63 to
      > find out if they've got secret goodness?

      If it's not aligned, the performance is little more than 50% of what it would be if aligned properly.

      As for figuring out the secret goodness, yes, it's a "test-it-and-see" issue until a better way can be found. Unfortunately, even if you start at sector 64, you still have to get all the calculations right for your disklabel partitions to be properly aligned.

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]