Contributed by dlg on from the omg-scsi-in-your-laptop dept.
ahci(4) is a driver supporting hardware conforming to the Advanced Host Controller Interface for SATA. The hardware is becoming more and more available, and soon it may be necessary to have this driver to use newer machines.
I started the driver in the tree about 3 months ago, but I was stuck on get the SATA ports initialised. Since then Chris Pascoe (pascoe@) got involved and he has now got it to the point where it's doing IO. Since it's basically working now on the JMicron SATA controllers it was enabled in GENERIC on i386 and amd64 machines.
Edit: The change is in the tree now, with code to make some of the SATA drivers attach to this driver instead, making some SATA devices appear as sdX instead of wdX. In case this really isn't what you want on your -current boxes, you can always use config(8)/boot -c into UKC> and from there "disable ahci" on the kernel to revert to the old driver. Remember to update your /etc/fstab if this does apply to your machine.
ahci(4) is important for a couple of reasons, the main one I've already covered: there is hardware coming out now which we'd like to support. That isn't a very surprising reason to work on something, and I'm sure it's not something I have to explain. The other reason I wanted to write a driver for ahci is less to do with AHCI hardware itself, and more to do with pciide(4), wdc(4), and wd(4).
pciide(4) is the driver that attaches to basically all the PCI IDE controllers we support, and wdc(4) supports all the other IDE controllers. Actually, pciide(4) and wdc(4) are mixed together a bit so it's really one and a half drivers. The drivers for devices on an ATA bus (wd(4), and atapiscsi(4)) attach directly to wdc(4) and pciide(4).
This is a bit different to the SCSI stuff, where there are maybe 40 to 50 drivers in the tree for all the different SCSI controllers out there (and we don't support all of them yet). We also have the scsibus(4) device sitting between the drivers for the 40ish controllers we have and the drivers for our SCSI devices. It simply provides a layer of abstraction and management between the devices that generate a SCSI command, and those that put the commands on the bus.
Because of this abstraction it is also extremely easy to emulate SCSI on top of devices that don't actually understand SCSI commands. Examples of this would be ami(4), gdt(4) and the sdmmc stuff. None of those devices actually understand SCSI, they just take a SCSI command via the scsibus(4) and translate it into something appropriate to the hardware they sit on top of.
Anyway, back to ATA stuff. The model for pciide/wdc/wd/atapiscsi sort of makes sense since all IDE controllers are fairly similair to each other, and this is reflected by how they work. There's a generic core for the standard IDE controller state machine, but with the option to provide exceptions to the core functionality for specific chipsets.
The problem with this is that the need for exceptions on more modern controllers is becoming the norm, rather than the exception. This causes the IDE drivers to become increasingly more and more complicated to cope with this new hardware. On top of this ATA gear is starting to get some smarter characteristics that look more and more like how SCSI works. Things like Native Command Queuing in newer SATA disks looks suspiciously like the tagged queing stuff SCSI has had for years.
To deal with the extra complication that newer controllers are bringing to pciide, an obvious solution would be to restructure these drivers to provide some level of abstraction. An obvious model to follow would be the one we use for SCSI. There have been discussions for a long time about how to do this. The first idea was to slowly rework the IDE code and get it to a point where we could introduce an atabus(4) between pciide(4)/wdc(4) and wd(4)/atapiscsi(4). This idea sounds good, but in the real world noone was willing to step up and spend the time working with the behemoth that is the pciide/wdc code. It's some very twisty code, and whoever worked on it would have had to modify how it works without breaking it. A lot of people use this hardware (this is an understatement), and potentially breaking their disks is scary.
The next idea (and my favourite for a long time) was to implement atabus(4) as a separate codebase to the existing IDE drivers. As a start it would only support a couple of controllers, but again, that would be separate to the existing code. The reasoning behind this is you can work on something without breaking existing device support. Once it got to a point where it was usable, we could just switch the drivers for that controller from pciide to the new atabus based code.
I eventually realised that this approach would be simply copying how scsibus(4) works and changing the structures that represent the command from SCSI commands to ATA frames. marco@ had been arguing with me saying that scsibus(4) has all the semantics we wanted in a modern pciide replacement, such as the abstraction between the disks and the controllers, and things like tagged queuing and hotplug. We also already emulate SCSI on a lot of devices that actually have no knowledge of SCSI, so why not do the same for IDE?
Thus ahci(4) and the atascsi layer were born.
The language that scsibus(4) talks is encapsulated in a struct called "scsi_xfer". It contains the SCSI command, pointers to the buffers containing the data you want transferred between the computer and the SCSI device, pointers to which driver is responsible for the buffers, timeouts for handing device errors, flags for how the controller should operate on this command, and so on. For devices that don't understand those SCSI commands (eg ami and sdmmc) we translate them in the controllers driver.
If we are only going to have ahci(4) attach a scsibus and do the translation, then leaving the translation in ahci itself would have been ok. However, we intend to do the same thing to other controllers in the future, so it makes sense to split this SCSI to ATA translation layer out, so that's what I did. I implemented an equivilent to a scsi_xfer called an ata_xfer, which is basically the same thing except it carries a representation of an ATA command instead of a SCSI one, and a few more flags for ATA specific things.
This means that IDE drivers can be structured similairly to our SCSI drivers, except instead of attaching a scsibus directly, they hook an instance of the ATA to SCSI translation layer, and instead of chewing on scsi_xfers, they take ata_xfers. The code that is responsible for the attachement of scsibus to an ATA driver, and for translating the two types of xfers, is a couple of files in src/sys/dev/ata
called atascsi.c and atascsi.h.
ahci(4) itself is actually pretty boring if you can get over the complexity of dealing with the hardware. All the SCSI hardware I've worked on deals with the init of the bus and devices on the bus for me. I simply init the chip, give it some memory, and start pushing commands down. AHCI is different, the hardware itself is incredibly dumb, which means the driver has to be very smart. It is responsible for initialising the port, the phy, getting the device on the end to wake up, stopping the controller on error and asking the device what went wrong, and so on and so on. I wasn't coping with that too well, so I asked pascoe@ to look into it and he managed to figure it out and then basically finished the driver.
Fundamentally though, ahci(4) works much like our SCSI drivers now, which is the goal we were aiming for. It simply inits the hardware and hooks up with atascsi. That in turn attaches a scsibus(4), and then takes scsi_xfers from the midlayer and emulates a scsi disk by talking ata_xfers to ahci(4). It also passes through ATAPI commands to devices that support them (like SATA cd/dvd drives). This means we've cut out wd, atapiscsi, pciide, and all the wdc code. This all happens in about 1000 lines of code in atapiscsi.
As a side note, linux and solaris do something similair to atascsi too. The linux emul layer (libata) is about 13000 lines.
So that's the story of ahci(4). Well, it was more the story of atascsi and how it let us write ahci(4) like a SCSI driver. I spose you all want to see what it looks like now. This is the JMicron controller on my dev box:
ahci0 at pci4 dev 0 function 0 "JMicron JMB361 IDE/SATA" rev 0x02: AHCI 1.0: irq 11
scsibus0 at ahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0: <ATA, ST3320620AS, 3.AA> SCSI2 0/direct fixed
sd0: 305245MB, 305245 cyl, 64 head, 32 sec, 512 bytes/sec, 625142448 sec total
We are working on supporting other controllers with ahci(4), such
(Comments are closed)
By Anonymous Coward (85.178.127.212) on
Will this get supported too.....?
Comments
By David Gwynne (dlg) on
> Will this get supported too.....?
It is actually related. Our SCSI layer can do hotplug, and that's another thing I didn't have to reimpliment in an ATA layer. It's something we want to do, but maybe after we get some broader controller support.
By gsson (83.249.112.165) gsson@fnord.se on http://fnord.se
Also, power-management still has no proper support in libata [1]. Will this be included from the start in atascsi?
I have a box that can boot in AHCI mode (Some Intel ICH thing, so if you've got something you want me to try you only have to ask.
[1] http://linux-ata.org/software-status.html#pm
Comments
By gsson (83.249.112.165) on http://fnord.se
Comments
By Johan Torin (jtorin) on
That's why you log in. :)
Comments
By Henrik Gustafsson (gsson) on http://fnord.se/
>
> That's why you log in. :)
Thanks for reminding me I had an account :)
By David Gwynne (dlg) on
uhm...
i have a diff that lets atactl(8) still talk to disks behind atascsi, and its only 150ish lines long. funnily enough supporting atactl(8) in atascsi actually requires less code than it is in the old school wd/wdc/pciide stuff. it's easier to read too. despite that im not sure this is something we should be doing.
is that the kind of functionality you're referring to though?
> Also, power-management still has no proper support in libata [1]. Will this be included from the start in atascsi?
i'm only interested in getting the disks to do the right thing over a suspend and resume of the system.
> I have a box that can boot in AHCI mode (Some Intel ICH thing, so if you've got something you want me to try you only have to ask.
k, thanks.
Comments
By Henrik Gustafsson (gsson) on http://fnord.se/
>
> uhm...
>
> i have a diff that lets atactl(8) still talk to disks behind atascsi, and its only 150ish lines long. funnily enough supporting atactl(8) in atascsi actually requires less code than it is in the old school wd/wdc/pciide stuff. it's easier to read too. despite that im not sure this is something we should be doing.
>
> is that the kind of functionality you're referring to though?
>
As long as any command that would have been supported in wd-mode is still supported in sd-mode that's exactly what I am referring to :)
Great news!
By Anonymous Coward (87.78.89.11) on
Thanks, very much apreciated.
Now i just have to get a clue where softraid.c work is headed.
Comments
By Anonymous Coward (87.78.110.66) on
Looks clean and in combination with ahci very interesting.
By Anonymous Coward (74.60.252.128) on
Comments
By Renaud Allard (renaud) on
In this case, don't waste the forum space, just do "lynx http://www.whatismyip.org"
By Renaud Allard (renaud) renaud@llorien.org on
+sd0 at scsibus0 targ 0 lun 0: <ATA, HTS541010G9SA00, MBZI> SCSI2 0/direct fixed
Extremely nice, but I think this will confuse many people. I don't even mention upgrades...
The idea behind making all hard drives behave the same way at the API level is excellent. However, I think that what was called wd0 should stay wd0 and not migrate to sd0. Or you should create a new dev (ie: hd0), this would still confuse people on upgrades, but not to the point that they will have to ask themselves if their device is now a sd or still a wd.
Comments
By Jeff Quast (dingo) on
> +sd0 at scsibus0 targ 0 lun 0: <ATA, HTS541010G9SA00, MBZI> SCSI2 0/direct fixed
>
> Extremely nice, but I think this will confuse many people. I don't even mention upgrades...
This is not beyond the ability of openbsd users. www.openbsd.org/faq/upgradeXX.html will explain it fine.
This sort of attitude is what has send many OS's into the dinasaour age...
By tedu (71.139.164.105) on
so don't use ahci
By David Gwynne (dlg) on
> +sd0 at scsibus0 targ 0 lun 0: <ATA, HTS541010G9SA00, MBZI> SCSI2 0/direct fixed
>
> Extremely nice, but I think this will confuse many people. I don't even mention upgrades...
if you're using -current then id expect you to be able to cope with this sort of change. for the next release it will be documented in the upgrade notes, which again i'd expect the user to be able to cope with.
> The idea behind making all hard drives behave the same way at the API level is excellent. However, I think that what was called wd0 should stay wd0 and not migrate to sd0. Or you should create a new dev (ie: hd0), this would still confuse people on upgrades, but not to the point that they will have to ask themselves if their device is now a sd or still a wd.
this isn't as simple as it sounds. providing a hd device would mean attaching another abstracted device on top of any storage devices we have, which in turn would mean that ALL disks change name. that sounds like more of an impact on our users than a similair change just for people who suddenly have ahci attach instead of pciide.
Comments
By Lennie (82.75.23.183) on
> > +sd0 at scsibus0 targ 0 lun 0: <ATA, HTS541010G9SA00, MBZI> SCSI2 0/direct fixed
> >
> > Extremely nice, but I think this will confuse many people. I don't even mention upgrades...
>
> if you're using -current then id expect you to be able to cope with this sort of change. for the next release it will be documented in the upgrade notes, which again i'd expect the user to be able to cope with.
>
> > The idea behind making all hard drives behave the same way at the API level is excellent. However, I think that what was called wd0 should stay wd0 and not migrate to sd0. Or you should create a new dev (ie: hd0), this would still confuse people on upgrades, but not to the point that they will have to ask themselves if their device is now a sd or still a wd.
>
> this isn't as simple as it sounds. providing a hd device would mean attaching another abstracted device on top of any storage devices we have, which in turn would mean that ALL disks change name. that sounds like more of an impact on our users than a similair change just for people who suddenly have ahci attach instead of pciide.
If you say Storage Device instead of SCSI device, it works just fine. ;-)
Although that's maybe a bit strange for a SCSI-scanner.
By Renaud Allard (renaud) on
> > +sd0 at scsibus0 targ 0 lun 0: <ATA, HTS541010G9SA00, MBZI> SCSI2 0/direct fixed
> >
> > Extremely nice, but I think this will confuse many people. I don't even mention upgrades...
>
> if you're using -current then id expect you to be able to cope with this sort of change. for the next release it will be documented in the upgrade notes, which again i'd expect the user to be able to cope with.
>
> > The idea behind making all hard drives behave the same way at the API level is excellent. However, I think that what was called wd0 should stay wd0 and not migrate to sd0. Or you should create a new dev (ie: hd0), this would still confuse people on upgrades, but not to the point that they will have to ask themselves if their device is now a sd or still a wd.
>
> this isn't as simple as it sounds. providing a hd device would mean attaching another abstracted device on top of any storage devices we have, which in turn would mean that ALL disks change name. that sounds like more of an impact on our users than a similair change just for people who suddenly have ahci attach instead of pciide.
My biggest problem with the change is remote upgrades, you will have to be really sure of how your device will be called after you install the new kernel and before the reboot. If every storage device was called the same, this wouldn't be a problem.
By frantisek holop (165.72.200.11) on
Comments
By Matthew R. Dempsky (15.227.137.69) on
atapiscsi0 at pciide0 channel 1 drive 1
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <ATAPI-CD, ROM-DRIVE-52MAX, 52PP> SCSI0 5/cdrom removable
I could look at cd(4) or scsibus(4) and try to intentionally confuse myself, or I can look at atapiscsi(4) and realize that my ATAPI CD-ROM is just being reported as a SCSI CD-ROM.
It would be no different if s/pciide/ahci/ and s/atapiscsi/atascsi/ (and maybe s/cd/sd/).
Comments
By frantisek holop (165.72.200.11) on
By Anonymous Coward (198.175.14.193) on
> disk is hardly an scsi disk (but i can't prove it to you,
> because atactl does not work)... for me, this was confusing
> it is a different thing. please call it differently.
>
Who cares what it's called, that's not going to make atactl work on it.
The problem is that you need to hack atactl, atascsi, and possibly sd and scsi(4) to pass the data properly. Try doing that, and then ask if you can rename other drivers.
Comments
By David Gwynne (dlg) on
i dont understand why people are so attached to atactl. what does it provide that you need?
Comments
By Henrik Gustafsson (gsson) on http://fnord.se/
>
> i dont understand why people are so attached to atactl. what does it provide that you need?
For me it's access to SMART and power management features.
By frantisek holop (165.72.200.11) on
By Miod Vallat (miod) on
No, it does not make it a SCSI device.
> i like openbsd because of its clarity and spartanism.
> my personal opinion is, that devices using the scsi
> layer are not scsi devices themselves and thus should
> be called differently: my seagate usb2 external ata hard
> disk is hardly an scsi disk (but i can't prove it to you,
> because atactl does not work)... for me, this was confusing
> from the beginning. (not that linux does this better,
> last time i used knoppix, the disk was mapped to sda)
This is because even though the disk itself is ata, the usb mass storage protocol uses scsi commands (which are translated back to ata commands by the usb logic in your device enclosure). For the kernel, it's yet-another-device-talking-SCSI, and it does not need to know more. Actually, the kernel CAN NOT know what actual technology the disk uses. It could very well be a real SCSI disk, or a non-ATA disk, this is completely hidden by the usb mass storage protocol.
> it is a different thing. please call it differently.
Even if the kernel had a magic way to know what kind of duck your disk is, it won't be named differently.
One of the reason disks devices have different names (instead of all of them being, say, `dk0', `dk1', etc, regardless of their actual technology), is that their driver, in addition to the common disk drive behaviour, implement specific ioctls.
When one of your disk shows up as `sd', this tells you that, among other things, you can issue the sd-specific ioctls documented in sd(4). And for this to work, your device nodes in /dev need to be different from the `wd' device nodes.
I agree that the documentation needs change to explain that not all devices driven by sd(4) are genuine SCSI disks, though.
Comments
By frantisek holop (165.72.200.11) on
my only pain at this point (guess i can live without atactl) is the
drive geometry of the umass storage devices.
(http://marc.info/?l=openbsd-misc&m=117019844812327&w=2,
it gets a bit more to the point after some mickey stuff)
how will the ahci driver cope with this?
partitions created on umass disks with these made up geometries
make other systems cry (i386, CHS boundaries).
Kenneth argues (http://marc.info/?l=openbsd-misc&m=117137332715849&w=2>),
that all systems use fake geometries, but windows is quite
consistent in this surprisingly.
all the wd disks use */255/63 fake geometry, but the umass
disks are given a hardcoded */64/32 fake geometry, and that
is maybe good for an mp3 player, but makes partitions unusable.
i think it might be worth a shot to see if "fixed" umass
devices can report their geometry correctly (as opposed to "removable")
sd0 at scsibus1 targ 1 lun 0: <WD, 1600BEVExternal, 1.02> SCSI0 0/direct fixed
or at least be able to tell fdisk the "real" fake geometry.
but it rejects emulated CHS values :-(
but this is getting a bit off topic.
By Miod Vallat (miod) on
Ah, that can't be, the name ``hd'' is already used for other kinds of disks on hp300 and vax (-:
Comments
By David Gwynne (dlg) on
heh, we can fix that. we'll just emulate scsi some more.
By Anonymous Coward (208.191.177.19) on
Comments
By Renaud Allard (renaud) on
>
>
Good point :) It was just an example. Something that could be funny would be sd(n) for Storage Device (or Secure Device since OpenBSD is there :)). Then every storage device would be an sd :)
By Anonymous Coward (204.9.40.20) on
I believe that SAS port multipliers are very different beasties, but the conceptual problem may be similar.
Comments
By David Gwynne (dlg) on
JMicron is keen to donate some so we can get SATA port multipliers working. Considering how long it took pascoe to get ATAPI working, it shouldnt take too long for PM support to appear.
> I believe that SAS port multipliers are very different beasties, but the conceptual problem may be similar.
Yes, they are different. They're totally transparent to the driver since the hardware is smart enough to figure things out. You don't even see them unless you go looking for them in the SAS world.
SATA is dumb, SAS is smart.
Comments
By Anonymous Coward (70.109.50.2) on
>
> JMicron is keen to donate some so we can get SATA port multipliers working. Considering how long it took pascoe to get ATAPI working, it shouldnt take too long for PM support to appear.
That's good to hear.
How have the JMicron chips and how has the vendor been in terms of bugs? Have they been good to work with and forthcoming with info? From what I'm reading hear, it sounds like I need to go buy some JMicron SATA HBAs...
> Yes, they are different. They're totally transparent to the driver since the hardware is smart enough to figure things out. You don't even see them unless you go looking for them in the SAS world.
>
> SATA is dumb, SAS is smart.
Not at all surprising.
Comments
By David Gwynne (dlg) on
They've been very enthusiastic about helping us out. As far as I know they contacted us first, we didn't have to ask them for help. Now that we have ahci in place they are trying to help us go further with the port multipliers and maybe a way to understand their RAID format.
They've been great, and their cards work great too.
By cron (142.176.233.194) on
Comments
By David Gwynne (dlg) on
SiI chips are used on a lot of generic cards, so you generally "get lucky" when you buy a no name SATA 2 controller. Some retailers list which chips are used on the controller, which is how jsg@ found a controller for us. It was a Skymaster part I think, but I can't remember for sure.
As for docs: http://lkml.org/lkml/2006/1/25/134
By Anonymous Coward (84.168.156.191) on
Comments
By Jeff Quast (dingo) on
You're an idiot. Would you rather have a broken driver handling your sensative data?? Your beef is not with openbsd, but with adaptec.
Comments
By sthen (85.158.44.149) on
Check if it's still broken in -current (there's a change since 4.0 that may or may not be relevant). If so, try disabling iopsp (see http://archives.neohapsis.com/archives/openbsd/2007-02/1094.html), and sendbug with all the usuals (dmesg, proper description and so on) from a valid email address (so if somebody's interested in fixing it they can get back to you to test things). If you aren't interested enough to do this, I'm not sure if you deserve a working driver (-:
> You're an idiot. Would you rather have a broken driver handling your sensative data?? Your beef is not with openbsd, but with adaptec.
I2O is iop(4), not the undocumented aac(4) that was removed from GENERIC. I2O "should" be a device-independent interface (done to a specification - not Adaptec's), the translation to the actual device's internals being done on the card itself.
Comments
By Anonymous Coward (84.168.144.92) on
>Check if it's still broken in -current (there's a change since 4.0 that may or may not be relevant). If so, try disabling iopsp (see http://archives.neohapsis.com/archives/openbsd/2007-02/1094.html), and sendbug with all the usuals (dmesg, proper description and so on) from a valid email address (so if somebody's interested in fixing it they can get back to you to test things). If you aren't interested enough to do this, I'm not sure if you deserve a working driver (-:
Nice, thank you for that help! I will fight the sendbug challenge! ;-)