OpenBSD Journal

Developer blog: marco

Contributed by marco on from the IBM-cant-read-spec dept.

Jordan returned from traveling and what better way to clear up the rust and hack up some ACPI? He showed up at my house with lunch and we he started hacking ACPI while I tried to figure out why the heck Bob and Theo's IBM 325 eServer doesn't do IPMI. We discovered a while ago what was crashing the box so we worked around that; well fixed really, it is a good idea to check the return value of bus_space_map(). The first real issues is simply because IBM's SMBIOS implementation violates the spec. The SMBIOS returns 01 15 20 ff 00000ca2 7f 04 in the IPMI structure. If we look at the structure it is returned in we see:
typedef struct {
        u_int8_t        smipmi_if_type;         /* IPMI Interface Type */
        u_int8_t        smipmi_if_rev;          /* BCD IPMI Revision */
        u_int8_t        smipmi_i2c_address;     /* I2C address of BMC */
        u_int8_t        smipmi_nvram_address;   /* I2C address of NVRAM
                                                 * storage */
        u_int64_t       smipmi_base_address;    /* Base address of BMC (BAR
                                                 * format) */
        u_int8_t        smipmi_base_flags;      /* Flags field:
                                                 * bit 7:6 : register spacing
                                                 *   00 = byte
                                                 *   01 = dword
                                                 *   02 = word
                                                 * bit 4 : Lower bit BAR
                                                 * bit 3 : IRQ valid
                                                 * bit 2 : N/A
                                                 * bit 1 : Interrupt polarity
                                                 * bit 0 : Interrupt trigger
                                                 */
        u_int8_t        smipmi_irq;             /* IRQ if applicable */
} __packed smbios_ipmi_t;
So there are 2 things wrong here. First look at the base_address, per the spec if it is IO and not memory mapped the value shall be odd. The IBM box uses IO and not memory mapped IO (this was determined after MANY reboots!). Also wrong is the register spacing. It is set to 0x01 but really should have been 0x00 or else the other calculated register offset will be wrong. In this case we would be poking in 0xca4 instead of 0xca3. Another nice thing is that bit 2 is set; the spec explicitly prohibits setting reserved bits.

The MSI board (the BMC that talks IPMI) is some sort of Taiwanese board that IBM dropped into this server. I found that out while hunting for Linux or some other code to use as a reference. We did find some absolutely awfully written pile of poo driver. After reading through some of that code I understand how the spec could have been completely misinterpreted. Clearly there was some sort of cranium deficiency at work here.

So I am all happy jumping up and down that we are no longer crashing and are getting values out of the BMC to see that it is not working reliably. Damn it, back to the drawing board. During this failure I see some familiar error mechanisms so I go back to the timeout code that I wrote a few weeks ago and sure enough there it was. The IO mechanism is several times slower than memory mapped IO equivalent so the timeout values were off. Ah, at least one easy fix :-)

I resume the jumping up and down activity to shortly run into the next snafu. The IPMI poll seems hung, no values are being updated. Argh!! This is getting old, now what?
By now Jordan is being summoned so he leaves without committing any code, I'll blog about his activities later, and I receive a NMI of the SIGWIFE type. Dinner, movie, etc

A very long movie later I resume hacking on this thing. I added a whole bunch of debug goo into the driver to basically see that I had been overzealous in my previous timeout commit. I did fix the cold (during boot) timeout code but screwed up the normal timeout path. I fix this to get ready for the next disappointment; it is still not working. Now I start disabling random devices that poke into IO space and magically IPMI starts working. Many reboots later I figure out that it is the nsclpcsio and gpio driver that are causing this. This is Alexander's stuff and it was too late for me to look at it and too early for him to be awake.

In the morning I found Alexander on ICB and talked to him about it and he went of and confirmed and fixed the bug. Now I have all the pieces to create a fix for i386 on this box. Some cleanup later the code goes in. There are still 2 things for me to look at on this box. First this needs to be validated on amd64 as well and secondly the fans are reporting 0 RPM so there is still something broken. More on this later.

(Comments are closed)


Comments
  1. By Anonymous Coward (84.92.159.114) on

    Filthy, filthy hacking. Makes me quite angry that this kind of dicking around is necessary to make stuff work. :(

    Massive kudos to you guys for sticking with this bullshit.

  2. By Daniel Melameth (208.139.201.73) daniel@melameth.com on

    Thanks for the regular dose of OpenBSD developer blogging! It gives us less C-skilled folks (last time I touched C was in college) a greater appreciation for the OS we admire!

    Comments
    1. By Anonymous Coward (80.202.46.38) on

      I second that! :-)

  3. By anon et. al. (80.213.132.8) on

    ...I found that out while hunting for Linux or some other code to use as a reference. We did find some absolutely awfully written pile of poo driver.

    code found in the Linux kernel then, I guess

    Comments
    1. By Marco Peereboom (143.166.226.19) marcp@peereboom.us on

      Nope from the hardware manufacturer of the MSI board. Found the link through the IBM website.

  4. By Anonymous Coward (24.34.57.27) on

    Hey Marco, any chance adding a brief subject or topic category in the title of your blog posts? There's a lot of "Developer blog: marco"'s now. ;)

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]