OpenBSD Journal

Hardware accelerated AES/HMAC-SHA on octeons

Contributed by Janne Johansson on from the new mips on the clock dept.

In this commit, visa@ submitted code (disabled for now) to use built-in acceleration on octeon CPUs, much like AESNI for x86s.

I decided to test tcpbench(1) and IPsec, before and after updating and enabling the octcrypto(4) driver.

I didn't capture detailed perf stats from before the update, I had heard someone say that Edgerouter Lite boxes would only do some 6MBit/s over ipsec, so I set up a really simple ipsec.conf with ike esp from A to B leading to a policy of

esp tunnel from A to B spi 0xdeadbeef auth hmac-sha2-256 enc aes
going from one ERL to another (I collect octeons, so I have a bunch to test with) and let tcpbench run for a while on it. My numbers hovered around 7Mbit/s, which coincided with what I've heard, and also that most of the CPU gets used while doing it.

Then I edited /sys/arch/octeon/conf/GENERIC, removed the # from octcrypto0 at mainbus0 and recompiled. Booted into the new kernel and got a octcrypto0 line in dmesg, and it was time to rock the ipsec tunnel again. The crypto algorithm and HMAC used by default on ipsec coincides nicely with the list of accelerated functions provided by the driver.

Before we get to tunnel traffic numbers, just one quick look at what systat pigs says while the ipsec is running at full steam:

     PID USER        NAME                 CPU     20\    40\    60\    80\  100\
   58917 root        crypto             52.25 #################
   42636 root        softnet            42.48 ##############
                     (idle)             29.74 #########
    1059 root        tcpbench           24.22 #######
   67777 root        crynlk             19.58 ######
So this indicates that the load from doing ipsec and generating the traffic is somewhat nicely evened out over the two cores in the Edgerouter, and there's even some CPU left unused, which means I can actually ssh into it and have it usable. I have had it running for almost 2 days now, moving some 2.1TB over the tunnel.

Now for the new and improved performance numbers:

   204452123        4740752       37.402  100.00% 
Conn:   1 Mbps:       37.402 Peak Mbps:       58.870 Avg Mbps:       37.402
   204453149        4692968       36.628  100.00% 
Conn:   1 Mbps:       36.628 Peak Mbps:       58.870 Avg Mbps:       36.628
   204454167        5405552       42.480  100.00% 
Conn:   1 Mbps:       42.480 Peak Mbps:       58.870 Avg Mbps:       42.480
   204455188        5202496       40.804  100.00% 
Conn:   1 Mbps:       40.804 Peak Mbps:       58.870 Avg Mbps:       40.804
   204456194        5062208       40.256  100.00% 
Conn:   1 Mbps:       40.256 Peak Mbps:       58.870 Avg Mbps:       40.256

The tcpbench numbers fluctuate up and down a bit, but the output is nice enough to actually keep tabs on the peak values. Peaking to 58.8MBit/s! Of course, as you can see, the average is lower but nice anyhow.

A manyfold increase in performance, which is good enough in itself, but also moves the throughput from a speed that would make a poor but cheap gateway to something actually useful and decent for many home network speeds. Biggest problem after this gets enabled will be that my options to buy cheap used ERLs diminish.

(Comments are closed)


  1. By Peter J. Philipp (pjp) nospam@solarscale.de on http://centroid.eu

    This is incredible! Thanks visa! The question I have is, why is this disabled? Is there more work to be done? Can't wait to put this on both my octeons...I'm aiming for some weeks down the line.

    -peter

    1. By Janne Johansson (jj) jj@stacken.kth.se on http://www.inet6.se

      Must be because it is super unstable. Send your octeons to me instead. 8^D

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]