OpenBSD Journal

Graphed and measured: running TCP input in parallel

Contributed by Peter N. M. Hansteen on from the measured packets dept.

Over on tech@, Alexander Bluhm (bluhm@) is airing a patch to improve parallel TCP input, and is looking for testers:

List:       openbsd-tech
Subject:    running TCP input in parallel
From:       Alexander Bluhm <bluhm () openbsd ! org>
Date:       2025-04-17 16:53:19

Hi,

To run tcp_input() in parallel efficently, we have to lock the
socket in a smart way.  I have measured multiple variants.

http://bluhm.genua.de/perform/results/2025-04-16T09:33:58Z/perform.html

The relevant TCP graph is here.

http://bluhm.genua.de/perform/results/2025-04-16T09:33:58Z/gnuplot/tcp.html
http://bluhm.genua.de/perform/results/2025-04-16T09:33:58Z/gnuplot/tcp6.html

First column (left) is no locking at all, just exclusive net lock.
Third column is socket lock in addition to exclusive net lock.  You
see 6% degradation due to locking overhead.  This has already been
commited.

Fourth column (right) is simple switch from exclusive net lock to
shared net lock and relying on socket lock.  Grabbing the socket
lock for each packet is expensive.  Especially tcp_input() and
soreceive() are fighting for the lock.  Single stream performance
goes down by 25%, but multi stream goes up by 140%.

The second column contains the diff below.  The idea is that
tcp_input() moves all TCP packets from softnet input queue into TCP
input queue.  This queue has storage per softnet thread and can be
accessed without lock.  After running all protocol input functions,
but in the same shared netlock context, process the TCP input queue.
tcp_input_mlist() keeps a pointer to the current socket.
tcp_input_solocked() tries to keep the lock on the socket.  If
consecutive TCP packets belong to the same socket, the lock is not
released.  Only when the TCP stream changes, we unlock the old and
lock a new socket.  This batch processing of locked sockets gives
5% increase in single stream and 160% for multi stream throughput.

I have several positive test reports.

ok?

bluhm

Index: net/if.c
===================================================================
[ … ]

The rest of the message contains the diff (where the [ … ] placeholder is displayed here), which will apply to a recent -current checkout.

It's nice to see these things properly measured and graphed, right?

If you are in a position to test, feedback is welcome as always.


Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]