Contributed by Paul 'WEiRD' de Weerd on from the Mister pushing packets speedily dept.
Hrvoje Popovski writes in with some result from his performance tests, like he did a few years ago:
I've tested Alexander Bluhm's (
bluhm@
) parallel ip forwarding diff and i've got some nice results. Readers should be aware that bluhm@'s diff sets NET_TASKQ=4 which means that forwarding will use 4 CPU threads and that this diff will affect only network cards that have multiqueue support (at the time of writing those cards are ix(4), ixl(4), and mcx(4). In my tests I was sending 14Mpps UDP packet over ix(4) interfaces which have 16 queues:ix0 at pci10 dev 0 function 0 "Intel 82599" rev 0x01, msix, 16 queues ix1 at pci10 dev 0 function 1 "Intel 82599" rev 0x01, msix, 16 queuesOpenBSD box is Supermicro AS-1114S-WTRT with 24 x AMD EPYC 7413 24-Core Processor, 2650.37 MHz CPUs so this box is nice to test those 16 queues.
And here are results:
plain forwarding
NET_TASKQ = 1 - 1.1 Mpps NET_TASKQ = 4 - 3.4 Mpps NET_TASKQ = 8 - 2.4 Mpps NET_TASKQ = 12 - 1.5 Mpps NET_TASKQ = 16 - 1.7 Mpps NET_TASKQ = 24 - 1.4 Mppsplain forwarding with pf - 1M states
NET_TASKQ = 1 - 550 Kpps NET_TASKQ = 4 - 1.4 Mpps NET_TASKQ = 8 - 1.9 Mpps NET_TASKQ = 12 - 1.6 Mpps NET_TASKQ = 16 - 1.6 Mpps NET_TASKQ = 24 - 1.5 MppsNET_TASKQ = 1 - 1.25 Mpps NET_TASKQ = 4 - 4.6 Mpps NET_TASKQ = 8 - 4.7 Mpps NET_TASKQ = 12 - 5 Mpps NET_TASKQ = 16 - 4.2 Mpps NET_TASKQ = 24 - 6.5 MppsNET_TASKQ = 1 - 1.5 Mpps NET_TASKQ = 4 - 4.8 Mpps NET_TASKQ = 8 - 4.1 Mpps NET_TASKQ = 12 - 4.3 Mpps NET_TASKQ = 16 - 3.7 Mpps NET_TASKQ = 24 - 5.5 MppsNET_TASKQ = 1 - 600 Kpps < - sending 700 Kpps NET_TASKQ - 4 - 800 Kpps <- sending 900 Kpps NET_TASKQ = 8 - 600 Kpps <- sending 700 Kpps NET_TASKQ = 12 - 480 Kpps <- sending 600 Kpps NET_TASKQ = 16 - 480 Kpps <- sending 600 Kpps NET_TASKQ = 24 - 400 Kpps <- sending 500 Kpps1 bridge behaves differently which means that if I send 14Mpps, bridge is dead. So I needed to pinpoint around 100Kpps over what bridge can forward to get highest pps.
Many thanks to Hrvoje for the write-up and for doing all these tests, and of course to Alexander Bluhm, Alexandr Nedvedicky and others developers for working on parallelizing the network stack.
(Comments are closed)
By n/a (Cabal) on
Very cool! I know that the Intel I210 and I211 have 4/4 and 2/2 queues (respectively), are those queues not yet supported by the em driver, or is this a different type of queue?
Comments
By sthen (sthen) on
The em(4) driver doesn't support multiple queues yet.
Comments
By sthen (sthen) on
AFAIK these are the drivers that already have some support for multiple queues: aq, bnxt, igc, ix, ixl, mcx, vmx.