OpenBSD Journal

Demise of Nagle's algorithm (RFC 896 - Congestion Control) predicted via sysctl

Contributed by Peter N. M. Hansteen on from the hammering the Nagle dept.

Is the classical TCP congestion control mechanism known as Nagle's algorithm (RFC 896 - Congestion Control) headed for the scrap heap of history?

A recent post on tech@ titled Add sysctl to disable Nagle's algorithm (RFC 896 - Congestion Control) from Job Snijders (job@) with a patch to implement the disabling sysctl indicates that some at least think that deprecation is in order.

The message leads in,

List:       openbsd-tech
Subject:    Add sysctl to disable Nagle's algorithm (RFC 896 - Congestion Control)
From:       Job Snijders <job () openbsd ! org>
Date:       2024-05-13 18:41:55

Dear all,

Back in the early 1980s, a suggestion was put forward how to improve TCP
congestion control, also known as "Nagle's algorithm". See RFC 896.

Nagle's algorithm can cause consecutive small packets from userland
applications to be coalesced into a single TCP packet. This happens at
the cost of an increase in latency: the sender is locally queuing up
data until it either receives an acknowledgement from the remote side or
sufficient additional data piled up to send a full-sized segment.
This approach might have been advantageous 40 - 50 years ago, when
multiple users were concurrently working behind 1200 baud lines. Nagle's
algorithm discourages sending tiny segments when the data to be sent
increases in small increments.  The trade-off being "sacrificing a
degree of interactivity" in exchange for "increased throughput".

In recent days the applicability and usefulness of Nagle's agorithm in
our times came into question. Nagle's algorithm negatively interacts
with Delayed Acks (RFC 813), as per Nagle himself:
https://news.ycombinator.com/item?id=10608356 and a more complete
description: https://datatracker.ietf.org/doc/html/draft-minshall-nagle

But some argue "Given the vast amount of work a modern server can do in
even a few hundred microseconds, delaying sending data for even one RTT
isn't clearly a win." https://brooker.co.za/blog/2024/05/09/nagle.html

In base, various applications have taken it upon themselves to disable
Nagle's algorithm: ssh, httpd, iscsid, relayd, bgpd, and unwind. Bluhm
and I are not aware of applications that explicitly enable Nagle.

The standards say in RFC 9293 section 3.7.4: "A TCP implementation
SHOULD implement the Nagle algorithm to coalesce short segments.
However, there MUST be a way for an application to disable the Nagle
algorithm on an individual connection."

So, why not take it a step further and allow for the algorithm to be
disabled on the whole system? :-)

The below changeset introduces sysctl net.inet.tcp.nodelay, which if set
to 1 will simply cause TCP_NODELAY to be set on all TCP sockets.

Note that with net.inet.tcp.nodelay set to 1, applications still can
inspect and disable TCP_NODELAY using getsockopt() and setsockopt().

Perhaps in the future - after more study & contemplation - we'll to
change this sysctl's default from 0 to 1?

Kind regards,

Job

-- and goes on to present the patch (against a recent -current) to introduce the code.

The discussion on whether this is desirable is ongoing on tech@. If you're up to it, join in, test the code and report your experiences!


Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]