Contributed by Peter N. M. Hansteen on from the validate before you go-go dept.
We have a saying about hackathons - They are for starting something, or for finishing something. This time for me was a "finishing something" - I landed the new x509 certificate chain validation in libcrypto.
So some background for the uninitiated as to why I might voluntarily choose to take that on. The x509 chain validator is what validates a certificate when you make a TLS connection. It looks at certificate and possible intermediates provided by the peer, and attempts to build a trusted chain of "who signed what" back to a trusted root that you have provided. If all that works, and follows all the rules (names, certificate usage, name constraints, etc. etc.) the certificate is valid. If not, it isn't.
Back in May, when we were all locked down here in Canada, one of the common internet intermediate certificates expired (https://twitter.com/sleevi_/status/1266647545675210753). This had the effect of exposing the problems with the legacy OpenSSL x509 chain validator, that we inherited in LibreSSL. The legacy chain validator doesn't find all possible chains from a leaf to a root, it just stops on the first one it finds. If one of them happens to be expired or invalid, well, that's tough, you lose, the certificate won't be seen as valid, even though there is another path that is (via another intermediate).
When this was brought to our attention, Joel Sing and I took a a look and OpenSSL had worked around the issue by setting a knob to look for trusted certificates first before checking an untrusted intermediate, Joel put a similar fix in place which ended up being Errata 10 to OpenBSD 6.7.
The problem is that we both knew that this fix wasn't a real fix, It just mimics the OpenSSL change which bypasses the specific problem in this case. More generally, it's very possible to fool the legacy OpenSSL validator with different paths to a valid root with cross signed certificates - and this is a situation that is becoming more and more common out there in the world as some of the original old school CA's have their signers expiring or are upgrading to new crypto. Certificates get cross signed by the old and new intermediates in order to transition to a new signer, or even a new CA.
So Joel put together a set of regression tests that demonstrated the problem with the legacy validator, so we could think about fixing it at some point. What was needed was a validator that would find all the possible chains, not just the first one. We knew that Go had one (and I had looked at it before), and at some point we should get this fixed so that certificate validation actually worked in the way the standards have specified for years.
So I agreed to take it on. My public justification being that I had Covid so my loss of smell allowed me to roll in what I had to roll in to do it without smelling it. Realistically though I did it because I figured otherwise Joel would, and I wanted him to continue working on replacing the TLS 1.2 record layer..
Using the Go validator for inspiration, I concocted a new one within a couple weeks that could validate and pass Joel's new regress tests without choking on different chain paths. What then followed was two months or so of digging through all the corner cases and how this could be integrated into the existing api for certificate validation, including how to support the horrendous OpenSSL certificate validation callback API without breaking things. After a couple more months and several false starts, I had something that could integrate into X509_validate_cert() and pass all our openssl, libcrypto, libssl, and libtls regression tests without breaking anything.
On the way I also had to bring Name constraint validation up to snuff since the legacy code we inherited also did not have anything like a complete or functional implementation of name constraints.
So prior to the hackathon, I had actually landed a *lot* of new regression tests for this stuff, including an adaptation of the "bettertls" suite of certificate tests from Netflix. (using the certificates generated by bettertls, and then replacing a pile of java with a short C program and a bit of perl).
So the distilled answer, most of this was finished, reviewed, and landed at the hackathon, where I took a lot of
jsing@'s time to review it. I then spent much of my time chasing any bugs it turned up - which included some nasty ways fetchmail deals with the callback, and some issues in
bluhm@'s regress tests and perl's ssleay module (which exposed a bug in how I was handling the legacy callback)
So while not necessarily "done" (I am watching for fallout carefully) and I still have some pieces to land to expose the new api to the new validator, it is currently used internally by default in X509_validate_cert(). The result of this should be a validator that will correctly validate modern x509 chains and correctly deal with name constraints.
Many thanks to
bcook@for tests, reviews and portable integration, as well as
bluhm@for regress work that helped me out a lot.
And last but not lest a special thanks to Genua for sponsoring us all to come to Burg Liebenzell, which was a pretty nice place to have a hackathon. I had a lovely time walking in the black forest every day!
Thanks for the report and all the work, Bob!
(Comments are closed)