Contributed by tbert on Wed Apr 25 13:50:53 2012 (GMT)
from the stitch in time dept.
In December of 2005, Ted Unangst (tedu@) commited the first iteration of
a new threading library to replace the existing implementation with
the message "add userland thread library. incomplete, but functional"
This new approach, labelled rthreads, took threading from being a
strictly userland to an OS-wide implementation.
A bit over seven years later, Phillip Guenther (@guenther), who had been
doing much of the work to complete rthreads, flipped the switch and
took what had been an experimental option and made it the default.
And two weeks ago, 16 OpenBSD developers gathered in Paris ("Best croissants
I've had at any hackathon so far!" says Ken Westerback(krw@)) to perform
even more concentrated work on bringing rthreads to a complete state.
A few of the hackers have shared their experiences.
Paris local Marc Espie (espie@) was first out of the gate with sharing
his experiences at r2k12:
Ariane van der Steldt (ariane@), who has been mastering (mistressing?)
OpenBSD's virtual memory system writes in with the following:
There's something special about doing a hackathon in your home-town, most specifically in the area where you went to school as a student.
A few weeks ago, Theo sprang that surprise `hey, we have a room in Paris in the middle of Le Quartier Latin, let's do a hackathon there'.
First, let's thank the people who made this possible, Anne and David, a big thank you. Also thanks to Cedric Villani, that great mathematician who's enlightened enough to understand that programming matters (hey, Marsu).
So I had a chance to share my old lairs with the OpenBSD crowd, and show them a small part of that city I love.
It was also very exciting to meet a lot of people face-to-face for the first time. They're at least as great in real-life as through email.
Unsurprisingly, I ended up working on a lot of stuff I hadn't planned at all.
This week was really busy for me:
- finished the posix_spawn documentation rewrite
- finally started committing the parts to make release -j clean; the md parts are coming together
- hunted an annoying build bug in gcc/java
- worked again on m4 -g gooey parts
- used ariane's new and shiny maxrss to squeeze more stats out of ports builds (the idea being to avoid building two moz at the same time).
- finally imported sqlite3 in base, though it's not activated yet, hopefully soon. This one was fairly annoying, I had all the parts from two years ago, when the source tree wasn't ready for it yet, but I had to remember the change and update to the new version... loads of fun, since quite a lot changed. The sqlite crowd is a tribe of busy beavers!
- a few make changes, some committed, some being tested.
All in all, a very productive week. I used the opportunity to close a lot of old entries in my todo list.
And in our final report of this installment, Christiano F. Haesbaert (haesbaert@)
tells about his attempt to give the process scheduler more cowbell:
Paris was awesome. I arrived by car sunday evening, I searched for the hotel and after driving past it a few times (it was hidden behind scaffolding) and a few times illegally parking, I managed to get myself checked in and learned where the nearest parking garage was. On my way to park my car, I ran into the other devs who'd already
arrived and made good use of their time (i.e. found a place with beer and food). ;)
The next few days were largely spent on code, with the evenings going out for dinner and drinks afterwards. I cannot stress enough how important the 'going for drinks'
bit is: not only is it highly enjoyable, but being away from the code is the best moment to discuss architectural changes and other big things. You often don't get around that while being buried in code. Architecture works best when away from the subject and talking face-to-face.
I did a few things during the hackathon. I commited the mmap0 diff, which is sure to trip up a few programs: it returns EINVAL when mmap is asked to allocate no memory.
The diff itself is pretty simple, but that doesn't mean it's not important: a 0-length allocation may cause all kinds of funny behaviour in our allocators, since they
weren't designed around this. Of course this broke the tree, patch(1), install(1) and locate(1) needed diffs to cope.
Another thing I did was implement max-RSS tracking for programs. This is done from the fault path, by simply asking the pmap how much is resident and writing that out.
A diff I hugely overestimated in complexity, as I was initially unaware that someone had already done all the rusage bits. (Thank you, oh anonymous developer!) So the
diff was simple and espie immediately brought up statistics about the insane amount of memory some programs require. To give an example: qt4 requires around 500 MB during compile, but that is nothing compared to mozilla (1 GB) or lang/rakudo (1.8 GB). Bad news if you want to compile on a 32-bit machine, but at least now we know. Espie has big plans with this: he wants to modify dpb (the distributed package build system) to plan its compiles in a way to not have two huge compiles run next together.
Where the max-RSS diff was pretty simple and straightforward, another diff I worked on I hugely underestimated. I started work on process-shared locking (POSIX_PROCESS_
SHARED and POSIX_MUTEX_ROBUST). The work consists of two parts: the kernel needs to track a list of threads (which may or may not be in the same process) and the userspace needs to be adapted.
The first time this idea surfaces was in Canada and the hard part I thought was going to be reducing an address in a process to a unique key that can be the same across
processes (if the lock is shared, multiple processes may have different pointers to point at the same physical memory). I quite overestimated the effort in this, as it'
s pretty easy (says the uvm hacker). I estimated the userspace bits to be easy in comparison, but that turns out to be a lot harder than expected. First of all, the current locks are allocated, which means that each lock, condition, rwlock, etc. is actually a pointer, which of course would point at entirely wrong memory in a different
process. So the data has to be flattened. And then there's the bit where the code paths diverge (especially for robust mutexes, which are quite different from the non-
robust ones). All-in-all, pretty tricky to adapt to the new case, so it may be a while before I'm finished with that.
Another thing I'm currently working on had little to do with the hackathon. While in Paris, ratchov drew my attention to how nasty munmap(2) is. It turns out that big
processes can make audio stutter badly. I'm currently working on a diff, to inform uvm_unmap() and uvmspace_free() if they should preempt/yield during their operation.
And then these functions have to have the flag set (the method to inform them). Orignally I thought it was mainly the reaper, destroying the dead programs, that caused
issues, but this turns out not to be the case. Unfortunately, this initial misinterpretation of me means that I've sent out diffs called 'nicer reaper', but I've decided to change the name to 'vm yield'. Since 'vm yield' is the general idea of the diff: uvm must yield the cpu if it can on expensive operations.
Stay tuned for further reports from the hackathon as they come in!
I wanted to study and poke the scheduler, mainly to learn about and see how we can improve it, I had already some diffs to build cpu topology and do a *smarter* migration of processes between cpus.
I had some ideas for improvements in mind:
- Code a CFS scheduler.
- Implement the posix realtime policies (SCHED_FIFO, SCHED_RROBIN).
- Measure and maybe diminish lock contention of sched_lock.
- Study ways to detect cache trashing of a specific process.
So I started with 1, and was able to hack something in about 2-3 days, it worked but was painfully slow, I invested some more time on it but got bored.
After that I went to measure the contention on 3, and I discovered it's somewhat big, we almost always spin on sched_lock, but I still don't trust my data, it seems too much. I tried hacking up and split the sched_lock on a per-processor basis, and then I learned that sched_lock is actually recursive, spent some time trying to figure it
out why, other hackers told me this was a problem art@ tried to solve for years: making the sched_lock not recursive.
At this point I had learned a lot about the scheduler and big lock/sched lock interaction and I was satisfied, but hadn't produced any real diffs, so I went reading up sasyncd(8) which Theo pointed me out we should do some changes in the future, so I wrote some cleanup diffs.
I just wanna say this was a fantastic experience, it was my first hackathon and I learned a lot from it, next time I won't try monster projects! I finally get why hackathons are so important: "people share knowledge". Where else do get to sit on a bar in Paris and listen to miod telling you about the difference between *processor revisions* from like 15-20 years ago, it was awesome.
I want to thank Theo and Guenther for organizing the event, it was great. I also wanna thank everyone who helped me with my questions and inquiries, specially: mikeb, ariane, guenther, theo, kettenis, miod...