Ingo Schwarze (schwarze@) writes in with our fourth report from the p2k15 ports hackathon:
Basically, there are three reasons for wanting
to get rid of USE_GROFF: Installing source manuals in principle
allows to use semantic searching with apropos(1) - though so far,
that mostly applies to mdoc(7) manuals and doesn't make much of a
difference for man(7) manuals; avoiding dependencies simplifies
optimization of bulk builds for speed; and getting rid of USE_GROFF
altogether would take one complication out of the ports build
One of the porters who again and again removed USE_GROFF from more
and more ports that no longer needed it is Christian Weisgerber
(naddy@), and as of late, he has become even more active in that
respect. Recently, he inspected all remaining 250 ports still
having USE_GROFF and classified the reasons why it wasn't removed
from each one yet, arriving at a list of about 45 different reasons;
of course, many ports are affected by multiple reasons, half of
the reasons occur in just one single port, and another quarter
affects but a handful of ports.
My plan for the hackathon was to take the list of reasons, sorted
by frequency, and try to remove USE_GROFF from as many ports as
feasible, of course without degrading formatting quality of any
ports manuals. It was obvious this might sometimes involve patching
invalid manual source code that doesn't render properly even with
groff, and even more importantly, it would involve fixing bugs in
mandoc(1) and adding missing features to it.
During the hackathon, i managed to work through two of the most
common classes of issues: Wrong indentation (46 affected ports)
and extra blank lines (29 affected ports). It turned out to be
harder than expected because tagging a bunch of ports with a common
label ("indent") doesn't imply there is just one problem to be fixed
for ticking off the whole class... It felt rather like most ports
in that class typically exhibited about two distinct mandoc(1)
indentation bugs, and the next port would demonstrate two new ones
rather than repeating the ones just fixed, and so on for the one
after that... Consequently, there was no way to handle anything
close to that still quite considerable number of 250 ports during
the four days of the hackathon. But at least, i managed to delete
USE_GROFF from the following 22 ports based on work done in Exeter,
which is about 10% of what remains: audio/mp3blaster devel/argtable
devel/ectags devel/libJudy devel/pcre games/gnushogi games/xmahjongg
games/xskat graphics/dcmtk graphics/mpeg_encode lang/classpath
lang/erlang lang/php mail/popclient misc/findutils multimedia/transcode
multimedia/xine-lib net/mutella net/rabbitmq plan9/sam
Three of the top reasons for USE_GROFF - the .ta request (define
tabulator stop positions, 61 affected ports), the .ti request
(temporary indent for the next output line, 50 affected ports), and
the braindead way the infamous DocBook formats bullet lists by
manually moving the cursor left and right with \h escape sequences
(30 affected ports) - are very hard to fix in the current mandoc(1)
parsing framework because mandoc(1) handles roff(7) as a pure
preprocessing language and is able to generate syntax tree nodes
only from high level mdoc(7) and man(7) macros. But theses three
features, .ta, .ti, and \h, require generating syntax tree nodes
on the roff(7) level, to be interspersed among high level macro
nodes. Achieving this requires a reorganization of the mandoc(1)
parsers, unifying the data structures of the syntax trees and the
functions handling them across all the various languages.
I didn't work on that reorganization *at* the hackathon, but on the
train going there and returning home, replacing mdoc(7) and man(7)
specific data structures and functions with unified data structures
enum roff_type, struct roff_node, struct roff_meta, struct roff_man,
and generic functions: roff_man_alloc roff_man_free roff_man_reset
roff_node_alloc roff_node_append roff_block_alloc roff_body_alloc
roff_head_alloc roff_elem_alloc roff_node_delete roff_node_free
roff_node_unlink roff_word_alloc roff_word_append roff_addeqn
roff_addtbl. This unification so far shrank the code by more than
350 lines, and that trend will continue. But above all, these data
structures and functions will be used for future roff(7) syntax
tree nodes and ultimately for improved low-level roff handling.
So once again, and even though four out of the eight trains i took
were seriously delayed by twenty to fourty minutes each, the train
Koeln-Wolfsburg proved quite productive and much less stressful
than flying - one other developer having booked with two different
air carriers got stranded between two London airports, missed his
connection and had to buy a new ticket to get home because the
second air carrier washed their hands of the delay caused by the
Changes to mandoc(1) at the hackathon included two other notable
refactorings - vastly simplified block unwinding for man(7), similar
to what i recently did for mdoc(7), and a common handling for the
breaking of explicit mdoc(7) blocks by implicit blocks. Looking
at many weird ports manuals resulted in a large number of mandoc(1)
- mdoc(7): Arguments to end macros of broken partial explicit blocks must go inside the breaking block.
- mdoc(7): If a partial explicit block extending to the next input line follows the end macro of a broken block, put all of it into the breaking block.
- man(7): Section headers have hanging indentation when overflowing the line.
- man(7): Use the default width for .RS without arguments.
- man(7): On a new .RS nesting level, the saved width starts from the default width, not from the saved width of the previous level.
- man(7): Fix a quirk with respect to empty .HP.
- man(7): Do not mistreat empty arguments to font alternating macros as vertical spacing requests.
- Don't allow breaking the output line after hyphens following escapes.
- roff(7): Fix rounding rules for horizontal scaling widths.
- man(1): Do not hardcode the path /usr/bin/ to more(1).
I also got one groff bugfix committed upstream, preventing mdoc(7) .Bl
with trailing -width or -offset from picking up old args when formatted
Given that it was a ports hackathon, i also committed two new ports,
textproc/p5-XML-SemanticDiff-1.0004 and devel/p5-Test-XML-0.08, and
updated two others, devel/p5-TAP-Formatter-JUnit 0.09 -> 0.11 and
devel/p5-TAP-Harness-JUnit 0.41 -> 0.42 around the time of the
Unfortunately, i didn't manage to make any time for hiking on
Dartmoor or on the South West Coast trail this time - for me, it
was a short but very focussed, productive, and pleasant trip. The
main side effect of coming to Devon was enjoying quite a bit of
culinary art - and who would have thought that even various French
developers would be full of praise for the English cusine, and the
praise wouldn't subside even when it came to cheeseboards!