OpenBSD Journal

Stefan Sperling: t2k13 Hackathon Report - locale progress

Contributed by phessler on from the anything.encoding dept.

Quickly on the tail of the last submission we received a report from Stefan Sperling (stsp@)...

I spent most of this hackathon working on locale support.

The list of available locales in OpenBSD used to be defined by names of directories within the /usr/share/locale directory. If users tried to configure a locale that didn't have a name corresponding to a directory in /usr/share/locale, the setlocale(3) call would fail. This resulted in numerous requests from users to add their favourite languages and countries to this directory.

Adding new entries to the directory has a cost. It uses up some disk space on the filesystem, even if the locale isn't used. Also, submitted patches to add locale names have to be processed by developers and sometimes discussed at length.

I also wanted to tidy up the contents of /usr/share/locale. Many redundant copies of LC_CTYPE files were stored, and some locale names contained unsupported encodings. bluhm@ suggested a new naming scheme which puts data pertaining to encodings and data pertaining to languages into separate directories. We also considered keeping the old naming scheme and using symlinks to eliminate mutiple copies of files, like FreeBSD does. However, that approach wouldn't really reduce the clutter.

I ended up making setlocale(3) accept effectively arbitrary locale names of the form "anything.encoding". Only the encoding is looked up in /usr/share/locale. In case the encoding cannot be found, setlocale(3) falls back to ASCII (the "C" locale). The "anything" part is interpreted by gettext(1) and catopen(3), which fall back to English if a supported language cannot be deduced from the locale name.

Nothing but libc is supposed to look into the /usr/share/locale directory. The layout is implementation-defined, so changing the layout should have no effect on applications, in theory. In practice, the ports tree contains software like GNOME which uses the list of locales supported by the system to filter its list of keyboard layouts and input methods on offer.

To obtain the list of supported locales, GNOME first checks for a Linux-specific binary file (a non-portable approach) and falls back to listing the contents of /usr/share/locale if that file doesn't exist (again, a non-portable approach).

So changing the directory layout ended up breaking some functionality in GNOME, and ajacoutot@ wasn't all too pleased, even though he didn't blame me for GNOME's non-portable assumptions. Luckily, POSIX specifies a locale(1) utility which provides a portable way of getting the list of supported locales. I added that utility so we can keep GNOME and similar applications working with the new /usr/share/locale layout. matthew@ helped me a lot with making the code for this utility as small as possible, and schwarze@ fixed my locale(1) man page draft.

The list of names locale(1) prints is pretty arbitrary, but should cover most needs. If you don't find your favourite locale in this list, don't worry -- you can now use your favourite locale name anyway. locale(1) also makes it a bit easier to spot problems in locale configuration since it displays what libc is making of the various locale-related environment variables.

I also fixed catopen(3) for UTF-8 locales. The catopen(3) function is used to translate error messages such as "No such file or directory" into the locale's language according to the LANG environment variable. Because it printed ISO8859-1 encoded strings regardless of the current locale, using catopen(3) in UTF-8 locales could result in garbled output. Again, matthew@ provided lots of help.

I also spent a lot of time reviewing collation support diffs submitted by Vladimir Tamara Patino (I don't think undeadly supports unicode, so I've transliterated his name to ASCII), and gave Vladimir some feedback on his proposed changes.

Towards the end of the hackathon, tedu@ imported support code for FUSE (file system in userspace). I took a look into adding support for a WebDAV filesystem FUSE module. It turns out that our FUSE API is still incomplete, and I started implementing some of the missing bits. I couldn't get this done during the hackathon but will keep working on it as time permits.

Many thanks to krw@ for organizing this hackathon, and also to my employer ( for sponsoring my flight.

Of course, I also enjoyed some off-time in the city of Toronto. I'd like to thank all swing dancers of the city of Toronto for being amazingly skillful, and having a great Saturday night with mpi@, pirofti@, and myself. Keep hopping!

 o  o    _o_  _o_     _o_  _o_     _o_  _o_     o_  o     \o_o/
/|\/ \    |            |            |           |  / \    _| |_ 
/|  |\   //    \\      \\  //       |    |     /|   |    //   \\

Many thanks to Stefan for making OpenBSD usable with languages and character sets other than English!

(Comments are closed)

  1. By Renaud Allard (renaud) on

    Well, this article is mostly about locale, but the thing that makes me the most happy here is the FUSE fs support. Congrats and keep up the good work.


Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]