Contributed by deanna on from the not-a-blog dept.
There's been a small increase in the number of files used to implement pkg_add(1). There is a pretty good reason for that: I'm trying to make it start up a bit faster, and the only way to do that is to have the pkg tools load stuff on-demand, instead of all the time. This is always a compromise, since having more files means more pressure on the file system, so a lot of classes are bundled together in `big' files (e.g., PackingElement) and some code is not in the package it belongs to (e.g., PackingElement code, again), because it's only used by visitor methods, and it makes no sense to load pkg_create-related methods when running pkg_delete.
More of the code is moving towards an OO-style. One big benefit is that a lot of use Package; can be replaced by require Package;, which moves it down to strictly-needed runtime. Also, some objects are now responsible for loading other stuff, so some stuff is no longer needed all the time. For instance, looking up packages used to be a big mess. It still is, but it has gotten better. Most of the code uses `generic' search objects, which can be used to look up stuff, and the actual goo that performs package names, or package stems, or package paths searches is isolated under the search object... so, for instance, the PackageName and PkgSpec modules are no longer loaded nearly as much for simple cases (price to pay: a smallish Search.pm module).
Some code is also getting more systematic. I am using more and more references to the OpenBSD::PackageRepository::Installed->new object (a singleton) instead of using installed_packages(), which makes code simpler, since it more or less always manipulates repositories, instead of having special cases.
But grabbing the full PackageRepository code is expensive, so the PackageRepository::Installed code lives in its own file now.
At some points, I may miss some requirements changes, and usually, you will end up with code that can't find a given Package... it doesn't last long most of the time, since I notice it quickly, or can fix it as soon as someone gives me feedback.
More changes are currently on-going. Most of the work is done through the `evolutionary' approach to OpenBSD development. Or refactoring, as it's usually called in OO circles. (In fact, I'm indebted to Martin Fowler's book about refactoring, for helping me turn my empiric processes into a more systematic approach). So, it's mostly simplify, encapsulate, extend.
Creating new objects in the process, Naming stuff, so to speak, which is the true essence of magic ;-)
For instance, I've moved the Dependencies::Solve code into a solver object, which makes things a bit clearer. Amazingly enough, it's going to help me fix a problem we had in the past.... I wasn't even thinking of that problem when I added that object, I just wanted to make things clearer.
I've planned a bit ahead: eventually, I am going to try turning pkgnames into PackageLocations as soon as the search object figures out where they live. This would allow me to bypass the hugely expensive PackageLocator->find routine, and avoid aliasing issues (even though a package of the same name is supposed to always be the same, regardless of the repository, some timing quirks may happen where we find a package on a repository, then that repository goes offline, and we later find that package elsewhere... caching pkgnames correspondences in PackageLocators has so far helped us cope, up to a point... this means we cannot have two identically named packages in two distinct repositories). There is the issue of package architecture, which is not always set when we should create PackageLocation. There is also some minor code needed to print out PackageLocations, so that we have real url-like behavior accross the board.
The other significant change will be the addition of UpdateSet objects. Instead of passing explicit parameters that mention that such package must be added, or such package removed, it's ways better to view each addition or suppression in term of a change, with possibly zero packages removed, or zero packages added. This is key to being able to achieve some complicated updates, where we would actually want to replace two packages at once... Heck, this could even handle circular dependencies (even though I don't like these at all).
There are still some design issues, as to whether I should put PackageNames, or PackageLocations in those objects (so the PackageRepository::Installed class strikes again... we need to be able to turn installed package names into locations for this to work), and also how to build and complete incomplete objects. For instance, pkg_add -r starts out as a series of UpdateSets where the new objects are known, and the old objects are not. And also, it allows us to avoid recomputing the same thing twice: pkg_add -u first turns old package names into new package names, then the new package names are used through pkg_add -r conflict handling to find out the old/new package association... in effect doing twice the same work, instead of having pkg_add -u build the UpdateSet directly.
Naming Search objects also allows them to have more complex behavior. For instance, I'm going to allow specs to specify hints to the corresponding pkgpath, and finally Src: repositories are going to make sense (yes, this means that pkg_add will be able to ask the ports tree to compile missing packages). Also, if it has access to distant repositories, it should be able to download quirks, and perform some magic on package names, like a complete rename of a package.
Even more stuff is planned. Yes Theo, simplifying the user interface and flags is also on the list... should happen soon, in fact.
As far as performance goes, I don't know whether I'm gaining performance, or losing some... I don't care all that much so far. The package tools are mostly fast enough, and a lot of the performance problems fall under:
- start-up time;
- abuse of network;
- recomputing the same thing twice.
If you've read this and don't understand half of what I'm talking about, that's to be expected... I can answer specific points, and discuss issues if wanted, but I'm not going to spend even more time explaining things in more detail...
Note: the text of this artcle originally appeared on the openbsd tech mailing list.
(Comments are closed)