OpenBSD Journal

Developer blog: espie@: where pkg_add is going

Contributed by deanna on from the not-a-blog dept.

Marc Espie (espie@) writes:

There's been a small increase in the number of files used to implement pkg_add(1). There is a pretty good reason for that: I'm trying to make it start up a bit faster, and the only way to do that is to have the pkg tools load stuff on-demand, instead of all the time. This is always a compromise, since having more files means more pressure on the file system, so a lot of classes are bundled together in `big' files (e.g., PackingElement) and some code is not in the package it belongs to (e.g., PackingElement code, again), because it's only used by visitor methods, and it makes no sense to load pkg_create-related methods when running pkg_delete.

More of the code is moving towards an OO-style. One big benefit is that a lot of use Package; can be replaced by require Package;, which moves it down to strictly-needed runtime. Also, some objects are now responsible for loading other stuff, so some stuff is no longer needed all the time. For instance, looking up packages used to be a big mess. It still is, but it has gotten better. Most of the code uses `generic' search objects, which can be used to look up stuff, and the actual goo that performs package names, or package stems, or package paths searches is isolated under the search object... so, for instance, the PackageName and PkgSpec modules are no longer loaded nearly as much for simple cases (price to pay: a smallish Search.pm module).

Some code is also getting more systematic. I am using more and more references to the OpenBSD::PackageRepository::Installed->new object (a singleton) instead of using installed_packages(), which makes code simpler, since it more or less always manipulates repositories, instead of having special cases.

But grabbing the full PackageRepository code is expensive, so the PackageRepository::Installed code lives in its own file now.

At some points, I may miss some requirements changes, and usually, you will end up with code that can't find a given Package... it doesn't last long most of the time, since I notice it quickly, or can fix it as soon as someone gives me feedback.

More changes are currently on-going. Most of the work is done through the `evolutionary' approach to OpenBSD development. Or refactoring, as it's usually called in OO circles. (In fact, I'm indebted to Martin Fowler's book about refactoring, for helping me turn my empiric processes into a more systematic approach). So, it's mostly simplify, encapsulate, extend.

Creating new objects in the process, Naming stuff, so to speak, which is the true essence of magic ;-)

For instance, I've moved the Dependencies::Solve code into a solver object, which makes things a bit clearer. Amazingly enough, it's going to help me fix a problem we had in the past.... I wasn't even thinking of that problem when I added that object, I just wanted to make things clearer.

I've planned a bit ahead: eventually, I am going to try turning pkgnames into PackageLocations as soon as the search object figures out where they live. This would allow me to bypass the hugely expensive PackageLocator->find routine, and avoid aliasing issues (even though a package of the same name is supposed to always be the same, regardless of the repository, some timing quirks may happen where we find a package on a repository, then that repository goes offline, and we later find that package elsewhere... caching pkgnames correspondences in PackageLocators has so far helped us cope, up to a point... this means we cannot have two identically named packages in two distinct repositories). There is the issue of package architecture, which is not always set when we should create PackageLocation. There is also some minor code needed to print out PackageLocations, so that we have real url-like behavior accross the board.

The other significant change will be the addition of UpdateSet objects. Instead of passing explicit parameters that mention that such package must be added, or such package removed, it's ways better to view each addition or suppression in term of a change, with possibly zero packages removed, or zero packages added. This is key to being able to achieve some complicated updates, where we would actually want to replace two packages at once... Heck, this could even handle circular dependencies (even though I don't like these at all).

There are still some design issues, as to whether I should put PackageNames, or PackageLocations in those objects (so the PackageRepository::Installed class strikes again... we need to be able to turn installed package names into locations for this to work), and also how to build and complete incomplete objects. For instance, pkg_add -r starts out as a series of UpdateSets where the new objects are known, and the old objects are not. And also, it allows us to avoid recomputing the same thing twice: pkg_add -u first turns old package names into new package names, then the new package names are used through pkg_add -r conflict handling to find out the old/new package association... in effect doing twice the same work, instead of having pkg_add -u build the UpdateSet directly.

Naming Search objects also allows them to have more complex behavior. For instance, I'm going to allow specs to specify hints to the corresponding pkgpath, and finally Src: repositories are going to make sense (yes, this means that pkg_add will be able to ask the ports tree to compile missing packages). Also, if it has access to distant repositories, it should be able to download quirks, and perform some magic on package names, like a complete rename of a package.

Even more stuff is planned. Yes Theo, simplifying the user interface and flags is also on the list... should happen soon, in fact.

As far as performance goes, I don't know whether I'm gaining performance, or losing some... I don't care all that much so far. The package tools are mostly fast enough, and a lot of the performance problems fall under:

  • start-up time;
  • abuse of network;
  • recomputing the same thing twice.
Those are issues immune to micro-optimzations, that respond much better to better algorithms overall. Once this is solved, I'll run some profiling and perform some micro-optimizations *if needed*.

If you've read this and don't understand half of what I'm talking about, that's to be expected... I can answer specific points, and discuss issues if wanted, but I'm not going to spend even more time explaining things in more detail...

Note: the text of this artcle originally appeared on the openbsd tech mailing list.

(Comments are closed)


Comments
  1. By Bret Lambert (tbert) bret.lambert@gmail.com on

    I appreciate the work being done; I love not having to curse my machine when I try to install something and some crazy-ass dependency refuses to compile.

  2. By jirib (195.212.29.179) on

    i like pkg_* very much... althought i think as packages are recommended way to install that we should have a way to check if there are new versions of packages available on ftp :)

    something like:

    pkg_info --diff

    (which would check versions of installed and available packages)...

    if you know some easy way, just tell... the same for available patches in errata.

    thank you

    Comments
    1. By Joachim Schipper (Joachim) on

      > i like pkg_* very much... althought i think as packages are recommended way to install that we should have a way to check if there are new versions of packages available on ftp :)
      >
      > something like:
      >
      > pkg_info --diff
      >
      > (which would check versions of installed and available packages)...
      >
      > if you know some easy way, just tell... the same for available patches in errata.

      How about pkg_add -u, or out-of-date?

      For errata, though, you'll want to check undeadly.org and the like.

      Joachim

  3. By Anonymous Coward (75.132.109.74) on

    And a huge thanks for updating the GraphViz port to 2.12, Marc!

  4. By Anonymous Coward (193.158.13.130) on

    Any chance of seeing python in base install some time ?

    Comments
    1. By Anonymous Coward (201.37.252.63) on

      > Any chance of seeing python in base install some time ?

      please no.

      Comments
      1. By Anonymous Coward (193.158.13.130) on

        > > Any chance of seeing python in base install some time ?
        >
        > please no.

        I left perl behind me some years ago and went to python.

        I NEVER LOOKED BACK!

        The pkg_* stuff is a pretty good example why OO design in Perl 5 always
        looks weired.

        I would really appreciate it.

        Comments
        1. By Matthew R. Dempsky (15.235.153.106) on

          > I would really appreciate it.

          Perl and Python are able to coexist on your filesystem. Your complaint is really quite nonsensical. Just install the Python package.

          Comments
          1. By Anonymous Coward (193.158.13.130) on

            > > I would really appreciate it.
            >
            > Perl and Python are able to coexist on your filesystem. Your complaint is really quite nonsensical. Just install the Python package.

            Which I always do.
            I think it would be benficial to have right after installing the base
            system.

            Comments
            1. By Joachim Schipper (Joachim) on

              > > > I would really appreciate it.
              > >
              > > Perl and Python are able to coexist on your filesystem. Your complaint is really quite nonsensical. Just install the Python package.
              >
              > Which I always do.
              > I think it would be benficial to have right after installing the base
              > system.

              Religious wars aside, that's what site.tgz is for.

              Joachim

        2. By Anonymous Coward (65.248.199.227) on

          > > > Any chance of seeing python in base install some time ?
          > >
          > > please no.
          >
          > I left perl behind me some years ago and went to python.
          >
          > I NEVER LOOKED BACK!
          >
          > The pkg_* stuff is a pretty good example why OO design in Perl 5 always
          > looks weired.
          >
          > I would really appreciate it.

          This is pretty disrespectful. It's really easy to spout a language preference, and a lot harder actually use one to create and continually improve these tools that we all depend on. You should apologize.

        3. By Marc Espie (213.41.185.88) espie@openbsd.org on

          > > > Any chance of seeing python in base install some time ?
          > >
          > > please no.
          >
          > I left perl behind me some years ago and went to python.
          >
          > I NEVER LOOKED BACK!
          >
          > The pkg_* stuff is a pretty good example why OO design in Perl 5 always
          > looks weired.
          >
          > I would really appreciate it.
          >

          Well, my perl5 OO is getting more and more smalltalk-ish. Possibly far
          away from python, that's for sure.

          I looked at python a few years back, noted it was just perl with a weird syntax, and only half the fun, and went back to perl.

    2. By Frank DENIS (82.224.188.215) on http://forum.manucure.info

      > Any chance of seeing python in base install some time ?

      I don't use Python, but I love Ruby.

      Therefore, Ruby should be in base install.

      Oh, and a lot of people love PHP.

      So PHP should be in base install, with all modules and requirements like MySQL.

      Ah Erlang, Ocaml, Eiffel and Squeak users will be frustrated. Let's add that, too.

      Let's add everything and finally, pkg_add is useless :)

      Comments
      1. By Dunceorq (213.113.152.10) on

        > > Any chance of seeing python in base install some time ?
        >
        > I don't use Python, but I love Ruby.
        >
        > Therefore, Ruby should be in base install.
        >
        > Oh, and a lot of people love PHP.
        >
        > So PHP should be in base install, with all modules and requirements like MySQL.
        >
        > Ah Erlang, Ocaml, Eiffel and Squeak users will be frustrated. Let's add that, too.
        >
        > Let's add everything and finally, pkg_add is useless :)


        Oh yes Erlang in the base =D

        Comments
        1. By Dan Farrell (thedanno) on http://danno.appliedi.net/drupal/

          > > > Any chance of seeing python in base install some time ?
          > >
          > > I don't use Python, but I love Ruby.
          > >
          > > Therefore, Ruby should be in base install.
          > >
          > > Oh, and a lot of people love PHP.
          > >
          > > So PHP should be in base install, with all modules and requirements like MySQL.
          > >
          > > Ah Erlang, Ocaml, Eiffel and Squeak users will be frustrated. Let's add that, too.
          > >
          > > Let's add everything and finally, pkg_add is useless :)
          >
          >
          > Oh yes Erlang in the base =D
          >
          >

          any chance we might see basic in the base install then? and while we're at it, i think logo could approach this better than perl, too... so let's add that in.

          Comments
          1. By Anonymous Coward (213.136.49.100) on

            I made a comment a few years ago about splitting base into packages to be able to get rid of unneeded things (apache and a few more in my case). No one seemed to agree with me. Now people want more stuff in base??? Why not include everything then. You'dn't need pkg_* at all then. Plus you can skip the packages.

            No thanks! What I'd like to see is this:

            1) Split base into packages
            2) Some easy way to do a binary upgrade

            On the other hand, I haven't got the skills to do it myself so I can just make a wish or shut up. I really like OpenBSD though.

            Just a thought...

    3. By Anonymous Coward (76.19.44.89) on

      > Any chance of seeing python in base install some time ?

      What need would that fulfill that the package does not?
      What value would that provide that the package does not?

  5. By Anonymous Coward (70.66.28.12) on

    This might get some flames but I would really like to see the base system in packages as well. That way if a bug is located in any part of the system its just a simple pkg_add to fix it instead of a recompile. We can start moving towards been able to have scripts to check for updates automatically and keep our servers secure a lot easier as well... then again I am just lazy and looking for any way to make my life easier :)

    Comments
    1. By Anonymous Coward (76.10.132.180) on

      > This might get some flames but I would really like to see the base system in packages as well. That way if a bug is located in any part of the system its just a simple pkg_add to fix it instead of a recompile. We can start moving towards been able to have scripts to check for updates automatically and keep our servers secure a lot easier as well... then again I am just lazy and looking for any way to make my life easier :)

      Why can't you script updates the way things are?

      Comments
      1. By Daniel Bolgheroni (201.43.91.248) on

        > > This might get some flames but I would really like to see the base system in packages as well. That way if a bug is located in any part of the system its just a simple pkg_add to fix it instead of a recompile. We can start moving towards been able to have scripts to check for updates automatically and keep our servers secure a lot easier as well... then again I am just lazy and looking for any way to make my life easier :)
        >
        > Why can't you script updates the way things are?

        That was what NetBSD was planning for in one of their projects some time ago.

    2. By Anonymous Coward (213.41.185.88) on

      > This might get some flames but I would really like to see the base system in packages as well. That way if a bug is located in any part of the system its just a simple pkg_add to fix it instead of a recompile. We can start moving towards been able to have scripts to check for updates automatically and keep our servers secure a lot easier as well... then again I am just lazy and looking for any way to make my life easier :)
      >

      There are reasons why the base system isn't in packages... The main one being that the current update process, painful though it might be, is a little bit sturdier... I don't want to have pkg_add -u fail half-way through (okay, the base install thingy might fail half-way through).

      You have to update the kernel and reboot anyways.

      I could get the base tar balls to be packages, with very little changes. I'm not too sure how to prevent people from removing stuff afterwards, though...

      Comments
      1. By Anonymous Coward (87.246.136.51) on

        > There are reasons why the base system isn't in packages...
        > ...I'm not too sure how to prevent people from removing stuff afterwards, though...

        Sorry, but some users need to pair down the system for limited resource environments. The removal of httpd from a router will have far more emotional consequences than security consequences. Please get over it. This is in no way a suggestion to remove the audited httpd from the OS, just a request to make it modular.

        As for updating, THISwould make updating easier should only one component need updating as is often the case with security updates.

        AC

  6. By Anonymous Coward (68.123.252.77) on

    Espie, thanks; very insightful.

    Comments
    1. By Anonymous Coward (68.123.252.77) on

      > Espie, thanks; very insightful.

      Deanna, thanks for noticing and posting it. ;)

  7. By Andrei GUDIU (andreig) andreig@openbsd-box.org on http://www.openbsd-box.org

    Hello, I never ask myself what language pkg_add was written in. So to my surprise I noticed today that after reading this article I could recognise and understand some things around there. Although it is 8 am here and I don't happen to wake up at 8 am on a regular basis :).. It was a great feeling of comfort finding out pkg_* uses PERL. Since I am a PERL developer for.. 6 years now I will gladly drop an eye around pkg_* and see if I can help in any way. This early morning is a success.

  8. By Tobias Weisserth (143.93.17.28) on

    Hi everybody,

    I enjoy using the pkg tools and it's great to hear Marc is focusing on improving the quality of the design beneath the surface. The only major drawback the pkg tools have is the lack of a mechanism to ensure the integrity of a package when installing packages via a cleartext protocol such as FTP. Ports ain't much better as they only use sha1 to verify the integrity of source downloads, but it's still better than having nothing. What's the use of a secure system after all when someone manages to infiltrate my system by performing a man-in-the-middle attack on the FTP connection pkg_add uses? In these days it's not only mere criminals one has to care about, the German government for example reportedly hacks citizens computers to bypass any filesystem and email encryption they can't cope with when grabbing the computer physically during a search. Manipulating download streams is easy for the government, the infrastructure is in place. If you want to defend against this, you need some way to verify the integrity of the download. I think OpenBSD should have this. Anyway, great work Marc, I'm going to support OpenBSD as strongly as I can, so keep buying releases, books and OpenBSD merchandise! :-)

    Cheers!

    Comments
    1. By Marc Espie (213.41.185.88) espie@openbsd.org on

      > Hi everybody,
      >
      > I enjoy using the pkg tools and it's great to hear Marc is focusing on improving the quality of the design beneath the surface. The only major drawback the pkg tools have is the lack of a mechanism to ensure the integrity of a package when installing packages via a cleartext protocol such as FTP. Ports ain't much better as they only use sha1 to verify the integrity of source downloads, but it's still better than having nothing.

      Nope, ports have switched to SHA256 a bit after 4.1 came out.

      Simon Bertrang and Todd Miller are working on integrating SHA256 support in
      base perl as well, and we will probably move the @md5 keyword to @sha256
      as well, which is a prerequisite for sensible package signing these days.

  9. By Pete (80.203.236.21) on

    Great stuff.

    Any chance of a pkg_* command based way to search pkg/ports ? something like an updated version of the (depreciated?) 'cd /usr/ports && make search key=whatever'

    /Pete

    Comments
    1. By Dean (71.42.114.218) on

      > Great stuff.
      >
      > Any chance of a pkg_* command based way to search pkg/ports ? something like an updated version of the (depreciated?) 'cd /usr/ports && make search key=whatever'
      >
      > /Pete

      http://ports.openbsd.nu works for me, I use it more than 'make search key=' now, and it can link you to CVSweb if you want to check on recent activity.

    2. By Marc Espie (213.41.185.88) espie@openbsd.org on

      > Great stuff.
      >
      > Any chance of a pkg_* command based way to search pkg/ports ? something like an updated version of the (depreciated?) 'cd /usr/ports && make search key=whatever'
      >
      > /Pete

      Hum... The only thing we ever deprecated is the syntax.

      cd /usr/ports make key=whatever target
      will do what search used to do, except that you can decide which
      target you want to run...

    3. By Matt Van Mater (69.255.1.181) on

      > Any chance of a pkg_* command based way to search pkg/ports ? something like an updated version of the (depreciated?) 'cd /usr/ports && make search key=whatever'


      Here is my personal hack I have been using for a while now. Just add the following two lines to your .profile

      export PKG_PATH=ftp://ftp3.usa.openbsd.org/pub/OpenBSD/`uname -r`/packages/`uname -m`/
      alias pls="echo ls | ftp -a $PKG_PATH | sed 's/.*\ //g' | grep -i "

      That way you can execute a command like `pls wget` and it will list all packages that match the string wget. You can chain that together and do "pkg_add `pls wget`" and it will add the packages that match your search... of course if there are multiple results it doesn't work so hot.

      Think of it as "pls" = "package ls". Its a hack, but works nicely most of the time. A fancier version I made that creates a menu for you to decide when there are multiple results is pasted below. It have been a long time since i tried it, so take it with a grain of salt :)

      #!/usr/bin/perl

      # Utility to search a remote ftp mirror that contains OpenBSD packages
      # Can be used to simply list packages, or add them
      # Released under the BSD license
      # Written By: Matt Van Mater
      # Contact Info: matt.vanmater@gmail.com
      # Release Date: 10/02/05
      # Version: 0.1


      #main section
      my $funct = shift;
      my $search_term = shift;

      if (! defined $funct){
      usage();
      } else {
      if ($funct eq "ls"){
      if (! -e ".pkgmirror"){conf();}
      my @pkg_list = ls($search_term);
      if (@pkg_list){
      foreach $match (@pkg_list){
      print "$match \n";
      }
      } else { print "No matches found, try revising your search\n";}
      } elsif ($funct eq "add"){
      if (! -e ".pkgmirror"){conf();}
      add($search_term);
      } elsif ($funct eq "conf"){
      conf();
      } else {
      usage();
      }
      }


      sub add {
      my $search_term = shift;
      my @match_list = ls($search_term);
      my $match_count = @match_list;

      if ($match_count eq 0){
      print "No matches found, try revising your search\n";
      } elsif ($match_count eq 1){
      print "One match found, installing @match_list[0] now\n";
      `pkg_add @match_list[0]`;
      } elsif ($match_count gt 1){
      my $count = 0;
      print "Choose which package to install:\n";
      foreach $match (@match_list){
      print "[$count] @match_list[$count]\n";
      $count++;
      }
      print "[X] Exit this menu\n";

      my $choice = <STDIN>;
      chomp ($choice);

      if (($choice =~ /^\d+$/) && ($choice < $match_count)){
      print "You chose @match_list[$choice], installing now \n";
      `pkg_add @match_list[$choice]`;
      } else {
      print "Exiting now\n";
      }
      }
      }


      sub conf {
      print "Fetching mirror list from ftp.openbsd.org\n";
      `ftp -a ftp://ftp.openbsd.org/pub/OpenBSD/ftplist`;

      if (! -e "ftplist"){die "Could not fetch ftp mirror list, exiting\n";}

      open (IN,"ftplist");
      while (<IN>){
      chomp();
      push @mirror_list, $_;
      }
      close IN;
      unlink(ftplist);

      my $count = 0;
      foreach $site (@mirror_list){
      if ($site =~ /ftp:\/\/(.*?)\s+(.*)/){
      push @ftp_list, $1;
      print "[$count] $site\n";
      $count++;
      }
      }
      print "[X] Exit this menu\n";
      print "Type the number of the mirror you want to use\n";

      my $match_count = @ftp_list;
      my $choice = <STDIN>;
      chomp ($choice);

      if (($choice =~ /^\d+$/) && ($choice < $match_count)){
      print "You chose @ftp_list[$choice], saving selection to .pkgmirror file \n";
      `echo @ftp_list[$choice] > .pkgmirror`;
      } else {
      die "Exiting now\n";
      }
      }


      sub ls {
      my $search_term = shift;
      # if no search term provided, just list entire directory
      if (! defined $search_term){$search_term = ".*"}

      $VER = `uname -r`;
      $ARCH = `uname -m`;
      $MIR = `cat .pkgmirror`;
      chomp($VER);
      chomp($ARCH);
      chomp($MIR);

      $PKG_PATH = "ftp://$MIR/${VER}/packages/${ARCH}/";
      $ENV{PKG_PATH} = $PKG_PATH;

      print "Getting directory listing from ftp mirror\n";
      my @file_list = `echo ls | ftp -a $PKG_PATH`;
      foreach (@file_list) {
      # grab only the file name
      if (/.* (.*?)$/){
      $filename = $1;
      if ($filename =~ /$search_term/i){
      push @match_list, $filename;
      }
      }
      }
      return @match_list;
      }

      sub usage {
      print qq(
      Usage:
      ./pkgg (ls|add) <pkg_name> #list or add packages that (almost) match pkg_name
      ./pkgg conf #configure tool to use official ftp mirror
      ./pkgg #this help screen
      \n);
      }







  10. By Steve Fairhead (195.112.48.103) on http://www.fivetrees.com

    Nice work, Marc. I'm chuffed that I understood most of it ;).

    Refactoring rocks. I'm a believer, and the more I do the better my new code gets. Yay me ;).

    Steve

Latest Articles

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]