OpenBSD Journal

Porting Bacula: The good, the bad, the ugly

Contributed by merdely on from the essense-sucking-backup-daemon dept.

I've been a Bacula user for about 2 1/2 years. It's a pretty good backup system that comprises four main components: file daemon, storage daemon, director and bconsole. bacula-fd is the agent that is installed on the systems you want to back up. bacula-sd is installed on the system that hosts the backup device (tape drive). bacula-dir is the backup management system that controls and directs the backup processes. bconsole is the management tool for backup admins to manage bacula.

I was disappointed to find that bacula was not supported on OpenBSD and set up a Fedora Core 2 box to host my backup server. I was able to create an "in-house" bacula-fd port so I could backup my OpenBSD boxes. Several months ago when my bacula server's MySQL took a dump during backups, I had to either restore my bacula catalog (which previously took 7 days) or start from scratch. I decided to move to pgsql instead of mysql and wanted to try to port bacula to OpenBSD. I had a working port under 4.0-current but ran into a couple problems: the speed in which backup wrote to tape was about 7x slower than the Linux server (same hardware) and bacula could not append to tapes (only overwrite).

Given that my backup window was quickly approaching, I had to ditch my OpenBSD porting work and install Linux. As soon as I told the Fedora installer to format my disk, I realized that I didn't backup my bacula port. <insert forehead slap here />

With the hackathon approaching, I wanted to get the port working so people could test it and hopefully improve its speed and ability to append to tapes. Since the hackathon, we've done a lot of testing and tweaking.

I started with my bacula-fd port (which employed the --enable-client-only configure options) and worked from there. The biggest hurdle I faced was creating a single port with multi-packages (-client, -server) where -server supported the different backend flavors that bacula uses (postgresql, mysql, sqlite3 and sqlite). Because the hackathon was approaching quickly and I couldn't figure out how to work out the multi-packages/flavors issues, I made two ports: bacula-client and bacula-server. Obviously this is not the desired solution, but it got me going.

With some hints from kili@ and steven@ and then some final guidance from robert@, I was able to create one port. Basically, I created the port with "MULTI_PACKAGES= -main -client -server" and then "FLAVORS= pgsql mysql sqlite3 sqlite". Then, to make sure -client is defined with "FULLPKGNAME-client= ${DISTNAME:S/-/-client-/}". This will make sure that no FLAVORs are ever created for the -client subpackage. But, when FLAVOR is defined (and it must be for bacula; the default FLAVOR is sqlite), the -main and -server subpackages are created for the specific flavor.

The bacula-client subpackage includes just the bacula-fd related files (man page, startup script and configure file). The bacula-server subpackage includes bacula-sd, bacula-dir, bconsole and other utilities, man pages, scripts and configuration files. The bacula-2.0.3 package is a super-package containing both -client and -server files.

There has been much testing by robert@ and jdixon@ along with access to machines provided by jj@ and ckuethe@ and the port seems to work as expected.

Finally, I'd like to thank espie@ (among others) for providing such a robust ports system. I've seen, over the past few years, the process of creating ports get more streamlined and flexible.

(Comments are closed)


Comments
  1. By Anonymous Coward (193.63.217.208) on

    Two points you raised but didn't answer directly:

    Did you get the tape speed up to match Linux?

    Did you get round the overwrite/append problem?

    Comments
    1. By Mike Erdely (merdely) on http://erdelynet.com/

      > Two points you raised but didn't answer directly:
      > Did you get the tape speed up to match Linux?
      > Did you get round the overwrite/append problem?

      Those were both st(4) driver problems as far as I can tell. They're not related to bacula except that bacula was using an st(4) device.

      The hope was to get bacula in the tree and then use it to work on the st(4) driver.

      So the answer to both of your questions is: no.

  2. By sthen (85.158.44.149) on

    Thanks for porting Bacula, and thumbs up to the comments about the robust ports system. If you try and do anything with the superficially-similar system on some other OS, you'll *really* appreciate some of the decisions that have been made here. systrace/fake/update-plist? genius. And this is before you even look at pkg_*...

  3. By Anonymous Coward (204.80.187.5) on

    Ive never had a problem writing new backups that append to tapes using mt for positioning and tar+dd for writing.

    Maybe Bacula doesn't setup appending properly, not st(4) ?

    You should check out how Bacula does it and compare the code to mt.

    Anyways, all this stuff is made obsolete by boxbackup, which doesn't bother with tapes and is also developed on OpenBSD. Too bad there's no port for it!

    Comments
    1. By Michiel van Baak (mvanbaak) on http://michiel.vanbaak.info

      > Anyways, all this stuff is made obsolete by boxbackup, which doesn't bother with tapes and is also developed on OpenBSD. Too bad there's no port for it!

      I totally agree with you there.
      building it from source isn't that hard, but a port will be nice.

      Comments
      1. By Mike Erdely (merdely) on http://erdelynet.com/

        >> Anyways, all this stuff is made obsolete by boxbackup, which doesn't
        >> bother with tapes and is also developed on OpenBSD. Too bad there's
        >> no port for it!
        > I totally agree with you there.
        > building it from source isn't that hard, but a port will be nice.

        While I disagree that boxbackup obsoletes bacula and other traditional backup methods, I made a port of boxbackup.

        Please test:
        http://marc.info/?l=openbsd-ports&m=118505098528605&w=2

      2. By sthen (85.158.44.149) on

        > > Anyways, all this stuff is made obsolete by boxbackup, which doesn't bother with tapes and is also developed on OpenBSD. Too bad there's no port for it!
        >
        > I totally agree with you there.
        > building it from source isn't that hard, but a port will be nice.

        It's often just as quick to make a port than it is to build from source (assuming you want files kept in sensible locations).

        Comments
        1. By sthen (85.158.44.149) on

          > It's often just as quick to make a port than it is to build from source (assuming you want files kept in sensible locations).

          ...hmm, I should add: the bit that takes time is making sure everything works well, keeping it up-to-date, finding/fixing problems and reporting them upstream, but you need to do this with port or with source. Especially relevant for something you want to trust your backups to...

          For simple disk-to-disk backups from Unix-like systems, I'm pretty fond of running dump from cron (generally transferred over ssh).

    2. By Edward Roper (69.17.54.112) on

      > Anyways, all this stuff is made obsolete by boxbackup, which doesn't bother with tapes and is also developed on OpenBSD. Too bad there's no port for it!

      While disk-to-disk backups are really handy, speedy and often times more convenient than tape, I believe there are still many situations that tape is preferable.

      Off-site storage is difficult if you have to ship an array of drives back and forth.

      This is further compounded when you have a large quantity of data to backup 30TB+ for example.

      Once you hit storage sizes like this, tape is cheaper than disk, particularly when you have long retention policies. Ultrium LTO-3 for example 400/800GB ~$36.00. SATA 500GB seem to be a sweet-spot right now at ~$88. That's $0.09/GB(tape) vs. $0.175/GB(disk).

      A quick example: If you're doing monthly full backups with a 6 month retention, you need 180,000GB of backup space.

      $31,499.00 for disk or $16,200.00 for tape.

      Disk hardly looks compelling in this situation.

      Just my $0.02.

      Comments
      1. By Shane J Pearson (59.167.252.29) on

        > While disk-to-disk backups are really handy, speedy and often times more convenient than tape, I believe there are still many situations that tape is preferable.

        Yes. I think there is a lot to be said for off-line backups. I like a mix of these, to address different issues...

        1. Off-line: malicious software can't take out a backup if the backup cannot be addressed by that malicious software.
        2. Off-site: to cover for major disasters like fire, terrorism, etc.
        3. On-line: Quicker restores means better uptime and thus higher availability to data.

        Look's like boxbackup will show promise for 2 (via networks) and 3, but I feel better having 1, so for me something like boxbackup looks like a nice supplement to traditional backups.

      2. By Anonymous Coward (74.115.21.120) on

        > This is further compounded when you have a large quantity of data to backup 30TB+ for example.
        >
        > A quick example: If you're doing monthly full backups with a 6 month retention, you need 180,000GB of backup space.

        Why would you do monthly full backups of 30TB of data? If you have that much data you can afford a decent SAN to store the data and online backups together using snapshots/pitr. And then you can replicate it to another SAN in another city for disaster recovery purposes. Most of the data in each of your full backups is going to be redundant since it hasn't changed since the last full backup.

        Comments
        1. By Edward Roper (69.17.54.112) on

          Even with snapshots you're exposed to malicious attacks against online data. I personally view snapshots as a self-service restore for my users when they oops. I use them as a first-line defense against accidental data loss.

          As for most of the data being redundant, I'd agree that for most people, in most situations this is true. There are scenarios, and not just hypothetical ones, where even large data stores have a very high rate of change. I've personally witnessed 5TB+/_day_ of deltas. It would take an insanely expensive pipe to move that much data over the wire every day. This rate of change also severely limits your snapshot retention times.

          Again, I'm not knocking disk-to-disk, snapshot mirrors, etc. I'm simply saying that these have hardly made tape obsolete. If it were feasible to never touch another tape, I'd be happy to jump at it.

        2. By Nick Holland (68.43.113.17) nick@holland-consulting.net on http://www.openbsd.org/faq/

          > > This is further compounded when you have a large quantity of data to
          > > backup 30TB+ for example.
          > >
          > > A quick example: If you're doing monthly full backups with a 6 month
          > > retention, you need 180,000GB of backup space.
          >
          > Why would you do monthly full backups of 30TB of data? If you have
          > that much data you can afford a decent SAN to store the data and online
          > backups together using snapshots/pitr. And then you can replicate it to
          > another SAN in another city for disaster recovery purposes. Most of the
          > data in each of your full backups is going to be redundant since it
          > hasn't changed since the last full backup.

          What happens when that SAN blows out?

          SANs are complicated things.
          Complicated things go boom once in a while. You better be ready for it.

          Worse, complicated things lead to administrator error. Administrator error is quickly and easily replicated to your Disaster Recovery site as well. And even if it doesn't, your bandwidth from your tape drive is better than it is from your DR site...that may be useful in some cases.

          Tape is not dead.

          Nick.

      3. By guly (88.149.178.255) on

        > A quick example: If you're doing monthly full backups with a 6 month retention, you need 180,000GB of backup space.
        >
        > $31,499.00 for disk or $16,200.00 for tape.
        >
        > Disk hardly looks compelling in this situation.
        >
        > Just my $0.02.


        according to www.skydatacorp.com/prod_tape_subsystems_ultrium2.asp
        ultrium2 tape has 25% lifetime compared to sataII disc (1millions hours MTBF)
        blame me if i'm wrong but it's not cheaper in my mind, and not so safe too.

        Comments
        1. By Edward Roper (208.66.102.2) on

          I believe that you're looking at the MTBF of that particular tape drive, under a 100% duty cycle.

          Assuming that you were running the drive under such a high duty cycle, and did manage to run it into the ground, no big deal. Tape is removable media, there is no data stored on the drive. You can take your media and stick it in any other compatible drive.

          Eventually media wears out, though this allegedly occurs less often with LTO than with other tape technologies.

          According to Wikipedia (http://en.wikipedia.org/wiki/Linear_Tape-Open) you can expect ~30 years of archival storage on LTO media. It's highly likely that you'll be able to recover your data long after every drive in your primary storage pool has either died or been obsoleted.

          In closing, I won't say that it is unlikely that anyone actually runs a 100% duty cycle on their tape drives, I'm sure that lots of folks do, but I haven't encountered it under normal circumstance.

  4. By Anonymous Coward (204.80.187.5) on

    Also, for speed, is Bacula setting the right block sizes? Large block sizes may be much faster, at least that's what I remember from using tapes in Solaris.

Latest Articles

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]