SysAdmin Blog: Adventures with OpenBSD

Contributed by jason on 2006-12-08 from the what-happened-to-self-healing-networks dept.

As I was sitting at my desk in mid-afternoon, I was surprised by an instant messenger from one of our media team members. This is out of the ordinary and generally means something is broken.

3:18:07 vince:  http down?
3:18:37 jason:  not that I'm aware of, but I noticed DNS was acting hokey.
                Why, is it down for you?
3:19:11 vince:  yes. couldnt connect to dev
3:20:08 jason:  I'll check it out

Years of networking experience have taught me to always start at the bottom [layer of the OSI model] and work upwards. This was no exception. A quick ping dev ruled out any problems with basic connectivity. I opened up a web browser and opened up a series of pages, both internal and external. For the most part, they timed out; however, a few tabs would open up slowly, while one opened up instantaneously (none of these sites were cached, so we can rule that out). There was obviously something amiss, and it was my job to track it down.

I was a DNS administrator in a previous life, so I'm wholly aware of the chaos that can be caused by a misbehaving nameserver. We use two OpenBSD 3.9 systems as our authoritative and resolving DNS servers, serving up both internal and external Bind views with the default chrooted named daemon. This configuration has served us well, and the servers nary skip a beat serving up hundreds of requests per second. I started out by running some queries against each server. The responses mirrored the activity we experienced during web browsing; often, the queries would respond in 2-3 seconds, rarely they would respond immediately, sometimes they would simply timeout. This was consistent across both of the DNS servers, regardless of the query target (internal or external recursion).

No changes had been made to any of the DNS zones or Bind configuration within weeks, so any sort of typo was ruled out as a cause. We were starting to consider a Denial-Of-Service attack, based on the inconsistent behavior. I opened up an OpenSSH connection to the primary master nameserver and snooped around. Process lists (ps -ax), network status (netstat -i -I em0 1) and kernel activity (vmstat 1) reports all came back normal. Processor load was virtually nothing. And yet, the conditions worsened... and more users started calling.

Before we continue, I'd like to give a brief overview of our network design. When I took over the infrastructure two years ago, the networks were a mishmash of six isolated LAN segments, each with their own dedicated Cisco PIX connected to the ISP WAN. There was no traffic accounting nor Quality-of-Service queuing to ensure each department received the bandwidth they needed to perform their tasks. The users are primarily developers and engineers, known to download large amounts of software (and streaming video) at their own discretion. It was also common practice to allow visiting clients and vendors to connect their laptops directly into the host network. On top of all this, the company was operating on a single T1 connection to the Internet. The frustration of daily user complaints over the network congestion soon turned to hope; hope that OpenBSD, PF, ALTQ and solid design fundamentals would ease my pain.

The CEO at our company is very accepting of open source software. It took very little convincing to get them to agree to a complete overhaul of our networks, starting with a pair of OpenBSD i386 firewalls. Each firewall contains a total of two external vlan (4) interfaces on em0 and 16 internal vlan interfaces on em1. The vlan interfaces also have a corresponding carp (4) interface which provides fail-over between the firewalls. Traffic states are synchronized thanks to pfsync (4), which is bound to sk0. All of the user networks are part of the "internal" interface group, allowing for easy policy-based filtering. Only traffic into the DMZ is allowed from all networks, while most of the networks are not allowed to route between themselves at all. This design effectively creates a number of developer "sandboxes" which we have much greater control over. While they continue to allow visitors inside their gated community, I can rest assured that any hazardous traffic will be isolated to their network, the DMZ, or the Internet. This encapsulation also simplifies any queuing structures that I wish to implement.

As we return to our hero, we remember that the situation was getting dire. While the patterns were predictable, it made little sense that traffic would work fine at times, while at other times it would respond slowly or not at all. All of the servers and switches appeared to be at normal capacity and operating normally. Even the traffic graphs created by symon revealed we were running at 50% of our 3-Mbps connection (we have since upgraded to a bonded T1 pair). All switch ports and servers are set to auto-negotiate, so that was ruled out as the culprit. My patience was running thin.

Up to this point, I had neglected to analyze any traffic on the wire, as it had appeared to be an application-layer problem. I decided to take a quick look at a tcpdump capture while monitoring the debugging output of named via syslog. Purely by chance, I chose to initiate the DNS query from one of the firewalls. What happened next came as a complete surprise.

While performing a tcpdump -ni em0 udp and host 10.0.0.1 and port 53 on the target nameserver, I issued a dig command from the firewall. To my disbelief, I saw nothing. Wait! There it is, three seconds later, the initial SYN request from the firewall, and an instantaneous response from the nameserver. For some reason, there was a delay for packets leaving the firewall to the nameserver. Immediately, I knew what was wrong.

# pfctl -s state | wc -l
10000

# pfctl -s memory | grep states
states        hard limit    10000

Sure enough, I had left the default state limit intact. The firewalls each come with 256MB of memory, but rarely exceed 50MB of real activity. I edited the pf.conf to add set limit states 20000, and issued a pfctl -O -f /etc/pf.conf to load the new options. Almost magically, network activity returned to normal. I sat back in my chair, breathed a deep sigh, and took in a healthy swig of caffeinated goodness.

Leaving for the day, I made sure to brief our CEO on the day's misadventure.

5:22:03 bill:  everything looks good now
5:23:45 jason:  yeah, one of the admins tripped on a cable.  problem solved.
5:23:59 bill:  ok, thanks
5:24:17 jason:  :)

THE END

(Comments are closed)

Comments

By David Gwynne (203.173.42.48) loki@animata.net on 2006-12-08 03:52

does Bill read undeadly?
Comments
1. By jason (151.196.11.192) jason@dixongroup.net on 2006-12-08 03:54
  
  > does Bill read undeadly?
  
  We'll know soon enough. :)
2. By CODOR (67.158.71.88) on 2006-12-08 05:30
  
  > does Bill read undeadly?
  
  More importantly, is his last name Lumbergh?
By escape (69.248.109.233) on 2006-12-08 06:33

Hmm. I was attempting to read the first comment for a while and it kept timing out. Coincidence?
By Pierre-Yves Ritschard (88.138.184.156) pyr@spootnik.org on 2006-12-08 07:08

for those using net-snmp, a small script or C-program can be used with pass-persist to report the percentage of the state table used, it has saved my life once in the exact same situation.
Comments
1. By Anonymous Coward (68.104.220.48) on 2006-12-09 01:47
  
  > for those using net-snmp, a small script or C-program can be used with pass-persist to report the percentage of the state table used, it has saved my life once in the exact same situation.
  
  Already done for us: http://www.packetmischief.ca/openbsd/snmp/.
By Kurt Seifried (68.149.173.71) kurt@seifried.org on 2006-12-08 12:16

Why did you lie to your boss about what happened rather than simply telling the truth

"We ran into a limitation based on a default configuration, I changed the configuration so it won't happen again anytime soon".
Comments
1. By jason (151.196.11.192) jason@dixongroup.net on 2006-12-08 12:31
  
  > Why did you lie to your boss about what happened rather than simply telling the truth
  >
  > "We ran into a limitation based on a default configuration, I changed the configuration so it won't happen again anytime soon".
  
  Actually, I did tell him offline (he knew I was kidding about the cable). Dammit Kurt, quit overanalyzing. :)
By Anonymous Coward (69.70.207.240) on 2006-12-08 13:12

Great article!

Glad to see this because if this would have happened to me, I probably wouldn't have thought of this...

What would be a good way to monitor things like this and/or be notified at a certain threshold percentage?
Comments
1. By Pierre-Yves Ritschard (193.252.148.11) pyr@spootnik.org on 2006-12-08 13:29
  
  > Great article!
  >
  > Glad to see this because if this would have happened to me, I probably wouldn't have thought of this...
  >
  > What would be a good way to monitor things like this and/or be notified at a certain threshold percentage?
  
  As I stated above, writing an extension for net-snmp does the trick.
  Maybe symon does it too for smaller installations.
  Comments
  1. By Anonymous Coward (69.70.207.240) on 2006-12-08 14:26
    
    > > Great article!
    > >
    > > Glad to see this because if this would have happened to me, I probably wouldn't have thought of this...
    > >
    > > What would be a good way to monitor things like this and/or be notified at a certain threshold percentage?
    >
    >
    > As I stated above, writing an extension for net-snmp does the trick.
    > Maybe symon does it too for smaller installations.
    
    Sorry, I noticed your post after... Thank you! I'm going to look into this more.
    
    Comments
    
    By Anonymous Coward (67.70.106.244) on 2006-12-10 02:16
    
    Wouldn't be appropriate for the pf code to cut a kernel syslog message the first time a state limit like this is exceeded? Or does this happen and the messages was not noticed?
By dgs (85.226.192.38) on 2006-12-08 18:35

Nice story, fun and learningfull.

/regards
By MotleyFool (134.253.26.9) on 2006-12-08 20:12

Howdy Something similiar was posted on misc@ in Oct '05 PF story, happy ending.
By Anonymous Coward (87.238.80.64) on 2006-12-08 21:29

I had similar problem with states limit. This was caused by some malware on our customer Windows network. One box opened a lot of connections to smtp servers across the net. It was sending big amount of spam everywhere and our firewall (with default settings) didn't have any free states anymore. I was at home that time so it was little bit difficult to ssh to OpenBSD but after few retries I've made it. It is always good to limit states per src ip, you never know what can happend in your network.
Comments
1. By Anonymous Coward (60.51.119.127) on 2006-12-09 01:35
  
  Is there any dedicated mailing list to share story like this? The compilation of sys admin/ sys engineer stories are nice to read ;)
  
  cheers
  Comments
  1. By Anonymous Coward (74.130.128.79) on 2006-12-09 01:43
    
    > Is there any dedicated mailing list to share story like this? > The compilation of sys admin/ sys engineer stories are nice to read ;)
    
    Sounds like you are searching for the Monastery.
By bryan Irvine (64.1.201.131) on 2006-12-13 21:10

seems like a slow response time from DNS. Are you starting it with "named -4"?
Comments
1. By Jason (151.196.11.192) Jason@dixongroup.net on 2006-12-14 06:10 http://www.dixongroup.net
  
  > seems like a slow response time from DNS. Are you starting it with "named -4"?
  
  Did you actually read the story? The problem was that the hard state limit on the firewall had been reached.
By Anonymous Coward (67.169.42.64) on 2006-12-18 19:14

This is why I obsessively monitor pftop and tcpdump -envvvi pflog0
Also, "set optimization aggressive" could reduce your state consumption depending on your situation.

Latest Articles

Sat, Jun 28
- 05:57 Game of Trees 0.115 released (0)
Tue, Jun 24
- 07:48 Game of Trees 0.114 released (0)
- 07:23 Call for testing: bge/bnx/iavf/igc/ix/ixl/ngbe/pcn: ifq_restart() fix (0)
Mon, Jun 16
- 08:22 j2k25 hackathon report from kn@: installer, low battery, and more (0)
Fri, Jun 13
- 11:18 dhcpd(8): use UDP sockets instead of BPF (1)
Thu, Jun 12
- 12:32 clang(1)/llvm/lld(1) updated to version 19 (0)
Wed, Jun 11
- 12:22 Source code sandboxing (0)
Tue, Jun 10
- 06:50 TearFree option backported to modesetting(4) driver (0)
Mon, Jun 09
- 07:32 FFS optimizations with dirhash, as blogged by rsadowski@ (1)

Credits

Copyright © 2004-2008 Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to April 2nd 2004 as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]