Tunnelling out of corporate networks (Part 4)

Contributed by mtu on 2009-12-15 from the the-end-is-near dept.

Tunnelling out of corporate networks - logs, collection and analysis

I haven't heard too many people say good things about log analysis or monitoring but in reality it can be really simple and effective. More importantly, if you take the necessary steps to reduce your network to exposure to malware, log monitoring becomes really easy and fun.

Read on to find out more about how we collect and analyse logs:

Articles 1 2 3 4 5

If you have read Bejtlich's book on extrusion detection, then you will know to focus more of your attention on what is trying to leave your network rather than the noise being blocked trying to come in. With a default deny by default on ingress and egress points, it is trivial to detect problems or bad traffic that we know is not supposed to be leaving the network.

Logging is generally messy because every system has it's own log format and every environment is unique. Interpretation of filtered log results requires a relatively high degree of competence. If it didn't, then we would automate it. Even a single do-it-all product will require considerable customization. Reviewing filtered log data and updating your filters is an ongoing process. Keep this in mind, as you need to budget time from your technical staff for this.

What should you log? We follow Marcus Ranum's Laws of Logging:

1st law - Never collect more data than you can conceive of possibly using.

2nd law - The number of times an uninteresting thing happens is an interesting thing.

3rd law - Collect everything you can except when you come into conflict with the first law.

When possible, we try and centralize logs from all sources: Unix System logs, Managed Switches, Windows logs, Managed Power Bars, VMware ESX, IDS, DNS, Firewalls, Proxies, Web Server logs. We log everything but DEBUG.

The basic principles that we use to log are as follows: Centralize the logs for retention and analysis. It's easier to ensure logs are retained for the minimum period; simplifies log backup, storage capacity planning. Logs from multiple sources can be examined as a single stream where by events can be correlated. We also use two basic mechanisms: blow and suck. Blow individual events sent by the device 'real-time'. Suck periodic batch collection of logs from the device.

The fundamental principles that we use for log analysis is as follows: Filter logs by removing the things that you know are OK, not by trying to identify the things that are bad. In order to do this, you need to know what is OK, which requires a very clear view of the network environment. To do this, we constrain the environment to make it simple and gather information about the environment. Then generate reporting to count even 'uninteresting' events.

In the previous article, I explained our naming strategy. It is all in an effort to simplify our environment. We want to ensure predictability so we fix IP addresses on workstations and map between arbitrary items: IP [192.168.59.23], Ext. [5923], MAC [00:0a:e4:2a:d8:af], workstation name [000AE42AD8AF] with an alias to the user name [muemura]. We want to control the network and limit transfer between our heavily segmented network to only explicitly allowed (needed) traffic. This is important for IDS placement.

Understanding the environment

The components that we use are syslog-ng, a perl pre-processor, swatch and splunk. Syslog-ng shouldn't be new to Sysadmins so I'll explain the perl pre-processor. It's a custom tool written in perl. It looks for IP addresses, MAC addresses, switch ports in logs and expands them with information from other sources such as asset inventory, DNS and our netdisco database.
regular log looks like this:

2009-11-12T15:22:47+0900  192.168.178.13 SYST: Port 6 link down

the pre-processor turns it into this:
                                        (Source Device)
2009-11-12T15:22:47+0900  192.168.178.13 (switch-name) \

     [MAC Address (Workstation alias & User Name)]
SYST: Port 6 [00:0a:e4:2a:d8:af (muemura)] link down
We use Swatch for real-time monitoring and there are two basic filters: A light filter that removes 'noisy' events but presents a steady flow of network activity. Events to watch for are colour highlighted. Then we have a heavy filter that removes all 'OK' activity. Anything that makes it through this filter by definition needs to be investigated. There are also specific 'report' filters running as scheduled batch jobs to generate reports for specific review for Administrative Actions.
Swatch Filter Fragment

# DNS query denied [Request Ticket #342]
ignore /^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\+\d{4}\s\ \
     \s$ipv4_full\snamed\[\d+\]: client $ipv4_full\d+: query \
     $cache$.*(AAAA|A6)\/IN. denied$/

# iSCSI array load balancing
ignore /arp info overwritten for 192.168.186.10 by \
     (00:09:8a:03:b4:b7|00:09:8a:03:b4:b8) on vlan160$/

# warn by default
watchfor /*./
     echo
Generally, our approach has usually been to utilize Open Source when it comes to anything security or network related. This is mainly because we believe most commercial solutions do not match Open Source offerings in these two areas but that's another series of articles :-). The one exception to that is with Splunk; a commercial log analysis tool. It is like Google for your logs. It can handle a variety of sources (syslog, files, pipes, etc.). It runs on almost any platform (Windows/Unix) and it has a powerful interactive search engine with scheduled reporting and alerting functionality. It can be taught log entry semantics (i.e., this is a username, this is a host name, this is an authentication failure). There are also 3rd party modules for specific log sources (both commercial and Open Source).

Netdisco is a nice Open Source web-based network management tool that extracts information from routers and switches via SNMP or DNS queries stored into a Postgres database.

For Windows workstations and servers, we use Snare for Windows which is freeware released under GPL. It allows us to redirect Windows event logs to our syslog-ng server. Windows event logs sent to syslog-ng are not really formatted for easy reading. The log format shows the GUI heritage and needs to be massaged to be useful when reading logs real-time. Consolidating your logs is very important and so we think that the end results are worth the effort.

We do use an IDS even though we are not fans of IDS or IPS systems (A.K.A. "allow all, block some" Layer 7 firewalls). We are mandated to at least use an IDS in order to comply with our Corporate Security Policy. We use Snort to satisfy this requirement but don't rely on it more than this. We believe that if we have to rely on an IDS to help enforce our security policy by alerting us of potential security issues, then we have not done a good enough job. Hopefully, this will be evident in the next and final article.

I'll briefly touch on some of the other Open Source programs that we use but since they are not directly related to this series, they will only get a passing mention here. We track every source code and config change using Puppet with Subversion as our version control system and Trac for a web interface to this repository. As an aside, it's a good idea to be storing your most important config files of all your machines on a secure and easily protected server. I couldn't think of a better Operating System for this purpose. We use Squid as our web proxy. We use Request Tracker (RT) as our tracking system mainly for change control management. We also use SmokePing for statistical network latency information and Nagios to monitor our systems.

As mentioned before, we want to prevent infection from happening in the first place but we must periodically verify this apart from any log monitoring that we do. We use a freeware auditing tool called Aida32 which is no longer being developed but is just fine for our purpose. There is a commercial successor called Everest for those interested in this tool. We commit reports of each workstation into subversion and at a glance can see what if anything has changed from any point in time. We also have baseline images that we compare systems against. This gives us a better idea of things that have changed on each machine. It is also a good starting point to check problem workstations after the usual PEBKAC check.

Centralizing and massaging your logs with tools similar to what we use will be an important step towards understanding the state or health of your network. Besides detecting signs of malware, host or application misconfiguration is easily detected. When something does happen that is not OK, it sticks out like a sore thumb, so to speak. Creating a ticket and closing that ticket in a timely manner puts you in good stead for audits as it shows that you are on top of things and proactive at correcting problem issues.

If the above is in place and you take the necessary steps to reduce your exposure to malware, then log analysis and monitoring will be a walk in the park. Keep in mind that the end goal is not to be effective at log analysis and monitoring but rather to be effective at preventing infection on your internal network. In the next and final article, I will explain how we went about eliminating this sort of problem. It's not magic or Draconian in measure; just really effective. I hope that you stay tuned.

Mark T. Uemura

(Comments are closed)

Comments

By Anonymous Coward (bodie) on 2009-12-19 22:22 http://www.openbsd.org

Another great article from superb line. It will be fine to read more similar stories from production corners.
By diw (diw) davidianwalker@gmail.com on 2009-12-22 01:14

I think this is the talk.

http://www.youtube.com/watch?v=UM4ZrsOjmNQ\|

Eminently sensible.

Best wishes.
Comments
1. By diw (diw) on 2009-12-22 01:16
  
  > I think this is the talk.
  >
  > http://www.youtube.com/watch?v=UM4ZrsOjmNQ\|
  >
  > Eminently sensible.
  >
  > Best wishes.
  
  Pardon moi.
  
  http://www.youtube.com/watch?v=UM4ZrsOjmNQ
  
  Best wishes.

Latest Articles

Sat, Jul 05
- 08:17 KDE Plasma 6.4 has landed in OpenBSD (0)
- 08:13 Blink and you'll miss it! 4096 colours and flashing text on the console! (2)
- 08:08 Game of Trees Hub now taking signups for repository hosting (0)
Sat, Jun 28
- 05:57 Game of Trees 0.115 released (0)
Tue, Jun 24
- 07:48 Game of Trees 0.114 released (0)
- 07:23 Call for testing: bge/bnx/iavf/igc/ix/ixl/ngbe/pcn: ifq_restart() fix (0)
Mon, Jun 16
- 08:22 j2k25 hackathon report from kn@: installer, low battery, and more (0)
Fri, Jun 13
- 11:18 dhcpd(8): use UDP sockets instead of BPF (1)
Thu, Jun 12
- 12:32 clang(1)/llvm/lld(1) updated to version 19 (0)

Credits

Copyright © 2004-2008 Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to April 2nd 2004 as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]