Linux network troubleshooting a la Dr. House

Intro

The following story is inspired by a recent case I had to troubleshoot at work. I think it is a nice example of troubleshooting Linux networking issues, so I’ve modified/simplified the setup a bit to be able to reproduce it on a VM. I’ll go through the troubleshooting steps in almost the same way we handled the actual case. Service names, IPs, ports, etc are all different that the real case as the focus should not be the example itself but the process.

It all started a few days ago when I was asked to help on an “unusual” case. Docker containers on every single host of an installation could not establish connections towards services that listen on the “main” IP of the host they run on, nor can they ping that IP, but the containers have full access to the internet and can connect to the service ports on other hosts in the LAN. As everyone who has done even a tiny bit of support, asking whether something changed recently in the setup is always replied back with a single global truth: “nothing has recently changed, it just stopped working”.
Challenge accepted!

Reproduction setup

For reproduction purposes I’ve used a VM with one ethernet interface, and a docker bridge. In this VM I have injected the same problem as with the real case. Even though the real case case was a bit more complicated, to make following the post somewhat easier, I’ve used only one service listening on the host, an Elasticsearch process, and only one Kibana docker container that needs to communicate with Elasticsearch on the host.

Troubleshooting process

Host interfaces:

2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 06:ce:3b:94:fe:ac brd ff:ff:ff:ff:ff:ff
    inet 172.31.45.100/20 brd 172.31.47.255 scope global dynamic ens5
       valid_lft 3572sec preferred_lft 3572sec
    inet6 fe80::4ce:3bff:fe94:feac/64 scope link 
       valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:e9:38:3d:a8 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:e9ff:fe38:3da8/64 scope link 
       valid_lft forever preferred_lft forever

Kibana’s config has the following ENV variable set ELASTICSEARCH_HOSTS=http://172.31.45.100:9200, and for simplification purposes let’s assume that this IP could not be changed.

As originally described, curl from the container towards the service IP:port does not work

(container) bash-4.2$ curl -v 172.31.45.100:9200
* About to connect() to 172.31.45.100 port 9200 (#0)
*   Trying 172.31.45.100...

it just hangs there without error. There’s no DNS resolution involved here, straight curl towards the IP:port

Let’s check if the service is actually listening on the host

[root@ip-172-31-45-100 ~]# ss -ltnp | grep 9200
LISTEN      0      128                             [::]:9200                                  [::]:*                   users:(("java",pid=17892,fd=257))

The service listens on 9200. Since the service listens on all interfaces, let’s curl from the container towards the service IP:port on the docker0 interface.

bash-4.2$ curl -v 172.17.0.1:9200
* About to connect() to 172.17.0.1 port 9200 (#0)
*   Trying 172.17.0.1...
* Connected to 172.17.0.1 (172.17.0.1) port 9200 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 172.17.0.1:9200
> Accept: */*
> 
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 524
< 
{
  "name" : "node1",
  "cluster_name" : "centos7",
  "cluster_uuid" : "d6fBSua6Q9OvSu534roTpA",
  "version" : {
    "number" : "7.7.0",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "81a1e9eda8e6183f5237786246f6dced26a10eaf",
    "build_date" : "2020-05-12T02:01:37.602180Z",
    "build_snapshot" : false,
    "lucene_version" : "8.5.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

that works, so the service is running properly. Curl-ing the service from the host using the host’s IP also works

[root@ip-172-31-45-100 ~]# curl http://172.31.45.100:9200
{
  "name" : "node1",
  "cluster_name" : "centos7",
  "cluster_uuid" : "d6fBSua6Q9OvSu534roTpA",
  "version" : {
    "number" : "7.7.0",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "81a1e9eda8e6183f5237786246f6dced26a10eaf",
    "build_date" : "2020-05-12T02:01:37.602180Z",
    "build_snapshot" : false,
    "lucene_version" : "8.5.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Let’s check for internet connectivity from the container

bash-4.2$ curl -v 1.1.1.1
* About to connect() to 1.1.1.1 port 80 (#0)
*   Trying 1.1.1.1...
* Connected to 1.1.1.1 (1.1.1.1) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 1.1.1.1
> Accept: */*
> 
< HTTP/1.1 301 Moved Permanently
< Date: Sun, 31 May 2020 10:10:27 GMT
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: keep-alive
< Location: https://1.1.1.1/
< Served-In-Seconds: 0.000
< CF-Cache-Status: HIT
< Age: 5334
< Expires: Sun, 31 May 2020 14:10:27 GMT
< Cache-Control: public, max-age=14400
< cf-request-id: 030bcf27a3000018e57f0f5200000001
< Server: cloudflare
< CF-RAY: 59bfe7b90b7518e5-FRA
< 
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>cloudflare-lb</center>
</body>
</html>

internet connectivity for the container works just fine. Let’s curl to another host in the same LAN on the same service port.

bash-4.2$ curl -v 172.31.45.101:9200
* About to connect() to 172.31.45.101 port 9200 (#0)
*   Trying 172.31.45.101...
* Connected to 172.31.45.101 (172.31.45.101) port 9200 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 172.31.45.101:9200
> Accept: */*
> 
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 524
< 
{
  "name" : "node2",
  "cluster_name" : "centos7",
  "cluster_uuid" : "d6fBSua6Q9OvSu534roTpA",
  "version" : {
    "number" : "7.7.0",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "81a1e9eda8e6183f5237786246f6dced26a10eaf",
    "build_date" : "2020-05-12T02:01:37.602180Z",
    "build_snapshot" : false,
    "lucene_version" : "8.5.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

That also works. Time to use the swiss army knife of network troubleshooting, tcpdump. If you want to find which veth interface a container uses you can either use dockerveth or use the following commands to figure it out manually.
Get the iflink of container’s eth0:

[root@ip-172-31-45-100 ~]# docker exec -it <container-name> bash -c 'cat /sys/class/net/eth0/iflink'

In this case that would be:

# docker exec -it kibana bash -c 'cat /sys/class/net/eth0/iflink'
41

then find the file name of the ifindex that contains that link number in `/sys/class/net/veth*/ifindex` of the host

[root@ip-172-31-45-100 ~]# grep -lw 41 /sys/class/net/veth*/ifindex
/sys/class/net/veth0006ca6/ifindex

`veth0006ca6` is what we need to use. Let’s run tcpdump on it

[root@ip-172-31-45-100 ~]# tcpdump -nni veth0006ca6
10:06:16.745143 IP 172.17.0.2.47166 > 172.31.45.100.9200: Flags [S], seq 1062649548, win 29200, options [mss 1460,sackOK,TS val 1781316 ecr 0,nop,wscale 7], length 0
10:06:16.749126 IP 172.17.0.2.47168 > 172.31.45.100.9200: Flags [S], seq 4174345004, win 29200, options [mss 1460,sackOK,TS val 1781320 ecr 0,nop,wscale 7], length 0
10:06:16.749131 IP 172.17.0.2.47170 > 172.31.45.100.9200: Flags [S], seq 1386880792, win 29200, options [mss 1460,sackOK,TS val 1781320 ecr 0,nop,wscale 7], length 0

the syn packet is seen going out of the container’s veth interface. So let’s tcpdump on docker0

[root@ip-172-31-45-100 ~]# tcpdump -nni docker0
10:07:07.813153 IP 172.17.0.2.47148 > 172.31.45.100.9200: Flags [S], seq 4114480937, win 29200, options [mss 1460,sackOK,TS val 1396384 ecr 0,nop,wscale 7], length 0
10:07:07.845141 IP 172.17.0.2.47144 > 172.31.45.100.9200: Flags [S], seq 3273546229, win 29200, options [mss 1460,sackOK,TS val 1412416 ecr 0,nop,wscale 7], length 0
10:07:07.845147 IP 172.17.0.2.47146 > 172.31.45.100.9200: Flags [S], seq 2062214864, win 29200, options [mss 1460,sackOK,TS val 1412416 ecr 0,nop,wscale 7], length 0

the syn packet can also be seen on the docker0 bridge. The syn packet cannot be seen on the interface (ens5) that has the service IP (172.31.45.100) on it, since it doesn’t traverse that link to go outside the host.

[root@ip-172-31-45-100 ~]# tcpdump -nni ens5 port 9200 or icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens5, link-type EN10MB (Ethernet), capture size 262144 bytes

Let’s check routing entries.

[root@ip-172-31-45-100 ~]# ip route ls
default via 172.31.32.1 dev ens5 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 
172.31.32.0/20 dev ens5 proto kernel scope link src 172.31.45.100 

Nothing interesting here at all. Time to check iptables.

[root@ip-172-31-45-100 ~]# iptables -nxvL
Chain INPUT (policy ACCEPT 169 packets, 27524 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy DROP 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
     368    27870 DOCKER-ISOLATION  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
     184    14254 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
     184    14254 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
     184    13616 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
       0        0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 109 packets, 10788 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain DOCKER (1 references)
    pkts      bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION (1 references)
    pkts      bytes target     prot opt in     out     source               destination         
     368    27870 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0      

[root@ip-172-31-45-100 ~]# iptables -nxvL -t nat
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
     204    12200 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 6 packets, 456 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
       0        0 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 6 packets, 456 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
      92     6808 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0           

Chain DOCKER (2 references)
    pkts      bytes target     prot opt in     out     source               destination         
     191    11460 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0  

[root@ip-172-31-45-100 ~]# iptables -nxvL -t mangle
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination  

There’s not even a DROP rule at all and all the policies are set to ACCEPT. iptables is definitely not dropping the connection. Even if there was a DROP rule, we would see the packet on tcpdump…so where’s the packet going ?

Let’s add an extra rule for both FORWARD and INPUT chains just to see if iptables can match these rules as the packets are passing by.

[root@ip-172-31-45-100 ~]# iptables -I INPUT -p tcp --dport 9200
[root@ip-172-31-45-100 ~]# iptables -I FORWARD -p tcp --dport 9200

wait for a while and then check the statistics of those 2 rules:

[root@ip-172-31-45-100 ~]# iptables -nxvL | grep 9200
       0        0            tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:9200
       0        0            tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:9200

no packets match these 2 rules at all! Time to inspect the container and the docker bridge network.

[root@ip-172-31-45-100 ~]# docker network inspect bridge
[
    {
        "Name": "bridge",
        "Id": "a6290df54ea24d14faa8d003d17802b3f8a4967680bc0c82c1211ab75d1815e2",
        "Created": "2020-05-31T09:40:19.81958733Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {},
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]

pretty standard options for the bridge network, even `enable_icc` is set to `true`. What about the container though ?

[root@ip-172-31-45-100 ~]# docker inspect kibana
[
    {
        "Id": "2f08cc190b760361d9aa2951b4c9c407561fe35b8dbdc003f3f535719456f460",
        "Created": "2020-05-31T10:24:11.942315341Z",
        "Path": "/usr/local/bin/dumb-init",
        "Args": [
            "--",
            "/usr/local/bin/kibana-docker"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 3735,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2020-05-31T10:24:12.329214827Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:eadc7b3d59dd47b1b56f280732f38d16a4b31947cbc758516adbe1df5472b407",
        "ResolvConfPath": "/var/lib/docker/containers/2f08cc190b760361d9aa2951b4c9c407561fe35b8dbdc003f3f535719456f460/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/2f08cc190b760361d9aa2951b4c9c407561fe35b8dbdc003f3f535719456f460/hostname",
        "HostsPath": "/var/lib/docker/containers/2f08cc190b760361d9aa2951b4c9c407561fe35b8dbdc003f3f535719456f460/hosts",
        "LogPath": "",
        "Name": "/kibana",
        "RestartCount": 0,
        "Driver": "overlay2",
        "MountLabel": "system_u:object_r:svirt_sandbox_file_t:s0:c434,c792",
        "ProcessLabel": "system_u:system_r:svirt_lxc_net_t:s0:c434,c792",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": null,
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "journald",
                "Config": {}
            },
            "NetworkMode": "bridge",
            "PortBindings": {
                "5601/tcp": [
                    {
                        "HostIp": "",
                        "HostPort": "5601"
                    }
                ]
            },
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": true,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "docker-runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": -1,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Name": "overlay2",
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/84217deb518fa6b50fb38aab03aa6a819150e0a248cf233bda8091b136c4825a-init/diff:/var/lib/docker/overlay2/bfda0aa2ec51f7047b5694e5daf89735f3021691e6154bc370827c168c4572f0/diff:/var/lib/docker/overlay2/5d32d74a3bb95b8e3377b1c115622f12a817a591936c4ae2da4512bc2e281e4b/diff:/var/lib/docker/overlay2/6482e711a89a90bebd61834aa8bd3463f567684dd3cdbbf2698179b752fdad7b/diff:/var/lib/docker/overlay2/4ae81e6a07956c974d985674c35e113ad3fbd9f4fdde43f4752c0e36a1153e69/diff:/var/lib/docker/overlay2/8330cdd839ec316133d659805f2839d1e65b16fbf7035324f419c2aa8d097925/diff:/var/lib/docker/overlay2/cd377c8c6fb23d050771d55ca15253cc9fa5043c7e49f41a2f73acd25f8e7ca9/diff:/var/lib/docker/overlay2/408c72d7e496be76503bbb01d5248c25be98e2290d71cae83d8d5d09d714f81d/diff:/var/lib/docker/overlay2/20068d51c4dd214db7b2b9d30fe13feb2e8ab35de646c9b652fea255476d396b/diff:/var/lib/docker/overlay2/0fdedfd6dbb551d32a9e826188a74936d0cec56e97c1b917fdd04b0e49a59a70/diff:/var/lib/docker/overlay2/0238ff31fbd60fdeaee1e162c92a1aa46735ec7b17df3b11455c09f18657c30f/diff:/var/lib/docker/overlay2/99a7a64a569e5e524e4139f9cf95bd929744c85c1633bcd8173c9172756c3233/diff",
                "MergedDir": "/var/lib/docker/overlay2/84217deb518fa6b50fb38aab03aa6a819150e0a248cf233bda8091b136c4825a/merged",
                "UpperDir": "/var/lib/docker/overlay2/84217deb518fa6b50fb38aab03aa6a819150e0a248cf233bda8091b136c4825a/diff",
                "WorkDir": "/var/lib/docker/overlay2/84217deb518fa6b50fb38aab03aa6a819150e0a248cf233bda8091b136c4825a/work"
            }
        },
        "Mounts": [],
        "Config": {
            "Hostname": "2f08cc190b76",
            "Domainname": "",
            "User": "kibana",
            "AttachStdin": false,
            "AttachStdout": true,
            "AttachStderr": true,
            "ExposedPorts": {
                "5601/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "ELASTICSEARCH_HOSTS=http://172.31.45.100:9200",
                "PATH=/usr/share/kibana/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "ELASTIC_CONTAINER=true"
            ],
            "Cmd": [
                "/usr/local/bin/kibana-docker"
            ],
            "Image": "docker.elastic.co/kibana/kibana:7.7.0",
            "Volumes": null,
            "WorkingDir": "/usr/share/kibana",
            "Entrypoint": [
                "/usr/local/bin/dumb-init",
                "--"
            ],
            "OnBuild": null,
            "Labels": {
                "license": "Elastic License",
                "org.label-schema.build-date": "2020-05-12T03:25:49.654Z",
                "org.label-schema.license": "Elastic License",
                "org.label-schema.name": "kibana",
                "org.label-schema.schema-version": "1.0",
                "org.label-schema.url": "https://www.elastic.co/products/kibana",
                "org.label-schema.usage": "https://www.elastic.co/guide/en/kibana/index.html",
                "org.label-schema.vcs-url": "https://github.com/elastic/kibana",
                "org.label-schema.vendor": "Elastic",
                "org.label-schema.version": "7.7.0",
                "org.opencontainers.image.created": "2020-05-04 00:00:00+01:00",
                "org.opencontainers.image.licenses": "GPL-2.0-only",
                "org.opencontainers.image.title": "CentOS Base Image",
                "org.opencontainers.image.vendor": "CentOS"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "258a4a11f55f7425b837c7d5c0420dd344add081be79da7e33c146501dd8f0ec",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {
                "5601/tcp": [
                    {
                        "HostIp": "0.0.0.0",
                        "HostPort": "5601"
                    }
                ]
            },
            "SandboxKey": "/var/run/docker/netns/258a4a11f55f",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "7126885dddf9ff031f2ff8c3b2cbd14708391dae619020bdb40efe7a849a01c7",
            "Gateway": "172.17.0.1",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "172.17.0.2",
            "IPPrefixLen": 16,
            "IPv6Gateway": "",
            "MacAddress": "02:42:ac:11:00:02",
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "a6290df54ea24d14faa8d003d17802b3f8a4967680bc0c82c1211ab75d1815e2",
                    "EndpointID": "7126885dddf9ff031f2ff8c3b2cbd14708391dae619020bdb40efe7a849a01c7",
                    "Gateway": "172.17.0.1",
                    "IPAddress": "172.17.0.2",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:11:00:02"
                }
            }
        }
    }
]

all looks very normal regarding the docker container. Let’s check sysctl settings in /etc

[root@ip-172-31-45-100 ~]# ls -Fla /etc/sysctl.d/
total 12
drwxr-xr-x.  2 root root   28 May 31 09:34 ./
drwxr-xr-x. 84 root root 8192 May 31 10:15 ../
lrwxrwxrwx.  1 root root   14 May 31 09:34 99-sysctl.conf -> ../sysctl.conf

[root@ip-172-31-45-100 ~]# cat /etc/sysctl.d/99-sysctl.conf 

# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).

nothing interesting here as well. What if someone has messed up ip forwarding via other means though ?

[root@ip-172-31-45-100 ~]# sysctl -a 2>/dev/null| grep forward | grep -v ipv6
net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.all.mc_forwarding = 0
net.ipv4.conf.default.forwarding = 1
net.ipv4.conf.default.mc_forwarding = 0
net.ipv4.conf.docker0.forwarding = 1
net.ipv4.conf.docker0.mc_forwarding = 0
net.ipv4.conf.ens5.forwarding = 1
net.ipv4.conf.ens5.mc_forwarding = 0
net.ipv4.conf.lo.forwarding = 1
net.ipv4.conf.lo.mc_forwarding = 0
net.ipv4.ip_forward = 1
net.ipv4.ip_forward_use_pmtu = 0

all looks fine here too. Let’s check some more sysctl settings regarding bridge + iptables

[root@ip-172-31-45-100 ~]# sysctl -a 2>/dev/null| grep bridge
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-filter-pppoe-tagged = 0
net.bridge.bridge-nf-filter-vlan-tagged = 0
net.bridge.bridge-nf-pass-vlan-input-dev = 0

everything still looks fine in these configuration settings, but the packets from the container still can’t reach the host.
Next step is to setup a netcat listening service on the host on a different port and try to connect to it via the container. That still doesn’t work, no packets to be seen on ens5.
Could it be ebtables ? No..no way..but what if…

[root@ip-172-31-45-100 ~]# ebtables -L
Bridge table: filter
Bridge chain: INPUT, entries: 0, policy: ACCEPT
Bridge chain: FORWARD, entries: 0, policy: ACCEPT
Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

still nothing interesting. Could it be a kernel bug ? is this some custom kernel ?

[root@ip-172-31-45-100 ~]# uname -a
Linux ip-172-31-45-100.eu-central-1.compute.internal 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

nope…that’s a vanilla centos7 kernel. Could it be nftables ? On 3.10 kernel and centos7 ?

[root@ip-172-31-45-100 ~]# nft list tables
-bash: nft: command not found

Nobody uses nftables yet, right ? Another wild thought, are there any ip rules defined ?

[root@ip-172-31-45-100 ~]# ip rule ls
0:    from all lookup local 
100:    from 172.31.45.100 lookup 1 
32766:    from all lookup main 
32767:    from all lookup default 

bingo, there’s a rule with priority 100 that matches the host’s IP address! What is this ip rule doing there ? Let’s check routing table 1 that the lookup of rule 100 points to

[root@ip-172-31-45-100 ~]# ip route ls table 1
default via 172.31.32.1 dev ens5 
172.31.32.0/20 dev ens5 scope link 

at last, here’s the answer!

There’s an IP rule entry that says that packets with a source IP of the ens5 interface should lookup routing entries only in routing table 1, which is not the main routing table. That routing table knows nothing about the docker network (172.17.0.0/16). Let’s delete the rule from the host

[root@ip-172-31-45-100 ~]# ip rule del from 172.31.45.100/32 tab 1 priority 100

and check if the container can contact the service now

(container) bash-4.2$ curl 172.31.45.100:9200
{
  "name" : "node1",
  "cluster_name" : "centos7",
  "cluster_uuid" : "d6fBSua6Q9OvSu534roTpA",
  "version" : {
    "number" : "7.7.0",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "81a1e9eda8e6183f5237786246f6dced26a10eaf",
    "build_date" : "2020-05-12T02:01:37.602180Z",
    "build_snapshot" : false,
    "lucene_version" : "8.5.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Success!

Where’s the SYN+ACK ?

Does the SYN packet reach the listening service ? No…and the reason is rp_filter. Centos7 sets net.ipv4.conf.default.rp_filter=1, so when docker0 interface gets created it is set to net.ipv4.conf.docker0.rp_filter=1.

Here’s what rp_filter values mean according to kernel documentation:

  • 0 – No source validation.
  • 1 – Strict mode as defined in RFC3704 Strict Reverse Path Each incoming packet is tested against the FIB and if the interface is not the best reverse path the packet check will fail. By default failed packets are discarded.
  • 2 – Loose mode as defined in RFC3704 Loose Reverse Path Each incoming packet’s source address is also tested against the FIB and if the source address is not reachable via any interface the packet check will fail.

After reverting the deleted ip rule via ip rule add from 172.31.45.100/32 tab 1 priority 100 and setting sysctl -w net.ipv4.conf.docker0.rp_filter=0 we can see the SYN+ACK packet going out of ens5 interface towards the default gateway.

[root@ip-172-31-45-100 ~]# tcpdump -enni ens5 port 9200
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens5, link-type EN10MB (Ethernet), capture size 262144 bytes
15:44:19.028353 06:ce:3b:94:fe:ac > 06:1b:e5:19:30:12, ethertype IPv4 (0x0800), length 74: 172.31.45.100.9200 > 172.17.0.2.47460: Flags [S.], seq 1431574088, ack 1879007024, win 26847, options [mss 8961,sackOK,TS val 22063599 ecr 22063599,nop,wscale 7], length 0
15:44:20.029163 06:ce:3b:94:fe:ac > 06:1b:e5:19:30:12, ethertype IPv4 (0x0800), length 74: 172.31.45.100.9200 > 172.17.0.2.47460: Flags [S.], seq 1431574088, ack 1879007024, win 26847, options [mss 8961,sackOK,TS val 22064600 ecr 22063599,nop,wscale 7], length 0
15:44:21.229126 06:ce:3b:94:fe:ac > 06:1b:e5:19:30:12, ethertype IPv4 (0x0800), length 74: 172.31.45.100.9200 > 172.17.0.2.47460: Flags [S.], seq 1431574088, ack 1879007024, win 26847, options [mss 8961,sackOK,TS val 22065800 ecr 22063599,nop,wscale 7], length 0

Finding such discarded packets, called martians, in the logs can be done by enabling log_martians via sysctl -w net.ipv4.conf.all.log_martians=1. Example syslog message:

May 31 16:01:08 ip-172-31-45-100 kernel: IPv4: martian source 172.31.45.100 from 172.17.0.2, on dev docker018
May 31 16:01:08 ip-172-31-45-100 kernel: ll header: 00000000: 02 42 e9 38 3d a8 02 42 ac 11 00 02 08 00 .B.8=..B......

But why ?

Why was the rule there in the original case ? Multihoming was tried, it didn’t work as expected and not all the configs were removed. Grep-ing /etc for the host’s IP found the following file:

/etc/sysconfig/network-scripts/rule-ens5:from 172.31.45.100/32 tab 1 priority 100

In multihoming it’s common that packets reaching a host on interface X should also be replied back from interface X. Part of a method to achieve this is to assign each interface its own routing table.

So when asked to troubleshoot networking issues act like Dr. House would, assume the worst.

P.S. thanks to Markos for the comments on improving the blogpost

Tormap – World map of Tor nodes – 5 years later

5 years ago I forked Moritz’s tormap project, updated it a bit and wrote about it. Tormap kept running for years until some changes in googlemaps broke it, not all KMLs were loading as they should. I later on figured out that googlemaps didn’t like that some of the KML files were larger than 3Mb. I didn’t have much time to play with it until recently, so a few days ago I decided to make it work again. I used newer googlemaps v3 API calls and compressed KML (KMZ) files to make it work. Then @iainlearmonth and @nusenu_ suggested making even more changes…

Their first suggestion was to use onionoo instead of parsing consensus on my own and running geoip on it, onionoo already provides that in a nice json output. Their other suggestion was to switch tormap to use OpenStretMap instead of googlemaps mostly because googlemaps block some Tor exit nodes and the tiles didn’t appear on the map when visiting over Tor. Both of these issues are fixed now.

I used leaflet.js and a couple of plugins like leaflet-plugins (for KML parsing) and leaflet-color-markers for the switch to OpenStreetMap. I will admit that using googlemaps APIs was far more convenient for someone without any javascript knowledge like me.

Maybe in the next 5 years I will have time again to implement their other suggestion, creating maps of nodes based on custom searches for relay attributes. Unless someone else wants to implement that, feel free to fork it!

Firejail with Tor HOWTO

A few years ago I created a set of scripts to start applications inside a linux namespace and automatically “Tor-ify” their network traffic. The main reason behind this effort was to provide some isolation and Tor support for applications that don’t have socks5 support, for example claws-mail. While this worked it was hard to keep adding sandboxing features like the ones firejail already provided. So I decided to take a look at how I could automatically send/receive traffic from a firejail-ed application through Tor.

This blog post is NOT meant to be used as copy/paste commands but to explain why each step is needed and how to overcome the problems found in the path.
If you have reasons to proxy all your traffic through Tor as securely as possible use Tails on a different machine, this guide is NOT for you.

A dedicated bridge
First of all create a Linux bridge and assign an IP address to it. Use this bridge to attach the veth interfaces that firejail creates when using the ‘net’ option. This option creates a new network namespace for each sandboxed application.

# brctl addbr tornet
# ip link set dev tornet up
# ip addr add 10.100.100.1/24 dev tornet

NAT
Then enable NAT from/to your “external” interface (eno1 in my case) for tcp connections and udp port 53 (DNS) and enable IP(v4) forwarding, if you don’t already use it. Some rules about sane default policy for FORWARD chain are added here well, modify to your needs.

# sysctl -w net.ipv4.conf.all.forwarding=1
# iptables -P FORWARD DROP
# iptables -A INPUT -m state --state INVALID -j DROP
# iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
# iptables -A FORWARD -i tornet -o eno1 -p tcp -j ACCEPT
# iptables -A FORWARD -i tornet -o eno1 -p udp --dport=53 -j ACCEPT
# iptables -t nat -A POSTROUTING -s 10.100.100.0/24 -o eno1 -j MASQUERADE

This configuration is enough to start a sandboxed application that will have it’s traffic NAT-ed from the Linux host.

$ firejail --net=tornet /bin/bash
Parent pid 26730, child pid 26731

Interface        MAC                IP               Mask             Status
lo                                  127.0.0.1        255.0.0.0        UP    
eth0             72:cc:f6:d8:6a:09  10.100.100.29    255.255.255.0    UP    
Default gateway 10.100.100.1

$ host www.debian.org
www.debian.org has address 5.153.231.4
Host www.debian.org not found: 3(NXDOMAIN)
Host www.debian.org not found: 4(NOTIMP)
$ host whoami.akamai.net
whoami.akamai.net has address 83.235.72.202
$ curl wtfismyip.com/text
3.4.5.6

(where 3.4.5.6 is your real IP and 83.235.72.202 should be the IP address of the final DNS recursive resolver requesting information from whois.akamai.net)

So NAT works and the shell is sandboxed.

“Tor-ify” traffic
Edit /etc/tor/torrc and enable TransPort and VirtualAddrNetwork Tor features to transparently proxy to the Tor network connections landing on Tor daemon’s port 9040. DNSPort is used to resolve DNS queries through the Tor network. You don’t have to use IsolateDestAddr for your setup, but I like it.

TransPort 9040
VirtualAddrNetwork 172.30.0.0/16
DNSPort 5353 IsolateDestAddr

Then use iptables to redirect traffic from tornet bridge to TransPort and DNSPort specified in torrc. You also need to ACCEPT that traffic in your INPUT chain if your policy is DROP (it is right ?)

# iptables -t nat -A PREROUTING -i tornet -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.1:5353
# iptables -t nat -A PREROUTING -i tornet -p tcp -j DNAT --to-destination 127.0.0.1:9040
# iptables -A INPUT -i tornet -p tcp --dport 9040 -j ACCEPT
# iptables -A INPUT -i tornet -p udp --dport 5353 -j ACCEPT

Run your sandbox again and try to access the same website:

$ firejail --net=tornet /bin/bash
$ curl wtfismyip.com/text
curl: (7) Failed to connect to wtfismyip.com port 80: Connection timed out

aaaand nothing happens. The problem is that you have tried to route traffic from a “normal” interface to loopback which is considered a “martian” and is not allowed by default by the Linux kernel.

sysctl magic
To enable loopback to be used for routing the route_localnet sysctl setting must be set.
# sysctl -w net.ipv4.conf.tornet.route_localnet=1

Try again:

$ firejail --net=tornet /bin/bash
$ host whoami.akamai.net
whoami.akamai.net has address 74.125.181.10
Host whoami.akamai.net not found: 3(NXDOMAIN)
Host whoami.akamai.net not found: 4(NOTIMP)
$ curl wtfismyip.com/text
176.10.104.243
$ host 176.10.104.243
243.104.10.176.in-addr.arpa domain name pointer tor2e1.digitale-gesellschaft.ch.

it works!

You can actually run any program you want like that:
$ firejail --net=tornet google-chrome

Accessing onion services
There’s one problem left though, accessing onion services.
If you try and access www.debian.org onion service from your firejail+tor setup you will get an error.

$ firejail --net=tornet /bin/bash
$ curl http://sejnfjrq6szgca7v.onion/
curl: (6) Could not resolve host: sejnfjrq6szgca7v.onion

To fix that you need to modify /etc/tor/torrc again and add AutomapHostsOnResolve option.
AutomapHostsOnResolve 1

$ firejail --net=tornet /bin/bash
$ curl -I http://sejnfjrq6szgca7v.onion/
HTTP/1.1 200 OK
Date: Fri, 09 Dec 2016 12:05:56 GMT
Server: Apache
Content-Location: index.en.html
Vary: negotiate,accept-language,Accept-Encoding
TCN: choice
Last-Modified: Thu, 08 Dec 2016 15:42:34 GMT
ETag: "3a40-543277c74dd5b"
Accept-Ranges: bytes
Content-Length: 14912
Cache-Control: max-age=86400
Expires: Sat, 10 Dec 2016 12:05:56 GMT
X-Clacks-Overhead: GNU Terry Pratchett
Content-Type: text/html
Content-Language: en

Accessing onion services works as well now.

Applications supporting socks5
If you already have some of your applications proxying connections to tor using 127.0.0.1:9050 then you need to add another iptables rule to redirect the socks traffic from inside firejail’s namespace to Tor SocksPort.
# iptables -t nat -A PREROUTING -i tornet -p tcp -m tcp --dport 9050 -j DNAT --to-destination 127.0.0.1:9050

Update on the state of STARTTLS support of Greek email providers

2 months ago I wrote a blog post describing the really bad state of STARTTLS support of Greek email providers. Things have slightly gotten better since then.

Updates on STARTTLS support per provider
The following is current as of 2016/03/26 and are only the updates since the previous blog post.
FORTHNET: Supports TLS 1.2 (at least since 2016/02/03)
VODAFONE: Supports TLS 1.2 for vodafone.gr but NOT for vodafone.com.gr (at least since 2016/03/10)

Updates on Certificate status per provider (that have STARTTLS support)
FORTHNET: uses a valid certificate (a wildcard *.forthnet.gr)
VODAFONE: uses a valid certificate (a wildcard *.megamailservers.eu)
MAILBOX: uses a valid signed certificate (for spamexperts.eu) (at least since 2016/03/26)

No other changes have been observed.

These updates indicate that 3 out of 5 commercial Greek ISPs currently use STARTTLS on their mail servers, OTE/COSMOTE, Forthnet and Vodafone. Way better than 1 out 5 which was the case 2 months ago. That means that the only ones left behind are Wind and Cyta, Since HOL has merged with Vodafone.

P.S. Thanks fly to @stsimb for notifying me of Forthnet updates with a comment on my blog

The sorry state of STARTTLS support of Greek email providers

I started looking into the STARTTLS support of Greek email providers completely by accident when one email of mine wasn’t being delivered for some reason to a friend who has an email address at a traditional Greek ISP. I started looking into the delivery issues by running swaks against the email server of the ISP and I just couldn’t believe it that the ISP’s mail server response did not include STARTTLS support. That made me wonder about the rest of the ISPs, so I created a very simple script that takes domains, finds their MX addresses and performs very simple TLS lookups using openssl. Yeah I know that there are websites that track the STARTTLS support of mail servers, but they usually don’t save the previous results and you can’t grep and compare.

What I’ve looked into is how emails are sent between servers (SMTP), not if users can read emails from the mail servers (POP3/IMAP) using encrypted connections.

TL;DR
The situation is BAD, REALLY BAD. Only 1,5 (yes, this is one and a half) commercial ISPs supports STARTTLS. OTE/COSMOTE has “proper” STARTTLS support while Wind has STARTTLS support only for windtools.gr domain, but not for their wind.gr.

I couldn’t believe the situation was SO, SO BAD before looking at the results. It seems that I had a lot more faith in those providers than I should have. Yeah I was wrong once again.

wtf is STARTTLS?
(please don’t read the next sentence if you know what TLS is)
If you have no idea about TLS and STARTTLS, then consider STARTTLS a way for servers to communicate and exchange data in encrypted form instead of cleartext. If mail servers don’t support STARTTLS then other servers can’t send them emails in encrypted form and everyone between those 2 servers can read the emails. It’s the equivalent of “https://” for mail servers. (There, I said it…).

TLS support per provider
The following is current as of 2016/01/23

Commercial Providers
OTE/COSMOTE: Some servers support TLS version 1.0 and some others 1.2 (more on that later)
WIND: Supports TLS version 1.0 on windtools.gr but does NOT support TLS on wind.gr (different mail servers)
CYTA: Does NOT support TLS on their mail servers
FORTHNET: Does NOT support TLS on their mail servers
HOL: Does NOT support TLS on their mail servers
VODAFONE: Does NOT support TLS on their mail servers

non-Commercial Providers
GRNET: Supports TLS 1.2
SCH: Supports TLS 1.0
TEE: Does NOT support TLS on their mail servers
MIL: Supports TLS 1.0

Universities
AUTH: Supports TLS 1.2
NTUA: Some servers support TLS 1.0 and one supports TLS 1.2
UPATRAS: Supports TLS 1.0

Free Providers
IN: Does NOT support TLS on their mail servers
FREEMAIL: Does NOT support TLS on their mail servers
MAILBOX: Supports TLS 1.2

Radical Providers
ESPIV: Supports TLS 1.2

Certificate status per provider (that have STARTTLS support)
OTE/COSMOTE: *.otenet.gr mail servers, which are the ones that support TLS 1.0, use a certificate that is valid for mailgate.otenet.gr, *.ote.gr mail servers have their own certificates, but all mail*.dt-one.com mail servers, which are the ones that use TLS 1.2, use the same self-signed certificate.
WIND: mx2.windtools.gr uses a valid certificate
GRNET: uses a valid certificate
SCH: uses a self-signed certificate (which has expired 5 years ago) signed by their own CA (which has expired 4 years ago)
MIL: uses a self-signed certificate (which has expired 1 year ago) signed by their own CA
AUTH: uses a certificate signed by their own CA called HARICA, whose certificate is now included in modern OSes, so I will consider this a valid certificate.
NTUA: all mail servers use a certificate that is valid for mail.ntua.gr
UPATRAS: uses a valid certificate
MAILBOX: uses a self-signed certificate (by plesk)
ESPIV: uses a valid certificate (a wildcard *.espiv.net)

Why does it matter
It makes a huge difference for users’ privacy. If a mail server does not support STARTTLS then anyone with the ability to look into packets traveling on the net from a source mail server to the destination mail server can read the emails in pure plaintext, as you read them on your mail client. Support of STARTTLS for a mail server forces an adversary that previously just passively monitored traffic to have to start a MITM (Man in the middle) attack in order to read those same emails. This converts the adversary from a passive to an active attacker. And this is both expensive and dangerous for the adversary, it can get caught in the act.

Security and privacy-minded people might start bashing me on my next proposal, but considering the current situation I think it’s OK for most of the users of those providers that don’t support TLS at all.
Dear providers, please install a certificate, even a self-signed one, and add support for STARTTLS on your mail servers today.

Even a self-signed certificate improves this situation. And it costs absolutely nothing. There’s really no excuse to not even have a self-signed certificate for your email server.

Self-signed vs CA-Signed
Truth is that it 99.9999% of email servers on the Internet do not verify the remote end’s certificate upon communication. That means that it makes absolutely no difference in most cases whether the certificate is CA-signed or self-signed. Most modern email servers support fingerprint verification for remote servers’ certificates but this can’t obviously scale on the Internet. If a user fears that some entity could MITM their email provider just to read their email, they already have bigger problems and certificate verification would not be able to help them a lot anyway. They either need to protect the contents of their email (gpg?) or start using alternate means of messaging/communication (pond?)

script
The script I used is on github: gr-mx. Feel free to make changes and send pull requests.
I plan to run the script once a week just to keep an archive of the results and be able to track and compare. Let’s see if something changes…

Various weirdness
* windtools.gr has 2 MX records, mx1.windtools.gr and mx2.windtools.gr. mx1.windtools.gr has been unreachable since I started running the script on 2016/01/08.
* mail{5,6,7,8}.dt-one.com mailservers used by OTE/COSMOTE did not have the self-signed certificate on 2016/01/08 while mail{1,2,3,4}.dt-one.com had it. The certificate was added at some point between 2016/01/11 and 2016/01/17

keys.void.gr – A GPG Keyserver in Greece

After some months of entertaining the idea of setting up a public gpg keyserver I finally managed to find some time and do it this weekend.

Habemus keys.void.gr Keyserver!

Some history
The first time I set up a gpg keyserver was 3 years ago. Its purpose was to make it possible for a researcher to get more results than the default on a single query from a keyserver. Using that keyserver the Greek PGP Web of Trust 2012 edition was created. After the original import of the keys, I refreshed the keys just 2 or 3 times in the following years.

The setup
The keyserver is running on Debian Linux with SKS version 1.1.5. Port 80 and 443 are being handled by nginx which acts as a reverse proxy for SKS. I originally had port 11371, the default port that gpg client uses, behind nginx as well but I had to remove it due to the following issue. I like using HSTS header for the HTTPS port, but browsers trying to access http://keys.void.gr:11371, were switching to https://keys.void.gr:11371 (because of HSTS) which couldn’t work because port 11371 does not use TLS. So once a browser visited https://keys.void.gr and got the HSTS header, every future connection towards http://keys.void.gr:11371 would fail. The solution was to use a protocol multiplexer called sslh. What this does, is that it sniffs the connections coming towards port 11371 and if it finds a TLS connection, it sends it to port 443, if it finds an HTTP connection it sends it to port 80. That way you can either visit http://keys.void.gr:11371 or https://keys.void.gr:11371 and they both work.

For ports 80,443 the connection path looks like: client -> nginx -> sks
For port 11317 the connection path looks like: client -> sslh -> nginx -> sks

keys.void.gr is available in both IPv4 and IPv6.

I’ve also setup an onion/hidden service for the keyserver, so if you prefer visiting the onion address, here it is: wooprzddebtxfhnq.onion (available on port 11371 as well).

Difficulties
I’m not sure if it’s the Debian package’s fault or I did something stupid, but if you plan on running your own keyserver be very careful with permissions on the your filesystem. sks errors are not very friendly. Make sure that /var/spool/sks, /var/lib/sks and /var/log/sks are all owned by debian-sks:debian-sks.
# chown -R debian-sks:debian-sks /var/spool/sks /var/lib/sks /var/log/sks
Don’t run the DB building script as root, run it as debian-sks user:
# sudo -u debian-sks /usr/lib/sks/sks_build.sh
There are a quite some tunables referenced in the sks man page regarding pagesizes, I went with the default options for now.

The pool
To enter the pool of keyservers and start interacting with other keyservers you have to join the sks-devel mailing list and announce your server existence by sending your “membership line” which looks like this:
keys.void.gr 11370 # George K. <keyserver [don't spam me] void [a dot goes here] gr> #0x721006E470459C9C

If people place this line in their membership config file and you place theirs, then the keyservers start communicating, or “gossiping” as it is called in the sks language. It needs to be mutual.

Because of the minimal traffic I was seeing on the mailing list archives I thought that finding peers would take weeks, if not months. I was very very wrong. I got 6 replies to my email in less than 2 hours. Impressive. Thanks a lot people!

UI
I’ve taken the boostrap-ed HTML from https://github.com/mattrude/pgpkeyserver-lite.

TODO
hkps support will be added in the following days or weeks.

Stats
keys.void.gr Keyserver statistics
sks-keyservers.net pool Status for keys.void.gr

Enjoy!

SMTP over Hidden Services with postfix

More and more privacy experts are nowdays calling people to move away from the email service provider giants (gmail, yahoo!, microsoft, etc) and are urging people to set up their own email services, to “decentralize”. This brings up many many other issues though, and one of which is that if only a small group people use a certain email server, even if they use TLS, it’s relatively easy for someone passively monitoring (email) traffic to correlate who (from some server) is communicating with whom (from another server). Even if the connection and the content is protected by TLS and GPG respectively, some people might feel uncomfortable if a third party knew that they are actually communicating (well these people better not use email, but let’s not get carried away).

This post is about sending SMTP traffic between two servers on the Internet over Tor, that is without someone being able to easily see who is sending what to whom. IMHO, it can be helpful in some situations to certain groups of people.

There are numerous posts on the Internet about how you can Torify all the SMTP connections of a postfix server, the problem with this approach is that most exit nodes are blacklisted by RBLs so it’s very probable that the emails sent will either not reach their target or will get marked as spam. Another approach is to create hidden services and make users send emails to each other at their hidden service domains, eg username@a2i4gzo2bmv9as3avx.onion. This is quite uncomfortable for users and it can never get adopted.

There is yet another approach though, the communication could happen over Tor hidden services that real domains are mapped to.

HOWTO
Both sides need to run a Tor client:
aptitude install tor torsocks

The setup is the following, the postmaster on the receiving side sets up a Tor Hidden Service for their SMTP service (receiver). This is easily done in his server (server-A) with the following line in the torrc:
HiddenServicePort 25 25. Let’s call this HiddenService-A (abcdefghijklmn12.onion). He then needs to notify other postmasters of this hidden service.

The postmaster on the sending side (server-B) needs to create 2 things, a torified SMTP service (sender) for postfix and a transport map that will redirect emails sent to domains of server-A to HiddenService-A.

Steps needed to be executed on server-B:
1. Create /usr/lib/postfix/smtp_tor with the following content:

#!/bin/sh

torsocks /usr/lib/postfix/smtp $@

2. Make it executable
chmod +x /usr/lib/postfix/smtp_tor

3. Edit /etc/postfix/master.cf and add a new service entry
smtptor unix - - - - - smtp_tor
For Debian Stretch and/or for postfix 2.11+ this should be:

smtptor      unix  -       -       -       -       -       smtp_tor
  -o smtp_dns_support_level=disabled

4. If you don’t already have a transport map file, create /etc/postfix/transport with content (otherwise just add the following to your transport maps file):

domain-a.net        smtptor:[abcdefghijklmn12.onion]
domain-b.com        smtptor:[bbbcccdddeeeadas.onion]

5. if you don’t already have a transport map file edit /etc/postfix/main.cf and add the following:
transport_maps = hash:/etc/postfix/transport

6. run the following:
postmap /etc/postfix/transport && service postfix reload

7. If you’re running torsocks version 2 you need to set AllowInbound 1 in /etc/tor/torsocks.conf. If you’re using torsocks version 1,you shouldn’t, no changes are necessary.

Conclusion
Well that’s about it, now every email sent from a user of server-B to username@domain-a.net will actually get sent over Tor to server-A on its HiddenService. Since HiddenServices are usually mapped on 127.0.0.1, it will bypass the usual sender restrictions. Depending on the setup of the receiver it might even evade spam detection software, so beware…If both postmasters follow the above steps then all emails sent from users of server-A to users of server-B and vice versa will be sent anonymously over Tor.

There is nothing really new in this post, but I couldn’t find any other posts describing such a setup. Since it requires both sides to actually do something for things to work, I don’t think it can ever be used widely, but it’s still yet another way to take advantage of Tor and Hidden Services.

!Open Relaying
When you setup a tor hidden service to accept connections to your SMTP server, you need to be careful that you aren’t opening your mail server up to be an open relay on the tor network. You need to very carefully inspect your configuration to see if you are allowing 127.0.0.1 connections to relay mail, and if you are, there are a couple ways to stop it.

You can tell if you are allowing 127.0.0.1 to relay mail if you have something like this in your postfix configuration by looking at the smtpd_recipient_restrictions and seeing if you have permit_mynetworks, and your mynetworks variable includes 127.0.0.1/8 (default). The tor hidden service will connect via 127.0.0.1, so if you allow that to send without authentication, you are an open relay on the tor network, and you don’t want that…

Three ways of dealing with this.

1. Remove remove 127.0.0.1 from mynetworks and use port 25/587 as usual.

2. Create a new secondary transport that has a different set of restrictions. Copy the restrictions from main.cf and remove ‘permit_mynetworks’ from them
/etc/postfix/master.cf

2525      inet  n       -       -       -       -       smtpd
   -o smtpd_recipient_restrictions=XXXXXXX
   -o smtpd_sender_restrictions=YYYYYY
   -o smtpd_helo_restrictions=ZZZZ

2587 inet n - - - - smtpd
   -o smtpd_enforce_tls=yes
   -o smtpd_tls_security_level=encrypt
   -o smtpd_sasl_auth_enable=yes
   -o smtpd_client_restrictions=permit_sasl_authenticated,reject
   -o smtpd_sender_restrictions=
   -o smtpd_recipient_restrictions=XXXXXXX
   -o smtpd_sender_restrictions=YYYYYY
   -o smtpd_helo_restrictions=XXXXX

Then edit your /etc/tor/torrc

HiddenServiceDir /var/lib/tor/smtp_onion
HiddenServicePort 25 2525
HiddenServicePort 587 2587

3. If your server is not used by other servers to relay email, then you can use the newer postfix variable that was designed for restricting relays smtpd_relay_restrictions (remember NOT to use permit_mynetworks there) to allow emails to be “relayed” by the onion service:
/etc/postfix/main.cf

smtpd_relay_restrictions = permit_sasl_authenticated,
        reject_unauth_destination

smtpd_recipient_restrictions =
        reject_unknown_recipient_domain,
        check_recipient_access hash:$checks_dir/recipient_access,
        permit_sasl_authenticated,
        permit_mynetworks,
        permit

Concerns
Can hidden services scale to support hundreds or thousands of connections e.g. from a mailing list ? who knows…
This type of setup needs the help of big fishes (large independent email providers like Riseup) to protect the small fishes (your own email server). So a new problem arises, bootstrapping and I’m not really sure this problem has any elegant solution. The more servers use this setup though, the more useful it becomes against passive adversaries trying to correlate who communicates with whom.
The above setup works better when there are more than one hidden services running on the receiving side so a passive adversary won’t really know that the incoming traffic is SMTP, eg when you also run a (busy) HTTP server as a hidden service at the same machine.
Hey, where did MX record lookup go ?

Trying it
If anyone wants to try it, you can send me an email using voidgrz25evgseyc.onion as the Hidden SMTP Service (in the transport map).

Links:
http://www.postfix.org/master.5.html
http://www.groovy.net/ww/2012/01/torfixbis
ehloonion/onionmx github repository

*Update 01/02/2015 Added information about !Open Relaying and torsocks version 2 configuration*
*Update 11/10/2016 Updated information about !Open Relaying*
*Update 14/06/2018 Added link to ehloonion/onionmx*

Creating a new GPG key with subkeys

A few weeks ago I created my new GPG/PGP key with subkeys and a few people asked me why and how. The rationale for creating separate subkeys for signing and encryption is written very nicely in the subkeys page of the debian wiki. The short answer is that having separate subkeys makes key management a lot easier and protects you in certain occasions, for example you can create a new subkey when you need to travel or when your laptop gets stolen, without losing previous signatures. Obviously you need to keep your master key somewhere very very safe and certainly not online or attached to a computer.

You can find many other blog posts on the net on the subject, but most of them are missing a few parts. I’ll try to keep this post as complete as possible. If you are to use gpg subkeys you definitely need an encrypted usb to store the master key at the end. So if you don’t already have an encrypted USB go and make one first.

When this process is over you will have a gpg keypair on your laptop without the master key, you will be able to use that for everyday encryption and signing of documents but there’s a catch. You won’t be able to sign other people’s keys. To do that you will need the master key. But that is something that does not happen very often so it should not be a problem in your everyday gpg workflow. You can read about signing other people’s keys at the end of this post. AFAIK you can’t remove your master key using some of the gpg GUIs, so your only hope is the command line. Live with it…

First some basic information that will be needed later.
When listing secret keys with gpg -K keys are marked with either ‘sec’ or ‘ssb’. When listing (public) keys with gpg -k keys are marked with ‘pub’ or ‘sub’.

sec => 'SECret key'
ssb => 'Secret SuBkey'
pub => 'PUBlic key'
sub => 'public SUBkey'

When editing a key you will see a usage flag on the right. Each key has a role and that is represented by a character. These are the roles and their corresponding characters:

Constant           Character      Explanation
─────────────────────────────────────────────────────
PUBKEY_USAGE_SIG      S       key is good for signing
PUBKEY_USAGE_CERT     C       key is good for certifying other signatures
PUBKEY_USAGE_ENC      E       key is good for encryption
PUBKEY_USAGE_AUTH     A       key is good for authentication

Before doing anything make sure you have a backup of your .gnupg dir.
$ umask 077; tar -cf $HOME/gnupg-backup.tar -C $HOME .gnupg

Secure preferences
Now edit your .gnupg/gpg.conf and add or change the following settings (most are stolen from Riseup: OpenPGP Best Practices):

# when outputting certificates, view user IDs distinctly from keys:
fixed-list-mode
# long keyids are more collision-resistant than short keyids (it's trivial to make a key with any desired short keyid)
keyid-format 0xlong
# when multiple digests are supported by all recipients, choose the strongest one:
personal-digest-preferences SHA512 SHA384 SHA256 SHA224
# preferences chosen for new keys should prioritize stronger algorithms:
default-preference-list SHA512 SHA384 SHA256 SHA224 AES256 AES192 AES CAST5 BZIP2 ZLIB ZIP Uncompressed
# If you use a graphical environment (and even if you don't) you should be using an agent:
# (similar arguments as https://www.debian-administration.org/users/dkg/weblog/64)
use-agent
# You should always know at a glance which User IDs gpg thinks are legitimately bound to the keys in your keyring:
verify-options show-uid-validity
list-options show-uid-validity
# when making an OpenPGP certification, use a stronger digest than the default SHA1:
cert-digest-algo SHA256
# prevent version string from appearing in your signatures/public keys
no-emit-version 

Create new key
Time to create the new key. I’m marking user input with bold (↞) arrows

$ gpg --gen-key
gpg (GnuPG) 1.4.12; Copyright (C) 2012 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection?
Your selection?  1 ↞↞↞↞ 
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)  4096 ↞↞↞↞ 
Requested keysize is 4096 bits
Please specify how long the key should be valid.
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0)  0 ↞↞↞↞ 
Key does not expire at all
Is this correct? (y/N)  y ↞↞↞↞ 
You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"
Real name: foo bar ↞↞↞↞ 
Email address: foobar@void.gr ↞↞↞↞ 
Comment:
You selected this USER-ID:
"foo bar <foobar@void.gr>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit?  O ↞↞↞↞ 
You need a Passphrase to protect your secret key.

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
.............+++++
..+++++

gpg: key 0x6F87F32E2234961E marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   3  signed:  14  trust: 0-, 0q, 0n, 0m, 0f, 3u
gpg: depth: 1  valid:  14  signed:   9  trust: 13-, 1q, 0n, 0m, 0f, 0u
gpg: next trustdb check due at 2014-03-18
pub   4096R/0x6F87F32E2234961E 2013-12-01
      Key fingerprint = 407E 45F0 D914 8277 3D28  CDD8 6F87 F32E 2234 961E
uid                 [ultimate] foo bar <foobar@void.gr>
sub   4096R/0xD3DCB1F51C37970B 2013-12-01

Optionally, you can add another uid and add it as the default:

$ gpg --edit-key 0x6F87F32E2234961E                                      
gpg (GnuPG) 1.4.12; Copyright (C) 2012 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

pub  4096R/0x6F87F32E2234961E  created: 2013-12-01  expires: never       usage: SC  
                               trust: ultimate      validity: ultimate
sub  4096R/0xD3DCB1F51C37970B  created: 2013-12-01  expires: never       usage: E   
[ultimate] (1). foo bar <foobar@void.gr>
gpg> adduid ↞↞↞↞ 
Real name: foo bar ↞↞↞↞ 
Email address: foobar@riseup.net ↞↞↞↞ 
Comment: 
You selected this USER-ID:
    "foo bar <foobar@riseup.net>"
Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit?  O ↞↞↞↞ 
You need a passphrase to unlock the secret key for
user: "foo bar <foobar@void.gr>"
4096-bit RSA key, ID 0x6F87F32E2234961E, created 2013-12-01

pub  4096R/0x6F87F32E2234961E  created: 2013-12-01  expires: never       usage: SC  
                               trust: ultimate      validity: ultimate
sub  4096R/0xD3DCB1F51C37970B  created: 2013-12-01  expires: never       usage: E   
[ultimate] (1)  foo bar <foobar@void.gr>
[ unknown] (2). foo bar <foobar@riseup.net>
gpg> uid 2 ↞↞↞↞ 
gpg> primary ↞↞↞↞ 
gpg> save ↞↞↞↞ 

Let’s see what we’ve got until now, 0x6F87F32E2234961E is the master key (SC flags) and 0xD3DCB1F51C37970B (E flag)is a separate subkey for encryption.

Add new signing subkey
Since we already have a separate encryption subkey, it’s time for a new signing subkey. Expiration dates for keys is a very hot topic. IMHO there’s no point in having an encryption subkey with an expiration date, expired keys are working just fine for decryption anyways, so I’ll leave it without one, but I want the signing key that I’m regularly using to have an expiration date. You can read more about this topic on the gnupg manual (Selecting expiration dates and using subkeys).

$ gpg --edit-key 0x6F87F32E2234961E
gpg (GnuPG) 1.4.12; Copyright (C) 2012 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

pub  4096R/0x6F87F32E2234961E  created: 2013-12-01  expires: never       usage: SC  
                               trust: ultimate      validity: ultimate
sub  4096R/0xD3DCB1F51C37970B  created: 2013-12-01  expires: never       usage: E   
[ultimate] (1). foo bar <foobar@riseup.net>
[ultimate] (2)  foo bar <foobar@void.gr>
gpg> addkey ↞↞↞↞ 

Key is protected.

You need a passphrase to unlock the secret key for
user: “foo bar <foobar@riseup.net>”
4096-bit RSA key, ID 0x6F87F32E2234961E, created 2013-12-01

Please select what kind of key you want:
(3) DSA (sign only)
(4) RSA (sign only)
(5) Elgamal (encrypt only)
(6) RSA (encrypt only)

Your selection? 4 ↞↞↞↞ 
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 4096 ↞↞↞↞ 
Requested keysize is 4096 bits
Please specify how long the key should be valid.
         0 = key does not expire
        = key expires in n days
      w = key expires in n weeks
      m = key expires in n months
      y = key expires in n years
Key is valid for? (0) 5y ↞↞↞↞ 
Key expires at Fri 30 Nov 2018 03:36:47 PM EET
Is this correct? (y/N) y ↞↞↞↞ 
Really create? (y/N) y ↞↞↞↞ 
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
+++++
...............+++++

pub  4096R/0x6F87F32E2234961E  created: 2013-12-01  expires: never       usage: SC  
                               trust: ultimate      validity: ultimate
sub  4096R/0xD3DCB1F51C37970B  created: 2013-12-01  expires: never       usage: E   
sub  4096R/0x296B12D067F65B03  created: 2013-12-01  expires: 2018-11-30  usage: S   
[ultimate] (1). foo bar <foobar@riseup.net>
[ultimate] (2)  foo bar <foobar@void.gr>
gpg> save ↞↞↞↞ 

As you can see there’s a new subkey 0x296B12D067F65B03 with just the S flag, that the signing subkey.
Before moving forward it’s wise to create a revocation certificate:

$ gpg --output 0x6F87F32E2234961E.gpg-revocation-certificate --armor --gen-revoke 0x6F87F32E2234961E
sec  4096R/0x6F87F32E2234961E 2013-12-01 foo bar <foobar@riseup.net>

Create a revocation certificate for this key? (y/N) y
Please select the reason for the revocation:
  0 = No reason specified
  1 = Key has been compromised
  2 = Key is superseded
  3 = Key is no longer used
  Q = Cancel
(Probably you want to select 1 here)
Your decision?  1 ↞↞↞↞ 
Enter an optional description; end it with an empty line:
> This revocation certificate was generated when the key was created. ↞↞↞↞ 
> 
Reason for revocation: Key has been compromised
This revocation certificate was generated when the key was created.
Is this okay? (y/N) y ↞↞↞↞ 
You need a passphrase to unlock the secret key for
user: "foo bar <foobar@riseup.net>"
4096-bit RSA key, ID 0x6F87F32E2234961E, created 2013-12-01

Revocation certificate created.

Please move it to a medium which you can hide away; if Mallory gets
access to this certificate he can use it to make your key unusable.
It is smart to print this certificate and store it away, just in case
your media become unreadable.  But have some caution:  The print system of
your machine might store the data and make it available to others!

Encrypt this file and store it someplace safe, eg your encrypted USB. You should definitely not leave it at your laptop’s hard disk. You can even print it and keep it in this form, it’s small enough so one could type it if needed.

Remove Master key
And now the interesting part, it’s time to remove the master key from your laptops’s keychain and just leave the subkeys. You will store the master key in the encrypted usb so it stays safe.

First go and backup your .gnupg dir on your encrypted USB. Don’t move forward until you do that. DON’T!

$ rsync -avp $HOME/.gnupg /media/encrypted-usb
or
$ umask 077; tar -cf /media/encrypted-usb/gnupg-backup-new.tar -C $HOME .gnupg

Did you backup your key? Are you sure ?

Then it’s time to remove the master key!

$ gpg --export-secret-subkeys 0x6F87F32E2234961E > /media/encrypted-usb/subkeys
$ gpg --delete-secret-key 0x6F87F32E2234961E
$ gpg --import /media/encrypted-usb/subkeys
$ shred -u /media/encrypted-usb/subkeys

What you’ve accomplished with this process is export the subkeys to /media/encrypted-usb/subkeys then delete the master key and re-import just the subkeys. Master key resides only on the encrypted USB key now. Don’t lose that USB key. USB keys are extremely cheap, make multiple copies of the encrypted key and place them in safe places, you can give one such key to your parents or your closest friend in case of emergency. For safety, make sure there’s at least one copy outside of your residence.

You can see the difference of the deleted master key by comparing the listing of the secret keys in your .gnupg and your /media/encrypted-usb/.gnupg/ dir.

$ gpg -K 0x6F87F32E2234961E                                             
sec#   4096R/0x6F87F32E2234961E 2013-12-01
uid                            foo bar <foobar@riseup.net>
uid                            foo bar <foobar@void.gr>
ssb   4096R/0xD3DCB1F51C37970B 2013-12-01
ssb   4096R/0x296B12D067F65B03 2013-12-01 [expires: 2018-11-30] 
$ gpg --home=/media/encrypted-usb/.gnupg/ -K 0x6F87F32E2234961E                                             
sec   4096R/0x6F87F32E2234961E 2013-12-01
uid                            foo bar <foobar@riseup.net>
uid                            foo bar <foobar@void.gr>
ssb   4096R/0xD3DCB1F51C37970B 2013-12-01
ssb   4096R/0x296B12D067F65B03 2013-12-01 [expires: 2018-11-30] 

Notice the pound (#) in the ‘sec’ line from your ~/.gnupg/. That means that the master key is missing.

Upload your new key to the keyservers if you want to…

Key Migration
In case you’re migrating from an older key you need to sign your new key with the old one (not the other way around!)
$ gpg --default-key 0xOLD_KEY --sign-key 0x6F87F32E2234961E

Write a transition statement and sign it with both the old and the new key:

$ gpg --armor -b -u 0xOLD_KEY -o sig1.txt gpg-transition.txt
$ gpg --armor -b -u 0x6F87F32E2234961E -o sig2.txt gpg-transition.txt

That’s about it…upload the transition statement and your signatures to some public space (or mail it to your web of trust).

Signing other people’s keys
Because your laptop’s keypair does not have the master key anymore and the master key is the only one with the ‘C’ flag, when you want to sign someone else’s key, you will need to mount your encrypted USB and then issue a command that’s using that encrypted directory:
$ gpg --home=/media/encrypted-usb/.gnupg/ --sign-key 0xSomeones_keyid
Export your signature and send it back to people whose key you just signed..

Things to play with in the future
Next stop ? An OpenPGP Smartcard! (eshop) or a yubikey NEO, (related blogpost). Any Greeks want to join me for a mass (5+) order?

References
https://wiki.debian.org/subkeys
https://help.riseup.net/en/security/message-security/openpgp/best-practices
https://alexcabal.com/creating-the-perfect-gpg-keypair/

P.S. 0x6F87F32E2234961E is obviously just a demo key. You can find my real key here.
P.S.2 The above commands were executed on gpg 1.4.12 on Debian Wheezy. In the future the output of the commands will probably differ.

Anonymize headers in postfix

E-mail headers usually leak some information about the person sending the email. Most servers reveal the sender’s originating IP, but sometimes we might not want this behavior. Here’s a simple way to modify your postfix server to remove just the IP of the sender. The original idea is from https://we.riseup.net/debian/mail but with postfix 2.9 version (Debian Wheezy) using the way proposed in the riseup article you will also be anonymizing all intermediate ‘Received: from’ headers and not just the sender’s. The setup proposed by riseup article seems to work fine with postfix 2.7 (Debian Squeeze).

1. Install postfix-pcre if you haven’t already.
# apt-get install postfix-pcre


2.
Create a file /etc/postfix/smtp_header_checks with content:
/^\s*(Received: from)[^\n]*(.*)/ REPLACE $1 [127.0.0.1] (localhost [127.0.0.1])$2


3.
Edit /etc/postfix/master.cf
Find the section about submission and add at the end of it: -o cleanup_service_name=subcleanup
e.g.

submission inet n       -       -       -       -       smtpd
  -o smtpd_tls_security_level=encrypt
  -o smtpd_sasl_auth_enable=yes
  -o smtpd_client_restrictions=permit_sasl_authenticated,reject
  -o milter_macro_daemon_name=ORIGINATING

submission inet n       -       -       -       -       smtpd
  -o smtpd_tls_security_level=encrypt
  -o smtpd_sasl_auth_enable=yes
  -o smtpd_client_restrictions=permit_sasl_authenticated,reject
  -o milter_macro_daemon_name=ORIGINATING
  -o cleanup_service_name=subcleanup

Then at the end of /etc/postfix/master.cf file add the following:

subcleanup unix n       -       -       -       0       cleanup
    -o header_checks=pcre:/etc/postfix/smtp_header_checks

That’s it, reload your postfix and you’re done. When you’ll be sending emails over submission (you do use submission instead of smtp to send your emails, right?) then the first ‘Received’ header will be modified like the following example.
Instead of:

Received: from foo.bar (abcd.efgh.domain.tld [111.222.100.200])
        by mail.domain.tld (Postfix) with ESMTPA id BAB8A1A0224
        for <user@dst.domain2.tld>; Sun, 24 Nov 2013 15:47:50 +0100 (CET)

It will be:

Received: from [127.0.0.1] (localhost [127.0.0.1])
        by mail.domain.tld (Postfix) with ESMTPA id BAB8A1A0224
        for <user@dst.domain2.tld>; Sun, 24 Nov 2013 15:47:50 +0100 (CET)

Extra
If you want to anonymize even more headers, try adding the following to /etc/postfix/smtp_header_checks

/^\s*User-Agent/        IGNORE
/^\s*X-Enigmail/        IGNORE
/^\s*X-Mailer/          IGNORE
/^\s*X-Originating-IP/  IGNORE

Logging
As the riseup article says, be very careful of what is being logged at the server. If you don’t want to log the replacements done by pcre then add something like the following in your rsyslog.conf before any other rule:
:msg, contains, "replace: header Received:" ~

Another day another hacked website

Yesterday morning, phone rings to notify my of a new sms. Someone could not access his website on some server that I am root/administer.
I tried to ping the server and got 1 reply every 10-15 packets so my initial thought was that the hosting provider had fucked up. I pinged other machines in the “neighborhood”, they replied just fine. So the problem lied in my server. I got console access through IPMI, you know…the ones with the cipher zero bug, and I managed to login. An apache2 process was constantly using 100% of a core and the machine sent gazillion packets towards a certain destination.

Since I wanted to investigate what exactly this process did, I put an iptables entry in my OUTPUT chain to block packets towards that destination. The machine became responsive again, though the apache process still ran at 100%. Since I run my vhosts using apache2 mpm_itk module, I knew through the apache2 PIDs’ username which site had been hacked. I grepped the logs for any POST, but I couldn’t see anything. Unfortunately the logs only go back 2 days (NOT my policy! and a very bad one actually…but anyway).

strace -p PID did not yield anything interesting, just the process trying to create sockets to send packets towards the destination.

socket(PF_NETLINK, SOCK_RAW, 0) = 417
bind(417, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(417, {sa_family=AF_NETLINK, pid=11398, groups=00000000}, [12]) = 0
sendto(417, "\24\0\0\0\26\0\1\3\233\323\354Q\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
recvmsg(417, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"0\0\0\0\24\0\2\0\233\323\354Q\206,\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 588
recvmsg(417, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"@\0\0\0\24\0\2\0\233\323\354Q\206,\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 128
recvmsg(417, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\24\0\0\0\3\0\2\0\233\323\354Q\206,\0\0\0\0\0\0\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 20
close(417) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 417
fcntl(417, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(417, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(417, {sa_family=AF_INET, sin_port=htons(4883), sin_addr=inet_addr("X.Y.Z.W")}, 16) = 0
fcntl(417, F_SETFL, O_RDWR) = 0
sendto(417, "\207\25\312P\322t\0#\317}jf\2(W\374\375\232h\213\220\31\355\277)\320[\255\273\276\221\374"..., 8192, MSG_DONTWAIT, NULL, 0) = -1 EPERM (Operation not permitted)
close(417) = 0

lsof -n -p PID output had hundreds of open log files and a few connections. Grepping out the logs I noticed one that was quite interesting, it went towards another server at port 5555.
apache2 11398 XXXXXXX 416u IPv4 831501972 0t0 TCP A.B.C.D:59210->B.C.D.E:5555 (CLOSE_WAIT)

I run tcpdump there, and of course it was an irc connection. I started capturing everything.

lsof also revealed this:
apache2 11398 XXXXXXX cwd DIR 8,7 4096 2474373 /var/www/vhosts/XXXXXXX/httpdocs/libraries/phpgacl
which I could have have also seen it doing ls /proc/PID/cwd/ …but anyway.

Looking inside that dir I found a file named gacl_db.php. It was base64 encoded. Well actually it was multiple times base64 encoded and obfsuscated by using character substitutions, so I had to de-obfuscate it. It was quite easy using php and some bash scripting.

This is the original base64 encoded/obfuscated file: Original gacl_db.php
This is the final result: Deobfuscated gacl_db.php
(I have removed the irc server details from the deobfuscated file, it’s still there in the original file for whoever wants it though)

It’s just an IRC bot containing a perl reverse shell as well. It has commands to flood other servers, and that’s what my server was doing.

I joined the IRC server and at that time there were more than 90 bots inside. Right now that I’m writing this blog post there are less than 50. Every bot joining the channel outputs a text like this:

[uname!]: FreeBSD a.b.c.d 8.1-RELEASE-p5 FreeBSD 8.1-RELEASE-p5 #10: Fri Sep 30 14:45:56 MSK 2011 root@a.b.c.d:/path/to/to/to/sth pl#27 amd64 (safe: off)
[vuln!]: http://www.a-vhost-name.TLD/libraries/phpgacl/gacl_db.php
[uname!]: Linux x.y.z.w 3.2.0-43-generic #68-Ubuntu SMP Wed May 15 03:33:33 UTC 2013 x86_64 (safe: off)
[vuln!]: http://www.another-vhost-name.another-TLD/libraries/phpgacl/gacl_db.php

So if you run servers or websites, do a locate gacl_db.php.

Since all the bot/servers entering post a [vuln!] message about phpgacl, my guess is that the original vulnerability that allowed the attacker to gain access is right there. I haven’t had time to look into it yet, but I’ve warned my clients to remove this library from their websites as a precaution. You should probably do the same.

Greek PGP Web of Trust 2012 edition

I’ve very glad for hosting this guest post. Dorothea put some real effort into it. So…enjoy!

—————————————————————————————————————–
In 2008 Patroklos Argyroudis created the first visualization of the greek PGP web of trust, based on information supplied mostly by people who attended a keysigning party at Thessaloniki. You can read his related posts at sysc.tl/tag/web-of-trust/ [0]
In 2012, during the second cryptoparty [1] at hackerspace.gr [2], George Kargiotakis suggested if someone wanted to update the network. I decided to undertake the task and you can see some of the visualizations below.

Visualizations:
1. Venn of persons that have signed others and of persons that have been signed by others
2. Greek PGP network for 2012
3. Trust in the 2012 Greek PGP network
4. Highlighting the persons who have signed more people
5. Do people trust more persons than they are trusted by?
6. Geolocation of individuals (globally)
7. Geolocation of individuals (in Greece)
8. Gender percentages
9. Educational and research institutes in the PGP network
10. Animation: Formation through time of associations that were active in 2012
11. Communities and the ten most important positions in the 2012 Greek PGP network according to Eigen value centrality

Additional sections:
12. Outline of methodology
13. Keyserver & keys used
14. Notes on methodology
15. Software and visualization notes
16. Problems encountered and how you can help
17. Future plans
18. Web references
19. Synopsis
20. Communication
21. Thanks
(more…)

How Vodafone Greece degrades my Internet experience

The title may sound a bit pompous, but please read on and you’ll see how certain decisions can cripple, or totally disrupt modern Internet services and communications as these are offered(?) by Vodafone’s mobile Internet solutions.

== The situation ==
I’ve bought a mobile Internet package from Vodafone Greece in order to be able to have 3G access in places where I don’t have access to wifi or ethernet. I am also using a local caching resolver on my laptop (Debian Linux), running unbound software, to both speed up my connections and to have mandatory DNSSEC validation for all my queries. Many of you might ask why do I need DNSSEC validation of all my queries since only very few domains are currently using DNSSEC, well I don’t have a reply that applies to everyone, let’s just say for now that I like to experiment with new things. After all, this is the only way to learn new things, experiment with them. Let’s not forget though that many TLDs are now signed, so there are definitely a few records to play with. Mandatory DNSSEC validation has led me in the past to identify and investigate a couple other problems, mostly having to do with broken DNSSEC records of various domains and more importantly dig deeper into IPv6 and fragmentation issues of various networks. This last topic is so big that it needs a blog post, or even a series of posts, of it’s own. It’s my job after all to find and solve problems, that’s what system or network administrators do (or should do).

== My setup ==
When you connect your 3G dongle with Vodafone Greece, they sent you 2 DNS servers (two out of 213.249.17.10, 213.249.17.11, 213.249.39.29) through ipcp (ppp). In my setup though, I discard them and I just keep “nameserver 127.0.0.1” in my /etc/resolv.conf in order to use my local unbound. In unbound’s configuration I have set up 2 forwarders for my queries, actually when I know I am inside an IPv6 network I use 4 addresses, 2 IPv4 and 2 IPv6 for the same 2 forwarders. These forwarders are hosted where I work (GRNET NOC) and I have also set them up to do mandatory DNSSEC validation themselves.
So my local resolver, which does DNSSEC validation, is contacting 2 other servers who also do DNSSEC validation. My queries carry the DNS protocol flag that asks for DNSSEC validation and I expect them to validate every response possible.

As you can see in the following screenshot, here’s what happens when I want to visit a website. I ask my local caching resolver, and that resolver asks one of it’s forwarders adding the necessary DNSSEC flags in the query.
The response might have the “ad” (authenticated) DNSSEC flag, depending whether the domain I’m visiting is DNSSEC signed or not.

[Screenshot of DNS queries]
dnssec_query

== The problem ==
What I noticed was that using this setup, I couldn’t visit any sites at all when I connected with my 3G dongle on Vodafone’s network. When I changed my /etc/resolv.conf to use Vodafone’s DNS servers directly, everything seemed work as normal, at least for browsing. But then I tried to query for DNSSEC related information on various domains manually using dig, Vodafone’s resolvers never sent me back any DNSSEC related information. Well actually they never sent me back any packet at all when I asked them for DNSSEC data.

Here’s an example of what happens with and without asking for DNSSEC data. The first query is without requesting DNSSEC information and I get a normal reply, but upon asking for the extra DNSSEC data, I get nothing back.
[Screenshot of ripe.net +dnssec query through Vodafone’s servers]
no_dnssec_replies_by_vodafone

== Experimentation ==
Obviously changing my forwarders configuration in unbound to the Vodafone DNS servers did not work because Vodafone’s DNS servers never send me back any DNSSEC information at all. Since my unbound is trying to do DNSSEC validation of everything, obviously including the root (.) zone, I need to get back packets that contain these records. Else everything fails. I could get unbound working with my previous forwarders or with Vodafone’s servers as forwarders, only by disabling the DNSSEC validation, that is commenting out the auto-trust-anchor-file option.

Then I started doing tests on my original forwarders that I had in my configuration (and are managed by me). I could see that my query packets arrived at the server and the server always sent back the proper replies. But whenever the reply contained DNSSEC data, that packet was not forwarded to my computer through Vodafone’s 3G network.

More tests were to follow and obviously my first choice were Google’s public resolvers, 8.8.8.8 and 8.8.4.4. Surprise, surprise! I could get any DNSSEC related information I wanted. The exact same result I got upon testing with OpenDNS resolvers, 208.67.222.222 and 208.67.220.220. From a list of “fairly known” public DNS servers that I found here, only ScrubIT servers seems to be currently blocked by Vodafone Greece. Comodo DNS, Norton DNS, and public Verizon DNS all work flawlessly.

My last step was to try and get DNSSEC data over tcp instead of udp packets. Surprise, surprise again, well not at all any more… I could get back responses containing the DNSSEC information I wanted.

== Conclusion ==
Vodafone Greece for some strange reason (I have a few ideas, starting with…disabling skype) seems to “dislike” large UDP responses, among which are obviously DNS replies carrying DNSSEC information. These responses can sometimes be even bigger than 1500bytes. My guess is that in order to minimize hassle for their telephone support, they have whitelisted a bunch of “known” DNS servers. Obviously the thought of breaking DNSSEC and every DNSSEC signed domain for their customers hasn’t crossed their minds yet. What I don’t understand though is why their own DNS servers are not whitelisted. Since they trust other organizations’ servers to send big udp packets, why don’t they allow DNSSEC from their own servers? Misconfiguration? Ignorance? On purpose?

The same behavior can (sometimes -> further investigation needed here) be seen while trying to use OpenVPN over udp. Over tcp with the same servers, everything works fine. That reminds me I really need to test ocserv soon…

== Solution ==
I won’t even try to contact Vodafone’s support and try to convince their telephone helpdesk to connect me to one of their network/infrastructure engineers. I think that would be completely futile. If any of you readers though, know anyone working at Vodafone Greece in _any_ technical department, please send them a link to this blog post. You will do a huge favor to all Vodafone Greece mobile Internet users and to the Internet itself.

The Internet is not just for HTTP stuff, many of us use it in various other ways. It is unacceptable for any ISP to block, disrupt, interrupt or get in the middle of such communications.
Each one of us users should be able to use DNSSEC without having to send all our queries to Google, OpenDNS or any other information harvesting organization.

== Downloads ==
I’m uploading some pcaps here for anyone who wants to take a look. Use wireshark/tcpdump to read them.

A. tcpdump querying for a non-DNSSEC signed domain over 3G. One query without asking for DNSSEC and two queries asking for DNSSEC, all queries go to DNS server 194.177.210.10. All queries arrived back. The tcpdump was created on 194.177.210.10.
vf_non-dnssec_domain_query.pcap

B. tcpdump querying for a DNSSEC signed domain over 3G. One query without asking for DNSSEC and three queries asking for DNSSEC, all queries go to DNS server 194.177.210.10. The last three queries never arrived back at my computer. The tcpdump was created on 194.177.210.10.
vf_dnssec_domain_query.pcap

C. tcpdump querying for a DNSSEC signed domain over 3G. One query without asking for DNSSEC and another one asking for DNSSEC, all queries go to DNS Server 8.8.8.8. All queries arrived back. The tcpdump was created on my computer using the PPP interface.
vf_ripe_google_dns.pcap

Linux kernel handling of IPv6 temporary addresses – CVE-2013-0343

I reported this bug on November 2012 but as of February 2013 it still hasn’t been fixed.

My initial report on oss-security and kernel netdev mailing lists reported it as an ‘information disclosure’ problem but then I found out that the issue is more severe and it can lead to the complete corruption of Linux kernel’s IPv6 stack until reboot. My second report wasn’t public, I thought it would be better not to make any public disclosure until the kernel people had enough time to respond, and was only sent to a number of kernel developers but I’m making it public now since the CVE is already out.

If someone wants to read all the publicly exchanged emails the best resource is probably this: http://marc.info/?t=135291265200001&r=1&w=2

Here’s the initial description of the problem:

Due to the way the Linux kernel handles the creation of IPv6 temporary addresses a malicious LAN user can remotely disable them altogether which may lead to privacy violations and information disclosure.

By default the Linux kernel uses the ‘ipv6.max_addresses’ option to specify how many IPv6 addresses an interface may have. The ‘ipv6.regen_max_retry’ option specifies how many times the kernel will try to create a new address.

Currently, in net/ipv6/addrconf.c,lines 898-910, there is no distinction between the events of reaching max_addresses for an interface and failing to generate a new address. Upon reaching any of the above conditions the following error is emitted by the kernel times ‘regen_max_retry’ (default value 3):

[183.793393] ipv6_create_tempaddr(): retry temporary address regeneration
[183.793405] ipv6_create_tempaddr(): retry temporary address regeneration
[183.793411] ipv6_create_tempaddr(): retry temporary address regeneration

After ‘regen_max_retry’ is reached the kernel completely disables temporary address generation for that interface.

[183.793413] ipv6_create_tempaddr(): regeneration time exceeded - disabled temporary address support

RFC4941 3.3.7 specifies that disabling temp_addresses MUST happen upon failure to create non-unique addresses which is not the above case. Addresses would have been created if the kernel had a higher
‘ipv6.max_addresses’ limit.

A malicious LAN user can send a limited amount of RA prefixes and thus disable IPv6 temporary address creation for any Linux host. Recent distributions which enable the IPv6 Privacy extensions by default, like Ubuntu 12.04 and 12.10, are vulnerable to such attacks.

Due to the kernel’s default values for valid (604800) and preferred (86400) lifetimes, this scenario may even occur under normal usage when a Router sends both a public and a ULA prefix, which is not an uncommon
scenario for IPv6. 16 addresses are not enough with the current default timers when more than 1 prefix is advertised.

The kernel should at least differentiate between the two cases of reaching max_addresses and being unable to create new addresses, due to DAD conflicts for example.

And here’s the second, more severe report about the corruption of the IPv6 stack:

I had previously informed this list about the issue of the linux kernel losing IPv6 privacy extensions by a local LAN attacker. Recently I’ve found that there’s actually another, more serious in my
opinion, issue that follows the previous one. If the user tries to disconnect/reconnect the network device/connection for whatever reason (e.g. thinking he might gain back privacy extensions), then the device gets IPs from SLAAC that have the “tentative” flag and never loses that. That means that IPv6 functionality for that device is from then on completely lost. I haven’t been able to bring back the kernel to a working IPv6 state without a reboot.

This is definitely a DoS situation and it needs fixing.

Here are the steps to reproduce:


== Step 1. Boot Ubuntu 12.10 (kernel 3.5.0-17-generic) ==
ubuntu@ubuntu:~$ ip a ls dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:8b:99:5d brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.96/24 brd 192.168.1.255 scope global eth0
    inet6 2001:db8:f00:f00:ad1f:9166:93d4:fd6d/64 scope global temporary dynamic 
       valid_lft 86379sec preferred_lft 3579sec
    inet6 2001:db8:f00:f00:5054:ff:fe8b:995d/64 scope global dynamic 
       valid_lft 86379sec preferred_lft 3579sec
    inet6 fdbb:aaaa:bbbb:cccc:ad1f:9166:93d4:fd6d/64 scope global temporary dynamic 
       valid_lft 86379sec preferred_lft 3579sec
    inet6 fdbb:aaaa:bbbb:cccc:5054:ff:fe8b:995d/64 scope global dynamic 
       valid_lft 86379sec preferred_lft 3579sec
    inet6 fe80::5054:ff:fe8b:995d/64 scope link 
       valid_lft forever preferred_lft forever

ubuntu@ubuntu:~$ sysctl -a | grep use_tempaddr
net.ipv6.conf.all.use_tempaddr = 2
net.ipv6.conf.default.use_tempaddr = 2
net.ipv6.conf.eth0.use_tempaddr = 2
net.ipv6.conf.lo.use_tempaddr = 2

ubuntu@ubuntu:~$ nmcli con status
NAME                      UUID                                   DEVICES    DEFAULT  VPN   MASTER-PATH
Wired connection 1        923e6729-74a7-4389-9dbd-43ed7db3d1b8   eth0       yes      no    --
ubuntu@ubuntu:~$ nmcli dev status
DEVICE     TYPE              STATE
eth0       802-3-ethernet    connected

//ping6 2a00:1450:4002:800::100e  while in another terminal: tcpdump -ni eth0 ip6

ubuntu@ubuntu:~$ ping6 2a00:1450:4002:800::100e -c1
PING 2a00:1450:4002:800::100e(2a00:1450:4002:800::100e) 56 data bytes
64 bytes from 2a00:1450:4002:800::100e: icmp_seq=1 ttl=53 time=70.9 ms

--- 2a00:1450:4002:800::100e ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 70.994/70.994/70.994/0.000 ms

# tcpdump -ni eth0 host 2a00:1450:4002:800::100e
17:57:37.784658 IP6 2001:db8:f00:f00:ad1f:9166:93d4:fd6d > 2a00:1450:4002:800::100e: ICMP6, echo request, seq 1, length 64
17:57:37.855257 IP6 2a00:1450:4002:800::100e > 2001:db8:f00:f00:ad1f:9166:93d4:fd6d: ICMP6, echo reply, seq 1, length 64

== Step 2. flood RAs on the LAN ==

$ dmesg | tail
[ 1093.642053] IPv6: ipv6_create_tempaddr: retry temporary address regeneration
[ 1093.642062] IPv6: ipv6_create_tempaddr: retry temporary address regeneration
[ 1093.642065] IPv6: ipv6_create_tempaddr: retry temporary address regeneration
[ 1093.642067] IPv6: ipv6_create_tempaddr: regeneration time exceeded - disabled temporary address support

ubuntu@ubuntu:~$ sysctl -a | grep use_tempaddr
net.ipv6.conf.all.use_tempaddr = 2
net.ipv6.conf.default.use_tempaddr = 2
net.ipv6.conf.eth0.use_tempaddr = -1
net.ipv6.conf.lo.use_tempaddr = 2

//ping6 2a00:1450:4002:800::100e  while in another terminal: tcpdump -ni eth0 ip6

ubuntu@ubuntu:~$ ping6 2a00:1450:4002:800::100e -c1
PING 2a00:1450:4002:800::100e(2a00:1450:4002:800::100e) 56 data bytes
64 bytes from 2a00:1450:4002:800::100e: icmp_seq=1 ttl=53 time=77.5 ms

--- 2a00:1450:4002:800::100e ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 77.568/77.568/77.568/0.000 ms

# tcpdump -ni eth0 host 2a00:1450:4002:800::100e
17:59:38.204173 IP6 2001:db8:f00:f00:5054:ff:fe8b:995d > 2a00:1450:4002:800::100e: ICMP6, echo request, seq 1, length 64
17:59:38.281437 IP6 2a00:1450:4002:800::100e > 2001:db8:f00:f00:5054:ff:fe8b:995d: ICMP6, echo reply, seq 1, length 64

//notice the change of IPv6 address to the one not using privacy extensions even after the flooding has finished long ago.

== Step 3. Disconnect/Reconnect connection  ==
// restoring net.ipv6.conf.eth0.use_tempaddr to value '2' makes no difference at all for the rest of the process

# nmcli dev disconnect iface eth0
# nmcli con up uuid 923e6729-74a7-4389-9dbd-43ed7db3d1b8

ubuntu@ubuntu:~$ ip a ls dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:8b:99:5d brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.96/24 brd 192.168.1.255 scope global eth0
    inet6 2001:db8:f00:f00:5054:ff:fe8b:995d/64 scope global tentative dynamic 
       valid_lft 86400sec preferred_lft 3600sec
    inet6 fdbb:aaaa:bbbb:cccc:5054:ff:fe8b:995d/64 scope global tentative dynamic 
       valid_lft 86400sec preferred_lft 3600sec
    inet6 fe80::5054:ff:fe8b:995d/64 scope link tentative 
       valid_lft forever preferred_lft forever

//Notice the "tentative" flag of the IPs on the device

//ping6 2a00:1450:4002:800::100e  while in another terminal: tcpdump -ni eth0 ip6

ubuntu@ubuntu:~$ ping6 2a00:1450:4002:800::100e -c1
PING 2a00:1450:4002:800::100e(2a00:1450:4002:800::100e) 56 data bytes
^C
--- 2a00:1450:4002:800::100e ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

# tcpdump -ni eth0 host 2a00:1450:4002:800::100e
18:01:45.264194 IP6 ::1 > 2a00:1450:4002:800::100e: ICMP6, echo request, seq 1, length 64

Summary:
Before flooding it uses IP: 2001:db8:f00:f00:ad1f:9166:93d4:fd6d
After flooding it uses IP: 2001:db8:f00:f00:5054:ff:fe8b:995d –> it has lost privacy extensions
After disconnect/reconnect it tries to use IP: ::1 –> it has lost IPv6 connectivity

The problem currently affects all Linux kernels (including the latest 3.8), that have IPv6 Privacy Extensions enabled. The only distribution that has IPv6 Privacy Extensions enabled by default is Ubuntu starting from version 12.04. So Ubuntu 12.04 and 12.10 are currently vulnerable to this attack and can have their IPv6 stack corrupted/disabled by a remote attacker in an untrusted network.

Kernel developers and people from RedHat Security Team are trying to fix the issue which in my opinion involves changing parts of the logic of IPv6 addressing algorithms in the Linux kernel.

No mitigation currently exists apart from disabling IPv6 Privacy Extensions.

You can play with this bug using flood_router26 tool from THC-IPv6 toolkit v2.1.
Usage: # ./flood_router26 -A iface

P.S. I can’t tell if the stack corruption could also lead to other kernel problems, that would probably need some professional security researchers to look into it.

mitigating wordpress xmlrpc attack using ossec

I’ve created some local rules for ossec to mitigate some of the effects of the wordpress xmlrpc attack presented here: WordPress Pingback Portscanner – Metasploit Module.

They seem to work for me, use at your own risk of getting flooded with tons of alerts. Obviously you need to set the alert level of the second rule to something that will trigger your active-response. Feel free to tweak the alert level, frequency and timeframe. Before you change frequency to something else please read the following thread though: setting ossec frequency level


   <rule id="100167" level="1">
    <if_sid>31108</if_sid>
    <url>xmlrpc.php</url>
    <match>POST</match>
    <description>WordPress xmlrpc attempt.</description>
  </rule>

  <rule id="100168" level="10" frequency="2" timeframe="600">
    <if_matched_sid>100167</if_matched_sid>
    <same_source_ip />
    <description>WordPress xmlrpc attack.</description>
    <group>attack,</group>
   </rule>

btw, if you don’t want xmlrpc.php accessible at all, you can block it through a simple mod_rewrite rule:


<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/PATH/TO/xmlrpc.php [NC]
RewriteRule ^(.*)$ - [F,L]

or you can try any of the other numerous ways to accomplish the same thing as presented here.

You’ll lose pingback functionality though…oh well.

World city map of Tor nodes

Some months ago I started playing with the idea of creating a world map that would have every Tor node on it. Obviously I wan’t the first one…I soon discovered Moritz Bartl’s post on the same topic. Luckilly he had his code posted on Github so I could fork it and add features that I wanted. The original python script parsed the consensus and the misrodescriptors, put Tor nodes into some classes and created a KML file with some description on each node.

Some differences
I changed some parts of the python script to better suit my needs.
a. Create a separate kml files for each Tor node class.
b. Add new classes: Bad, Authority and Named.
c. Pay more attention on requesting every external URL over HTTPS.
d. Generate HTML code that displays those KMLs on a Google Maps overlay.
e. Add some small randomization to each nodes’s coordinates so that nodes in the same city don’t overlap.

You can find a complete changelog at kargig/tormap GitHub repo.

And here’s the outcome: World city map of Tor nodes at https://tormap.void.gr/
One of my main goals was to have selectable classes of nodes that will appear on the map.

To produce the map overlay, a cron script runs every hour, which is also the period it takes for Tor Authority nodes to produce a new consensus, and creates some static files which are then served by nginx.

I’m not a web developer/designer and I don’t really know any javascript. So please, feel free to fork my code and make it look better, run faster and add your own features. I’ll happily accept patches/pull requests!

Extras
On kargig/tormap repo you will also find a handy script, ‘runme.sh’, that downloads all necessary files that need to be parsed by the python script.

Missplaced nodes on the map
Well, blame MaxMind’s GeoIP City database for that. But I think it’s kinda funny to see Tor nodes in Siberia and in the middle of the sea though (look at the West coast of Africa), heh. For those wondering, these nodes are gathered there because their geoip Lat,Long is set to 0,0.
Really though, what’s “Ben’s Cat Shaque” diplayed there next to all those nodes in the west coast of Africa? Anyone has some clue ?

Conspriracy people
I’m sure that people who love conspriracy theories will start posting about those ‘Bad’ Tor nodes in Iran and Syria. Why do you think these are there ? What does it mean ? Let the flames begin!

Future TODO
a. OpenStreetMap
I have started working on an OpenStreetMap implementation of the above using OpenLayers. The biggest hurdle is that OSM does not provide a server that serves map tiles over HTTPS. Makes me wonder…is that actually so difficult ?
b. More stats
I would like to add small graphs on how the number of nodes in each class evolves.

Other Tor mapping efforts
https://b.kentbackman.com/2010/10/04/view-tor-exit-nodes-in-google-earth/
http://freehaven.net/~ioerror/maps/v3-tormap.html

Don’t forget, you can always help Tor by running a node/bridge or sending some money to Tor or EFF!