Continuously building our container image chain

Container images can be layered on each other, so that you do not need to always rebuild different layers from scratch.
This helps you to a) standardize your container images and b) to not to repeat yourself all over the place.
As an example, we have a couple of ruby applications, and they all require ruby, bundler and some common dependencies, since every ruby app eventually bundles nokogiri and thus requires a bunch of libxml and libxslt libraries. Simplified this means we have the following chain:

base -> ruby -> app

At immerda we build our own base images from scratch so we can fully control what is in them. E.g. we add our repository, our CA certificate and so on. Then we have a common ruby image, that can be used to develop and test ruby applications. And in the end we might package the application as an image, like our WKD service.

We are updating our systems on regular and unattended basis to keep the systems itself up to date with the latest security releases and bug fixes. Now when it comes to container images, we are also pulling the images of running containers on a regular basis and restarting the containers / pods when there was update to the image tag the container is referring to. This way we do not only keep the applications packaged in containers up to date, but are also making sure, that we get security fixes for lower level libraries, like e.g. openssl (remember heartbleed?) in a timely manner.

Since container images are intended to be immutable (and we actually run nearly all of our containers, with –readonly=true) you want to and must rebuild to keep them updated.

Now when it comes to pull and use images from random projects or registries (looking at you docker-hub), we try to vet images at the time we start using them that they are also continuously built. As an example the official library images, like the official postgres images are good examples for that.
For our own images, we are doing that through gitlab-ci.

However, since images are layered on each other, it means, that you need to rebuild your images in the right order, so that your ruby image builds on top of your latest base image and your app image uses the latest ruby updates.

Gitlab-ci has limited capabilities (especially in the free core version) when it comes to modelling pipelines across projects in the right order. However, with building blocks as webhook events for pipeline runs, we can orchestrate that pretty well.

This orchestration is done using a project form our friends at Autistici/Inventati called gitlab-deps. This tool keeps track of the dependency chain of your images and triggers the pipelines to build these images in the right order.

Central to gitlab-deps is a small webhook receiver, that receives events from our gitlab-ci pipelines and based on the dependency list of your images, it triggers the next pipelines. We are running the webhook receiver in a container and it scans the immerda group in gitlab for projects, either using a Containerfile/Dockerfile FROM statement or a list of depending projects in a file called .gitlab-deps. The latter is much more accurate and likely the better idea to describe the dependencies of your project. As an example see the gitlab-deps container project itself.

gitlab-deps itself requires an API token for gitlab, that is able to trigger pipeline runs, as well as adding webhooks to the project. For now this seems to be a token from a user with the Maintainer role.

We then also trigger a systemd-timer, that runs a script in the container to update all the dependencies as well as add webhooks to all the projects we know of.

For our base images, we defined a scheduled pipeline run to kick off building the initial lowest layer of our images, which – if successful – will trigger the next layers and then the next layer and so on.

If one of the pipelines fails gitlab-deps won’t trigger anything further down the chain. Gitlab will notify you (e.g. via email) about the failed pipeline run, you can investigate and if you are able to re-trigger a successful pipeline run, the chain continuous to be built.

While we are using this now for our image build chain, gitlab-deps can also be used to rebuild/update projects that are depending on a library and so on. While moving more of our work into gitlab and hooking it into CI, we might also start using these dependency triggers for other kinds of pipeline chains.

lvm cache create + resize

LVM allows to have a caching layer, where your actual LV resides on spinning (slow) disks and you have a caching layer in a secondary LV, that caches some of your most frequent reads and the writes. From an end-user perspective the details are transparent: one single blockdevice. For a good overview and introduction see the following blog post: Using LVM cache for storage tiering

In our case mainly writeback cache is mainly interesting and we are adding a raid 1 SSD/NVME to the VG for the spinning disks (usually raid10).

Create

Once some fast PV has been added to the VG, we can start caching individual LVs within that VG:

lvcreate --type cache --cachemode writeback --name slowlv_cache --size 100G storage/slowlv /dev/md3

Where:

  • slowlv_cache – the name of the caching LV
  • size: The cache size
  • storage/slowlv: the vg & slow LV to cache
  • /dev/md3 the SSD/NVME PV in the vg storage

Resize

You cannot directly resize a cached LV, rather you need to uncache, resize and then add the caching LV again. When uncaching, the not yet written-through data gets synced down. This might take a while depending on your cache size and speed of the slow disks.


lvconvert --uncache /dev/storage/slowlv
lvresize -L +200G /dev/storage/slowlv
lvcreate --type cache --cachemode writeback --name slowlv_cache --size 100G storage/slowlv /dev/md3

The last command is exactly the same one as we initially used to create the cache.

How to update the PostgreSQL version on your puppetserver

Our puppetserver uses puppetdb which users PostgreSQL as the persistent datastore in the back.
So far everything is self-contained on the same VM and PostgreSQL is more less managed by the puppetdb module.
The puppetdb module takes care of setting up the PostgreSQL server and uses the upstream PostgreSQL yum module for the binaries. By default it uses PostgreSQL in version 9.6.

Lately, it was announced that puppetdb will start requiring PostgreSQL at least in version 11. Time to start to upgrade our PostgreSQL installation to be ready.

Since the upstream yum repository allows to install multiple version in parallel the migration can be done with little downtime and also supports easy roll-back.

In general we followed the following guide: PostgreSQL upgrade on CentOS.

These were our steps we did:

yum install -y \
https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

yum install -y postgresql13-server postgresql13-contrib
su - postgres -c "/usr/pgsql-13/bin/postgresql-13-setup initdb"
su - postgres -c "/usr/pgsql-13/bin/pg_upgrade \
--old-bindir=/usr/pgsql-9.6/bin/ \
--new-bindir=/usr/pgsql-13/bin/ \
--old-datadir=/var/lib/pgsql/9.6/data/ \
--new-datadir=/var/lib/pgsql/13/data/ \
--check"

# all looks good
cp /var/lib/pgsql/13/data/pg_hba.conf /tmp/
cp /var/lib/pgsql/9.6/data/pg_hba.conf /var/lib/pgsql/13/data/pg_hba.conf
systemctl stop puppetserver
systemctl stop puppetdb
systemctl stop postgresql-9.6
su - postgres -c "/usr/pgsql-13/bin/pg_upgrade \
--old-bindir=/usr/pgsql-9.6/bin/ \
--new-bindir=/usr/pgsql-13/bin/ \
--old-datadir=/var/lib/pgsql/9.6/data/ \
--new-datadir=/var/lib/pgsql/13/data/"

systemctl start postgresql-13
systemctl start puppetdb
# validate
systemctl status postgresql-13
systemctl status puppetdb
# start
systemctl start puppetserver

Now let’s commit the switch to our ibox modules that drives the puppetdb and postgresql configuration.

Done!

Now we can also remove the old installation:

yum remove -y postgresql96*
rm -rf '/var/lib/pgsql/9.6/data'

Intercept traffic sent over a socket

What is the easiest way to intercept traffic sent over a UNIX Socket?

In general a socket is just a file, so you an use strace for any programm and capture what it writes there. BUT the data you capture there won’t be easy processable by something like wireshark or so.

The other way is that you setup a socat-chain, which proxies the data through a local port and then you capture the data there. You do that, by either pointing the clients or the server to another socket to write to or read from.

Let’s assume usually clients connect to /tmp/socket and we can more easily change them.

(terminal 1)$ socat -t100 -d -d -x -v UNIX-LISTEN:/tmp/socket.proxy,mode=777,reuseaddr,fork \
TCP-Connect:127.0.0.1:9000
(terminal 2)$ socat -t100 -d -d -x -v TCP-LISTEN:9000,fork,reuseaddr  UNIX-CONNECT:/tmp/socket
(terminal 3)$ tcpdump -w /tmp/data.pcap -i lo -nn port 9000

Now reconnect the client to /tmp/socket.proxy and tcpdump will record the traffic flowing over the socket as normal tcp packets.

Puppet (and everything) over Tor

Mostly, when we talk about Tor, we just talk about websites. But what’s about other traffic and tools? What’s about Puppet or Icinga? If you have a Puppet server and you like to hide where it stays and/or which nodes are connected, perhaps you like to serve those services over an onion service.

This article is based on some research how can a Puppet server and an icinga parent be hidden. But only the corresponding traffic should be routed through Tor, the rest of the traffic shouldn’t go through Tor.

Normally Tor provides a local socks proxy, and each application with support for socks proxies and be configured to run over Tor. If your application supports socks proxies, use the socks proxy. If your application supports only http proxies, there are tools, like polipo, in the internet which can translate from http to socks proxy. Puppet can use http proxies and that’s fine. But first of all, on the website of polipo it’s written that polipo isn’t maintained anymore and second, not every application supports proxies.

The goal is to implement a transparent proxy for every kind of (tcp) traffic. This example is built with CentOS 7 and/or 8, it should be possible with every kind of Unix. For the firewall part – it’s essential to configure the firewall correctly – this example uses nftables. The same is also possible with iptables.

Server side

The server side is easy, it’s like every hidden service. Tor runs on the server and creates an onion service which forwards the traffic to a port.
Configure Tor (/etc/tor/torrc):

# Puppet hidden service
HiddenServiceDir /var/lib/tor/puppet
HiddenServicePort 8140

# Icinga hidden service
HiddenServiceDir /var/lib/tor/icinga2
HiddenServicePort 5665

The onion address can be found in the corresponding hostname file, e.g. /var/lib/tor/puppet/hostname. Puppet and Icinga will have different hostnames. As this setup isn’t about how to create onion services, this setup is enough. If you like to dig deeper into onion services, checkout the Tor documentation.

Agent side

The interesting part is the agent side. For a transparent proxy we need multiple parts to work together. First of all, the client will establish a connection to an fqdn, for that we need a dns response for the fqdn. As the fqdn is an onion address, we need a dns server which is aware of onion addresses and will return an ip address. Second of all, the connection will be established directly to an ip address and port, the firewall needs to capture that and redirects it to the Tor service.

DNS

The Tor service can create a dns server (DNSPort) and serve questions. As the Tor service is aware of onion addresses it can return an ip address for an onion address (AutomapHostsOnResolve).

DNSPort 53
AutomapHostsOnResolve 1

After a restart of the Tor service, it will listen on 127.0.0.1:53 (only udp).
If not specified otherwise, the ip range which is used for onion addresses is 127.192.0.0/10 and [FE80::]/10. That’s fine if everything is on one node, if not it’s possible to specify the address range with VirtualAddrNetworkIPv4 and VirtualAddrNetworkIPv6.
The /etc/resolv.conf must be adapted to this and the only nameserver should be 127.0.0.1 as it’s the only one which can serve the onion top level domain too.

Transparent proxy

The Tor service can create a port which serves as a transparent proxy (TransPort).

TransPort 9051

After a restart of the service it will listen on 127.0.0.1:9051 tcp. To redirect all our traffic to the Tor service, some firewall rules are required.

$ cat /etc/sysconfig/nftables.conf
table ip nat {
    chain output {
        type nat hook output priority -100; policy accept;
        meta l4proto tcp ip daddr 127.192.0.0/10 redirect to :9051
    }
}

This will redirect all traffic, which has a destination within the VirtualAddrNetworkIPv4 network to the local Tor service transparent proxy.

Testing

Let’s do some tests first:

$ dig +short @127.0.0.1 kqcbbtuyhuaagan3pbm22knapempzb5ia2wnmdqnhg7rrfwx775cqmad.onion
127.255.217.241
$ dig +short @127.0.0.1 AAAA kqcbbtuyhuaagan3pbm22knapempzb5ia2wnmdqnhg7rrfwx775cqmad.onion
fe96:9715:fe63:392b:bea0:67db:78f8:2434
$ dig +short @127.0.0.1 www.immerda.ch
199.58.80.177

We see, that the dns server respond with an A and AAAA record inside of the network which is specified with VirtualAddrNetworkIPv?. Every other dns query will be answered too. That’s great, what’s about our traffic which should be redirected to the transparent proxy?

$ nmap -sT -p8140 kqcbbtuyhuaagan3pbm22knapempzb5ia2wnmdqnhg7rrfwx775cqmad.onion
Starting Nmap 7.70 ( https://nmap.org ) at 2020-03-22 20:05 UTC
Nmap scan report for kqcbbtuyhuaagan3pbm22knapempzb5ia2wnmdqnhg7rrfwx775cqmad.onion (127.230.205.159)
Host is up (0.0034s latency).
Other addresses for kqcbbtuyhuaagan3pbm22knapempzb5ia2wnmdqnhg7rrfwx775cqmad.onion (not scanned): fea3:1af6:48c2:a7f5:f5ef:bd1d:accc:f623

PORT     STATE SERVICE
8140/tcp open  puppet

Applications

As Tor creates now a transparent proxy, it’s really easy to setup them. Let’s have a look on the two examples Puppet and Icinga.

Puppet

The Puppet server needs a certificates which also includes the onion address as a x509v3 dns alternative name. Have a look at the dns_alt_names configuration in the Puppet documentation.
For an already existing Puppet server, the host certificate has to be removed and regenerated. There is no need to replace the Puppet CA.
On the agent side, we have to specify the onion server address:

$ cat /etc/puppetlabs/puppet/puppet.conf
[main]
    certname = agent.example.com
    server = kqcbbtuyhuaagan3pbm22knapempzb5ia2wnmdqnhg7rrfwx775cqmad.onion
...

Puppet will now use a connection over Tor. On the server side, the certificate of the agent will still include the certname (agent.example.com).

Icinga

This article will just explain some parts of the whole Icinga2 configuration. The whole Icinga configuration is too much and out of scope.
On the monitoring parent node, create the zones and endpoints:

$ cat /etc/icinga2/zones.conf
object Endpoint "monitoring.example.com" {
}
object Zone "monitoring.example.com" {
    endpoints = [ "monitoring.example.com" ]
}

object Endpoint "agent.example.com" {
}
object Zone "agent.example.com" {
    endpoints = [ "agent.example.com" ]
    parent = "monitoring.example.com"
}

On the agent node, create the same zones and endpoints:

$ cat /etc/icinga2/zones.conf
object Endpoint "monitoring.example.com" {
    host = "kqcbbtuyhuaagan3pbm22knapempzb5ia2wnmdqnhg7rrfwx775cqmad.onion"
    port = 5665
}
object Zone "monitoring.example.com" {
    endpoints = [ "monitoring.example.com" ]
}


object Endpoint "agent.example.com" {
}
object Zone "agent.example.com" {
    endpoints = [ "agent.example.com" ]
    parent = "monitoring.example.com"
}

All done. Icinga will now communicate over Tor.

Conclusion

It’s relative simple to create a transparent Tor proxy, but it’s not really well documented in the internet. There were blog articles which described parts of it, and every configuration option is well documented in the man page. A negative point is, that the dns server of the Tor service will not serve every dns type. Perhaps it would be better to create a onion router which serves within a network as a transparent proxy and configure your main dns server to stub resolv onion addresses on this onion router.

GitLab CI with podman

We know GitLab CI with docker runners for quiet a while now, but what’s about GitLab CI with podman? Podman is the next generation container tool under Linux, it can start docker containers within the user space, no root privileges are required. With RHEL 8 there is no docker runtime available at the moment, but Red Hat supports podman. But how can we integrate that with GitLab CI? The GitLab CI runner has some native support (called executor) for docker, shell, …, but there is no native support for podman. There are two possibilities, using the shell runner or using the custom runner. With the shell runner, you have to ensure that every project starts podman, and only podman. So let’s try the custom runner.

GitLab CI runner with custom executor

Let’s start build a GitLab CI custom executor with podman on a RHEL/CentOS 7 or 8 with a really basic container. First, install the gitlab-ci-runner Go binary and create a user with a home directory under which the gitlab-ci-runner should run later.
For this example we assume there is a unix user called gitlab-runner with the home directory /home/gitlab-runner. This user is able to run podman. Let’s try that:

sudo -u gitlab-runner podman run -it --rm \
    registry.code.immerda.ch/immerda/container-images/base/fedora:30 \
    bash

Next, let’s make a systemd service for the GitLab runner (/etc/systemd/system/gitlab-runner.service):

[Unit]
Description=GitLab Runner
After=syslog.target network.target
ConditionFileIsExecutable=/usr/local/bin/gitlab-runner

[Service]
User=gitlab-runner
Group=gitlab-runner
StartLimitInterval=5
StartLimitBurst=10
ExecStart=/usr/local/bin/gitlab-runner run --working-directory /home/gitlab-runner
Restart=always
RestartSec=120

[Install]
WantedBy=multi-user.target

Now, let’s register a runner to a GitLab instance.

sudo -u gitlab-runner gitlab-runner register \
    --url https://code.immerda.ch/ \
    --registration-token $GITLAB_REGISTRATION_TOKEN \
    --name "Podman fedora runner" \
    --executor custom \
    --builds-dir /home/user \
    --cache-dir /home/user/cache \
    --custom-prepare-exec "/home/gitlab-runner/fedora/prepare.sh" \
    --custom-run-exec "/home/gitlab-runner/fedora/run.sh" \
    --custom-cleanup-exec "/home/gitlab-runner/fedora/cleanup.sh"
  • –builds-dir: The build directory within the container.
  • –cache-dir: The cache directory within the container.
  • –custom-prepare-exec: Prepare the container before each job.
  • –custom-run-exec: Pass the .gitlab-ci.yml script items to the container.
  • –custom-cleanup-exec: Cleanup all left-overs after each job.

There are three scripts referenced at this point. Those script will be executed for each job (a CI/CD pipeline can contain multiple jobs, e.g. build, test, deploy). The whole magic will happen within those scripts. The output of those scripts is always shown in the GitLab job, so for debugging reasons it’s possible to do a set -x.

Scripts

Every job will start all the referenced scripts. First have a look on some variables we need during all scripts. Let’s create a file /home/gitlab-runner/fedora/base.sh

CONTAINER_ID="runner-$CUSTOM_ENV_CI_RUNNER_ID-project-$CUSTOM_ENV_CI_PROJECT_ID-concurrent-$CUSTOM_ENV_CI_CONCURRENT_PROJECT_ID-$CUSTOM_ENV_CI_JOB_ID"
IMAGE="registry.code.immerda.ch/immerda/container-images/base/fedora:30"
CACHE_DIR="$(dirname "${BASH_SOURCE[0]}")/../_cache/runner-$CUSTOM_ENV_CI_RUNNER_ID-project-$CUSTOM_ENV_CI_PROJECT_ID-concurrent-$CUSTOM_ENV_CI_CONCURRENT_PROJECT_ID-pipeline-$CUSTOM_ENV_CI_PIPELINE_ID"
  • CONTAINER_ID: Name of the container.
  • IMAGE: Image to use for the container.
  • CACHE_DIR: The cache directory on the host system.

Prepare script

The prepare executable (/home/gitlab-runner/fedora/prepare.sh) will

  • pull the image from the registry
  • start a container
  • install the dependencies (curl, git, gitlab-runner)
#!/usr/bin/env bash

currentDir="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
source ${currentDir}/base.sh

set -eo pipefail

# trap any error, and mark it as a system failure.
trap "exit $SYSTEM_FAILURE_EXIT_CODE" ERR

start_container() {
    if podman inspect "$CONTAINER_ID" >/dev/null 2>&1; then
        echo 'Found old container, deleting'
        podman kill "$CONTAINER_ID"
        podman rm "$CONTAINER_ID"
    fi

    # Container image is harcoded at the moment, since Custom executor
    # does not provide the value of `image`. See
    # https://gitlab.com/gitlab-org/gitlab-runner/issues/4357 for
    # details.
    mkdir -p "$CACHE_DIR"
    podman pull "$IMAGE"
    podman run \
        --detach \
        --interactive \
        --tty \
        --name "$CONTAINER_ID" \
        --volume "$CACHE_DIR":"/home/user/cache" \
        "$IMAGE"
}

install_dependencies() {
    podman exec -u 0 "$CONTAINER_ID" sh -c "dnf install -y git curl"

    # Install gitlab-runner binary since we need for cache/artifacts.
    podman exec -u 0 "$CONTAINER_ID" sh -c "curl -L --output /usr/local/bin/gitlab-runner https://gitlab-runner-downloads.s3.amazonaws.com/latest/binaries/gitlab-runner-linux-amd64"
    podman exec -u 0 "$CONTAINER_ID" sh -c "chmod +x /usr/local/bin/gitlab-runner"
}

echo "Running in $CONTAINER_ID"

start_container
install_dependencies

Run script

The run executable (/home/gitlab-runner/fedora/run.sh) will run the commands defined in the .gitlab-ci.yml within the container

#!/usr/bin/env bash

currentDir="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
source ${currentDir}/base.sh

podman exec "$CONTAINER_ID" /bin/bash < "$1"
if [ $? -ne 0 ]; then
    # Exit using the variable, to make the build as failure in GitLab
    # CI.
    exit $BUILD_FAILURE_EXIT_CODE
fi

Cleanup script

And finally the cleanup executable (/home/gitlab-runner/fedora/cleanup.sh) will cleanup after every job.

#!/usr/bin/env bash

currentDir="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
source ${currentDir}/base.sh

echo "Deleting container $CONTAINER_ID"

podman kill "$CONTAINER_ID"
podman rm "$CONTAINER_ID"
exit 0

The script above doesn’t cleanup the cache. The reason is, that perhaps we need the cache during the next job or during the next pipeline. So an additional cleanup on the host system is needed for purging the cache after a while.

Last but not least

This is just a quick howto. If you wanna implement that, there are a lot of improvements. This should just explain how the custom executor can be used and how to use podman for GitLab CI runner. At the moment there is no support for the tag image (see #4357).

Ehlo Onion

Email transport security in 2016, it’s still a thing! The last mile is fortified — no reasonable provider accepts plaintext smtp, pop, or imap from a client. But what about transport? It’s still opportunistic, downgradeable, interceptable, and correlateable. It’s time to put some more band-aids on this wound!

SMTP delivery to Tor Hidden Services

Moving forward our MX is reachable at ysp4gfuhnmj6b4mb.onion:25. We are happy to accept mail for all our domains (ie. all domains where mail.immerda.ch is the MX) there! Do you want to know how to do that? Or even how to make your system reachable through tor? There is a tutorial for Exim at the bottom of this post and there is another one for Postix. But wait, there is more!

Ehlo

So how about everyone on the internet does this? How about using the following format to publish Onion Service MX records in dns:

_onion-mx._tcp.immerda.ch. 3600 IN SRV 0 5 25 ysp4gfuhnmj6b4mb.onion.

Fair enough, dns can be spoofed. But we still get all the other benefits when it works, we make correlation much harder and improve the meta-data leakage. And we exclude the attacker who can only tamper with the SMTP session but not the DNS query.

At the moment this is just a proposal but we are eager to collaborate on this if you reach out to us!

Howto? (Exim)

Now let’s see how we can configure Exim send mail to a tor Onion Service for a manually curated set of domains.

There is previous work, but since the exim 4.87 release an easier approach is possible. Here is the high level overview of what we need:

  1. A static mapping between email domain and MX onion address
  2. A router to prepare the submission using Tor’s AutomapHostsOnResolve feature: The router performs a programatic DNS lookup with the tor daemon. The returned IP is being mapped to the correct onion url by tor.
  3. A transport sending emails via the tor socks proxy using the above IP as destination.

Here’s how we adjusted our exim setup for outgoing mail (and you should be able to do it in a similar way):

  • First create the mapping of recipient domains to onion addresses:

/etc/exim/onionrelay.txt

immerda.ch ysp4gfuhnmj6b4mb.onion
lists.immerda.ch ysp4gfuhnmj6b4mb.onion
  • Then convert it to cdb for faster lookups:

    cdb -m -c -t /tmp/onionrelay.tmp /etc/exim/onionrelay.cdb /etc/exim/onionrelay.txt

  • Install and configure Tor for Onion Service DNS mapping and have the local daemon running:

/etc/torrc

AutomapHostsOnResolve 1
DNSPort 5300
DNSListenAddress 127.0.0.1
...
  • Configure Exim:

/etc/exim/conf.d/perl

perl_startup = do '/etc/exim/perl-routines.pl'
perl_at_start

/etc/exim/perl-routines.pl

use Net::DNS::Resolver;
sub onionLookup {
  my $hostname = shift;
  my $res = Net::DNS::Resolver->new(nameservers => [qw(127.0.0.1)],);
  $res->port(5300);
  my $query = $res->search($hostname);
  foreach my $rr ($query->answer) {
    next unless $rr->type eq "A";
    return $rr->address;
  }
  return 'no_such_host';
}

/etc/exim/conf.d/domainlists

ONION_RELAYDB=/etc/exim/onionrelay.cdb
domainlist onion_relays     = cdb;ONION_RELAYDB
...

/etc/exim/conf.d/router

# send things over tor where we have an entry for it
onionrelays:
  driver    = manualroute
  domains   = +onion_relays
  transport = onion_relay
  # get the automap IP for the onion address from the tor daemon
  route_data = ${perl{onionLookup}{${lookup{$domain}cdb{ONION_RELAYDB}}}}
  no_more
...

/etc/exim/conf.d/transports

onion_relay:
  driver = smtp
  socks_proxy = 127.0.0.1 port=9050
...

Running OnionService MX

To receive mail all you need to do is set up a Tor Onion Service (there are plenty of tutorials out there) which listens on port 25 and publish the address to the world.

We strongly advice to run this hidden service on a separate VM and internally forward to your MX to avoid running an open relay.

You could also configure the Onion Service directly on the MX but then you need to be extra careful since connections will appear to come from 127.0.0.1. Most mail servers treat localhost in a privileged way and you want to avoid that. Possible workarounds are to either locally map to a different port or bind tor daemon to another ip (eg. 127.0.0.2).

How we configure services: ibox types

As previously mentioned we are using the ibox project as a way to refactor, modernize and share our automation setup with other interested folks. As we are looking back to around 10 years of automating our services using puppet, there might one or the other place where it’s time to do such a refactor. So this whole project is a slow but steady process to make our plans happen: That we – internally, but also others – are able to replicate parts of our infrastructure on a local environment to easily renew, improve, debug or extend it.

Since our last posting we migrated quite a bunch of services to the ibox structure and so it might be a good time to share what you can do with it now. Also it’s a good opportunity for us to get things at least somehow a bit in a shape we can share it (read document).

ibox types

ibox types are somewhat our puppet roles and within our infrastructure in general we try to isolate things as it makes sense. This means we might actually run different ibox types together on a certain node type and some ibox types might be purely isolated within their VM. The same is possible with the ibox allowing you quite nice setups as you might see later.

Though something should be mentioned: we are never removing types from our nodes and so this is also not something we want to enable within the ibox. Removing craft respecting all kinds of different combination can become very cumbersome and given that the ibox’s idea is to easily replicate something within a local development environment, we actually don’t support that here either.

You should be able to easily get an overview of the different types by looking at the ibox::types:: space within the ibox module. Also we try to share certain examples within the vagrant.yaml.example file.

As you might see there is broad range of our services available and more will come in the future. To give you some idea on how to start using that concept, we give you an example how to use a few of the available types.

webhosting & webservices

Besides what most people are using from us – email & other communication services – we are also offering webhosting for your static or php-based websites. Most of the things that are driving the setup for such a webhosting are part of the webhosting and apache modules. You can easily use them to for example replicate your configuration of a webhosting with us locally on your machine using the webhosting type and the following hiera data example.

In short what this does is:

  1. configure a few defaults for a webhosting host, especially with regards to SFTP access.
  2. Setup the databases required for the hostings.
  3. Configure a simple php hosting with installing the php security checker of sektions eins to be automatically installed (Browse to http://php56.ibox-one.local/phpconfigcheck.php to verify our configuration). We activated only one as it should show what the others do as well.
  4. Automatically install a full WordPress installation, that is setup and ready to serve your blogpost. You can retrieve wordpress’ admin password using trocla get wordpress_adminuser_wp.ibox-one.local plain
  5. Automatically install a full Mediawiki installation, that is setup and ready to organize your content.
  6. And last but not least: A SimpleMachine Forum installation ready to go through the web installer. Something which we didn’t yet automate to the end, but we would take merge requests for that! 🙂

All these installations match what and how we are installing things on our webhostings. So if you might have problem debugging something, this might be a good opportunity to get a real shell for your webhosting 🙂

# a webhosting type with mysql
# to illustrate we also install a bunch of hostings
ibox::types: ['webhosting','dbserver']
# we don't need postgres here
ibox::types::dbserver::is_postgres_server: false
# let's get some space for the webhostings
ib_disks::datavgs::www::size_data: 12G
# let's make sure our hostings can login
# over ssh but only using sftp and are chrooted
sshd::sftp_subsystem: 'internal-sftp'
sshd::allowed_groups: 'root sftponly'
sshd::use_pam: 'yes'
sshd::hardened: '+sha1'
sshd::tail_additional_options: |
  Match Group root
         PasswordAuthentication no

  Match Group sftponly
         PasswordAuthentication yes
         ChrootDirectory %h
         ForceCommand internal-sftp
         AllowTcpForwarding no
# the databases we need
ib_mysql::server::default_databases:
  'wp_test': {}
  'wiki_test': {}
  'smf_test': {}
ib_webhosting::hostings::php:
#  'php.ibox-one.local':
#    git_repo: 'https://github.com/sektioneins/pcc'
#  'php54.ibox-one.local':
#    git_repo: 'https://github.com/sektioneins/pcc'
#    php_installation: "scl54"
#  'php55.ibox-one.local':
#    git_repo: 'https://github.com/sektioneins/pcc'
#    php_installation: "scl55"
  'php56.ibox-one.local':
    git_repo: 'https://github.com/sektioneins/pcc'
    php_installation: "scl56"
# setup a wordpress hosting fully automatic
# mind the database above
ib_webhosting::hostings::wordpress:
  'wp.ibox-one.local':
    blog_options:
      dbname: 'wp_test'
      adminemail: 'admin@ibox.local'
# setup a mediawiki fully automatic
# mind again the database above
ib_webhosting::hostings::mediawiki:
  'mw.ibox-one.local':
    db_name: 'wiki_test'
    contact: 'admin@ibox.local'
    sitename: 'mw'
    db_server: '127.0.0.1'
# install a smf hosting, ready to be clicked
# through the webinstaller
ib_webhosting::hostings::simplemachine:
  'smf.ibox-one.local': {}
# setup all diffrent kind of php hostings, either using
# the system php installation or scl installations

Webservices are managed services that we are fully running and providing to our users. Such a service is for example coquelicot powering our dl.immerda.ch service. And you can actually setup your own, by adding the following hieradata:

# webservices
ibox::types: ['webservices',]
#ib_apache::services::webhtpasswd::htpasswd_name: 'ht.ibox-one.local'
# get a coquelicot up and running
ib_apache::services::coquelicot::instances:
  'dl.ibox-one.local': {}

And as an example we have another service htpasswd.immerda.ch of us added there as well.

tor onion services

Since last week we also integrated the way how we are setting up onion services into ibox. And so you can actually make your ibox also available as an onion service:

ibox::types: ['onion_service']
ib_tor::onion::services:
  "%{hostname}":
    '22': {}

So this will create an onionserver called ibox-one (you can get the onion address @ /var/lib/tor/ibox-one/hostname) which makes ssh accessible.

You can combine that and make for example a wordpress that you installed available over an onion service:

ibox::types: ['webhosting','dbserver','onion_service']
# we don't need postgres here
ibox::types::dbserver::is_postgres_server: false
# let's get some space for the webhostings
ib_disks::datavgs::www::size_data: 12G
# the databases we need
ib_mysql::server::default_databases:
  'wp_test': {}
# setup a wordpress hosting fully automatic
# mind the database above
ib_webhosting::hostings::wordpress:
  'wp.ibox-one.local':
    domainalias: ['*.onion']
    blog_options:
      dbname: 'wp_test'
      adminemail: 'admin@ibox.local'
ib_tor::onion::services:
  "wp_os":
    '80': {}

For simplicity reasons we are serving any .onion address for the wordpress host, so we don’t need to read out the generated onion address and rerun puppet, but you might get the idea. Please note that some components might need additional tuning to make them safe, when being used through a local onion service, as lot’s of service assume localhost to be a safe place.

And so you get your wordpress served through an onion service:

$ torsocks curl -I "$(vagrant ssh -c 'cat /var/lib/tor/wp_os/hostname' | cut -c 1-22)/"
HTTP/1.1 200 OK
Date: Fri, 18 Nov 2016 15:12:01 GMT
Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips mod_fcgid/2.3.9
Link: <http://wp.ibox-one.local/?rest_route=/>; rel="https://api.w.org/"
Content-Type: text/html; charset=UTF-8

XMPP man-in-the-middle via tor

Update: It was pointed out to us that the word ‘wide-spread’ below is misleading since the cumulative exit probability of those nodes was probably below .5%. What we wanted to say instead is that the number of domains affected was large, when a bad exit was involved.

We saw some wide-spread XMPP man-in-the-middle via malicious tor exit nodes during the last 24h. The attacks where only targeting starttls connections on port 5222. The mitm served forged self-signed certificates for various Jabber domains, one of them being our imsg.ch. The attack was orchestrated between multiple exit nodes acting in sync. All of them served the same set of forged certificates, allegedly created around midnight March 2nd to 3rd, using common names tailored to various XMPP servers.

We tried a small sample of XMPP servers. Out of which we recorded the following domains being intercepted:

  • freifunk.im
  • jabber.ccc.de
  • jabber.systemli.org
  • jappix.org
  • jodo.im
  • pad7.de
  • swissjabber.ch
  • tigase.me

For a handful other domains the connection attempts where dropped and google xmpp was the only one we found to be unaffected.

The exit nodes involved in this attack were reported to the tor project and seem to be dysfunctional by now. The ones we know of are:

  • FAFE24D8CF973BC38B54500DA666EEE44F02C642
  • 6269E9B3549012C44F518D2D123E41A4F320157E
  • 04DDEEAB315956AD956AF046338FB8E5B52B2DAF
  • DB213B8BD383A955CC383CAB76F8DD71A7198F47
  • 0276C54A43ABF27AB0247AF377952A306605FB8A
  • A1B3C065339D11AC361D85B0A3F9B59A23BD818A
  • CAD72B06527F8534640E8E47AEE38E2A3321B2D4
  • 5CBDC2AB702784154E7EFE7E6F87645EB107E8FA
  • BAB1FA3492162DB8058F464AD99203144F2AAF87
  • 187C2945109982516002FE6620E46A94B9B81A6E
  • 6269E9B3549012C44F518D2D123E41A4F320157E
  • FAFE24D8CF973BC38B54500DA666EEE44F02C642
  • FB134ED5DEE131B707270D9E99D72201CC96D668
  • E46021F7921EBB836B318863BDD432982BAA7BD0

Here is the certificate which was presented when you tried to access imsg.ch:

Certificate:
    Data:
        Version: 1 (0x0)
        Serial Number: 12273528369637281981 (0xaa54550634e8d0bd)
    Signature Algorithm: sha1WithRSAEncryption
        Issuer: CN=imsg.ch
        Validity
            Not Before: Mar  3 12:08:43 2016 GMT
            Not After : Jan 10 12:08:43 2026 GMT
        Subject: CN=imsg.ch
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
        [...]
SHA1 Fingerprint=2C:F2:07:E8:19:ED:4E:CA:81:59:6E:3F:D8:59:52:B8:12:22:88:DB

What was curious is that the mitm SSL endpoint was sending a TLS session ticket. Does anybody have an idea if that could lead to an additional attack being carried out, or if it was merely an artifact of their SSL stack. E.g. one explanation we have is that the SSL terminator might have seen all packets originating from the same local tor daemon IP.

Here’s more log, for your convenience:

Certificate chain
 0 s:/CN=imsg.ch
   i:/CN=imsg.ch
-----BEGIN CERTIFICATE-----
MIICoDCCAYgCCQCqVFUGNOjQvTANBgkqhkiG9w0BAQUFADASMRAwDgYDVQQDEwdp
bXNnLmNoMB4XDTE2MDMwMzEyMDg0M1oXDTI2MDExMDEyMDg0M1owEjEQMA4GA1UE
AxMHaW1zZy5jaDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBALqSTvWG
ohTiP9DJztuSki2NTLRUfC9/1gbby2TK9kKPFKLyeIyZDtvgQvigT+ToNfvOl2EW
XX4WwlQ9sWGY+C2nZrX5EmttE7zV1/v3tYcPf9Vdir83WXX8/IRauewMxghHd2kH
LTQCTjYtOao+b2VEBDf6QsOV3DEAeM3tbitgFHazmxGMBI0LvadT2NWDVK36gVri
nD4FTV52RydamGMN9hwLP8Lp5RFcmoIOf8xJcYkMRpusvTNDCWERJYHrSnq0fppE
Y2i6EIMBLKpfbatO7IVyy4E1zAMgcVzvnDBQ2DsmcFpjLrzBVmW73YnH7GsGcuPk
Z9ha1eCGML2VncsCAwEAATANBgkqhkiG9w0BAQUFAAOCAQEAG98eWZQpv0V7e8Lt
i/ajKnZrFiv3kh6GwuJAXOH0dimhoNOaCMfwY2e30e3rxSNpXGr09mx6qozH2oh4
CbsZt2zd3jfeFeFe7bEWZijkigMcr+zI4IPpVeHfnZLidaMo/Pr5WMtaHgfO/kOC
UGboN3YbnVVQ7aFvrpvFhyfoemZD44O7ieFObKcmKHySXNyLrIYL1y9LNpWNZF5X
c6JsGHjFrefWI0smcd/aIyus0UTNJ/UaZwRxNbdlnSpXL0+nptnFadXqCo7ygZeE
ciqz/ckEIY0i4S39O1hO3LXEpT6gmlZnJt9Kffc3zLw5nS3IE+LpCbxnGNn53MCG
R977LA==
-----END CERTIFICATE-----
---
Server certificate
subject=/CN=imsg.ch
issuer=/CN=imsg.ch
---
No client certificate CA names sent
Peer signing digest: SHA512
Server Temp Key: ECDH, P-256, 256 bits
---
SSL handshake has read 1948 bytes and written 490 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES256-GCM-SHA384
    Session-ID: [... 64 bytes ...]
    Session-ID-ctx: 
    Master-Key: [... 96 bytes ...]
    Key-Arg   : None
    Krb5 Principal: None
    PSK identity: None
    PSK identity hint: None
    TLS session ticket lifetime hint: 300 (seconds)
    TLS session ticket: [... 0x100 bytes ...]

    Start Time: [...]
    Timeout   : 300 (sec)
    Verify return code: 18 (self signed certificate)

Another pygrub adventure

So the bug from the article Get Centos 7 DomU guests booting on Xen 4.1 hit us again, after a while as we wanted to reboot a few guests.

Everything still seemed to be fine within the guest, nevertheless pygrub on host wasn’t able to find a kernel to boot.

Revisiting one of the old xen bus that we already searched on our previous experience, showed us that we also need to patch pygrub on the host:

> -            title_match = re.match('^menuentry ["\'](.*)["\'] (.*){', l)
> +            title_match = re.match('^menuentry ["\'](.*?)["\'] (.*){', l)

in /usr/lib/xen-4.1/bin/pygrub and your guests boot again.