trocla – get (hashed) passwords out of puppet manifests

Background

At immerda.ch, we try to automate every aspect of our infrastructure, so we can work on more interesting things and let the repetitive and boring work be done by puppet. This means that we also manage a lot of users, required by the different services, with puppet, whether these are plain system-, mysql- or any other kind of users. For some of them, e.g. the SFTP-Users for the webhostings, we are also managing the passwords.

Up to now, we generated and hashed the passwords by hand and put them in our manifests, which means that the password hashes also ended up being version controlled by git. Managing the users and their passwors with puppet works very well and have proven to be a very stable solution. However, it has the disadvantage that a lot of (mainly hashed) passwords are laying around in different places in our infrastracture:

  1. The host on which they are used: In the actual backend (shadow, mysql, …) and the puppet catalog.
  2. On the puppet master: In the manifests that are checked out from git.
  3. In the git repository: This means a) on our internal git server, but b) also checked out on different systems of immerda admins.

Point 1 and 2 are obvious and can’t be changed given that we want local authentication (no central ldap) for most parts of our infrastructure and given that we want to run puppet in master/agent mode as our source of truth for various reasons. Although, we take the protection of the data we handle serious and no immerda admin should ever work with any content from our systems on disks without strong encryption, we think that it is better to not spread data more than it is necessary. So point 3 was in our eyes always a bit annoying.

Meet trocla

To address this issue and also to make password generation a bit more comfortable, we use trocla. Trocla has 2 main parts:

  1. A gem that provides all the logic and a little cli to work with the data
  2. A puppet function to query trocla while compiling the manifests, which fetches the passwords from trocla (and thus generates them if not yet existing).

So instead of generating and hashing the passwords ourself and keeping them in our puppet manifests, read: in our git repository, we simply use a puppet function that will do all of these steps for us and keep our git repositories password free.

How trocla works

Trocla is a wrapper around a key/value storage. Actually, it was built that you can use any kind of key/value storage that is supported by a newer not yet released version of the moneta-gem. By default trocla uses a yaml backend, which should be sufficient for most use-cases. The keys are used in the manifests to lookup the passwords from trocla and the value would be the stored password. That’s more or less the big picture.

However, with only that feature set we could also simply stick with something like extlookup or hiera (or hiera-gpg) and just put our values in a storage file, that is not in our git repository. But lets get a step back and look at all the steps that need to be performed, if we set somewhere a password:

  1. The plaintext password (which a user can later user to login)
  2. The password in the format of the actual service. So for example for local users we use salted SHA-512 passwords, MySQL passwords are stored using a simple SHA1, etc.

Trocla extends this simply key/value lookup with a third argument named format. This argument refers to the format of the password that we are interested in and is used by the service/user we are managing. The format option actually refers to the algorithm that have been used to hash the password. And to automate things a little bit further: trocla will generate a random password, if it does not yet find a password for the key.

In short we can describe trocla’s workflow as followed:

  1. Do I have the key/format tuple stored? Yes? -> Return the stored value.
  2. Do I have the value for the key stored in the plain format? Yes? -> Generate the requested format, store it (for later lookups) and go to 1.
  3. Otherwise: Generate a new random password, store it as plain format for that key and go to 2.

We need to store the hashed passwords, as we always want to return the same password hash for a certain key during multiple runs, so we don’t have to challenge puppet’s requirement for idempotency. Also, as mentioned above at some point (mainly in the beginning) you are usually also interested in the plain password, hence we store that one as well.

Workflow

Now, by using trocla, we are able to get rid off any passwords in our manifests and replace them with puppet-trocla-function calls and puppet will retrieve the passwords during catalog compilation in the right format. This means that the passwords are now only stored in 2 places:

  1. On the host itself: In the compiled catalog and the backend.
  2. On the puppet master: As hashed version and as plaintext password.

So, the only place where the plaintext password is stored is on the puppetmaster, which is anyway our source of truth and central point to manage all our systems. However, if we don’t need the plaintext password on the target host itself, it is not really necessary to keep the password on the puppetmaster. Still, our users should get the plaintext password, so they can actually login and use the service. Would be nice, not? ;-)

If we keep trocla’s lookup in mind: Once the hashed password is generated and the plaintext password is not used in any place in the puppet manifests, there is no need to keep the plaintext password on the puppetmaster. As mentioned in the beginning, the trocla gem comes also with a little cli tool to work with its storage. All the different actions of that cli are explained in the README file and the one we are interested in is delete:

$ trocla delete user1 plain
# This will delete the plain password of the key user1 and return it.

The last part of how that command works is the most interesting: This action will not only delete the value of the supplied format, but will also return (read display) it! So we can get the plaintext password, while removing it the from the puppetmaster. 2 important things to remember at this point:

  1. In the manifests, we usually only query the hashed format.
  2. If the hashed password is once stored, trocla will directly return that stored format.

So to wrap up our workflow for generating passwords for our users works now the following way:

  1. Add the new user to the puppet manifests and use the trocla function to query the passwords.
  2. Let puppet run on the target host, so puppet manages the user, hence queries trocla for the password, which will generate the passwords in the first run, but subsequently directly return the hashed password.
  3. Login on the puppet master and query the plaintext password by deleting it. This has the advantage that you not only got the password, but that it’s also not anymore stored on puppetmaster.

Note: Beware that you always delete only the plain format and not hashed format, or no format. The latter will delete and return all stored formats for that key, which is the same as a password reset and deleting a hashed format is only interesting if the format uses a salt and you’d like to resalt the hashed password, but keeping the plaintext format. However for both issues, there are other actions provided by the trocla cli.

Supported hashes and more

Trocla currently only supports a few hashes:

  • bcrypt: -hashed passwords
  • md5crypt: salted MD5-shadow passwords
  • mysql: SHA1-Hashes for MySQL-Users
  • pgsql: MD5 hashed passwords for PostgreSQL, that are salted with the username, which you need to pass as an option
  • sha256crypt: salted SHA256-shadow passwords
  • sha512crypt: salted SHA512-shadow passwords

However, trocla is built-in mind to easily extend it with further formats and if you look at the various formats you should be able to quickly get an idea how to extend trocla with further formats. Git pull requests are always welcome!

And to finish a few examples, how trocla is used in our manifests:

# common usage:
webhosting::static{'www.immerda.ch':
  ...
  password => trocla('webhosting_www.immerda.ch','sha512crypt');
}
# format requires an option:
postgres::role{'some_user':
  ...
  password => trocla('postgres_some_user','pgsql','username: some_user');
}

But we took that part even a step further and integrated the usage of trocla in completely transparent manner into our manifests. Examples can be found in the mysql module or the webhosting module.

Future

Trocla gives us now an automated integration of password storing and generation into puppet manifests. If you take the steps taken to that point a little bit further, we see plenty of more options to automate various things further and probably also to integrate them with other interfaces (to users?). So that various configuration parts of webhostings could be done by the users themselves, but would still be managed by puppet.

The state of Forward Secrecy in OpenSSL

It could be possible that your SSL services are not providing
forward secrecy and you haven’t noticed yet!

Many SSL ciphers provide forward secrecy by using ephedermal Diffie-Hellman (EDH) keys. This means that for every SSL session a temporary encryption key is negotiated and the normal key is only used for verifying authenticity. As the OpenSSL documentation states:

“By generating a temporary DH key inside the server application that is lost when the application is left, it becomes impossible for an attacker to decrypt past sessions, even if he gets hold of the normal (certified) key, as this key was only used for signing.”

Although ciphers using EDH will most probably be available in your setup, often they are disabled because the application fails to provide DH params to OpenSSL. Since it is costly to generate those parameters – which are needed to negotiate a DH key exchange – OpenSSL suggests to create them when an application is installed.

Many application will not do this, but rather let the user generate and include the parameters in the configuration manually. Since (i) most administrators are not aware of this problem, (ii) those applications do not yield any warnings if the parameters are missing and (iii) OpenSSL silently disables ciphers with unsatisfied requirements, forward secrecy is not available in many SSL connections.

Update: Also see Bernats blog for a nice roundup on the cryptographic background of perfect forward secrecy and the new, faster elliptic curve implementations.

Verify your Setup

Try to open an SSL session to your service (https, imap, smtp, jabber, irc, …) with

openssl s_client -port <port> -host <yourdomain.tld>

this will show you the details of the SSL session and you can verify that the used cipher includes EDH:

New, TLSv1/SSLv3, Cipher is DHE-RSA-AES256-SHA

or not:

New, TLSv1/SSLv3, Cipher is AES256-SHA

Fix your Setup

Applications which we found to work with EDH ciphers are Apache and Dovecot.

Update: Applications which we found to not support EDH out of the box are: squid, exim, courier

In most applications you can configure a dhparams variable somewhere. The dhparams can be generated with the following command:

openssl dhparam -out dhparams.pem 2048

We already fixed the problem in the following services:

Squid (reverse proxy)

In /etc/squid/include.d/https_port add dhparams=/path/dhparams.pem to every line

Exim

In /etc/exim.conf add the line tls_dhparam = /path/dhparams.pem

Fix the general Problem

This problem has two main reasons:

  1. Applications do not check whether the requirements of the user selected ciphers are satisfied. The requirements are listed in the OpenSSL doku. Or they could just always generate dhparams when they are installed, since EDH ciphers should be preferred anyway.
  2. The OpenSSL API does not provide any means to verify the state of the configuration. There is no function to check if cipher requirements are met and the SSL_CTX setup is consistent. As long as at least a single cipher (even the least secure) in the acceptable ciphers list can be initialized OpenSSL will not complain to the application.

If you find any application which exhibits this problem, please file a bug report and convince the maintainers to at least generate a warning to the user and state the consequences in the documentation.

If you are a developer of an application which uses OpenSSL please consider shipping install scripts that generate dhparams or generate them on the fly if they are missing. Please do not just let OpenSSL silently disable a key feature of SSL.

Storing mail credentials using bcrypt

We wanted to migrate the hashes of our mail user database for some time now. We couldn’t sleep at night anymore since there were still md5 passwords in there. This database is mainly used by exim and dovecot in our setup.

First our plan was to migrate to salted sha512 like it is described in the dovecot wiki. But this migration approach has a huge problem: all passwords are sent to the sql server in plaintext – just to be able to refer to them in a post login script. This is rather insane since there is really no technical reason why the sql server needs the plaintext passwords.

So we went off writing our own authentication script that implements the dovecot checkpassword specification. Now we can migrate our hashes (or everything else we want to do) while checking the password.

And most importantly: inspired by this little post we also decided that it would make sense to use a more sane hash function to store the passwords – namely bcrypt instead of sha.

If you want to check out our solution skip down to the dovecot howto. Please just write us if you need help with this. It is still quite alpha and not so well documented.

Other services

Our MTA (Exim) now uses imap to authenticate smtp users. The solution is originally from here.

For the users to change their passwords we use horde-passwd which does not support bcrypt. We fixed this by extending the sql backend with a custom driver that assumes passwords are in bcrypt. We agree that this is a hack…

Future Plans

The whole approach should be integrated into the tools we use. In the long run it would definitely pay off to directly extend dovecot, exim and horde to support this authentication. If we ever have time we should write patches for them.

Since we now have our custom authentication solution it would be cool to do even more stuff with it.

For example a long standing plan would be to use encfs to encrypt the maildirs. On login we would decrypt the maildir of the user and copy in all mails he got since the last login. When he’s not logged in we couldn’t access his mailbox anymore!

Dovecot Howto

If you’re interested in using our checkpassword script, it is available in our git repo. To get it running you need to:

  • set CONFIG_DIR in checkpasswd-bcrypt.incl.rb
  • adapt checkpasswd-bcrypt.conf.rb to your needs by providing a db access and the names of your db columns
  • adapt the sql queries in checkpasswd-bcrypt.sql.conf.rb to your setup
    (There is also a query to store the month of the last login. You can disable this in the config with KeepLastlogin = false)
  • set the dovecot config to use the checkpasswd-bcrypt.rb script

Arver – distributed LUKS key management

We recently developed Arver for LUKS – a distributed key manager with benefits!

If you want to use it too check out arver at codecoop.

It is not just another tool to conviniently store passwords for LUKS. No, it is a shiny monkeywrench for all sorts of challenges you face when administrating more than one server with encrypted harddrives. Plus it even enhances LUKS security in several ways.

But let me prove this by giving some examples:

shared passwords no more

Shared passwords are arguably one of the worst threats to your environment. They are hard to change, hard to revoke, hard to keep safe and tend to be simpler than advised. With arver every admin has its own gpg-key that is used to grant him access to the LUKS disks. Moreover access can be granted on a per-device basis!

Lets assume i created a new LUKS partition and want to grant bob access to it: ‘arver –add-user bob a_server/a_disk’ will assign a new passphrase to a_disk on a_server and store it encrypted with bob’s public gpg-key as arver-key. Bob can then use this arver-key to open a_disk. No need to communicate any password in plaintext!

mind the rubberhose

Well what would Alice do if Bob made her aware that he might be under pressure to release any internal data. She would just execute ‘arver –del-user bob ALL’

And even if she didn’t do this Bob could always claim that he doesn’t have access to a particular disk since his arver-key doesn’t reveal for which disks it is.

more uptime

Arver lets you automate all tasks surrounding LUKS managemant. It has script hooks for pre-/post open/close. Imagine you had a power outtake in a_colo. With the right setup it should be enough to: ‘arver –open a_colo’.

This will loop over all hosts at a_colo, e.g. first executing pre_open scripts on a disk-basis that create a loop-device. Then post-open scripts on a host-hasis to start all virtual servers that were waiting for a LUKS disk to be opened.

interested?

If you’d like to know more about Arver we recommend reading the man page, look at this confusing diagram or download arver directly as gem.

securely modelling social graphs

Immerda has an invite system for its services. This means you have to be invited by an immerda user to get an account. The person who invited you is then your trust-relation to us. E.g. he can perform password resets on your behalf.

We had this system for quite some time and are happy with it since we see ourselves as kind of peer-group isp.

What is addressed

The problem remains how to store those trust-relations to be able to verify if password resets are legitimate. And most important: how to store this information in such a way that it is only visible to the people belonging to a certain trust relation.

The following ideas can be applied to any kind of social graph that has to be (partially) stored:

First of all we want to assure that this sensitive information cannot be used to do any kind of data mining. It should be relatively hard to decypher a certain relation, for people that don’t take part in it.

Furthermore we want our users to be able to change their trust links. (Because they might not trust the inviting person anymore.)

And we also thought of providing the possibility to have multiple trust links. With this feature we could either make more than one person having to agree to a password reset or we could have multiple persons beeing able to request a password reset.

What makes it hard

The basic idea is to store this information in a database that is cryptographically designed to prevent that kind of mapping. But with this database it should nevertheless be possible to verify the authenticity of a request.

But what needs to be stored? In order for a user to be able to manage his trust-relations, it must be possible for him to see, which people he trusts. So we obviously need to store this information. But of course in such a way that only he or she can access this information.

Then we need the people that are trusted by others to be able to prove that in case they need to. This part of the relation has to be initiated by one person, provable by the second person that is trusted and not guessable by an outstanding person.

Then we have the following issue: How do you cope with changes in the trust relations? E.g. if A wants to cancel/add a trust relation to B? Can this be achieved without B having to first accept this relation and therefore without the need of storing this relation temporarily in plaintext?

Our proposal

We propose a modell that tries to address those issues and make reasonable tradeoffs between ease of use and privacy. To implement this schema each user has to have some sort of public/private keypair. (And we intend to use the normal login password as passphrase for the privatekey.)

We’d like to explain this modell by providing all the relevant datastructures and the usecases:

terminology

  • trusted user : user entitled to be able to perform a password reset
  • trusting user: user who trusts the trusted user
  • encX(blob): means that blob is encrypted with the publickey of X

Our trust database would look like this:

relations database:

trusting-user trusted-user reset token hash
A encA( B ) encB( randomData ) hash( randomData )
A encA( C ) encC( randomData2 ) hash( randomData2 )

usecases:

1.  If A wants to add a trust-relation to B:

  • Just insert the corresponding entry into the database:
  • Store the trusted-user (B) encrypted with A’ publickey.
  • Encrypt a random string R with the public key of B as reset-token.
  • Store a hashed version of R to be able to verify the reset-token.

2.  If B wants to reset the pw of A:

  • Decrypt all the possible reset tokens of A (where A is the trusting-user) with the private key of B.
  • If one of those decrypted plaintexts correlates with the hash the request is legitimate.
  • We can now reset the password and generate a new keypair for A since he lost acces to his old privatekey too.
  • When A logs in with his new password: He has to reenter his trusted-users since the old entries are lost since his privatekey has changed.
  • A particular problem persists at this point: All the relations where A is the trusted-user are not valid anymore (since he lost the correct privatekey) and have to be rebuilt. Therefore:

3.  Each time a user logs in:

  • Go through all the trust-relations where he is the trusting-user and regenerate the reset-tokens and the hashs with new random strings. This is needed since the keypair of the trusted-user could have changed in the meantime.

Plans

The provided schema is a general approach to store this kind of sensitive information. With it you can lay all the power of those trust links directly into the users hands.

We hope to implement such a system in our planned user account management interface.

And of course we’d love to hear other suggestions to this problem or also comments on our idea.

Not again! a blog?

Hi, this is the new immerda techblog.

We will use this place to write about our experiences on our journey @immerda. Our hope beeing to spread our solutions and ideas to a wider audience. Topics probably will focus on our technical setup, software projects, plans, recepies and so on.