Configuration management on the Raspberry Pi

Raspberry Pi computers running the Raspbian flavour of the venerable Debian Linux distribution make excellent headless home servers. Inexpensive, small, quiet, reliable, power-efficient and powerful enought to be employed as NAS-appliances, BitTorrent boxes, AirPlay receivers, blog hosts, beer fermentation controllers etc etc.. Setting up a Raspberry Pi for these application are seldom as trivial as a:

$ apt-get something

More typical are laborious sessions at the console cloning git repos, running DDL scripts, tweaking cron tables as directed by long step-by-step tutorials in blog posts.

If  one had to endure this once and then be set for the foreseeable future things might not be too bad.

However, as stable the Pi hardware is it does have one Achilles’ heel – persistent storage. The SDHC-cards used for holding the root file system are  infamous for not holding up very well in the long-term (perhaps not too unexpected considering the much less demanding camera applications SD cards typically are designed for). Eventual file system corruption appears  to be almost unavoidable, as the plethora of online problem reports bears wittnes to [1], [2].

This problem leads to frequent (re)imaging of cards. If one wasn’t for foresighted enough to keep a full backup image hanging around you’ll have to start from scratch with an install image, toiling away at the step-by-step guides again.

Enter modern configuration automation tools. The prefered silver bullet of the fledging devops movement promises big saving in the data center automating all kinds of system administration task on hordes of virtual machines in private or public clouds.

Recently I’ve experimented with applying these tools on a different kind of platform: the humble Raspberry Pi.

The tools provides ways to specify the desired configuration of the nodes under control in various domain specific languages. These languages typically allows for powerful encapsulation and modularization abstractions thus managing complexity and promoting reusability. The tools offers several importants benefits compared to manual or semi-manual configuration :

  • Large libraries of ready-to-use modules for tasks such as controlling databases, installing packages, etc
  • Convenient mass application of specification on multiple nodes
  • Idempotency. Specifications can be re-run repeatablity ensuring that controlled nodes are up-to-date with the latest addition.
  • More friendly syntax than bash.

I’ve taken a closer look att Puppet, Pallet and Ansible.

Puppet is a well-known Ruby based tool that has been around for a while. The Puppet ecosystem is huge and there is a great assortment of third-party modules ready to use for almost any task. I’ve used Puppet for implementing a manifest that installs the Tiny Tiny RSS reader on PostgreSQL with data backups, feed updates, etc.

Pallet is a newer Clojure based option. In the trio it is the option which currently clearly enjoys the least amount of community traction, the available documentation is also sparse and disorganized. Even though I’m attracted to Clojure as a language these factors caused me to abandon Pallet for now.

Both Puppet and Pallet are heavily geared toward operation in large data centers, offerings features to control and organize hundreds of nodes. Puppet also in general relies on central master nodes that agents on controlled nodes periodically pull new configuration specs from. This is a huge impedance mismatch when one only wants to control  a couple of a Raspberry Pi servers in a living room.

Ansible is a relatively new Python based tool hitting 1.0 as recently as February 2013. Compared to Puppet there is less documentation available and the selection of third party modules is still a little behind. However is Ansible under heavy development with features and new modules added at a furious pace, the latest release being 1.3 as of August 2013.

Ansible is by design push based and agent less, requiring no running daemon on controlled hosts. Since Ansible uses standard ssh as transport to control hosts it is perfectly possible to control hosts without installing any additional software at all on them. The apt modules however requires python-apt and aptitute to be installed (easily achieved remotely with the Ansible shell module).

This means that it’s possible to assume control of a freshly imaged Pi without ever manually logging in!

Ansible is remarkably quick and executes a typical playbook much faster than a corresponding Puppet manifest on the modest Pi hardware. Ansible comes with a great deal of instant gratification, the learning curve beeing  smoother than that of Puppet, much of which probably can be attributed to a simpler specification language. Ansible playbooks are YAML, Puppets manifest is a Ruby DSL. Yaml is definitely less ceremony, and more cool. Puppets manifest are declarative in nature, one defines the desired actions to be taken and relationships between them (like before/after) leaving Puppet to figure out the correct execution sequence, while Ansible’s style is more imperative. Puppets style of specification may offer better optimization possibilities on the tool level and perhaps specs that are easier to maintain in the long run, but so far I’m very happy with Ansible’s simpler approach.

The conclusion is simple: to control a couple of Pi:s go for Ansible!

Leave a comment