Building (Virtual) Appliances with Python

Here is a summary of my lightning talk, for reference… 🙂

Context: You want to build an appliance to do something. It’s beefy enough to run Linux (or FreeBSD or similar). It can be a virtual appliance, a physical appliance on an x86-based architecture or a physical appliance on an ARM or similar.

You will want to be able to split your code into many processes. These will need to communicate. Since processes tend to die randomly, in the while, you will want the communication to be loosely coupled. The best loosely-coupled communication is shared state. One way to share state is to use an external state storer — AKA database. Relational or not, and perhaps both, you will want some sort of database. But some state needs to be accessed without calling out to an external process. For that, you will want something like what I already blogged about.

Executive summary: Database, shared state bus.

It is possible you will need to use some other language. In that case, the best thing is to go “one process, one language” — instead of embedding another language in Python, consider just making sure your shared state infrastructure works with the other language, and use its native event loop (and native logging system).

Executive summary: separate processes for separate languages.

As mentioned above, processes die. Or get stuck. Whatever, you need to make sure that you recover — you will need to write a watchdog. Ideally it will monitor not only actual life but some kind of heartbeat (UDP packets, touching a file, whatever). The watchdog will need to be simple and stable, since it cannot die. If you want to perform further “stuck” diagnostics, the best way is with a watchdog helper script. The watchdog, given a decent dependency management mechanism, will be able to recover from those without getting stuck itself if the helper script fails.

Executive summary: Watchdog, host of watchdog helpers.

Now, if the appliance is at all successful, people will want to script against it. It will have to expose some kind of API — a network API. The only protocol reliably working in the corporate network is HTTP, which is why it’s the best bet. JSON-RPC, XML-RPC or REST — choose whatever works for you but make sure it is fun to work with this API. There is a good rule of thumb to know when it’s fun — when in some of your processes, despite having the DB/shared-bus options, you prefer working through the external API instead.

Executive summary: web-service server

A word about selecting technologies: the DB is best used off-the-shelf (whether relational or not, there are many existing solutions). The shared state bus is a classic for Twisted — multiple clients, transparently serializing to “one update at a time” is the most natural thing to do with a Twisted server. Use protocol buffers (or something similar) to implement the actual transport. The watchdog is another classic for Twisted, with its process-management event-oriented APIs just ready to manage multiple processes at once. The web-service server is another classic for Twisted, with the caveat that long-lived connections will not work.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: