Shipping Python Applications in Docker

March 20, 2017

Posted on my new blog as an experiment!

On Raising Exceptions in Python

December 10, 2016

There is a lot of Python code in the wild which does something like:

raise SomeException("Could not fraz the buzz:"
                    "{} is less than {}".format(foo, quux)

This is, in general, a bad idea. Exceptions are not program panics. While they sometimes do terminate the program, or the execution thread with a traceback, they are different in that they can be caught. The code that catches the exception will sometimes have a way to recover: for example, maybe it’s not that important for the application to fraz the buzz if foo is 0. In that case, the code would look like:

except SomeException as e:
    if ???

Oh, right. We do not have direct access to foo. If we formatted better, using repr, at least we could tell the difference between 0 and “0”: but we still would have to start parsing the representation out of the string.

Because of this, in general, it is better to raise exceptions like this:

raise SomeException("Could not fraz the buzz: foo is too small", foo, quux)

This way exception handling has a lot of power: it can introspect foo, introspect quux and introspect the string. If by some reason the exception class is raised and we want to verify the reason, checking string equality, while not ideal, is still better than trying to match string parts or regular expression matching.

When the exception is presented to the user interface, in that case, it will not look as nice. Exceptions, in general, should reach the UI only in extreme circumstances: and in those cases, having something that has as much information is useful for root cause analysis.

Moshe’z Messaging Preferences

December 8, 2016

The assumption here is that you have my phone number. If you don’t have my phone number, and you think that’s an oversight on my part, please send me an e-mail at and ask for it. If you don’t have my phone number because I don’t know you, I am usually pretty responsive on e-mail.

In order of preference:

Don’t Mock the UNIX Filesystem

December 2, 2016

When writing unit tests, it is good to call functions with “mocks” or “fakes” — objects with equivalent interface but a simple, “fake” implementation. For example, instead of a real socket object, something that has recv() but returns “hello” the first time, and an empty string the second time. This is great! Instead of testing the vagaries of the other side of a socket connection, you can focus on testing your code — and force your code to handle corner cases, like recv() returning partial messages, that happen rarely on the same host (but not so rarely in more complex network environments).

There is one OS interface which it is wise not to mock — the venerable UNIX file system. Mocking the file system is the classic case of low-ROI effort:

  • It is easy to isolate: if functions get a parameter of “which directory to work inside”, tests can use a per-suite temporary directory. Directories are cheap to create and destroy.
  • It is reliable: the file system rarely fails — and if it does, your code is likely to get weird crashes anyway.
  • The surface area is enormous: open(), but also, os.mkdir, os.rename, os.mknod, os.rename, shutil.copytree and others, plus modules calling out to C functions which call out to C’s fopen().

The first two items decrease the Return, since mocking the file system does not make the tests easier to write or the test run more reproducible, while the last one increases the Investment.

Do not mock the file system, or it will mock you back.

Belt & Suspenders: Why put a .pex file inside a Docker container?

November 19, 2016

Recently I have been talking about deploying Python, and some people had the reasonable question: if a .pex file is used for isolating dependencies, and a Docker container is used for isolating dependencies, why use both? Isn’t it redundant?

Why use containers?

I really like glyph’s explanation for containers: they isolate not just the filesystem stack but the processes and the network, giving a lot of the power that UNIX was supposed to give but missed out on. Containers isolate the file system, making it easier for code to write/read files from known locations. For example, its log files will be carefully segregated, and can be moved to arbitrary places by the operator without touching the code.

The other part is that none of the reasonable options packages Python and this means that a pex file would still have to be tested with multiple Pythons, and perhaps do some checking at start-up that it is using the right interpreter. If PyPy is the right choice, it is the choice the operator would have to make and implement.

Why use pex?

Containers are an easy sell. They are right on the hype train. But if we use containers, what use is pex?

In order to explain, it is worthwhile comparing a correctly built runtime container that is not using pex, with one that is: (parts that are not relevant have been removed)

ADD wheelhouse /wheelhouse
RUN . /appenv/bin/activate; \
    pip install --no-index -f wheelhouse DeployMe
COPY twist.pex /

Note that in the first option, we are left with extra gunk in the /wheelhouse directory. Note also that we still have to have pip and virtualenv installed in the runtime container. Pex files bring the double-dutch philosophy to its logical conclusion: do even more of the build on the builder side, do even less of it on the runtime side.

Deploying with Twisted: Saying Hello

November 13, 2016

Too Long: Didn’t Read

The build script builds a Docker container, moshez/sayhello:.

$ ./build MY_VERSION
$ docker run --rm -it --publish 8080:8080 \
  moshez/sayhello:MY_VERSION --port 8080

There will be a simple application running on port 8080.

If you own the domain name, you can point it at a machine that the domain resolves to and then run:

$ docker run --rm -it --publish 443:443 \
  moshez/sayhello:MY_VERSION --port le:/srv/www/certs:tcp:443 \
  --empty-file /srv/www/certs/

It will result in the same application running on a secure web site:


All source code is available on GitHub.


WSGI has been a successful standard. Very successful. It allows people to write Python applications using many frameworks (Django, Pyramid, Flask and Bottle, to name but a few) and deploy using many different servers (uwsgi, gunicorn and Apache).

Twisted makes a good WSGI container. Like Gunicorn, it is pure Python, simplifying deployment. Like Apache, it sports a production-grade web server that does not need a front end.

Modern web applications tend to be complex beasts. In order to be trusted by users, they need to have TLS support, signed by a trusted CA. They also need to transmit a lot of static resources — images, CSS and JavaScript files, even if all HTML is dynamically generated. Deploying them often requires complicated set-ups.


Container images allow us to package an application with all of its dependencies. They often cause a temptation to use those as the configuration management. However, Dockerfile is a challenging language to write big parts of the application in. People writing WSGI applications probably think Python is a good programming language. The more of the application logic is in Python, the easier it is for a WSGI-based team to master it.


Pex is a way to package several Python “distributions” (sometimes informally called “Packages”, the things that are hosted by PyPI) into one file, optionally with an entry-point so that running the file will call a pre-defined function. It can take an explicit list of wheels but can also, as in our example here, take arguments compatible with the ones pip takes. The best practice is to give it a list of wheels, and build the wheels with pip wheel.


The pkg_resources module allows access to files packaged in a distribution in a way that is agnostic to how the distribution was deployed. Specifically, it is possible to install a distribution as a zipped directory, instead of unpacking it into site-packages. The code:pex format relies on this feature of Python, so adherence to using pkg_resources to access data files is important in order to not break code:pex compatibility.

Let’s Encrypt

Let’s Encrypt is a free, automated, and open Certificate Authority. It has invented the ACME protocol in order to make getting secure certificates a simple operation. txacme is an implementation of an ACME client, i.e., something that asks for certificates, for Twisted applications. It uses the server endpoint plugin mechanism in order to allow any application that builds a listening endpoint to support ACME.


The twist command-line tools allows running any Twisted service plugin. Service plugins allow us to configure a service using Python, a pretty nifty language, while still allowing specific customizations at the point of use via command line parameters.

Putting it all together

Our files defines a distribution called sayhello. In it, we have three parts:

  • src/sayhello/ A simple Flask-based WSGI application
  • src/sayhello/data/index.html: an HTML file meant to serve as the root
  • src/twisted/plugins/ A Twist plugin

There is also some build infrastructure:

  • build is a Python script to run the build.
  • build.docker is a Dockerfile designed to build pex files, but not run as a production server.
  • run.docker is a Dockerfile designed for production container.

Note that build does not push the resulting container to DockerHub.


Glyph Lefkowitz has inspired me in his blog about how to build efficient containers. He has also spoken about how deploying applications should be no more than one file copy.

Tristan Seligmann has written txacme.

Amber “Hawkowl” Brown has written “twist”, which is much better at running Twisted-based services than the older “twistd”.

Of course, all mistakes and problems here are completely my responsibility.

Twisted as Your WSGI Container

October 29, 2016


WSGI is a great standard. It has been amazingly successful. In order to describe how successful it is, let me describe life before WSGI. In the beginning, CGI existed. CGI was just a standard for how a web server can run a process — what environment variables to pass, and so forth. In order to write a web-based application, people would write programs that complied with CGI. At that time, Apache’s only competition was commercial web servers, and CGI allowed you to write applications that ran on both. However, starting a process for each request was slow and wasteful.

For Python applications, people wrote mod_python for Apache. It allowed people to write Python programs that ran inside the Apache process, and directly used Apache’s API to access the HTTP request details. Since Apache was the only server that mattered, that was fine. However, as more servers arrived, a standard was needed. mod_wsgi was originally a way to run the same Django application on many servers. However, as a side effect, it also allowed the second wave of Python web application frameworks –Paste, Flask and more — to have something to run on. In order to make life easier, Python included wsgiref, a module that implemented a single-thread single-process blocking web server with the WSGI protocol.


Some web frameworks come with their own development web servers that will run their WSGI apps. Some use wsgiref. Almost always those options are carefully documented as “just for development use, do not use in production.” Wouldn’t it be nice to use the same WSGI container in both development and production, eliminating one potential source of reproduction bugs?

For ease of use, it should probably be written in Python. Luckily, “twist web –wsgi” is just such a server. In order to show-case how easy it is to use it, twist-wsgi shows commands to run Django, Flask, Pyramid and Bottle apps as easy as it is to run frameworks’ built-in web server.


In production, using the Twisted WSGI containers come with several advantages. Production-grade SSL support using PyOpenssl and cryptography allows elimination of “SSL terminators”, removing one moving piece from the equation. With third-party extensions like txsni and txacme, it allows modern support for “easy SSL”. The built-in HTTP/2 support, starting with Twisted 16.3, allows better support for parallel requests from modern browsers.

The Twisted web server also has a built-in static file server, allowing the elimination of a “front-end” web server that deals with static files by itself, and passing dynamic requests to the application server.

Twisted is also not limited to web serving. As a full-stack network application, it has support for scheduling repeated tasks, running processes and supporting other protocols (for example, a side-channel for online control). Last but not least, in order to integrate that, the language used is Python. As an example for an integrated solution, the Frankenstenian monster plugin show-cases a combo web application combining 4 frameworks, a static file server and a scheduled task updating a file.

While the goal is not to encourage using four web frameworks and a couple of side services in order to greet the user and tell them what time it is, it is nice that if the need strikes this can all be integrated into one process in one language, without the need to remember how to spell “every 4 seconds” in cron or how to quote a string in the nginx configuration file.