In a conversation recently with a friend, we agreed that “something the instructions tell you to do ‘sudo pip install’…which is good, because then you know to ignore them”.
There is never a need for “sudo pip install”, and doing it is an anti-pattern. Instead, all installation of packages should go into a virtualenv. The only exception is, of course, virtualenv (and arguably, pip and wheel). I got enough questions about this that I wanted to write up an explanation about the how, why and why the counter-arguments are wrong.
What is virtualenv?
The documentation says:
virtualenv is a tool to create isolated Python environments.
The basic problem being addressed is one of dependencies and versions, and indirectly permissions. Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications? If you install everything into
/usr/lib/python2.7/site-packages (or whatever your platform’s standard location is), it’s easy to end up in a situation where you unintentionally upgrade an application that shouldn’t be upgraded.
Or more generally, what if you want to install an application and leave it be? If an application works, any change in its libraries or the versions of those libraries can break the application.
The tl:dr; is:
- virtualenv allows not needing administrator privileges
- virtualenv allows installing different versions of the same library
- virtualenv allows installing an application and never accidentally updating a dependency
The first problem is the one the “sudo” comment addresses — but the real issues stem from the second and third: not using a virtual environment leads to the potential of conflicts and dependency hell.
How to use virtualenv?
Creating a virtual environment is easy:
$ virtualenv dirname
will create the directory, if it does not exist, and then create a virtual environment in it. It is possible to use it either activated or unactivated. Activating a virtual environment is done by
$ . dirname/bin/activate
this will make
python, as well as any script installed using setuptools’ “console_scripts” option in the virtual environment, on the command-execution path. The most important of those is pip, and so using pip will install into the virtual environment.
It is also possible to use a virtual environment without activating it, by directly calling
dirname/bin/python or any other console script. Again, pip is an example of those, and used for installing into the virtual environment.
Installing tools for “general use”
I have seen a couple of times the argument that when installing tools for general use it makes sense to install them into the system install. I do not think that this is a reasonable exception for two reasons:
- It still forces to use root to install/upgrade those tools
- It still runs into the dependency/conflict hell problems
There are a few good alternatives for this:
- Create a (handful of) virtual environments, and add them to users’ path.
- Use “pex” to install Python tools in a way that isolates them even further from system dependencies.
People often use Python for exploratory programming. That’s great! Note that since pip 7, pip is building and caching wheels by default. This means that creating virtual environments is even cheaper: tearing down an environment and building a new one will not require recompilation. Because of that, it is easy to treat virtual environments as disposable except for configuration: activate a virtual environment, explore — and whenever needing to move things into production, ‘pip freeze’ will allow easy recreation of the environment.