Extension API: An exercise in a negative case study

I was idly contemplating implementing a new Jupyter kernel. Luckily, they try to provide facility to make it easier. Unfortunately, they made a number of suboptimal choices in their API. Fortunately, those mistakes are both common and easily avoidable.

Subclassing as API

They suggest subclassing IPython.kernel.zmq.kernelbase.Kernel. Errr…not “suggest”. It is a “required step”. The reason is probably that this class already implements 21 methods. When you subclass, make sure to not use any of these names, or things will break randomly. If you do not want to subclass, good luck figuring out what the assumption that the system makes about these 21 methods because there is no interface or even prose documentation.

The return statement in their example is particularly illuminating:

        return {'status': 'ok',
                # The base class increments the execution count
                'execution_count': self.execution_count,
                'payload': [],
                'user_expressions': {},
               }

Note the comment “base class increments the execution count”. This is a classic code smell: this seems like this would be needed in every single overrider, which means it really belongs in the helper class, not in every kernel.

None

The signature for the example do_execute is:

    def do_execute(self, code, silent, store_history=True, 
                   user_expressions=None,
                   allow_stdin=False):

Of course, this means that user_expressions will sometimes be a dictionary and sometimes None. It is likely that the code will be written to anticipate one or the other, and will fail in interesting ways if None is actually sent.

Optional Overrides

As described in this section there are also ways to make the kernel better with optional overrides. The convention used, which is nowhere explained, is that do_ methods mean you should override to make a better kernel. Nowhere it is explained why there is no default history implementation, or where to get one, or why a simple stupid implementation is wrong.

Dictionaries

 

All overrides return dictionaries, which get serialized directly into the underlying communication platform. This is a poor abstraction, especially when the documentation is direct links to the underlying protocol. When wrapping a protocol, it is much nicer to use an Interface as the documentation of what is assumed — and define an attr.s-based class to allow returning something which is automatically the correct type, and will fail in nice ways if a parameter is forgotten.

Summary

If you are providing an API, here are a few positive lessons based on the issues above:

  • You should expect interfaces, not subclasses. Use composition, not subclassing.If you want to provide a default implementation in composition, just check for a return of NotImplemeted(), and use the default.
  • Do the work of abstracting your customers from the need to use dictionaries and unwrap automatically. Use attr.s to avoid customer boilerplate.
  • Send all arguments. Isolate your customers from the need to come up with sane defaults.
  • As much as possible, try to have your interfaces be side-effect free. Instead of asking the customer to directly make a change, allow the customer to make the “needed change” be part of the return type. This will let the customers test their class much more easily.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: