A Transport-agnostic Process-agnostic System in Python

Abstract

Different communication mechanisms have different trade-offs: XML-RPC is ubiqutous, JSON-RPC is almost as popular but more space/time efficient, AMQP allows efficient routing and so on. It is advantageous to write application code which is completely transport-agnostic. Deciding how to split components into processes is a decision more efficiently done at the tail-end of a project, when the knowledge of which processes are more reliable, or need to communicate more, is known. Described here is the system we use to allow application code to be unaware of those distinction, pushing those decisions into deployment or installation time.

Introduction

When starting to implement a complicated system in Python, there are many unknowns: amount of communications between processes, typical process bugs and deployment targets. It is highly useful to allow writing the business logic code in a way that is agnostic to the answers to those questions. This also allows the communication layer to be written as a quick prototype, and optimized later. Since infrastructure costs are frequently minimized during the demo phase, this kind of approach allows quick demos during development without any cross-team code rewrite. The trade-off in using this kind of approach is that the feature set assumed from the transport has to be a least-common-denominator subset of all “possible” feature sets. This, in turn, pushes some of the work that could be done by more sophisticated transport to the Python-level infrastructure and sometimes, in the worst cases, into application code. The most egregious example is that any system that supports HTTP-based RPC mechanism (XML-RPC, JSON-RPC or SOAP) must not assume any kind of asynchronous communication mechanism.

Interface Definition Language

Many systems do not have an IDL of their own — or it is so weak (as in the case of XML-RPC Introspection) so as to be worthless. Interface definition languages give us two benefits: it reduces the amount of trust between processes, since processes can only send valid data to each other, and it allows us to declare explicitly which methods are supported by a given component. Our Interface Definition Language takes advantage of Python meta-classes to look like::

 class ThingyData(Data):
     """
     Data which identifies a target
     """
     uuid=lambda:str
     metadata=lambdaict

 class ThingiesOfType(Data):
     """
     Collection of targets of the same type
     """
     type=lambda:str
     thingies=lambda:array(ThingyData)

 class SearchInterface(Interface):
     """
     Interface to search functionality.
     """
     name='com.vmware.example.version1.Search'
     @public
     def RunQuery(query=str, maxResults=int):
         """run a query

         @param query: a query to run

         @param maxResults: maximum number of results to return

         @param results: query results
         """
         return dict(results=array(ThingiesOfType))

Since we cannot depend on the underlying transport to contain objects or references, we assume all communications happen in terms of “simple” (nested) data types: ints, strings, floats, uniformly typed lists, string to string dictionaries and string to vartype dictionaries. In order to do so in a dependable way, without application code having to translate manually, we have the Implementation metaclass::

 class SearchImplementation(object):
     __metaclass__ = Implementation
     interface = SearchInterface
     def RunQuery(self, query):
         d = defer.Deferred()
         myThingies= Structure('ThingiesOfType')
         myThingies.type = 'Something'
         myThingies.thingies = []
         reactor.callLater(5, d.callback, [myThingies])
         return d

It is possible to return either a deferred or a value from a method in an implementation. Note that there is no need for type checking inside application code — bad types would trigger an exception early on in the calling process.

Clients

Communication needs two sides. So far we have described how to write a server. We want client code to be equally transport-agnostic. In order to do that, we have a dynamic client object.

 def _simplifyObject(obj):
     if isinstance(obj, (int, float, str, bool, dict)):
         return obj
     elif isinstance(obj, list):
         return [_simplifyObject(x) for x in obj]
     elif isinstance(obj, Structure):
         return dict([(key, _simplifyObject(value))
                        for key, value in obj])
      else:
         raise ValueError("cannot simplify", obj)

 class MultiClient(object):
    def __init__(self):
        self.ifaces = set()
        self.routing = {}
    def add(self, iface, provider):
        self.ifaces.add(iface)
        for method in iface.methods:
            self.routing[iface.name+"."+method] = provider
    def callRemote(self, name, *args, **kwargs):
        args = map(_simplifyObject, args)
        provider = self.routing[name]
        return provider.callRemote(name, *args)

This means we can drop in any objects that support callRemote, commit to which interfaces are available for them, and the client code does all multiplexing. From inside the application code, we need no idea about which provider gives which interface — all we need is a callRemote.

Transport and Processes

The server code is the non-transport-agnostic part. It takes the fully-qualified name of a function, runs that function and treats the results as a (possibly nested) lists of implementations that should run in the given process. For the XML-RPC transport, it also expects on the command-line to have a list of which ports are running which interfaces. It creats XML-RPC proxies for those ports, and drops them into the singleton client. Running all servers manually means a lot of manual stitching up and repetition. In order to save that repetition, we have a so-called multiserver, using the Twisted Process Monitoring module (which takes care of starting and restarting processes). The multiserver gets a list of processes, specified in a JSON file. It then calculates what should be its client list, and runs all processes with the same client list, using the same server. We plan to add support for JSON-RPC and HTTPS as command-line switches to the multiserver, which would then run all processes, and assumptions, appropriately. Note, however, that a smarter multiserver could decide that some processes should use JSON-RPC and some XML-RPC, and run them all appropriately. Also note that currently the multiserver can only support one component per process, even though its underlying server is capable of more. We also plan to have the multiserver set a special environment variable “SECRET_KEY”. This key would be a randomized string — all clients would send it, and all servers would require it before they process input. In this way, a malicious process running with different privileges on the same machine will not be able to get into the system.

Unified Front

We want to have a single port on which all public parts of the interface are exposed. We can do this in a way that is transport agnostic, even if it is not agnostic to some of the implementation details of Interface or the singleton client.

 def makeForwardingFunction(name, outargs):
     def retFunc(self, *args):
         remoteReturn = singletonClient.callRemote(name, *args)
         def cb(remoteReturnValue):
             return dict([(aname, argumentParse(tp, remoteReturnValue[aname]))
                              for aname, tp in outargs.iteritems()])
         remoteReturn.addCallback(cb)
         return remoteReturn
     return retFunc

 def makeForwarder(iface):
     d = dict(interface=iface)
     for name, func in iface.methods.iteritems():
         if func.isPublic:
             d[name] = makeForwardingFunction(iface.name+'.'+name, func.outargs)
     return Implementation('Forwarder', (), d)()

 def unifiedFront():
     for iface in singletonClient.ifaces:
         yield makeForwarder(iface, client.client)

Note that we could have set our multiserver to serve the unified front using one protocol (e.g., XML-RPC over HTTP) while running all clients using another one (e.g., AMQP). While this requires some complicated code in the multiserver, it does not require *any* kind of application code support.

Conclusion

With the right kind of separation between application code and transport code, it is possible to support complicated interaction topologies while completely hiding them from the application code. Python metaclasses, duck-typing and high level of dynamism makes writing this kind of code fairly easy.

Advertisements

One Response to A Transport-agnostic Process-agnostic System in Python

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: