Create your own Python

Model-driven development is getting more popular even in the automotive industry. Martin Karlsch, who is graduating as Master of Science in Software Engineering at the Hasso-Plattner-Institute in Potsdam, has recently published his master-thesis about a model-driven framework for domain specific languages (DSL) which is demonstrated on a Python-based test automation language he created with his prototype implementation of such framework, frodo.

DSLs are designed for implementations of special purpose software. Other than general purpose languages, Python for example, DSLs are strictly focused on particular and often very complex areas of application, such as pharmacy, genetics, aerospace, or automotive. Let's have a look on the latter: Complexity of software systems in modern automobiles is significantly increasing, not only in those of German brands such as BMW, Mercedes or Audi. Most innovations in cars are software-related and these software-components must often fulfill many requirements (e.g. safety regulations, real-time processing) and will be integrated with plenty of other components in a broad variety of vehicle models. I think there is no driver who has never heard of at least one mass car recall due to severe software-related defects in the past few years. So, manufacturers started to address the challenges involved with this progression and are looking for ways to strongly improve testing and software quality management at all. E. g., for test scenarios regarding integration of software-components and other components in a prototype car there is the need to have a domain specific language in which the engineers can easily specify the automatic tests they want to run and probably also to analyze the mass of information in the test reports. It is in the nature of DSLs that their user bases are rather small. Moreover it is not unlikely that a company which is facing such complex and cost-intensive challenges will find itself in the need to develop its own DSL.

Wait! ... What?

Anyone who has ever developed a programming language (I never programmed one myself, but nevertheless I felt free to make sassy design decisions) will know that this is all but an easy task to master.

As I understand Martin Karlsch's master-thesis, he has achieved to implement a prototype of this framework, that builds up on existing Python grammar and constructs which can be extended, reduced or modified by domain experts and software engineers, so the language is tailored for use in the target domain. These specifications are defined in domain meta models, syntax views with textual representations and semantic mappings. As result of the process the domain experts will finally have a special-purpose Python language available. - Python gets children!

I like it. I like the idea of combining Python's clean elegance and dynamic OO strengths with domain-specific elements. Martin's framework is capable to carry out the delicious Python flavor to many more gourmets, very specialized gourmets indeed. And I wouldn't be surprised if in a future not far this model-driven approach will become a popular alternative and add-on to the rich base of Python libraries and site-packages.



pyswarm is a Free Software tool for the model-driven development of Python applications with PostgreSQL databases. Future releases should support a wide range of UML tools. Since I think that UML is already well known in today's software industry, I want to provide an overview of the Model-Driven Architecture (MDA) and some other related standards of the Object Management Group (OMG, official Web-site / Wikipedia), and how they may play a role with Python and pyswarm in particular.

pyswarm SDK 0.7.1 is still a prototype. Currently the SDK supports UML 2.0 models stored in (MagicDraw) XMI 2.1 files that are parsed and then directly transformed into a custom target application code. For the 1.0 final release several major improvements have been suggested. Some of them are tightly coupled with MDA requirements or recommendations, such as the model-to-model transformation.

OMG's Four Model Layers

A good starting point is OMG' view on the modeling domain. This view covers four modeling layers: M3 is the layer of meta-meta modeling. It does not only sound very abstract, it is the most abstract layer here. OMG's key standard in the M3 layer is the Meta Object Facility (MOF, Web-site / Wikipedia). MOF is used to describe meta models such as the Unified Modeling Language (UML, Web-site / Wikipedia) and even MOF itself. UML resides in the M2 layer (meta modeling) and is also a standard by OMG. A model that is described in UML resides in the M1 layer. You can see two examples of such models of the M1 layer in the pyswarm documentation.

And as a fourth layer there is M0, the most concrete layer. The OMG understands M0 as the data layer. M0 represents concrete instances of elements (e.g. objects of classes or records of database tables) that have been specified in a model in the M1 layer. Regarding pyswarm SDK 0.7.1 this means that the generator reads a model of the M1 layer (such as the one stored in the PetStore.xml file), transforms that in another M1 layer model (but this time specifically for the pyswarm architecture) and generates Python + SQL code, which then in run-time will work with a concrete M0 layer model.

Model-Driven Architecture

Furthermore there is OMG's Model-Driven Architecture (MDA, Web-site / Wikipedia), a standard for model-driven software development heavily based on OMG standards. Essentially MDA is about the creation of software by transformations from one (UML) model to another.

For doing this MDA suggests that a Computation Independent Model (CIM) is specifying domain-relevant knowledge of domain-experts in order to provide software-experts information about the expected purpose, terms and limitations of a software system. In short, you can consider a CIM as the visualized business model for which a software system is planned, which is specifiying purely business requirements, with no technical meaning.

MDA also suggests that a CIM is manually transformed into a Platform-Independent Model (PIM). A PIM identifies entities from the CIM and specifies them in a more technical meaning, e.g. by defining business classes, operations, attributes, packages, but a PIM is still not specific for a particular platform or language. At least theoretically it should be possible to use a PIM for any other platform. Usually only some UML extensions have to be changed, e.g. by changing stereotypes that are applied to elements in the PIM.

The PIM will be transformed into a Platform-Specific Model (PSM), which usually should happen automatically. You can expect that the PSM will be more complex than the PIM since it represents how the PIM is adopted for a particular platform. You can check the examples in the pyswarm SDK introduction to see the differences between the PIM and the resulting PSM.

Some MDA tools will provide the resulting PSM and may even offer to the user to change the PSM before it will be transformed into the so-called Implementation-Specific Model (ISM). The ISM is the implementation, resulting from the PSM, speak: the generated code (that is, in the case of pyswarm, Python and SQL), documentation and other directories and files.

In order to improve tool usability MDA also suggests the use of Transformation Record Models (TRM) created during a model-to-model transformation. A TRM would show the mapping between elements of the source model and elements of the target model.


Software-engineering is sometimes categorized according to the sequence in which specification and implementation are accomplished during a project. Under this criteria, there are commonly three different categories identified: forward engineering (you have a specification and use that to create an implementation), reverse engineering (if you are in the pitiful situation of using an implementation to create the specification of the same), and round-trip engineering for a complete development cycle into both directions. In that meaning you can consider MDA as a mainly forward engineering technique. The MDA doesn't require transformations from an ISM to a PSM or from a PSM to a PIM, neither MDA encourages you to do so. Also MDA doesn't require that a PSM is made available to the user. Some MDA tools though are capable to show users the PSM resulting from the user's PIM and some are storing the PSM as XMI file.

More OMG Standards

OMG has been hard-working in the recent years and specified several standards that are of interest regarding MDA. Unfortunately many of them are pretty complex for their own and have been revised multiple times. Although I was not able to keep track with all of these standards, I want to outline those which could be of interest for the re-engineering of the pyswarm SDK:

As mentioned before there is the Meta Object Facility (MOF), especially the Essential MOF (EMOF) as core specification of MOF, the specification of XML Metadata Interchange (XMI, Web-site / Wikipedia) and the most recent MOF2.0/XMI2.1 mapping specification. XMI can be used to serialize meta models in MOF into an XML-based format, so these meta models can be exchanged between tools, particularily UML tools.

The Object Constraint Language (OCL, Web-site / Wikipedia) is a declarative language that can be used to specify rules applicable to elements in a model. Originally developed by IBM, OMG used OCL as part of the UML standard. In the current version 2.0 OCL can be used for the specification of constraints in any MOF-based meta model.

After some years of experience with MDA in daily business some people felt the need to define a standard for model transformations. Thus OMG specified the MOF Query/View/Transformation (QVT, PDF / Wikipedia) standard. QVT includes OCL with imperative extensions and defines three domain specific languages to be used for model transformation specifications.


Indeed, also due to their complexity the standards outlined before are already providing a thriving habitat for highly specialized modeling experts. I understand that there are scenarios in software industry, in particular in large projects, where this grade of abstraction is crucial and justified. Still I am not sure if consideration of all these standards would be really a viable approach for adding the support of other UML tools and XMI formats to pyswarm SDK, especially since it is not clear if they will be worth the efforts - not only to be implemented once, but also to be maintained and updated to future standard changes.

Specifications And White Papers