Slow-Moving Targets

Tuesday, March 01, 2005

Defensive Generalization (Part 3: The Essence of MDA)

A Kubrik Moment

Many things drive us to generalize when we build software systems. We experience that strong desire to solve a problem once and keep it solved. We want to make sure that the experience we won building yesterday's system shows itself in today and tomorrow's system. We look forward to new failures (cough) to overcome and new successes to build on top of.

In Part 2, we started a deliberate process of pulling apart the clumps of knowledge we had about how CRM was put together. Forgive me for the review, but I want to spend a bit more time on something I glossed over in the last post.

In setting up a homebrew model-centric code generation system the first efforts often yield something that is coupled to the structure of the application in question. Just so I actually have something to write about, let's say that our first round of factoring resulted in the following configuration.
  • CRM domain model (as an XML file of some sort)
  • CRM domain-to-relational model transformation
  • CRM domain-to-DAO model transformation
  • CRM domain-to-web-presentation model transformation
  • CRM relational data model (another XML file)
  • CRM Java DAO model (XML)
  • CRM web presentation model (more XML)
  • Code generators for the three CRM technology-specific models
  • The finished CRM application
Sitting back with a mug of coffee in my hand, two thoughts strike me. One, this coffee needs a bit more sugar, and two, I was really getting tired of typing CRM over and over. Being one that is mildly allergic to unnecessary repetition, I find it far more satisfying to parameterize that list above with "CRM" rather than repeat it.

In the story I told last time, we eventually generalized the infrastructure so that the transformations were expressed at a metamodel level. The crafting of those metamodels is one of the things that I want to talk about in this post.

Metamodels, or modeling languages are yet another dimension of knowledge factoring. In the same way we build up jargon around something when we know it well enough, we build metamodels to express a stock understanding of the way things in a certain category of experience relate to each other. When I began writing about Data Access Objects and Java Server Pages using their acronyms, you understood what I was writing about because you knew the expansion of those acronyms, and how those pieces fit together to make a business application.

Just as we build acronyms by using shorthand representations, we can push knowledge into metamodels to create Domain-Specific Languages (DSLs, not to be mistaken for Digital Subscriber Lines). When we use DSLs in system construction, it allows us to speak in shorthand about specific aspects of our system. Assumptions and assertions about the rules of how those aspects relate to each other are built into the DSL. A given DSL is particularly powerful when it allows us to make correct assumptions about the implementation of the concepts expressed with it.

We need to create four modeling languages to continue the generalization of our model-driven framework. Our toolbox now includes metamodels for relational databases, Java business tiers (including DAO), Java web presentation tiers, and a generic component-based application language (our domain model). These metamodels allow us to decouple the transformations and code generators from the information that must change from project to project: the target application (CRM in this case).

In our CRM domain model we can say that Customers have Calls. In the domain model for our order entry system we specify that Orders have OrderLines. In either project, we can use the same transformation to produce a relational model where the two entities in question are transformed into two tables. This transformation definition is now orthogonal to CRM and Order Entry, but specific to the two metamodels the transformation must bridge. In the same way, we write code generation templates to depend on metamodel elements, and hence we generate DDL for Customer and Call, or Order and OrderLine, with equal ease.

We've arrived at a new division of artifacts. On one side of this division we have what we can think of as class-level artifacts. Our metamodel of relational data says that columns cannot exist unless they are part of a table. Our templates for generating DAOs call for the creation of a value object. These metamodels and code generation templates are class-level artifacts. Just like Java classes have instances at run-time, our class-level artifacts have instances when we use them to build systems. You can think of our models of CRM and Order Entry as instance-level artifacts. The web presentation model of Order Entry, then, is an instance (or occurrence) of the web presentation metamodel. The CRM JSP code is an instance of the web presentation model-to-code templates. The "regions" separated by these divides are often referred to as metalevels.

The artifacts on the instance side of this divide can be modified without requiring modification to artifacts on the class side. The dependency only goes one way, from instance to class, from model to metamodel. If your modeling language changes, naturally you must change your model to fit. We expect this, just as we might expect to alter some code when going from one C compiler to another. When changes occur on the class side of this divide, sweeping changes occur in the instance side; and this is precisely the effect we were looking for. When we choose to change our style of persistence from DAOs to Hibernate, we expect the resulting source code to be very different. The same notions of Customer or Order now have a Hibernate application architecture applied to them. We hope to see an improvement in response time in both applications by making a change in one spot. This ability to reflect new architectural know-how in one location was the reason we pursued separation of concerns in the first place.

Factoring: The Final Frontier

We now need to look beyond our fictional department to that new contract down the street. This new customer will have their own architecture, frameworks, and libraries. In this new environment, we find a need to generalize and factor knowledge in the tools we use to build such model-driven development facilities. We need tools and facilities for defining metamodels, model-to-model transformations, and code generation templates. Armed with these tools, we can set up our environment for model-driven development again.

Looking a little further, we see an industry stuck repeating itself. If we continue the process of driving repetition from our work, the character of this work becomes the elimination of reinvention from implementation to implementation, and vendor to vendor. In other words, for our model-driven approach, we factor the knowledge we've gained about how to factor knowledge into industry standards.

At its heart, the Model-Driven Architecture is a pattern of tools and standards that allow you to build systems using metamodels, models, model transformations, and code generation templates. The "A" in MDA isn't about the application's architecture. The "A" stands for the facilities and techniques we, as developers, use to build applications. It is an Architecture for building systems, using models.

The essence of MDA lies in its goals of separation of concerns (adhering to the DRY principle), and generalization of its facilities, even to the extent of industry standardization.

The OMG's MDA uses the Meta Object Facility to define metamodels or modeling languages. The forthcoming MOF-QVT (Query View Transformation) specification defines a set of languages (more on this in a future post) for writing model transformation specifications between two or more MOF metamodels. The MOF Model to Text specification (still in its infancy) will define a standard template language for generation of text from MOF-metamodel-based models.

You'll notice I didn't mention UML. The reason is that UML is not central to an MDA approach. Having spoken this heresy, let me soften it a bit by saying that most MDA implementations, including Compuware's OptimalJ, use pieces of UML (most especially the class metamodel and diagrammatic notation). This portion of UML is expressed as a MOF metamodel. In the case of our domain metamodel, UML is extended with features we need for a model-driven... umm... architecture. UML is a general purpose modeling language, not a special-purpose DSL.

To complete our survey of all things DRY, let's finish mapping our efforts to MDA jargon.

The code we don't write anew for each application is factored into frameworks and libraries we selected, like Hibernate, Struts, and the Xerces XML parser. This code also lives in application servers, databases and operating systems that we select. In MDA parlance, the combination of these things is the platform our system will run on. Models that contain information specific to the class of platforms, which includes our particular platform, is said to be a Platform-Specific Model (PSM). We create PSMs using a PSM language or metamodel defined in MOF. In our example, the CRM relational model is an example of a PSM.

In our example, we also have a domain model where Customer is defined. In the domain model we expressed the essence of Customer, absent any technological or implementation-specific decisions. This model is orthogonal to the class of platforms we choose to implement the system on; it is a Platform Independent Model (PIM). Like PSMs, we express PIMs in a PIM language, which, in turn, is defined in MOF.

That, in a nutshell, is what MDA is about. Granted, there's a lot missing from what I've written in these three posts, but you've experienced the main avenue of thought on the matter. In building any MDA implementation, you must go through the process of parceling knowledge into the platform, metamodels, transformations, templates, models, and code. Making all of these choices may not be easy, but it will prove worthwhile.

Going forward I hope to start examining the different points of view within the MDA community. I have my own (colored, naturally, by my experience with OptimalJ), but I intend to call out where my opinion and someone else's opinion diverge. In these three posts, I've voiced my opinion, yet shown a lot of the "fact" that is MDA as we know it today.