Slow-Moving Targets

Sunday, June 06, 2004

Don't Repeat Yourself (Part 1: The Essence of MDA)

Keeping Your Code DRY

It's all about change. Software maintenance can become impossible if the software was written poorly. Well-designed, well-crafted software remains soft, it can be changed and extended to accommodate the changing needs of its users.

Perhaps the most important characteristic of the software is how and where knowledge is represented. If you distribute knowledge throughout the software, then changing to a new "understanding" becomes difficult. For example, one classic maintenance problem in client/server application development is the "magic pushbutton."

All the code for a feature of the program lived in the event for that button push. When some of that functionality (knowledge) needed to be reused, the relevant bits of code were copied into the event for the menu item, or into the code for a dialog box. Any change to that feature--to the data structures or rules involved in their manipulation--would ripple through each place where that knowledge was represented. Miss just one of those representations and your software would behave unpredictably, or worse, the software would cause the user to lose work.

While it is not impossible to maintain software built this way, it becomes significantly easier to get the changes wrong or to miss something. In their wonderful book, The Pragmatic Programmer: From Journeyman to Master, Dave Thomas and Andrew Hunt spelled out the DRY principle: Don't Repeat Yourself. Have a single, authoritative representation of knowledge in your system.

Working on features of Compuware's OptimalJ, and working at the OMG on Model Driven Architecture, had me thinking about the DRY principle in the context of MDA. Adrian Colyer's recent explanation of Aspect-Oriented Programming only made the ideas knocking around in my head clearer.

Adrian's blog discussed the relationship between different concepts a developer deals with when building a system. Specifically, he wrote about how Aspect-Orient Programming addresses the problem of representing associated concepts when they relate to each other in a one-to-n fashion. By providing the capability of representing knowledge of the associations explicitly (through point cuts) AOP allows representation of associated concerns once and only once. Usually this results in more concise and higher quality code.

Model Driven Architecture provides a solution to this problem. But it does so while providing something even more important: higher abstraction. We'll get to that as we work through this code factoring problem.

Zooming In
I've actually written about this before, but that was sort of shooting from the hip, so I'll take a more considered attempt at it here.

Starting at the end of a long trail of debugging sessions, code cranking fugues, design decisions, architectural expressions and requirements we see a working system, fully conceived. Examining it closely, we see that our notion of Customer, culled from the mental floss of our requirements documents, lives in many places in our working system. Customer, or rather the collaborating notions of Customer, must live in all these different places to allow users of the system to accomplish work.

Customer exists as RM_CUST_MAST in our relational database. It breathes as instances of the com.compuware.myapplication.Customer class, and attendant helper objects and interfaces. It is rendered in many formats across many UI screens (or pages, in this example). It's heartbeat plays out across the glue code and configuration files that hook these pieces together. The same thing is true for Call, and Product, and all of the myriad business notions that found their way into the system.

Since we have written all of this code by hand (an extreme case, but we'll go with it for the moment) we've violated the DRY principle on a number of levels.

  • Customer, as a notion, is repeated and distributed about our system's code, as are any of the other domain concepts we need our system to manipulate.
  • We repeatedly expressed the knowledge of transforming a domain concept into a relational database entity.
  • We repeatedly expressed the knowledge of how to render domain concepts as Java classes.
  • We repeatedly expressed the knowledge of how to handle associations between domain objects in code.
  • We repeatedly expressed the knowledge of how to expose domain objects to the user through the UI.
  • We repeatedly expressed the knowledge of how to map Java objects to database tables.
  • ...

  • How do we fix this? We start by finding alternatives to representing knowledge that puts information in the right spot.

    Zooming Out With Models

    Let's attack the DBMS representation first. If you've built large or complex databases you've probably already solved this particular problem. You represented the concept of an entity and its associations and rendered the code that the DBMS operates on. Using a diagramming tool, you created a model of Customer, Call, and Product. This model could then be rendered as DDL specific to our database of choice. The model showed us just what information was important to create implementations in a specific database. The model allows us to target Oracle 9i, SQL Server, MySQL. More importantly, it let us start at Oracle 7.3 and upgrade to subsequent versions when we needed to.

    The diagramming tool likely used two representations of Customer: one in the logical model and another in the physical model. The logical model captured the general idea of the Customer entity and the physical model captured the actual definitions of the database structure according to a specific database. Rules or mappings define how elements in the logical model should be interpreted in the context of a specific database.

    Specific knowledge and design decisions are now parceled up into discreet packages. We defined the concept of a Customer data table in one place. Someone defined the mapping rules between this concept and how it will be implemented in another place. The model of the implementation, and the rules for turning this model into code (DDL) exist in two more distinct places. If the tool is especially flexible, or if I built the tool myself, I can control the DDL generation and refine it as my expertise with the software platform (Oracle) increases. Even better, if the tool permits it, I can control the mapping rules between models as my expertise with the class of platforms (RDBMS) increases.

    In creating these stepping stones to implementation (a metaphor I'm borrowing from MDA Distilled), we've achieved two things. First, through abstraction, we've achieved a more powerful form of expression that allows us to say more. Second, we've apportioned knowledge in our system in a more fine-grained manner. This manner of parceling up different kinds of knowledge in different spots is exactly the kind of technique we need to effectively practice the DRY principle.

    We haven't yet solved the larger problem, though. More on this for my next entry, stay tuned.