Domain-Driven Design
Page 36
This formalization of communication implies some shared model vocabulary—the basis of the SERVICE interfaces. As a result, the other subsystems become coupled to the model of the OPEN HOST, and other teams are forced to learn the particular dialect used by the HOST team. In some situations, using a well-known PUBLISHED LANGUAGE as the interchange model can reduce coupling and ease understanding. . . .
Published Language
The translation between the models of two BOUNDED CONTEXTS requires a common language.
When two domain models must coexist and information must pass between them, the translation process itself can become complex and hard to document and understand. If we are building a new system, we will typically believe that our new model is the best available, and so we will think in terms of translating directly into it. But sometimes we are enhancing a set of older systems and trying to integrate them. Choosing one messy model over the other may be choosing the lesser of two evils.
Another situation: When businesses want to exchange information with one another, how do they do it? Not only is it unrealistic to expect one to adopt the domain model of the other, it may be undesirable for both parties. A domain model is developed to solve problems for its users; such a model may contain features that needlessly complicate communication with another system. Also, if the model underlying one of the applications is used as the communications medium, it cannot be changed freely to meet new needs, but must be very stable to support the ongoing communication role.
Direct translation to and from the existing domain models may not be a good solution. Those models may be overly complex or poorly factored. They are probably undocumented. If one is used as a data interchange language, it essentially becomes frozen and cannot respond to new development needs.
The OPEN HOST SERVICE uses a standardized protocol for multiparty integration. It employs a model of the domain for interchange between systems, even though that model may not be used internally by those systems. Here we go a step further and publish that language, or find one that is already published. By publish I simply mean that the language is readily available to the community that might be interested in using it, and is sufficiently documented to allow independent interpretations to be compatible.
Recently, the world of e-commerce has become very excited about a new technology: Extensible Markup Language (XML) promises to make interchange of data much easier. A very valuable feature of XML is that, through the document type definition (DTD) or through XML schemas, XML allows the formal definition of a specialized domain language into which data can be translated. Industry groups have begun to form for the purpose of defining a single standard DTD for their industry so that, say, chemical formula information or genetic coding can be communicated between many parties. Essentially these groups are creating a shared domain model in the form of a language definition.
Therefore:
Use a well-documented shared language that can express the necessary domain information as a common medium of communication, translating as necessary into and out of that language.
The language doesn’t have to be created from scratch. Many years ago, I was contracted by a company that had a software product written in Smalltalk that used DB2 to store its data. The company wanted the flexibility to distribute the software to users without a DB2 license and contracted me to build an interface to Btrieve, a lighter-weight database engine that had a free runtime distribution license. Btrieve is not fully relational, but my client was using only a small part of DB2’s power and was within the lowest common denominator of the two databases. The company’s developers had built on top of DB2 some abstractions that were in terms of the storage of objects. I decided to use this work as the interface for my Btrieve component.
This approach did work. The software smoothly integrated with my client’s system. However, the lack of a formal specification or documentation of the abstractions of persistent objects in the client’s design meant a lot of work for me to figure out the requirements of the new component. Also, there wasn’t much opportunity to reuse the component to migrate some other application from DB2 to Btrieve. And the new software more deeply entrenched the company’s model of persistence, so that refactoring that model of persistent objects would have been even more difficult.
A better way might have been to identify the subset of the DB2 interface that the company was using and then support that. The interface of DB2 is made up of SQL and a number of proprietary protocols. Although it is very complex, the interface is tightly specified and thoroughly documented. The complexity would have been mitigated because only a small subset of the interface was being used. If a component had been developed that emulated the necessary subset of the DB2 interface, it could have been very effectively documented for developers simply by identifying the subset. The application it was integrated into already knew how to talk to DB2, so little additional work would have been needed. Future redesign of the persistence layer would have been constrained only to the use of the DB2 subset, just as before the enhancement.
The DB2 interface is an example of a PUBLISHED LANGUAGE. In this case, the two models are not in the business domain, but all the principles apply just the same. Because one of the models in the collaboration is already a PUBLISHED LANGUAGE, there is no need to introduce a third language.
Example: A PUBLISHED LANGUAGE for Chemistry
Innumerable programs are used to catalog, analyze, and manipulate chemical formulas in industry and academia. Exchanging data has always been difficult, because almost every program uses a different domain model to represent chemical structures. And of course, most of them are written in languages, such as FORTRAN, that do not express the domain model very fully anyway. Whenever anyone wanted to share data, they had to unravel the details of the other system’s database and work out some sort of translation scheme.
Enter the Chemical Markup Language (CML), a dialect of XML intended as a common interchange language for this domain, developed and managed by a group representing academics and industry (Murray-Rust et al. 1995).
Chemical information is very complex and diverse, and it changes all the time with new discoveries. So they developed a language that could describe the basics, such as the chemical formulas of organic and inorganic molecules, protein sequences, spectra, or physical quantities.
Now that the language has been published, tools can be developed that would never have been worth the trouble to write before, when they would have only been usable for one database. For example, a Java application, called the JUMBO Browser, was developed that creates graphical views of chemical structures stored in CML. So if you put your data in the CML format, you’ll have access to such visualization tools.
In fact, CML gained a double advantage by using XML, a sort of “published meta-language.” The learning curve of CML is flattened by people’s familiarity with XML; the implementation is eased by various off-the-shelf tools, such as parsers; and documentation is helped by the many books written on all aspects of handling XML.
Here is a tiny sample of CML. It is only vaguely intelligible to nonspecialists like myself, but the principle is clear.
1.17947 0.95091 0.97175 1.00000 1.17947 0.95090 0.97174 1.00000
1.17946 0.98215 0.94049 1.00000 1.17946 0.95091 0.97174 1.00000
1.17946 0.95091 0.97174 1.00000 1.17946 0.98215 0.94049 1.00000
0.89789 0.89790 0.89789 0.89789 0.89790 0.89788
Unifying an Elephant
It was six men of Indostan
To learning much inclined,
Who went to see the Elephant
(Though all of them were blind),
That each by observation
Might satisfy his mind.
The First approached the Elephant,
And happening to fall
Against his broad and sturdy side,
/>
At once began to bawl:
"God bless me! but the Elephant
Is very like a wall!"
. . .
The Third approached the animal,
And happening to take
The squirming trunk within his hands,
Thus boldly up and spake:
"I see," quoth he, "the Elephant
Is very like a snake."
The Fourth reached out his eager hand,
And felt about the knee.
"What most this wondrous beast is like
Is mighty plain," quoth he;
"’Tis clear enough the Elephant
Is very like a tree!"
. . .
The Sixth no sooner had begun
About the beast to grope,
Than, seizing on the swinging tail
That fell within his scope,
"I see," quoth he, "the Elephant
Is very like a rope!"
And so these men of Indostan
Disputed loud and long,
Each in his own opinion
Exceeding stiff and strong,
Though each was partly in the right,
And all were in the wrong!
. . .
—From “The Blind Men and the Elephant,” by John Godfrey Saxe (1816–1887), based on a story in the Udana, a Hindu text
Depending on their goals in interacting with the elephant, the various blind men may still be able to make progress, even if they don’t fully agree on the nature of the elephant. If no integration is required, then it doesn’t matter that the models are not unified. If they require some integration, they may not actually have to agree on what an elephant is, but they will get a lot of value from merely recognizing that they don’t agree. This way, at least they don’t unknowingly talk at cross-purposes.
The diagrams in Figure 14.9 are UML representations of the models the blind men have formed of the elephant. Having established separate BOUNDED CONTEXTS, the situation is clear enough for them to work out a way to communicate with each other about the few aspects they care about in common: the location of the elephant, perhaps.
Figure 14.9. Four contexts: no integration
Figure 14.10. Four contexts: minimal integration
As the blind men want to share more information about the elephant, the value of sharing a single BOUNDED CONTEXT goes up. But unifying the disparate models is a challenge. None of them is likely to give up his model and adopt one of the others. After all, the man who touched the tail knows the elephant is not like a tree, and that model would be meaningless and useless to him. Unifying multiple models almost always means creating a new model.
With some imagination and continued discussion (probably heated), the blind men could eventually recognize that they have been describing and modeling different parts of a larger whole. For many purposes, a part-whole unification may not require much additional work. At least the first stage of integration only requires figuring out how the parts are related. It may be adequate for some needs to view an elephant as a wall, held up by tree trunks, with a rope at one end and a snake at the other.
Figure 14.11. One context: crude integration
The unification of the various elephant models is easier than most such mergers. Unfortunately, it is the exception when two models purely describe different parts of the whole, although this is often one aspect of the difference. Matters are more difficult when two models are looking at the same part in a different way. If two men had touched the trunk and one described it as a snake and the other described it as a fire hose, they would have had more difficulty. Neither can accept the other’s model, because it contradicts his own experience. In fact, they need a new abstraction that incorporates the “aliveness” of a snake with the water-shooting functionality of a fire hose, but one that leaves out the inapt implications of the first models, such as the expectation of possibly venomous fangs, or the ability to be detached from the body and rolled up into a compartment in a fire truck.
Even though we have combined the parts into a whole, the resulting model is crude. It is incoherent, lacking any sense of following contours of an underlying domain. New insights could lead to a deeper model in a process of continuous refinement. New application requirements can also force the move to a deeper model. If the elephant starts moving, the “tree” theory is out, and our blind modelers may break through to the concept of “legs.”
Figure 14.12. One context: deeper model
This second pass of model integration tends to slough off incidental or incorrect aspects of the individual models and creates new concepts—in this case, “animal” with parts “trunk,” “leg,” “body,” and “tail”—each of which has its own properties and clear relationships to other parts. Successful model unification, to a large extent, hinges on minimalism. An elephant trunk is both more and less than a snake, but the “less” is probably more important than the “more.” Better to lack the water-spewing ability than to have an incorrect poison-fang feature.
If the goal is simply to find the elephant, then translating between each model’s expression of location will do. When more integration is needed, the unified model doesn’t have to reach full maturity in the first version. It may be adequate for some needs to view an elephant as a wall, held up by tree trunks, with a rope at one end and a snake at the other. Later, driven by new requirements and by improved understanding and communication, the model can be deepened and refined.
Recognizing multiple, clashing domain models is really just facing reality. By explicitly defining a context within which each model applies, you can maintain the integrity of each and clearly see the implications of any particular interface you want to create between the two. There is no way for the blind men to see the whole elephant, but their problem would be manageable if only they recognized the incompleteness of their perception.
Choosing Your Model Context Strategy
It is important always to draw the CONTEXT MAP to reflect the current situation at any given time. Once that’s done, though, you may very well want to change that reality. Now you can begin to consciously choose CONTEXT boundaries and relationships. Here are some guidelines.
Team Decision or Higher
First, teams have to make decisions about where to define BOUNDED CONTEXTS and what sort of relationships to have between them. Teams have to make these decisions, or at least the decisions have to be propagated to the entire team and understood by everyone. In fact, such decisions often involve agreements beyond your own team. On the merits, decisions about whether to expand or to partition BOUNDED CONTEXTS should be based on the cost-benefit trade-off between the value of independent team action and the value of direct and rich integration. In practice, political relationships between teams often determine how systems are integrated. A technically advantageous unification may be impossible because of reporting structure. Management may dictate an unwieldy merger. You won’t always get what you want, but at least you may be able to assess and communicate something of the cost incurred, and take steps to mitigate it. Start with a realistic CONTEXT MAP and be pragmatic in choosing transformations.
Putting Ourselves in Context
When we are working on a software project, we are interested primarily in the parts of the system our team is changing (the “system under design”) and secondarily in the systems it will communicate with. In a typical case, the system under design is going to get carved into one or two BOUNDED CONTEXTS that the main development teams will be working on, perhaps with another CONTEXT or two in a supporting role. In addition to that are the relationships between these CONTEXTS and the external systems. This is a simple, typical view, to give some rough expectation for what you are likely to encounter.
We really are part of that primary CONTEXT we are working in, and that is bound to be reflected in our CONTEXT MAP. This isn’t a problem if we are aware of the bias and are mindful of when we step outside the limits of that MAP’s applicability.
Trans
forming Boundaries
There are an unlimited variety of situations and an unlimited number of options for drawing the boundaries of BOUNDED CONTEXTS. But typically the struggle is to balance some subset of the following forces:
Favoring Larger BOUNDED CONTEXTS
• Flow between user tasks is smoother when more is handled with a unified model.
• It is easier to understand one coherent model than two distinct ones plus mappings.
• Translation between two models can be difficult (sometimes impossible).
• Shared language fosters clear team communication.
Favoring Smaller BOUNDED CONTEXTS
• Communication overhead between developers is reduced.
• CONTINUOUS INTEGRATION is easier with smaller teams and code bases.
• Larger contexts may call for more versatile abstract models, requiring skills that are in short supply.
• Different models can cater to special needs or encompass the jargon of specialized groups of users, along with specialized dialects of the UBIQUITOUS LANGUAGE.
Deep integration of functionality between different BOUNDED CONTEXTS is impractical. Integration is limited to those parts of one model that can be rigorously stated in terms of the other model, and even this level of integration may take considerable effort. This makes sense when there will be a small interface between two systems.
Accepting That Which We Cannot Change: Delineating the External Systems
It is best to start with the easiest decisions. Some subsystems will clearly not be in any BOUNDED CONTEXT of the system under development. Examples would be major legacy systems that you are not immediately replacing and external systems that provide services you’ll need. You can identify these immediately and prepare to segregate them from your design.