Posted by Ricky T. Lindeman on 30/06/2010 16:23, last updated on 30/06/2010 16:27 (Public)
Title: When and How to Develop Domain-Specific Languages
Authors: Mernik, Marjan and Heering, Jan and Sloane, Anthony M.
The goal of this paper is clear, Mernik et al. will try to answer the ques- tion “When and how to develop domain specific languages”. The answers to these two questions are rather lengthy and complex. The authors will identify DSL development methodologies and identify patterns in the decision, analysis, design and implementation phases of DSL development. The patterns they in- troduce are taken from earlier work on DSL design patterns and are improved or extended.
The first section discusses general DSL facts: The reason why DSL’s were developed, a general comparison of GPL’s versus DSL’s The different kinds of executability of a DSL ranging from non executable, i.e. a data specification, to a DSL with well-defined execution semantics. And how DSL’s enable reuse, be it using a language specification or reusing a GPL as host language. The authors mention several factors that can complicate the decision to develop a new DSL. DSL maintenance is seen as a serious and time-consuming issue but unfortunately is not mentioned in the following sections, while other issues are addressed in further sections.
In the second section the authors identify the following phases in DSL de- velopment: decision, analysis, design, implementation and deployment. The authors miss the fact that a DSL can also be subject to change. Visser suggests adding a maintenance phase . The decision phase corresponds to when part of the question in the title of the paper, while the other phases correspond to the how part. For each phase several patterns exist that may depend on each other or have some overlap.
The decision to build a DSL or not is taken in the decision phase. The authors list several patterns that will aid in the decision making, but they miss the fact that the development of a DSL requires a well established knowledge about the domain. Ultimately the decision is a financial one: Is it worth to invest in the development of a new DSL? The authors could have provided financial facts about DSL development to allow the reader to make a better estimation of the costs for their own project.
The second phase is the analysis phase. In this phase the problem domain is identified and relevant domain knowledge is gathered. Analysis can be done informally or formally using domain analysis methodologies. A third way is to extract information from existing code. The authors do not mention strategies to extract knowledge from code, as normally the need to design a DSL comes after writing many GPL programs. There is no clear guideline how to design a DSL based on the outcome of the analysis phase, although the design should incorporate the variability of the domain. The authors discuss two domain analysis methodologies, FODA and FAST, to give an idea of the scope of these methods.
According to the authors approaches to DSL design can be characterized along two orthogonal dimensions: the relationship between the DSL and exist- ing languages, and the formal nature of the design description. For the first dimension a DSL can be designed from scratch with no commonality with ex- isting languages or at the other end of the spectrum a DSL can use parts of an existing language. Respectively, these DSL’s are also called external DSL’s and internal DSL’s . The authors correctly quote a lesson from : “Design only what is necessary. Learn to recognize your tendency to over-design”. For the second dimension A DSL can either be designed informally, using natural language, or formally with a semantic specification. A formal design can reduce the implementation effort as noted by the authors. In a further section design support tools are discussed that can also reduce the implementation effort.
The last discussed phase is the implementation phase. Several implemen- tation patterns exist, and some techniques are from the GPL design field, but others are DSL specific. For instance, an interpreter or compiler approach can be used to implement a GPL as well as DSL. However according to the authors this point of view differs from Spinellis who argues that DSL development is rad- ically different, since a DSL development project should be smaller than a GPL in terms of costs . Furthermore the authors mention that interpreters and compilers are instead widely used but do not give any factual numbers about usage. Mernik et al. also discuss a COTS-based implementation approach, in- troduced by Wile . Meaning a Common Off-The-Shelf product is used as host for the DSL. Depending on the used product and the actual implementation strategy, this approach could also be seen as an embedded language approach using a language specialization or extension design approach. The authors also consider XML as a COTS approach and give an example how XML using an XSLT transformation can be used for code generation. This implementation technique however directly corresponds to an application generator implemen- tation approach. The XML is used as DSL to specify a program and the XSLT is the application generator.
Furthermore, the authors discuss implementation trade-offs between the in- terpreter and compiler or application generator approaches versus the embedded approach. This comparison can also be seen as a comparison between external and internal DSL’s. The missing maintenance phase in the DSL development process is also reflected in the comparison where the maintainability aspect to compare the approaches is also missing. Deursen et al. discuss the maintain- ability of little languages in general but do not provide a comparison of external and internal DSL’s.
Finally Mernik et al. present a decision diagram to help the DSL developer to determine the correct implementation approach and compares the approaches based on a cost-benefit analysis. However, the authors mention that in prac- tice the decision is mostly influenced by the implementation experience of the development team.
The second section ends with a comparison of patterns with other classifica- tions. Spinellis  groups his patterns into creational, structural and behavioral and restricted them to DSL specific ones. According to the authors all of them can be mapped to patterns discussed in this paper. Mernik et al. also mentions the classification by Wile  which has three classes, full language design, lan- guage extension and COTS-based approaches but fails to provide a comparison. Full language design can be seen as an external DSL, thus is comparable with a interpreter, compiler or application generator approach. Language extension, but also language specialization falls under this classification, can be compared with the preprocessor, embedding or extensible compiler/interpreter approach. The COTS-based approaches are of course the same. Wile also evaluates his approaches based on DSL-specific, GPL support, and pragmatic support issues. Although Mernik et al. do not compare these issues with their own process, each type can be used to determine the correct pattern to use in a phase of the process. DSL-specific issues have influence on the language design and should be considered relevant in the design phase. GPL support issues are issues that are applicable to both GPL and DSL, i.e. what kind of tool support is needed by the end-user. These issues should be determined in the decision phase as the development of such tools effects the development costs but should also influence the implementation phase as some tool support can come for free depending on the implementation approach. Pragmatic support issues have influence on the development costs, they are general requirements and should be used in the decision phase as they can have a large effect on the development costs.
Section three discusses DSL development support. The authors list several systems and toolkits that aid in the design and implementation of DSL’s. It is unclear however which patterns in the design and implementation phase should be used with such a system or toolkit. For instance, creating an editor for a DSL that is implemented using an embedded approach is radically different than creating an editor for a compiled approach. Furthermore separate frameworks and tools exist for the analysis phase and can be used with a domain analysis methodology.
In the last section several open problems for each development phase are listed. Mernik et al. state that GPL’s should provide more support for em- bedded DSL’s. They take Java as an example but could instead focus more on dynamic programming languages as their dynamic nature can often easily allow a language extension.
Mernik et al. provide an almost complete overview of all phases of DSL devel- opment (decision, analysis, design and implementation) but lack the last two phases (deployment, maintenance). The deployment phase, in this phase the DSL is actually used, could benefit from patterns in the design and implemen- tation phase when DSL support tools can be automatically be generated. Also the maintenance phase is effected by previous phases as described by Deursen et al. in their paper about the maintenance of little languages . The other phases give a great range of patterns and DSL’s that used the pattern allowing the reader a meaningful insight in the usage of the pattern and the chance of success when applying that pattern to their own DSL development process.
 Martin Fowler. Domain specific languages. http://martinfowler.com/ dslwip/, December 2009. Accesed on 2 December 2009.
 Diomidis Spinellis. Notable design patterns for domain-specific languages. Journal of Systems and Software, 56(1):91–99, 2001.
 Eelco Visser. Webdsl: A case study in domain-specific language engineer- ing. In Ralf Lmmel, Joost Visser, and Joo Saraiva, editors, Generative and Transformational Techniques in Software Engineering II, International Summer School, GTTSE 2007, Braga, Portugal, July 2-7, 2007. Revised Papers, volume 5235 of Lecture Notes in Computer Science, pages 291– 373. Springer, 2007.
 David S. Wile. Supporting the dsl spectrum. CIT. Journal of computing and information technology, 9(4):263–287, 2001.