Quotient lenses

J. Nathan Foster, Alexandre Pilkiewicz, Benjamin C. Pierce. Quotient lenses. In James Hook, Peter Thiemann, editors, Proceeding of the 13th ACM SIGPLAN international conference on Functional programming, ICFP 2008, Victoria, BC, Canada, September 20-28, 2008. pages 383-396, ACM, 2008. [doi]

Abstract

There are now a number of bidirectional programming languages, where every program can be read both as a forward transformation mapping one data structure to another and as a reverse transformation mapping an edited output back to a correspondingly edited input. Besides parsimony—the two related transformations are described by just one expression—such languages are attractive because they promise strong behavioral laws about how the two transformations fit together—e.g., their composition is the identity function. It has repeatedly been observed, however, that such laws are actually a bit too strong: in practice, we do not want them “on the nose,” but only up to some equivalence, allowing inessential details, such as whitespace, to be modified after a round trip. Some bidirectional languages loosen their laws in this way, but only for specific, baked-in equivalences. In this work, we propose a general theory of quotient lenses— bidirectional transformations that are well behaved modulo equivalence relations controlled by the programmer. Semantically, quotient lenses are a natural refinement of lenses, which we have studied in previous work. At the level of syntax, we present a rich set of constructs for programming with canonizers and for quotienting lenses by canonizers. We track equivalences explicitly, with the type of every quotient lens specifying the equivalences it respects. We have implemented quotient lenses as a refinement of the bidirectional string processing language Boomerang. We present a number of useful primitive canonizers for strings, and give a simple extension of Boomerang’s regular-expression-based type system to statically typecheck quotient lenses. The resulting language is an expressive tool for transforming real-world, ad-hoc data formats. We demonstrate the power of our notation by developing an extended example based on the UniProt genome database format and illustrate the generality of our approach by showing how uses of quotienting in other bidirectional languages can be translated into our notation.