The modular transformation challenge

I got some good feedback here on my last paper. I want to see if I can get some equally good advice about what my next paper should be. My past papers have been about making programming easier. I would like to move on to making programming more powerful. Specifically, by making transformations and views a fundamental language feature. I have thought of a simple challenge problem that I might use to motivate and evaluate this idea. Please let me know whether you agree that a) this is actually an unsolved problem, and b) whether it is a worthwhile challenge. If there is interest, I will post this onto some collaborative editing surface. Thanks!


It is often suggested that software designs can profitably be seen as the transformation of information from one structure into another. Such transformations are also called mappings or views. The appeal of this idea is the potential to construct complex systems by modularly combining simple transformations. But our programming languages do not express transformations very well, forfeiting their potential.

I want to present a simple challenge problem to motivate and evaluate techniques of programming with transformations. This challenge is the archetypal problem of presenting data fields on a screen form, and the input of user changes to that data. Conceptually, this is merely a matter of mapping the internal data structure to a screen-form structure, and then inversely mapping user inputs back again. But actual practice typically involves complex frameworks and multiple languages that are not structured in terms of transformations.

We start with a simple record of data fields. Expressed in Java, it is:

class Customer {
  int id;
  String name;
  String phone;
  String address;
}

For this challenge, the data can alternatively be expressed in any preferred format, such as XML:

<Customer>
  <id>1234</id>
  <name>John Smith</name>
  <phone>555-1212</phone>
  <address>32 Vassar St
  Cambridge, MA 02139
  </address>
</Customer>

The problem is to transform an instance of this data into a screen form, using any desired presentation language. In HTML, the result would be:

<form method="post" action="">
  id: <input name="id" type="text" value="1234"/><p>
  name: <input name="name" type="text" value="John Smith"/><p>
  phone: <input name="phone" type="text" value="555-1212" /><p>
  address: <textarea name="address" rows="3">32 Vassar St
  Cambridge, MA 02139</textarea><p>
</form>

The challenge comes from the fact that both the data definition and the form layout can change. The data definition will evolve due to changing requirements. The form layout will be customized to add features and improve the usability of the presentation. These changes pose a dilemma. If we generate the form from the data, it is easy to accommodate change to the data, but hard to allow change to the form. On the other hand, we can sever the relationship between the data and the form, maintaining them as independent artifacts, and so easily customize the form, but changing the data will then require manual adaptation of the form.

The latter option is the standard approach, treating the screen form as an independently maintained source artifact. To deal with online changes to data, some kind of “binding” occurs in the form to pull the data fields in and push user inputs back out again. Some UI frameworks offer bidirectional binding as a service between data and presentation classes. Template languages, common in use on the web, use “escapes” containing interpreted code that pull in data values in the right places at the right time.

The problem with the standard approach is that it leaves the forms internally hard-wired into the data definitions. If the data definitions evolve, the forms must be manually adjusted. To make the problem precise, we will specify that the following data evolutions are to be handled:

  1. Inserting new fields, and
  2. Renaming existing fields.

One approach to solving this problem would be to treat these evolutions as refactorings that automatically make the appropriate changes to the form. To be an acceptable solution to this challenge, such refactorings must be decidable and sound. Good luck with that, as bindings and escapes are typically expressed in general purpose Turing-complete languages. The form language would have to be a limited special-purpose language of some sort. That is the approach taken in so-called Model-Driven Development (MDD), where domain specific models are defined declaratively. MDD raises the “round-trip engineering” problem, which is the need to map changes in one model into the others to keep them consistent. MDD is based on transformations, but ones that map changes between independent artifacts. I challenge MDD proponents to show how they would solve this simple problem. To make the problem more precise, I will list the customizations that the form language must support:

  1. Introducing a hierarchy of form layout containers to control where fields are displayed. For example, splitting the form into two side-by-side panels.
  2. Distributing fields across different layout containers, and changing the order of fields within a layout container. For example, moving the address field into the right-hand panel created above, and moving the id field to be after the name field.
  3. Renaming field labels, for example “id” to “customer id”.

These customizations must be invariant in the presence of the data evolutions stipulated earlier: introducing new data fields (which should pop up within the form, preserving relative order with other fields that have not been explicitly moved), and renaming data fields (which should have no effect other than to replace displayed labels that haven’t been explicitly renamed).

The purpose of this challenge is to evaluate an alternative resolution of the dilemma of coordinating change in software designs. I want to treat the form as being directly generated from the data definition, not as an independent artifact. The generator will be specified in a special language designed to express transformations between structures. Customizations to the form will also be transformations, which are composed together to convert the generated generic form into the desired customized form. The key technical hurdle is that these customizing transformations must be invariant under the stipulated evolutions of the input data. I challenge existing transformation languages to handle this short list of customizations and evolutions. I think they will be hard-pressed, because those customizations and evolutions, while commonplace and natural, play havoc with the ways we typically express structure in languages.

Renaming is particularly nasty, for the naïve way to specify most of these transformations is by keying off of the names. Any reasonably expressive transformation language is unlikely to support decidable and sound refactoring of its matching semantics. Order is another stumbling block, for insertion is often specified in terms of integer ordinals, which are unstable in the presence of other insertions, violating the data evolution requirement. So for example moving the id field after the name field, if encoded in terms of ordinal positions, would be incorrect after inserting a new field before the id field.

These technical problems reveal a surprising, even scandalous, situation: that in this day and age we still don’t understand sequential and hierarchical structures well enough to build modular transformations on them. It is not my purpose here to present a solution, but those familiar with my past work will not be surprised to hear me claim that the root cause of the problem is that we have adopted the wrong way to encode such structure: text strings. Strings (or the morally equivalent ASTs/DOMs parsing them) are not a useful way to represent complex information artifacts. I claim that once we get over our textual fixation, the transformation challenge becomes tractable. By an odd coincidence, it turns out that the internal data model of Subtext is close to a solution.

Meeting the challenge of transformations would yield a pleasing result: that instead of building a system as a set of artifacts with subtle internal interdependencies that must be manually maintained, we can instead plug together a set of modular transformations. Note that the data definition artifact itself can be replaced by a history of evolutions, seen as transformations, starting from the empty definition. It’s transformations all the way down. We win because the transformations are modular, unlike the structures they generate. I call this approach Transformative Programming.

24 Replies to “The modular transformation challenge”

  1. This looks close to what I’ve been looking into. I must admit I hadn’t put a cool name to it like “transformative programming”.

    http://lambda-the-ultimate.org/node/2652#comment-39945

    I agree that dealing with change is a big problem, but I’m not sure it needs a solution. I think there’s two separate problems here. One is how to build views and transforms; the second is how to deal with change.

    Personally, I won’t be trying to touch the second problem. It may be that this is like the fallacies of distributed computing..

    1) The network is reliable
    2) Latency is zero
    3) Bandwidth is infinite
    4) The network is secure
    5) Topology doesn’t change
    6) There is one administrator
    7) Transport cost is zero
    8) The network is homogeneous
    9) The system is monolithic
    10) The system is finished

    The only way to solve them is to work around them. My personal take is that to dealing with change requires better refactoring tools. Updates to anywhere that needs changing because of new fields, etc.

    I think the other parts of transformative programming are very interesting. Obviously there are hints of the requirements in different languages, as I found from responses at LtU. I’m not going to be writing any papers any time soon, so I look forward to seeing what you come up with.

  2. Exactly.

    I’ve been putting together a presentation for LIPHP (Long Island PHP Users Group) about exactly this topic — it’s my best vaporous talk yet — I’ve been delaying presenting it precisely because of the difficulty proving (a) and (b) concerns you listed above. It’s not that I can’t prove these concerns, it’s that it takes a lot of knowledge about problem application domains and actual implementations of code bases and their structures to explain to people how to change the way you look at your world. Problem domains like web application development have become entrenched with default assumptions that just don’t scale. Every developer with a passion for head-down coding wants to solve this problem, and so they hack at it with their incurable do-it-yourself attitude. If I can talk philosophically for a second… and apologize for my stream of consciousness…

    The term myself and a few others use to describe what you are talking about is fungible. You can only have a fungible presentation of the data iff you have data independence. This is why it upsets me when people blow off what I try to tell them about application design. They just don’t get it. It’s about “one fact, one place, one time”. Moreover, because the presentation is fungible it’s the least valuable part of the application. Very few software engineering researchers even seem willing to embrace this.

    At LIPHP tonight, I asked my good friend Ken about the topic of drinking kool-aid. Coincidentally, he recently gave a guest talk to students at Nassau CC taking a web development course here in Long Island, and one of the students was so impressed with Ken’s “foreign” view on application development that the student asked him out right, “So how do you get people to do things your way?” Ken’s reply was he was never able to do that, and he didn’t have an answer. However, afterward, Ken and the professor of the course were talking about it, and they agreed that the only way to get people to do things your way is if you save them labor.

    You have to eliminate labor. You can call it “Transformative Programming”, or you can invent something else and call it “Structure-Oriented Programming”, at the end of the day the only thing people are going to ask about your research results are, “Did you save anybody labor?” With subtext 2.0, it’s clear to me that the stuff you came up with will save me labor. I implicitly asked that question throughout watching your demonstration of it’s features.

    It is often suggested that software designs can profitably be seen as the transformation of information from one structure into another.

    Not just structure, though! SPECIFICATION. Structure is such a weak, obtuse word here. You really want a way to specify interfaces without code – that is what “structure” is typically used to mean. The other half of the picture is dynamics; that’s what things like collaboration diagrams are for. The reason I’ve been thinking about transformations so much is that so many programmers I talk to have a difficulty VISUALIZING what complex transformations do in not-quite-expressive languages like Java. They have a hell of a time debugging long call chains because they have to trace through 10 method calls to realize what the F the code is doing wrong. Much of the syntactic weight in Java is really just a complex transformation waiting to be simplified.

    End stream of consciousness. Sorry for the lackluster presentation, but I had a burst of energy while simultaneously needing to go to bed!

  3. Have you looked at Harmony? It’s a structure transformation language where all operations are bi-directional (but sometimes lossy) Unfortunately I haven’t had a chance to investigate it’s use as a program transformation tool.

    http://www.seas.upenn.edu/~harmony/

    [Thanks Justin – Yes I know about Harmony. My challenge is evilly crafted to stymie such transformation languages, which rely on fixed unique names and have no concept of order.]

  4. nice challenge. is this kind of task straightforward in subtext?

    [I think the basic semantics are almost there, but they are not currently exposed. I need to refine the internal engine, define a set of primitive functions, and provide a GUI presentation. Bottom line, I hope to make it straightforward in the next version of Subtext. – Jonathan]

  5. Could a DWIM approach work? DWIM = Do What I Mean. Map the ‘phone’ data field to the presentation item closest in spelling to ‘phone’…

    I recently endured a project with just this mapping problem. Adding a single item to the presentation layer involved changes all the way down: business objects, web service methods, stored procedures, and database tables. It seemed that some semi-automated pattern matcher could figure out that “int LandUseCodeId” in the DB maps to combo-box LandUse. Even if only 80% correct, and helping it with additional hints was easy, it would be a huge time saver.

  6. Ian, all that maintenance you describe is an architectural shortcoming or trade-off. It also reflects a very specific mindset those of us at LIPHP discourage: the thought that the changes bubble downward and the implication that the database is the “bottom” layer. Levels of layers is the wrong metaphor. It’s a pipeline. Modularity and layering exist to support change. That’s it. Your problem is too much change, simply not supporting change well. Strategically, the goal of all software engineering research should be to eliminate labor. Anything else and you are not living up to Fred Brooks’ essay No Silver Bullet. One thing eliminating labor means is eliminating these software architectures devised by Einstein that divvy work into tiny steps for Mort to do. It’s all tiny steps, but there’s an infinite amount of them. Software maintenance shouldn’t have an indefinite postponement.

    “Business objects” and so forth is what I consider the Ralph Johnson school of thought: http://st-www.cs.uiuc.edu/users/johnson/bus-obj.html This school of thought seems to believe that if you have enough patterns and human coordination you can have information dominance in very large scale software systems. This school of thought is tactical, not strategic. It’s the 100 year war metaphor in politics. (To be fair, tactics are useful in the sense they improve process-oriented decisions and help move software engineering more toward engineering.)

  7. John, good point, it was a poor architecture. But even a good one would still have to the two levels discussed in this blog post: data and presentation. So the question remains that a decent pattern-matching-plus-hints approach might work.

  8. Poor architecture for change, maybe. Poor architecture overall, I don’t have enough information? All software architecture down to the tactical level is just a spring force model.

    I don’t have any resistance to your suggestion. Some of my favorite research projects try to help the programmer see the pipeline: linked editing, relo, subtext. However, I can think of a particular architectural constraint I’ve been exposed to where you might not get the 80/20 you are asking for. Still, modeling the problem the way you posed is not a bad idea, it’s not right or wrong, it’s just a model. I admire the sort of thinking that probably brought you to the idea, too. If I can indulge myself, it’s the idea that most software should be anonymous. Also, as Jonathan put it into a wonderful metaphor, this is programming as experimental philosophy.

    The reason I emphasized specification earlier is that specification doesn’t define implementation. When you are talking about transforming structure, you are talking about transforming implementation details. The key word he uses here is “adaptation”; implementations are an adaptation of the spec. When speaking of transformations, the real design question is, “What shouldn’t be hard wired?” The answer is you can’t hard wire one implementation to the spec. That’s what people do all the time. I call it the N+1 Schema Anti-Pattern, and it’s easily the most common anti-pattern without a widely recognized name because so many are duped into thinking it is “the right approach” when really it is just the default assumption.

    It is a commentary on how some people at places like OOPSLA think about architecture when they talk about very large scale software systems, when I am talking about very long term software systems; I don’t want to manage labor, I want to eliminate it. All software architecture is just a spring force model.

    [John – You have obviously been thinking about these issues for a while. Why don’t you try to organize your thoughts into an essay? Perhaps one place to start is with your “N+1 Schema” anti-pattern. You could use the standard pattern template. I would be happy to comment on it. – Jonathan]

  9. Got it. The approach you’re talking about (“Not just structure, though! SPECIFICATION”) is better. It makes explicit the presentation relationships (order, relative location, etc).

    My idea comes from a colleague who marvels at Google’s ability to do spell-check through little more than pattern-matching. Same with Flickr tagging producing good results. The librarian (or programmer) in us wants a human to do all classification (or mapping), but perhaps machine-generated results can be good enough. With software the programmer would have to review mappings, and be notified whenever they change.

  10. Right. I understand where your idea comes from. It’s not just Google’s ability to do spell-check through little more than pattern matching. It’s Google’s ability to make almost all of it’s software anonymous to the user. The front page of Google has barely changed since it’s inception. Why? Because they handle all the complexity for the user. Pattern matching is just one trick we can employ.

    Personally, I tried digging deep into the concept of “rubber-stamp pattern matching” but I wasn’t smart enough to understand how to use more than 1% of K.S. Fu’s stuff. Maybe some time down the road I’ll have a use for all those books and papers by him I binge purchased. 🙂

    The specification vs. structure thing is something you see in Geometry. What is a geoboard but a tool for rearranging the structure of a geometry while keeping it’s specification in tact? To paraphrase a challenge a teacher gave me in my early youth when I was first introduced to geometry: “Here are the rules for putting the rubber bands on the geoboard. See what you can do structurally.”

  11. “Strings (or the morally equivalent ASTs/DOMs parsing them) are not a useful way to represent complex information artifacts.”

    True enough, but it sure is a lot easier to diff them. Change control & review is an absolute requirement in practice.

    [Good point Tim, and I agree about the importance of change control. It is so important that it shouldn’t be based on the lossy medium of strings. It is easy to diff strings, but the results are unreliable and hard to interpret. Diff is just a heuristic, and one that assumes changes happen only in units of entire lines. Wouldn’t it be better to capture change information at the source, in the IDE, so you get information about WHY things changed, not just HOW. For example, a rename refactoring could be recorded as just that, rather than a set of edits to 100 files. I see changes as just another kind of transformation, one that happens through the dimension of time instead of space. – Jonathan]

  12. To what extent does this article have to do with this [1] idea of AST-focused development? I could do some compare/contrast myself, but I think it would be more interesting if you (Jonathan Edwards) were to do it.

    [1] http://blogs.tedneward.com/2008/02/18/Modular+Toolchains.aspx

    [Luke – my challenge is to customize generated artifacts. Modular compiler back-ends help the compiler builder construct a fancy generator from smaller generators. There is a mention of aspect-weaving in the toolchain, but aspects are really a declarative way of customizing the source – the aspect writer knows nothing about the internal models generated in the back end. — Jonathan]

  13. Hate to jump in on the “this sounds like…” train, but interpretable/compilable transformations and alternate views of the underlying model is exactly what Simonyi’s “Intentional Programming,” which is now in private beta, is about. For a lecture with a good demonstration, see http://langnetsymposium.com/talks/Videos/3-09 (Ivarson).

    [Larry – they haven’t revealed much detail, but I assume that writing one of their views is extremely complicated, and requires intimate knowledge of a large API (like building a COM embeddable and linkable document type). I also believe IntentSoft is fundamentally monolithic: all information is stored in a single source artifact, requiring all customizations of generated views to be mapped back into the source. That raises all the issues of round-trip engineering. So I believe they would find it hard to solve my challenge. – Jonathan]

  14. This is all fine for the simplest situations, but in real life you have more subtle changes like changes to data type, changes to the domain of values, changes to business rules, special cases…. these impact all the transformations differently. Even if you have a nice transformation language – which would be a cool thing – it only moves the deck chairs.

    As for strings, the strength of strings is that we’ve been using them for thousands of years and our semantics are tailored to them rather than the more precise digital forms. This is particularly of relevance in domains with variable precision (dates, numbers, measurements). This influences our thinking (try looking at date precision in practice and compare gregorian and the Iranian calendar). In the end, to humans, all data is sequences of characters with assigned patterns of interpretation, and I think this is where software should head.

  15. Jonathan,

    have you had a look at the Magritte framework used in some Seaside web applications?
    http://www.lukas-renggli.ch/smalltalk/magritte

    Basically it works by attaching meta-descriptions to data objects, e.g. for phone it would say “I’m an alpha-numeric string of at most a dozen characters, that should be valid as a phone number as specified by this regexp or this parser, and I should appear near mobile, fax numbers or email addresses”. For the address field it would say “I’m a multi-line text string and I appear last…” and so on.

    When the data object must be displayed, the system gathers descriptions for all fields, and generates the form in the desired format (HTML or native widgets).
    Descriptions are objects, so they know how to instanciate the correct UI for the data they were gathered from.

    Hope it helps…

    —————————
    Thanks, Damien. Magritte is the most dynamic monolithic-model approach I have seen, and I will probably use it as an exemplar of such in my paper. As I understand it, the entire system is encoded into a single model of entities with attributes (which can be methods). Everything is generated at runtime off of this model by traversal and method execution. This works quite well if the generated forms list all the fields in the same order as they are defined in the model. Doing the kind of form customizations I am calling for will be quite awkward in this approach. For example, moving a form field from one place in the form hierarchy to another would require a custom rendering method in the field that was hard-wired with knowledge of the form’s tree structure, and would need to be manually recoded on many structural changes to the form. Magritte makes big claims about end-users being able to customize the attributes to effect most needed design changes, but that does not appear to be the case for what is in my experience a typical customization.

  16. Are you familiar with Conal Eliot’s recent work? He is approaching some of the same problems you are interested in from an angle that bears directly on the modular transformation challenge. Googling should lead you directly to a presentation he made at Google and to relevant entries on the Haskell wiki.

    [Thanks for the tip. I read his Tangible Functional Programming paper. I’ll check up on his latest work. – Jonathan]

  17. There are different levels to an application, each of which adds information to the level below. For example, a database defines field types and maximum lengths, and a form includes that, and adds layout. In OO, somewhere in-between the database and the UI we have one or more classes that add validation and business rules.

    It is more important for the application to to be able to re-use information than it is to be able to generate a particular layer of the application based on that information. (Generation is just one kind of re-use). You could generate a default form with the right fields, but that default layout is not as valuable as reducing duplication of information – such as field types, maximum lengths, etc.

    Of course, even if you avoid all duplication of information, you still need to map between layers, and this is a form of duplication too. Hence the challenge of adding new fields and renaming fields.

    Adding fields is not really a practical concern. I say this because each new field needs to be touched at all levels of the application anyway. The UI adds layout, the OO class adds validation. As long as the default amount of code is kept to a minimum (possibly zero, but not necessarily), that is ok in my book. I have succeeded in doing this for forms, but I still have to manually touch the form to add and position the field. Zero user-code (using drag-and-drop), but not zero effort. In fact, occasionally I do not want to display a new fields on the UI, so this default “opt-in” suits me fine.

    Renaming fields is easy, as long as you give every field a unique key. Usually, we just use the names of things as the key, which is why renaming is hard. In the absence of some abstraction technique (who wants to deal with keys!?), using keys in this way may add more artificial complexity than it is worth. The half-way-decent alternative I have used in the past is “aliases” – a field can have a “real name”, and also an “alias”. Thus, a rename consists of adding an alias with the old name, and renaming.

    You can see screenshots of my production-application attempts at UI re-usability near the bottom of http://blog.perfectapi.com/2008/02/introducing-egg-ui-pattern.html

  18. A very interesting post. I chanced across your blog when searching for posts about “beautiful code” (which I am reading now … ). I must say, that this problem of dealing with change in layout generators is a pretty intriguing one.
    At work, we have a homegrown solution for layout generation that allow us to bind java objects to specific UIs, and the change control we use is based on View classes per feature (a.k.a single screens), which are singular points of change, for any change in the backend DS that represents it.

    @Steve Campbell : I have to disagree with your point about “adding fields are not a concern part”. In my case, we support some 10/20 fields per feature per release (don’t ask why :)), and adding fields becomes a very critical aspect of managing change. Our solution, it to offset layout to the UI framework, by making the UI layout generator aware of the type of data it is trying to render.
    For example, when displaying enums, it would choose a combo, and set it’s size to the max length + some buffer, but will also look for any other fields in vicinity that it can position next to the shortened combo.

    Another way to allow for control is to define layout xmls, that force specific layouts on the UI generator (cutting our the UI layout engine), which implies two points of change for a new field.
    1. the layout XML
    2. ORB (Object relation Binder) – that provides DS schema + type info + relationships betn objects.

    [Shivanand – You say “10/20 fields per feature per release”. That is the kind of evidence I need to make my case. Do you know of any published articles or papers that report data on the amount of such changes? – Jonathan]

  19. @Shivanand

    You say, “Don’t ask why [we support 10-20 fields per feature per release]”, yet I’m actually very curious. Be careful with adding too much complexity to your system, even if it has Enterprise buzz-words describing it. In my experience, enterprises will ask for more than they can legitimately handle. Part of the responsible of the developer is not just to listen to what the user’s requests are, but to think about how those requests challenge integration. Sometimes the best solution is to take the user requesting the feature aside and make them think about the consequences their actions have on every other user of the system.

    Enterprises love complex reporting, and so they naturally assume “adding fields and recording more data == more complex reports”. When that guy or gal from the marketing department tells you that they want all re-sellers to have to fill out a 20 page form to become a re-seller, you have to take them by the hand and politely tell them they need to slash 17 pages from the data entry form. Finding out more about your re-sellers isn’t nearly as important as having re-sellers and access to more markets. In my experience, the big problem is nobody gets reporting right. That is why there are so many reporting solutions out there. Data mining researchers/statistical analysts/”quants” will also tell you that they don’t need more data, but instead need easier ways to query data and discover statistical relationships between two or more variables. To punch home the point: Every so often someone from the Sloan School publishes a report saying enterprises are overrun with data and their biggest concern is finally making use of it.

    Also, I don’t recommend the desire to “place every pixel” in the UI layout generator stage. Don’t try placing every pixel. Instead, use what I call Policy-Based Layouts at the GUI toolkit/framework level. The UI generator should not be responsible for placing every pixel. It should only be responsible for semantic groupings and relative positioning. It’s a trade-off, really. The downside to following my recommendation is that you need to re-implement (as in port) policy-based layouts to each GUI framework you support. The upside is that your UI generator is much simpler and can focus on declarative semantics. In this way, you get the best of what Steve Campbell is referring to about adding new fields being trivial, but you also avoid his problem of manually touching the forms to add and *position* the field.

Comments are closed.