Schema Evolution in Interactive Programming Systems

New paper

Many improvements to programming have come from shortening feedback loops, for example with Integrated Development Environments, Unit Testing, Live Programming, and Distributed Version Control. A barrier to feedback that deserves greater attention is Schema Evolution. When requirements on the shape of data change then existing data must be migrated into the new shape, and existing code must be modified to suit. Currently these adaptations are often performed manually, or with ad hoc scripts. Manual schema evolution not only delays feedback but since it occurs outside the purview of version control tools it also interrupts collaboration.

Schema evolution has long been studied in databases. We observe that the problem also occurs in non-database contexts that have been less studied. We present a suite of challenge problems exemplifying this range of contexts, including traditional database programming as well as live front-end programming, model-driven development, and collaboration in computational documents. We systematize these various contexts by defining a set of layers and dimensions of schema evolution.

We offer these challenge problems to ground future research on the general problem of schema evolution in interactive programming systems and to serve as a basis for evaluating the results of that research. We hope that better support for schema evolution will make programming more live and collaboration more fluid.

Operational Version Control

Abstract of a talk I just gave:

It would be useful to have version control like git but for data structures, particularly the data structures we call spreadsheets, databases, and ASTs. Operational Version Control observes changes to typed data structures by recording high-level operations performed in a GUI or API. These operations can change both values and types, with type changes inducing corresponding value changes (so-called schema migration). Version control capabilities such as differencing, merging, and reverting are constructed out of transformations on operation histories. This theory requires as input a set of rules formalizing the intuitive sense that two operations “do the same thing” in different states. Our prototype implementation presents the user a simplified conceptual model: branches with linear append-only histories, forking by copying branches, and cherry-picking as the fundamental merge operation.