Excerpted from: Future of end-user software engineering: beyond the silos [PDF].
For example, consider “Frieda“, an office manager in charge of her department’s budget tracking. (Frieda was a participant in a set of interviews with spreadsheet users that the first author conducted. Frieda is not her real name.) Every year, the company she works for produces an updated budget tracking spreadsheet with the newest reporting requirements embedded in its structure and formulas. But this spreadsheet is not a perfect fit to the kinds of projects and sub-budgets she manages, so every year Frieda needs to change it. She does this by working with four variants of the spreadsheet at once: the one the company sent out last year (we will call it Official-lastYear), the one she derived from that one to fit her department’s needs (Dept-lastYear), the one the company sent out this year (Official-thisYear), and the one she is trying to put together for this year (Dept-thisYear).
Using these four variants, Frieda exploratively mixes reverse engineering, reuse, programming, testing, and debugging, mostly by trial-and-error. She begins this process by reminding herself of ways she changed last year’s by reverse engineering a few of the differences between Official-lastYear and Dept-lastYear. She then looks at the same portions of Official-thisYear to see if those same changes can easily be made, given her department’s current needs.
She can reuse some of these same changes this year, but copying them into Dept-thisYear is troublesome, with some of the formulas automatically adjusting themselves to refer to Dept-lastYear. She patches these up (if she notices them), then tries out some new columns or sections of Dept-thisYear to reflect her new projects. She mixes in “testing” along the way by entering some of the budget values for this year and eyeballing the values that come out, then debugs if she notices something amiss. At some point, she moves on to another set of related columns, repeating the cycle for these. Frieda has learned over the years to save some of her spreadsheet variants along the way (using a different filename for each), because she might decide that the way she did some of her changes was a bad idea, and she wants to revert to try a different way she had started before.
I have a new paper out with Tomas Petricek: Interaction vs. Abstraction: Managed Copy and Paste, to appear at PAINT’22. [Demo video] I have mixed feelings about this work.
I’ve been talking about the idea ever since my first Subtext paper, and tried to build it several times, but hit many difficulties. This new theory of structure editing I’ve been working on seemed to make it possible. So I had to try it.
The idea is philosophically tantalizing. There is a long-running intellectual debate between those who believe in Logic and Formal Methods as an account of language/cognition/programming and those who reject those accounts as shallow and inadequate. Wittgenstein famously took both sides. I believe programming offers us for the first time a way to substantiate the anti-logic position with a constructive theory that is more than just counter-examples and anecdotes. Managed Copy and Paste is my primary candidate for an Informal Method that takes on the Formal Methods.
Functional abstraction is considered to be the essence of programming languages, enshrined in the holy Lambda Calculus. A key benefit of functions is to centralize change. But if we can track copies and propagate changes between them then having a centralized abstraction is more of an ideal end-state than a necessary condition. The key change of perspective is to move from a program as a static crystalline abstraction to programming as an interactive process of continual adaptation. Managed Copy and Paste subverts functional abstraction and could actually be more ergonomic in practice.
The good news is that it works, and my editing theory handles tricky cases like copying copies. Unfortunately I’m not sure it turns out to actually be more ergonomic. At the very least it needs a lot more UI work. At the moment I’m afraid it is isn’t a slam dunk win, and it needs to be a slam dunk to get anyone to seriously consider such a radical change. Bottom line, I love the idea and its philosophical implications, but in practice it may be more of a luxury than a must-have. A vitamin, not a pain killer.
But what do I know? Tomas convinced me to write this paper to at least put the idea out there and see what happens. What do you think?
in one tweet. Distilled a lot of thought into this one, so lifting it up here.
I was recently asked to state my research goal, and this is what I came up with. Suggestions for clarification welcome.
Make simple things easy for amateur programmers.
Where simple things = small scale ad hoc internet applications:
< 100 users
< 100Mb data
< 1000 LOC (or equivalent)
Where easy = total documentation of the entire stack < 1000 pages, written to the comprehension level of a non-STEM undergraduate. But while still providing general purpose programming within some domain, like spreadsheets but unlike many no-code systems.
Where amateur = someone who wants to invest the minimum effort in learning technical details to make something work. It is not a goal to appeal to professional programmers nor train them. But perhaps the extreme simplification required for amateur programming will suggest ways to simplify industrial-scale programming too. For example SQL started as an end user tool.
I believe these goals can not be acheived by remixing our standard bag of tricks. If that were possible it would already have been done. We need fundamentally new ideas, or perhaps discarded old ones.
That’s the headline for my latest project (with Tomas Petricek), presented at HATRA. [paper] [recorded talk]
With this work I am finally confronting the demon cursing my work: version control. If we are to liberate programming from text we need to make structure editing work, including version control. After all, there will be limited benefit from structure editing if collaboration forces us to return to text editing. It feels like this chronic problem has been dragging me down forever. I’ve spent endless effort trying to handle differencing and merging with embedded unique IDs, but I never got it fully worked out, and it remained as technical “dark matter” clogging all my projects. I finally had to accept that the whole approach wasn’t working, and start over.
I’ve turned to the idea of Operational Transformation (OT), which is how Google Docs works. But OT does synchronization, not version control, so I’ve repurposed and generalized it to support long-lived variants with difference inspection and manual merging. Surprisingly, it seems no one has done this before, and I had to invent a new theory.
The result is “high-level” version control. By observing the history of edits we can do much more intelligent version control. For example a rename refactoring is a single atomic operation that does not conflict with any other, whereas in textual VC it conflicts with every change on the same line as an instance of the name. There is also a big opportunity to simplify the conceptual model of version control, which is frankly an utter disaster in git. Perhaps version control is actually the weak point of the textual edifice, where we might be able to win a battle.
The paper situates our work as reviving the “image-based” programming model of Smalltalk and LISP machines and other beloved systems like HyperCard. Images combined all code and data into harmonious and habitable programming worlds. But the difficulty of collaboration and distribution were major blockers for image-based systems. Because images combine code and data, we must also deal with schema change, which we talk about in the paper. We see schema change as another overlooked benefit to structure editing to help incentivize the move from text.
We have a long way to go towards that vision, but at least this paper is a concrete step. I haven’t published a paper in ages. Thanks to Tomas for helping me reboot.