Version Control for Structure Editing

That’s the headline for my latest project (with Tomas Petricek), presented at HATRA. [paper] [recorded talk]

With this work I am finally confronting the demon cursing my work: version control. If we are to liberate programming from text we need to make structure editing work, including version control. After all, there will be limited benefit from structure editing if collaboration forces us to return to text editing. It feels like this chronic problem has been dragging me down forever. I’ve spent endless effort trying to handle differencing and merging with embedded unique IDs, but I never got it fully worked out, and it remained as technical “dark matter” clogging all my projects. I finally had to accept that the whole approach wasn’t working, and start over.

I’ve turned to the idea of Operational Transformation (OT), which is how Google Docs works. But OT does synchronization, not version control, so I’ve repurposed and generalized it to support long-lived variants with difference inspection and manual merging. Surprisingly, it seems no one has done this before, and I had to invent a new theory.

The result is “high-level” version control. By observing the history of edits we can do much more intelligent version control. For example a rename refactoring is a single atomic operation that does not conflict with any other, whereas in textual VC it conflicts with every change on the same line as an instance of the name. There is also a big opportunity to simplify the conceptual model of version control, which is frankly an utter disaster in git. Perhaps version control is actually the weak point of the textual edifice, where we might be able to win a battle.

The paper situates our work as reviving the “image-based” programming model of Smalltalk and LISP machines and other beloved systems like HyperCard. Images combined all code and data into harmonious and habitable programming worlds. But the difficulty of collaboration and distribution were major blockers for image-based systems. Because images combine code and data, we must also deal with schema change, which we talk about in the paper. We see schema change as another overlooked benefit to structure editing to help incentivize the move from text.

We have a long way to go towards that vision, but at least this paper is a concrete step. I haven’t published a paper in ages. Thanks to Tomas for helping me reboot.

I wrote a paper

My paper with Tomas Petricek was accepted at HATRA. I hope this collaboration with Tomas marks the end of my epic researcher’s block!

HATRA is a workshop encouraging early stage work and hence not even publishing proceedings. Yet the reviews were some of the most attentive and helpful I’ve ever gotten. There was plenty of criticism, but they all earnestly engaged with the ideas in the paper. My compliments to the organizers and the Program Committee.

Continue reading “I wrote a paper”

Open source is stifling progress

My previous post lamented the Great Software Stagnation. We could blame technology lock-in effects (the QWERTY syndrome). We could also blame civilization-wide decadence: the Great Stagnation that was alluded to. But a big part of the blame is something completely unique to software: the open source movement.

Open source is the ideology that all software should be free. This belief is unprecedented in the history of technology. It seems to be related to the fact that software is a form of abstract information. A lot of people seem to think music and movies should also be free, but not too many musicians or movie-makers agree. It is bizarre that open source has been promoted largely by software creators themselves.

Not much user-facing software is open source. But it has almost taken over software development tools. Building things for ourselves and other programmers tickles our nerd sensibilities. Nerd cred is a form of social status we have a shot at. Open source has certainly enabled a lot of software startups to get rich quickly building (closed source) software. But it also killed the market for development of better software tools. There used to be a cottage industry of small software tool vendors offering compilers, libraries, editors, even UI widgets. You can’t compete with free. You can’t even eat ramen on free. You get what you incent, and open source de-incentivizes progress in software tools.

Open source strongly favors maintenance and incremental improvement on a stable base. It also encourages cloning and porting. So we get an endless supply of slightly-different programming languages encouraging people to port everything over. It’s a hobby programming club that reproduces the same old crap with a slightly different style. Inventing really new ideas and building really new things is *hard*, with many trials and errors, and requires a small dedicated cohesive team. Invention can’t be crowdsourced, and it can’t be done on nights and weekends. So the only progress we get is the table scraps of the MegaTechCorps.

Open source and Unix and the internet boom are all wrapped up together. They took over around the same time, 1996, and I blame them for the lack of much progress since.

There are signs that open source is changing. Nadia Eghbal has shown a spotlight on the dark side of open source. There are attempts to convert it to a more sustainable charity model. However I believe the more fundamental change is the imposition of Codes of Conduct, which are trying to change the social norms of open source. Will it still function if it is no longer a private club for autism spectrum guys? Open source as we know it is over, for better or worse.

[See https://faircode.io/]

The Great Software Stagnation

Software is eating the world. But progress in software technology itself largely stalled around 1996. Here’s what we had then, in chronological order:

LISP, Algol, Basic, APL, Unix, C, SQL, Oracle, Smalltalk, Windows, C++, LabView, HyperCard, Mathematica, Haskell, WWW, Python, Mosaic, Java, JavaScript, Ruby, Flash, Postgress.

Since 1996 we’ve gotten:

IntelliJ, Eclipse, ASP, Spring, Rails, Scala, AWS, Clojure, Heroku, V8, Go, Rust, React, Docker, Kubernetes, Wasm.

All of these latter technologies are useful incremental improvements on top of the foundational technologies that came before. For example Rails was a great improvement in web application productivity, achieved by gluing together a bunch of existing technologies in a nicely structured way. But it didn’t invent anything fundamentally new. Likewise V8 made new applications possible by speeding up JavaScript, extending techniques invented in Smalltalk and Java. Yes, there is localized progress – for example ownership types were invented in 98 and popularized in Rust. But Since 1996 almost everything has been cleverly repackaging and re-engineering prior inventions. Or adding leaky layers to partially paper over problems below. Nothing is obsoleted, and the teetering stack grows ever higher. Yes, there has been progress, but it is localized and tame. We seem to have lost the nerve to upset the status quo. (Except Machine Learning, which has made real progress, but is also arguably an entirely different kind of software. I am talking here about human programming. )

Those of us who worked in the 70’s-90’s surfed endless waves of revolutionary changes. It felt like that was the nature of software, to be continually disrupted by new platforms and paradigms. And then it stopped. It’s as if we hit a wall in 1996. What the hell happened in 1996? I think what happened was the internet boom. Suddenly, for the first time ever, programmers could get rich quick. The smart ambitious people flooded into Silicon Valley. But you can’t do research at a startup (I have the scars from trying). New technology takes a long time and is very risky. The sound business plan is to lever up with VC money, throw it at elite programmers who can wrangle the crappy current tech, then cash out. There is no room for technology invention in startups.

Today only megacorps like Google/Facebook/Amazon/Microsoft have the money and time horizons to create new technology. But they only seem to be interested in solving their own problems in the least disruptive way possible.

Don’t look to Computer Science for help. First of all, most of our software technology was built in companies (or corporate labs) outside of academic Computer Science. Secondly, Computer Science strongly disincentivizes risky long-range research. That’s not how you get tenure.

The risk-aversion and hyper-professionalization of Computer Science is part of a larger worrisome trend throughout Science and indeed all of Western Civilization that is the subject of much recent discussion (see The Great Stagnation, Progress Studies, It’s Time to Build). Ironically, a number of highly successful software entrepreneurs are involved in this movement, and are quite proud of the progress wrought from commercialization of the internet, yet seem oblivious to the stagnation and rot within software itself.

But maybe I’m imagining things. Maybe the reason progress stopped in 1996 is that we invented everything. Maybe there are no more radical breakthroughs possible, and all that’s left is to tinker around the edges. This is as good as it gets: a 50 year old OS, 30 year old text editors, and 25 year old languages. Bullshit. No technology has ever been permanent. We’ve just lost the will to improve.

[The discussion continues at Open source is stifling progress.]

[Jan 5: Deleted then reinstated by popular demand. The argument could certainly be improved. I should at least note that there has been substantial technical progress in scaling web performance to meet the needs of the MegaTechCorps and Unicorns. The problem is that much else has languished, or actually worsened. For example small-scale business applications as were built with Visual Basic, and small-scale creative work as with Flash, and small-scale personal applications as with HyperCard. This trend is alarmingly similar to the increasing inequality throughout our society.]

Update, interrupted

My pandemic project has been to get down to solving the hard research problems needed to make Subtext real. I started with the Update Problem, which is at the heart of the imperative vs. functional programming dilemma. My last run at the problem was in Two-way Dataflow. I now have a new approach and prototyped enough of it to believe it works. It has restricted “hygienic” forms of writing through aliased pointers and triggering callback cascades that are safe from many of the usual pitfalls in imperative programming (and equally their emulation in monads and effect handlers). The big win is having updatable views, which is cleaner and more compositional than the zoo of state management frameworks engendered by reactive programming architectures.

Unfortunately I’ve concluded that I can’t publish these ideas in their current state. Neither practitioners or academics will consider such radical ideas without proof that they work in practice on realistic cases. Small contrived examples don’t cut it. TodoMVC doesn’t cut it. They’re right: extraordinary claims require extraordinary proof. I need to build out a fully working programming system creating credible applications. Research, like everything else, requires proof of work. And I don’t have graduate students to do it.

I’m just not ready to do that work now, because there is another fundamental problem that needs to be solved first: the Edit Problem. This is really a cluster of problems related to making structured editing beneficial enough to displace text editing. Actually this is a more important problem: not everyone has to update data, but everyone has to edit code.

I guess the point of this post is to help me work through the research grieving process. I am very disappointed to find myself yet again believing that I have a new solution to an important problem yet unable to communicate it to anyone. Add it to the stack and move on.

[update] Here lies Subtext 10