Substrates vision statement

Submitted to Substrates-25 Workshop

Jonathan Edwards
May 12, 2025

What is a substrate?

I define a Substrate as:

  • A complete and self-sufficient programming system,
  • with a persistent code & data store,
  • providing a direct-manipulation UI on that state.
  • Supports live programming.
  • Programming & using are on a spectrum, not distinct.
  • Conceptually unified — not a “stack”.
  • Summarized as a slogan: a WYSIWYG document, DB, & PL in one.

The canonical examples of a substrate are Smalltalk and LISP systems. HyperCard and Flash were much-beloved beginner-friendly substrates. Spreadsheets are by far the most successful substrate, and an inspiring existence proof that alternative programming experiences are possible. Webstrates construct a substrate in the web browser, though with a different definition:

“We define shareable dynamic media as collections of information substrates (or substrates for short). Substrates are software artifacts that embody content, computation and interaction, effectively blurring the distinction between documents and applications.” 

What are the benefits?

  • Building applications as documents/images provides a more consistent user experience and a simpler developer experience.
  • There is a gentle progression from user to developer.
  • Beginners can quickly build non-trivial applications, and easily become competent.The parade of no-code/low-code tools attests that this need is still unmet.
  • Less code is required because inessential impedance mismatches are dissolved.
  • The programming experience is improved by having a small set of tools working throughout the substrate, rather than specialized tools for each specialized technology in a stack.
  • Live programming and ubiquitous observability makes it easier to understand, debug, and modify code.
  • The confidence of working in a human-scale world that is coherent and knowable.

What are our major research problems?

  • Can a substrate be pluralistic? Related to work on integration domains, component models, and malleability.
  • Can the stack actually be unified or does it reflect essential specializations?
  • Substrates have been built upon dynamically typed PLs. Could a statically typed PL serve instead?
  • Substrates have been built upon PLs (Smalltalk/LISP) and UIs (the browser). How about building upon a DB instead?
  • Edit calculi (see my personal research statement).
  • UI for navigating the substrate: inspector windows, outline, zoomable canvas, etc.
  • Provenance and observability.
  • Programming by Example.
  • Modes of collaboration: multiplayer, git, and beyond.
  • Substrate as an ecosystem: libraries, packages, and DLC.
  • Lifting Unix or the browser into a substrate.
  • Interoperating with the mainstream: calling/serving HTTP APIs, reading/writing standard file formats.
  • What is the role of LLMs? Will they obsolete the need for substrates?
  • What is a “killer app” that justifies substrates?
  • How do we evaluate our results? See UIST Author Guide.
  • Where do we regularly meet, publish, and present?

What do we seem to disagree about?

  • What kind of user are we targeting?
  • Should a substrate include a full programming experience?
  • Should a substrate be self-contained or can it expose underlying standard tech?

Are we even a field?

To be a field of research there should be a productive exchange of ideas. We should be citing each other as related work, and extending or critiquing each other’s ideas. History indicates that progress accelerates when there is a healthy mix of competition and collaboration. To foment that interaction it helps to have a regular meeting place.

What I’d like to see result from this workshop.

  • Identifying research problems and disagreements.
  • Defining one or more canonical examples like TodoMVC to compare substrates.
  • Publishing a report of the meeting, like the 1968 NATO SE conference.
  • Creating a Wikipedia page for Software_Substrate citing our report.
  • Planning to meet again, perhaps a one-off like Dagstuhl, or an annually recurring event.
  • Inflaming rivalries and alliances!

Personal Research Statement

[These ideas have been the subject of many discussions with Tomas Petricek.]

Who is the user? My target users are the beginners and non-technical people that embraced HyperCard, and the “power-users” of spreadsheet fame, but who need more power and generality. I want to give them the power of Smalltalk without losing the user-friendliness and “conviviality” of HyperCard and spreadsheets.

Who isn’t the user? I am not targeting users like myself, nor current professional programmers. I do not want to build a “tool for thought” for intellectuals. I would rather build a “tool for getting stuff done” by ordinary people.

I make some opinionated design choices:

  • Focus on data first. Users care more about data than code.
  • The data model unifies key features of documents, relational DBs, and PLs.
  • Static typing to benefit the PX and UX, schema change for flexibility.
  • Built-in user-friendly distributed version control.

The benefit of a unified data model is to avoid the infamous impedance mismatches creating much complexity when shuttling information between the DB, PL, and UI. My experiments have converged on a model like that of statically typed FP languages: a tree of records, sums, (homogeneously typed) lists, and atomic values except that:

  1. Data is mutable.
  2. Every edge of the tree (record fields, sum components, list elements) has an internally generated globally unique permanent ID.
  3. There are cross-links in the tree, defined as a path of IDs from the root, subject to certain static and dynamic constraints.
  4. Type/schema change is a first-class operation. 

To establish a solid theoretical foundation I am exploring an Edit Calculus. Generally speaking an edit calculus formalizes the interaction between a user and a stateful system, where edits are operations mutating the state. This particular edit calculus originated by asking how a substrate could be statically typed? Changing a data type must simultaneously adapt any instances of that type, called schema migration in DBs. The problem is that you can’t tell how to migrate the data just by comparing types before and after. For example, was a field moved or was it deleted and a new field inserted? It is necessary to capture the user’s intention as they interactively edit the type. I formalize this with a set of edit operations that are surfaced in the UI to capture the intention of a type change and accordingly migrate instances.

But beyond schema migration the edit calculus turns out to also enable new modes of collaboration. I generalize the theories of Operational Transformation (OT) and Convergent Replicated DataTypes (CRDTs) to support collaboration like that in distributed version control systems (DVCS). An edit calculus over a data model not only defines what the edits do, but also makes rules for how an edit migrates forwards or backwards through other edits so as to preserve the user’s original intention. From these rules we can generate analogs of DVCS capabilities: diffing, reverting, merging, and cherry-picking. Unlike traditional DVCS systems like git these capabilities:

  1. Integrate changes to code, data, and types/schema.
  2. Operate to preserve intentions rather than concrete differences.
  3. Span multiple modes of collaboration, from traditional transactions to multi-player collab to loosely coupled version control.
  4. Function in an open world of documents, not a bounded repository.
  5. Present a coherent conceptual model abstracting from the implementation.
  6. Provide a feature-complete GUI.

Collaboration is a relatively new feature for substrates. The classic systems focused on (and in large part invented) the personal computing experience. I am betting that the edit calculus can provide collaboration capabilities that not only match but exceed those of mainstream programming tools. Winning that bet would reframe the narrative: instead of substrates being beginner versions of “real programming” they are a new technology with unique benefits for a different audience. For example end-user merging of document variants. Could a substrate like Notion/Airtable with end-user programming and next-gen version control be a killer app?

My research prototype is called Baseline. Progress has been reported in: Version Control for Structure EditingManaged Copy & Paste;  Operational Version ControlDB usability: as if

There are still major unsolved research problems:

  • I haven’t worked out transactional and multiplayer modes.
  • I haven’t found a clean algebra of edits on the data model comparable to relational algebra. Maybe the model is not quite right yet.
  • The migration rules currently struggle with duplication and irreversible edits. These situations clash with my intuition that the migration rules should satisfy certain symmetry properties to be correct. Something has to give.
  • I need an executable specification language for the migration rules.
  • I need to prove or property-based-test some notion of correctness for the edit calculus.
  • Data comes first but ultimately there needs to be an embedded programming language that can at least do queries. My vision is to extend the edit calculus into a full-fledged PL with novel capabilities, including my previous experiments on Subtext. The fallback is to use a conventional PL design.
  • Much of the work so far has been iterating on the UX of version control on structured data. There is much work left to do.

4 Replies to “Substrates vision statement”

  1. I just got in a rabbit hole looking at old Bret Victor videos the other day and was wondering what you were up to

  2. FWIW I agree 95% – great summary! – and, the 5% is in two related parts:

    the network should be up there with DB, PL, UI! (my own arch is divided into four parts, not three: DB=”ONN”, PL=”ONR”, UI=”ONT”, and the fourth, network=”ONP”)
    I don’t believe all the edit and schema stuff is really that big an issue!

    … so, my take is to just relax on the whole schema thing and instead:

    Imagine a global data fabric that we each own a chunk of … and where more sophisticated collaborative editing protocols are a problem for another time.

    Let’s just get the simple substrate working first.

    So meanwhile just start with “I own my stuff and declare its R/W permissions” and be happy with “last edit wins” and work forwards from that.

    1. You may be right but I think people will only try something different if it has some compelling unique benefit. Simple and easy don’t sell. Pain killers not vitamins as they say.

Leave a Reply to Jake BrownsonCancel reply