Evaluating programming systems design

I collaborated on this paper at PPIG19. They haven’t published the proceedings yet, so I’ve put the paper up here. Abstract:

Research on programming systems design needs to consider a wide range of aspects in their full complexity. This includes user interaction, implementation, interoperability but also the sustainability of its ecosystem and wider societal impact. Established methods of evaluation, such as formal proofs or user studies, impose a reductionist view that makes it difficult to see programming systems in their full complexity and, consequently, force researchers to adopt simplistic perspectives.

This paper asks whether we can create more amenable methods of evaluation derived from existing informal practices such as multimedia essays, demos, and interactive tutorials. These popular forms incorporate recorded or scaffolded interaction, often embedded in a text that guides the reader. Can we augment such forms with structure and guidelines to obtain methods of evaluation suitable for peer review? We do not answer this question, but merely seek to identify some of the problems and instigate a community discussion. In that spirit we propose to hold a panel session at the conference.