Offload Mental Simulation

In this write-up I want to talk about an idea that I believe is an absolutely essential feature, the bare minimum, for any future programming system. In fact this feature must be built into the design from the start.

The short version of this idea is "offload mental simulation onto the computer". This is not a novel idea, it's been said many times before but is also one of my favorite topics and bears repeating. The long version is below.

Programming carries mental weight.

Imagine you are reading a piece of code. Typically the code is some text. When you look at this text, what do you see? You see variables that have names, but they don't have values. You see many lines of code, any of which may participate in the execution. When you try to understand this piece of code, what do you do? What do we do? How do we make sense of this homogeneous blob of text? Let's think about it for a moment.

A screenful of code

Often we play and replay different scenarios in our heads. In our mind we assign values to variables and think about what the computer will do. We extract relationships between variables. We extract relationships between blocks of code. For example, "if this section runs, then this other section always runs", or "if this runs then that doesn't run", or "this section up here must run before this other section runs".

Sometimes when doing this mental simulation, we think of a single value for a variable ("a corner case, how clever of me!") and then we proceed to pretend to be the computer. Line by line, expression by expression, we trace the execution flow. We trace the state. We fill in the blanks. "I am the CPU, the memory, the compiler and the libraries - I am the executor of all code I see" is the unspoken mantra of the programmer at work.

Other times we may think up a range of values, as in "What if X is an integer between 1 and 255?" and then try to get an idea of program behavior.

There are a lot of what-ifs we keep track of. Then there are what-ifs within those what-ifs. There are also "what-if this happens, does it always mean that cannot happen?"

All of this thinking carries incredible, incredible mental weight. We have many ways to deal with this. One is just practice, practice, practice – read more code, replay more scenarios and over time we learn how to make better guesses at those values, learn where in the code to focus our mental cycles. We may also learn how to write code structured in a way that is a bit easier to follow.

But what if we could lift a lot of this weight off of our minds and offload it to the computer.

A computer is often readily available in this situation – it's right there in front of us! The "offloading" must be easily accessible. This means it must be easy to just assign a value to variable and see how it affects other variables... see how it affects the execution trace. When I say easy I mean really, really easy. Like "type a number into a spreadsheet cell" easy. In fact, it should be just as easy to do partial and somewhat abstract tracing. For example, I ask "what if X here is a value between 1 and 10?" and then the computer does it's thing and gives me some sense of what would happen. If there are code paths that would definitely not be taken, they get grayed out. Value ranges appear on other variables. I then proceed to refine those ranges (explore the what-ifs within what-ifs) or click on the grayed out code and see the point where there code is side-stepped (explore the why-not-this?).

I can do more than assign speculative values though. I click on a line of code and immediately see the implication lines turn green (these always run when the selected line is run), the contradiction lines turn red (these never run when the selected ine is run). I pin the selected line and refine the scenario by assigning more variable values, while I browse away to other modules, inspecting their possible state given my current conditions. I'm still asking the same questions, but the computer is doing the grunt work. At every step I tweak the scenario – the collection of all my what-ifs and conditions – and the computer shows me the map of remaining possibilities all through this back-and-forth.

I can also do interactive program slicing. I point to two variables that are far apart and ask the computer how they are related. The computer quickly inspects the possible traces and shows me a condensed summary of how one is derived from the other.

But what about...

Now there are a couple of poor approximations of this feature. One is the REPL. Another is tests.

REPLs are nice but they work well only for reasonably isolated code with few dependencies. It's hard to set up a complex object to pass into a function. It's harder still to set up an elaborate context of dependencies around that function.

Tests can set up the full context but some of it might be lies... er I mean mocks. Also, when you're reading some fresh code in your browser, do you really want to stop to configure that test harness? You might fall down a dependency hole. You might never emerge. No, it's just safer to keep reading and mentally interpreting the dead code on the screen. If we do it enough times over and over, it will work. After all, it works for everyone else. And so we sit in front of our calculators doing mental arithmetic, because the buttons don't exist and entering the numbers into the machine require careful use of tweezers.

There is the family of Lisps, Smalltalks and friends, where the system is live and you can set variables to values whenever you wish. The problem there tends to be isolation of speculation. I don't want my what-ifs to turn into has-becomes. "What if X is 33?"... OK done and now the computer shall never forget. I want all speculation to fly in a transient layer above the existing stable codebase.

In any case, none of these systems do tracing with partial information like described above.

As mentioned above, this idea is not new. For a better list of related work, see Where's My Simulator.

How to build this

For this to work well, it must be designed from the start because it is very easy to design a system that precludes the interesting features here. If you start with a batch compiler you've already lost. You can retrofit some static analysis tools that do this work, but effectively you're duplicating language semantics outside your compiler.

If you look at the desired experience above, it looks less like running a command and more like an interaction where the programmer successively refines queries and the system immediately responds. In fact the back-and-forth user interaction would have to be designed in tandem with the language and the runtime. Primitives like what_if(x=3) and traces would need to be created. So I think we want to start with a live system that interprets the code. Further, I think such a system must be designed with strong support for abstract interpretation. The reason for this is that we'll often be evaluating the code without actual values present. The same interpreter should be able to evaluate f(x), whether x is the number 3, the type integer or the type integer between 1 and 10. In the latter two cases, the result of the evaluation will not be a value, but some kind of program trace. This is just a vague starting point and there are probably many ways to get there.

So that's it for this position piece! I leave you to ponder this quote from Turing about programming.

There need be no real danger of it ever becoming a drudge, for any processes that are quite mechanical may be turned over to the machine itself. – Alan Turing

But what about...

How to build this

Comments