Continuing the theme of problems with existing systems, I want to enumerate various problems we typically tolerate when we use plain text files as a representation medium.
Plain text provides the notion of ordered lines and characters but no other structure. I could swap the order of two sections in a plain text file, effectively modifying the inherent structure of the text, but have zero effect on the semantic structure (parsed representation) of the information. In effect plain text forces us to specify a linear order where none might exist—it has no notion of an unordered collection.
Data or Visualization?
A lot of the whitespace in source code is meaningless to the computer and
promptly discarded by parsers. Eliminating superfluous and redundant data is
generally considered good practice—we don’t want to store visual padding
spaces in the
name field for a
User data type, for instance—but plain text
gets a free pass here.
In fact, plain text requires us to consider both—the visual presentation aspects (line lengths, text spacing and such, dictated by the “style guides”), and semantic aspects (the program you are trying to represent)—in the same medium.
Isn’t this a gross conflation of concerns? Whether I change the presentation or semantics of my a plain text encoded program, I change the same file. Plain text is a strange mixture of data and visualization.
When you open a plain text file, you might see a long list of text lines, but no indication of where to start reading, what are the high level ideas and low level details encoded in that file, and so on. While presentation is one supposed purpose of plain text files, there are no rich exploratory features available in the encoding.
Most information we want to encode includes various kinds of links and
interconnections. These could be references to other files (such as
path.to.other) or references to entities in the same file (such as a file
global name). Plain text has no way to encode rich structure, so each of these
links has to be stored denormalized—repeating a text representation of the
Convenience and Familiarity
One thing going for text is that it is very easy to generate large amounts of it from a keyboard. It is also a familiar medium—we spend years in early childhood learning to read and write text. Another convenient aspect is that a very large number of existing tools work with it, and these tools can be composed to the degree that text provides some rudimentary line and character based structure. Still, I feel the best job these tools can do is a poor one, as I wrote previously in Stuck with Plain Text.
All the misfeatures listed above have real implications on complexity in systems.
Since semantic structure is subject to parsing and interpretation, multiple readers must re-implement the same logic for extracting useful meaning. Editors and compilers for instance, parse the same files into similar tree structures. Since visualization is intertwined with semantics, version control systems cannot differentiate between purely visual changes and semantic changes. Links are stored denormalized and interpreted outside the file itself. This means any relocation of the link target (e.g. renaming a function) requires a system wide text search and replace operation. This problem is compounded by the fact that the text representation of a link itself has no standard syntax, with each language format inventing its own. With no embedded semantic structure (besides lines), editors can provide no useful affordances for viewing, creation and manipulation of text—unless of course they re-implement some parser and layer some semantics on top.
Admittedly, some of these issues apply not just to text files but files in general, however it’s still useful to identify these specific misfeatures of text files.
In general, I don’t think the convenience aspect outweighs these misfeatures. I've come to the conclusion that “plain text” should only be a transient representation at most, used to communicate to the system, but quickly parsed into something richer and not used as a long term canonical representation.
Why strip the system of the rich knowledge of the semantic interconnections and deeper structures, only to repeatedly reconstruct the knowledge in transient forms, within some special programs? Why not have the system capable of storing and working with the rich structures directly? That may let us do reasoning about the various connections more easily (even cross program reasoning), make syntactic history (rather than textual history) the default, let us build better generic tools, encourage a diversity of views into the same programs, eliminate redundant busywork (pervasive parsing), and so on.
There are plenty of related materials and initiatives that share the theme of this write-up. I’m listing a few below that focus on replacing text specifically for programming:
Excerpts from the Subtext 1 paper by Jonathan Edwards:
The affordances offered by text – inserting and deleting characters – are meaningless on their own... A major reason that programming is so hard is that text strings are a poor representation for programs.
The user is editing a syntax tree and not a block of text. This makes syntax errors impossible — code is always syntactically correct.
With Infra, arbitrarily-complex structured data can be encoded, viewed, edited, and processed, all while remaining in an efficient non-textual form. It is suitable for the full range of information modalities, from free-form input, to compact schema-conforming structures
Steven Krouse has compiled a nice list of structured editors.
Instead of a comment you can also annotate this page.