Red Squiggle, Green Squiggle

Back when I was a kid, if you typed in the following at the BASIC interpreter prompt

10 PRINK "Put another dime in the jukebox, baby"
20 GOTO 10

the machine would come back with “Syntax error in line 10” because you had misspelled “PRINT”. For years the word “syntax” was to me an adjective that meant “computer-related”. It was only after I took my first linguistics class as an undergraduate that I began to think of it in its broader sense of–well, what exactly? If I say “The boy are tall”, “Does him want him’s dinner?”, or “What who did she about?” I have committed a syntactic foul, and in linguistics class one learns to put an asterisk next to a transcription of the offending sentence. This sense of syntax is distinct from meaning or “semantics” (“The boy are tall” may be mangled, but it’s still pretty clear we’re not discussing a short boy), which in turn grades over into the contextually and culturally determined aspects of language use which we call “pragmatics” in acknowledgement of the fact that this is what tends to matter most in daily life. The syntax/semantics/pragmatics trichotomy is a useful one for discussing natural language, even if the various boundaries are blurred.

The trichotomy works in computer programming too. In fact the criteria for distinguishing between these categories is so crisp that I’ve become convinced they make more sense in software than they do in speech. If something in the source code causes the compiler to fail, that’s a syntactic violation, just like my TRS-80 said it was. If the code compiles but does not produce the correct outputs for given inputs, that is a semantic error. The proofs of algorithmic correctness in computing theory are giving us semantic assurances, and we need them because in practice, semantic errors manifest as mystifying behavior and consume the bulk of a programmer’s working hours. There is still another failure mode though. Your code may compile and map inputs to outputs in in the intended manner but fail to be useful in some other way. It may be too slow, or be prohibitively expensive for large inputs, or the colors in the user interface may be ugly. This is a pragmatic failing of the software in both a linguistic and an everyday usage sense. As in formal linguistics, the pragmatics category is a bit of a catchall, but here the boundary between semantic and pragmatic failing can often be expressed crisply. For instance, the theory of algorithmic complexity–the big-O notation that helps you figure out whether your program will finish running next week or next year–is a rigorous mathematical formalism (borrowed from number theorists, no less) that concerns itself exclusively with the pragmatic realm.

Modern word processors underline spelling errors with a red squiggle and grammatical errors with a green squiggle. Similarly, modern coding environments may highlight syntax errors in red and spelling errors in green. Though the different handling of misspellings makes these appear superficially different, really it’s the same convention. Red is for the purely mechanical, embarrassing, but easily detected and fixed error. Green is for the subtler problem that you don’t have to address but maybe you should. In the natural language domain of word processing both kinds of error are down in the purely mechanical realm. (One can imagine word processors of the future adding blue squiggles to indicate non sequiturs, bad jokes, and mawkish clichés.) In the all-around easier software domain, error classes waft upwards towards computational meaning, so the red line indicates a syntactic error, and the green line a semantic one. A misspelled variable name isn’t the only kind of semantic error, but it’s the simplest kind. And crucially the anomaly indicates where the semantics lives.  If I accidentally call a function that calculates a factorial fictorial(n) I have made a typo. If I purposely call it f(n) I am needlessly obscuring my intent.

Computers are new. Language is old. Yet the recent formal account of the two practices developed both contemporaneously and independently. Russell gave a quantificational account of the word “the” in roughly the same war-shadowed onset of the twentieth century that saw Turing propose his hypothetical machine. Chomsky put forth his early syntactic formalism just as mainframes were coming into their own as commercial devices. (And if its usefulness for describing the way people actually talk has gotten murkier as the years have gone by, its utility in compiler theory remains unchallenged.) Montague, Church, early Wittgenstein–in retrospect they all seemed to be talking about bits and bytes, though none of them set out to do so. The goal was a formalism to describe natural language, but the clear success was in an account of the language-like formalisms we devised to control complicated machines. The cottages we lived in while we worked on the glorious mansion that never actually got built were our true achievement.

Advertisements
This entry was posted in Mermaids, Those that have just broken the flower vase. Bookmark the permalink.

One Response to Red Squiggle, Green Squiggle

  1. W.P. McNeill says:

    Found Beth Andres-Beck’s post “Assumptions Make Programming Possible” after I wrote this and think it’s coming at similar ideas from a different angle. It’s worth a look.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s