My career as an artificial intelligence engineer began in a master’s program in linguistics. There I memorized the International Phonetic Alphabet, played hunt-the-allophone, read the literature on control verbs and code switching, and generally made a good faith effort to locate myself in the proud tradition of Noam Chomsky, Ray Jackendoff, and Henry Higgins. The only role computers played in this endeavor were in coaxing me to spend way too much time figuring out how to draw syntax trees in Microsoft Word. Exposure to the Minimalist Program (“It makes French literary theory look downright reasonable!”) quickly disabused me of any academic aspirations, and I made a pivot into NLP and later industry, but linguistics remained close to my heart. For years I was fascinated by parsing. My master’s thesis asked whether information about the grammatical structure of sentences could be used to improve speech recognition. (The answer: not really.) Using a computer to draw a tree structure above a string of words seemed on some fundamental level to just be what one did. Syntax was the queen of linguistics, the field I hoped would be the key that unlocked the AI kingdom. I wanted linguistics to matter, but it never did.
No, that’s unfair. Linguistics provides an invaluable intellectual framework. The Saussurean notion of the sign, the syntax/semantics/pragmatics trichotomy, an appreciation of the endless structural variety of human communication: centuries of work have gone into compiling that knowledge. A programmer who fails to grapple with it and instead plunges ahead, hacking on natural language like it’s just another data structure will quickly disappear into the weeds, never to be seen again. But once you get beneath the level of worldview, the specific theoretical constructs of linguistics are largely irrelevant to natural language engineering. I will never sit in a meeting arguing the merits of LFG versus HPSG. No money will ever ride on my team’s ability to apply predicate calculus to the Zen koan that is “Every man loves a woman”. PRO-drop, ergodicity, and the middle voice–all fascinating, but as far as the software industry is concerned, just so much irrelevant donkey abuse.
For a while natural language processing was a subfield of machine learning in which linguistic knowledge was required for the feature engineering, but deep learning has started to erode even that. Deep learning is, after all, an attempt to reduce the art of feature engineering itself to just another numerical optimization problem. A new steam engine to wear down the latest generation of John Henrys. Though the deep learning technique du jour of word vector embedding is clearly an implementation of Firth’s distributional hypothesis, in its particulars it bears less of a resemblance to anything I studied in grad school than it does to Jacques Derrida’s concept of différance, God save us all. Soon your performance won’t go up every time you fire a linguist, because you won’t have hired any to begin with.
For a while this upset me. I didn’t want my work to be merely a language-shaped widget in the software machine. I wanted to do language. And how could I be if I wasn’t using linguistics? NLP was an enjoyable enough challenge to build a career around, but it wasn’t truly deep. Soon the only thing I’d have in common with my former colleagues was our shared envy of physicists. But then over the past few years it began to dawn on me that I hadn’t left the kingdom after all. Sure I wasn’t doing language, but the machines I programmed were.
Cooking is chemistry. It’s all about how different substances interact when you combine them and subject them to heat. It clearly falls within a particular scientific purview, but being a brilliant research chemist does not make you a great chef. It doesn’t hurt, but it’s irrelevant. Likewise, being a great chef doesn’t give you even a crude insight into molecular chemistry. Though concerned with the same stuff, cooking and chemistry are entirely separate disciplines. And this isn’t just the difference between theory and practice. Cooking has a theory: you can read cookbooks, learn techniques, and memorize what flavors go with what, but knowing all that won’t make you a great chef either. To be a great chef you have cook day-in day-out for years until making good food is a part of who you are.
In artificial intelligence we say that we are making computers that “understand” language, but we mean this in a qualified and metaphorical way. The thing we are trying to instill into machines is what linguists call linguistic competence, and as any linguist will tell you, linguistic competence is understanding of a very particular sort. It is not an accumulation of facts, or a set of conscious techniques. You don’t learn French by buying a French dictionary and memorizing it. Linguistic competence is knowing-how, not knowing-that. Linguistics is the science that takes linguistic competence as its object of study. Because both are abstract cognitive phenomena it can be easy to get them confused, but they are entirely different things. That linguistics is largely irrelevant to computer language engineering is no mark against linguistics, but merely a reflection of how vast the phenomenon of language is. It rarely impacts my daily work because I’m not trying to teach computers how to be linguists. I’m trying to teach them how to speak.