I Dream of Digits

Boltz the robot can read digits. If you hold up a handwritten digit in front of him, Boltz will say it aloud. “Two. Five. Two. Seven. Zero.” He is as good at this as a person is. The only digits he gets wrong are ones that are so sloppily written that human beings also find them hard to decipher.

Blue and white NAO robot

Boltz’s “eyes” are cameras that transform handwritten digits into 28×28 grayscale bitmaps. These bitmaps are then sent to a neural network model that has previously been trained on a large set of bitmap/digit-name pairs. In the course of training, the model sets an array of numerical weights which define a function that maps a 28×28 bitmap to one of the categories 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. This is explained in more detail in the video below by Geoffrey Hinton, the computer scientist who pioneered this particular model.

Sometimes Boltz goes to sleep. His head droops and his camera eyes shut off. Occasionally you’ll hear him quietly name a digit while in this state—“Three…Five…Eight…Two…Three…”—even though he is clearly not looking at anything. In these cases we say that Boltz is dreaming about the digits he saw during his waking hours.

Now “dreaming” is just a fanciful turn of phrase. In reality we don’t know what Boltz is doing. Maybe he stores bitmaps that he has previously seen and later sends them to his digit recognition software, evoking a vocalization just as when the actual picture was in front of him. Or perhaps instead of memorizing exact bitmaps (verbatim, as it were), Boltz is generating an image of a digit. One of the features of Hinton’s model is that it can be run “backwards”: instead of taking a bitmap as input and producing the name of a digit as output, you can start with a digit and produce a bitmap. The fine details of the image are determined randomly, but the array of weights used in recognition guides the random selection so that the end product looks like the digit in question. So Boltz might feed 5 into the back-end of the model, and it would generate an image that—even though nothing exactly like it had ever been produced before—would nevertheless be recognizable to both human and robot as a “5”. Perhaps it is this novel image the sleeping Boltz is then recognizing and saying aloud.

So far we’ve considered that when Boltz dreams his “mind’s eye” either resees an old image or generates a new one. But there is a third possibility. Perhaps, just as in the Hinton-style dream scenario, Boltz starts off by randomly selecting a digit, except then instead of generating an image, he simply says the digit aloud. The output of the random digit generator is sent directly to his speech unit. Boltz is just dreaming the idea of a digit. At no point is there a corresponding image.

If all we had to go on was Boltz talking in his sleep, we’d have no way to distinguish these three processes. A “five” sounds like a “five” sounds like a “five”. So maybe we teach Boltz to draw. We enable him to pick up a pencil and make shadings on a 28×28 grid that correspond exactly to the bitmaps. We then might ask Boltz to draw what he “saw” when he said a certain number while sleeping. We could use these drawings to distinguish between memorized and imagined digits because only the former would exactly match images we had originally presented. But if the drawn digits were novel that would only tell us that they were created by running the recognition model backwards—it wouldn’t tell us when this occurred. Maybe it happened when he was asleep, and Boltz is recalling the generated bitmap from memory. Maybe nothing was generated when Boltz was asleep and he ran his model backwards only after we asked him to draw what he saw. Or maybe Boltz generated a novel image when he was asleep, but only remembered the output of his recognition unit in response to it, so when we ask him to draw what he saw, he generates a different image.

Finally, in a last-ditch attempt to determine what is going on here, we endow Boltz with full powers of comprehension, introspection, and reason. We ask him to draw one of the digits he mentioned in his sleep and then ask him “The digit you just drew—is that what you actually saw, or is it something you’re inventing on the spot right now?” Boltz considers this question for a minute, and then we hear his servo motors whir as he arranges his shoulders and arms into a shrug. It’s adorable.

Posted in Mermaids, Those that have just broken the flower vase | Leave a comment

I Dream of Motorcycles

“Initially” we never hear noises and complexes of sound, but the creaking wagon, the motorcycle. We hear the column on the march, the north wind, the woodpecker tapping, the crackling fire.

It requires a very artificial and complicated attitude in order to “hear” a “pure noise”.

—Martin Heidegger, Being and Time

If you were building a machine to recognize different sounds you would start by setting up a microphone in a field. The audio signal would be digitized, deflections of a membrane translated into voltage levels, and those translated into numbers as a function of time. A computer could then process the numbers. The computer would of course have no concept of what the numbers meant. If a tree fell in the forest next to the field with no one to hear it, it would generate a string of numbers like anything else. This is a “pure noise”.

A microphone framed against a sky with clouds.

For your computer to be useful, you’d have to train it to make distinctions, to categorize the various strings of numbers. Current state of the art would have you collect many thousands of audio samples labeled “north wind”, “woodpecker tapping”, “crackling fire”, and “motorcycle”. Given sufficient examples, you could build a statistical model that mapped a given sequence of numbers to one of these categories. The mapping would be inexact—the statistics would embody a fair amount of guesswork—but that’s fine. That’s life.

There is a notion of sequence here. First there is the raw microphone signal, then there is the sorting into categories. Heidegger also proposes an ordering. (The tipoff: “‘Initially’” gets shudder-quoted.) But he orders things the other way: first there is the motorcycle etc., and then only later—at the cost of a “complicated attitude”—do we have access to the raw signal.

This squares with my experience. Where I sit right now, out in public, carrying on side conversations as I write this, I hear people I know talking first. Only with some effort could I mentally transform that into pure sound that happens to be meaningful. I look around the room and there are people I know, there is the TV, there is the chair, and so on. To see these familiar things as patches of color and spatial relationships from which I could infer daily life—that’s weird. It is an artificial and complicated attitude that I can’t pull off. But then I am a human being, not an instrumented microphone.

There are two options:

  1. Machines are like humans. Raw input precedes meaningful input in some temporal or logical fashion. If I don’t hear the the pure noise, that is only because “hearing” is something that occurs after a fair amount of non-conscious processing has occurred.
  2. Machines are not like humans. Humans can start with a semantic whole—creaking wagon, woodpecker, motorcycle—and work backwards to raw input as needed. Or at least convince themselves that they have done so.

Option (1) appeals to me. My paid job is to make machines appear convincingly human-like, so any congruence works to my advantage. Option (2) seems like a delusion of philosophical idealism. The way to emulate human hearing is to start with a microphone in a field and work from there.

And yet, and yet. Here’s the counterexample. I am dreaming. I am dreaming that I hear a motorcycle. There is no raw external signal: it’s all in my head. Where did it come from? Did my subconscious create the appropriate set of air molecule vibrations to correspond to the sound of a motorcycle engine? If I dream that I see a motorcycle, do I first (“initially”) mentally compose patches of color that correspond to what would occur if a motorcycle were to roll into my visual field? I doubt my ability to do that. Instead, I imagine I dreamed “motorcycle” then filled in the details later, probably after I woke up.

Machines today are not like humans. We dream in concepts.

Posted in Fabulous ones | 4 Comments

Proposed New Vocabulary: Omnispiel

The A recorded phrase that you hear frequently in your daily life. It can’t arrive through media channels, but rather must be part of your immediate physical environment. Most typically a recording attached to some piece of equipment. “Hello and welcome to Car2Go.” “Stand clear of the closing doors please.” You will often repeat the phrase to yourself as you settle into the car, step onto the subway, or what have you. At first you will be making fun of it, but gradually the words will recede, leaving only a familiar cadence, unconscious and physically comforting.

Posted in Mermaids | Leave a comment

Personal Philosopher

  • How do I know I’m not actually dreaming?…$150/hour
  • Is my dog intelligent?…$200/hour
  • Clones…$375/hour
  • Should Neo have taken the blue pill?…$650/hour
  • But is it art?…$850/hour
  • Did you know there’s a tribe in the Amazon that has no word for “regret”? I wonder if they…[Waves hand impatiently] $1500/hour
  • Does God exist?…Contact for rates
Posted in Fabulous ones | Leave a comment

The Meaning of Life

Right-clicking on a highlighted word in my web browser brings up a menu that gives me the option of either looking up that word or searching for it on Google.

the meaning of lifeThe first option returns a dictionary and thesaurus definition.

life definitionThe second opens a web search in a separate tab.

life searchThe former is what most people imagine when they think of the meaning of a word. The latter isn’t its meaning. Instead it is examples of its use, though, being the first page returned by a popular search engine, you can expect these examples to be good ones.

I almost never select the look up option. I frequently select the search option. The former will be highlighted first on the menu, but I’ll move away from it. “Don’t tell me what this word means,” I’ll think. “Show me.”

Posted in Mermaids | 2 Comments

Do You Believe in Electrons?

Two physicists are sitting in wooden chairs at a table in a cafe, drinking coffee from plain white porcelain cups. One says to the other, “Do you believe in electrons?”

The physicist is not asking “Do you believe electrons exist?” He believes that they exist. He believes that she believes they exist as well, although that existence is contingent. The modern idea of the electron wasn’t formulated until the 19th century, and it is possible that some future theory of physics will banish the concept. Still at the moment the physicist finds the case for electrons convincing. The mathematical formalism that describes them makes sense. (It hangs together.) He has personally manipulated laboratory equipment whose behavior becomes explicable if you hypothesize that it is measuring electrons. He has heard good-faith reports from many other scientists and engineers about how the notion of electrons has helped them structure their conception of the natural world and build useful devices. If that’s not the scientific definition of existing, he doesn’t know what is.

Still, when he tries to imagine what an electron is, he finds himself stuck. What comes to mind? Maybe a set of equations in a book. Or facts about an electron: its mass and charge. But surely these are just descriptions of electrons and not the electrons themselves. He might imagine relevant readings off laboratory equipment—deflected voltmeter needles, blurs on photographic plates. But these aren’t the electrons themselves either. It is merely evidence of their existence: convincing but still indirect. The physicist decides he’s overthinking it and tries free association: he says the word “electron” to himself and makes note of the first image that comes to mind. It is a drawing of a cluster of little spheres stuck together representing protons and neutrons. Other spheres representing electrons orbit this nucleus. You can tell that they are orbiting because there are elliptical lines tracing out the orbital paths.

solar-system-atom

This is an illustration of the “solar system” model of the atom proposed by the Danish physicist Niels Bohr. The Bohr model is wrong. In the course of orbiting the electrons would radiate away energy, causing the atoms to collapse. Bohr himself was aware of this, nevertheless his model served as a stopgap that physicists made do with in the 1920s and 30s until quantum theory came up with a more convincing account of atomic structure. Even as a representation of the Bohr model of the atom, the picture is wrong: the size of the electron orbits is completely out of proportion to the size of the individual particles. And yet the physicist doesn’t feel foolish for having this image in his head. There are reasons to recommend it. It does, for instance, capture the idea that protons, neutrons and electrons combine to form atoms, with protons and neutrons bound tightly to one other while the electrons surround them in some way. A picture more in line with contemporary quantum physics would replace the image of a solar system with a series of blobby, dumbbell-shaped electron orbitals, but even those are just handy visualizations of equations that describe electrons. They are no less cartoons.

Still these are cartoons of whole atoms. What about the electrons themselves? Physics admits the idea of a free electron. What image does the physicist the free-associate for that? Probably, he reluctantly admits, just a sphere. Were he in charge of producing the artwork for a science textbook, he’d represent a lone electron as a little ball. It would be textureless, featureless, and a solid color. Its diameter would likely be no more than what could contain a short printed word. (He would not have a sphere representing an electron take up half a page, though in an astronomy textbook he might do so for a sphere representing a planet. Why? Because electrons are small and planets are big, though both are so out of proportion to the size of a book the distinction doesn’t make much sense.) Finally, in order to convey a three-dimensional sense, he’d probably put a dot of light on one part of the sphere, as if the electron were being illuminated by a desk lamp just off to the right.

electron

This is all tremendously unsatisfying. The illustration of the Bohr model, inaccurate as it was, still had structure corresponding to features of actual atoms. Here there is no such correspondence. It is meaningless to talk about an electron’s texture, color, or shape. That dot of light makes absolutely no sense: the phenomenon of illumination is the result of collisions between vast numbers of individual photons and individual electrons. An electron can no more be illuminated than a single person can band together to form a mob. You can make a case for representing electrons as spheres rather than, say, cubes because some of their properties (mass and charge, for instance) do exhibit spherical symmetry, but that’s where the verisimilitude ends. The cartoon of a single electron says more about the nature of cartooning than the entity it is supposed to depict.

In asking “Do you believe in electrons?” what the physicist really means is “Isn’t it strange that you and I are both convinced of the existence of things that we are incapable of imagining?” His use of the phrase “believe in” is a little joke, intentionally evoking the question “Do you believe in God?” He’s not asking if the other physicist believes electrons exist by virtue of a leap of faith. (The goal of science is to obviate, or at least drastically minimize, such leaps.) But a religious person asked to free-associate an image with the word God might come up with something (a giant man with a long white beard and booming voice, perhaps) they would similarly dismiss as cartoonish and inadequate. In both cases people are willing to attribute existence to things their brains are literally not equipped to handle.

god

The other physicist considers the question for a moment then replies, “I believe in electrons as much as I believe in these chairs or this coffee cup.” This is her way of affirming that, yes, indeed she does believe that electrons exist. But she is making a little joke as well, because no one talks like that, at least not typically. To say that you believe in something is to affirm its existence appearances to the contrary. To admit that there is an extra burden on you, the believer, to establish its reality, whether by scientific theory, faith, or some other means. No such burden exists when discussing the chair you are sitting in, or a coffee cup you are holding. There the presumption is that these things exist. This existence is contingent as well. It may be the case that one of the physicists is actually dreaming, or that they are both just figments of a computer simulation. Maybe the physical reality of the coffee cup is so far removed from our subjective perception of it that the correspondence between them is unclear. There are many ways of casting doubt—of compelling us to say about an object in front of us that we believe in it rather than it simply is. But those ways all take a fair amount of mental gymnastics, about as much as it takes to believe in an electron that we cannot visualize. So the other physicists’ joke is this: “You have now ceased to talk about physics and begun to talk about philosophy.”

coffee-cup

Once you are willing to perform the mental gymnastics of philosophy, you find yourself casting all manner of fundamental beliefs and sensations into doubt. This can create a sense of vertigo that leads to you want to restore certainty—to get back to the point where simple things simply are. It would be nice if physics could help with this. It does after all tell us things about the fundamental nature of coffee cups and chairs. For instance it tells us that they are made in large part out of electrons, and this has tremendous explanatory power. It illuminates much of how the world hangs together, but is ultimately no antidote for philosophical vertigo, because the scientific entities that explain the daily phenomena can never appear more real to us than the daily phenomena themselves. You start with the coffee cup and you go from there.

Posted in Fabulous ones, Stray dogs | Leave a comment

Wittgenstein, Watson, and Language Games

Here the term “language-game” is meant to bring into prominence the fact that the speaking of language is part of an activity, or of a form of life.

Review the multiplicity of language-games in the following examples, and in others:
Giving orders, and obeying them—
Describing the appearance of an object, or giving its measurements—
Constructing an object from a description (a drawing)—
Reporting an event—
Speculating about an event—

Forming and testing a hypothesis—
Presenting the results of an experiment in tables and diagrams—
Making up a story; and reading it—
Play-acting—
Singing catches—
Guessing riddles—
Making a joke; telling it—
Solving a problem in practical arithmetic—
Translating from one language into another—
Asking, thanking, cursing, greeting, praying.

–Ludwig Wittgenstein, Philosophical Investigations

Jeopardy! and John Henry

Consider another kind of language game. A quizmaster reads a clue. For example, “William Wilkinson’s ‘An Account of the Principalities of Wallachia and Moldavia’ inspired this author’s most famous novel.” You then try to formulate a question that would evoke that clue as an answer. For example, “Who is Bram Stoker?” You are competing against two other players. Whoever formulates a correct question first wins a certain amount of money. The person with the most money after a set number of questions wins the game. The clues are drawn from general knowledge—history, geography, science, culture—and often involve wordplay: rhymes and puns and whatnot. To be good at this game it helps to have an encyclopedic knowledge of trivia, quick recall, and a corny sense of humor.

In 2011 a computer program built by IBM research called Watson1 appeared on the game show Jeopardy! and defeated its human opponents, among them Ken Jennings, the reigning champion. The contest was structured like a regular all-human competition: Watson played by the same rules as the people. The clue above about Bram Stoker is the final one with which Watson secured its victory.

The computer program Waston on the game show Jeopardy!

It is astounding that a computer defeated a person in this arena. Machines are good at certain things (storing and retrieving vast quantities of data, working without a pause for years at a time) and humans are good at other things (synthesizing, inferring, catching jokes). Jeopardy! would seem to favor the latter strengths. Furthermore, the standard assumption in the field of artificial intelligence is that the humans will always be smarter than the computers—our job as programmers is merely to make the computers smart enough to be useful. So Watson’s victory on Jeopardy! was a tremendous upset. It is not a case of the steam engine defeating John Henry. Rather it is if by some miracle a mass-produced dining room set turned out to be of higher quality than one hand-built by a carpenter.

In his book Philosophical Investigations from a few decades before, the philosopher Ludwig Wittgenstein claims that speech is a kind of game people play. He talked about games to emphasize the fact that speech exists not just for the purpose of transmitting information, but also as an end in itself. (What is the purpose of playing a game of chess, after all, other than to play a game of chess?) Later in the same book, Wittgenstein tries to come up with clearly articulated definition of the concept of a game, but after considering the varied activities that might fall under that heading (chess, athletics, jumprope, political contests, war) gives up. So in using the game metaphor he is also emphasizing how varied the act of speaking is. It is foolish to try and reduce language to some formula. Talking is simply one of the many things people do.

Wittgenstein knew nothing about computers. His notion of language games was a metaphor, but was prescient in this instance because Watson literally played a language game and won. So we may want to keep him in the back of minds in case there is other guidance he might offer.

A Russian Language Game

You can simulate what it’s like to be Watson right now. Type the phrase “William Wilkinson’s ‘An Account of the Principalities of Wallachia and Moldavia’ inspired this author’s most famous novel -IBM -Watson” into your favorite search engine.2 This brings up multiple links to articles about Bram Stoker and Dracula. It is easy for you to skim them and infer that the correct answer is “Bram Stoker”. (Sorry–“Who is Bram Stoker?”) So easy that it almost seems like cheating. But there’s a crucial difference between you and Watson: you understand English. Watson does not.

So in the interest of realism let’s play a different language game. The rules of this game are that I give you a clue in Russian and you have to return the correct answer in Russian. (I’ll permit you to skip the whole phrase-it-as-a-question business.) You are allowed to use a search engine, but you are not allowed to understand Russian. So now when you blindly perform the search “‘Рассказ о княжествах Валахии и Молдавии’ Уильяма Уилкинсона вдохновил самый известный роман этого автора” it brings back links to documents that are incomprehensible to you. What do you do next? How do you win?

Google search in Russian

Let’s make the rules of the game a little more forgiving. You are not allowed to understand Russian, but you are allowed to recognize Cyrillic characters, distinguish words, and tell when two phrases look similar or different. (Basically you are allowed to understand Russian as well as someone who doesn’t actually understand Russian.) If you carefully combed through the top hits returned by the search engine, you might notice the phrases Дракула and Брэма Стокера showing up repeatedly. Perhaps one of these is the answer.

It order to choose one term over the other it would be helpful to know the relationship between them. Knowing this offhand would be considered understanding Russian, however, if you did web searches on these two terms, you might discover that they often occur near each in the same documents. Furthermore you might notice that these documents often also contain the terms автора and роман, which appear in the question as well. You wouldn’t know what these words meant either, but their presence still might pique your interest. If you had enough examples of similar clues and answers about authors and novels3, you might be able to recognize typical word patterns that would lead you to guess the correct answer, “Брэма Стокера”.

Брэма Стокера Bram Stoker
Дракула Dracula
автора author
роман novel

In order to discern these patterns, you’d have to make careful tallies of Cyrillic word shapes across many thousands of documents and analyze them with sophisticated mathematical techniques capable of teasing out the subtle correlations between them. It would be a much too tedious job for a human being. You’d need a computer.

At Play in the World

Watson plays something very similar to the Russian language game. It analyzes the clue it is given, extracting relevant terms that are then used to perform a query of a general knowledge database. Candidate entity names returned by the query are ranked by a machine learning model trained on a long history of Jeopardy! questions and answers, and the highest scoring one is proposed as the answer.4 There is also knowledge about the grammar of English and basic ontology baked into the program, but for the most part Watson is just recognizing patterns of words.

This naturally raises the question, are we, the humans, playing the same game? Is our understanding of English, Russian or what-have-you ultimately just unconscious, statistically-driven pattern recognition? This is a contemporary rephrasing of the question of how to distinguish between appropriately conditioned behavior and true comprehension, which has a long philosophical history outside of software. Officially, engineers like myself are agnostic on this issue.5 We are only concerned with getting the appropriate responses and don’t care what underlies them. As a practical matter though, there remains a vast gulf between the variety of games a human and a computer program—even a world-class program like Watson—is able to play.

g18mpxb55i6ahqm55ri1

Consider another world-class computer program—the Google web search engine. To find the answer to the Bram Stoker clue, I typed it verbatim into Google, which instantly provided relevant results. It turns out that this is often the case: Google is an excellent way to cheat at Jeopardy! However, the Google search engine could not have gone on TV and beat Ken Jennings because it returns links to documents and relies on a human being to make sense of them. That is not sufficient for Jeopardy! There you must return the name of a specific entity couched in the form of a question, and as the Russian language example above demonstrates, going that final mile is harder than it looks. Google can’t win at Jeopardy! because it’s not playing by the rules of that game.

In fact the rules of this particular game impose all manner of non-obvious constraints. Both clues and answers must be concise. “What effect has the character Dracula had on film and literature?” is a valid question to ask, but no good for Jeopardy! because you could fill books answering it. Answers must not be a matter of opinion (“This gothic tale about a bloodsucking count is the greatest novel of the 19th century”) and clues must not contain incorrect presuppositions (“This male English author wrote the novel Frankenstein”). The convention of answering in the form “What is–?” “Who is–?” restricts answers to being well-defined entities, while the quiz show format disallows all manner of discourse. The designers of Watson knew their system would never have to compose a poem, tell a joke, comfort a grieving widow, or maunder on about the weather. The set of utterances you don’t have to handle is as vast as the set of ones you do.

Navigating open but still constrained domains is where the field of artificial intelligence stands at the moment. We know how to play particular games. Given a task—find a webpage, recommend a movie, transcribe a spoken utterance, win at Jeopardy!–and a large set of exemplars of how human beings have successfully performed that task before, we can find a way to train a machine to imitate them. Usually not surpass6, but at least emulate to some reasonable degree. It’s not easy—machine learning is still as much an art as a science—but for the foreseeable future there is a clear way forward that lies in making incremental progress by winning incrementally different games.

Computer programmers have a instinct towards generalization, and so naturally wonder if there is an approach that could subsume the current piecemeal state of the art. You’d want there to be a single human game—call it reason, rationality, intelligence—that we could learn to play just once, and have particular tasks fall out as special cases. This was the dream of both an earlier generation of artificial intelligence researchers and an earlier incarnation of Wittgenstein, who in the Tractatus Logico-Philosophicus proposed a sort of gnomic form of predicate calculus as the definitive solution to all outstanding philosophical problems. Wittgenstein later renounced this position, and his description of the multiplicities of language games in the quote above reflects his later view that a description of human experience will never be able to fully abstract beyond the particulars. Currently, engineers are okay with attacking sets of particular problems and undecided as to whether these may someday be unified. Wittgenstein, however, warns that any attempt at this unification is a fool’s errand. There is no master template from which all reasoning derives. Instead it’s just games, games, games all the way down.


1 Usually I find these disclaimers superfluous, but since my job title is “Watson NLP Developer” I suppose I should state for the record that the opinions expressed on this blog are entirely my own and do not reflect those of my employer.

2 The “-IBM” and “-Watson” are necessary for the sake of fairness, because all the web pages that contain this phrase verbatim discuss Watson’s victory on Jeopardy!

3 Of course you wouldn’t know they were about authors and novels. You’d just know that certain Russian words tended to pattern with certain other Russian words in certain ways.

4 The system is described in detail in the May-June 2012 edition of the IBM Journal of Research and Development.

5 Though personally I have to say that it sure doesn’t feel that way to me.

6 It is interesting that the other great publicity coup IBM has scored in the past twenty years was Deep Blue’s defeat of the chess grandmaster Garry Kasparov. I wonder if Wittgenstein had lived later in the 20th century he would have defined a game as “something a computer could eventually defeat a human at”.

Posted in Fabulous ones, Mermaids, Those that have just broken the flower vase | 2 Comments