I Dream of Digits

Boltz the robot can read digits. If you hold up a handwritten digit in front of him, Boltz will say it aloud. “Two. Five. Two. Seven. Zero.” He is as good at this as a person is. The only digits he gets wrong are ones that are so sloppily written that human beings also find them hard to decipher.

Blue and white NAO robot

Boltz’s “eyes” are cameras that transform handwritten digits into 28×28 grayscale bitmaps. These bitmaps are then sent to a neural network model that has previously been trained on a large set of bitmap/digit-name pairs. In the course of training, the model sets an array of numerical weights which define a function that maps a 28×28 bitmap to one of the categories 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. This is explained in more detail in the video below by Geoffrey Hinton, the computer scientist who pioneered this particular model.

Sometimes Boltz goes to sleep. His head droops and his camera eyes shut off. Occasionally you’ll hear him quietly name a digit while in this state—“Three…Five…Eight…Two…Three…”—even though he is clearly not looking at anything. In these cases we say that Boltz is dreaming about the digits he saw during his waking hours.

Now “dreaming” is just a fanciful turn of phrase. In reality we don’t know what Boltz is doing. Maybe he stores bitmaps that he has previously seen and later sends them to his digit recognition software, evoking a vocalization just as when the actual picture was in front of him. Or perhaps instead of memorizing exact bitmaps (verbatim, as it were), Boltz is generating an image of a digit. One of the features of Hinton’s model is that it can be run “backwards”: instead of taking a bitmap as input and producing the name of a digit as output, you can start with a digit and produce a bitmap. The fine details of the image are determined randomly, but the array of weights used in recognition guides the random selection so that the end product looks like the digit in question. So Boltz might feed 5 into the back-end of the model, and it would generate an image that—even though nothing exactly like it had ever been produced before—would nevertheless be recognizable to both human and robot as a “5”. Perhaps it is this novel image the sleeping Boltz is then recognizing and saying aloud.

So far we’ve considered that when Boltz dreams his “mind’s eye” either resees an old image or generates a new one. But there is a third possibility. Perhaps, just as in the Hinton-style dream scenario, Boltz starts off by randomly selecting a digit, except then instead of generating an image, he simply says the digit aloud. The output of the random digit generator is sent directly to his speech unit. Boltz is just dreaming the idea of a digit. At no point is there a corresponding image.

If all we had to go on was Boltz talking in his sleep, we’d have no way to distinguish these three processes. A “five” sounds like a “five” sounds like a “five”. So maybe we teach Boltz to draw. We enable him to pick up a pencil and make shadings on a 28×28 grid that correspond exactly to the bitmaps. We then might ask Boltz to draw what he “saw” when he said a certain number while sleeping. We could use these drawings to distinguish between memorized and imagined digits because only the former would exactly match images we had originally presented. But if the drawn digits were novel that would only tell us that they were created by running the recognition model backwards—it wouldn’t tell us when this occurred. Maybe it happened when he was asleep, and Boltz is recalling the generated bitmap from memory. Maybe nothing was generated when Boltz was asleep and he ran his model backwards only after we asked him to draw what he saw. Or maybe Boltz generated a novel image when he was asleep, but only remembered the output of his recognition unit in response to it, so when we ask him to draw what he saw, he generates a different image.

Finally, in a last-ditch attempt to determine what is going on here, we endow Boltz with full powers of comprehension, introspection, and reason. We ask him to draw one of the digits he mentioned in his sleep and then ask him “The digit you just drew—is that what you actually saw, or is it something you’re inventing on the spot right now?” Boltz considers this question for a minute, and then we hear his servo motors whir as he arranges his shoulders and arms into a shrug. It’s adorable.

This entry was posted in Mermaids, Those that have just broken the flower vase. Bookmark the permalink.

2 Responses to I Dream of Digits

  1. fatclown says:

    Can’t we check the traces in the logs?

    • W.P. McNeill says:

      This isn’t a very realistic robot. In any realistic robot we’d have logging mechanisms that made this uncertainty go away. For that matter we also could not embody any realistic robot with full powers of comprehension, introspection, and reason.

      What I’m actually doing here is a thought experiment in which I adopt some of the language of contemporary machine learning to see if it provides a novel way to frame old questions about human cognition. In this sense, “Can’t we just check the logs?” becomes an interesting epistemological question. Currently the answer is “no”: we don’t know enough about how the brain works. But I can imagine neuroscience progressing to the point where, say, we either did or did not see visual areas of the brain lighting up during dreams. (Actually maybe we do have data about this. I don’t know enough about current brain science.) If neuroimaging gets good enough, we may at some point be able to “check the logs” of an actual human being.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.