Boltz the robot can read digits. If you hold up a handwritten digit in front of him, Boltz will say it aloud. “Two. Five. Two. Seven. Zero.” He is as good at this as a person is. The only digits he gets wrong are ones that are so sloppily written that human beings also find them hard to decipher.
Boltz’s “eyes” are cameras that transform handwritten digits into 28×28 grayscale bitmaps. These bitmaps are then sent to a neural network model that has previously been trained on a large set of bitmap/digit-name pairs. In the course of training, the model sets an array of numerical weights which define a function that maps a 28×28 bitmap to one of the categories 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. This is explained in more detail in the video below by Geoffrey Hinton, the computer scientist who pioneered this particular model.
Sometimes Boltz goes to sleep. His head droops and his camera eyes shut off. Occasionally you’ll hear him quietly name a digit while in this state—“Three…Five…Eight…Two…Three…”—even though he is clearly not looking at anything. In these cases we say that Boltz is dreaming about the digits he saw during his waking hours.
Now “dreaming” is just a fanciful turn of phrase. In reality we don’t know what Boltz is doing. Maybe he stores bitmaps that he has previously seen and later sends them to his digit recognition software, evoking a vocalization just as when the actual picture was in front of him. Or perhaps instead of memorizing exact bitmaps (verbatim, as it were), Boltz is generating an image of a digit. One of the features of Hinton’s model is that it can be run “backwards”: instead of taking a bitmap as input and producing the name of a digit as output, you can start with a digit and produce a bitmap. The fine details of the image are determined randomly, but the array of weights used in recognition guides the random selection so that the end product looks like the digit in question. So Boltz might feed 5 into the back-end of the model, and it would generate an image that—even though nothing exactly like it had ever been produced before—would nevertheless be recognizable to both human and robot as a “5”. Perhaps it is this novel image the sleeping Boltz is then recognizing and saying aloud.
So far we’ve considered that when Boltz dreams his “mind’s eye” either resees an old image or generates a new one. But there is a third possibility. Perhaps, just as in the Hinton-style dream scenario, Boltz starts off by randomly selecting a digit, except then instead of generating an image, he simply says the digit aloud. The output of the random digit generator is sent directly to his speech unit. Boltz is just dreaming the idea of a digit. At no point is there a corresponding image.
If all we had to go on was Boltz talking in his sleep, we’d have no way to distinguish these three processes. A “five” sounds like a “five” sounds like a “five”. So maybe we teach Boltz to draw. We enable him to pick up a pencil and make shadings on a 28×28 grid that correspond exactly to the bitmaps. We then might ask Boltz to draw what he “saw” when he said a certain number while sleeping. We could use these drawings to distinguish between memorized and imagined digits because only the former would exactly match images we had originally presented. But if the drawn digits were novel that would only tell us that they were created by running the recognition model backwards—it wouldn’t tell us when this occurred. Maybe it happened when he was asleep, and Boltz is recalling the generated bitmap from memory. Maybe nothing was generated when Boltz was asleep and he ran his model backwards only after we asked him to draw what he saw. Or maybe Boltz generated a novel image when he was asleep, but only remembered the output of his recognition unit in response to it, so when we ask him to draw what he saw, he generates a different image.
Finally, in a last-ditch attempt to determine what is going on here, we endow Boltz with full powers of comprehension, introspection, and reason. We ask him to draw one of the digits he mentioned in his sleep and then ask him “The digit you just drew—is that what you actually saw, or is it something you’re inventing on the spot right now?” Boltz considers this question for a minute, and then we hear his servo motors whir as he arranges his shoulders and arms into a shrug. It’s adorable.