“Initially” we never hear noises and complexes of sound, but the creaking wagon, the motorcycle. We hear the column on the march, the north wind, the woodpecker tapping, the crackling fire.
It requires a very artificial and complicated attitude in order to “hear” a “pure noise”.
—Martin Heidegger, Being and Time
If you were building a machine to recognize different sounds you would start by setting up a microphone in a field. The audio signal would be digitized, deflections of a membrane translated into voltage levels, and those translated into numbers as a function of time. A computer could then process the numbers. The computer would of course have no concept of what the numbers meant. If a tree fell in the forest next to the field with no one to hear it, it would generate a string of numbers like anything else. This is a “pure noise”.
For your computer to be useful, you’d have to train it to make distinctions, to categorize the various strings of numbers. Current state of the art would have you collect many thousands of audio samples labeled “north wind”, “woodpecker tapping”, “crackling fire”, and “motorcycle”. Given sufficient examples, you could build a statistical model that mapped a given sequence of numbers to one of these categories. The mapping would be inexact—the statistics would embody a fair amount of guesswork—but that’s fine. That’s life.
There is a notion of sequence here. First there is the raw microphone signal, then there is the sorting into categories. Heidegger also proposes an ordering. (The tipoff: “‘Initially’” gets shudder-quoted.) But he orders things the other way: first there is the motorcycle etc., and then only later—at the cost of a “complicated attitude”—do we have access to the raw signal.
This squares with my experience. Where I sit right now, out in public, carrying on side conversations as I write this, I hear people I know talking first. Only with some effort could I mentally transform that into pure sound that happens to be meaningful. I look around the room and there are people I know, there is the TV, there is the chair, and so on. To see these familiar things as patches of color and spatial relationships from which I could infer daily life—that’s weird. It is an artificial and complicated attitude that I can’t pull off. But then I am a human being, not an instrumented microphone.
There are two options:
- Machines are like humans. Raw input precedes meaningful input in some temporal or logical fashion. If I don’t hear the the pure noise, that is only because “hearing” is something that occurs after a fair amount of non-conscious processing has occurred.
- Machines are not like humans. Humans can start with a semantic whole—creaking wagon, woodpecker, motorcycle—and work backwards to raw input as needed. Or at least convince themselves that they have done so.
Option (1) appeals to me. My paid job is to make machines appear convincingly human-like, so any congruence works to my advantage. Option (2) seems like a delusion of philosophical idealism. The way to emulate human hearing is to start with a microphone in a field and work from there.
And yet, and yet. Here’s the counterexample. I am dreaming. I am dreaming that I hear a motorcycle. There is no raw external signal: it’s all in my head. Where did it come from? Did my subconscious create the appropriate set of air molecule vibrations to correspond to the sound of a motorcycle engine? If I dream that I see a motorcycle, do I first (“initially”) mentally compose patches of color that correspond to what would occur if a motorcycle were to roll into my visual field? I doubt my ability to do that. Instead, I imagine I dreamed “motorcycle” then filled in the details later, probably after I woke up.
Machines today are not like humans. We dream in concepts.