It is fascinating how much sound is all around us. The sonic landscapes of our lives are made from all manner of organized and unorganized sound. How do we organize these sounds for ourselves, to tell which ones are dangerous, pleasing, or carrying information we need? How does our auditory system work? For anyone interested in the intersections of music and psychology, music therapy, music cognition, or music aesthetics, an understanding of the auditory nervous system is essential. Scholars and performers from Marilyn Nonken to Oliver Sachs have used an understanding of the auditory nervous system to aid them in their writings, analyses and even their performances.

This week we will look at how the auditory nervous system converts the pressure waves around us into information that we call sound. We will discuss how the sensory stimuli are translated into neural impulses, where in the brain that information is processed, how we perceive pitch, and how we know where sound is coming from.

Soundscapes. Sound and the Natural World

A Soundscape can be defined as the human perception of physical sound resources. Like beauty, soundscapes are in the mind of the beholder. Any sound that is integrated into our life, from bird song to traffic, could be considered part of our soundscape. Some of these sounds are ordered, and some are not. Music would be the most highly ordered aspect of a sound scape (the sound with the highest level of organization around it) Yet, even music can incorporate soundscapes, or indeed, even be a soundscape. And so the discussion around the level of organization that must go int sound to transform it to music is a fascinating one. Acoustic resources are physical sound sources, including both natural sounds, wind, water, wildlife, vegetation, that can contribute to a sound scape.

The acoustic environment is the combination of all the acoustic resources within a given area. This includes natural sounds and cultural sounds, as well as non-natural human-caused sounds. The sound vibrations made by our imaginary falling tree are a part of the acoustic environment regardless of whether a human is there to perceive them. Bat echolocation calls, while outside of the realm of the human soundscape, are also part of the acoustical environment. It is therefore critical to take the entire acoustic environment into account when working to protect natural sounds.

Consider this fascinating work by Steve Reich, Different Trains, that combines string sounds with recorded sounds of train whistles:


This video sums it up:

Seriously though, so many things need to happen before we actually ‘hear’ what the outside world is providing. And remember, it is all happening as various frequencies that all those things emit, change the shifting air patterns around us. Air is the medium through which all these waves of information are received.

Once in the ear, many things have to happen as well, for frequencies in the air to be translated into something we hear. These many translations are different for everyone, and they change as we get older, so no one really hears the same thing, and certainly, no one understands the same thing from what they hear. A review and understanding of this systematic translation of frequencies moving through air into what we hear as sound is important for a variety of reasons:

  1. It asks to confront the issue of why people hear sound and music differently.
  2. It is the beginning of a study of music cognition, in which we can investigate theories around why music effects us the way it does.
  3. It is the beginning of a biological study on how we can hear sounds and music better, more accurately, or with the aid of technologies that enhance or recreate the auditory nervous system.

One important thing to consider about our system of hearing is that it is connected to our nervous system. Like many things connected to our nervous system, we cannot turn it off. We cannot choose to not hear things as we can choose to not see or smell or touch things. Our constructed world is based on this in terms of smoke alarms, fire alarms and other loud sounds that help keep us safe. The same system that allows us to enjoy music is the same system that is built to process danger.

Try this:

Using your phone voice memo or other simple recording device, take a sound walk through campus, a park, urban area, or other setting. Record between 2-5 minutes. Listen back. What range of frequencies do you hear? What can be understood as music, noise, ambient sound, or as some other sonic characteristic? Are there natural organized sounds? Human made organized sounds? Are you surprised to hear sounds inexplicably working together or clashing? Are there elements that are not able to avoid hearing such as sirens or machines? Does the sonic landscape change during your walk?

The ear can be separated into multiple sections. The outer ear includes the pinna, which is the visible part of the ear that protrudes from our heads, the auditory canal, and the tympanic membrane, or eardrum. The middle ear contains three tiny bones known as the ossicles, which are named the malleus (or hammer), incus (or anvil), and the stapes (or stirrup). The inner ear contains the semi-circular canals, which are involved in balance and movement (the vestibular sense), and the cochlea. The cochlea is a fluid-filled, snail-shaped structure that contains the sensory receptor cells (hair cells) of the auditory system.

Sound waves travel along the auditory canal and strike the tympanic membrane, causing it to vibrate. This vibration results in movement of the three ossicles. As the ossicles move, the stapes presses into a thin membrane of the cochlea known as the oval window. As the stapes presses into the oval window, the fluid inside the cochlea begins to move, which in turn stimulates hair cells, which are auditory receptor cells of the inner ear embedded in the basilar membrane. The basilar membrane is a thin strip of tissue within the cochlea.

The activation of hair cells is a mechanical process: the stimulation of the hair cell ultimately leads to activation of the cell. As hair cells become activated, they generate neural impulses that travel along the auditory nerve to the brain. Auditory information is shuttled to the inferior colliculus, the medial geniculate nucleus of the thalamus, and finally to the auditory cortex in the temporal lobe of the brain for processing. Like the visual system, there is also evidence suggesting that information about auditory recognition and localization is processed in parallel streams.

As seen in the videos above, auditory information from the cochlea is sent through the auditory nerve to the auditory cortex in the temporal lobe of the brain for processing. The images below show how parts of the cochlea receptive to certain frequencies correspond to parts of the brain’s auditory cortex. This correlation is called tonotopical mapping or tonotopy: the spatial arrangement of where sounds of different frequency are processed in the brain. In humans, six tonotopic maps have been identified in the primary auditory cortex.

The four essential areas of auditory perception

There are different types of kinocelia and stereocilia in the cochlea responsible for four essential areas of auditory perception:

  • Pitch
  • Volume
  • Location
  • Timbre


Pitch can originate from sounds from our natural soundscape or from tuned instruments. Even untuned instruments often have some pitch in their sound. Normal talking has pitch, as do cat meows and, of course, bird song. Industrial sounds like car engines also have pitch. We are able to distinguish a great many pitch frequencies in the mix of the information we receive from all these sound sources. Within the cochlea there are different stereocilia responsible for determining different pitches. The stereocilia responsible for higher frequencies are located at the entrance to the cochlea, the stereocilia for lower tones are located deeper inside the organ.


As discussed earlier, a major characteristic of any sound is its volume (dynamic, sound level, etc). The volume level we register for a sound is dependent on two aspects: Sound pressure and frequency. The higher the sound pressure levels that reach the ear, the more the stereocilia will be set in motion and a correspondingly large number of nerve impulses will be triggered. The human ear is most sensitive to frequencies in the midrange, at about 4,000 Hertz. When it comes to very low and very high frequencies, our ears’ ability to perceive sound decreases. We therefore perceive the volume of these sounds as being lower. Below is a common used frequency hearing test. See how your perception of volume increases as the frequency grows higher. The actual decibel level is not changing but the higher frequencies engage more stereocilia.

In the video below, notice how the various kinocilia / stereocilia of the coachlea react to various frequency stimulations.


Our auditory perception allows us to determine the direction from which a car is approaching or where someone who calls our name might be standing. When it comes to determining where a sound is coming from, it’s necessary to differentiate between the horizontal plane (left, center, right) and median plane (up, down, back, front), as the ear employs different strategies for each.

Differences in the propagation time and volume between the two ears are essential for localizing sound along a horizontal plane. A sound coming from our left side will reach the left ear slightly sooner than the right. This time difference is minimal but is enough for our brain to process and localize the sound. Along the median plane, our ears primarily use frequency response in order to narrow down the source of a sound.

When both of our ears are stimulated, we call it stereophonic hearing. The difference between the intensity and the frequency at each ear, over time, effects our perception of sound. Another term for this, binaural hearing is a way of describing how we localize sound and identify its source.

An example of stereophonic hearing is when the same sound reaches the two ears at the same time and so it is perceived more loudly than if were to reach only one ear first. The sound itself is not louder, but our orientation to it is. An example of binaural hearing is when we are able to identify where a sound is coming from by thinking about which ear is hearing the sound more loudly and moving closer to the sound’s location.

With only one ear it would be very difficult to guess from where the sound is coming. Differences in loudness and time between the two ears allow us a pretty good sound localization in the horizontal plane. It’s harder for the ear to localize sound in the medial plane.


Timbre is also known as tone color or tone quality. No piano sounds just like another and the difference is more than just a matter of pitch. When it comes to music, we use words like “warm,” “edgy,” “lush” and “tinny” to describe a sound. Such auditory attributes fall under the sonic attribute timbre. Timbre primarily describes the way different tones combine. Instruments and even our own voices are comprised of bass notes and overtones which combine to produce complex harmonics. 

In Class Assignment:

Before class, search for an object that produces sound of some kind, either pitch or unpitched. Think of a way in which the object can be ‘performed’ to produce pitch or some sonic effect. In class, keep the object hidden but perform a sound from it. Ask the class what the object could possibly be? Ask for ways in which the sound produced could be described. Are those sounds related at all to the object itself or not? Can the listeners describe the frequency or timbre of the object’s sound?

One of the most prominent scientists of music and cognition in the 20th Century is Oliver Sachs. Below, he discusses his research


1. The Organ of Corti is:
A. The first synthesizer developed to aid in cochlear implants.
B. The membrane in the middle of the cochlea.
C. The canal that leads to the timpanic membrane.
D. The first treatise written on the auditory nervous system

2. The middle ear consists of:
A. The pinna, the timpanic membrane and the auditory canal
B. The oval window, the cochlea and the circular window.
C. The malleus, incus and stapes
D. The timpanic membrane, malleus and incus

3. Kinocilium or sterocilium are:
A. Hairs in the cochlea that flex to turn fluid pressure into electric stimuli for the auditory nerve.
B. The linings of the timpanic membrance that respond to sonic wave pressure
C. Microbes that help the general health of the inner ear by keeping bacteria in check.
D. Supplements that can help in developing a stronger auditory nerve system.

4. The four essential areas of auditory perception are:
A. Decibel, Frequency, Tone Quality, Expression
B. Pitch, Volume, Location, Timbre
C. Danger, Safety, Pleasing, Neutral
D. Low, High, Quiet, Loud

5. Stereophonic hearing is:
A. Using speakers
B. Using headphones
C. When a sounds reaches both ears at the same time.
D. Another word for digital playback

6. Tonotopical mapping or tonotopy is:
A. The mapping of sounds through the outer and inner ear.
B. The mapping of auditory information sent from the cochlea to the brain’s auditory cortex.
C. The method for understanding differentiations between pitch and timbre.
D. The science behind the behavior of sound waves.

7. Acoustic Resources are:
A: The theoretical tools necessary to understand frequency
B. Instruments
C. Physical sound sources that contribute to a soundscape
D. Sounds that only humans produce