Ambiophonics,
2nd
Edition:
Replacing
Stereophonics
to
Achieve
Concert-Hall
Realism |
| Chapter
1 |
| Ralph
Glasgal |
| August
1999 |
www.ambiophonics.org
Ambiophonics
is the logical
multi-speaker
replacement
for
stereophonics
and a
technical
methodology
which, if
adhered to
closely, makes
it possible to
immerse
oneself in an
exceedingly
real acoustic
space, sharing
it with the
music
performers on
the stage in
front of you.
Ambiophonics
does this, at
its best,
using ordinary
standard and
existing two
channel
recordings. We
will show in
the chapters
that follow
that, as hard
as this may be
to believe,
there is
nothing to be
gained as far
as realism in
acoustic music
reproduction
is concerned
by using more
than two
recorded
channels (as
opposed to
multi-speaker)
and that the
complex
microphone
arrangements
that
multichannel
recording
implies are
actually
deleterious
and wasteful
of bandwidth
that could be
put to better
use.
Ambiophonics
is like a
visit to a
concert hall
and is for
serious
listeners who
do not often
read, talk,
eat, knit, or
sleep in their
home concert
halls, any
more than they
would at a
live
performance.
Ambiophonics
is not
suitable for
movies, video,
or any sound
tracks where
direct or
moving sound
sources come
from the
extreme sides,
rear, or
overhead.
Ever
since 1881
when Clément
Ader ran
signals from
ten spaced
pairs of
telephone
carbon
microphones
clustered on
the stage of
the Paris
Opera via
phone lines to
single
telephone
receivers in
the Palace of
Industry that
were listened
to in pairs,
practitioners
of the
recording arts
have been
striving to
reproduce a
musical event
taking place
at one
location and
time at
another
location and
time with as
little loss in
realism as
possible.
While
judgments as
to what sounds
real and what
doesn't may
vary from
individual to
individual,
and there are
even some who
hold that
realism is not
the proper
concern of
audiophiles,
such views of
our hearing
life should
not be allowed
to slow
technical
advances in
the art of
realistic
auralization
that listeners
may then
embrace or
disdain as
they please.
What
is Realism in
Sound
Reproduction?
Realism
in staged
music sound
reproduction
will usually
be understood
to mean the
generation of
a sound field
realistic
enough to
satisfy any
normal
ear-brain
system that it
is in the same
space as the
performers,
that this is a
space that
could
physically
exist, and
that the sound
sources in
this space are
as full bodied
and as easy to
locate as in
real life.
Realism does
not
necessarily
equate to
accuracy or
perfection.
Achieving
realism does
not mean that
one must
slavishly
recreate the
exact space of
a particular
recording
site. For
instance, a
recording made
in Avery
Fisher Hall
but reproduced
as if it were
in Carnegie
Hall is still
realistic,
even if
inaccurate.
While a home
reproduction
system may not
be able to
outperform a
live concert
in a hall the
caliber of
Boston's
Symphony Hall,
in many cases
the home
experience can
now exceed a
live event in
acoustic
quality. For
example, a
recording of
an opera made
in a smallish
studio can now
easily be made
to sound
better at home
than it did to
most listeners
at a crowded
recording
session. One
can also argue
that a home
version of
Symphony Hall,
where one is
apparently
sitting tenth
row center, is
more involving
that the live
experience
heard from a
rear side seat
in the balcony
with
obstructed
visual and
sonic
prospect. In a
similar vein,
realism does
not mean
perfection. If
a full
symphony
orchestra is
recorded in
Carnegie Hall
but played
back as if it
were in
Carnegie
Recital Hall,
one may have
achieved
realism but
certainly not
perfection.
Likewise, as
long as
localization
is as
effortless and
as precise as
in real life,
the reproduced
locations of
discrete sound
sources
usually don't
have to be
exactly in the
same positions
as at the
recording site
to meet the
standards of
realism
discussed
here. (Virtual
Reality
applications,
by contrast,
often require
extreme
accuracy but
realism is not
a
consideration.)
An example of
this occurs if
a recording
site viewed
from the
microphone has
a stage width
of 120° but
is played back
on a stage
that seems
only 90°
wide. What
this really
means in the
context of
realism is
that the
listener has
moved back in
the reproduced
auditorium
some fifteen
rows, but
either stage
perspective
can be
legitimately
real. Being
able to
localize a
stage sound
source in a
stereo or
surround multi
channel system
does not
guarantee that
such
localization
will sound
real. For
example, a
soloist's
microphone
panned by a
producer to
one
loudspeaker is
easy to
localize but
almost never
sounds real.
In
a similar
vein, one can
make a case
that one can
have glorious
realism, even
without any
detailed front
stage
localization,
as long as the
ambient field
is correct.
Anyone who has
sat in the
last row of
the family
circle in
Carnegie Hall
can attest to
this. This
kind of
realism makes
it possible to
work seeming
miracles even
with mono
recordings.
Reality
is in the Ear
of the
Behearer
While
it is always
risky to make
comparisons
between
hearing and
seeing, I will
live
dangerously
for the
moment. If
from birth,
one were only
allowed to
view the world
via a small
black and
white TV
screen, one
could still
localize the
position of
objects on the
video screen
and could
probably
function quite
well. But
those of us
with normal
sight would
know how drab,
or I would say
unrealistic,
such a
restricted
view of the
world actually
was. If we now
added color to
our subject's
video screen,
the still
grossly
handicapped
(by our
standards)
viewer would
marvel at the
previously
unimaginable
improvement.
If we now
provided
stereoscopic
video, our now
much less
handicapped
viewer would
wonder how he
had ever
functioned in
the past
without depth
perception or
how he could
have regarded
the earlier
flat
monoscopic
color images
as being
realistic.
Finally, the
day would come
when we
removed the
small video
screens and
for the first
time our
optical guinea
pig would be
able to enjoy
peripheral
vision and the
full
resolution,
contrast and
brightness
that the human
eye is capable
of and fully
appreciate the
miracle of
unrestricted
vision. The
moral of all
this is that
only when all
the visual
sense
parameters are
provided for,
can one enjoy
true visual
reality and
the same is
true for sonic
reality.
Since
most of us are
quite familiar
with what live
music in an
auditorium
sounds like,
we can sense
unreality in
reproduction
quite readily.
But in the
context of
audio
reproduction,
the
progression
toward realism
is similar to
the visual
progression
above. To make
reproduced
music sound
fully
realistic, the
ears, like the
eyes, must be
stimulated in
all the ways
that the
ear-brain
system
expects. Like
the visual
example, when
we go from
mono to stereo
to matrix
surround to
multi-channel
discrete, etc.
we marvel at
each
improvement.
But since we
already know
what real
concert halls
sound like, we
soon realize
that something
is missing. In
general,
multi-channel
recording
methods or
matrix
surround
systems (Hafler,
SQ, QS, UHJ,
Dolby,
5.1,etc.) seem
like exciting
improvements
when first
heard by long
realism
deprived
stereo music
auditors, but
in the end
don't sound
real. What is
usually
missing is
completeness
and sonic
consistency.
One can only
achieve
realism if all
the ear's
expectations
are
simultaneously
satisfied. If
we assume that
we know
exactly how
all the
mechanisms of
the ear work,
then we could
conceivably
come up with a
sound
recording and
reproduction
system that
would be quite
realistic. But
if we take the
position that
we don't know
all the ear's
characteristics
or that we
don't know how
much they vary
from one
individual to
another or
that we don't
know the
relative
importance of
the hearing
mechanisms we
do know about,
then the only
thing we can
do, until a
greater
understanding
dawns, is what
Manfred
Schroeder
suggested over
a quarter of a
century ago,
and deliver to
the remote
ears a
realistic
replica of
what those
same ears
would have
heard when and
where the
sound was
originally
generated.
Four
Methods Used
to Generate
Reality at a
Distance
Audio
engineers have
grappled with
the problem of
recreating
sound fields
since the time
of Alexander
Graham Bell.
The classic
Bell Labs
theory
suggests that
a curtain, in
front of a
stage, with an
infinite
number of
ordinary
microphones
driving a like
curtain of
remote
loudspeakers
can produce
both an
accurate and a
realistic
replica of a
staged musical
event and
listeners
could sit
anywhere
behind this
curtain, move
their heads
and still hear
a realistic
sound field.
Unfortunately,
this method,
even if it
were
economically
feasible, does
not deliver
either
accuracy or
realism. Such
a curtain acts
like a lens
and changes
the direction
or focus of
the sound
waves that
impinge on it.
Like light
waves, sound
waves have a
directional
component that
is easily lost
in this
arrangement
either at the
microphone,
the speaker or
both places.
Thus each
radiating
loudspeaker,
in practice,
represents a
new discrete
source of
sound with
uncontrolled
directionality,
possibly
diverting
sound meant
for oblivion
in the ceiling
down to the
listener and
causing other
sounds to
impinge on the
head at odd
angles.
Finally
this curtain
of
loudspeakers
does not
radiate into a
concert-hall
size listening
room and so
one would
have, say, an
opera house
stage attached
to a listening
room not even
large enough
to hold the
elephants in
Act 2 of Aida.
This lack of
opera-house
ambience
wouldn't by
itself make
this
reproduction
system sound
unreal, even
if the rest of
the field were
somehow made
accurate, but
it certainly
wouldn't sound
perfect. The
use of speaker
arrays (walls
of hundreds of
speakers)
surrounding a
relatively
large
listening area
has been shown
to be able to
reproduce
ambient sound
fields with
remarkable
accuracy. But
while this
technique may
be useful in
sound
amplification
systems in
halls,
theaters or
labs,
application to
playback in
the home seems
doubtful.
The
Binaural
Approach
A
second more
practical and
often exciting
approach is
the binaural
one. The idea
is that, since
we only have
two ears, if
we record
exactly what a
listener would
hear at the
entrance to
each ear canal
at the
recording site
and deliver
these two
signals,
intact, to the
remote
listener's ear
canals then
both accuracy
and realism
should be
perfectly
captured. This
concept almost
works and
could
conceivably be
perfected, in
the very near
future, with
the help of
advanced
computer
programs,
particularly
for virtual
reality
applications
involving
headsets or
near field
speakers. The
problem is
that if a
dummy head,
complete with
modeled ear
pinnae and ear
canal embedded
microphones,
is used to
make the
recording,
then the
listener must
listen with
in-the-ear-canal
earphones
because
otherwise the
listeners own
pinnae would
also process
the sound and
spoil the
illusion.
The
real
conundrum,
however, is
that the dummy
head does not
match closely
enough any
particular
human
listeners head
shape or
external ear
to avoid the
internalization
of the sound
stage whereby
one seems to
have a full
symphony
orchestra (and
all of
Carnegie Hall)
from ear to
ear and from
nose to nape.
Internalization
is the
inevitable and
only logical
conclusion a
brain can come
to when
confronted
with a sound
field not at
all processed
by the head or
pinnae. For
how else could
a sound have
avoided these
structures
unless it
originated
inside the
skull? If one
uses a dummy
head without
pinnae, then,
to avoid
internalization,
one needs
earphones that
stand off from
the head, say,
to the front.
But now the
direction of
ambient sound
is incorrect.
IMAX is an
example of
this off the
ear method, as
supplemented
with
loudspeakers.
Unfortunately,
head-shape
differences
between the
dummy head and
the listener's
head remain
and usually
engender a
feeling of
unreality.
The
fact that
binaural sound
via earphones
runs into so
many
difficulties
is a powerful
indication
that
individual
head shapes
and outer ear
convolutions
are critically
important to
our ability to
sense sonic
reality but as
we shall see
loudspeaker
binaural is an
essential
element of the
Ambiophonic
paradigm.
Wavefront
Synthesis
A
third
theoretical
method of
generating
both an
accurate and a
realistic
soundfield is
to actually
measure the
intensity and
the direction
of motion of
the
rarefactions
and
compressions
of all the
impinging
soundwaves at
the single
best listening
position
during a
concert and
then recreate
this exact
sound wave
pattern at the
home listening
position upon
playback. This
method is the
one expounded
by the late
Michael Gerzon
starting in
the early 70's
and embodied
in the
paradigm known
as Ambisonics.
In Ambisonics,
(ignoring
height
components) a
coincident
microphone
assembly,
which is
equivalent to
three
microphones
occupying the
same point in
space,
captures the
complete
representation
of the
pressure and
directionality
of all the
sound rays at
a single point
at the
recording
site. In
reproduction,
speakers
surrounding
the listener,
produce
soundwaves
that
collectively
converge at
one point (the
center of the
listeners
head) to form
the same
rarefactions
and
compressions,
including
their
directional
components,
that were
heard by the
microphone.
In
theory, if the
reconstructed
soundwave is
correct in all
respects at
the center of
the head (with
the listeners
head absent
for the
moment) then
it will also
be correct
three and one
half inches to
the right or
left of this
point at the
entrance to
the ear canals
with the head
in place. The
major
advantage of
this technique
is that it can
encompass
front stage
sounds, hall
ambience and
rear direct
sounds
equally, and
that since it
is recreating
the original
sound field
(at least at
this one
point) it does
not rely on
the quirky
phantom image
illusion of
traditional
Blumlein
stereo.
The
Ambisonic
method is not
easy to keep
accurate at
frequencies
much over 1500
Hz and thus
must and does
rely on the
apparent
ability of the
brain to
ignore this
lack of
realistic high
frequency
localization
input and
localize on
the basis of
the easier to
reconstitute
lower
frequency
waveforms
alone. This
would be fine
if
localization,
by itself,
equated to
realism or we
were only
concerned with
movie surround
sound
applications.
Other
problems with
basic
Ambisonics
include the
fact that it
requires at
least three
recorded
channels and
therefore can
do nothing for
the vast
library of
existing
recordings.
Back on the
technical
problem side,
one needs to
have enough
speakers
around the
listener to
provide
sufficient
diversity in
sound
direction
vectors to
fabricate the
waveform with
exactitude and
all these
speakers
positions,
relative to
the listener,
must be
precisely
known to the
Ambisonic
decoder.
Likewise the
frequency,
delay and
directional
responses of
all the
speakers must
be known or
closely
controlled for
best results
and as in all
other
loudspeaker
systems the
effects of
listening room
reflections
must also be
taken into
account, or
better yet,
eliminated.
As
you might
imagine, it is
quite
difficult,
particularly
as the
frequency goes
up, to insure
that the size
of the
Ambisonic
field at the
listening
position is
large enough
to accommodate
the head, all
the normal
motions of the
head, the
everyday
errors in the
listener's
position, and
more than one
listener.
Those readers
who have tried
to use the
Lexicon
panorama mode,
the Carver
sonic hologram
or the Polk
SDA speaker
system, all
designed to
correct the
higher
frequency
parts of a
simple stereo
soundfield at
the listener's
ear by
acoustic
cancellation
will
appreciate how
difficult this
sort of thing
is to do in
practice, even
when only two
speakers are
involved.
In
my opinion,
however, the
basic barrier
to reality,
via any single
point waveform
reconstruction
method, like
Ambisonics, is
its present
inability, as
in the
binaural case,
to accommodate
to the effects
of the outer
ear and the
head itself on
the shape of
the waveform
actually
reaching the
ear canal. For
instance, if a
wideband
soundwave from
a left front
speaker is
supposed to
combine with a
soundwave from
a rear right
speaker and a
rear center
speaker etc.
then for those
frequencies
over say 2500
Hz the left
ear pinna will
modify the
sound from
each such
speaker quite
differently
than expected
by the
equations of
the decoder,
with the
result that
the waveform
will be
altered in a
way that is
quite
individual and
essentially
impossible for
any practical
decoder to
control. The
result is good
low frequency
localization
but poor or
non-existent
pinna
localization.
Unfortunately,
as documented
below, mere
localization,
lacking
consistency,
as is
unfortunately
the case in
stereo,
surround sound
or Ambisonics
is no
guarantor of
realism.
Indeed, if a
system must
sacrifice a
localization
mechanism, let
it be the
lowest
frequency one.
Ambiophonics
The
fourth
approach, that
I am aware of,
I have called
Ambiophonics.
Ambiophonics
assumes that
there are more
localization
mechanisms
than are
dreamed of in
the previous
philosophies
and strives to
satisfy them
all, even the
unknown ones.
It also takes
the position
that this
reproduction
technology
need only be
concerned with
reproducing
staged
acoustical
musical
events, not
movies or
virtual
reality. The
advantage of
focusing on
just one
aspect of
sonic reality
is that this
reality is
achievable
today, is
reasonable in
cost, and is
applicable to
existing LPs,
CDs, and
future DVDs.
One
basic element
in Ambiophonic
theory is that
it is best not
to record rear
and side
concert-hall
ambience or
try to extract
it later from
a difference
signal or
recreate it
via waveform
reconstruction,
but to
regenerate the
ambient part
of the field
using real,
stored concert
hall, data to
generate early
reflections
and
reverberant
tail signals
using the new
generation of
digital signal
processors.
The variety
and accuracy
of such
synthesized
ambient fields
is limited
only by the
skill of
programmers
and data
gatherers, and
the speed and
size of the
computers
used. Thus, in
time, any
wanted degree
of concert
hall design
perfection
could be
achieved. A
library of the
worlds great
halls may be
used to
fabricate the
ambient field
as has already
been done with
startling
success in the
JVC XP-A1010.
The number of
speakers
needed for
ambience
generation
does not need
to exceed six
or eight
(although
Tomlinson
Holman of THX
fame is now up
to ten and I
usually go
with 16) and
is comparable
to Ambisonics
or surround
sound in this
regard. But
even more
speakers could
be used as
this ambience
recovery
method, called
convolution,
is completely
scaleable and
the quality
and location
of these
speakers is
not critical.
Ambiophonics
is less
limited as to
the number of
listeners who
can share the
best
experience at
the same time
than most
implementations
of other
methods using
a similar
number of
speakers but
Ambiophonics
is certainly
not suited to
group
listening.
However, like
a non-ideal
seat in a
concert hall
one has a
marked sense
of space
anywhere in
the room while
the orchestra
is playing
somewhere over
there.
The
other basic
tenet of
Ambiophonics
is similar to
Ambisonics and
that is to
recreate at
the listening
position an
exact replica
of the
original
pressure
soundwave.
However,
Ambiophonics
does this by
transporting
the sound
source, stage,
and hall to
the listening
room rather
than a point
wavefront to
the ears. In
other words,
Ambiophonics
externalizes
the binaural
effect, using,
as in the
binaural case,
just two
recorded
channels but
with two front
stage
reproducing
loudspeakers
and eight or
so ambience
loudspeakers
in place of
earphones.
Ambiophonics
generates
stage image
widths up to
about 150°
with an
accuracy and
realism that
far exceeds
that of any
other 2
channel or
even multi
channel
recording
scheme. I for
one have never
had a seat at
a live
performance
where the
music came
from anything
approaching a
full 180
degrees so
this
limitation in
stage width
seems of
little moment.
Psychoacoustic
Fundamentals
Related to
Realism in
Reproduced
Sound
The
question is
how to achieve
realistic
sound with the
psychoacoustic
knowledge at
hand or
suspected. For
starters, the
fact that
separated
front
loudspeakers
can produce
centrally
located
phantom images
between
themselves is
a
psychoacoustic
fluke akin to
an optical
illusion that
has no purpose
or counterpart
in nature and
is a poor
substitute for
natural
frontal
localization.
Any
reproduction
method that
relies on
stimulating
phantom
images, and
this includes
not only
stereo but
most versions
of surround
sound, can
never achieve
realism even
if they
achieve
localization.
Realism cannot
be obtained
merely by
adding
surround
ambience to
frontal
phantom
localization.
Ambisonics,
Binaural, and
Ambiophonics
do not employ
the phantom
image
mechanism to
provide the
front stage
localization
and therefore,
in theory,
should all
sound more
realistic than
stereo and, in
fact almost
always do.
The
optimized
Ambiophonic
microphone
arrangement
discussed
later could
make this
approach to
realism even
more
effective, but
I am happy to
report that
Ambiophonics
works quite
well with most
of the
microphone
setups used in
classical
music or
audiophile
caliber jazz
recordings.
Adding
home-generated
ambience,
provides the
peripheral
sound vision
to perfect the
experience.
Since
our method is
to just give
the ears
everything
they need to
get real, it
is not
essential to
prove that the
pinna (and I
usually mean
this word to
also include
the concha,
the head and
the torso) are
more important
than some
other part of
the hearing
mechanism, but
the plain fact
is that they
are. To me it
seems
inconceivable
that anyone
could assume
that the pinna
are vestigial
or less
sensitive in
their
frequency
domain then
the other ear
structures are
in theirs. As
a
hunter-gatherer
animal, it
would be of
the utmost
importance to
sense the
direction of a
breaking twig,
a snake's
hiss, an
elephant's
trumpet, a
birds call,
the rustle of
game etc. and
probably of
less
importance to
sense the
lower
frequency
direction of
thunder, the
sigh of the
wind, or the
direction of
drums. The
size of the
human head
clearly shows
the bias of
nature in
having humans
extra
sensitive to
sounds over
700 Hz. Look
at your ears.
Look
at your ears.
The extreme
non-linear
complexity of
the outer ear
structures,
and their
small
dimensions
defies
mathematical
definition and
clearly
implies that
their exact
function is
too complex
and too
individual to
understand,
much less
fool, except
in half-baked
ways. The
convolutions
and cavities
of the ear are
so many and so
varied so as
to make sure
that their
high frequency
response is as
jagged as
possible and
as distinctive
a function of
the direction
of sound
incidence as
possible. The
idea is that
no matter what
high
frequencies a
sound consists
of or from
what direction
a transient
sound comes
from, the
pinnae and
head together
or even a
single pinna
alone will
produce a
distinctive
pattern that
the brain can
learn to
recognize in
order to say
this sound
comes from
over there.
The
outer ear is
essentially a
mechanical
converter that
maps sound
arrival
directions to
preassigned
frequency
response
patterns.
There is also
no purpose in
having the
ability to
hear
frequencies
over 10 kHz,
say, if they
cannot aid in
localization.
The dimensions
of the pinna
structures and
the
measurements
by Møller,
strongly
suggest, if
not yet prove,
that the pinna
do function
for this
purpose even
in the highest
octave.
Møller's
curves of the
pinna and head
functions with
frequency and
direction are
so complex
that the
patterns are
largely
unresolvable
and very
difficult to
measure using
live subjects.
Again, it
doesn't matter
whether we
know exactly
how anyone's
ears work as
long as we
don't
introduce
psychoacoustic
anomalies or
compromise on
the delivery
of frequency
response,
dynamic range,
loudness, low
distortion,
and especially
source and
ambience
directionality,
during
reproduction.
Basics
of Concert
Hall
Psychoacoustics
In
order to
produce a
concert-hall
sound field in
the home
without
actually
building a
concert hall,
we need to
know what the
ear requires
at the minimum
for accepting
a sound field
as real.
Knowing this,
it is then
possible to
look for ways
to accomplish
this feat in a
small space
and within a
budget,
without
compromising
the reality of
the aural
illusion.
While not
everything is
known about
how the ear
perceives
distance,
horizontal and
vertical
angular
position, hall
enclosure size
and type, and
maybe absolute
polarity,
enough is
known to allow
Ambiophonics
to create a
variety of
sound fields
suited to
different
types of music
that are real
enough to be
accepted as
such by the
ear-brain
system.
In
general the
only parts of
the hearing
mechanism that
concern us
specifically
are the ear
pinnae and the
existence of
two ears
separated by a
head. Even
without
consulting the
hundreds of
papers on this
subject, it is
clear that the
pinnae are
designed to
modify the
frequency
response of
sound waves as
a function of
the direction
from which the
sound comes.
It is also
clear that no
two
individuals
have ear
pinnae that
are
identically
shaped. But to
give a general
idea of what
one person's
pinna does in
the horizontal
plane: for a
sound coming
from directly
in front, the
frequency
response at
the ear canal
entrance,
measured with
a tiny
microphone
inserted into
the ear canal,
is essentially
flat up to
1000 Hertz. As
for most
people, the
response then
rises as the
rear of the
pinna
interdicts
sound and
reflects it
additively
into the ear
canal. A broad
11 dB peak in
the response
is reached at
about 3000 Hz
after which
the response
drops off to
minus 10 dB at
10 kHz and
then begins to
rise again. A
response
spread such as
this of 21 dB
in the treble
region is
quite
substantial,
and if a
loudspeaker
had this kind
of response it
would get very
poor reviews
indeed. It is
also easy to
see that
differences in
individual
pinnae are not
easy to
correct for
with tone
controls or
equalizers.
For a sound
coming from
the side to
the near ear,
a slow rise in
response
starts at 200
Hz, reaches 15
dB at 2500 Hz,
drops to 1 dB
at 5 kHz,
rises to 12dB
at about 7 kHz
and then drops
to 4 dB at
about 10 kHz.
(after Henrik
Moller et al)
This side
response is
quite
different from
the dead ahead
response and
indicates that
we are very
sensitive to
the direction
from which
sounds
originate even
if we listen
with only one
ear. For
sounds
directly
rearward, the
pinna cause a
dropoff of 23
dB between
2500 Hz and 10
kHz. Other
radically
different
frequency
responses
occur for
sounds coming
from above or
below. The
pinnae seem to
be entirely
responsible
for our sense
of
center-front
sound source
height.
What
this means for
realistic
sound
reproduction
is that
whatever sound
we generate
must come to
the listening
position from
the proper
direction. In
theory, it
would be
possible to
modify the
pinna
frequency
response of
say ceiling
reflections to
mimic side
reflections,
but such an
equalizer
would have to
be readjusted
for each human
being. It is
much easier to
place the
ambient
loudspeakers
around the
listener and
feed the
appropriate
signals to
them, as
described in
later
chapters.
These pinnae
effects also
explain why
launching
deliberately
or
inadvertently
recorded rear
reverberant
hall sounds
from the main
front
loudspeakers,
(or proscenium
stage ambience
from rear
speakers) in
stereo or 5.1
surround
systems, does
not and cannot
sound
realistic.
Although
a one-eared
music lover
can tell the
difference
between a live
performance
and a stereo
recording (and
Ambiophonics
works for such
an individual)
it is
two-eared
listeners that
Ambiophonics
can help the
most. Two ears
can enhance
the listening
experience in
a concert hall
(and life in
general) only
if there are
differences
between the
sounds
reaching each
ear, at least
most of the
time. The only
differences
the sound at
one ear
compared to
that of the
other ear can
have are
differences in
intensity,
arrival time,
and absolute
polarity. In
an acoustical
concert hall
or any real
physical
space, it is
not possible
for absolute
polarity to be
inverted at
just one ear
and certainly
not at just
one ear at all
frequencies
simultaneously.
Thus we need
only consider
what the
difference (or
lack of
difference)
between the
ears in sound
arrival time
and intensity
does for
listeners at a
concert.
It
is clear,
since the
distance
between the
ears is
relatively
small, that at
very low
frequencies
there can be
no significant
intensity
difference,
regardless of
where a
low-bass sound
originates. At
the other,
very high
frequency
extreme, the
head is an
effective
barrier to
sounds coming
from the side
and,
therefore,
intensity
differences
provide the
strongest non-pinna
related
directional
dues. At the
higher bass
frequencies
the brain can
begin to use
arrival time
differences to
locate a
sound. At
higher
frequencies in
the 500 to
1500 Hz
region, both
time and
intensity
differences
play a role,
until as the
frequency
continues to
rise only
pinna pattern
intensity
differences
matter.
Finally, the
sensitivity of
the ear to the
arrival time
of sharp
transients is
often cited as
a hearing
parameter but
is probably
just a
particular
manifestation
of the
mechanisms
cited above.
There
is one more
relevant
psychoacoustic
characteristic
of the
binaural
hearing
mechanism
which does
relate to
intensity and
arrival time.
This is the
ability of the
ear-brain
system to
focus on one
particular
sound source
out of many.
Most of us
can, if we
wish, pick out
just one voice
or instrument
in a quartet,
or in the
classic
example,
overhear one
conversation
at a noisy
cocktail
party. This
focusing
ability is
strong in live
three-dimensional
concert
situations and
weak when
trying to
distinguish
one voice in a
monophonic
recording of
Gregorian
chant. The
relevance to
Ambiophonics
is that if you
can generate a
concert-hall
stage and
sound field
real enough to
fool the
brain, the
ability to
focus does
appear. At a
live concert,
distractions
such as
coughing,
subway rumble,
and program
rattling are
much less
obtrusive
because one
can focus on
the stage and
the music.
Likewise at
home, such
distractions
as needle
scratch, tape
hiss, hum,
cable
idiosyncrasies,
amplifier
defects, and
domestic
noises become
easier to
ignore if you
are immersed
in Ambiophonic
atmosphere.
This
concentration
effect is
particularly
startling when
playing CD
transfers of
noisy Caruso
acoustic-era
recordings.
The
Ambiophonic
Playback
System
Ambiophonics
was developed
to provide
audiophiles,
record
collectors,
equipment
manufacturers,
and,
eventually,
recording
engineers with
a clear,
understandable
recipe for
generating
realistic
music sound
fields,
consistently
and
repeatedly,
either from
the vast
library of
existing two
channel
recordings or
from new two
channel LPs,
CDs or DVDs
made,
hopefully,
even more
realistic by
keeping
Ambiophonic
principles in
mind.
The
basic home
elements
required, if
the ultimate
in realism is
desired, are
as follows:
-
A
dedicated
listening
room. As
in home
video
theater, a
room
dedicated
to this
purpose
where
decor and
all its
other
attributes
are kept
subservient
to the
requirements
of the
Ambiophonic
method and
the laws
of
acoustics.
If the
growth of
home
video-theater
installations
is any
indication,
there are
thousands
of home
videophiles
who are
prepared
to invest
in video
projectors,
theater
seats,
large
screens,
and
built-in
surround
sound
systems in
order to
duplicate
the
movie-theater
experience
at home.
Perhaps
there are
similar
numbers of
music
lovers who
are
prepared
to invest
in a home
concert
hall or
opera
house.
Fortunately,
duplicating
the
concert
hall
experience
at home is
not nearly
as
expensive
or complex
as home
theater
and a much
smaller
room can
be used
but it is
best if
one has a
similar
dedicated
room and
as
determined
a mind
set.
-
Listening
room
treatment:
While the
size and
shape of
the room
are not
critical,
proper
electronic
or
mechanical
(preferably
both)
absorbent
sound
treatment
is
essential.
The room
must be so
configured
that sound
reflections
from its
walls are
minimized
and do not
interfere
with the
illusion
that
Ambiophonics
creates.
Keeping
exterior
noise out
of the
room is
also a
function
of the
room
treatments
discussed
in the
chapters
that
follow.
-
Loudspeaker
crosstalk
avoidance.
For
reasons
discussed
in a later
chapter,
the front
main left
and right
loudspeaker
sounds
must be
kept
acoustically
isolated
to their
respective
ears at
the
listening
position
or
positions.
This may
be done
using the
stereo
dipole
software
discussed
in later
chapters
or less
expensively
using a
permanent
or
portable
folding
panel on
edge,
extending
from the
listening
position
toward the
space
between
two very
closely
spaced
speakers.
The two
front
speakers
are moved
to a
position
almost
directly
in front
of the
listeners.
This is an
advantage
over
standard
60- degree
stereo
since the
speakers
are as
easy to
locate and
as
noncritical
in this
regard as
monophonic
sound
reproduction
was before
the coming
of stereo.
-
Front
proscenium
reflections.
Left and
right
proscenium
early
reflection
signals
derived
from the
known
impulse
responses
of the
worlds
best halls
or the
particular
site of
the
recording
must be
recreated
by
computer
or digital
signal
processor
and
reproduced
through
one or
more pairs
of
surround
loudspeakers.
The first
pair is
ideally
placed at
55 degrees
to the
right and
left of
the
listener
and the
second at
90 degrees
but the
positions
of these
and any
other
surround
speakers
are not
critical
and no
more
audible
than the
differences
in real
concert
halls
because of
their
shapes.
-
Side
hall
reverberation.
Left and
right side
reverberant
signals
must be
recreated
and
reproduced
through
loudspeakers
placed
roughly to
the right
and left
of the
listening
area.
-
Rear
hall
reverberation.
Left and
right rear
hall
reverberation
signals
must
similarly
emanate
from two
or more
speakers
behind or
elevated
behind the
listening
position.
-
Amplifier
power.
Enough
amplifier
power must
be
available
to achieve
concert-hall
volume in
the room.
This is
seldom a
problem
especially
with the
larger
number of
speakers
sharing
the
acoustic
load.
The
technical
reasons for
these
requirements
are discussed
here and in
the chapters
that follow.
It is hoped
that once the
physics and
the
psychoacoustic
laws are
understood
that the
reader may be
able to think
of better ways
to achieve the
same end.
Ambiophonics
was not
developed in a
day and the
reader may not
want to
implement the
entire
Ambiophonic
system at one
time. But each
element in the
system, when
implemented,
does result in
an appreciable
audible
improvement.
What
Ambiophonics
Specifically
Achieves
If
you employ the
techniques
described in
the chapters
below, you
will produce a
rock-solid
sound stage
that
consistently
extends far
beyond the
right and left
positions of
the closely
spaced front
loudspeakers.
You will find
that even with
the main left
and right
loudspeakers
directly in
front of you,
there is only
no compromise
in the
perceived
stage width or
depth, but a
substantial
improvement
over 60 degree
stereo or 5.1
surround with
virtually any
recording. You
will also see
that recreated
hall ambience,
if propagated
in a properly
treated room,
launched from
the correct
direction by
well-situated
loudspeakers
will yield the
sense that you
are in a hall
similar to
that in which
the recording
was made.
Since
two-eared
listening is
more vibrant
than one-eared
listening,
sound fields
that differ at
each ear in
intensity or
arrival time
are more
exciting, and
in concert
halls add
spatial
interest to
the event.
Thus when we
come to
consider
home-concert-hall
design, it is
not enough to
just maintain
the separation
of the front
left and right
channels; it
is also
necessary to
ensure the
diversity of
all the
signals
launched into
the home
listening
space.
Correlation is
the opposite
of diversity,
and in the
next chapter
we will
consider the
significance
of the
correlation
factors of
both music and
auditoriums so
that we can
sound as
realistic as
possible.
|