Ambiophonics, 2nd Edition Chapter 8
|Ambiophonics, 2nd Edition: Replacing Stereophonics to Achieve Concert-Hall Realism
Ambiopoles and Ambiophones
Ambiophonics combines four technologies to produce realistic sound fields and actually does it optimally via two-channel recording media. The technologies are convolution for hall ambience, room/speaker treatment/correction, front loudspeaker crosstalk and pinna angle error elimination, and an optional superior recording microphone design and placement. The basic tenet of Ambiophonics is to recreate at the listening position an exact replica of the original concert hall sound field. Ambiophonics does this by transporting the sound sources and the stage and hall ambience to the listening room. In other words, Ambiophonics delivers an externalized binaural effect, using, as in the binaural case, just two recorded channels but with two front stage reproducing loudspeakers and eight or so ambience loudspeakers in place of earphones. Ambiophonics generates stage image widths up to 140° with an accuracy and realism that far exceeds that of any other 2 channel or multi-channel recording/reproducing scheme. While there are Ambiophonic ways to get a direct sound stage extending to 180 degrees, I, for one, have never experienced such a wide angled stage at a live concert and so this aspect is not considered here.
We will now discuss how to reproduce the front stage of a two channel recording without exposing our ears to comb filtering, phantom imaging or major errors in the angle of sound incidence on the pinna and how best to make recordings that take advantage of Ambiophonic binaural technology. At this point you may want to review the material in the preface on the psychoacoustic deficiencies inherent in the stereo triangle.
Making Good on the Promise of Binaural Technology
Since we have only two ears, it seems reasonable that only two signals should need to be recorded. Indeed it was Blumlein’s original idea that he could externalize the earphone binaural effect using spaced loudspeakers and some novel microphone arrangements. But once you give up earphones for stereo loudspeakers, the interaural-crosstalk and the arbitrary speaker angle destroy the almost perfect, but internalized (within the skull), binaural frontal stage image and with all the hall ambience now coming entirely from the front the hall ambience sounds unnatural. Binaural theory says that if you sit in the concert hall with small microphones in your ear canal, record the concert, and then later play it back with in-the-ear canal earphones you will experience an almost perfect “you are there” recreation. The only flaw in this method would be that when you moved your head, while listening or recording, the reproduced stage would rotate unrealistically. But let us consider, briefly, why this recording method can otherwise produce an awesome reality.
First of all, the sound from the stage and the hall during such a personal binaural recording reaches your ear canal (and the imbedded microphones) after being filtered by your pinna and your head shape. Since the playback earphones we are using are an in-the-ear-canal type the sound only passes through the pinna or around the head once. Also the pinna used to make the recording are your own, not those on some dummy head carved in wood or plastic. The two channels are kept separate throughout and the left ear playback earphone signal never leaks into the right ear or vice-versa. Thus we can state one of the basic rules of realistic binaural recording technology. In any binaural recording or reproduction chain there should be one and only one pinna function and it must be your own. There must also be one and only one head shadowing entity but in this case whose head it is not critical. That the head shadowing function is not as individual as the pinna function can be understood when one realizes that sound passes around the head over the top, under the chin, around the back, and varies as the head is tilted or rotated. Thus the brain is not overly sensitive to the exact shape of a particular head or the exact frequency response of the head shadowing function, within reason.
So let us see how we can make use of this knowledge. Let us assume that we have a two-channel recording made using a dummy head that has no pinna. This dual microphone is sitting tenth row center. Its signals are then recorded and played back over two loudspeakers directly in front of the home listener. Let us assume for the moment that these loudspeakers are like laser beams so that their sound is aimed precisely at the proper ear. In this case the listener hears what the corresponding microphone hears and the sound impacts his own pinna with very little incident angle error for central stage sources. For stage sources that are more to the side, the listener hears the head response transfer function of the microphone head and for normal stage widths this is quite realistic. But now the home listener can rotate his head and the image is stable just as if he were in the concert hall. So this technique is not only equal to but also superior to the earphone method considered above. There is a pinna angle error for stage sources toward the extreme left and right but fortunately these are the angles where direct sound has a more or less clear shot at getting to the ear canal directly without extreme pinna filtering and also where nature has compensated for the decrease in pinna sensitivity by making the interaural head shadowing most pronounced providing strong and natural horizontal plane localization. In practice, both IMAX and Ambiophonics easily demonstrate that this binaural technology is exceptionally realistic and does produce wide front stages that even allow the cocktail party effect to be in evidence.
Now the question is how to make a pair of center front speakers behave like sound lasers. There are two possibilities. One is to put a physical wall or panel in front of the listener. This wall extends to within a foot or so of the listener’s head and keeps the left speaker from radiating to the right ear and vice-versa. This technique works perfectly and if you are an audiophile and want absolute fidelity without cables or extra processing this is a very inexpensive way to go. You can try it first with a mattress on end, if you want to experiment and have some fun.
While I appreciate that the use of a barrier will never find universal acceptance, an understanding of how it works is necessary to an appreciation of what a software version of such a crosstalk avoidance system must accomplish. You can make a barrier out of sound absorbing panels with a cutout at the end of it so that it is possible to sit comfortably at the end of it. The thickness of the barrier is not critical, but should be about six to eight inches wide so that when a listener is seated their right eye cannot see the left speaker and vice versa. The wall extending back toward the space between the speakers is, preferably, made with sound absorbing material. This panel can be thought of as a collimator for most sound except the low bass. It eliminates all stray rays from the right that might be heading left and those from the left that might be heading right. A panel such as this is very effective in dampening higher frequency room reflections since it absorbs rays coming from both room sides.
The use of an outdoor reflective barrier to eliminate stereophonic crosstalk was described in 1986 by Timothy Bock and Don Keele Jr. at the 81st Audio Engineering Society Convention. While Ambiophonics uses an absorbent barrier, their results are still largely pertinent. They determined that a listener could be further back from the end of the barrier if the barrier was wider, the speakers are closer together, and the listener further from the speakers. Stated as an equation:
Where, in inches, L is the maximum distance a listener’s head can be from the barrier, X is the distance from the listening end of the barrier to the position of the speakers, D is the distance between the centers of the speakers H is the distance between the ears, and T is the thickness of the barrier. For a worst case scenario of a six-inch head, a six-inch thick barrier, an eight-foot distance to the speakers, and a speaker separation of three feet (too much) a listener could be as much as 32 inches, almost three feet from the end of the barrier. Thus the use of a barrier does not in any way make listening uncomfortable or claustrophobic.
Our own Ambiophonic barrier geometry allows one to be four feet from the end of the barrier, but at the far end of this range one’s head must be more precisely centered. With a four-foot space, two in-line listeners can enjoy the enhanced angular image separation at the same time and indeed the front listener acts as a continuation of the barrier for the second listener. If in doubt about the spacing, the eyeball method is very conservative. As long as no part of the opposite loudspeaker is visible from one eye, excellent separation is guaranteed. Sitting too close to the barrier is not only unpleasant but results in a loss of high-frequency response if the barrier is as wide as the head and absorptive.
However, the mainstream way is to use software and a computer or digital signal processing system to eliminate the crosstalk. I call a pair of speakers, designed for this purpose, that use the public domain software that we have developed to do this, an Ambiopole.
First, although most speakers can be used to form an Ambiopole, it is best if the speakers chosen are very directional and well matched. A slightly concave electrostatic panel (called an Ambiostat) can actually focus sound well enough that it almost behaves like the laser we have hypothesized. Obviously, if the speakers are focused and time aligned, the software can do its job much better. What the software does is generate slightly delayed reversed polarity signals for the speakers to cancel the crosstalk acoustically before it reaches the ear canal. The cancellation is an infinite series process since the crosstalk caused by the cancellation signal also produces crosstalk, which must then be cancelled and so on.
If the Ambiopoles were widely spaced, then the crosstalk would have to go around the head and the correction signals would be very difficult to calculate since they would be affected by head position and pinna shape. Thus the front speaker pair should be as close together as possible with ten-degrees or less between them so that both the main front speakers emit directly to their onside ears.
Another way of looking at this process is to consider the mechanical barrier again. The barrier works perfectly every time. If you put a microphone at the ear position at the end of the barrier and measure the crossed impulse response of the system and then convolve the main front Ambiopole signals with this response you can create software that is useable with that speaker type and speaker angle after the barrier is removed. Just as it is obvious that a barrier will work better with close together speakers, since speaker proximity makes it easier for the barrier to shadow the appropriate ear, so crosstalk software works better if the speakers are closer together.
Ambiopoles do have a sweet spot limitation although in my experience the sweet spot is larger than that of most well focused stereo or 5.1 systems. But if the Ambiopoles are constructed using omni-directional speakers then it is possible to enlarge the sweet spot enough to accommodate two or even three listeners. Unfortunately there are few true omni-directional speakers available, and so it has been difficult so far to perfect this application and demonstrate that this variation works to audiophile standards. Of course, using omni-directional speakers requires that the room be really well sound treated to avoid the extra wall reflections that are generated by such a speaker.
Sometime during 2001 the Ambiophonic Institute expects to have crosstalk cancellation software available for downloading at no charge from its web site. Eventually it is hoped that manufacturers will use this or similar software in their products. It would also be possible to provide an alternate track on a DVD-A to allow crosstalk free playback of music recordings via an Ambiopole. As discussed below, Ambiopole software can be tweaked to compensate for the various main and spot microphone or panning techniques employed to make a particular stereo or three channel recording if the simpler, optimum, Ambiophone has not been employed.
The Stereo Dipole, AES Preprint 4463
Among the pioneers in the field of crosstalk cancellation are Ole Kirkeby, and Philip A. Nelson of The University of Southampton and Hareo Hamada of Tokyo Denki University who developed an electronic version of the panel in 1996. They have shown that the ideal speaker spacing for a crosstalk cancellation system be it mechanical or electronic is about 10 degrees. They refer to two speakers placed so close together as a “stereo dipole”. The electronic filters required to cancel crosstalk in this narrow speaker arrangement are somewhat easier to design and are more effective since at the narrower angle there is little diffraction around the head for the correction signals and so HRTF correction is not necessary. Pinna angle distortion of the correction signals is also not a major factor and so the crosstalk cancellation can be allowed to operate over the full upper frequency range without restricting the size of the listening area or generating the audible phasiness effects that afflict electronic crosstalk cancellation schemes for widely spaced loudspeakers.
They also show, that at narrow speaker angles, the path length difference from a speaker to each ear is so small that the infinite series of inverted crosstalk cancellation impulses are generated at a rate of over 10 kHz. This allows for very fine definition of the crosstalk cancellation signals at higher frequencies and makes this process quite accurate using the DSP power presently available.
University of Parma Ambiopole Software
The Ambiophonic Institute in conjunction with the University of Parma has developed an advanced version of the stereo dipole called the Ambiopole. In their implementation, the crosstalk cancellation operation is performed through the convolution of the two left and right front input signals of the recording with a set of 4 inverse filters. Two of these filters can be selected by the listener based on knowledge of the microphone employed to make the recording. These inverse filters cancel out a great part of the microphone-dependent spatial effects. The goal is to convert recordings, made with other than the ideal Ambiophone described below, sound as if they were so recorded. In principle, any kind of two or three channel microphoning system (such as ORTF, M/S, spaced Omnis, Soundfield, Dummy Head, Sphere, etc.) can be compensated for including even a “virtual” one, as happens when the stereo mix is obtained by the panning of monophonic sources. Thus this new software is designed so that almost all two-channel recordings can benefit from being reproduced Ambiophonically. In practice, an Audiophile listener can select from a menu of filters the one that makes a particular recording sound most realistic.
The University of Parma Ambiopole is realized by means of a single DSP processor programmed with mathematical entities called “warped finite impulse response” functions or filters. The warping is essentially a mathematical weighting algorithm that makes it possible to compute the required crosstalk cancellation signals in real time i.e. while the music is playing without falling behind or making errors. It is hoped that those reading this book in the near future will be able to purchase Ambiophonic system processors that include this software as well as the software for hall convolution and room correction. Until then, Ambiophonics will remain a do it yourself technology for audiophile computer experts only.
Bass Response of Ambiopoles
Since Ambiophonics is a binaural based system, it does not provide the Blumlein loudspeaker crosstalk signal that furnishes the lowest frequency phase shift localization cues for those few recordings made with a coincident microphone arrangement such as the Soundfield mic or crossed figure eight mics in the M/S (mid-side Blumlein configuration). (See the Appendix A for a detailed analysis of the Blumlein patent and technology.) However, it should be understood that at very low bass frequencies, the barrier (depending on its size and absorbency) and its electronic cousins lose their effectiveness allowing increasing crosstalk as the frequency declines and therefore amplifying LF phase cues for coincident microphone recordings. This is basically a non-issue. Remember that the ear’s ability to localize bass frequencies at 80 Hz and below is virtually non-existent. The pinna certainly has no capability in this frequency range and the head is too small to attenuate signals with wavelengths measured in tens of feet. Thus the only localization method available to the brain at very low frequencies is the few degrees of phase shift between the ears. There is no evidence that the brain can detect such small phase shifts and thus worrying about crosstalk elimination at very low frequencies to improve front stage imaging is not productive.
Indeed, impulse response measurements on the mechanical crosstalk barrier show that crosstalk cancellation begins to decline starting at 400 Hz. To be on the safe side the software can go somewhat lower in frequency before rolling off, but at very low frequencies the power required to produce crosstalk cancellation at very low frequencies becomes excessive and is not necessary.
Once we know that playback will be Ambiophonic, the question arises as to whether there is an ideal recording method that can take advantage of the fact that surround ambience will be derived via convolution, that the Ambiopole will eliminate crosstalk and phantom imaging, and that the listening room is sound treated. But I still want to emphasize that although Ambiophone microphone arrangements can make the Ambiophonic approach to realism even more effective Ambiophonics works quite well with most of the microphone setups used in classical music or audiophile caliber jazz recordings and as indicated above there are software ways to correct existing recordings if one is really fanatical.
One can heighten the accuracy, if not gild the lily of realism, of an Ambiophonic reproduction system by taking advantage, in the microphone arrangement, of the knowledge that in playback, the rear/side half of the hall ambience is convolved, that there is no crosstalk, that listening room reflections are minimized and that the front loudspeakers are relatively close together. Earlier we considered the binaural model where microphones are inserted in the ear canal of an ideally situated listener. But now the situation is different. We are going to reproduce the hall ambience by convolution so we do not want our binaural listener to pick up any hall ambience from the rear the extreme sides or the ceiling. So let us put sound absorbing material just behind his head and above him as well so that he has a sonic view of only the stage in front of him.
Now we know that upon reproduction the Ambiopole speaker sound will pass by his pinna on the way to the eardrum. Thus we do not want any pinna at the recording site. Thus the human listener is excused from the recording site and we are left with a pair of baffled head spaced omni or cardioid microphones sitting at the best seat in the house. But the rule stated earlier said there must be at least one and only one head shadow in the recording/reproduction chain and so, since the home listener is directly in front of the Ambiopole it is up to the Ambiophone to provide a head shadow. So let us put a head shaped oval between the two microphones at this best seat in the house. So our Ambiophone boils down to an oval shaped two capsule assembly baffled to the rear and above comfortably ensconced at the best seat in the house or studio.
Nothing New Under the Sun
After completing the above derivation of the ideal Ambiophone, I began to search for recordings that played back realistically Ambiophonically to see if they had anything consistent or unusual about them. Not being a recording engineer or a microphone aficionado, it took me awhile to notice that many of the best CDs in my collection were made with something called a Schoeps KFM-6. A picture of this microphone in a PGM Recordings promotional flyer showed a head sized but spherical ball with two omnidirectional microphones one recessed on each side of the ball where ear canals would be if we had an exactly round head. The PGM flyer also included a reference to a paper by Günther Theile describing the microphone, entitled On the Naturalness of Two-Channel Stereo Sound, J. Audio Eng. Soc., Vol. 39, No. 10, 1991 OCT.
Although Theile would probably object to my characterization of his microphone, his design is essentially a simplified dummy head without external ears. He states, It is found that simulation of depth and space are lacking when coincident microphone and panpot techniques are applied. To obtain optimum simulation of spatial perspective it is important for two loudspeaker signals to have interaural correlation that is as natural as possible……..Music recordings confirm that the sphere microphone combines favorable imaging characteristics with regard to spatial perspective accuracy of localization and sound color….. Later he states The coincident microphone signal, which does not provide any head-specific interaural signal differences, fails not only in generating a head-referred presentation of the authentic spatial impression and depth, but also in generating a loudspeaker-referred simulation of the spatial impression and depth……it is important that, as far as possible, the two loudspeaker signals contain natural interaural attributes rather than the resultant listener’s ear signals in the playback room.
What Theile did not appreciate is that, for signals coming from the side, the sphere acts as sort of filter for the shorter wavelengths just as the head does. When this side sound comes from side stereo speakers the listener’s head again acts as a filter resulting in HRTF squared. The solution, of course, is to use the mechanical or software Ambiopole barrier and listen to the Theile sphere without the second head response function. Theile also “generates artificial reflections and reverberation from spot-microphone signals.” He uses the word artificial in the sense that the spot microphone signals will be coming from the front stereo loudspeakers instead of from the rear, the sides, or overhead. While Theile’s results rest as much on empirical subjective opinion as they do on psychoacoustic precepts, they certainly are consistent with the premises of Ambiophonics both in recording and reproduction. Making new recordings using the Schoeps KFM-6 version of the Theile Sphere and evaluating existing recordings made with this microphone show that the theory is correct since such recordings yield exceptionally realistic front stages with normal concert-hall perspectives and proscenium ambience.
Realistic Reproduction of Depth
It is axiomatic that a realistic music reproduction system should render depth as accurately as possible. Fortunately, front stage distance cues are easier to record and/or recreate realistically than most other parameters of the concert-hall sound field. Assuming that the recording microphones are placed at a reasonable distance from the front of the stage, then the high frequency roll-off due to distance and the general attenuation of sound with distance remain viable distance cues in the recording. Depth of discrete stage sound sources is, however, more strongly evidenced in concert-halls by the amplitude and delay of the early reflections and the ear finds it easier to sense this depth if there is a diversity of such reflections. In Ambiophonics, convolved early reflections from the surround speakers make the stage as a whole seem more interesting, but it is only the recorded early reflections coming from the front speakers that provide the reflections that allow depth differentiation between individual instruments. This is why anechoic recordings sound so flat when played back stereophonically or even Ambiophonically, despite the presence of an added ambient field. In ordinary stereo, depth perception will suffer if early side and rear hall reflections wrap around to the front speakers or in the anechoic case, are completely missing. Since it is easy to make Ambiophonic recordings that include just proscenium ambience, why not do so and save on convolver processing power and preserve, undistorted, the depth perception cues?
There remains the issue of perspective, however. When making a live performance recording of an opera or a symphony orchestra the recording microphones are likely to be far enough away from the sound sources to produce an image at home that is not so close as to be claustrophobic. There are many recordings, however, that produce a sense of being at or just behind the conductor’s podium. This effect does not necessarily impact realism but you must like to sit in the front row to be comfortable with this perspective. Turning down the volume and adding ambience can compensate for this, but with a loss in realism. This problem becomes more serious in the case of solo piano recordings or small Jazz combos. For example, if a microphone pair is placed three feet from an eight foot piano, then that piano is going to be an overwhelming close-up presence in the listening room and a “They-Are-Here” instead of a “You Are There” effect is unavoidable. This will be very realistic especially with the Ambiopole, but adding real hall ambience doesn’t help much since the direct sound is so overwhelming. The major problem with this type of recording is that you have to like having these people so close in a small home listening room. You may notice that demonstrators of high resolution playback systems in show rooms or at shows, overwhelmingly, use small ensemble, solo guitar, single vocalist etc., close mic’ed, recordings to demonstrate the lifelike qualities of their products and that these demonstrations are mostly of the “They Are Here” variety.
These depth and perspective problems are easily solved by simply placing an Ambiophone at a seat that has a reasonable view of the performers.
Don’t forget to bookmark us! (CTRL-D)
Stereo Times Masthead
Frank Alles, Mike Girardi, Key Kim, Russell Lichter, Terry London, Moreno Mitchell, Paul Szabady, Bill Wells, Mike Wright, Stephen Yan, and Rob Dockery
David Abramson, Tim Barrall, Dave Allison, Ron Cook, Lewis Dardick, Dan Secula, Don Shaulis, Greg Simmons, Eric Teh, Greg Voth, Richard Willie, Ed Van Winkle, and Rob Dockery
Carlos Sanchez, John Jonczyk, John Sprung and Russell Lichter
Site Management Clement Perry
Ad Designer: Martin Perry