- BACCH™ 3D Sound: 20 Questions and Answers
- BACCH™ Filters: Optimized Crosstalk Cancellation for 3D Audio over Two Loudspeakers
- PHOnA: A Public Dataset of Measured Headphone Transfer Functions
- A New Approach to Impulse Response Measurements at High Sampling Rates
- Coloration Metrics for Headphone Equalization
- Comparison of Techniques for Binaural Navigation of Higher-Order Ambisonic Soundfields
- Capturing the elevation dependence of interaural time difference with an extension of the spherical-head model
- A Database of Loudspeaker Polar Radiation Measurements
- Metrics for Constant Directivity
- Soundfield Navigation using an Array of Higher-Order Ambisonics Microphones
BACCH™ 3D Sound (previously called "Pure Stereo 3D Audio™") is a recent breakthrough in audio technology, based on BACCH™ Filters, (and licensed by Princeton University) that yields unprecedented spatial realism in loudspeaker-based audio playback allowing the listener to hear, through only two loudspeakers, a truly 3D reproduction of a recorded soundfield with uncanny accuracy and detail, and with a level of high tonal and spatial fidelity that is simply unapproachable by even the most expensive and advanced existing high-end audio systems.
This paper is the PDF version of the 20 Questions and Answers that are posted online at this webpage. This poster summarizes the fundamental principles behind this technology as well as some technical aspects of its implementation.
BACCH™ Filters are optimized crosstalk cancellation (XTC) filters that allow 3D audio reproduction over a pair of loudspeakers. They yield maximum crosstalk cancellation level without introducing any spectral coloration to the input signal. An introduction to BACCH™ Filters can be found here.
This highly technical paper describes most of the basic aspects of BACCH™ Filters (some aspects are not published for propriety reasons).
The Princeton Headphone Open Archive (PHOnA) is a dataset of measured headphone transfer functions (HpTFs) from many different research institutions around the world. Visit this webpage to access the dataset.
For some applications such as binaural 3D audio, impulse responses (IRs) measured at sampling rates of 96 kHz or higher may be desirable, if not necessary. However, if not properly employed, conventional IR measurement techniques at these higher sampling rates may result in low signal-to-noise ratios (SNRs), undesirable pre-responses (or other processing artifacts), and/or potentially even transducer damage. In this paper, the above challenges are addressed and an iterative measurement procedure is proposed which is experimentally shown to achieve superior results compared to conventional techniques in terms of measured SNR and peak pre-response amplitude.
This paper was presented by Joseph (Joe) G. Tylka at the 137th Convention of the Audio Engineering Society (AES 137). More information on this paper can be found on its AES page and the slides used in its presentation can be found here.
Headphone equalization is essential to binaural listening. Equalization algorithms have previously been optimized by hand using heuristics and a small set of measurements from a single institution. The PHOnA Project allows us to compare different equalizations across data from many laboratories. This paper develops a set of objective metrics for the psychoacoustic phenomena that cause audible coloration during headphone listening. This will allow a better understanding of how to create a transparent reproduction channel for 3D audio listening via headphones.
Soundfields that have been decomposed into spherical harmonics (i.e., encoded into higher-order ambisonics - HOA) can be rendered binaurally for off-center listening positions, but doing so requires additional processing to translate the listener and necessarily leads to increased reproduction errors as the listener navigates further away from the original expansion center. In this paper, three techniques for performing this navigation (simulating HOA playback and listener movement within a virtual loudspeaker array, computing and translating along plane-waves, and re-expanding the soundfield about the listener) are compared through numerical simulations of simple incident soundfields and evaluated in terms of both overall soundfield reconstruction accuracy and predicted localization.
This paper was presented by Joseph (Joe) G. Tylka at the 139th Convention of the Audio Engineering Society (AES 139). More information on this paper can be found on its AES page and the poster presented at the convention can be found here.
An extension of the spherical-head model (SHM) is developed to incorporate the elevation dependence observed in measured interaural time differences (ITDs). The model aims to address the inability of the SHM to capture this elevation dependence, thereby improving ITD estimation accuracy while retaining the simplicity of the SHM. To do so, the proposed model uses an elevation-dependent head radius that is individualized from anthropometry. Calculations of ITD for 12 listeners show that the proposed model is able to capture this elevation dependence and, for high frequencies and at large azimuths, yields a reduction in mean ITD error (averaged over the 12 listeners) of up to 47 microseconds (9% of the measured ITD value), compared to the SHM. For low-frequency ITDs, this reduction is up to 192 microseconds (27%). The values quoted in the abstract of the paper correspond to mean ITD error averaged over the 12 listeners and all available elevations.
Anechoic directivity data for a variety of loudspeakers have been measured and compiled into a freely available online database, which may be used to evaluate these loudspeakers based on their directivities. The measurements are illustrated through four types of plots (frequency response, polar, contour, and waterfall) and are also given as raw impulse responses. Two sets of directivity metrics are defined and are used to rank the loudspeakers. The first set consists of full and partial directivity indices that isolate sections of the loudspeaker's radiation pattern (e.g., forward radiation alone) and quantify its directivity over those sections. The second set quantifies the extent to which the loudspeaker exhibits constant directivity. Measurements are taken, in an anechoic chamber, along horizontal and vertical orbits with a (nominal) radius of 1.6 m and an angular resolution of five degrees.
This engineering brief was presented by Joseph (Joe) G. Tylka and Rahulram Sridhar at the 139th Convention of the Audio Engineering Society (AES 139). More information on this paper can be found on its AES page and the poster presented at the convention can be found here.
Many transducer manufacturers claim to achieve "constant directivity" (also called "controlled directivity"), but there is currently no way of quantifying the extent to which a transducer possesses this quality. To address this problem, commonly-accepted criteria are used to propose two definitions, one more strict and one more lenient, of constant directivity: 1) that the polar radiation pattern of a transducer must be invariant over a specified frequency range, or 2) that the directivity index (see this report for more information), must be invariant with frequency. Furthermore, to quantify each criterion, five metrics are derived and computed using measured polar radiation data for four loudspeakers. The loudspeakers are then ranked, from most constant-directive to least, according to each metric, and the rankings are used to evaluate each metric's ability to quantify constant directivity. Results show that all five metrics are able to quantify constant directivity according to the criterion on which each is based and two of them are able to adequately quantify both proposed definitions of constant directivity.
This paper was presented by Prof. Edgar Y. Choueiri at the 140th Convention of the Audio Engineering Society (AES 140). More information on this paper can be found on its AES page and the poster presented at the convention can be found here.
A method is presented for soundfield navigation through estimation of the spherical harmonic coefficients (i.e., the higher-order ambisonics signals) of a soundfield at a position within an array of two or more ambisonics microphones. An existing method based on blind source separation is known to suffer from audible artifacts, while an alternative method, in which a weighted average of the ambisonics signals from each microphone is computed, is shown to necessarily introduce comb-filtering and degrade localization for off-center sources. The proposed method entails computing a regularized least-squares estimate of the soundfield at the listening position using the signals from the nearest microphones, excluding those that are nearer to a source than to the listening position. Simulated frequency responses and predicted localization errors suggest that, for interpolation between a pair of microphones, the proposed method achieves both accurate localization and minimal spectral coloration when the product of angular wavenumber and microphone spacing is less than twice the input expansion order. It is also demonstrated that failure to exclude from the calculation those microphones that are nearer to a source than to the listening position can significantly degrade localization accuracy.
This paper will be presented by Joseph (Joe) G. Tylka at the 2016 Audio Engineering Society International Conference on Audio for Virtual and Augmented Reality (AES 2016 AVAR). More information on this paper can be found on its AES page.