Evolution of Automotive Audio Tuning and Testing

August 13 2025, 14:10

Offering an updated perspective on automotive audio, this article highlights the advantages of utilizing a head-and-torso simulator (HATS) and introduces the Multi-Dimensional Audio Quality Score (MDAQS) assessment of audio quality - both key technologies pioneered by HEAD acoustics. Valuable insights about the methods used to benchmark automotive audio and advanced audio processing in modern vehicles, equipped with high channel counts and headrest speakers.

This article discusses the evolution of methods used to benchmark automotive audio and advanced audio processing in modern vehicles equipped with high channel counts and headrest speakers. It highlights the advantages of utilizing a head-and-torso simulator (HATS) and introduces the multi-dimensional audio quality score (MDAQS) assessment for evaluating audio quality.

It is hard to believe it’s been more than 40 years since Earl Geddes and Henry Blind presented their paper “The Localized Sound Power Method” at the 76th Convention of the Audio Engineering Society (AES) in New York [1]. The test methodology introduced in that paper has been essential to tuning automotive audio systems and is still widely used today.

Imagine the audio systems available in automobiles from that era compared to a car produced today. Stereo was a common feature, but the standard would have been four loudspeakers (doors, A-pillars, rear shelf, dash), and in rare trim levels, a subwoofer.

In contrast, today’s production vehicles can have 20 channels with loudspeakers, tweeters, and subwoofers packaged almost anywhere. Headrest loudspeaker systems, once only an audio designer’s dream, are now in production and can offer a personalized experience for each passenger. Advanced features such as active noise cancellation (ANC) for correlated noise sources such as powertrain, Road Noise Compensation (RNC) for uncorrelated noise sources such as tire/road/wind noise are common in even base trims. Immersive experiences are now becoming common that utilize spatial audio that make the automobile not just a “stereo system” but a center of entertainment. With all the additional channels and features being added to improve the audio experience, is the Geddes-Blind method still the most appropriate to use?

Automotive Audio System Testing Today
The famous six-microphone array (Figure 1) implemented by Geddes-Blind accounted for the distance between the ears (15cm) and the 99th percentile ear ellipsoid, which essentially covers all the possible ear positions. This method was ideal for quickly measuring frequency responses in the passenger’s location, adding a level of objectiveness to the tuning process, and making the entire process more efficient.

Figure 1: A six-microphone array positioned in the driver’s seat.

However, the six-microphone array does not account for the head-related transfer function (HRTF) and its impact on the sound field. Nor does it accurately represent the cabin acoustic characteristics when “loaded” with a body [2]. With more advanced audio systems featuring spatial audio rendering, personalized audio zones, and noise cancellation, an HRTF is essential for effective tuning.

The industry has been reluctant to use a binaural head, called a HATS, to tune vehicle audio systems. Concerns about repeatability, additional test time, and even questions about the validity of including an HRTF in the tuning process have all been raised. All valid questions. But as vehicle audio systems have evolved, the tools used to tune them need to evolve as well.

The concern about repeatability is twofold: is the HATS a repeatable “transducer” and are measurements made by HATS repeatable when physically placing (and re-placing) it in a vehicle? The repeatability of the transducer is one that was addressed 30 years ago by the ITU-T. The size, shape, and acoustic characteristics of a HATS are all well defined by ITU-T recommendation P.58 [3] (introduced in 1993, updated in 2023) as has the ear shape by ITU-T recommendation P.57 [4] (introduced in 1993, updated in 2021).

Contemporary measurement-grade HATS meet both ITU-T standards and are the primary transducer used in the voice communication market. The ITU-T HATS is also the standard used by Noise, Vibration, and Harshness (NVH) labs in the automotive market since the 1990s. Using a HATS based on the ITU-T standard not only ensures repeatability of measurement by the transducer but is also in line with the rest of the acoustics industry. When the Geddes-Blind paper was written HATS technology had just been introduced in test labs and was likely not considered by the authors.

Regarding repeatability of measurements, there have been recent studies [5, 6] that show HATS measurements in cars to be very repeatable, almost on par with the six-microphone array. But even the most careful engineer could introduce a measurement error with the six-microphone array when placing it in a vehicle seat. A HATS has the added benefit of simple fixturing for positioning with a laser level or simple measurement sticks. Even though the Geddes-Blind paper suggests that a six-microphone array is sufficient, that is at the discretion of the user.

Most labs use the six-microphone array and vehicle software tuning tools to capture measurements following the recommendations currently being summarized in the AES Automotive Audio committee’s Automotive Audio white paper [7]. This recommendation focuses on three primary metrics: max SPL, frequency response and Impulsive distortion. All are done with no driving noise. These objective measurements with the six-microphone array establish the baseline system “performance” but are supplemented with a lot of subjective “Golden Ear” listening sessions throughout the tuning phase all the way to final product signoff.

This now begs the question: what measurement data best supplements what is being perceived and how are advanced features such as spatial audio being accounted for? Are modern measurement metrics able to support the “Golden Ears” so they can be more efficient during the final hand tune?

Near Future of Automotive Audio Tuning and Testing
Three key concepts come to mind: First, there is no reason to replace or rid ourselves of the six-microphone array especially just as the AES is trying to standardize the measurement approach and data; rather, we need to find a way to better supplement that data. Second, we need to bring the ears and the subjective into the tuning phase earlier to help the “Golden Ears.” Third, we need better methods for evaluating the spatial components of audio in the car.

To clarify, there still needs to be a human (or “Golden Ear”) in the tuning phase, and certainly during validation. Some artistic expressions reflective of brand or product price point can be difficult to obtain objectively.

Expanding the Six-Mic Array Measurements
When car audio systems expand from four to say 20 (or more!) loudspeakers, it is all the more important to capture and document the individual delays that occur from source (loudspeaker) to sink (seat location). In connection with impulse response measurements for each loudspeaker, we can form a picture of how the loudspeakers behave in the cabin especially as we transition through the Schroeder frequency [8].

Not only would these two simple measurements help with loudspeaker characterization and interaction, they also can uncover potential cabin acoustics issues like reverberant or absorptive spots (that are critical to know when tuning noise cancelling systems).

Introduce the Subjective
Let’s put a human proxy in the car (i.e., a HATS). First, the acoustic space is now loaded as it would be in real-world conditions by both the HRTF and the presence of a “body.” Second, we can now perform binaural measurements that correspond to what the average human would perceive.

We should start by applying the same measurements from the six-microphone array to the binaural setup, ideally at the 5th, 50th and 95th percentile seat location (as seen in Figure 2).

Figure 2: A HATS positioned in the driver’s seat on a seat mount, that can be adjusted to fit the 5th, 50th, and 95th percentile.

Additionally, one novel measurement that should be considered is Multi-Dimensional Audio Quality Score [9] (MDAQS), a Mean Opinion Score (MOS) metric designed to evaluate audio quality from an audio playback system. Applying a specially designed stimulus signal consisting of isolated sweeps and music from six different genres (see Figure 3), it provides the combined perceived audio quality score along three key dimensions: Timbre, Distortion, and Immersiveness.

This metric, like all MOS metrics, is a mathematical model based on subjective impressions from a wide-ranging jury study of several hundred naïve listeners. It’s like inviting 300 of your favorite friends to evaluate your audio system every time you run the measurement. The combined MOS provides valuable insight into the general acceptance of the audio system performance. With the addition of the individual dimensional scores, you can evaluate specific aspects of a vehicle system as well: Timbre, Distortion, and Immersiveness. As mentioned, this is based on naïve listeners and therefore should be considered “ears” and not “Golden Ears.” Like the six-microphone array, MDAQS can be a powerful tool in the toolbox of any audio engineer.

Sidenote: The research leading up to the creation of the objective metric, MDAQS, was not done in isolation, but is based on research already done and publicly available.

Last, it’s worth mentioning that with binaural measurements, this opens the door for subjective evaluations outside the vehicle with aurally accurate binaural playback. Not only can “Golden Ears” get a chance to do early-build evaluations from anywhere in the world, it also opens the door for AB comparisons, competitive analysis, and consumer feedback through jury studies. There is no better audio analyzer than the human brain!

Better Spatial Evaluation
Apart from measuring multiple seat locations at once, we should consider measuring the Inter-aural Time Difference (ITD) and Inter-aural Level Difference (ILD) data to objectively determine the sound stage. This can really only be done with a HATS as we need the inter-aural, ear-to-ear delays and attenuations, as well as the reflections and diffractions that occur around the shoulders, head, pinna, and concha.

The theory is that for a center stage experience, where we present an isolated source at 0-degree incidence, the ITD and ILD for each seat ought to be zero (i.e., sound arrival at each ear happens at the same time, at the same amplitude). However, considering audio content might take advantage of the whole sound stage, from left to right, we need a technique to better understand the smoothness and realism of the sound stage.

One proposed method is to use ILD vs. time. Assume an audio playback system wants to reproduce the soundstage equivalent to two loudspeakers set at ±45 degrees off axis. If we create a source file that pans from -45 degrees all the way to +45 degrees, and you were to turn your head in time with the panning, then an ideal setup should produce an ILD of 0 during the panning. Applying a 1500Hz high-pass filter isolates the playback primarily to the tweeters and concentrates the acoustic energy in the frequencies humans are most sensitive to ILD.

Using a HATS with motorized head-above-torso rotation, we can perform this ILD (right ear minus left ear) measurement in a car (Figure 4). Referring to Figure 5, our “ideal” scenario is shown by “Car 1.” As we pan from left to right in complete synchronization with the source signal, the ILD stays at 0 and the sound stage is presented smoothly and realistically at all angles. “Car 2” shows a linear error that starts by showing a left ear bias of 2dB and ends with a right ear bias of 2dB. This would indicate the soundstage is expanded. “Car 3” shows a different type of behavior, that would be interpreted as an expanded and offset soundstage.

Figure 4: Example of a HATS with motorized head-above-shoulder rotation.

Figure 5: Example ILD vs. Time data to indicate spatial.

A second use case for a HATS with motorized head-above-torso rotation is to perform spectral analyses of the sound field in each seat location and compare to the 0-degree position (looking straight ahead) as well as the average spectral response for all angles (Figure 6).

Figure 6: Frequency response of rotated head, relative to 0 degree, relative to average of all angles.

This type of analysis would allow us to see how well the spatial elements would hold up for a person rotating their head. For a driver, this is hopefully limited head rotation (eyes on the road!), but for passenger seat locations — and “drivers” in autonomous vehicles — this is even more important. This would be important for speakers built into the seat headrest, evaluating noise cancellation at different head angles, and evaluating listening effort between vehicle occupants through in-car communication systems.

Future Work
Most measurements and evaluations mentioned are designed for silent conditions. And while people do enjoy audio content stopped at a traffic light; most playback occurs at vehicle speed—with other systems (e.g., ANC/RNC) active. While we currently aim to characterize the audio system in quiet conditions with no interfering systems, we are acutely aware of the interference and masking that occur under real-world use conditions. It’s a good time to be an automotive audio engineer.

Conclusion
Test methodologies should strive to be as sophisticated as the technology being tested. It’s fair to say that audio engineers have evolved along with the technology present in cars since 1984.

The first recommendation is to consider using a HATS along with the six-microphone array. While the six-microphone array gives good spatial average of different seat positions, is reliable, and industry accepted, it is missing the crucial HRTF and “loading” that is added with the HATS. The HRTF and “loading” represent the real-life use case of a human sitting in a seat. The benefit of binaural measurements allow for later subjective listening tests to be done when the vehicle tested is long since gone. Finally, the HATS is more accurate for testing modern audio systems that use seat headrest speakers and spatial audio rendering.

The second recommendation is to consider using modern metrics such as MDAQS. The original six-microphone array relies heavily on a single spatially averaged frequency response function curve. While elegant for its time, this is insufficient for today’s modern audio systems.

Metrics proposed include MDAQS (MOS-based quality score) and ILD + ITD with a HATS that is able to “turn its head” can better measure spatial audio played back from today’s vehicle audio systems. These modern metrics are not meant to replace the six-microphone array or “Golden Ear,” rather augment both with more tools that can make tuning more efficient and repeatable.

Finally, new transducers and metrics will be necessary as we evaluate audio systems in realistic scenarios (e.g., in the presence of background noise) and working in tandem with other advanced DSP technologies (such as engine sound enhancement, noise cancellation, in-car communications, and more [10]).

In summary, make sure your audio testing program keeps up with the new audio systems you are developing! aX

References
[1] E. Geddes and H. Blind, “The Localized Sound Power Method,” Ford Motor Co., Dearborn, MI; Paper 2127; 1984, https://aes2.org/publications/elibrary-page/?id=11627
[2] H. Brücher, M. Wegerhoff, D. Beljan, and T. Kamper, “Investigations of the influence of an artificial head on acoustic characteristics of vehicle cabins based on FE simulation results,” HEAD acoustics GmbH, DAGA 2023; www.researchgate.net/publication/370214770_Investigations_of_the_influence_of_an_artificial_head_on_acoustic_characteristics_of_vehicle_cabins_based_on_FE_simulation_results
[3] ITU-T P.58; Head and Torso for telephonometry, www.itu.int/ITU-T/recommendations/rec.aspx?rec=11458
[4] ITU-T P.57; Artificial Ears, www.itu.int/ITU-T/recommendations/rec.aspx?id=14662&lang=en
[5] J. Soendergaard, “Determining the Repeatability of Automotive Audio Testing pt. 1,” December 12, 2024, www.linkedin.com/pulse/determining-repeatability-automotive-audio-testing-jacob-soendergaard-xvume/
[6] J. Soendergaard, “Determining the Repeatability of Automotive Audio Testing pt. 2,” LinkedIn, December 19, 2024, www.linkedin.com/pulse/determining-repeatability-automotive-audio-testing-pt-soendergaard-mfzwe/
[7] Audio Engineering Society (AES) Automotive Audio Committee, www.aes.org/technical/aa
[8] “Room Acoustics,” Wikipedia, https://en.wikipedia.org/wiki/Room_acoustics
[9] “MDAQS — Measuring operation with ACQUA,” Application Note, HEAD acoustics GmbH 2023; https://cdn.head-acoustics.com/fileadmin/data/global/Application-Notes/Telecom/MDAQS-Measuring-operation-with-ACQUA-Application-Note.pdf
[10] J. Soendergaard and F. Kettler, “Enhancing In-Vehicle Communication and Analyzing Listening Effort and Acoustic Privacy,” audioXpress, January 2025.

About the Authors
Jacob Soendergaard is an Audio and Acoustics aficionado privileged to be employed in the industry. He is the Customer Success Manager at HEAD acoustics, working with consumer, business, and military customers on the goal of improving sound, voice and conversational quality in their products. Jacob has a B.Eng from Imperial College and MSc from University of Sussex and brings a wealth of experience from various technical and commercial roles from almost two decades in the business. Outside of work, he spends a lot of time on family and sports, and he is a big fan of tacos.

JesseGratke-Web

Jesse Gratke is Engineering Services Manager – Lead Consultant at HEAD acoustics. With a background in managing speech and audio engineering services, his expertise extends to optimizing speech communication systems for infotainment products. With a BS degree in Mechanical Engineering from Kettering University and a MS in Acoustics from Penn State, Jesse has a specialty in automotive telecom transducer research and development, packaging, and validation. He holds patents for inventing wideband microphones for automobiles and has earned another five patents for work in developing in-vehicle conference calling, in-vehicle communication systems, and active noise cancellation.

This article was originally published in audioXpress, June 2025

« Back