Bernhard Grill, managing director of Fraunhofer IIS, recently showed me the institute’s latest audio innovations. As the German research body behind the MP3 and a key driver in AAC development, the institute is now merging that legacy with cutting-edge work on MPEG-H. This new standard aims to make listening more flexible, immersive, and adaptive.
What this means for makers and artists
MPEG-H moves audio beyond fixed stereo or surround mixes. By utilising audio objects and dynamic metadata, content can adapt to various playback setups, ranging from immersive speaker arrays to soundbars, TVs, and headphones. For creators, this shifts the focus from static mixes to dynamic experiences that respond to the listener’s environment.
Grill, a member of the core group that developed the MP3, emphasises that research must be applicable in the real world. He contrasts the two eras: while the original MP3 was built by a small team of roughly five people, MPEG-H required nearly 100 specialists. This highlights a shift in innovation from small, focused groups to large, distributed efforts. As Grill notes, while making a difference for the next generation is becoming harder due to complexity, hardware power continues to increase.
However, this growing complexity requires a supportive environment. Grill argues that Europe risks losing ground to China and the US due to bureaucracy. Delayed purchasing, regulatory hurdles, and strict restrictions on AI usage force researchers to spend time on administrative forms rather than advancing technology.
From underdogs to industry standards
Despite its current status, Fraunhofer was once seen as an underdog. The industry initially distrusted their technology, viewing it as too complex and doubting its mass-market viability. The project survived only through small contracts that kept the team together while they pushed forward with a technology that few fully believed in.
The transition was not straightforward. Piracy became a visible side effect of the shift towards efficient compression, though the developers did not intend this outcome. Grill recalls the industry’s early hostility, noting that lawyers attempted to find legal grounds to attack the team but failed. The real issue was a broader cultural mutation: sound could now be copied and shared with unprecedented ease.
MP3 addressed the specific problems of an era characterised by expensive storage and slow connections. Fraunhofer recognised early that the internet would transform sound circulation, necessitating efficient compression. The technology expanded rapidly once computers capable of playing MP3s without dedicated hardware became common and the internet spread.
Today, the challenge has evolved. It is no longer just about efficient compression but redesigning the listening experience itself. The institute’s “Mozart room” exemplifies this. It is a validation and acoustic development environment built with extreme precision. Featuring a room-in-room construction with a floating floor, double walls, and a slow-moving air-conditioning system to prevent noise, the space contains around 40 loudspeakers positioned at various heights. This includes a middle ring adjustable to ear level and speakers placed above and below to create vertical spatiality.
The Mozart and Bach rooms
Ulli Scuda, Head of Group Soundlab, guided the tour and explained the room’s physical design. Even the large aluminium ring supporting the system was engineered to avoid vibrations that could interfere with reproduction. The central function of the room is to host listening tests where evaluators compare signals and technologies to determine if improvements are truly audible.
For domestic environments, Scuda highlights the 7.4 configuration as particularly convincing. The four additional speakers placed above the listener reinforce height and immersion. While larger systems like the 22.2 format used by NHK in Japan exist, the 7.4 setup offers powerful spatial reproduction without requiring extreme infrastructure. Side positions are crucial for reproducing reverberation and spatial detail, such as in recordings of churches or cathedrals.
MPEG-H in the real world
MPEG-H transforms the listening experience through audio objects and dynamic metadata. It allows language switching without losing ambient background noise, enables separate adjustment of dialogue and background levels, and permits further customisation based on provider options.
This technology is no longer confined to labs. In Brazil, MPEG-H has been adopted through trials and pilots, becoming part of the new TV 3.0 framework formalised by decree on August 27, 2025. Fraunhofer worked with broadcasters like TV Globo on tests and real productions. Local teams eventually absorbed the technology, learning to operate the system autonomously.
The system also adapts to the playback medium. Even through headphones, certain configurations can simulate rear spatiality via binaural processing, transferring three-dimensional complexity to everyday listening formats.
At Fraunhofer IIS, there is no strict divide between engineering and musical sensibility. Mandy Garcia, Head of Marketing and Communication, states that many staff members are musicians, connecting them through a shared passion. The institute even maintains a rehearsal room in the basement where colleagues form bands to play together.
Key takeaways
- MPEG-H shifts audio from fixed mixes to dynamic, object-based experiences that adapt to the playback environment.
- European research progress faces hurdles from bureaucracy and AI restrictions, creating a competitive gap with the US and China.
- Real-world adoption is accelerating, exemplified by Brazil’s official integration of MPEG-H into its TV 3.0 framework.
- Fraunhofer IIS bridges the gap between engineering and musical practice, ensuring technology serves the lived experience of listening.
Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.




