Learn audio decoding and rendering with Cavern

Speaker placement guidelines

Traditionally, for channel-based mixes, speaker placement was straightforward: put the fronts on the sides of the screen, the center just beind or below it, and the various surrounds at their corresponding angles. This was because the average angle of a surround speaker in a commercial cinema was the same. Object-based audio works by drastically different principles, yet the speaker placement practices were not updated for it. Angular placement is a good rule of thumb, very easy to describe, but doesn't create perfect spatial precision.

For present day recommended speaker locations to make sense, we need to know how an object-based renderer works. The long story is the Rendering tab, but the short story is as follows: point objects are rendered by finding a bounding speaker box, then the closer the object to a speaker, the louder it will play from that speaker. For precise placements, these boxes have to be kept as boxes, so each top channel has to be directly above a front or surround on the ground. Before getting into the exact locations of the speakers, another thing to note, that in mixing rooms and cinemas, screens traditionally go from wall to wall. This means the front shall stay next to the screen, regardless how large it is, to perfectly convey side to side movements on it. While codecs allow for handling the front wall and the screen differently (this is called anchoring), it wasn't found to be used at all in practice, so Enhanced AC-3, and thus Cavern, assumes the front corners of the room as sides of the screen.

Channel positions for object-based audio

Because the industry standard is allocentric mixing, which means the movement should be relative to the room, and not measured in angles, channel positions in mixes and standards are placed on wall middle points and corners. This is defined precisely in a standard called "ETSI TS 103 420 V1.2.1", in Table B.10. Playing object-based mixes in Cavern also confirm that this indeed is the case for all content. Knowing which positions are used by objects as main channel locations, we can match them to common names, and write a human-readable position for each of them. These are where channels shall be placed for object-based content:

Channel Euclidean coordinates Y angle X angle Human-readable position
Front Left -1, 0, 1 -45° Front wall, left side of the screen
Front Right 1, 0, 1 45° Front wall, right side of the screen
Front Center 0, 0, 1 Front wall, middle of the screen
Rear Left -1, 0, -1 -135° Rear left corner
Rear Right 1, 0, -1 135° Rear right corner
Side Left -1, 0, 0 -90° Middle of the left wall
Side Right 1, 0, 0 90° Middle of the right wall
Wide Left -1, 0, 0.677419 -55.89° Left wall, about 16% from the front
Wide Right 1, 0, 0.677419 55.89° Right wall, about 16% from the front
Top Front Left -1, 1, 1 -45° -45° Ceiling, perfectly above Front Left
Top Front Right 1, 1, 1 45° -45° Ceiling, perfectly above Front Right
Top Front Center 0, 1, 1 -45° Center of the front wall/ceiling corner
Top Rear Left -1, 1, -1 -135° -45° Top rear left corner
Top Rear Right 1, 1, -1 135° -45° Top rear right corner
Top Rear Center 0, 1, -1 180° -45° Center of the rear wall/ceiling corner
Top Side Left -1, 1, 0 -90° -45° Midway between TFL and TRL positions
Top Side Right 1, 1, 0 90° -45° Midway between TFR and TRR positions
God's Voice 0, 1, 0 -90° Middle of the ceiling

This is only an advertisement and keeps Cavern free.

Transforming the angles to specific rooms

The angles shown in the table on the Y (horizontal) and X (vertical) axes are calculated from the center of the room, for a perfectly cubical room. The ideal listening position is not that, it's about 2/3 to the back, but the angles should stay, because these are not individual channels, but channel arrays. In a commercial cinema, all speakers on the left wall used to play the left surround channel's signal before object-based content appeared, giving all listeners the experience of left sounds coming exactly from the left. Because in modern home content, there are only two options on the side walls, side and wide speakers, this directionality shall be preserved. Wide channels were actually created to help move the sounds from the screen - whatever size it is - to the room that is handled as a perfect cube. In commercial cinemas, they're also medium size between the large fronts and small surrounds to help the sounds fade better from huge speakers to small ones.

Knowing these, the most mathematically optimal way to get the actually mixed soundstage is using a laser pointer from the main listener's head pointing to the directions described here, and putting the speakers where they hit any wall or the ceiling - with the exception of fronts. This conveys the exact object movements described in content rendered with Cavern, all other setups warp the space in some direction.