You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<p>This specification defines a model for representing <a data-link-type="dfn" href="#immersive-audio" id="ref-for-immersive-audio②">Immersive Audio</a> contents based on <a data-link-type="dfn" href="#audio-substream" id="ref-for-audio-substream">Audio Substream</a>s contributing to <a data-link-type="dfn" href="#audio-element" id="ref-for-audio-element④">Audio Element</a>s meant to be rendered and mixed to form one or more presentations as depicted in the figure below.</p>
<p>The term <dfn class="dfn-paneled" data-dfn-type="dfn" data-noexport id="rendered-mix-presentation">Rendered Mix Presentation</dfn> means a <a data-link-type="dfn" href="#3d-audio-signal" id="ref-for-3d-audio-signal⑦">3D audio signal</a> after the <a data-link-type="dfn" href="#audio-element" id="ref-for-audio-element①①">Audio Element</a>(s) defined in a <a data-link-type="dfn" href="#mix-presentation" id="ref-for-mix-presentation②">Mix Presentation</a> is(are) rendered and mixed together for playback through physical loudspeakers or headphones.</p>
<p>Based on the model, this specification defines the Immersive Audio Model and Formats (<dfn data-dfn-type="dfn" data-noexport id="iamf">IAMF<a class="self-link" href="#iamf"></a></dfn>) architecture as depicted in the figure below.</p>
<p>Within an <a data-link-type="dfn" href="#ia-sequence" id="ref-for-ia-sequence①①">IA Sequence</a>, all <a data-link-type="dfn" href="#mix-presentation" id="ref-for-mix-presentation⑧">Mix Presentation</a>s have the same duration, defining the duration of the <a data-link-type="dfn" href="#ia-sequence" id="ref-for-ia-sequence①②">IA Sequence</a>, and have the same presentation start time defining the presentation start time of the <a data-link-type="dfn" href="#ia-sequence" id="ref-for-ia-sequence①③">IA Sequence</a>.</p>
2359
2359
<p>The term <dfn class="dfn-paneled" data-dfn-type="dfn" data-noexport id="temporal-unit">Temporal Unit</dfn> conceptually means a set of all <a data-link-type="dfn" href="#audio-frame-obu" id="ref-for-audio-frame-obu①①">Audio Frame OBU</a>s with the same decode start time and the same duration from all coded <a data-link-type="dfn" href="#audio-substream" id="ref-for-audio-substream②⑤">Audio Substream</a>s and all non-redundant <a data-link-type="dfn" href="#parameter-block-obu" id="ref-for-parameter-block-obu①⓪">Parameter Block OBU</a>s with the decode start time within the duration.</p>
2360
2360
<p>The figure below shows an example of the Timing Model in terms of the decode start times and durations of the coded <a data-link-type="dfn" href="#audio-substream" id="ref-for-audio-substream②⑥">Audio Substream</a> and <a data-link-type="dfn" href="#parameter-substream" id="ref-for-parameter-substream①⑥">Parameter Substream</a>.</p>
<figcaption>An example of the IAMF Timing Model. AFO: <a data-link-type="dfn" href="#audio-frame-obu" id="ref-for-audio-frame-obu①②">Audio Frame OBU</a>, PBO: <a data-link-type="dfn" href="#parameter-block-obu" id="ref-for-parameter-block-obu①①">Parameter Block OBU</a>, \(\text{PT}x\): time \(x\) (ms) on the presentation layer’s timeline, \(\text{DT}y\): time \(y\) (ms) on the decoding layer’s timeline.</figcaption>
<p><a data-link-type="dfn" href="#parameter-block-obu" id="ref-for-parameter-block-obu①⑥">Parameter Block OBU</a>s MAY be associated with Audio Frames.</p>
3020
3020
</ul>
3021
-
<center><img height="356" src="images/Immersive Audio Sequence with scalable channel audio (before OBU packing).png" style="width:100%; height:auto;" width="1622"></center>
3021
+
<center><img height="356" src="v1.1.0_images/Immersive Audio Sequence with scalable channel audio (before OBU packing).png" style="width:100%; height:auto;" width="1622"></center>
3022
3022
<center>
3023
3023
<figcaption>Immersive Audio Sequence with scalable channel audio (before OBU packing). See <a href="#standalone">§ 5 Standalone IAMF Representation</a> for related details on OBU ordering within an IA Sequence.</figcaption>
<p>\(CL \text{#}i\) is one of the <a data-link-type="dfn" href="#loudspeaker_layout" id="ref-for-loudspeaker_layout①⑤">loudspeaker_layout</a>s supported in this version of the specification.</p>
3041
3041
</ul>
3042
3042
<p>Scalable channel audio with <a data-link-type="dfn" href="#num_layers" id="ref-for-num_layers⑦">num_layers</a> \(> 1\) SHALL only allow down-mix paths that conform to the rules above, as depicted in the figure below.</p>
<figcaption>IA Down-mix Path for scalable channel audio</figcaption>
3046
3046
</center>
@@ -3550,7 +3550,7 @@ <h4 class="heading settled" data-level="3.8.2" id="syntax-demixing-info"><span c
3550
3550
<p>7: Reserved for future use</p>
3551
3551
</ul>
3552
3552
<p>\(\alpha\) and \(\beta\) are gain values used for the <a data-link-type="dfn" href="#s7to5-encoder" id="ref-for-s7to5-encoder">S7to5 encoder</a>, \(\gamma\) for the <a data-link-type="dfn" href="#t4to2-encoder" id="ref-for-t4to2-encoder">T4to2 encoder</a>, \(\delta\) for the <a data-link-type="dfn" href="#s5to3-encoder" id="ref-for-s5to3-encoder">S5to3 encoder</a> and <dfn class="dfn-paneled" data-dfn-type="dfn" data-noexport id="w_idx_offset">w_idx_offset</dfn> is the offset used to generate a gain value <a data-link-type="dfn" href="#w-k" id="ref-for-w-k③">\(w(k)\)</a> used for <a data-link-type="dfn" href="#t2totf2-encoder" id="ref-for-t2totf2-encoder">T2toTF2 encoder</a>.</p>
@@ -3831,7 +3831,7 @@ <h3 class="heading settled" data-level="5.1" id="standalone-ia-sequence"><span c
3831
3831
<p>An <dfn class="dfn-paneled" data-dfn-type="dfn" data-noexport id="ia-sequence">IA Sequence</dfn> is composed of a series of OBUs in the sequence of a set of <a data-link-type="dfn" href="#descriptors" id="ref-for-descriptors②⓪">Descriptors</a> followed by their associated <a data-link-type="dfn" href="#ia-data" id="ref-for-ia-data②">IA Data</a>.</p>
3832
3832
<p>The <a data-link-type="dfn" href="#descriptors" id="ref-for-descriptors②①">Descriptors</a> MAY additionally be repeated redundantly and as frequently as necessary. In this case, the <a data-link-type="dfn" href="#obu_redundant_copy" id="ref-for-obu_redundant_copy⑤">obu_redundant_copy</a> field in their <a data-link-type="dfn" href="#obu-header" id="ref-for-obu-header⑤">OBU Header</a>s SHALL be set to 1. Within an <a data-link-type="dfn" href="#ia-sequence" id="ref-for-ia-sequence⑤③">IA Sequence</a>, each OBU in the first <a data-link-type="dfn" href="#descriptors" id="ref-for-descriptors②②">Descriptors</a> SHALL be regarded as a non-redundant OBU regardless of the value of its <a data-link-type="dfn" href="#obu_redundant_copy" id="ref-for-obu_redundant_copy⑥">obu_redundant_copy</a>.</p>
3833
3833
<p>The figure below shows an example of an <a data-link-type="dfn" href="#ia-sequence" id="ref-for-ia-sequence⑤④">IA Sequence</a>.</p>
<figcaption>Recommendation for handling ISO-BMFF trimming information. PTS is the presentation start time. PTS1 is the presentation start time of the first audio sample before trimming. PTS2 is the presentation start time of the first audio sample after trimming.</figcaption>
<p class="note" role="note"><span>NOTE:</span> The IA decoder may choose to lazily parse OBUs to avoid unnecessarily parsing OBUs that are not used by the selected <a data-link-type="dfn" href="#mix-presentation" id="ref-for-mix-presentation③⑤">Mix Presentation</a>.</p>
4131
4131
<p>The figure below depicts an example of IA decoder architecture with modules that perform the steps above.</p>
<p>The reconstruction of an Ambisonics signal SHALL conform to <a data-link-type="biblio" href="#biblio-rfc-8486">[RFC-8486]</a>, with the exception that a codec other than Opus MAY be used.</p>
4152
4152
<p>The figure below shows the decoding and reconstruction flowchart.</p>
<p>This section describes the decoding and reconstruction of a Scalable Channel Audio representation.</p>
4177
4177
<p>The output of this process SHALL be the <a data-link-type="dfn" href="#3d-audio-signal" id="ref-for-3d-audio-signal①⑧">3D audio signal</a> (e.g., 3.1.2ch or 7.1.4ch) for the target channel layout.</p>
4178
4178
<p>The figure below shows the decoding and reconstruction flowchart.</p>
<p>A list of channel layouts to be supported for scalable channel audio, which conforms to <a data-link-type="dfn" href="#loudspeaker_layout" id="ref-for-loudspeaker_layout③③">loudspeaker_layout</a>.</p>
4968
4968
</ul>
4969
4969
<p>The figure below shows an example architecture for an IA encoder that generates an <a data-link-type="dfn" href="#ia-sequence" id="ref-for-ia-sequence⑧①">IA Sequence</a> with one <a data-link-type="dfn" href="#audio-element" id="ref-for-audio-element①①⑧">Audio Element</a>.</p>
<p>This section describes how down-mix parameters and loudness levels can be generated for a given channel audio and a given list of channel layouts for scalability (i.e., <a data-link-type="dfn" href="#num_layers" id="ref-for-num_layers②④">num_layers</a> > 1).</p>
5170
5170
<p>The figure below shows a block diagram for the Down-Mix Parameter Generator and Loudness Module, including the Down-Mixer.</p>
5171
-
<center><img height="651" src="images/Down-mix Parameter and Loudness.png" style="width:100%; height:auto;" width="1382"></center>
5171
+
<center><img height="651" src="v1.1.0_images/Down-mix Parameter and Loudness.png" style="width:100%; height:auto;" width="1382"></center>
5172
5172
<center>
5173
5173
<figcaption>IA Down-Mix Parameter and Loudness</figcaption>
0 commit comments