-
Notifications
You must be signed in to change notification settings - Fork 46
Description
I notice a drawback of Opus: when a certain audio is encoded with libopus at 128 kbps and below, the high-frequency stereo is all mixed into mono for both channels, and no real stereo (surround stereo) is preserved. In other words, real surround stereo is completely lost. Although the high frequencies of audio encoded with libopus at 128 kbps or below still appear stereo, the high-frequency stereo is created by volume matching of the left and right channels on mono audio. This makes the stereo produced by libopus non-real.
When the audio is encoded with libopus at 128 kbps, the stereo in the 15.6 kHz–20 kHz band is mixed into mono and then rebalanced between the left and right channels by volume and phase inversion; this means the stereo above 15 kHz is pseudo-stereo.
When the audio is encoded at 96–104 kbps, the stereo in the 12–15.6 kHz and 15.6 kHz–20 kHz bands is mixed into mono and then rebalanced; this means stereo above 12 kHz is pseudo-stereo.
When the audio is encoded at 80–88 kbps, the stereo in 9.6–12 kHz, 12–15.6 kHz, and 15.6 kHz–20 kHz bands is mixed into mono and then rebalanced; this means stereo above 9.6 kHz is pseudo-stereo.
When the audio is encoded at 64 kbps, the stereo in 5.6–6.8 kHz, 6.8–8.0 kHz, 8.0–9.6 kHz, 9.6–12 kHz, 12–15.6 kHz, and 15.6 kHz–20 kHz bands are mixed into mono and then rebalanced; this means stereo above roughly 5.6 kHz is pseudo-stereo.
When the audio is encoded at 48 kbps, the stereo in 3.2–4.0 kHz, 4.0–4.8 kHz, 4.8–5.6 kHz, 5.6–6.8 kHz, 6.8–8.0 kHz, 8.0–9.6 kHz, 9.6–12 kHz, 12–15.6 kHz, and 15.6–20 kHz bands is mixed into mono and then rebalanced; this means stereo above about 3.2 kHz is pseudo-stereo.
The following video screen recordings show stereo XY plot analyses of the different frequency bands of a piece of music encoded with Libopus at different bitrates, as well as spectrogram analyses of test audio also encoded with Libopus at different bitrates (the test audio has a single-frequency sound whose pitch descends gradually, while the two channels’ sounds occur at the same moment with a frequency separation of about 50–500 Hz).
Also included is a video screen recording explaining what the various images in the stereo XY plots represent: a positively slanted line indicates equal left and right channel volumes in a mono-by-two; a positively sloped line with a slope crossing to the right indicates equal left and right channel volumes with the channels being out of phase (mono-by-two); a diagonal line with an angle less than 45 degrees from horizontal indicates the left channel having a smaller volume (mono-by-two); a diagonal line with an angle less than 45 degrees from vertical indicates the right channel having a smaller volume and the channels being out of phase (mono-by-two); dispersed points or curves indicate surround stereo.
我看出,Opus有一个缺点:将某个音频用libopus编码成128Kbps及以下的音频时,高频段的立体声全部都被混合成单声道*2,并且没有一点真实的立体声(环绕立体声)会被保留,也就是真实的环绕立体声都被一刀切了。虽然用libopus编码成128Kbps及以下的音频的高频看起来仍然是立体声,但是高频的立体声是对单声道音频在左右声道进行音量配比的。这导致用libopus编码的立体声不真实。
当音频用libopus编码到128kbps时,15.6KHz-20.KHz频段的立体声被混合成单声道,并重新被进行左右声道音量和反声波配比,这说明15.6KHz以上的立体声都是伪立体声。
当音频用libopus编码到96-104kbps时,12-15.6Khz和15.6Khz-20.KHz两个频段的立体声分别被混合成单声道,并重新被进行左右声道音量和反声波配比。这说明12KHz以上的立体声都是伪立体声。
当音频用libopus编码到80-88kbps时,9.6-12Khz,12-15.6Hz和15.6Khz-20.KHz,三个频段的立体声分别被混合成单声道,并重新被进行左右声道音量和反声波配比。这说明9.6KHz以上的立体声都是伪立体声。
当音频用libopus编码到64kbps时,5.6-6.8KHz、6.8-8.0KHz、8.0-9.6KHz、9.6-12Khz,12-15.6Hz和15.6Khz-20.KHz,多个频段的立体声分别被混合成单声道,并重新被进行左右声道音量和反声波配比。这说明5.6KHz以上的立体声都是伪立体声。
当音频用libopus编码到48kbps时,3.2-4.0KHz、4.0-4.8KHz、4.8-5.6KHz、5.6-6.8KHz、6.8-8.0KHz、8.0-9.6KHz、9.6-12Khz,12-15.6Hz和15.6Khz-20.KHz,多个频段的立体声分别被混合成单声道,并重新被进行左右声道音量和反声波配比。这说明3.2KHz以上的立体声都是伪立体声。
以下视频录屏是对用Libopus编码到不同码率的一段音乐的部分频段的立体声XY图分析,以及用Libopus编码到不同码率的测试音频(测试音频中,一个占单频率的声音,音调逐渐下降,但是两个声道的声音在同一时间点,频率相隔50-500Hz之间)的频谱图分析。同时附带一个视频录屏:立体声XY图分析各种图像代表什么情况(正撇斜线,代表在左右声道音量相同的单声道2;正捺斜线,代表在左右声道音量相同,并且左右声道互为反声波的单声道2;与横线夹角小于45度的撇直线,代表在左声道音量较小的单声道2;与竖线夹角小于45度的捺直线,代表在右声道音量较小,并且左右声道互为反声波的单声道2);随意分布的点或曲线,代表环绕立体声。
Spectrogram Analyses Of The Test Audio Encoded With Libopus At Different Bitrates
Video Explaining What The Various Images In The Stereo XY Plots Represent
Stereo XY Plot Analyses Of The Different Frequency Bands Of The Piece Of Music Encoded With Libopus At Different Bitrates
The Test Audio Encoded With Libopus At Different Bitrates
The Piece Of Music Encoded With Libopus At Different Bitrates