Skip to content
Wolle edited this page Aug 29, 2025 · 54 revisions

audioI2S FAQ

What is this?

The library enables decoding of MP3 and AAC compression and plays 8bit or 16bit wav files. The audio data can come from the Internet, SD card or SPIFFS. Many radio stations can be heard. Playlists are unpacked and a connection to the (first) URL is established, formats are * .pls, * .m3u and * .asx. SSL connections are possible.

Examples:

connecttohost("http://online.rockarsenal.ru:8000/rockarsenal_aacplus");
connecttoFS(SD, "/wave_test/Wav_868kb.wav");
connecttoFS(SPIFFS, "wobble.mp3");

Stations can be received up to 320Kbit/s. A good connection is a prerequisite for this. Many, but not every, station that runs smoothly in the VLC player works on the ESP32 without dropouts. Shortly before the input buffer is empty, this message appears in the serial monitor slow stream, dropouts are possible If the connection is lost, the library tries to re-establish the connection. Tip: the AAC decoder supports SBR (Spectral Band Replication). To do this, 'AAC_ENABLE_SBR' can be activated in 'aac_decoder.h'. However, another ~ 60KB are required in RAM. In SBR mode, PSRAM cannot be used because of the longer access time.


Which external DACs can be used?

Basically all 16 bit DACs that have the pins DIN, BLCK and LRC. The PCM5102A delivers good results. Most GPIOs can be used. There are devices with I2S inputs that only accept 48 kHz. In this case, #define SR_48K can be activated in audio.h. Here are some examples for DACs that are controlled via I2C: https://github.com/schreibfaul1/ESP32-audioI2S/tree/master/examples

setPinout(uint8_t BCLK, uint8_t LRC, uint8_t DOUT); 
For DACs such as the PT8211, you can switch from the I2S standard to the Japanese (Least Significant Bit Justified) format. For this there is the command setI2SCommFMT_LSB(true) which has to be executed before activating the I2S interface (i.e. before connectTo ...)

```c++
setI2SCommFMT_LSB(true);

Can the Arduino IDE be used?

Yes, the library can be downloaded as a zip file. Unzip the file and copy the src folder next to your *.ino file. (#include "src/audio.h")

Arduino IDE Library

Tip: Use the partition scheme 'Huge App' so that there is enough memory for your own extensions

Arduino IDE Partition Scheme


What about PSRAM?

PSRAM is required to bridge short interruptions. Approximately 1 MB is required.


How to adjust the balance and volume?

Internally, the volume is divided into 64 steps. With setVolume() the volume is set to 22 steps by default. Internally, the 22 steps are assigned to the 64 volume steps via a table. This creates a logarithmic curve. This is the ideal solution for buttons or touchpads. Manche externen Geräte (z.B. AC101, ES8388 ...) require a larger range of values. The default maximum (21) can be overwritten with setVolumeSteps(uint8_t steps). In this way, value ranges can be redefined, e.g. (0...63) or (0...99). The balance attenuates the left or right channel (values between -16 ...16).

setBalance(-16); // mutes the left channel
setVolume(21);  // max loudness

The volume control stages are not linear, but follow a logarithmic control characteristic to cover a large dynamic range with linear adjustment. To achieve this, two different curves are implemented. Curve 0 follows a quadratic curve, curve 1 a logarithmic curve. Which curve is chosen depends on personal preference and the hardware used.
Call: setVolume(uint8_t vol, uint8_t curve); Volume Settings Dynamics


How To Change Bass And Treble?

Yes, that is possible. There are built-in IIR filters to simulate a 3 band equalizer.

setTone(int8_t gainLowPass, int8_t gainBandPass, int8_t gainHighPass){
    // values can be between -40 ... +6 (dB)

SetTone (0, 0, 0) is the default setting. If you want to go deeper into the rabbit hole, take a look at the routine IIR_calculateCoefficients (int8_t G0, int8_t G1, int8_t G2). The limit frequencies are specified there. The filter formulas I used can be find here: https://www.earlevel.com/main/2012/11/26/biquad-c-source-code/ The filter effect can be evaluated graphically here:

lowpass


The server requires access data

audio_info: authentification failed, wrong credentials?

The name and password can be transferred when the destination is called, use:

connecttohost("http://xxxx", "name", "password");

What audio events are there?

The CallBack function must be activated in order to log the events: The "Functional Callback" integrated in C++ is used for this purpose. audio_info_callback is assigned a self-defined function, for example: Audio::audio_info_callback = my_audio_info;

void my_audio_info(Audio::msg_t m) {
    ...
    ...
}

m.e contains the event type as event_t' (uint8_t),
m.msg contains the message as const char*,
m.arg1 contains the value as ìnt32_t, e.g. msg is SampleRate (Hz): 48000 then arg1 is 48000

  • evt_info General information
  • evt_name The name of a RadioStation from the HTTP response header
  • evt_streamtitle This is the content of the metadata
  • evt_eof Called at the end of a file
  • evt_lasthost Is the URL of the last successfully connected host
  • evt_icyurl Is the URL that was received in the HTTP response header
  • evt_icylogo Is the URL of a logo transmitted by a radio station
  • evt_id3data Data from audio file header
  • evt_image An embedded image was found
  • evt_icydescription Sometimes transmittet by a radio station
  • evt_bitrate Read from response header
  • evt_lyrics Lyrics text, is called up at the specified time
  • evt_log Logs within this library, log_v ... log_e

evt_image in ogg OGG can contain embedded images in the comment header. If these are larger than an OggS frame, the image is fragmented and embedded in further OggS frames. The image data is always Base64 encoded.

OGG METADATABLOCKPICTURE

void my_audio_info(Audio::msg_t m) {
    switch(m.e){
        case Audio::evt_image:
            for(int i = 0; i < m.vec.size(); i += 2){
                Serial.printf("CoverImage:  " ANSI_ESC_GREEN "segment %02i, pos %08lu, len %08lu\n", i / 2, m.vec[i], m.vec[i + 1]);}
             break;
    }
}

audio_process_i2s is used to decouple the audio signal and pass it on to external devices. The sample rate is always 48KHz, stereo, 16bps, even if the source is (8KHz, 8bps, mono). This means that an Bluetooth device can be connected without skaling. If continueI2S is true, the signal is written to the I2S DMA. But you can also manipulate a signal. The example shows how an audio stream is accompanied by a sine tone.

void audio_process_i2s(int16_t* outBuff, int32_t validSamples, bool *continueI2S){

    int16_t sineWaveTable[44] = {
         0,   3743,   7377,  10793,  14082,  17136,  19848,  22113,  23825,  24908,
      25311,  24908,  23825,  22113,  19848,  17136,  14082,  10793,   7377,   3743,
         0,  -3743,  -7377, -10793, -14082, -17136, -19848, -22113, -23825, -24908,
     -25311, -24908, -23825, -22113, -19848, -17136, -14082, -10793,  -7377,  -3743
    };

    static uint8_t tabPtr = 0;
    int16_t* sample[2]; // assume 2 channels, 16bit
    for(int i= 0; i < validSamples; i++){
        *(sample + 0) = outBuff + i * 2;     // channel left
        *(sample + 1) = outBuff + i * 2 + 1; // channel right

        *(*sample + 0) = (sineWaveTable[tabPtr] /50 + *(*sample + 0));
        *(*sample + 1) = (sineWaveTable[tabPtr] /50 + *(*sample + 1));
        tabPtr++;
        if(tabPtr == 44) tabPtr = 0;
    }
   *continueI2S = true;
}

What else is there?

There are other useful functions for building MP3 players, for example

setConnectionTimeout() In some cases it can make sense to change the threshold value for establishing a connection. By default, 250ms are set for unencrypted connections and 2700ms for SSL connections.

uint16_t timeout_ms = 300;
uint16_t timeout_ms_ssl = 3000;
audio.setConnectionTimeout(timeout_ms, timeout_ms_ssl);

bool setTimeOffset(int sec) jump relatively by sec within the file
bool setAudioPlayTime(uint16_t sec) jumps to the absolute time
uint32_t stopSong(); stops the song and returns the current audio time (in seconds)

getAudioFileDuration() Indicates the expected length of an audio file in seconds. With a constant bit rate, CBR, the value is exact, with a variable bit rate, VBR, the duration is estimated based on the first 100 mp3 frames and can therefore deviate slightly from the actual playback time

uint32_t getAudioFileDuration()

getAudioCurrentTime() returns the current playing time in seconds

uint32_t getAudioCurrentTime()

An example program could look like this:

Ticker ticker;
void setup()  {
	...
	ticker.attach(1, tcr1s);
	...
}

void tcr1s(){
    uint32_t act=audio.getAudioCurrentTime();
    uint32_t afd=audio.getAudioFileDuration();
    uint32_t pos =audio.getFilePos();
    log_i("pos =%i", pos);
    log_i("audioTime: %i:%02d - duration: %i:%02d", (act/60), (act%60) , (afd/60), (afd%60));
}

The output in the serial monitor

Audio Duration

This works with local files (SD, FFat, SD_MMC, SPIFFS) and with web files in wav or mp3 format. The current time for AAC-coded files (m4a) cannot be precisely determined and is therefore estimated using the mean value of the bit rate.

In some projects there is only one audio amplifier or speaker. Then it makes sense to convert the stereo signal into a mono signal. With **forceMono(true);** the mean value is calculated from the signal of both channels and placed on the left and right channel.

```c++
void forceMono(true);  // change stereo to mono
void forceMono(false); // default stereo will be played

setAudioTaskCore(uint8_t coreID) The audio task takes the data from the buffer, decodes it and feeds the I2S. On the other hand, the audio.loop() fills the buffer, takes care of the entire control, processes all 'non' audio-relevant data, such as the metadata, and generates the events. For good performance, the audio task should not run on the core of the Arduino loop task. By default, the audio task runs on core 0, but can be changed here.


A simple project to receive a webstream

Here is a simple program example, you need an ESP32 developer board and an external DAC (e.g. PCM5102A)

#include "Arduino.h"
#include "WiFi.h"
#include "Audio.h"

#define I2S_DOUT      26  // connect to DAC pin DIN
#define I2S_BCLK      27  // connect to DAC pin BCK
#define I2S_LRC       25  // connect to DAC pin LCK

Audio audio;

const char* ssid =     "SSID";
const char* password = "password";

void setup() {
    Serial.begin(115200);
    WiFi.begin(ssid, password);
    while (WiFi.status() != WL_CONNECTED) delay(1500);
    audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT);
    audio.connecttohost("http://s1.knixx.fm/dein_webradio_64.aac"); // 64 kbp/s aac+
}

void loop() {
    audio.loop();
}

void audio_info(const char *info){
    Serial.print("info        "); Serial.println(info);
}

The output in the serial monitor:

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x3fff001c,len:1216
ho 0 tail 12 room 4
load:0x40078000,len:10944
load:0x40080400,len:6388
entry 0x400806b4
info        PSRAM not found, inputBufferSize = 6399 bytes
info        buffers freed, free Heap: 228148 bytes
info        Connect to new host: "http://s1.knixx.fm/dein_webradio_64.aac"
info        Connect to "s1.knixx.fm" on port 80, extension "/dein_webradio_64.aac"
info        Connected to server
info        Server: nginx/1.14.2
info        audio/aac seen.
info        format is aac
info        AACDecoder has been initialized, free Heap: 199916 bytes
info        chunked data transfer
info        Connection: close
info        ice-audio-info: channels=2;samplerate=44100;bitrate=64
info        icy-description: Wir spielen Musik von den 60ern bis Heute! Und immer um halb aktuelle Country-Music.
info        icy-genre: variety,pop,oldies,country
info        icy-name: knixx.fm - Dein Webradio. / 64 kbp/s aac+
info        icy-pub: 1
info        icy-url: https://knixx.fm
info        Cache-Control: no-cache, no-store
info        Access-Control-Allow-Origin: *
info        Access-Control-Allow-Headers: Origin, Accept, X-Requested-With, Content-Type
info        Access-Control-Allow-Methods: GET, OPTIONS, HEAD
info        Expires: Mon, 26 Jul 1997 05:00:00 GMT
info        X-Frame-Options: SAMEORIGIN
info        X-Content-Type-Options: nosniff
info        Switch to DATA, bitrate is 64000, metaint is 4096
info        inputbuffer is being filled
info        StreamTitle="Michael Bolton - Soul Provider -- 1989"
info        stream ready
info        buffer filled in 7 ms
info        syncword found at pos 0
info        AAC Channels=1
info        AAC SampleRate=22050
info        AAC BitsPerSample=16
info        AAC Bitrate=64000
info        StreamTitle="Symbol - The Most Beautiful Girl In The World -- 1994"

building it on a breadboard:

Simple_Project

the schematic:

Simple_Project Schematic


Who wants to build a simple internet radio

There are displays for the Raspberry Pi with a resolution of 480x320 pixels and an SPI bus. These are particularly suitable, see the radio folder

Simple_WiFi_Radio