-
Notifications
You must be signed in to change notification settings - Fork 359
home
The library enables decoding of MP3 and AAC compression and plays 8bit or 16bit wav files. The audio data can come from the Internet, SD card or SPIFFS. Many radio stations can be heard. Playlists are unpacked and a connection to the (first) URL is established, formats are * .pls, * .m3u and * .asx. SSL connections are possible.
Examples:
connecttohost("http://online.rockarsenal.ru:8000/rockarsenal_aacplus");
connecttoFS(SD, "/wave_test/Wav_868kb.wav");
connecttoFS(SPIFFS, "wobble.mp3");
Stations can be received up to 320Kbit/s. A good connection is a prerequisite for this. Many, but not every, station that runs smoothly in the VLC player works on the ESP32 without dropouts. Shortly before the input buffer is empty, this message appears in the serial monitor
slow stream, dropouts are possible
If the connection is lost, the library tries to re-establish the connection.
Tip: the AAC decoder supports SBR (Spectral Band Replication). To do this, 'AAC_ENABLE_SBR' can be activated in 'aac_decoder.h'. However, another ~ 60KB are required in RAM. In SBR mode, PSRAM cannot be used because of the longer access time.
Basically all 16 bit DACs that have the pins DIN, BLCK and LRC. The PCM5102A delivers good results. Most GPIOs can be used.
There are devices with I2S inputs that only accept 48 kHz. In this case, #define SR_48K
can be activated in audio.h.
Here are some examples for DACs that are controlled via I2C: https://github.com/schreibfaul1/ESP32-audioI2S/tree/master/examples
setPinout(uint8_t BCLK, uint8_t LRC, uint8_t DOUT);
For DACs such as the PT8211, you can switch from the I2S standard to the Japanese (Least Significant Bit Justified) format. For this there is the command setI2SCommFMT_LSB(true) which has to be executed before activating the I2S interface (i.e. before connectTo ...)
```c++
setI2SCommFMT_LSB(true);
Yes, the library can be downloaded as a zip file. Unzip the file and copy the src folder next to your *.ino file. (#include "src/audio.h"
)
Tip: Use the partition scheme 'Huge App' so that there is enough memory for your own extensions
PSRAM is required to bridge short interruptions. Approximately 1 MB is required.
Internally, the volume is divided into 64 steps. With setVolume() the volume is set to 22 steps by default. Internally, the 22 steps are assigned to the 64 volume steps via a table. This creates a logarithmic curve. This is the ideal solution for buttons or touchpads. Manche externen Geräte (z.B. AC101, ES8388 ...) require a larger range of values. The default maximum (21) can be overwritten with setVolumeSteps(uint8_t steps)
. In this way, value ranges can be redefined, e.g. (0...63) or (0...99).
The balance attenuates the left or right channel (values between -16 ...16).
setBalance(-16); // mutes the left channel
setVolume(21); // max loudness
The volume control stages are not linear, but follow a logarithmic control characteristic to cover a large dynamic range with linear adjustment.
To achieve this, two different curves are implemented. Curve 0 follows a quadratic curve, curve 1 a logarithmic curve. Which curve is chosen depends on personal preference and the hardware used.
Call: setVolume(uint8_t vol, uint8_t curve);
Yes, that is possible. There are built-in IIR filters to simulate a 3 band equalizer.
setTone(int8_t gainLowPass, int8_t gainBandPass, int8_t gainHighPass){
// values can be between -40 ... +6 (dB)
SetTone (0, 0, 0) is the default setting. If you want to go deeper into the rabbit hole, take a look at the routine IIR_calculateCoefficients (int8_t G0, int8_t G1, int8_t G2). The limit frequencies are specified there. The filter formulas I used can be find here: https://www.earlevel.com/main/2012/11/26/biquad-c-source-code/ The filter effect can be evaluated graphically here:
audio_info: authentification failed, wrong credentials?
The name and password can be transferred when the destination is called, use:
connecttohost("http://xxxx", "name", "password");
The CallBack function must be activated in order to log the events:
The "Functional Callback" integrated in C++ is used for this purpose.
audio_info_callback
is assigned a self-defined function, for example: Audio::audio_info_callback = my_audio_info;
void my_audio_info(Audio::msg_t m) {
...
...
}
m.e contains the event type as event_t' (uint8_t)
,
m.msg contains the message as const char*
,
m.arg1 contains the value as ìnt32_t
, e.g. msg is SampleRate (Hz): 48000
then arg1 is 48000
- evt_info General information
- evt_name The name of a RadioStation from the HTTP response header
- evt_streamtitle This is the content of the metadata
- evt_eof Called at the end of a file
- evt_lasthost Is the URL of the last successfully connected host
- evt_icyurl Is the URL that was received in the HTTP response header
- evt_icylogo Is the URL of a logo transmitted by a radio station
- evt_id3data Data from audio file header
- evt_image An embedded image was found
- evt_icydescription Sometimes transmittet by a radio station
- evt_bitrate Read from response header
- evt_lyrics Lyrics text, is called up at the specified time
- evt_log Logs within this library, log_v ... log_e
evt_image in ogg OGG can contain embedded images in the comment header. If these are larger than an OggS frame, the image is fragmented and embedded in further OggS frames. The image data is always Base64 encoded.
void my_audio_info(Audio::msg_t m) {
switch(m.e){
case Audio::evt_image:
for(int i = 0; i < m.vec.size(); i += 2){
Serial.printf("CoverImage: " ANSI_ESC_GREEN "segment %02i, pos %08lu, len %08lu\n", i / 2, m.vec[i], m.vec[i + 1]);}
break;
}
}
audio_process_i2s is used to decouple the audio signal and pass it on to external devices. The sample rate is always 48KHz, stereo, 16bps, even if the source is (8KHz, 8bps, mono). This means that an Bluetooth device can be connected without skaling. If continueI2S is true, the signal is written to the I2S DMA. But you can also manipulate a signal. The example shows how an audio stream is accompanied by a sine tone.
void audio_process_i2s(int16_t* outBuff, int32_t validSamples, bool *continueI2S){
int16_t sineWaveTable[44] = {
0, 3743, 7377, 10793, 14082, 17136, 19848, 22113, 23825, 24908,
25311, 24908, 23825, 22113, 19848, 17136, 14082, 10793, 7377, 3743,
0, -3743, -7377, -10793, -14082, -17136, -19848, -22113, -23825, -24908,
-25311, -24908, -23825, -22113, -19848, -17136, -14082, -10793, -7377, -3743
};
static uint8_t tabPtr = 0;
int16_t* sample[2]; // assume 2 channels, 16bit
for(int i= 0; i < validSamples; i++){
*(sample + 0) = outBuff + i * 2; // channel left
*(sample + 1) = outBuff + i * 2 + 1; // channel right
*(*sample + 0) = (sineWaveTable[tabPtr] /50 + *(*sample + 0));
*(*sample + 1) = (sineWaveTable[tabPtr] /50 + *(*sample + 1));
tabPtr++;
if(tabPtr == 44) tabPtr = 0;
}
*continueI2S = true;
}
There are other useful functions for building MP3 players, for example
setConnectionTimeout() In some cases it can make sense to change the threshold value for establishing a connection. By default, 250ms are set for unencrypted connections and 2700ms for SSL connections.
uint16_t timeout_ms = 300;
uint16_t timeout_ms_ssl = 3000;
audio.setConnectionTimeout(timeout_ms, timeout_ms_ssl);
bool setTimeOffset(int sec) jump relatively by sec within the file
bool setAudioPlayTime(uint16_t sec) jumps to the absolute time
uint32_t stopSong(); stops the song and returns the current audio time (in seconds)
getAudioFileDuration() Indicates the expected length of an audio file in seconds. With a constant bit rate, CBR, the value is exact, with a variable bit rate, VBR, the duration is estimated based on the first 100 mp3 frames and can therefore deviate slightly from the actual playback time
uint32_t getAudioFileDuration()
getAudioCurrentTime() returns the current playing time in seconds
uint32_t getAudioCurrentTime()
An example program could look like this:
Ticker ticker;
void setup() {
...
ticker.attach(1, tcr1s);
...
}
void tcr1s(){
uint32_t act=audio.getAudioCurrentTime();
uint32_t afd=audio.getAudioFileDuration();
uint32_t pos =audio.getFilePos();
log_i("pos =%i", pos);
log_i("audioTime: %i:%02d - duration: %i:%02d", (act/60), (act%60) , (afd/60), (afd%60));
}
The output in the serial monitor
This works with local files (SD, FFat, SD_MMC, SPIFFS) and with web files in wav or mp3 format. The current time for AAC-coded files (m4a) cannot be precisely determined and is therefore estimated using the mean value of the bit rate.
In some projects there is only one audio amplifier or speaker. Then it makes sense to convert the stereo signal into a mono signal. With **forceMono(true);** the mean value is calculated from the signal of both channels and placed on the left and right channel.
```c++
void forceMono(true); // change stereo to mono
void forceMono(false); // default stereo will be played
setAudioTaskCore(uint8_t coreID) The audio task takes the data from the buffer, decodes it and feeds the I2S. On the other hand, the audio.loop() fills the buffer, takes care of the entire control, processes all 'non' audio-relevant data, such as the metadata, and generates the events. For good performance, the audio task should not run on the core of the Arduino loop task. By default, the audio task runs on core 0, but can be changed here.
Here is a simple program example, you need an ESP32 developer board and an external DAC (e.g. PCM5102A)
#include "Arduino.h"
#include "WiFi.h"
#include "Audio.h"
#define I2S_DOUT 26 // connect to DAC pin DIN
#define I2S_BCLK 27 // connect to DAC pin BCK
#define I2S_LRC 25 // connect to DAC pin LCK
Audio audio;
const char* ssid = "SSID";
const char* password = "password";
void setup() {
Serial.begin(115200);
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) delay(1500);
audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT);
audio.connecttohost("http://s1.knixx.fm/dein_webradio_64.aac"); // 64 kbp/s aac+
}
void loop() {
audio.loop();
}
void audio_info(const char *info){
Serial.print("info "); Serial.println(info);
}
The output in the serial monitor:
rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0018,len:4 load:0x3fff001c,len:1216 ho 0 tail 12 room 4 load:0x40078000,len:10944 load:0x40080400,len:6388 entry 0x400806b4 info PSRAM not found, inputBufferSize = 6399 bytes info buffers freed, free Heap: 228148 bytes info Connect to new host: "http://s1.knixx.fm/dein_webradio_64.aac" info Connect to "s1.knixx.fm" on port 80, extension "/dein_webradio_64.aac" info Connected to server info Server: nginx/1.14.2 info audio/aac seen. info format is aac info AACDecoder has been initialized, free Heap: 199916 bytes info chunked data transfer info Connection: close info ice-audio-info: channels=2;samplerate=44100;bitrate=64 info icy-description: Wir spielen Musik von den 60ern bis Heute! Und immer um halb aktuelle Country-Music. info icy-genre: variety,pop,oldies,country info icy-name: knixx.fm - Dein Webradio. / 64 kbp/s aac+ info icy-pub: 1 info icy-url: https://knixx.fm info Cache-Control: no-cache, no-store info Access-Control-Allow-Origin: * info Access-Control-Allow-Headers: Origin, Accept, X-Requested-With, Content-Type info Access-Control-Allow-Methods: GET, OPTIONS, HEAD info Expires: Mon, 26 Jul 1997 05:00:00 GMT info X-Frame-Options: SAMEORIGIN info X-Content-Type-Options: nosniff info Switch to DATA, bitrate is 64000, metaint is 4096 info inputbuffer is being filled info StreamTitle="Michael Bolton - Soul Provider -- 1989" info stream ready info buffer filled in 7 ms info syncword found at pos 0 info AAC Channels=1 info AAC SampleRate=22050 info AAC BitsPerSample=16 info AAC Bitrate=64000 info StreamTitle="Symbol - The Most Beautiful Girl In The World -- 1994"
building it on a breadboard:
the schematic:
There are displays for the Raspberry Pi with a resolution of 480x320 pixels and an SPI bus. These are particularly suitable, see the radio folder