Skip to content

Example: I2S Audio

Julian Kemmerer edited this page Jun 28, 2021 · 60 revisions

WORK IN PROGRESS

This page is describes using an Arty and a PMOD I2C DAC+ADC standard 1/8 in (3.5mm) stereo audio jacks adapter to do basic audio signal processing for distortion and digital delay effects.

This example is from a series of examples designed for the Arty Board. See that page for instructions on using the Arty board with PipelineC generated files.

Setup

Digilent provides reference files: Here is the .xdc file describing the PMOD ports for the I2S adapter. The PMOD port is mostly unchanging in its top level definition pmod.c. From those top level ports i2s_pmod.c is used to map PMOD ports to I2S signals.

Following Digilent's instructions, a basic I2S passthrough example was the first step confirmed working. The code for the I2S passthrough, and included top level ports,etc can be seen here, and a snippet below:

#define SCLK_PERIOD_MCLKS 8
#define LR_PERIOD_SCLKS 64
#pragma MAIN_MHZ app 22.579
void app()
{
  // Registers
  static uint3_t sclk_counter;
  static uint1_t sclk;
  static uint6_t lr_counter;
  static uint1_t lr;
  
  // Read the incoming I2S signals
  i2s_to_app_t from_i2s = read_i2s_pmod();
  
  // Outgoing I2S signals
  app_to_i2s_t to_i2s;
  
  // Basic loopback:
  // Only input is data, connected to output data
  to_i2s.tx_data = from_i2s.rx_data;
  // Outputs clks from registers
  to_i2s.tx_sclk = sclk;
  to_i2s.rx_sclk = sclk;
  to_i2s.tx_lrck = lr;
  to_i2s.rx_lrck = lr;
  
  // Drive I2S clocking derived from current MCLK domain
  
  // SCLK toggling at half period count
  uint1_t sclk_half_toggle = sclk_counter==((SCLK_PERIOD_MCLKS/2)-1);
  // 0->1 SCLK once per period rising edge
  uint1_t sclk_period_toggle = sclk_half_toggle & (sclk==0); 
  if(sclk_half_toggle)
  {
    // Do toggle and reset counter
    sclk = !sclk;
    sclk_counter = 0;
  }
  else
  {
    // No toggle yet, keep counting
    sclk_counter += 1;
  }
  
  // LR toggling happens per SCLK period 
  if(sclk_period_toggle)
  {
    // LR toggling at half period count
    if(lr_counter==((LR_PERIOD_SCLKS/2)-1))
    {
      // Do toggle and reset counter
      lr = !lr;
      lr_counter = 0;
    }
    else
    {
      // No toggle yet, keep counting
      lr_counter += 1;
    }
  }
  
  // Drive the outgoing I2S signals
  write_i2s_pmod(to_i2s);
}

I2S Stereo Samples Stream

Use the above basic clocking logic for I2S passthrough, the i2s_mac.c module exposes an AXIS-like streaming interface for RX and TX.

#define sample_t q0_23_t

// I2S stereo sample types
typedef struct i2s_samples_t
{
  sample_t l_data;
  sample_t r_data;
}i2s_samples_t;
// _s 'stream' of the above data w/ valid flag
typedef struct i2s_samples_s
{
  i2s_samples_t samples;
  uint1_t valid;
}i2s_samples_s;

// RX function def w/ flow control
typedef struct i2s_rx_t
{
  i2s_samples_s samples;
  uint1_t overflow;
}i2s_rx_t;
i2s_rx_t i2s_rx(uint1_t data, uint1_t lr, uint1_t sclk_rising_edge, uint1_t samples_ready, uint1_t reset_n);

// TX function def w/ flow control
typedef struct i2s_tx_t
{
  uint1_t samples_ready;
  uint1_t data;
}i2s_tx_t;
i2s_tx_t i2s_tx(i2s_samples_s samples, uint1_t lr, uint1_t sclk_falling_edge, uint1_t reset_n);

// Single MAC module with both RX and TX
typedef struct i2s_mac_t
{
  i2s_rx_t rx;
  i2s_tx_t tx;
}i2s_mac_t;
i2s_mac_t i2s_mac(uint1_t reset_n, uint1_t rx_samples_ready, i2s_samples_s tx_samples);

i2s_mac_passthrough_app.c shows implementing an audio passthrough using this i2s_mac module.

Additionally, to aid in making user code easier to write + autopipeline, i2s_mac.c exposes an i2s_mac instance wired to globally visible wires as defined below:

// RX
typedef struct i2s_mac_rx_to_app_t
{
  i2s_samples_s samples;
  uint1_t overflow; 
}i2s_mac_rx_to_app_t;
typedef struct app_to_i2s_mac_rx_t
{
  uint1_t samples_ready;
  uint1_t reset_n;
}app_to_i2s_mac_rx_t;
i2s_mac_rx_to_app_t i2s_mac_rx_to_app;
app_to_i2s_mac_rx_t app_to_i2s_mac_rx;
// TX
typedef struct i2s_mac_tx_to_app_t
{
  uint1_t samples_ready;
}i2s_mac_tx_to_app_t;
typedef struct app_to_i2s_mac_tx_t
{
  i2s_samples_s samples;
  uint1_t reset_n;
}app_to_i2s_mac_tx_t;

// Globally visible ports/wires
i2s_mac_tx_to_app_t i2s_mac_tx_to_app;
app_to_i2s_mac_tx_t app_to_i2s_mac_tx;

The final i2s_app.c file reads+writes these globally visible wires.

Digital Delay Effect

The delay.c file contains the logic to implement a half second digital delay. A half second deep samples FIFO is filled, delaying samples, and then that delayed stream is read continuously from there on, the FIFO output is combined with current passthrough samples creating a single slap back delayed echo.

Distortion Effect

Per kind internet folk of the past, sgn(x)*(1-e^(G*-|x|)) was suggested as a distortion function w/ gain G=15.0 to implement. It was decided that a lookup table of 256 function points, followed by linear interpolation between those LUT points would be enough to approximate the function. The file interp_lut_gen.py is used to generate most of distortion.c which uses Q0.23 fixed pointer numbers, a snippet:

q0_23_t distortion_mono(q0_23_t x)
{
  // Get lookup addr from top bits of value
  uint8_t lut_addr = int24_23_16(x.qmn);
  // And interpolation bits from lsbs
  uint16_t interp_point = int24_15_0(x.qmn);

  // Generated lookup values:
  q0_23_t Y_VALUES[256];
  Y_VALUES[0].qmn = 0x0;
  Y_VALUES[1].qmn = 0xe2789;
  ...
  // M Scaled down by 2^4
  q0_23_t M_VALUES[256];
  M_VALUES[0].qmn = 0x713c4c;
  M_VALUES[1].qmn = 0x64b6ba;
  ...

  // Do lookup
  q0_23_t y = Y_VALUES[lut_addr];
  q0_23_t m = M_VALUES[lut_addr];

  // Do linear interp, dy = dx * m
  // Not using fixed point mult funcs since
  // need intermediates to do different scaling than normal
  q0_23_t dxi; // Fractional bits of input x
  dxi.qmn = interp_point;
  int48_t temp = dxi.qmn * m.qmn;
  int48_t temp_rounded = temp + (1 << (23 - 1));
  // Shift right by 23 for normal Q mult, then shift left by 4 to account for slope scaling
  q0_23_t dy;
  dy.qmn = temp >> 19;
  // Interpolate
  q0_23_t yi = q0_23_add(y, dy);
  return yi;
}

Automatically Pipelined Audio Effects

By putting the stateful overflow flag in a separate/isolated app_status (not shown below) the remaining app and effects_chain can be arbitrarily pipelined to meet timing.

// Autopipelineable stateless audio stream effects processing pipeline
i2s_samples_s effects_chain(uint1_t reset_n, i2s_samples_s in_samples)
{
  // Delay effect
  i2s_samples_s samples_w_delay = delay(reset_n, in_samples);
  
  // Distortion effect
  i2s_samples_s samples_w_distortion = distortion(samples_w_delay); //in_samples);
  
  // "Volume effect", cut effects volume in half w/ switches
  i2s_samples_s samples_w_effects = samples_w_distortion;
  uint4_t sw;
  WIRE_READ(uint4_t, sw, switches)
  if(uint4_1_1(sw))
  {
    samples_w_effects.samples.l_data.qmn = samples_w_effects.samples.l_data.qmn >> 1;
    samples_w_effects.samples.r_data.qmn = samples_w_effects.samples.r_data.qmn >> 1;
  }
  if(uint4_2_2(sw))
  {
    samples_w_effects.samples.l_data.qmn = samples_w_effects.samples.l_data.qmn >> 1;
    samples_w_effects.samples.r_data.qmn = samples_w_effects.samples.r_data.qmn >> 1;
  }
  if(uint4_3_3(sw))
  {
    samples_w_effects.samples.l_data.qmn = samples_w_effects.samples.l_data.qmn >> 1;
    samples_w_effects.samples.r_data.qmn = samples_w_effects.samples.r_data.qmn >> 1;
  }
  
  // Use switch0 to control, 1=effects on
  // Connect output
  i2s_samples_s out_samples;
  if(uint4_0_0(sw))
  {
    out_samples = samples_w_effects;
  }
  else
  {
    out_samples = in_samples;
  }
  
  return out_samples;
}

// Send audio through an effects chain
#pragma MAIN app
void app(uint1_t reset_n)
{
  // Read wires from I2S mac
  i2s_mac_rx_to_app_t from_rx_mac;
  WIRE_READ(i2s_mac_rx_to_app_t, from_rx_mac, i2s_mac_rx_to_app)
  i2s_mac_tx_to_app_t from_tx_mac;
  WIRE_READ(i2s_mac_tx_to_app_t, from_tx_mac, i2s_mac_tx_to_app)
  
  // Received samples
  i2s_samples_s rx_samples = from_rx_mac.samples;
  // Signal always ready draining through effects chain, checking overflow in status
  uint1_t rx_samples_ready = 1;
  
  // Send through effects chain
  i2s_samples_s samples_w_effects = effects_chain(reset_n, rx_samples);
  
  // Samples to transmit
  i2s_samples_s tx_samples = samples_w_effects;
  
  // Write wires to I2S mac
  app_to_i2s_mac_tx_t to_tx_mac;
  to_tx_mac.samples = tx_samples;
  to_tx_mac.reset_n = reset_n;
  WIRE_WRITE(app_to_i2s_mac_tx_t, app_to_i2s_mac_tx, to_tx_mac)
  app_to_i2s_mac_rx_t to_rx_mac;
  to_rx_mac.samples_ready = rx_samples_ready;
  to_rx_mac.reset_n = reset_n;
  WIRE_WRITE(app_to_i2s_mac_rx_t, app_to_i2s_mac_rx, to_rx_mac)

  // Control+status logic is stateful (ex. overflow bit)
  // and is kept separate from this stateless autopipelineable function
  app_status(reset_n, to_tx_mac.samples.valid, from_tx_mac.samples_ready);
}

As written, targeting the Artix 7 device on the Arty board, the PipelineC tool reports the following:

██████╗ ██╗██████╗ ███████╗██╗     ██╗███╗   ██╗███████╗ ██████╗
██╔══██╗██║██╔══██╗██╔════╝██║     ██║████╗  ██║██╔════╝██╔════╝
██████╔╝██║██████╔╝█████╗  ██║     ██║██╔██╗ ██║█████╗  ██║     
██╔═══╝ ██║██╔═══╝ ██╔══╝  ██║     ██║██║╚██╗██║██╔══╝  ██║     
██║     ██║██║     ███████╗███████╗██║██║ ╚████║███████╗╚██████╗
╚═╝     ╚═╝╚═╝     ╚══════╝╚══════╝╚═╝╚═╝  ╚═══╝╚══════╝ ╚═════╝

Output directory: /home/julian/pipelinec_syn_output
================== Parsing C Code to Logical Hierarchy ================================
Parsing: /media/1TB/Dropbox/PipelineC/git/PipelineC/main.c
================== Writing Resulting Logic to File ================================
Building map of combinatorial logic...
Using VIVADO synthesizing for part: xc7a35ticsg324-1l
Writing VHDL files for all functions (as combinatorial logic)...
Writing multi main top level files...
Writing the constant struct+enum definitions as defined from C code...
Writing clock cross definitions as parsed from C code...
Writing finalized comb. logic synthesis tool files...
Output VHDL files: /home/julian/pipelinec_syn_output/read_vhdl.tcl
================== Adding Timing Information from Synthesis Tool ================================
Synthesizing as combinatorial logic to get total logic delay...
...
i2s_rx Path delay (ns): 3.019 = 331.23550844650543 MHz
i2s_tx Path delay (ns): 2.947 = 339.328130302002 MHz
app_status Path delay (ns): 1.558 = 641.8485237483953 MHz
i2s_mac_ports Path delay (ns): 3.581 = 279.2516056967328 MHz
delay Path delay (ns): 8.248000000000001 = 121.2415130940834 MHz
distortion_mono Path delay (ns): 9.81 = 101.9367991845056 MHz
distortion Path delay (ns): 9.811 = 101.92640913260625 MHz
effects_chain Path delay (ns): 14.335 = 69.75933031042902 MHz
app Path delay (ns): 14.335 = 69.75933031042902 MHz
================== Beginning Throughput Sweep ================================
Function: led0_module Target MHz: 22.579
Function: led1_module Target MHz: 22.579
Function: led2_module Target MHz: 22.579
Function: led3_module Target MHz: 22.579
Function: leds_module Target MHz: 22.579
Function: switches_module Target MHz: 22.579
Function: pmod_ja Target MHz: 22.579
Function: i2s_mac_ports Target MHz: 22.579
Function: app Target MHz: 22.579
Setting all instances to comb. logic to start...
Starting with blank sweep state...
Starting middle out sweep...
Starting from zero clk timing params...
Collecting modules to pipeline...
Pipelining modules...
Updating output files...
Running syn w timing params...
led0_module : 0 clocks latency...
led1_module : 0 clocks latency...
led2_module : 0 clocks latency...
led3_module : 0 clocks latency...
leds_module : 0 clocks latency...
switches_module : 0 clocks latency...
pmod_ja : 0 clocks latency...
i2s_mac_ports : 0 clocks latency...
app : 0 clocks latency...
Running: /media/1TB/Programs/Linux/Xilinx/Vivado/2019.2/bin/vivado -log /home/julian/pipelinec_syn_output/top/vivado_1208.log -source "/home/julian/pipelinec_syn_output/top/top_1208.tcl" -journal /home/julian/pipelinec_syn_output/top/vivado.jou -mode batch
switches_module Clock Goal: 22.58 (MHz) Current: 69.15 (MHz)(14.46 ns) 0 clks
app Clock Goal: 22.58 (MHz) Current: 69.15 (MHz)(14.46 ns) 0 clks
i2s_mac_ports Clock Goal: 22.58 (MHz) Current: 69.15 (MHz)(14.46 ns) 0 clks
Met timing...
================== Writing Results of Throughput Sweep ================================
Output VHDL files: /home/julian/pipelinec_syn_output/read_vhdl.tcl
Done.

Note that no modules are reported to have any additional pipelining since the clock goal of ~22.58MHz is an easy target to meet.

util

Note the use of exactly two DSPs in the distortion function: a multiplier for each of the stereo left+right channels. Also note the use of block rams for the delay effect (FIFO of samples).

Once loaded onto the board, switches control if the effects chain is active and at what volume. Give it a try!.

Clone this wiki locally