|
8 | 8 | TCDM Interconnect |
9 | 9 | ================= |
10 | 10 |
|
| 11 | +The Tightly Coupled Data Memory (TCDM) Interconnect is a high-performance, low-latency memory bus designed for efficient data transfers. |
11 | 12 |
|
| 13 | +Features |
| 14 | +~~~~~~~~ |
| 15 | +- The processor utilizes the TCDM Interconnect for both instruction fetching and data load/store operations. |
| 16 | +- The uDMA Subsystem uses TCDM interconnect to access interleaved(L2) memory. |
| 17 | +- Acts as a master to the APB peripheral interconnect. |
| 18 | +- 4 TCDM interfaces for eFPGA provide high speed access to the CORE-V-MCU memory. |
| 19 | +- Provides a JTAG debug interface. |
| 20 | +- Supports a 32-bit address width, 32-bit data width, and 32-bit byte enable (BE) width. |
| 21 | +- Support below network topologies |
12 | 22 |
|
| 23 | + - Full Crossbar |
| 24 | + - Clos network |
| 25 | + - Butterfly |
| 26 | + |
| 27 | + **NOTE**: The network topology is fixed and not configurable. |
| 28 | + |
| 29 | + |
| 30 | +For more details about TCDM interconnect refer `here <https://github.com/openhwgroup/core-v-mcu/blob/master/rtl/tcdm_interconnect/README.md>`_. |
| 31 | + |
| 32 | +Block Architecture |
| 33 | +~~~~~~~~~~~~~~~~~~ |
| 34 | +The TCDM interconnect supports 9 master ports and 9 slave ports. The figure below shows a high-level block diagram of the interconnect, highlighting its main components: |
| 35 | + |
| 36 | +- L2 Interconnect Demux |
| 37 | +- Contiguous Crossbar |
| 38 | +- Interleaved Crossbar |
| 39 | +- AXI Bridge |
| 40 | + |
| 41 | +The L2 Interconnect Demux identifies the target slave region and routes the request to appropriate destination - either one of the Crossbars or the AXI Bridge. Internally, both the Crossbars and the AXI Bridge use |
| 42 | +address decoders and arbiters to direct requests to the correct slave. |
| 43 | + |
| 44 | +.. figure:: ../images/TCDM_Interconnect_block_diagram.png |
| 45 | + :name: TCDM_Interconnect_block_diagram |
| 46 | + :align: center |
| 47 | + :alt: |
| 48 | + |
| 49 | + **TCDM Interconnect block diagram** |
| 50 | + |
| 51 | + |
| 52 | + |
| 53 | +**Masters:** |
| 54 | + |
| 55 | +- uDMA Subsystem (2 ports) |
| 56 | +- eFPGA (4 ports) |
| 57 | +- Core Complex (2 ports) |
| 58 | +- Debug Module (1 port) |
| 59 | + |
| 60 | +**Slaves:** |
| 61 | + |
| 62 | +- Boot ROM |
| 63 | +- Non-interleaved memory (2 private memory banks) |
| 64 | +- Interleaved memory (4 banks) |
| 65 | +- APB peripheral interconnect |
| 66 | +- eFPGA APB Target |
| 67 | + |
| 68 | +TCDM (L2 Interface) Demux |
| 69 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 70 | +The uDMA SS, eFPGA, and Core Complex masters connect to the TCDM Demux, which is responsible for routing requests to the correct slave. The slaves fall into three categories based on address regions: |
| 71 | + |
| 72 | +- AXI Region : Connects to APB peripheral interconnect to access APB Peripherals |
| 73 | +- Contiguous Slaves : Includes Non-interleaved memory regions such as L2 private memory banks (SRAM Bank0 - 32KB, SRAM Bank1 - 32KB), Boot ROM and eFPGA APB Target |
| 74 | +- Interleaved Slaves : Contains Interleaved memory banks, 4*112KB SRAM blocks |
| 75 | + |
| 76 | +Refer to `Memory Map <https://github.com/openhwgroup/core-v-mcu/blob/master/docs/doc-src/mmap.rst>`_ for address ranges of the each slave. |
| 77 | + |
| 78 | +The TCDM Demux integrates an address decoder that inspects each incoming request address and matches it against the configured address ranges for all slave regions. Upon identifying a match, the address decoder determines the appropriate target region |
| 79 | +and internally routes the request to the corresponding slave — whether AXI, contiguous, or interleaved. |
| 80 | + |
| 81 | +Interaction with Contiguous Crossbar |
| 82 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 83 | + |
| 84 | +.. figure:: ../images/TCDM_Contiguous_Crossbar.png |
| 85 | + :name: TCDM_Contiguous_Crossbar |
| 86 | + :align: center |
| 87 | + :alt: |
| 88 | + |
| 89 | + **Contiguous Crossbar** |
| 90 | + |
| 91 | +The contiguous crossbar consists of two primary components: |
| 92 | + |
| 93 | +1. Address Decoders - One per master (Total of 9) |
| 94 | +2. Single Xbar Module |
| 95 | + |
| 96 | +Each address decoder receives the ADDR from TCDM demux and checks it against the address ranges of contiguous slaves address. if a match is found, port_sel is generated and sent to the Xbar module's ADDR input. |
| 97 | +This port sel signal represents the slave index provided to the Xbar to route the request to the appropriate slave arbiter within the Xbar. |
| 98 | +Meanwhile the actual request (ADDR, WEN, WDATA and BE) is aggregated into single bundle and forwarded to Xbar's WDATA input. |
| 99 | +Here, the ADDR bundled with WDATA contains the full original address for read/write operation and is used by the selected slave to determines the exact memory offset for the access. |
| 100 | + |
| 101 | +The Xbar is a multi-master and multi-slave module that includes: |
| 102 | + |
| 103 | +1. A dedicated local address decoder and response multiplexer for each master to interpret port_sel. |
| 104 | +2. A dedicated RR arbiter for each slave to handle requests from multiple masters. |
| 105 | + |
| 106 | +The address decoder decodes the index received over port_sel port and selects the corresponding slave-specific arbiter. |
| 107 | +Each arbiter manages contention among multiple masters and grants access to one master per cycle using a round-robin (RR) arbitration policy. |
| 108 | +Once access is granted, the aggregated request is disaggregated into its original signals (ADDR, WEN, WDATA, BE) and forwarded to the slave. |
| 109 | + |
| 110 | +When a slave detects the REQ signal, it immediately asserts the GNT signal in the same clock cycle to acknowledge the request. |
| 111 | + |
| 112 | +For read operations, the r_data and valid signals are updated in the next clock cycle. |
| 113 | +The response multiplexer colects the response data from all the slaves and selects the valid response corresponding to the previously decoded target. |
| 114 | +This selection ensures that only the appropriate response is forwarded back to the master. |
| 115 | + |
| 116 | + |
| 117 | +Interaction with Interleaved Crossbar |
| 118 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 119 | + |
| 120 | +.. figure:: ../images/TCDM_Interleaved_Crossbar.png |
| 121 | + :name: TCDM_Interleaved_Crossbar |
| 122 | + :align: center |
| 123 | + :alt: |
| 124 | + |
| 125 | + **Interleaved Crossbar** |
| 126 | + |
| 127 | +The interleaved crossbar follows a different mechanism for selecting the target slave. Unlike the contiguous crossbar, it does not use address decoders based on full address ranges. |
| 128 | +Instead, it uses specific address bits (often referred to as bank bits) to determine the destination memory bank. These bits are extracted from the request address and forwarded to the Xbar's ADDR input. |
| 129 | + |
| 130 | +``port_sel = ADDR[$clog2(BE_WIDTH)+PORT_SEL_WIDTH-1:$clog2(BE_WIDTH)]`` |
| 131 | + |
| 132 | +NOTE: |
| 133 | + - BE_WIDTH = 4 |
| 134 | + - PORT_SEL_WIDTH = $clog2(NR_SLAVE_PORTS) = $clog2(4) = 2 |
| 135 | + - port_sel = ADDR[2+2-1:2] = ADDR[3:2] |
| 136 | + |
| 137 | +These bits represents the slave index provided to the Xbar to route the request to the appropriate slave arbiter within the Xbar. |
| 138 | +Each master aggregates its request (ADDR, WEN, WDATA, and BE) into a bundled format and sends it to the crossbar's DATA input. |
| 139 | +Here, the ADDR bundled with WDATA contains the full original address for read/write operation and is used by the selected slave to determines the exact memory offset for the access. |
| 140 | + |
| 141 | +Internally, the interleaved crossbar also contains a Xbar module that includes: |
| 142 | + |
| 143 | +1. A dedicated local address decoder and response multiplexer for each master to interpret port_sel. |
| 144 | +2. A dedicated RR arbiter for each slave to handle requests from multiple masters. |
| 145 | + |
| 146 | +As in contiguous cross bar, the address decoder decodes the index received over port_sel port and selects the corresponding slave-specific arbiter. |
| 147 | +The arbitration occurs every clock cycle, ensuring fair access. |
| 148 | +Once access is granted, the aggregated request is disaggregated into its original signals (ADDR, WEN, WDATA, BE) and forwarded to the slave. |
| 149 | + |
| 150 | +When a slave detects the REQ signal, it immediately asserts the GNT signal in the same clock cycle to acknowledge the request. |
| 151 | + |
| 152 | +For read operations, the r_data and valid signals are updated in the next clock cycle. |
| 153 | +The response mux colects the response data from all the slaves and selects the valid response corresponding to the previously decoded target. |
| 154 | +This selection ensures that only the appropriate response is forwarded back to the master. |
| 155 | + |
| 156 | +Interaction with AXI Bridge |
| 157 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 158 | + |
| 159 | +.. figure:: ../images/TCDM_AXI_Bridge.png |
| 160 | + :name: TCDM_AXI_Bridge |
| 161 | + :align: center |
| 162 | + :alt: |
| 163 | + |
| 164 | + **TCDM AXI Bridge** |
| 165 | + |
| 166 | +The AXI bridge receives incoming requests, which are passed through a TCDM-to-AXI converter. This converter translates 32-bit TCDM protocol transactions into 32-bit AXI transactions. |
| 167 | +The translated AXI transactions are then forwarded to an AXI crossbar for further decoding and routing. |
| 168 | + |
| 169 | +The AXI crossbar efficiently routes transactions from multiple masters to multiple slaves. The crossbar includes the following components: |
| 170 | + |
| 171 | +- **Write Address Decoder**: Each master has a dedicated write address decoder that compares the write transaction address (AWADDR) against the address ranges of all connected slaves. Upon finding a match, it generates a selection signal for the corresponding slave and forwards the transaction to the AXI Demux; otherwise, the request is redirected to the error slave. |
| 172 | +- **Read Address Decoder**: Similarly, each master has a dedicated read address decoder that compares the ARADDR (read address) against slave address ranges. If a valid slave match is found, the selection signal is generated and the request is passed to the AXI Demux; otherwise, the request is redirected to the error slave. |
| 173 | +- **AXI Demultiplexer**: There is one AXI Demux per master. it receives read/write transactions and routes them to one of several slaves based on the selection signals provided by the address decoders. It ensures that transactions are correctly distributed across the slaves. |
| 174 | +- **AXI Error Slave**: A dedicated error slave for each master. It handles unmatched or invalid addresses. If no slave address matches the decoded address, the transaction is routed to the error slave, which generates an appropriate error response. |
| 175 | +- **AXI Multiplexer**: There is one AXI MUX per slave. It merges response channels( write response and read) coming from multiple masters targeting that slave. The mux includes RR arbitration logic to forward one valid response at a time to the master. |
| 176 | + |
| 177 | +The AXI Demux handles the actual routing of transactions to the correct slave based on the decoder's selection signals received from Write/Read Address decoder. |
| 178 | +Once the slave complete processing the requests, the read and write responses are sent back to the crossbar. Since multiple masters may target the same slave, their responses are funneled through a shared interface. The axi_mux, instantiated per slave, merges these responses and uses RR arbitration to decide which master's response to forward at any given time. |
| 179 | + |
| 180 | +System Architecture |
| 181 | +~~~~~~~~~~~~~~~~~~~ |
| 182 | +.. figure:: ../images/TCDM_Interconnect_block_diagram_system_level.png |
| 183 | + :name: TCDM_Interconnect_connection_diagram |
| 184 | + :align: center |
| 185 | + :alt: |
| 186 | + |
| 187 | + TCDM Interconnect connection diagram |
| 188 | + |
| 189 | +Programming Model |
| 190 | +~~~~~~~~~~~~~~~~~ |
| 191 | + |
| 192 | +The TCDM Interconnect handles address decoding and transaction routing internally, making its functionality completely transparent to the user. |
| 193 | + |
| 194 | +TCDM interconnect CSRs |
| 195 | +~~~~~~~~~~~~~~~~~~~~~~ |
| 196 | + |
| 197 | +There are no CSR available as this IP is transparent to users. |
| 198 | + |
| 199 | +Pin Diagram |
| 200 | +~~~~~~~~~~~~~~ |
| 201 | + |
| 202 | +.. figure:: ../images/TCDM_Interconnect_pin_diagram.png |
| 203 | + :name: TCDM_Interconnect_pin_diagram |
| 204 | + :align: center |
| 205 | + :alt: |
| 206 | + |
| 207 | + TCDM Interconnect pin diagram |
| 208 | + |
| 209 | +Below is the categorization of these pins: |
| 210 | + |
| 211 | +Clock Interface |
| 212 | +^^^^^^^^^^^^^^^ |
| 213 | + |
| 214 | +- ``clk_i`` : system clock |
| 215 | + |
| 216 | +Reset Interface |
| 217 | +^^^^^^^^^^^^^^^ |
| 218 | + |
| 219 | +- ``rst_ni`` : Active low reset signal |
| 220 | + |
| 221 | +Master Interface |
| 222 | +^^^^^^^^^^^^^^^^ |
| 223 | + |
| 224 | +- ``req_i`` : Request signal from master ports. |
| 225 | +- ``add_i`` : Address of the tcdm. |
| 226 | +- ``wen_i`` : Write enable signal; 1 = write, 0 = read. |
| 227 | +- ``wdata_i`` : Data to be written to memory. |
| 228 | +- ``be_i`` : Byte enable signals. |
| 229 | +- ``gnt_o`` : Grant signal indicating the request has been accepted. |
| 230 | +- ``vld_o`` : Response valid signal, also used for write acknowledgments. |
| 231 | +- ``rdata_o`` : Data read from memory for load operations. |
| 232 | + |
| 233 | +Slave Interface |
| 234 | +^^^^^^^^^^^^^^^ |
| 235 | + |
| 236 | +- ``req_o`` : Request signal sent to slave memory banks. |
| 237 | +- ``gnt_i`` : Grant signal from memory banks. |
| 238 | +- ``add_o`` : Address within each memory bank. |
| 239 | +- ``wen_o`` : Write enable signal to memory banks. |
| 240 | +- ``wdata_o`` : Data to be written to memory. |
| 241 | +- ``be_o`` : Byte enable signals for each memory bank. |
| 242 | +- ``rdata_i`` : Data returned from the memory banks for read operations. |
0 commit comments