doc(spi_master): add descriptions about the SPI master speed

ginkgm · espressif-bot · commit e0b05c29f1e6 · 2018-04-02T12:44:19.000Z
Closes #1542, Closes #1008
diff --git a/docs/en/api-reference/peripherals/spi_master.rst b/docs/en/api-reference/peripherals/spi_master.rst
@@ -57,22 +57,13 @@ A transaction on the SPI bus consists of five phases, any of which may be skippe
 * The dummy phase. The phase is configurable, used to meet the timing requirements.
 * The read phase. The slave sends data to the master.
 
-In full duplex, the read and write phases are combined, causing the SPI host to read and
-write data simultaneously. The total transaction length is decided by 
+In full duplex mode, the read and write phases are combined, and the SPI host reads and
+writes data simultaneously. The total transaction length is decided by 
 ``command_bits + address_bits + trans_conf.length``, while the ``trans_conf.rx_length``
 only determins length of data received into the buffer.
 
-In half duplex, the length of write phase and read phase are decided by ``trans_conf.length`` and 
-``trans_conf.rx_length`` respectively. ** Note that a half duplex transaction with both a read and 
-write phase is not supported when using DMA. ** If such transaction is needed, you have to use one 
-of the alternative solutions:
-
-  1. use full-duplex mode instead.
-  2. disable the DMA by set the last parameter to 0 in bus initialization function just as belows:
-     ``ret=spi_bus_initialize(VSPI_HOST, &buscfg, 0);``  
-
-     this may prohibit you from transmitting and receiving data longer than 32 bytes.
-  3. try to use command and address field to replace the write phase.
+While in half duplex mode, the host have independent write and read phases. The length of write phase and read phase are
+decided by ``trans_conf.length`` and ``trans_conf.rx_length`` respectively. 
 
 The command and address phase are optional in that not every SPI device will need to be sent a command
 and/or address. This is reflected in the device configuration: when the ``command_bits`` or ``address_bits``
@@ -82,6 +73,39 @@ Something similar is true for the read and write phase: not every transaction ne
 as well as data to be read. When ``rx_buffer`` is NULL (and SPI_USE_RXDATA) is not set) the read phase 
 is skipped. When ``tx_buffer`` is NULL (and SPI_USE_TXDATA) is not set) the write phase is skipped.
 
+GPIO matrix and native pins
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Most peripheral pins in ESP32 can directly connect to a GPIO, which is called *native pin*. When the peripherals are
+required to work with other pins than the native pins, ESP32 use a *GPIO matrix* to realize this. If one of the pins is
+not native, the driver automatically routes all the signals to the GPIO matrix, which works under 80MHz. The signals are
+sampled and sent to peripherals or the GPIOs. 
+
+When the GPIO matrix is used, signals cannot propogate to the peripherals over 40MHz, and the setup time of MISO is very
+likely violated. Hence the clock frequency limitation is a little lower than the case without GPIO matrix.
+
+Native pins for SPI controllers are as below:
+
++----------+------+------+
+| Pin Name | HSPI | VSPI |
++          +------+------+
+|          | GPIO Number |
++==========+======+======+
+| CS0*     | 15   | 5    |
++----------+------+------+
+| SCLK     | 14   | 18   |
++----------+------+------+
+| MISO     | 12   | 19   |
++----------+------+------+
+| MOSI     | 13   | 23   |
++----------+------+------+
+| QUADWP   | 2    | 22   |
++----------+------+------+
+| QUADHD   | 4    | 21   |
++----------+------+------+
+
+note * Only the first device attaching to the bus can use CS0 pin.
+
 Using the spi_master driver
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -125,21 +149,141 @@ Write and read phases
 
 Normally, data to be transferred to or from a device will be read from or written to a chunk of memory
 indicated by the ``rx_buffer`` and ``tx_buffer`` members of the transaction structure. 
-When DMA is enabled for transfers, these buffers are highly recommended to meet the requirements as belows:
+When DMA is enabled for transfers, these buffers are highly recommended to meet the requirements as below:
 
   1. allocated in DMA-capable memory using ``pvPortMallocCaps(size, MALLOC_CAP_DMA)``;
   2. 32-bit aligned (start from the boundary and have length of multiples of 4 bytes).
 
 If these requirements are not satisfied, efficiency of the transaction will suffer due to the allocation and 
 memcpy of temporary buffers.
 
+.. note::  Half duplex transactions with both read and write phases are not supported when using DMA. See
+  :ref:`spi_known_issues` for details and workarounds.
+
 Sometimes, the amount of data is very small making it less than optimal allocating a separate buffer
 for it. If the data to be transferred is 32 bits or less, it can be stored in the transaction struct
 itself. For transmitted data, use the ``tx_data`` member for this and set the ``SPI_USE_TXDATA`` flag
 on the transmission. For received data, use ``rx_data`` and set ``SPI_USE_RXDATA``. In both cases, do
 not touch the ``tx_buffer`` or ``rx_buffer`` members, because they use the same memory locations
 as ``tx_data`` and ``rx_data``.
 
+Speed and Timing Considerations
+-------------------------------
+
+Transferring speed
+^^^^^^^^^^^^^^^^^^
+
+There're two factors limiting the transferring speed: (1) The transaction interval, (2) The SPI clock frequency used.
+When large transactions are used, the clock frequency determines the transferring speed; while the interval effects the
+speed a lot if small transactions are used.
+
+    1. Transaction interval: The interval mainly comes from the cost of FreeRTOS queues and the time switching between
+       tasks and the ISR. It also takes time for the software to setup spi peripheral registers as well as copy data to
+       FIFOs, or setup DMA links. Depending on whether the DMA is used, the interval of an one-byte transaction is around 
+       25us typically. 
+            
+            1.  The CPU is blocked and switched to other tasks when the
+                transaction is in flight. This save the cpu time but increase the interval. 
+            2.  When the DMA is enabled, it needs about 2us per transaction to setup the linked list. When the master is
+                transferring, it automatically read data from the linked list. If the DMA is not enabled,
+                CPU has to write/read each byte to/from the FIFO by itself. Usually this is faster than 2us, but the
+                transaction length is limited to 32 bytes for both write and read.
+       
+       Typical transaction interval with one byte data is as below:
+
+       +-----------------------+---------+
+       | Transaction Time (us) | Typical |
+       +=======================+=========+
+       | DMA                   | 24      | 
+       +-----------------------+---------+
+       | No DMA                | 22      |
+       +-----------------------+---------+
+
+    2. SPI clock frequency: Each byte transferred takes 8 times of the clock period *8/fspi*. If the clock frequency is
+       too high, some functions may be limited to use. See :ref:`timing_considerations`.
+
+For a normal transaction, the overall cost is *20+8n/Fspi[MHz]* [us] for n bytes tranferred
+in one transaction. Hence the transferring speed is : *n/(20+8n/Fspi)*. Example of transferring speed under 8MHz
+clock speed:
+
++-----------+----------------------+--------------------+------------+-------------+
+| Frequency | Transaction Interval | Transaction Length | Total Time | Total Speed |
+|           |                      |                    |            |             |
+| [MHz]     | [us]                 | [bytes]            | [us]       | [kBps]      |
++===========+======================+====================+============+=============+
+| 8         | 25                   | 1                  | 26         | 38.5        |
++-----------+----------------------+--------------------+------------+-------------+
+| 8         | 25                   | 8                  | 33         | 242.4       |
++-----------+----------------------+--------------------+------------+-------------+
+| 8         | 25                   | 16                 | 41         | 490.2       |
++-----------+----------------------+--------------------+------------+-------------+
+| 8         | 25                   | 64                 | 89         | 719.1       |
++-----------+----------------------+--------------------+------------+-------------+
+| 8         | 25                   | 128                | 153        | 836.6       |
++-----------+----------------------+--------------------+------------+-------------+
+
+When the length of transaction is short, the cost of transaction interval is really high. Please try to squash data
+into one transaction if possible to get higher transfer speed.
+
+.. _timing_considerations:
+
+Timing considerations
+^^^^^^^^^^^^^^^^^^^^^
+Due to the input delay of MISO pin, ESP32 SPI master cannot read data at very high speed. The frequency allowed is
+rather low when the GPIO matrix is used. Currently only frequency not greater than 8.8MHz is fully supported. When the
+frequency is higher, you have to use the native pins or the *dummy bit workaround*.
+
+.. _dummy_bit_workaround:
+
+**Dummy bit workaround:** We can insert dummy clocks (during which the host does not read data) before the read phase
+actually begins. The slave still sees the dummy clocks and gives out data, but the host does not read until the read
+phase. This compensates the lack of setup time of MISO required by the host, allowing the host reading at higher
+frequency.
+
+The maximum frequency (in MHz) host can read (or read and write) under different conditions is as below:
+
++-------------+-------------+-----------+-----------------------------+
+| Frequency Limit           | Dummy Bits| Comments                    | 
++-------------+-------------+ Used      +                             +
+| GPIO matrix | Native pins | By Driver |                             |
++=============+=============+===========+=============================+
+| 8.8         | N.M.        | 0         |                             |
++-------------+-------------+-----------+-----------------------------+
+| N.M.        | N.M.        | 1         | Half Duplex, no DMA allowed |
++-------------+-------------+-----------+                             +
+| N.M.        | N.M.        | 2         |                             |
++-------------+-------------+-----------+-----------------------------+
+
+N.M.: Not Measured Yet.
+
+And if the host only writes, the *dummy bit workaround* is not used and the frequency limit is as below:
+
++-------------+----------------------+
+| GPIO matrix | Native pins          |
++=============+======================+
+| 40          | 80                   |
++-------------+----------------------+
+
+.. _spi_known_issues:
+
+Known Issues
+------------
+
+1. Half duplex mode is not compatible with DMA when both writing and reading phases exist.
+
+   If such transactions are required, you have to use one of the alternative solutions:
+
+   1. use full-duplex mode instead.
+   2. disable the DMA by setting the last parameter to 0 in bus initialization function just as below:
+      ``ret=spi_bus_initialize(VSPI_HOST, &buscfg, 0);``  
+
+      this may prohibit you from transmitting and receiving data longer than 32 bytes.
+   3. try to use command and address field to replace the write phase.
+
+2. Full duplex mode is not compatible with the *dummy bit workaround*, hence the frequency is limited. See :ref:`dummy
+   bit speed-up workaround <dummy_bit_workaround>`.
+
+
 Application Example
 -------------------