- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.1k
Fix unreliable writes to cyw43. #2209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix unreliable writes to cyw43. #2209
Conversation
| So if I understand the code correctly, these lines:         dma_channel_configure(bus_data->dma_out, &out_config, &bus_data->pio->txf[bus_data->pio_sm], tx, tx_length / 4, true);
        uint32_t fdebug_tx_stall = 1u << (PIO_FDEBUG_TXSTALL_LSB + bus_data->pio_sm);
        bus_data->pio->fdebug = fdebug_tx_stall;
        pio_sm_set_enabled(bus_data->pio, bus_data->pio_sm, true);
        while (!(bus_data->pio->fdebug & fdebug_tx_stall)) {
            tight_loop_contents(); // todo timeout
        }...are attempting to block until the state machine has consumed all of the TX data, before deconfiguring the state machine. The problem is that this code also detects the case where the TX FIFO has bottomed out for reasons other than all of the data having been transferred. In particular, it immediately falls through if no DMA write has yet taken place between the  If we think that is the failure mode, then the most direct fix would be this one: diff --git a/src/rp2_common/pico_cyw43_driver/cyw43_bus_pio_spi.c b/src/rp2_common/pico_cyw43_driver/cyw43_bus_pio_spi.c
index bcc7284..a9f5970 100644
--- a/src/rp2_common/pico_cyw43_driver/cyw43_bus_pio_spi.c
+++ b/src/rp2_common/pico_cyw43_driver/cyw43_bus_pio_spi.c
@@ -309,8 +309,10 @@ int cyw43_spi_transfer(cyw43_int_t *self, const uint8_t *tx, size_t tx_length, u
         dma_channel_configure(bus_data->dma_out, &out_config, &bus_data->pio->txf[bus_data->pio_sm], tx, tx_length / 4, true);
 
         uint32_t fdebug_tx_stall = 1u << (PIO_FDEBUG_TXSTALL_LSB + bus_data->pio_sm);
-        bus_data->pio->fdebug = fdebug_tx_stall;
         pio_sm_set_enabled(bus_data->pio, bus_data->pio_sm, true);
+        dma_channel_wait_for_finish_blocking(bus_data->dma_out);
+        bus_data->pio->fdebug = fdebug_tx_stall;
         while (!(bus_data->pio->fdebug & fdebug_tx_stall)) {
             tight_loop_contents(); // todo timeout
         }So the steps are: 
 For me this also fixes the reproducer from #2123. This also raises the question of why we are messing about with the DMA so much given this code is actually 100% blocking. | 
| Yes, I agree. dma_channel_wait_for_finish_blocking in the right place works. Still something funny going on with the bus data but I think that's a separate problem. | 
| 
 I guess that also explains why the bug appears / disappears with different compiler optimisation settings 👍 | 
| 
 The fields you are accessing here are read-only after initialisation, so IMO this more likely to be a timing issue than anything to do with those particular accesses being volatile. | 
214f7ec    to
    53bd61c      
    Compare
  
    | 
 False alarm I think. The code could do with a refactor. But I should probably avoid doing that here? | 
53bd61c    to
    69453bb      
    Compare
  
    | Minor nitpick, but given Luke's diagnosis above, should the "The theory is that this flag will also get set if the bus is busy. So we mistakenly think a write to cyw43 has completed." part be removed from the commit-message? | 
| 
 Nope, Peter's description there seems to match mine above. There's a race (or at least one), and it manifests as a bug when the bus timing is just right, e.g. if the DMA is held off because the processor is keeping FASTPERI busy. | 
| it seems to me like this code could still suffer races if the PIO is divided significantly. Why not just make the RX loop PIO label symbol public from the . | 
| 
 You mean this flag could be set by another state machine? | 
| i meant the clock, but either way, i think checking the address is better but i haven't dived deeply into this | 
| It's not immediately obvious that this could be done by reading the instruction address. For a write the wrap point is set to lp1_end. So when the write is complete the pio is stalled at "lp" which it will pass through normally. I'll update the description to reflect the actual behaviour. I'll test if things work at different clock speeds, but I suggest the current fix has the least risk at this point. This works with the current fix.  | 
We use a pio and dma to write to the cyw43 chip using spi. Normally you write an address and then read the data from that address, so the pio program does does a write then read. If you just want to write data in the case of uploading firmware we use the fdebug_tx_stall flag to work out if the pio has stalled waiting to write more data. The theory is that this flag will also get set if the bus is busy. So we mistakenly think a write to cyw43 has completed. Wait for the dma write to complete before waiting for the pio to stall. Fixes raspberrypi#2206
69453bb    to
    ea32e6d      
    Compare
  
    | I'm just a simple man, don't know much of that stuff, but will the last else if statement be reachable here: pico-sdk/src/rp2_common/pico_cyw43_driver/cyw43_bus_pio_spi.c Lines 236 to 328 in 3d746b3 
 given the fact that: 
 and: 
 Just asking, maybe it is suppose to be that way. | 
| No it won't. | 
This seems to work nicely with WiFi as of the upstream Pico SDK fix: raspberrypi/pico-sdk#2209
We use a pio and dma to write to the cyw43 chip using spi. Normally you write an address and then read the data from that address, so the pio program does does a write then read. If you just want to write data in the case of uploading firmware we use the fdebug_tx_stall flag to work out if the pio has stalled waiting to write more data. The theory is that this flag will also get set if the bus is busy. So we mistakenly think a write to cyw43 has completed. Wait for the dma write to complete before waiting for the pio to stall. Fixes raspberrypi#2206
We use a pio and dma to write to the cyw43 chip using spi. Normally you write an address and then read the data from that address, so the pio program does does a write then read.
If you just want to write data in the case of uploading firmware we use the fdebug_tx_stall flag to work out if the pio has stalled waiting to read data which will never arrive.
The theory is that this flag will also get set if the bus is busy. So we mistakenly think a write to cyw43 has completed.
Add a check for the dma irq as well.
Fixes #2206