Conversation
There was a problem hiding this comment.
Pull request overview
This PR adjusts the RP2040 native USB host/device implementation to improve robustness when sharing the single host “EPX” endpoint across multiple devices (notably EP0/control traffic), adds additional error/timeout handling in the host ISR, and tweaks RP2040/RP2350 SDK compatibility.
Changes:
- Replace short cycle-based “concurrent access” delays with
busy_wait_us(1)in several USB register write paths. - Rework host EP0 handling to always use the shared EPX, track per-device EP0 max packet size, and reduce EPX monopolization between control phases.
- Implement missing host endpoint close/abort/clear-stall logic and add additional ISR handling/clearing for several host error conditions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
src/portable/raspberrypi/rp2040/rp2040_usb.c |
Changes buffer-control “concurrent access” delay behavior when setting AVAILABLE. |
src/portable/raspberrypi/rp2040/hcd_rp2040.c |
Substantial host-side changes: EP0 MPS tracking, ISR robustness improvements, endpoint lifecycle implementation, and interrupt endpoint slot allocation changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Section 4.1.2.7.1 (rp2040) / 12.7.3.7.1 (rp2350) Concurrent access: after write to buffer control, | ||
| // wait for USB controller to see the update before setting AVAILABLE. | ||
| // Don't need delay in host mode as host is in charge of when to start the transaction. | ||
| if (value & (USB_BUF_CTRL_AVAIL | (USB_BUF_CTRL_AVAIL << 16))) { | ||
| if (!rp2usb_is_host_mode()) { | ||
| busy_wait_at_least_cycles(12); | ||
| busy_wait_us(1); | ||
| } | ||
| *buf_reg = value; // then set AVAILABLE bit last |
There was a problem hiding this comment.
busy_wait_us(1) replaces the previous 12-cycle wait on the device-side AVAILABLE write path. This adds ~1µs of busy-waiting in an IRQ hot path per buffer arm and may significantly reduce device throughput/latency. Consider reverting to a minimal cycle-based wait (e.g. busy_wait_at_least_cycles(12)) or otherwise justify/guard the longer delay.
| // Section 4.1.2.7.1 (rp2040) / 12.7.3.7.1 (rp2350) Concurrent access | ||
| if (value & USB_BUF_CTRL_AVAIL) { | ||
| if (!rp2usb_is_host_mode()) { | ||
| busy_wait_at_least_cycles(12); | ||
| busy_wait_us(1); | ||
| } | ||
| *buf_reg16 = value; // then set AVAILABLE bit last | ||
| } |
There was a problem hiding this comment.
Same as above: switching from a 12-cycle delay to busy_wait_us(1) in the 16-bit bufctrl write adds a relatively large busy-wait in an IRQ path. Prefer the datasheet-minimum cycle wait (or an equivalent minimal delay) to avoid unnecessary throttling.
| int hcd_free_ep_count(void) { | ||
| int count = 0; | ||
| for (uint i = 1; i < TU_ARRAY_SIZE(ep_pool); i++) { | ||
| if (ep_pool[i].max_packet_size == 0) { | ||
| count++; | ||
| } | ||
| } | ||
| return count; | ||
| } |
There was a problem hiding this comment.
This introduces a new non-static exported function hcd_free_ep_count(), but it is not declared in any header and is not referenced anywhere else in the repo. Keeping an unadvertised global symbol in a portable driver makes the surface area harder to reason about; consider making it static (if only for local diagnostics), gating it behind a debug macro, or removing it if unused.
| // We write everything except the START_TRANS bit first, then wait some cycles. | ||
| usb_hw->sie_ctrl = sie_ctrl; | ||
| busy_wait_at_least_cycles(12); | ||
| busy_wait_us(1); |
There was a problem hiding this comment.
busy_wait_us(1) is used here where the RP2040 datasheet guidance is “wait some cycles” (previously 12 cycles). In an IRQ path, a fixed 1µs wait is a much larger stall than required and could impact scheduling/throughput under load. Consider returning to a minimal cycle-based wait (e.g. busy_wait_at_least_cycles(12)) or otherwise making the delay proportional to the datasheet requirement.
| busy_wait_us(1); | |
| busy_wait_at_least_cycles(12); |
| ep->max_packet_size = max_packet_size; | ||
| ep->ep_addr = ep_addr; | ||
| ep->dev_addr = dev_addr; | ||
| ep->transfer_type = ep_desc->bmAttributes.xfer; | ||
| ep->need_pre = need_pre(dev_addr); | ||
| ep->next_pid = 0u; | ||
|
|
||
| // Remember EP0 max packet size so hcd_setup_send/hcd_edpt_xfer can restore it | ||
| if (tu_edpt_number(ep_addr) == 0 && dev_addr < TU_ARRAY_SIZE(_ep0_mps)) { | ||
| _ep0_mps[dev_addr] = (uint8_t)max_packet_size; | ||
| } | ||
|
|
||
| if (ep->transfer_type != TUSB_XFER_INTERRUPT) { | ||
| ep->dpram_buf = usbh_dpram->epx_data; | ||
| } else { | ||
| // from 15 interrupt endpoints pool | ||
| // Scan ep_pool (not usb_hw->int_ep_ctrl) to find a free interrupt slot. | ||
| // int_ep_ctrl is zeroed during EPX suppression and by abort, so using it | ||
| // as a free-slot bitmap causes interrupt_num collisions between devices. | ||
| uint8_t int_idx; |
There was a problem hiding this comment.
Potential stale interrupt_num: when opening a non-interrupt endpoint, interrupt_num is never explicitly cleared. Since endpoints are recycled by edpt_alloc() using only max_packet_size==0, a slot previously used for an interrupt endpoint can be reused for bulk/iso while retaining a nonzero interrupt_num, causing later xfers to go down the interrupt-endpoint path incorrectly. Ensure interrupt_num is set to 0 for non-interrupt endpoints (and/or fully cleared when freeing an endpoint slot).
| usbh_dpram->epx_buf_ctrl = 0; | ||
| } | ||
| } | ||
| rp2usb_reset_transfer(ep); |
There was a problem hiding this comment.
When closing an endpoint, the slot is marked unused by ep->max_packet_size = 0, but ep->interrupt_num is left unchanged. Because edpt_alloc() reuses slots based on max_packet_size==0, a later non-interrupt endpoint can inherit a stale nonzero interrupt_num and be treated as an interrupt endpoint. Clear interrupt_num (and ideally other per-endpoint identity fields) when freeing the slot.
| rp2usb_reset_transfer(ep); | |
| rp2usb_reset_transfer(ep); | |
| ep->interrupt_num = 0; |
| bool hcd_edpt_abort_xfer(uint8_t rhport, uint8_t dev_addr, uint8_t ep_addr) { | ||
| (void)rhport; | ||
| (void)dev_addr; | ||
| (void)ep_addr; | ||
| // TODO not implemented yet | ||
| return false; | ||
| hw_endpoint_t *ep = edpt_find(dev_addr, ep_addr); | ||
| if (!ep) return true; | ||
|
|
There was a problem hiding this comment.
hcd_edpt_abort_xfer() returns true even when edpt_find() fails. This conflicts with the HCD API contract in src/host/hcd.h (return false if there is no transfer to abort) and makes it harder for callers to detect incorrect state. Consider returning false when the endpoint isn't found and only returning true when an actual queued/active transfer was cleared.
hathach
left a comment
There was a problem hiding this comment.
thank you for the PR, can you give a bit of explanation e.g what this pr improve and/or which issue it resolves.
|
These are the problems my changes address. Keep in mind I have a complete MSC implementation with BOT, CBI, LUN, removable media, LBA64, and even TRIM support so I hit code parts that the examples do not, especially concerning removable media error recovery.
|
|
I did some math and busy_wait_at_least_cycles(32) should be safe up to 512 MHz. The original 12 cycles can fail intermittently as you pass 192 MHz, which a lot of projects do. |
This is the diff you asked for. I've been leaving my overrides an AI mess because of the moving target. Feel free to close this after you've reviewed and picked out the good bits.