Skip to content

rp2040 changes for RP6502#3582

Open
rumbledethumps wants to merge 6 commits intohathach:masterfrom
rumbledethumps:rp2040-hcd-epx
Open

rp2040 changes for RP6502#3582
rumbledethumps wants to merge 6 commits intohathach:masterfrom
rumbledethumps:rp2040-hcd-epx

Conversation

@rumbledethumps
Copy link
Copy Markdown

This is the diff you asked for. I've been leaving my overrides an AI mess because of the moving target. Feel free to close this after you've reviewed and picked out the good bits.

Copilot AI review requested due to automatic review settings April 2, 2026 04:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the RP2040 native USB host/device implementation to improve robustness when sharing the single host “EPX” endpoint across multiple devices (notably EP0/control traffic), adds additional error/timeout handling in the host ISR, and tweaks RP2040/RP2350 SDK compatibility.

Changes:

  • Replace short cycle-based “concurrent access” delays with busy_wait_us(1) in several USB register write paths.
  • Rework host EP0 handling to always use the shared EPX, track per-device EP0 max packet size, and reduce EPX monopolization between control phases.
  • Implement missing host endpoint close/abort/clear-stall logic and add additional ISR handling/clearing for several host error conditions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
src/portable/raspberrypi/rp2040/rp2040_usb.c Changes buffer-control “concurrent access” delay behavior when setting AVAILABLE.
src/portable/raspberrypi/rp2040/hcd_rp2040.c Substantial host-side changes: EP0 MPS tracking, ISR robustness improvements, endpoint lifecycle implementation, and interrupt endpoint slot allocation changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 121 to 128
// Section 4.1.2.7.1 (rp2040) / 12.7.3.7.1 (rp2350) Concurrent access: after write to buffer control,
// wait for USB controller to see the update before setting AVAILABLE.
// Don't need delay in host mode as host is in charge of when to start the transaction.
if (value & (USB_BUF_CTRL_AVAIL | (USB_BUF_CTRL_AVAIL << 16))) {
if (!rp2usb_is_host_mode()) {
busy_wait_at_least_cycles(12);
busy_wait_us(1);
}
*buf_reg = value; // then set AVAILABLE bit last
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

busy_wait_us(1) replaces the previous 12-cycle wait on the device-side AVAILABLE write path. This adds ~1µs of busy-waiting in an IRQ hot path per buffer arm and may significantly reduce device throughput/latency. Consider reverting to a minimal cycle-based wait (e.g. busy_wait_at_least_cycles(12)) or otherwise justify/guard the longer delay.

Copilot uses AI. Check for mistakes.
Comment on lines 139 to 145
// Section 4.1.2.7.1 (rp2040) / 12.7.3.7.1 (rp2350) Concurrent access
if (value & USB_BUF_CTRL_AVAIL) {
if (!rp2usb_is_host_mode()) {
busy_wait_at_least_cycles(12);
busy_wait_us(1);
}
*buf_reg16 = value; // then set AVAILABLE bit last
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: switching from a 12-cycle delay to busy_wait_us(1) in the 16-bit bufctrl write adds a relatively large busy-wait in an IRQ path. Prefer the datasheet-minimum cycle wait (or an equivalent minimal delay) to avoid unnecessary throttling.

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +109
int hcd_free_ep_count(void) {
int count = 0;
for (uint i = 1; i < TU_ARRAY_SIZE(ep_pool); i++) {
if (ep_pool[i].max_packet_size == 0) {
count++;
}
}
return count;
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a new non-static exported function hcd_free_ep_count(), but it is not declared in any header and is not referenced anywhere else in the repo. Keeping an unadvertised global symbol in a portable driver makes the surface area harder to reason about; consider making it static (if only for local diagnostics), gating it behind a debug macro, or removing it if unused.

Copilot uses AI. Check for mistakes.
// We write everything except the START_TRANS bit first, then wait some cycles.
usb_hw->sie_ctrl = sie_ctrl;
busy_wait_at_least_cycles(12);
busy_wait_us(1);
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

busy_wait_us(1) is used here where the RP2040 datasheet guidance is “wait some cycles” (previously 12 cycles). In an IRQ path, a fixed 1µs wait is a much larger stall than required and could impact scheduling/throughput under load. Consider returning to a minimal cycle-based wait (e.g. busy_wait_at_least_cycles(12)) or otherwise making the delay proportional to the datasheet requirement.

Suggested change
busy_wait_us(1);
busy_wait_at_least_cycles(12);

Copilot uses AI. Check for mistakes.
Comment on lines 650 to 668
ep->max_packet_size = max_packet_size;
ep->ep_addr = ep_addr;
ep->dev_addr = dev_addr;
ep->transfer_type = ep_desc->bmAttributes.xfer;
ep->need_pre = need_pre(dev_addr);
ep->next_pid = 0u;

// Remember EP0 max packet size so hcd_setup_send/hcd_edpt_xfer can restore it
if (tu_edpt_number(ep_addr) == 0 && dev_addr < TU_ARRAY_SIZE(_ep0_mps)) {
_ep0_mps[dev_addr] = (uint8_t)max_packet_size;
}

if (ep->transfer_type != TUSB_XFER_INTERRUPT) {
ep->dpram_buf = usbh_dpram->epx_data;
} else {
// from 15 interrupt endpoints pool
// Scan ep_pool (not usb_hw->int_ep_ctrl) to find a free interrupt slot.
// int_ep_ctrl is zeroed during EPX suppression and by abort, so using it
// as a free-slot bitmap causes interrupt_num collisions between devices.
uint8_t int_idx;
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential stale interrupt_num: when opening a non-interrupt endpoint, interrupt_num is never explicitly cleared. Since endpoints are recycled by edpt_alloc() using only max_packet_size==0, a slot previously used for an interrupt endpoint can be reused for bulk/iso while retaining a nonzero interrupt_num, causing later xfers to go down the interrupt-endpoint path incorrectly. Ensure interrupt_num is set to 0 for non-interrupt endpoints (and/or fully cleared when freeing an endpoint slot).

Copilot uses AI. Check for mistakes.
usbh_dpram->epx_buf_ctrl = 0;
}
}
rp2usb_reset_transfer(ep);
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When closing an endpoint, the slot is marked unused by ep->max_packet_size = 0, but ep->interrupt_num is left unchanged. Because edpt_alloc() reuses slots based on max_packet_size==0, a later non-interrupt endpoint can inherit a stale nonzero interrupt_num and be treated as an interrupt endpoint. Clear interrupt_num (and ideally other per-endpoint identity fields) when freeing the slot.

Suggested change
rp2usb_reset_transfer(ep);
rp2usb_reset_transfer(ep);
ep->interrupt_num = 0;

Copilot uses AI. Check for mistakes.
Comment on lines 737 to +741
bool hcd_edpt_abort_xfer(uint8_t rhport, uint8_t dev_addr, uint8_t ep_addr) {
(void)rhport;
(void)dev_addr;
(void)ep_addr;
// TODO not implemented yet
return false;
hw_endpoint_t *ep = edpt_find(dev_addr, ep_addr);
if (!ep) return true;

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hcd_edpt_abort_xfer() returns true even when edpt_find() fails. This conflicts with the HCD API contract in src/host/hcd.h (return false if there is no transfer to abort) and makes it harder for callers to detect incorrect state. Consider returning false when the endpoint isn't found and only returning true when an actual queued/active transfer was cleared.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner

@hathach hathach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the PR, can you give a bit of explanation e.g what this pr improve and/or which issue it resolves.

@rumbledethumps
Copy link
Copy Markdown
Author

These are the problems my changes address. Keep in mind I have a complete MSC implementation with BOT, CBI, LUN, removable media, LBA64, and even TRIM support so I hit code parts that the examples do not, especially concerning removable media error recovery.

  • busy_wait_at_least_cycles uses the CPU clock instead of the SIE clock
  • Devices refuse to mount, e.g. a hub with MSC-BOT, CDC, and HID won't mount the HID.
  • Removing devices from hub causes panic hcd_clear_stall.
  • Too many endpoints allocated, one extra EP0 per device is allocated.
  • Entire close and abort path is missing!!!
  • hcd_free_ep_count() added because pool is so tiny and user needs to know.

@rumbledethumps
Copy link
Copy Markdown
Author

I did some math and busy_wait_at_least_cycles(32) should be safe up to 512 MHz. The original 12 cycles can fail intermittently as you pass 192 MHz, which a lot of projects do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants