Skip to content

Conversation

@samblenny
Copy link

@samblenny samblenny commented Sep 6, 2025

This is a second attempt at

with a smaller set of changes to usb.core.Device error handling:

  1. Instances where 0xff (not a valid enum value) was assigned to the static xfer_result_t _xfer_result; variable (an enum type) are converted to using XFER_RESULT_INVALID (a valid enum value). Both values were used to indicate the condition of waiting for a callback to finish. This change makes usage in usb.core.Device consistent with usage in the TinyUSB implementation.
  2. Callback result code checks are converted from partial coverage using if statements to a switch statement that fully covers all the possible enum values. In particular, this new code will raise a USBError exception in the case of result == XFER_RESULT_FAILED (old behavior was to return the buffer's previous contents -- all zeros for a freshly allocated array or undefined garbage for a previously used array -- without indicating any error condition).
  3. Factor out timeout and result checking code that was previously duplicated in _xfer() and common_hal_usb_core_device_ctrl_transfer().
  4. Fix control transfer return value to give actual length (was returning requested length)
  5. Fix missing error handling for usb.core.Device.idVendor and usb.core.Device.idProduct
  6. [edit 9/6] Fix missing error handling for usb.core.Device.product, usb.core.Device.manufacturer, and usb.core.Device.serial_number

Checks:

  • pre-commit
  • build & run for Fruit Jam

Testing

Test Code:

import displayio
import gc
import usb
import time
from usb.core import USBError, USBTimeoutError

def test_unplug_during_find():
    # This tests how repeated calls to usb.core.find() behave in the
    # context of unplugging a USB device.
    cache = {}
    print("Finding USB devices...")
    while True:
        try:
            for device in usb.core.find(find_all=True):
                # Read 18 byte device descriptor
                desc = bytearray(18)
                device.ctrl_transfer(0x80, 6, 0x01 << 8, 0, desc, 300)
                # Validate descriptor
                key_ = tuple(desc)
                all_zero = all([b==0 for b in desc])
                if all_zero:
                    # Got bad data
                    print("bad data:", key_)
                    continue
                elif key_ in cache:
                    # Descriptor is valid but cached, skip device
                    continue
                # Otherwise, print properties and add to cache
                cache[key_] = True
                print_descriptor_properties(device)
                print(cache)
        except USBTimeoutError as e:
            print("USBTimeoutError: '%s'" % str(e))
        except USBError as e:
            print(f"USBError: '%s'; clear cache, sleep 20ms" % str(e))
            cache.clear()
            # This delay allows TinyUSB to recover from failures
            time.sleep(0.02)

def print_descriptor_properties(device):
    print()
    print(f"idVendor      {device.idVendor:04x}")
    print(f"idProduct     {device.idProduct:04x}")
    print(f"product       {device.product}")
    print(f"manufacturer  {device.manufacturer}")
    print(f"serial_number {device.serial_number}")

displayio.release_displays()
gc.collect()
test_unplug_during_find()

Test Hardware:

  • Adafruit Fruit Jam rev D
  • 8BitDo SN30 pro gamepad (as shown in example below)
  • Various other USB devices including the Adafruit generic SNES style gamepad

Test Results Before Changes (10.0.0-beta.3):

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
Finding USB devices...

idVendor      045e
idProduct     028e
product       Controller
manufacturer  Controller
serial_number Controller
{(18, 1, 0, 2, 255, 255, 255, 64, 94, 4, 142, 2, 20, 1, 1, 2, 3, 1): True}
bad data: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
bad data: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
bad data: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
bad data: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
bad data: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
bad data: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
...

Note how usb.core.Device.ctrl_transfer() begins returning all zero result buffers. This continues for as long as I let it run.

Test Results After Changes:

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
Finding USB devices...

idVendor      045e
idProduct     028e
product       Controller
manufacturer  Controller
serial_number Controller
{(18, 1, 0, 2, 255, 255, 255, 64, 94, 4, 142, 2, 20, 1, 1, 2, 3, 1): True}
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms

idVendor      057e
idProduct     2009
product       Pro Controller
manufacturer  Nintendo Co., Ltd.
serial_number 000000000001
{(18, 1, 0, 2, 0, 0, 0, 64, 126, 5, 9, 32, 0, 2, 1, 2, 3, 1): True}
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms

idVendor      045e
idProduct     028e
product       Controller
manufacturer  Controller
serial_number Controller
{(18, 1, 0, 2, 255, 255, 255, 64, 94, 4, 142, 2, 20, 1, 1, 2, 3, 1): True}
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms

idVendor      081f
idProduct     e401
product       USB gamepad           
manufacturer  None
serial_number None
{(18, 1, 0, 1, 0, 0, 0, 8, 31, 8, 1, 228, 6, 1, 0, 2, 0, 1): True}
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms
USBError: ''; clear cache, sleep 20ms

In this case, I plugged in an SN30 pro which has a weird startup sequence where it disconnects itself and swaps device descriptors a couple times. The first 3 descriptor info prints are from plugging in the SN30 pro. After that, I unplugged it, producing a series of 3 USBErrors. After that, I plugged in an Adafruit generic SNES style gamepad.

Note how there isn't any bad data, and the code can let TinyUSB auto-recover from unplugged (or self-disconnected) devices by sleeping for 20ms after it gets a USBError.

[edit 9/6: added serial_number to test code and test results]

shared-module/usb/core/Device.c was using its own 0xff value for
the xfer_result_t enum defined by tinyusb/src/common/tusb_types.h.
The 0xff value served the same purpose as the already exisiting
XFER_RESULT_INVALID enum value (a placeholder to mark in-progress
transactions). This commit standardizes on XFER_RESULT_INVALID in
usb.core.Device consistent with the usage in tinyusb.

Making this change allows implementing `switch(result){...}` style
result code checks without compiler errors about 0xff not being a
valid value for the enum.
This directly translates the Device.ctrl_transfer() result check
logic from its old if-statements to an equivalent switch-statement.
The point is to make it clear how each possible result code is
handled. Note that XFER_RESULT_FAILED and XFER_RESULT_TIMEOUT both
return 0 without generating any exception. (but also, tinyusb may
not actually use XFER_RESULT_TIMOUT if its comments are still
accurate)
Previously this returned the requested transfer length argument,
ignoring the actual length of transferred bytes. This changes to
returning the actual length.
Previously, usb.core.Device.ctrl_transfer() did not raise an
exception when TinyUSB returned an XFER_RESULT_FAILED result code.
This change raises an exception which prevents a failed transfer
from returning an all zero result buffer.
The code for endpoint transfers and control transfers previously
had a bunch of duplicated logic for setup, timeout checking, and
result code error handling.

This change factors that stuff out into functions, using my new
transfer result error handling code from the last commit. Now the
endpoint and control transfers can share the setup, timeout, and
error check logic.

For me, this refactor reduced the firmware image size by 176 bytes.
Previously these weren't checking to see if TinyUSB reported a
failure. Now they check.
Improve the error handling for usb.core.Device string properties:
.serial_number, .product, .manufacturer

Previously, the property getters didn't check the device descriptor
to see if the device actually had the requested string. Instead,
they relied on TinyUSB to return a failure result if the string was
not available. That made it impossible to distinguish missing
strings from other more serious USB errors (e.g. unplugged device).

These changes make it possible to return None for a missing string
or raise a USBError exception in case of a more serious problem.
@samblenny
Copy link
Author

Just added a commit to fix missing error handling for product, manufacturer, and serial_number string getter properties. Details of missing error checks at:

I'm reasonably confident these changes should be enough to resolve issue #10553

@RetiredWizard
Copy link

With this version of Device.c I get this error starting Fruit Jam OS launcher and an unlabeled generic usb mouse, the "absolute newest" 10.x version doesn't have a problem starting up with this mouse (although I think it may not always see the mouse initially but once you start moving it everything seems to start working):

Note that you have to add the "use_mouse" parameter to the launcher.conf.json file to test mouse use in the launcher.

Traceback (most recent call last):
  File "code.py", line 120, in <module>
usb.core.USBError: 

The Fruit Jam OS OS launcher starts fine with the firmware from this PR using the "3D Optical Mouse" I believe I purchased from the Adafruit Shop.

Removing any mouse while the launcher is running causes:

Traceback (most recent call last):
  File "code.py", line 522, in <module>
usb.core.USBError: 
inside atexit callback

This exact same message occurs using the "absolute newest 10.x" firmware from Circuitpython.org as well though.

I tried hot swapping in the editor and PyPaint apps and both the "absolute newest 10.x" and the new Device.c version seem to work fine unplugging and re-plugging a different mouse (even the generic one) in while the app was running

I think both these crashes could be better handled in the applications and are not necessarily core issues, as the core is now properly returning USBErrors. From the application stand point the errors aren't necessarily fatal and should be caught and handled. The reason the launcher wasn't crashing with my generic mouse before this PR is that the core wasn't raising an error when it couldn't retrieve the PID or VID, it just returned None as the values. When this happened the applications would either retry or simply ignore the issue.

@samblenny
Copy link
Author

With this version of Device.c I get this error starting Fruit Jam OS launcher...

If I understand your description correctly, that sounds like what I would expect from CircuitPython code that is using the usb.core API without anticipating the possibility of USBError and USBTimeoutError. In my experience, it works best to:

  1. Have an outer loop that will retry usb.core.find(), usb.core.Device.set_configuration(), etc. when any of the code in either of the inner or outer loop gets interrupted by a USBError
  2. Have an inner loop that will retry usb.core.Device.read() and so on if it gets a USBTimeoutError

@samblenny
Copy link
Author

samblenny commented Sep 7, 2025

Also, I suppose it might be a good idea to include exception information in the documentation at:
https://docs.circuitpython.org/en/latest/shared-bindings/usb/core/index.html#usb.core.Device

The current docs mention that USBError and USBTimeoutError exist, but there's no indication of which methods and properties can raise them under which circumstances. Basically, most of the methods and properties touch the TinyUSB tuh_* API. That means they can get a failure result back from TinyUSB when attempting to access a device that hasn't been fully enumerated and addressed yet, and that seems to be a relatively common occurrence.

@samblenny
Copy link
Author

samblenny commented Sep 7, 2025

There may be a race condition in usb.core.find() where it's relying on TinyUSB's tuh_mount_cb() and tuh_umount_cb() callbacks to track device addresses. Those callbacks can get delayed by the mysterious thing that goes away if you wait for 20ms after getting a USBError.

Currently, _next_device() from shared-bindings/usb/core/__init__.c relies on common_hal_usb_core_device_construct() from shared-module/usb/core/Device.c to decide which addresses should be included in the find() iterator. The filter in common_hal_usb_core_device_construct() relies on what it knows from the mount/umount callbacks. Instead of relying on callbacks, it could call tuh_connected() and tuh_mounted() from the TinyUSB API.

[edit: actually, it's more complicated. See next comment.]

@samblenny
Copy link
Author

samblenny commented Sep 7, 2025

Actually, it's way more complicated than I thought. In TinyUSB, there's a device state progression from connected to addressed to configured.

  • The USBError for vid / pid failure results originate from tuh_vid_pid_get() using the TU_VERIFY() macro to check for the addressed state, which gets set in an early stage of the device enumeration process.
  • tuh_mount_cb() gets invoked when TinyUSB's class driver feature has finished configuring a driver for a device, and that happens at the final stage of the enumeration process. According to the TinyUSB comments, class driver setup is asynchronous so it could plausibly span many calls to tuh_task(). This might be why the 20ms delay resolves the CircuitPython problem with tuh_mount_cb() and tuh_umount_cb() not getting called.
  • tuh_mounted() checks for the configured state which is approximately equivalent to tuh_mount_cb() in that it indicates a class driver has been configured.

@samblenny
Copy link
Author

samblenny commented Sep 7, 2025

By my reading of the code, it seems CircuitPython's usb.core.find() is usually racing with TinyUSB's automatic interface configuration and asynchronous class driver binding mechanism. Also, it seems that TinyUSB's class drivers may be operating in parallel with CircuitPython USB drivers?!?

  1. tuh_mount_cb() and tuh_umount_cb() have the wrong semantics for the way the usb.core.find() is using them. What find() actually needs is a way to check if the device has made it to the end of TinyUSB's ENUM_GET_FULL_CONFIG_DESC state of the device enumeration process.
  2. By my reading of TinyUSB's process_enumeration() function, there may not be any way to stop TinyUSB from automatically attempting to configure interfaces and bind class drivers in the ENUM_SET_CONFIG and ENUM_CONFIG_DRIVER device enumeration states.
  3. CircuitPython doesn't make use of the TinyUSB class drivers, so having them activate automatically seems like, at minimum, a waste of CPU. But, there could also be lots of subtle interactions between TinyUSB class drivers and CircuitPython drivers. I wonder if this is why some MIDI messages disappear.

@samblenny
Copy link
Author

When I used ripgrep to check the circuitpython repo, including lib/tinyusb submodule, for TinyUSB's MIDI driver configuration option (rg CFG_TUH_MIDI), it looks like the MIDI driver is not enabled by default. Could be wrong about that.

@samblenny
Copy link
Author

An alternate way of looking at the automatic class driver binding thing is that it's similar to kernel device drivers with desktop PyUSB. From that perspective, usb.core.Device.detach_kernel_driver() should be able to unbind TinyUSB drivers (which it is not currently able to do). See:

@samblenny
Copy link
Author

samblenny commented Sep 8, 2025

To bring this back into sharper focus on error handling... I'm concerned about the automatic TinyUSB class driver binding stuff because it's redundant (drivers are active but CircuitPython doesn't use them), it's asynchronous (relies on RUN_BACKGROUND_TASKS), and the usb.core.find() implementation currently depends on class driver callbacks (tuh_mount_cb() and tuh_umount_cb()).

The current find() implementation sometimes fails to find devices or finds devices that give bad data (before this PR's changes) or USBError (after this PR's changes) if you try to use them. This could probably be improved by modifying find(), and probably also TinyUSB, so that CircuitPython can directly observe the TinyUSB enumeration state machine to better understand when devices are not yet fully enumerated. It's also possible that TinyUSB class drivers are interacting badly with CircuitPython usb.core.Device implementations and causing some of the USBError stuff.

To summarize, the changes in this PR make find() work somewhat better, but it still has some problems. Addressing those problems effectively will probably require coordinated changes in TinyUSB and CircuitPython, but that will require API design choices. There are at least a couple of possible approaches (TinyUSB config option to disable class drivers, implement detach_kernel_driver() to keep the possibility of using MIDI class driver open, etc). I could work on that, but I'd need core dev feedback on what would be suitable. It might also be the case that Thach should be the one to work on that stuff.

@samblenny
Copy link
Author

Thinking about the TinyUSB enumeration state machine some more, and considering what RetiredWizard mentioned on Discord over the weekend about additional control transfer in progress mutex logic... It seems plausible the TinyUSB enumeration state machine might not be adequately keeping track of when a control transfer is in progress. The enumeration process makes a bunch of async control transfers, and it also has some provisions for debouncing as a solid physical connection is established at the USB jack. It's possible there's a bug in that stuff somewhere that doesn't cancel a control transfer when it should.

@samblenny
Copy link
Author

Turns out it's even more complicated... I was wrong about CircuitPython not using TinyUSB class drivers. Reading more carefully, supervisor/shared/usb/tusb_config.h defines non-zero values for CFG_TUH_HID (hid keyboard class driver option) and CFG_TUH_HUB (hub class driver option). Those options control the creation of TinyUSB's static array of class driver structs in tinyusb/src/host/usbh.c.

So, that eliminates my idea of modifying TinyUSB to entirely disable class drivers. Seems like we might need to build a way to detach the HID driver in cases where we don't want the supervisor to manage the device. For example, how does the HID class driver handle HID gamepads?

@samblenny
Copy link
Author

Looks like TinyUSB actually has a relatively elaborate HID class driver that covers keyboards, mice, and HID gamepads: tinyusb/src/class/hid/hid.h

Maybe it would be better to implement a way of accessing the HID class driver rather than detaching it?

@samblenny
Copy link
Author

samblenny commented Sep 9, 2025

To make use of the HID class driver for gamepad input, seems like a possibility would be to modify tuh_hid_report_received_cb() in circuitpython/supervisor/shared/usb/host_keyboard.c to check for gamepad reports and divert them to a buffer somewhere that could be read with some kind of stream interface by a new HID gamepad API. Same approach would probably also work for mice.

Might be a terrible idea. Dunno. Seems plausibly doable though.

In any case, my current revised opinion is that usb.core.device.detach_kernel_driver() should actually work fine to detach the HID class driver for any HID device, not just keyboards.

So, with the idea of leaving that path open for a possible future PR, that considerably narrows down what changes might be suitable for find(). Basically, seems like the thing to do is find and fix any bugs in TinyUSB's enumeration state machine for the async class driver binding stuff.

Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an improvement to me. Thanks!

I think the related issue is a better place for discussion about other aspects of the USB debugging you are doing (like the mystery delay.)

@tannewt tannewt merged commit 77bd2db into adafruit:main Sep 9, 2025
397 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants