Skip to content

long-locked xSemaphoreCreateMutex() causes crash when used with wifi #10561

@dhalbert

Description

@dhalbert

This is a generalization of #10520, and supersedes it.

This program will hang and then hard crash on ESP32-S2 and ESP-S3, in somewhat different ways (see below).

creating either an I2C or SPI object and then locking it causes the issue . Both use an xSemaphoreCreateMutex(). The try_lock() is necessary: it does an xSemaphoreTake() on the mutex, and then holds the lock for the rest of the program.

(I suggest you import this program manually, instead of using code.py, so you can control the failure.)

import adafruit_requests
import board
import socketpool
import ssl
import time
import wifi

# print("create SPI")
# spi = board.SPI()
# print("lock SPI")
# spi.try_lock()

print("create I2C")
i2c = board.I2C()
# print("lock I2C")
i2c.try_lock()

pool = socketpool.SocketPool(wifi.radio)
requests = adafruit_requests.Session(pool, ssl.create_default_context())

# An HTTPS request more reliably hangs, probably due to duration, but using HTTPS is not necessary.
#print(requests.get("http://wifitest.adafruit.com/testwifi/index.html").text)
print("Make network request")
print(requests.get("https://www.adafruit.com/api/quotes.php").text)

while True:
    print(time.time())
    time.sleep(1)

Either the I2C or the SPI case causes the problem. The problem occurs more irregularly when doing an HTTP fetch, but can occur.

On a Metro ESP32-S2, the program usually hangs during the fetch, and never gets to the time-printing loop. USB will disconnect and the board will reset.
On a Metro ESP32-S3, the network fetch usually succeeds, but ctrl-C'ing the time-printing loop doesn't work, and the board will crash eventually.

Originally this was noticed on a FunHouse, which is an ESP32-S2 with DotStars. The adafruit_funhouse library instantiated a DotStar object, which itself creates an busio.SPI. The code there grabbed the lock for the busio.SPI it created,and didn't give it up, since no one else was going to use it.

Doing a network fetch is necessary. I poked around a bit there: the socketpool code, which is invoked by doing a fetch, does some stuff with task notifications. That may be a cause and a clue.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions