Characterising ESPNOW #11757
Replies: 10 comments 23 replies
-
Thanks Peter. This is a very useful summary and generally accords with my own experience, although I haven't experienced the extended periods of failed messages that you have seen. Over the last few years I have run high throughput tests over weeks and not seen that behaviour. However, I haven't done those long term tests with S3 devices and could redo some of those tests with the current version of the module to check. When the extended periods of failed transmissions occured, were you able to test if both incoming and outgoing messages were failing? I assume your tests were all with |
Beta Was this translation helpful? Give feedback.
-
My echo server runs this single task: async def echo_server(e):
async for mac, msg in e:
print("Echo:", msg)
try:
await e.asend(mac, msg)
asyncio.create_task(flash(GREEN)) # 1 second LED flash for range testing
except OSError as err:
if len(err.args) > 1 and err.args[1] == 'ESP_ERR_ESPNOW_NOT_FOUND':
e.add_peer(mac)
asyncio.create_task(flash(BLUE)) # Add peer
await e.asend(mac, msg)
else:
asyncio.create_task(flash(RED)) # ??? The other peer runs two tasks, a transmitter that sends a JSON encoded message number and timestamp every 3 seconds and a receiver which runs an The long sequences of failed messages are concerning. I changed channel from my WiFi's 3 to a quiet 10 but it had no effect (almost all of my WiFi traffic is on 5GHz). The long sequences typically occur for a few minutes after start, it then settles down with only occasional dropped messages. I live in a semi-rural area with reasonably low levels of RFI, other projects using 2.4GHz have not been affected in this way. [EDIT]
The simple echo server didn't provide a way to do this. I'll do a test to establish this. |
Beta Was this translation helpful? Give feedback.
-
I've run a test where the echo server decodes the message, looks for missing message ID's, and appends that number to the message which it returns. The great majority of missing messages are outgoing, from the test script to the server. In 800 attempts, 170 were missed as seen by the test script. 167 were missed as seen by the server. So only three messages from server to script failed to be received. |
Beta Was this translation helpful? Give feedback.
-
@glenn20 I have now replaced the S3 with the original ESP32 reference board. The echo server still runs the S3. The problem has gone away. Now running >500 messages with zero loss. Round-trip latency max: 31ms min: 11ms. The problem seems to be with the S3. Incidentally using [EDIT] |
Beta Was this translation helpful? Give feedback.
-
Hi Peter Maybe a silly guess but are there microwave ovens nearby? It happened to me that in the office a colleague warmed up his lunch... with ridiculous effects on the radio links. |
Beta Was this translation helpful? Give feedback.
-
@glenn20 Re ensuring that the channel number is fixed, have you seen #11819? I am using sta.config(channel = chan) but accoring to the PR this does not work. Issuing sta.config(reconnects=0) causes the application to fail to reconnect after a WiFi outage (with a WiFi timeout error). This is hard for me to test as my AP uses a fixed channel which I specify, so I'm reliant on you ESP32 gurus for advice here :) I'd appreciate any comments. |
Beta Was this translation helpful? Give feedback.
-
Hi Glenn Please can you explain because it is not that useful for wifi client devices ? My scenario. In my area, channel 6 is congested, so on the micro I set channel 1. As far as I know, on 2.4 GHz the best channels are the lateral ones, 1 and 14, due to the less overlap with the nearby ones. Isn't it a good solution? BTW, I am using esp8266, so this is the sequence of operations:
|
Beta Was this translation helpful? Give feedback.
-
I verified. In AP_IF, after setting the channel (step 2 of the above sequence), calling |
Beta Was this translation helpful? Give feedback.
-
Another observation that others might want to check. I have three Feather S3 boards. These have 8MB of PSRAM. I have found them to suffer bouts of failed ESPNow messages. It is hard to replicate this behaviour as they can work for hours. Another time after power up I get repeated errors before it settles down. Initially I thought I had a duff board, but it isn't as simple as that. The ESP32 reference board works very well - I've never seen this behaviour. The question in my mind is whether it is the S3 or a latency effect from SPIRAM. |
Beta Was this translation helpful? Give feedback.
-
@glenn20 I wondered this too. Two thoughts. GC on machines with SPIRAM takes 100ms in my measurements but others have measured 200ms. That is even if you manually trigger a GC immediately after a prior one. However the typical behaviour I've seen occurs immediately after boot, when there can't be enough garbage in 8MiB of RAM to trigger a GC. This occurs even with trivial scripts like echo servers. It's very sporadic: just when you're confident you've isolated it to one unit it rears its head again. I'll be very interested to see how you fare. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I've started testing using
aioespnow
on ESP32-S3. Testing was with an echo server and a single peer. These are my initial observations.asend
method is pessimistic. It sometimes returnsFalse
when the peer has received the message.Reliability might be an issue. I chose a channel which is a long way from any used in my house, one where a scan revealed low levels of remote WiFi. Testing was over a room-to-room distance. Testing can run for long periods with only a rare missed message, but has bouts of failing repeatedly. This can even occur when the units are a metre apart.
There are two communications scenarios that often arise:
In principle it seems good for low latency work and the message integrity means that devising a protocol for reliability would be easy. But the long sequences of failures over many minutes are rather troubling and I haven't seen this behaviour over short distances with other wireless technologies. Any comments?
Beta Was this translation helpful? Give feedback.
All reactions