Mesmerizer flashed from platformIO. No pixels lighting up. #423
Replies: 3 comments 1 reply
-
I'm away from my Mesmerizer for a few days. This would be a lovely chance
for one of our new members with a fresh Mesmerizer to help pick through the
logs of their devices and help compare notes.
On Fri, Sep 8, 2023 at 9:18 PM Keiran Hines ***@***.***> wrote:
I think I have soft bricked my board and wanted to know how to recover it.
The
The good news is that WAY too much is working for your board to be
considered a brick. It's clearly running Mesmerizer code, connecting with
the host just fine and so on.
It's POSSIBLE that you may have blown/lost an LCD panel, a level shifter,
an I/O interface pin, or something else. But for now, let's think happy
thoughts and work through the logs.
After doing that the serial output listed my wifi, but then that was
quickly replaced with an 'IoTWifi' SSID. (Logs below)
That's odd. The word "IoTWiFi" doesn't really appear in the code anywhere
that I can find. Do you have any theories where that name came from?
(W) (ReadWiFiConfig)(C1) Retrieved SSID and Password from NVS: IoTWiFi,
********
So that name seems to be coming from your NonVolatile storage. Are you
really saying you never gave it that name?
Starting SmartMatrix Mallocs
Some of that may seem scary sounding, but from memory, it all seems about
right.
(I) (LoadJSONFile)(C1) Attempting to read JSON file /effects.cfg
(W) (LoadJSONFile)(C1) Out of memory reading JSON from file /effects.cfg - increasing buffer to 4931 bytes
(W) (LoadJSONFile)(C1) Out of memory reading JSON from file /effects.cfg - increasing buffer to 6979 bytes
This is actually normal.|
>> Launching Network Thread. Mem: 157828, LargestBlk: 110580, PSRAM Free: 1010099/4174347, [ 2127][E][ESPmDNS.cpp:65] begin(): Failed starting MDNS
Error starting mDNS
Now this, I don't remember seeing on eery boot. Can anyone else comment on
whether they're seeing this one in a similar position?
(W) (UpdateSubscribers)(C1) Skipping Subscriber update, waiting for WiFi...
>> Launching ColorData Thread. Mem: 148972, LargestBlk: 110580, PSRAM Free: 1010291/4174347, >> Launching Socket Thread. Mem: 146036, LargestBlk: 110580, PSRAM Free: 1010291/4174347, (W) (NotifyJSONWriterThread)(C1) >> Notifying JSON Writer Thread
(I) (ConnectToWiFi)(C1) Setting host name to NightDriverStrip...WL_NO_SHIELD
Now THIS is concerning. This doesn't seem like an error, but is showing
that src/network.cpp::ConnectToWiFi() is having trouble talking to the
board. That would be super strange if the name actually came from DHCP,
which would mean the name came from thee etwork - that we now seemingly
can't find. It would be less strange if you had another theory of where
that name came from
This means that
debugI("Setting host name to %s...%s",
cszHostname,WLtoString(WiFi.status()));
is being hit with a WiFi.status of WIFI_NO_SHIELD which sure looks like an
error condition and not not a happy state.
(W) (ConnectToWiFi)(C1) Pass 1 of 5: Connecting to Wifi SSID: "IoTWiFi" - ESP32 Free Memory: 108184, PSRAM:4174091, PSRAM Free: 997147
In an attempt to fix that I attempted to upload load a file system image
and erase the flash. I assume that is where I went wrong. Now I just get
the below output when monitoring. I can see it appears to be attempting to
draw, but nothing is output of the LED matrix.
Describe how you erased the flash and how you recovered from that. Did you
upload both the app and user partitinos or just the app partition?
(W) (ConnectToWiFi)(C1) Pass 1 of 5: Connecting to Wifi SSID: "Shelbyville" - ESP32 Free Memory: 108124, PSRAM:4174091, PSRAM Free: 997675
I know it's not the point, but are you also in Middle TN? It would be cool
to have three of us....
(W) (ProcessIncomingConnectionsLoop)(C1) Error accepting data!
(W) (SockettToWiFi)(C1) Socket server started.
sed. Retrying...
(I)
I)
tToWiFi)(C1) Socket server started.
(I) (ConnectToWiFi)(C1) Publishing OTA...
This section is a bit funky. Here, it looks like you - or something else -
tried to start an over-the-air flash update while you were monitoring and
trying to load from COM6 (?) above.
(I) (ConnectToWiFi)(C1) Setting Clock...
[ 7848][E][WiFiUdp.cpp:221] parsePacket(): could not receive data: 9
This is the WiFi over the air updater getting data that's corrupt.
It's my experience that mixing over the air and direct serial updates works
poorly. Until you get your board back on track, I'd recommend using only
direct connection.
(I) (UpdateClockFromWeb)(C1) NTP clock: response received, updated time to: 1694222180.521925, DELTA: 1694222172.567352
That's interesting. You were getting OTA packetts just moment ago, but now
you're getting that first NTP response where the board yells out "Does
anybody really know what time it is?" and getting the first first answer
after boot. Normally, the board would have already been fully bootted
before getting OTAs.
You don't reallywhat communication methods are in play at each step (and
you should) but please restrict yourself to a single USB/serial connection
and not mixing network and serial.
(I)
(I) (ConnectToWiFi)(C1) Starting Web Server...
(I) (begin)(C1) Connecting Web Endpoints
(I) (begin)(C1) Embedded html file size: 1350
(I) (begin)(C1) Embedded jsx file size: 41450
(I) (begin)(C1) Embedded ico file size: 15406
(I) (begin)(C1) Embedded timezones file size: 17834
(I) (begin)(C1) HTTP server started
(I) (ConnectToWiFi)(C1) Web Server begin called!
You earlier said you'd cleared flash. Here, the flash of the user data is
present. It seems that someone somewhere has successfully completed a flash
of the user parition. Any theories when or how that happened?
(I) (WriteCurrentEffectIndexFile)(C0) Number of bytes written to file /current.cfg: 1
[ 11220][E][WiFiClient.cpp:517] flush(): fail on fd 54, errno: 11, "No more processes"
This is troublesome. WriteCurrentEffectIndexFile should be writing just a
byte or two (one here). The write isn't failing, but the sync is. This
points to filesystem corruption in that second ("user", "secondary", or
"non-system") partition. We'd normally have basically reformatted that
partition if we'd tried to mount it and read it when we first booted.
So it's like the system has had a head injury and that second partition
both exists and doesn't exist. Fortunately, relatively little irretrievable
information relies on that (which effects were enabled/disabled and which
one is playing at this moment) so the heavy hand of erasing the partitions
and letting the system recreate that is not usually a terriblly distasteful
order fro teh doctors.
Personally, I'd attach it directly to serial and attempt to reinstall BOTH
the systems software ad "user data" partitions. Don't get OTA involved and
certainly not while this is running.
If you have to get microscopic, verify that you're getting the partiion
table dedicated to Mesmerizer. Dump flash at 0xc00 and confirm that the
bytes match those in config/partitions_custom_8M.csv
Beyond that making a friend in the group and comparing log files to see
where your device FIRST leaves the rails and comparing notest going through
a couple of COMPLETE flashing cycles is my best advice from here.
I don't think your board is physically broken.
Whether by cause (human or system) or by effect, I DO think the layout of
the "disk" that comprises the mesmerizer image (the system partition, the
partiion table, the OTA entries, the user parition, the respective file
systems and files, etc. is a bit scrambled. I think a bit of therapy can
probably get your system back to working.
Good luck!
RJL
…
(I) (UpdateSubscribers)(C1) Got YouTube subscriber count for channel Daves Garage (GUID 9558daa1-eae8-482f-8066-17fa787bc0e4)
(I) (loop)(C1) WiFi: WL_CONNECTED, IP: 10.1.5.1, Mem: 75636, LargestBlk: 42996, PSRAM Free: 985135/4173915, LED FPS: 53 Refresh: 60 Hz, Power: 2017 mW, Brite: 100%, Audio FPS: 41, MinVU: 1035.4, PeakVU: 1062.3, VURatio: 0.1 Buffer: 0/500, CPU: 093%, 073%, FreeDraw: 0.004
(I) (loop)(C1) WiFi: WL_CONNECTED, IP: 10.1.5.1, Mem: 75636, LargestBlk: 42996, PSRAM Free: 985663/4173931, LED FPS: 56 Refresh: 60 Hz, Power: 1977 mW, Brite: 100%, Audio FPS: 45, MinVU: 639.8, PeakVU: 688.0, VURatio: 0.5 Buffer: 0/500, CPU: 075%, 065%, FreeDraw: 0.008
Any help would be greatly appreciated.
Thanks,
Keiran.
—
Reply to this email directly, view it on GitHub
<#423>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD35LPVTN5N64O4XT57LXZO7X7ANCNFSM6AAAAAA4RCSRAY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thanks for all the info, to answer your questions:
After erasing the flash and uploading again I got this.
Similar results occurred on every reboot. For completeness, I went back to commit 12e941e, did a full clean, rebuild and upload. After that it works again. I went back to main, cleaned and rebuilt again and am now fully up to date and working. No idea why that fixed it, but all is well again. I can enjoy my Pong Clock having the correct time while I decide where I want to start developing new things. Thank-you for all your help. |
Beta Was this translation helpful? Give feedback.
-
To get back to “new”, you can do a command line
pio —erase_flash
(Or something like that, but it’s the pio erase flash command)
Then refresh the board. You MIGHT have to hold down the BOOT button, but probably not.
Pretty sure the board itself is working, and I’ve never seen a matrix OR a board actually die. Double-check your power connection and that you are perfectly centered up down with the PCB so as not to be off by one pin.
- Dave
… On Sep 8, 2023, at 9:39 PM, Robert Lipe ***@***.***> wrote:
I'm away from my Mesmerizer for a few days. This would be a lovely chance
for one of our new members with a fresh Mesmerizer to help pick through the
logs of their devices and help compare notes.
On Fri, Sep 8, 2023 at 9:18 PM Keiran Hines ***@***.***>
wrote:
> I think I have soft bricked my board and wanted to know how to recover it.
> The
>
The good news is that WAY too much is working for your board to be
considered a brick. It's clearly running Mesmerizer code, connecting with
the host just fine and so on.
It's POSSIBLE that you may have blown/lost an LCD panel, a level shifter,
an I/O interface pin, or something else. But for now, let's think happy
thoughts and work through the logs.
> After doing that the serial output listed my wifi, but then that was
> quickly replaced with an 'IoTWifi' SSID. (Logs below)
>
That's odd. The word "IoTWiFi" doesn't really appear in the code anywhere
that I can find. Do you have any theories where that name came from?
(W) (ReadWiFiConfig)(C1) Retrieved SSID and Password from NVS: IoTWiFi,
********
So that name seems to be coming from your NonVolatile storage. Are you
really saying you never gave it that name?
> Starting SmartMatrix Mallocs
>
> Some of that may seem scary sounding, but from memory, it all seems about
right.
> (I) (LoadJSONFile)(C1) Attempting to read JSON file /effects.cfg
> (W) (LoadJSONFile)(C1) Out of memory reading JSON from file /effects.cfg - increasing buffer to 4931 bytes
> (W) (LoadJSONFile)(C1) Out of memory reading JSON from file /effects.cfg - increasing buffer to 6979 bytes
>
> This is actually normal.|
> >> Launching Network Thread. Mem: 157828, LargestBlk: 110580, PSRAM Free: 1010099/4174347, [ 2127][E][ESPmDNS.cpp:65] begin(): Failed starting MDNS
> Error starting mDNS
>
> Now this, I don't remember seeing on eery boot. Can anyone else comment on
whether they're seeing this one in a similar position?
> (W) (UpdateSubscribers)(C1) Skipping Subscriber update, waiting for WiFi...
> >> Launching ColorData Thread. Mem: 148972, LargestBlk: 110580, PSRAM Free: 1010291/4174347, >> Launching Socket Thread. Mem: 146036, LargestBlk: 110580, PSRAM Free: 1010291/4174347, (W) (NotifyJSONWriterThread)(C1) >> Notifying JSON Writer Thread
> (I) (ConnectToWiFi)(C1) Setting host name to NightDriverStrip...WL_NO_SHIELD
>
> Now THIS is concerning. This doesn't seem like an error, but is showing
that src/network.cpp::ConnectToWiFi() is having trouble talking to the
board. That would be super strange if the name actually came from DHCP,
which would mean the name came from thee etwork - that we now seemingly
can't find. It would be less strange if you had another theory of where
that name came from
This means that
debugI("Setting host name to %s...%s",
cszHostname,WLtoString(WiFi.status()));
is being hit with a WiFi.status of WIFI_NO_SHIELD which sure looks like an
error condition and not not a happy state.
>
> (W) (ConnectToWiFi)(C1) Pass 1 of 5: Connecting to Wifi SSID: "IoTWiFi" - ESP32 Free Memory: 108184, PSRAM:4174091, PSRAM Free: 997147
>
> In an attempt to fix that I attempted to upload load a file system image
> and erase the flash. I assume that is where I went wrong. Now I just get
> the below output when monitoring. I can see it appears to be attempting to
> draw, but nothing is output of the LED matrix.
>
Describe how you erased the flash and how you recovered from that. Did you
upload both the app and user partitinos or just the app partition?
> (W) (ConnectToWiFi)(C1) Pass 1 of 5: Connecting to Wifi SSID: "Shelbyville" - ESP32 Free Memory: 108124, PSRAM:4174091, PSRAM Free: 997675
>
> I know it's not the point, but are you also in Middle TN? It would be cool
to have three of us....
(W) (ProcessIncomingConnectionsLoop)(C1) Error accepting data!
> (W) (SockettToWiFi)(C1) Socket server started.
> sed. Retrying...
> (I)
> I)
> tToWiFi)(C1) Socket server started.
> (I) (ConnectToWiFi)(C1) Publishing OTA...
>
> This section is a bit funky. Here, it looks like you - or something else -
tried to start an over-the-air flash update while you were monitoring and
trying to load from COM6 (?) above.
> (I) (ConnectToWiFi)(C1) Setting Clock...
> [ 7848][E][WiFiUdp.cpp:221] parsePacket(): could not receive data: 9
>
>
This is the WiFi over the air updater getting data that's corrupt.
It's my experience that mixing over the air and direct serial updates works
poorly. Until you get your board back on track, I'd recommend using only
direct connection.
> (I) (UpdateClockFromWeb)(C1) NTP clock: response received, updated time to: 1694222180.521925, DELTA: 1694222172.567352
>
> That's interesting. You were getting OTA packetts just moment ago, but now
you're getting that first NTP response where the board yells out "Does
anybody really know what time it is?" and getting the first first answer
after boot. Normally, the board would have already been fully bootted
before getting OTAs.
You don't reallywhat communication methods are in play at each step (and
you should) but please restrict yourself to a single USB/serial connection
and not mixing network and serial.
> (I)
> (I) (ConnectToWiFi)(C1) Starting Web Server...
> (I) (begin)(C1) Connecting Web Endpoints
> (I) (begin)(C1) Embedded html file size: 1350
> (I) (begin)(C1) Embedded jsx file size: 41450
> (I) (begin)(C1) Embedded ico file size: 15406
> (I) (begin)(C1) Embedded timezones file size: 17834
> (I) (begin)(C1) HTTP server started
> (I) (ConnectToWiFi)(C1) Web Server begin called!
>
> You earlier said you'd cleared flash. Here, the flash of the user data is
present. It seems that someone somewhere has successfully completed a flash
of the user parition. Any theories when or how that happened?
>
> (I) (WriteCurrentEffectIndexFile)(C0) Number of bytes written to file /current.cfg: 1
> [ 11220][E][WiFiClient.cpp:517] flush(): fail on fd 54, errno: 11, "No more processes"
>
> This is troublesome. WriteCurrentEffectIndexFile should be writing just a
byte or two (one here). The write isn't failing, but the sync is. This
points to filesystem corruption in that second ("user", "secondary", or
"non-system") partition. We'd normally have basically reformatted that
partition if we'd tried to mount it and read it when we first booted.
So it's like the system has had a head injury and that second partition
both exists and doesn't exist. Fortunately, relatively little irretrievable
information relies on that (which effects were enabled/disabled and which
one is playing at this moment) so the heavy hand of erasing the partitions
and letting the system recreate that is not usually a terriblly distasteful
order fro teh doctors.
Personally, I'd attach it directly to serial and attempt to reinstall BOTH
the systems software ad "user data" partitions. Don't get OTA involved and
certainly not while this is running.
If you have to get microscopic, verify that you're getting the partiion
table dedicated to Mesmerizer. Dump flash at 0xc00 and confirm that the
bytes match those in config/partitions_custom_8M.csv
Beyond that making a friend in the group and comparing log files to see
where your device FIRST leaves the rails and comparing notest going through
a couple of COMPLETE flashing cycles is my best advice from here.
I don't think your board is physically broken.
Whether by cause (human or system) or by effect, I DO think the layout of
the "disk" that comprises the mesmerizer image (the system partition, the
partiion table, the OTA entries, the user parition, the respective file
systems and files, etc. is a bit scrambled. I think a bit of therapy can
probably get your system back to working.
Good luck!
RJL
>
> (I) (UpdateSubscribers)(C1) Got YouTube subscriber count for channel Daves Garage (GUID 9558daa1-eae8-482f-8066-17fa787bc0e4)
> (I) (loop)(C1) WiFi: WL_CONNECTED, IP: 10.1.5.1, Mem: 75636, LargestBlk: 42996, PSRAM Free: 985135/4173915, LED FPS: 53 Refresh: 60 Hz, Power: 2017 mW, Brite: 100%, Audio FPS: 41, MinVU: 1035.4, PeakVU: 1062.3, VURatio: 0.1 Buffer: 0/500, CPU: 093%, 073%, FreeDraw: 0.004
> (I) (loop)(C1) WiFi: WL_CONNECTED, IP: 10.1.5.1, Mem: 75636, LargestBlk: 42996, PSRAM Free: 985663/4173931, LED FPS: 56 Refresh: 60 Hz, Power: 1977 mW, Brite: 100%, Audio FPS: 45, MinVU: 639.8, PeakVU: 688.0, VURatio: 0.5 Buffer: 0/500, CPU: 075%, 065%, FreeDraw: 0.008
>
> Any help would be greatly appreciated.
>
> Thanks,
> Keiran.
>
> —
> Reply to this email directly, view it on GitHub
> <#423>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACCSD35LPVTN5N64O4XT57LXZO7X7ANCNFSM6AAAAAA4RCSRAY>
> .
> You are receiving this because you are subscribed to this thread.Message
> ID: ***@***.***>
>
—
Reply to this email directly, view it on GitHub <#423 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA4HCF3TABSY56PT33DFCTTXZPXIRANCNFSM6AAAAAA4RCSRAY>.
You are receiving this because you are subscribed to this thread.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi All,
I think I have soft bricked my board and wanted to know how to recover it. The current state of the board is that I can Boot the board, it connects to my Wifi. I can connect to the frontend and the IR Remote triggers, but no pixels are lit up.
I was attempting to reflash from pio because the web installer was not working. After some messing around I was able to flash the board.
After doing that the serial output listed my wifi, but then that was quickly replaced with an 'IoTWifi' SSID. (Logs below)
In an attempt to fix that I attempted to upload load a file system image and erase the flash. I assume that is where I went wrong. Now I just get the below output when monitoring. I can see it appears to be attempting to draw, but nothing is output of the LED matrix.
Any help would be greatly appreciated.
Thanks,
Keiran.
Beta Was this translation helpful? Give feedback.
All reactions