ERROR: Unable to write file : DISK_ERR, leads to Out of Memory errors. #236
Replies: 39 comments
-
Posted at 2014-03-06 by @gfwilliams Hmm - It could be that the disk errors are causing Espruino to leak memory (however looking at the code I can't see how), or it could be that Espruino is running low on memory and that's causing the errors? Could print the value of Any chance you could strip your code down (removing the DHT read and LCD write) and see if it still happens? |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-06 by DrAzzy I'll try to strip it down into a minimal example that exhibits the behavior. While it was running, I checked process.memory() and it was steady at 900 used. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-06 by @gfwilliams Hmm, that should be ok then... I wrote some simple code to append to a file, and the pulled the SD card out (producing |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-07 by DrAzzy The DISK_ERR issue itself is trivial to reproduce:
This will run for a while, until eventually, every attempt to write results in DISK_ERR. This itself is a Bad Thing. However, it does not run out of memory! So why does it run out of memory when running the full setup?
All I can think of is that this DISK_ERR disrupts other things happening at the same time, like DHT read or LCD update. Unfortunately, I've never been able to catch it between the first time it gives the DISK_ERR and when it runs out of memory, in order to see if it's leaking memory every time, or whether it survives until the error happens at a critical moment. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-07 by Frida
change to: then it works on my Espruino. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-07 by Frida Now I have it running again, and right now it reached to 400 logs to the file without error. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-07 by Frida With 1 sec. log it failed at |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-07 by DrAzzy Hm. On my Espruino, onInit and doInit are DEFINITELY running - if they weren't, it wouldn't work at all. On my Espruino, I've always been able to put strings into setTimeout/setInterval. The IDE warns me that it's "evil", but it works fine*. What I didn't realize is that you could just pass it the name of a function, unquoted, and have it call that function with no arguments! I thought you had to do: So in your case, it sorts itself out after enough disk errors happen? I don't think it was coming back to life for me, will check logs tonight. *Note that I have not tested whether this "evil" evaluation has resulted in my eternal damnation; testing this seems like a real chore. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-09 by DrAzzy I've determined that the out of memory issue can happen without logging. So I think there is some condition in which my DHT module explodes. Investigations are ongoing. However, there is still the issue with the SD card getting unhappy and producing DISK_ERRs, which happens when just logging gibberish. Do you have any thoughts on how to debug this further? |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-10 by DrAzzy And yet, here's a log clearly showing that once the disk errors start, the memory problems begin - I am back to being convinced this is an Espruino bug, as the system was stable for approximately one hour, reporting stable memory usage, then the instant it started disk erroring, it started leaking memory every time it measured the DHT's.
The code that's generating this:
Does anyone have any thoughts on what might be going on? Is there any way to get some information about where all that memory is going when it starts to fail? Using Espruino v55. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-10 by @gfwilliams That's pretty strange... You get the memory leak with the disk error:
But then the disk errors stop and you're still losing memory! You could try typing If you could try and get the simplest possible piece of code that still got the memory leaks then that'd really help matters. With the DISK_ERR, I'm unsure what to do as it's the filesystem lib that's failing and when I have tried it, it does recover eventually - the development version (or 1v56 when it is released) now returns false is appendFile fails, so you could try again if needed. By the way, Using strings in setInterval/etc it's really evil in Espruino - it's actually pretty efficient. It's just that jslint (which is what does the warnings) thinks it is and I haven't got around to convincing it otherwise :) |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-11 by DrAzzy Got a trace of it just before it exploded (down to 61 free at start). Clearly a repetitive structure - but uh... I can't make sense of the trace... What are these multiplied things? Another one, where the trace was taken right after the problem first manifested. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-11 by @gfwilliams Thanks - that's really helpful... So it looks like there are a whole bunch of setTimeouts coming from the DHT11 module, but I have no idea why! The only thing I can think is that somehow the RTC is getting really confused and the value from Please can you also log the value from How long does this happen after the last hard reset? From the logs it looks like both times have been after around 4 hours? Do you think it's been around that amount of time in each case? |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-11 by @gfwilliams I've discovered a pretty nasty problem I think that could explain your issues. After 65535 seconds, the RTC goes into the second 16 bit word, and I wasn't combining the two 16 bit words properly - so time would suddenly jump back at that point. It should be fixed in 1v56 now |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-12 by @gfwilliams I'm still trying to track this down - something happens after around 8000 seconds, and the RTC reports the correct time, but |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-18 by DrAzzy Happened really fast the first time, now seems to be running stable for a while All of these errors were appearing in rapid succession, meaning it was trying to log much more frequently than it should have been - the onslaught was such that the trace and everything else got pushed out. Is there a way to make the IDE log to a file, so I wouldn't miss the results of trace() and the events surrounding the initial event? In any event, there is no way that the espruino was running for 44494 seconds, nor had it been running for ~25k seconds when the failure occurred. !!?!?! http://drazzy.com/espruino/v58devfailure.txt I'm rather confused by the early failure, followed by a very stable run.... |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-18 by @gfwilliams Yes... I had that once - but then it went away. I wonder whether it was something strange to do with the firmware update.... Let me know whether it happens again though (and if you find a way to reproduce it). Glad it's looking more stable now! |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-18 by DrAzzy Failed at some point during the night with the same symptom. Again, by the time I'd found it, the original trace() was long since pushed off the console, and disk errors coming rapidly while the getTime increased far too fast. Last log entry was at 28802. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-18 by @gfwilliams Strange. I've been running it overnight here (not with filesystem writes, but with a clock using setInterval) and it's spot on :( |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-18 by DrAzzy Yeah, I've only seen it when I'm doing fs writes. It's as if the SD card failure breaks the timer somehow... Seems to work fine just sitting there reading dht's. Edit: Just got another failure, this one much faster - and I got the whole trace this time because it froze entirely at the end http://drazzy.com/espruino/v58devfailure2.txt (note that I'd created a demo array to show my friend the WS2812's, that's why there's a graphics object in the trace) |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-19 by @gfwilliams Thanks - just wanted to check - are you using setDeepSleep at all? |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-19 by DrAzzy Nope, no deep sleep. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-21 by DrAzzy The timer bug can be reproduced with this code:
Ran it overnight, and woke up to it broken. Manually cleared the interval, and typed getTime() a few times. These were typed a couple of seconds apart at most:
(in above, I had to remove the > prompt, otherwise the forum mangled it) Last entries in the log:
|
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-21 by @gfwilliams You know you can just wrap code in three backtick marks? I just edited your post to add it. Thanks for cutting it down - I'll try that code out and see what I can get to happen with it. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-21 by DrAzzy Yeah - I knew about the backticks (and used them for the js code - or I thought I did), but didn't think to use it on the logs - I was in a hurry since I was posting it while not before I left for work. What's strange is how it gets 11 writes off (in two bursts) shortly after things have started going off the rails. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-21 by @gfwilliams Right - just reproduced. Mine did it early too:
I'll see what I can do about fixing this next week. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-22 by @gfwilliams Just a thought - this will be some issue with the code that keeps the system tick timer in sync with the RTC. As a workaround, try setDeepSleep (1) and power from a charger/battery so it enters sleep. That will almost certainly solve your problems. It also explains why one of mine has been running all week and is still keeping good time. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-24 by @gfwilliams Could you give this a go please? http://www.espruino.com/binaries/git/commits/e85fa5a013c3a1469635b92e12a35591eb2589d8 I've been running it for the past 6 hours or so with your logger and it appears to be fine. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-25 by DrAzzy So far it's been running over night without issue. Will keep letting it run. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2014-03-25 by @gfwilliams Wohoo! 1v59 (just released) has the fix in too. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Posted at 2014-03-06 by DrAzzy
I have a program that reads a pair of DHT's (11 and 22, to compare their readings), shows them on a Nokia 5110 screen, and every ten seconds, records the latest data on the SD card. Every time I go away and leave it, I come back to find something like this on the console:
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Out of Memory!
ERROR: Out of Memory!
WARNING: Unable to create string as not enough memory
ERROR: Out of Memory!
ERROR:
(with cursor at the end of the word ERROR:)
Using v55 firmware, connecting via bluetooth. After reset, SD card is found to be working normally.
All of the DISK_ERRs come in fairly rapid succession (either all at once, or once per attempted append once the problem manifests). I don't know what the exact timing is, since I've never been watching the console when it happens.
Beta Was this translation helpful? Give feedback.
All reactions