-
Notifications
You must be signed in to change notification settings - Fork 7.6k
TG1WDT_SYS_RESET Randomly, No Guru Meditation Error #1033
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@Vincrl You definitely have a WatchDog timeout issue: from readtheDocs
I think the last sentence is what you are seeing: If this watchdog for some reason cannot execute the interrupt handler that prints the task data (e.g. because IRAM is overwritten by garbage or interrupts are disabled entirely) it will hard-reset the SOC. Chuck. |
Thanks for the quick response Chuck. I will make sure the Watchdog is fed, but the weird part is that if I replace my Uart Core (This core has 1 task that always read, write and fill the vectors of value.) by a for(;;){} (So super CPU consuming loop doing nothing, not even feeding) or a for(;;){delay(1000)}, then the WiFi Core / Single Task is running perfectly without rebooting. I realised that I am Writing in a vector from the Uart Core and Reading from that vector in the Wifi Core as follow: WiFi Core <--- [Data Vector] <--- Uart Core. And the Data Vector loops on itself (kind of a circular buffer). But I made no Semaphore or Mutex to protect the access to that data considering They can only be written by one task, and red by one different task. Could that cause the problem? Thank You I hope I make sense in my sentences. |
@Vincrl in your vector code (interrupt Vector?) are there any While() loops that could hang? an ISR needs to always be deterministic. It must not have any condition under which it waits for some other task or event to complete. It should be short and succinct. Are you processing your UART data inside the interrupt code? You should just move the data to a buffer, mark the buffer ready to process and the foreground loop should do the actual processing. If you try to send data out the UART inside your receive UART interrupt code you can create a stall waiting for the bits to trickle out. The input ISR stalls waiting for the output UART to process, another input interrupt occurs interrupting the stalled ISR waiting the UART (Serial() object) which uses spinlocks to singlethread datablocks. explosion! Have you elevated the priority of your code? If your code has a higher priority and is always ready, it will never allow a lower priority (idle) loop to execute. The WiFi code is piggy It has elevated priority and loops on conditions waiting for hardware events to occur. Chuck. |
@stickbreaker Thanks again for the quick answer. Sorry, by Vector I mean C++ std::Vector , so on array in sort. Also my Uart isnt driven by ISR but run on its own Task alone on its core always polling the UartRX and Sending Whenever he needs to. I understand your answer but no task in my Uart Core blocks anything considering I can run it without any problem when removing any mbedtls (HTTPS Requests) code. So if my Esp is Uart communicating on one core, and answering to clients on the wifi task (Local Requests at his local address, so like a server) on the other core. Everything is alright. As soon as I add the Https Requests to a server, the reboot appear. Also, if the Uart core is deactivated or in a for(;;){} loop and the WiFi Core answers to clients in local and also do https requests to a server (with mbdetls), the device runs fine too. (Just ran for 20h). So, by your logic, the mbedtls and Uart Tasks would share a ressource that interlock themselves thus provoking the WDT to trigger. Considering the only shared ressource they use from my code is the std::vector, this ressource should be the source of my WDT, but I do not protect that value with any spinlock, mutex or semaphore... Also, shouldn't the WDT trigger a message like???:
I will try launching the system without accessing the std::vector from both side and only one at a time and keep you up to date. If this still triggers the WDT, that would mean a ressource not created by me is the cause of this Deadlock or Interlock or Eternal Wait. Vince. P.S. Really Appreciate the help by the way. |
@Vincrl are you accessing the same Vector:: object from multiple tasks simultaneously? I do not know how thread safe standard objects are. I would assume they are not thread save. There are dedicated intertask communication procedures supported by the underlying operation system FreeRTOS I would recommend you read through the FreeRTOS documentation (8.2 is the current version included in the ESP32 environment). Chuck. |
@stickbreaker Seems like that's exactly what is happening, I removed the access to the vector from the https task and it seems have fix the issue. I still need to redo that part with a safe procedure tho, but I'm pretty sure that was the reason for my reboot. Don't know why the system reboot without Guru Meditation error tho, or by saying the reset was due to WDT. Still gotta understand that part but I believe intercore variable access securities and protocole are needed here and will lead me to a solution. Gonna confirm you that it fixes the issue. Thanks Chuck. |
@Vincrl Sound like you are on the path to success. Good luck. Chuck. |
In the interests of more information, I'll add that I was seeing this until I increased the size of the task stack size. That seemed to fix the problem for me. |
Hardware:
Board: ESP32 Adafruit Feather Huzzah
Core Installation/update date: 25/jan/2018
IDE name: Arduino IDE
Flash Frequency: 80Mhz
Upload Speed: 921600
Description:
I have an embedded system that need to analyze a continual flow of UART information and transmit them on a fix frequency to a Database over TCP by using mbedTLS. To achieve that goal, I divided the Cores, so Core 0 takes care of WiFi transmission of data using mbedTLS of a JSON String. The Core 1 is simply reading package on the UART line and rights them in a buffer. The WiFi Task then reads these values every 10 seconds to transmit them if required by the server, it also transmits a smaller package if no informations are required by the server.
So here is the problem: After booting, everything setups just right. The ESP connects to the defined WiFi and it start its routine by transmitting. Then, randomly, (sometimes after 3 transmissions to server, other times 50) it reboots. but no giving any panic reasons or Guru Meditation (Verbose Option Active.) looking like this:
It does not always crashes at Seeding the random number, sometimes it crashes here:
I used to have a much more stable build. But since a few time it started acting out and I cant find the solution.
Sketch:
I am sadly not in the right to disclose the code at the moment but ill try to post to you guys a working example with the same crash.
What I tried:
I am open to any solutions if you guys have some propositions. Thank You.
The text was updated successfully, but these errors were encountered: