-
Notifications
You must be signed in to change notification settings - Fork 1.3k
REPL Access is Lost After Awhile #3588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you give more details? What version of CircuitPython are you using? |
I have spoken with this user on Discord a bit as well. I can fill in some of that information. They are using a CPX with I did reproduce this issue once this morning though, my device was running simpletest script for a color sensor (that was not connected, was testing for someone else). I haven't been able to nail down the exact steps to replicate it consistently though. I've switched to no In my case: I will try to fish up the color sensor example I had running, though I don't think there is anything "special" about it, to the best of my knowledge it would have been sitting on whatever exception is thrown when the sensor isn't found Edit: this was the script running (or more likely crashed) on my device at the time I saw this occur: https://github.com/adafruit/Adafruit_CircuitPython_AS7341/blob/main/examples/as7341_simpletest.py |
The problem exists on CircuitPython v6.0.0-rc0. Did not exist on v5.3.1. It happens with this very simple program:
It's occurred using Screen in iTerm on MacOS 10.15.7. Also occurred using the Atom language-circuitpython package (that's where I first saw it). I'm testing on an itsybitsy M4 right now to see if it's the same. |
I had this happen again with this code running on the CPX: import time
while True:
#print(time.monotonic())
time.sleep(1) Interestingly, it ran for a long time with the print statement un-commented before this and never got into this state during that time. I'll try to see if I can narrow down whether the printing is really making a difference or if it was just coincidence. |
So, the problem arises for me on the CPX between 1h 00m and 2h 37m. Will try 1h 45m next. The missing REPL occurs with the original program:
And, I just confirmed the behavior is the same at 2h 37m with this code:
Last part of the output from the later when I pressed CTRL+C was:
|
After 3 hours running, my ItsyBitsy M4 has the same problem, the REPL is unresponsive. Code is:
After
Also, on the ItsyBitsy, after When I eject the ItsyBitsy, the dotstar returns to solid green. |
Looking over recent PR's, I am wondering if this has anything to do with #3546, maybe in the overflow handling. Maybe we could use a bisect to check. |
@robertgallup all of the builds are archived on S3. Here is the link for CPX English US builds: https://adafruit-circuit-python.s3.amazonaws.com/index.html?prefix=bin/circuitplayground_express/en_US/ If you go to circuitpython.org/downloads and find your board then look for the "Browse S3" button it will lead you to the page for that device. |
https://adafruit-circuit-python.s3.amazonaws.com/bin/circuitplayground_express/en_US/adafruit-circuitpython-circuitplayground_express-en_US-20201013-3d21eec.uf2 is the merge commit for #3546. The one just before that is https://adafruit-circuit-python.s3.amazonaws.com/bin/circuitplayground_express/en_US/adafruit-circuitpython-circuitplayground_express-en_US-20201013-a010dc3.uf2. If you are willing to do a little testing on these, that would be great! Thanks. |
The way we'd normally test this is to do a "bisect". We'd pick a known bad and a known good build. Then we pick a build halfway in between, and test that. If it's bad, we then pick another build halfway between the second bad one and the good one, and repeat. Each time we conquer and divide the possibilities in half. Git has a way of doing this automatically. Unfortunately since it takes several hours to determine good/bad, this might get pretty tedious. So I made a stab at a possible place to start. |
Excellent, thanks! I was just going looking for that build. I'll see what I can find out. I'm using my own 'bisect' to find out the minimum test time (currently between 1h and 2h 30m :-) ) |
Is this a duplicate of #2686 perhaps with a change that increases the frequency it occurs at? |
Perhaps. Was there ever a resolution to that issue? FWIW, my issue doesn't seem to be impacted by amount of serial output. |
@hathach You'll want to read through this too. |
I've just had a CLUE running This is on windows 8.1, btw |
@dhalbert It looks like your hunch might be correct! TL;DR My testing suggests the #3546 commit did introduce the problem. The previous commit passes and the #3546 commit fails. The Full Version: I spent yesterday confirming (unintentionally) that this issue doesn't exist in v5.3.1. :-) This morning I made additional progress and found that with the following program, the issue is reproducible after 50m (maybe less):
The results are below. I first confirmed that the test failed on v6.0.0-rc.0. Then, I confirmed that the commit prior to #3546 passed. Then confirmed the #3546 commit failed. And, finally, I confirmed that the prior commit succeeded again. Not an iron-clad case, but it is suspicious.
|
@robertgallup Any chance you have a debugger you can connect to an M0? That will tell us where it is when it is hung. |
I'm not experienced with debuggers, but I have one for Cypress (miniprog3) that looks like it supports jtag. Do you happen to know if this would work for this with generic software? Otherwise, I think I can get ahold of a j-link (mini edu?) by Monday-ish. And, I'm always happy to learn something new. |
It's not hanging, though, right? It's just that REPL output is no longer being sent back on USB (but is showing up on the display), is that right? |
It's not a hang, per se. The REPL is not responsive after I stop the program with |
@dhalbert from your comment, "...but is showing up on the display", is there a way for me to see the REPL output on a display (OLED, etc.) rather than over serial? |
I mistakenly thought you were running this on a PyPortal or CLUE or similar board with an integral display (which shows the REPL). |
Regarding debuggers, if you happen to have a Raspberry Pi, I’m currently using one (3B+) to debug CircuitPython on a SAMD51, no extra hardware needed. Can expand if you are interested. Regarding displays, if you have an external one you can connect, you should be able to get terminalio on the ItsyBitsy M4. Maybe even on the CPX, it looks like there’s a circuitplayground_express_displayio firmware. |
@cwalther please expand! Just a link to instructions would be great. |
I was gathering information from a bunch of places, as far as I remember these were the most helpful ones: |
A few final data points for now. I connected the TFT Gizmo to the CPX and found the following:
|
I ran this on Ubuntu 20.04 on a Metro M0 Express with 6.0.0-rc.0 for 6500 seconds, and was able to ctrl-C it at that point: >>> while True:
... print(int(time.monotonic()))
... time.sleep(10)
... I could try again with a CPX, but I'm wondering if I should try this on another OS. @robertgallup This failed for you on MacOS after 50 minutes. During that time did the Mac go to sleep, or were you keeping it awake? |
@dhalbert Is there a possiblity the USB physical socket and/or other devices in use have an effect through their presence or from traffic? USB gives the appearance of a point to point topology with a desktop/laptop making a star with nothing shared but I'm wondering if there's more to it? |
@dhalbert I've just run a couple of tests with all timeouts turned off and they both passed. I'm running one final test with my normal timeouts back on to see if I can get it to fail again. |
@dhalbert With sleep back on, both the CPX and a Trinket M0 failed. This seemed to be the case even if I was sitting there working during the time which should have prevented any sleep. On the other hand, I turned the screen saver and sleep off and left a Trinket M0 on all night (5-1/2 hours). This morning I could break out and access the REPL with no problem... Now, this just in, I started the Trinket again without resetting it, restored sleep/screen saver and the Trinket was unresponsive after 30 minutes. It was happily sending serial output, but would not accept any input (i.e. |
@robertgallup So this only happens when the code has a |
@tannewt I haven't tested recently, but the original test case didn't include time.sleep():
EDIT: I've tested that this simple case eventually locks on my Trinket M0. |
I am able to reproduce this problem, without waiting a long time, by putting the host computer into sleep mode. I reproduced this on both Windows and Linux. Simple test program: # Prints elapsed number of seconds, every 10 seconds.
import time
while True:
print (int(time.monotonic()))
time.sleep(10) Example with 6.0.0-rc.0:
By comparison, here's 5.3.1:
@tannewt suggested a possible lead on fixing this to me yesterday, and I'll investigate that. |
@dhalbert I don't know if this is useful, but I just did a manual bisect on S3 builds using your protocol (with a 30 second sleep time). The build that seems to have broken is: The build just before is: |
I think I have a fix: see PR #3624. Build artifacts are available here https://github.com/adafruit/circuitpython/runs/1334933429?check_suite_focus=true if you want to test. |
Fixed by #3624 |
Is this a possible or likely fix for #2686 too? |
I take that back. #2686 talks about loss of output. In the test above, output is not lost. When the sleeping computer wakes up, output continues to the terminal program. The issue was that CircuitPython didn't realize that the computer reconnected, so it did not go back to the REPL after a ctrl-C. |
On the Circuit Playground Express, after running a CircuitPython program, access to the REPL is lost. Ctrl+C stops execution and the serial connection remains, but there's no way to regain the REPL prompt. Sometimes restarting the serial console (on the PC side) unlocks the REPL, or resetting the CPX board.
The text was updated successfully, but these errors were encountered: