Skip to content

PYPORTAL: wiped itself (several times) #1577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TG-Techie opened this issue Feb 20, 2019 · 62 comments
Closed

PYPORTAL: wiped itself (several times) #1577

TG-Techie opened this issue Feb 20, 2019 · 62 comments

Comments

@TG-Techie
Copy link

i was running code that worked perfectly before but after updating to the most recent master it started hard crashing and then wiped itself

@TG-Techie
Copy link
Author

working on reconstructing the lost code as an example

@TG-Techie
Copy link
Author

it's done it two more times.
iv'e been working on:
https://github.com/TG-Techie/pyportal_gui
i wish i could be more specific, this is mostly reconstructed code

@TG-Techie
Copy link
Author

ah, i just tried it with 4.0.0-beta.2-36-g66b0c67f5-dirty on 2019-02-12 and it happened again hrmmmm (i saved an old compile)

@dhalbert
Copy link
Collaborator

dhalbert commented Feb 21, 2019

I think I know what this might be. I might have messed up declaring MICROPY_PORT_ROOT_POINTERS in the big refactor. @ladyada
Will investigate further

@dhalbert
Copy link
Collaborator

No, that's not it.

@tannewt
Copy link
Member

tannewt commented Feb 21, 2019

Since you have a JLink, it'd be awesome if you could have a debug build running through the JLink as you work with it. Set a breakpoint on reset_into_safe_mode. It will likely be triggered when the USB dies for you. If it doesn't, you can still control-c it to get a backtrace. You may also want to power the pyportal separately through a stemma connector so that you can unplug usb when it misbehaves. I use gnd and 5v from a grand central or metro to power the pyportal.

@TG-Techie
Copy link
Author

I'll be free at 3 to do so until 4. I do not know what a debug build is nor am I super experienced with a Jlink. Is there a guide you know of ?

@siddacious
Copy link

This is the one I used; I hear the author went on to do great things
https://learn.adafruit.com/debugging-the-samd21-with-gdb/overview

@siddacious
Copy link

JLinkGDBServer -speed 4000 -if SWD -device ATSAMD51P20 is the gdb server command I use for the GC which I think is the same for the pyportal?

@TG-Techie
Copy link
Author

try these two firmwares, I find the one without init does not have the problem
wout_init_firmware.uf2.zip
firmware.uf2.zip

@ladyada
Copy link
Member

ladyada commented Feb 21, 2019

TG, when commenting on issues - please add more information so we know what you're doing and what the firmware you are posting does: what 'init code'? how did you verify 'it does not have the problem' during test? we cannot debug, read, or analyze uf2 files :)

more detail, information, and code is better, while keeping the information relevant and on topic will help us help you!

@TG-Techie
Copy link
Author

TG-Techie commented Feb 21, 2019

I duplicated the pyportal board profile and changed the board.c to match the board.c from metro m4, thus board.DISPLAY is a Nonetype in code, and I have not been having the file/folder (and occasional) corruption problem while saving or reloading in addition to far fewer safe mode alerts. The board i am running it on is in spi mode, if relevant.

ladyada, thank you for the feedback you have been giving I have been finding it very useful and I hope I can someday use it in a job. :-)

@ladyada
Copy link
Member

ladyada commented Feb 21, 2019

ok thanks - next up, when you get safemode alerts what are they saying? do you have a screenshot? they have the reason for safemode

@TG-Techie
Copy link
Author

TG-Techie commented Feb 22, 2019

to get the pyportal to output this error I just saved the same main.py over and over, using the program in between saves, until it went to safe mode (took three tries). I was using the most recent master (see below for the build number).

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.

You are running in safe mode which means something unanticipated happened.
Looks like our core CircuitPython code crashed hard. Whoops!
Please file an issue at https://github.com/adafruit/circuitpython/issues
 with the contents of your CIRCUITPY drive and this message:
Crash into the HardFault_Handler.

Press any key to enter the REPL. Use CTRL-D to reload.
Adafruit CircuitPython 4.0.0-beta.2-99-gf3e50b9df on 2019-02-21; Adafruit PyPortal with samd51j20

@TG-Techie
Copy link
Author

after i power cycled the drive was wiped again

@dhalbert
Copy link
Collaborator

@tannewt I think one thing to check here is if a displayio-related object (or maybe a filesystem-related object) should be part of the root pointers lists, but is not. I had a crashing (not a filesystem issue) due to forgetting this for an internal BLE linked list.

@TG-Techie What editor and what operating system are you using? (MacOS for the latter, right?)

@uhrheber
Copy link

I don't know whether it's related or not, but I built the latest version from the repository, and tried adafruit_ble with the echo example.
It ran perfectly at first, but then I added some code to show the bluetooth status with the onboard LEDs, and after several restarts, both by saving the file and CTRL-D, the virtual drive was wiped.

Version:
Adafruit CircuitPython 4.0.0-beta.2-111-gd218069f0 on 2019-02-22; PCA10059 nRF52840 Dongle with nRF52840

@TG-Techie
Copy link
Author

@dhalbert. Yes, I'm a Unix/Macos person through and through. I'm using mu 1.0.2.

@uhrheber
Copy link

There seems to be something wrong with the filesystem drivers, after reformatting the virtual drive, and reflashing, I got all sorts of weird errors, like:

main.py output:
Traceback (most recent call last):
File "main.py", line 1, in
File "/lib/adafruit_ble/init.py", line 1, in
NameError: name 'ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ' is not defined

But when I open the file init.py in an editor, everything looks OK.

@TG-Techie
Copy link
Author

@uhheber what do you mean by main.py output? Via repl? Is init seperate from main ?

@uhrheber
Copy link

*** This code works:
#######################################
from adafruit_ble.uart import UARTServer
import board, digitalio

ledg = digitalio.DigitalInOut(board.LED2_G)
ledg.direction = digitalio.Direction.OUTPUT
ledb = digitalio.DigitalInOut(board.LED2_B)
ledb.direction = digitalio.Direction.OUTPUT

uart = UARTServer()

while True:
uart.start_advertising()
ledg.value = False
ledb.value = True
# Wait for a connection
while not uart.connected:
pass

while uart.connected:
    # Returns b'' if nothing was read.
    ledb.value = False
    one_byte = uart.read(1)
    if one_byte:
        uart.write(one_byte)`

########################################

*** This code produces an error:
############################################
from adafruit_ble.uart import UARTServer
import board, digitalio

ledg = digitalio.DigitalInOut(board.LED2_G)
ledg.direction = digitalio.Direction.OUTPUT
ledb = digitalio.DigitalInOut(board.LED2_B)
ledb.direction = digitalio.Direction.OUTPUT

uart = UARTServer()

while True:
uart.start_advertising()
ledg.value = False
ledb.value = True
# Wait for a connection
while not uart.connected:
pass

while uart.connected:
    # Returns b'' if nothing was read.
    ledb.value = False
    ledg.value = True
    one_byte = uart.read(1)
    if one_byte:
        uart.write(one_byte)

##############################################

The error is:
Traceback (most recent call last):
File "main.py", line 1, in
NameError: name 'ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ' is not defined

This is reversible!
When I remove the line 'ledg.value = True', the error is gone, when I add it (regardless whether I paste it, or type it in letter by letter), the error appears.

The error is independent from the used editor (Mu, gedit, Notepad++) and the OS (Windows 10, Linux Mint 19.1).

@dhalbert
Copy link
Collaborator

@uhrheber Could you say exactly which commit you built, or which UF2 you used? The commit listed in the startup prompt is apparently not accurate.

@jerryneedell
Copy link
Collaborator

I just tried the error prodcing code both manually as test.py and as main,py -- it does produce an error as main.py but it runs normally as test.py


Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
main.py output:
Traceback (most recent call last):
  File "main.py", line 1, in <module>
NameError: name '�����������������������������������������������������������������������������������������������������������������������' is not defined



Press any key to enter the REPL. Use CTRL-D to reload.
Adafruit CircuitPython 4.0.0-beta.2-111-gd218069f0 on 2019-02-22; PCA10059 nRF52840 Dongle with nRF52840
>>> import test

test.py


from adafruit_ble.uart import UARTServer
import board, digitalio

ledg = digitalio.DigitalInOut(board.LED2_G)
ledg.direction = digitalio.Direction.OUTPUT
ledb = digitalio.DigitalInOut(board.LED2_B)
ledb.direction = digitalio.Direction.OUTPUT

uart = UARTServer()

while True:
    uart.start_advertising()
    ledg.value = False
    ledb.value = True
    # Wait for a connection
    while not uart.connected:
        pass

    while uart.connected:
        # Returns b'' if nothing was read.
        ledb.value = False
        ledg.value = True
        one_byte = uart.read(1)
        if one_byte:
            uart.write(one_byte)

@jerryneedell
Copy link
Collaborator

jerryneedell commented Feb 22, 2019

ha! but it runs normally if named code.py ??

Press any key to enter the REPL. Use CTRL-D to reload.
Adafruit CircuitPython 4.0.0-beta.2-111-gd218069f0 on 2019-02-22; PCA10059 nRF52840 Dongle with nRF52840
>>> 
>>> 
soft reboot

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:

@jerryneedell
Copy link
Collaborator

jerryneedell commented Feb 22, 2019

my tests were on a pca10059 dongle with a build from the current master as of a few hours ago as you can see in the output. Looks like the same commit referenced with the error report above.
I have not had any issue where the File system has been wiped by a control-D

@TG-Techie
Copy link
Author

Hrmmm I find mine crashes and wipes regardless of file content.

@uhrheber
Copy link

@dhalbert Git log says:
commit d218069 (HEAD -> master, origin/master, origin/HEAD)
Merge: 3e877e0 ed1ace0
Author: Dan Halbert [email protected]
Date: Thu Feb 21 17:15:50 2019 -0500

Merge pull request #1584 from tannewt/disable_concurrent_write_protection

Add option to disable the concurrent write protection

commit 3e877e0
Merge: 0261c57 1532863
Author: Scott Shawcroft [email protected]
Date: Thu Feb 21 13:36:26 2019 -0800

Merge pull request #1580 from cpforbes/cpf-1572

Set __file__ for the main source file (e.g. code.py, main.py)

commit 0261c57
Merge: b8678c9 01e5a82
Author: Scott Shawcroft [email protected]
Date: Thu Feb 21 13:24:07 2019 -0800

Is that what you wanted to know?

Bootloader is:
UF2 Bootloader 1.00
Model: Adafruit PCA10059
Board-ID: NRF52-Bluefruit-v0
Bootloader: s140 6.1.0 r0
Date: Oct 2 2018

@jerryneedell
Copy link
Collaborator

jerryneedell commented Feb 22, 2019

Now I reloaded it as main.py and it works fine ...

It may be very important to make sure the File sytem has been updated before resetting - on linux I do "sync" after any write to the board.

I had some odd behaviors with trying to load another file and execute it from the REPL -- it resulted in the same "NameError: name= junk " error as above until I reloaded

after that even the previous;y failing main.py started working. So it may all be related to not having the FS stable when resetting

@uhrheber
Copy link

uhrheber commented Feb 22, 2019

@jerryneedell I don't think that this is the root cause. I had the code that produces the error in main.py, and it was synced, because I could read it with another editor. I then ejected the stick (in Linux), unplugged and replugged it. After that, the file system was wiped clean.

Is there something I can check?
I have a pca10056 and several pca10059 with debug connector, and a J-Link.

@uhrheber
Copy link

I don't use special mount options, I just disabled the write cache.
Also, as I already said, it's independent from the OS.
Text editors:
Windows: Notepad++, Mu
Linux: Gedit, Mu

I said nothing about resetting the board.
It doesn't matter whether I use the OS' drive eject function before unplugging it, the file system gets wiped anyway.

When I don't use external libraries, I can do with the board what I want, nothing breaks the drive, not even unplugging it while it runs.

@uhrheber
Copy link

PCA10056 works without a problem, even when I unplug it while it's running, there's no drive corruption.
So your assumption about a bug in the internal flash code is most likely right.

@dhalbert
Copy link
Collaborator

@uhrheber Which board(s) are you these problems on?

@dhalbert
Copy link
Collaborator

Our comments crossed. Thanks!

@uhrheber
Copy link

Only pca10059 so far, I have some nRF52840 based modules from Fanstel here, but didn't test them so far. They don't have external memory, so they should behave exactly as the pca10059.

@TG-Techie
Copy link
Author

TG-Techie commented Feb 22, 2019

are we proceeding under the assumption that the two problems are linked ?

@uhrheber
Copy link

But why is TG-Techie having these problems with the PyPortal? It HAS external flash.

@dhalbert
Copy link
Collaborator

There may be several problems here. @uhrheber’s issue definitely seems related to internal flash. Yours seems to be related to displayio, which they’re not using.

@TG-Techie
Copy link
Author

TG-Techie commented Feb 22, 2019

I think i is worth pointing out I do have displayio compiled into the firmware, I only call release_displays, but it has no pre-initialised display. also that my hardware config does not match the software when init_display is enabled
is how i put that clear?

@TG-Techie
Copy link
Author

TG-Techie commented Feb 22, 2019

i have also noticed my tricorder, basically a stock m4, does not have the file loss issue.

@uhrheber
Copy link

BTW, the problem with ble.uart still exists also on the pca10056. When I send more than a few characters at once, the program crashes:

 File "main.py", line 28, in <module>
  File "adafruit_ble/uart.py", line 128, in write
OSError: Failed to notify or indicate attribute value, err 1304x

@TG-Techie
Copy link
Author

@uhrheber should that go in a separate issue?

@TG-Techie TG-Techie changed the title PYPORTAL: wiped itself PYPORTAL: wiped itself (several times) Feb 22, 2019
@uhrheber
Copy link

Definitely. It's a different problem.

@TG-Techie
Copy link
Author

TG-Techie commented Feb 22, 2019

I wonder if this is happening on the hallowing?

@ladyada
Copy link
Member

ladyada commented Feb 22, 2019

TG - what if you have the backlight very dim, but using the displayio interface - please try that!

@TG-Techie
Copy link
Author

Ladyada, I'm glad to try, and not entirely sure what you want me to try. I use it in spi and even when in repl and not main I don't see anything on the screen, just white.

@ladyada
Copy link
Member

ladyada commented Feb 22, 2019

to see the the backlight current is causing a brownout while writing to the FAT, its worth a try. you would have displayio active, but just not use high current ALL ON backlight

@uhrheber
Copy link

OK, guys, the problem seems to be gone after I erased the memory of the pca10059 completely, flashed the newest version of the UF2 bootloader (pca10059_bootloader-0.2.8_s140_6.1.1.hex), and then copied the freshly compiled CircuitPython UF2 file over.

Now I feel dumb for not trying this sooner.
Sorry for wasting your time, obviously a completely unrelated problem.

@dhalbert
Copy link
Collaborator

@uhrheber No problem - it can take several tries to divide and conquer an issue. If you are still seeing BLE problems, please open a new issue.

@uhrheber
Copy link

Unfortunately, I spoke too soon.
After copying more libraries to the pca10059, and adding more code, the problem reappeared.

This made me thinking, that it might be related to the filling level of the disk.
So I started a test program, called F3, that is meant for testing USB sticks.
It fills the drive with test files to the brim, and then reads them back, testing for corrupted and overwritten parts.
I ran it on both pca10056 and pca10059, while the program was running, and it showed corrupted data afterwards.
Also, while F3 was writing data, the program restarted erratically, showed error in the code here and there, and the serial connection dropped several times.

It seems, that the flash driver has a problem with concurrent access.

I then reflashed all boards, and added a boot.py, that disables autoreloading, then added the libraries and the code.
Everything runs perfectly now, I can unplug the boards in every situation, without corrupting the drive.

But: When I copy more files to the drive, so that it is nearly full, the program crashes again, and after unplugging, the drive is wiped clean.

@TG-Techie Could you please try to add a boot.py to your board?
It should contain:

from supervisor import disable_autoreload
disable_autoreload()

@uhrheber
Copy link

I just repeated the write/read test with F3 on a Pyboard v1.1 (MicroPython v1.9.4), and again on a pca10059 (CircuitPython 4.0.0-beta.2-116-gaf863a378).
Pyboard: no data corruptions, runs without an error even when the disk is full, reset doesn't do any damage
pca10059: corrupted data, behaves sometime erratically when the disk is nearly full, disk wiped after reset

Would be interesting to see whether this also happens on other ports. Unfortunately, compiling Circuitpython for STM32 fails, and I don't have any SAMD boards.

@dhalbert
Copy link
Collaborator

CircuitPython is only supported on atmel-samd and nrf (and esp8266 on 3.x). We don't remove the other ports/* so we can still merge from upstream easily.

@TG-Techie
Copy link
Author

TG-Techie commented Feb 24, 2019

the no auto reload seems to have fixed the file corruption. however, the pyp still freezes while saving occasionally.

@TG-Techie
Copy link
Author

up just hard crashed!
but i forgot to put a boot.py back in when i stated this new version.

@dhalbert
Copy link
Collaborator

Should be fixed by #1604 and/or #1649.

@TG-Techie
Copy link
Author

TG-Techie commented Mar 20, 2019

@dhalbert the wiping problem wasn't. I just tried beta5 on my pyp and it crashed and reappeared as NO NAME.
the bootout.txt reads:

Adafruit CircuitPython 4.0.0-beta.5 on 2019-03-17; Adafruit PyPortal with samd51j20

I updated the firmware and then connected to the repl, using mu, and ctrl-c ed then ctrl-d ed.

@dhalbert
Copy link
Collaborator

@TG-Techie is it reproducible? If you can do it again, what was the main.py you were running? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants