Skip to content

samd: REPL disconnect problems after tinyusb PR #1342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
notro opened this issue Nov 15, 2018 · 15 comments
Closed

samd: REPL disconnect problems after tinyusb PR #1342

notro opened this issue Nov 15, 2018 · 15 comments
Milestone

Comments

@notro
Copy link

notro commented Nov 15, 2018

PR #1321 has caused problems with uploading files over the serial REPL.

Using latest master:

pi@agl:~/circuitpython/workdirs/test/circuitpython$ git log -1 --oneline
0ea31ec15 (HEAD -> test, origin/master, origin/HEAD, master) Merge pull request #1339 from dhalbert/3.x-frozen-swap

I'm unable to put a 460k file using ampy:

pi@cp:~/work/circuitpython/ampy $ python3 ampy/cli.py -p /dev/ttyACM1 put ../workdirs/test/circuitpython/ports/atmel-samd/build-metro_m4_express/firmware.uf2
Traceback (most recent call last):
  File "/home/pi/.local/lib/python3.5/site-packages/serial/serialposix.py", line 537, in write
    n = os.write(self.fd, d)
OSError: [Errno 5] Input/output error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ampy/cli.py", line 419, in <module>
    cli()
<snip>
  File "/home/pi/.local/lib/python3.5/site-packages/serial/serialposix.py", line 571, in write
    raise SerialException('write failed: {}'.format(e))
serial.serialutil.SerialException: write failed: [Errno 5] Input/output error

Kernel log:

pi@cp:~/work/circuitpython/ampy $ dmesg
[98691.837544] cdc_acm 1-1.4:1.0: failed to set dtr/rts
[98691.937608] usb 1-1.4: reset full-speed USB device number 100 using dwc_otg
[98692.070296] cdc_acm 1-1.4:1.0: ttyACM1: USB ACM device
[98692.074569] sd 0:0:0:0: [sda] 4089 512-byte logical blocks: (2.09 MB/2.00 MiB)
[98692.091618]  sda: sda1

Nothing special on the LEDs. With another tool I use for upload that checks the file timestamp before uploading, I sometimes get a new device name ttyACM2 after the usb reset:

[80063.522775] usb 1-1.4: reset full-speed USB device number 81 using dwc_otg
[80063.655651] cdc_acm 1-1.4:1.0: ttyACM2: USB ACM device
[80063.664710] sd 1:0:0:0: [sdc] 4089 512-byte logical blocks: (2.09 MB/2.00 MiB)
[80063.682308]  sdc: sdc1

Zero size file (since it's not closed I suppose):

Adafruit CircuitPython 4.0.0-alpha.2-172-g0ea31ec15 on 2018-11-15; Adafruit Metro M4 Express with samd51j19
>>> os.stat('firmware.uf2')
(32768, 0, 0, 0, 0, 0, 0, 1542319926, 1542319926, 1542319926)

Working commit

Checkout commit beofre tinyusb PR:

git checkout d08747d374057b24803b429bbf6474a2e4dcea5c
<snip>
HEAD is now at d08747d37... Merge pull request #1319 from dhalbert/origin/arturo182_bleio_pr_1289

Did this 3 times without failure:

pi@cp:~/work/circuitpython/ampy $ python3 ampy/cli.py -p /dev/ttyACM1 put ../workdirs/test/circuitpython/ports/atmel-samd/build-metro_m4_express/firmware.uf2

ampy get (doesn't support binary files?) didn't work so I couldn't check that the files are equal, but at least the size matches:

pi@cp:~/work/circuitpython/ampy $ ls -l ../workdirs/test/circuitpython/ports/atmel-samd/build-metro_m4_express/firmware.uf2
-rwxrwxrwx 1 root root 473600 Nov 15 22:06 ../workdirs/test/circuitpython/ports/atmel-samd/build-metro_m4_express/firmware.uf2

Adafruit CircuitPython 4.0.0-alpha.2-135-gd08747d37-dirty on 2018-11-15; Adafruit Metro M4 Express with samd51j19
>>> os.stat('firmware.uf2')
(32768, 0, 0, 0, 0, 0, 473600, 1542316836, 1542316836, 1542316836)
@tannewt tannewt added this to the 4.0 Alpha milestone Nov 15, 2018
@jerryneedell
Copy link
Collaborator

jerryneedell commented Nov 17, 2018

FYI - I am seeing a similar (I think) disconnect on a SAMD21 (CPX) with 4.0.0-alpha.3
This occurs while running a script or idle at REPL
I am able to reconnect to REPL afterwards.

dmesg log on Ubuntu 18.04

[132560.492281] usb 3-3.4: reset full-speed USB device number 10 using xhci_hcd
[132560.613665] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device
[132560.615905] sd 7:0:0:0: [sdc] 4089 512-byte logical blocks: (2.09 MB/2.00 MiB)
[132812.389018] usb 3-3.4: reset full-speed USB device number 10 using xhci_hcd
[132812.510998] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device
[132812.607334] sd 7:0:0:0: [sdc] 4089 512-byte logical blocks: (2.09 MB/2.00 MiB)

@jerryneedell
Copy link
Collaborator

same behavior seen on feather_m0_express -- also with 4.0.0-alpha.3

[141211.399907] usb 3-3.4: reset full-speed USB device number 17 using xhci_hcd
[141211.522081] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device
[141211.526148] sd 7:0:0:0: [sdc] 4089 512-byte logical blocks: (2.09 MB/2.00 MiB)
[141590.294132] usb 3-3.4: reset full-speed USB device number 17 using xhci_hcd
[141590.416326] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device
[141590.431553] sd 7:0:0:0: [sdc] 4089 512-byte logical blocks: (2.09 MB/2.00 MiB)

@notro notro changed the title samd51: REPL disconnect problems after tinyusb PR samd: REPL disconnect problems after tinyusb PR Nov 18, 2018
@ladyada
Copy link
Member

ladyada commented Nov 18, 2018

ditto seeing this too - will try to get some beagle traces - im on win 10

@ladyada
Copy link
Member

ladyada commented Nov 19, 2018

USB trace attached, see packets 43978, thats where it 'hangs', and then 44064 is when it resets and reappears - MSD is back but CDC connection is gone

image

beagle trace.zip

@tannewt
Copy link
Member

tannewt commented Nov 20, 2018

FYI, I'm waiting for this PR: hathach/tinyusb#11

Once its in, I'll update CircuitPython to it.

@tannewt
Copy link
Member

tannewt commented Nov 23, 2018

Please try with #1353. I believe it will fix it.

@notro
Copy link
Author

notro commented Nov 25, 2018

Running many unittest tests through the repl is working again.
Uploading files is working.

The script I use to upload, use os.stat() to check if files need uploading. When I rerun the script after it has uploaded it all files, I get into trouble.

It consistently fails when stat'ing these files:
/lib/unittest/test/__init__.mpy
/lib/unittest/test/dummy.mpy
/lib/test/imghdrdata/python.pbm

(I've had 2-3 times out of over a hundred times that one of these succeeded)

Fails as in raw repl not responding with b'\x04' (delimiter) after it said b'OK':

b'>\x01raw REPL; CTRL-B to exit\r\n>print(__import__(\'_os\').stat(\'/lib/unittest/test/dummy.mpy\'), end="")\x04OK('

It looks like it's starting to write the tuple judging by the opening parentheses.

This is how a successful one looks:

b'>\x01raw REPL; CTRL-B to exit\r\n>print(__import__(\'_os\').stat(\'/lib/unittest/test/test_assertions.mpy\'), end="")\x04OK(32768, 0, 0, 0, 0, 0, 10790, 1543174556, 1543174556, 1543174556)\x04\x04'

However if I catch the exception and retry the command through the regular repl mode, it succeeds and the script can continue.

I have tried to retry the command through raw repl, and I even tried microcontroller.reset() and serial reconnect without success.

I tried changing the contents of the failing files (copy content of another one) and it succeeded.

If I try raw repl directly using screen, the failing command succeeds.

I've spent hours trying to make something that can easily reproduce this, but I haven't been successful at that.

Reverting to before tinyusb makes it work.
Done on a M4 Metro Express.

@tannewt
Copy link
Member

tannewt commented Nov 27, 2018

There is a ring buffer on the CDC output that could be filling up. Is that all allocated as one string and printed?

@notro
Copy link
Author

notro commented Nov 27, 2018

That string was a combination of read and write (ie. the session):

read:  b'>'  # previous raw repl prompt
write: b'\x01'  # Ctrl-A
read:  b'raw REPL; CTRL-B to exit\r\n>'
write: b'print(__import__(\'_os\').stat(\'/lib/unittest/test/test_assertions.mpy\'), end="")'
write: b'\x04'  # Ctrl-D
read:  b'OK'
read:  b'(32768, 0, 0, 0, 0, 0, 10790, 1543174556, 1543174556, 1543174556)\x04'
read:  b'\x04'

@tannewt tannewt modified the milestones: 4.0 Alpha, 4.0.0 - Bluetooth Dec 7, 2018
@tannewt
Copy link
Member

tannewt commented Feb 5, 2019

@notro Is this still an issue?

@notro
Copy link
Author

notro commented Feb 11, 2019

I still have the above mentioned problem.

Additionally I have this:

Test script:

import serial
import sys
import time

port = sys.argv[1]
print('port:', port)

version = sys.argv[2]

cmd = b'__import__("os").uname()\r\n'
print('cmd:', repr(cmd))

print()

CHAR_CTRL_A = b'\x01'
CHAR_CTRL_B = b'\x02'
CHAR_CTRL_C = b'\x03'
CHAR_CTRL_D = b'\x04'

with serial.Serial(port, baudrate=115200) as ser:
    # Reset REPL
    ser.write(CHAR_CTRL_C)
    ser.write(CHAR_CTRL_C)
    time.sleep(0.5)
    ser.write(CHAR_CTRL_B)
    time.sleep(0.5)
    discard = ser.read(ser.inWaiting())
    #print('discard', discard)

    # Enter Raw REPL
    ser.write(CHAR_CTRL_A)

    ser.write(b'res = ' + cmd + b'\r\n')
    if int(version) == 1:
        ser.write(b'print(res)\r\n')
    else:
        ser.write(b"print('BEGINMARKER>' + repr(res) + '<ENDMARKER')\r\n")

    # Execute the code that has been entered
    ser.write(CHAR_CTRL_D)

    time.sleep(2)  # The real code spins looking for the 0x04 byte separator
    out = ser.read(ser.inWaiting())
    print('out:', out)

    _, res = out.split(b'OK')
    print('res:', res)

    output, error, prompt = res.split(b'\x04')
    print('output:', output)

I use markers to know what result I should return from a remote function call. The reason for doing this is so I can use print statements in the function.

This is the output without the markers:

$ python3 -u simple_serial_test.py /dev/ttyACM1 1
port: /dev/ttyACM1
cmd: b'__import__("os").uname()\r\n'

out: b"\r\nraw REPL; CTRL-B to exit\r\n>OK(sysname='samd51', nodename='samd51', release='cd0aacccb-dirty', version='patchbase-dirty on 2019-02-09', machine='Adafruit Grand Central M4 Express with samd51p20')\r\n\x04\x04>"
res: b"(sysname='samd51', nodename='samd51', release='cd0aacccb-dirty', version='patchbase-dirty on 2019-02-09', machine='Adafruit Grand Central M4 Express with samd51p20')\r\n\x04\x04>"
output: b"(sysname='samd51', nodename='samd51', release='cd0aacccb-dirty', version='patchbase-dirty on 2019-02-09', machine='Adafruit Grand Central M4 Express with samd51p20')\r\n"

This is the failing output with the markers:

$ python3 -u simple_serial_test.py /dev/ttyACM1 2
port: /dev/ttyACM1
cmd: b'__import__("os").uname()\r\n'

out: b"\r\nraw REPL; CTRL-B to exit\r\n>OKBEGINMARKER>(sysname='samd51', nodename='samd51', release='cd0aacccb-dirty', version='patchbase-dirty on 2019-02-09', machine='A"
res: b"BEGINMARKER>(sysname='samd51', nodename='samd51', release='cd0aacccb-dirty', version='patchbase-dirty on 2019-02-09', machine='A"
Traceback (most recent call last):
  File "simple_serial_test.py", line 48, in <module>
    output, error, prompt = res.split(b'\x04')
ValueError: not enough values to unpack (expected 3, got 1)

I don't get the full output. Maybe there's a buffer that's not flushed...

It's fine in the ordinary REPL:

Adafruit CircuitPython patchbase-dirty on 2019-02-09; Adafruit Grand Central M4 Express with samd51p20
>>> res = __import__("os").uname()
>>> print('BEGINMARKER>' + repr(res) + '<ENDMARKER')
BEGINMARKER>(sysname='samd51', nodename='samd51', release='cd0aacccb-dirty', version='patchbase-dirty on 2019-02-09', machine='Adafruit Grand Central M4 Express with samd51p20')<ENDMARKER
>>>

@notro
Copy link
Author

notro commented Feb 12, 2019

That uname example is working now, turned out I had a 14 days old checkout.

I still have this problem: #1342 (comment)

@tannewt
Copy link
Member

tannewt commented Feb 13, 2019

@notro Thank you for the update! Because its hard to reproduce and likely timing related I'd like to punt this to future versions. I think it'll be a large task now. Any related bugs/behavior make it easier to debug.

@kevinjwalters
Copy link

I'm sure you've seen the notifications but this is worth a re-test after the fix hathach/tinyusb#260 for #1639 is picked up by CircuitPython.

@tannewt
Copy link
Member

tannewt commented Aug 11, 2021

I haven't seen any issues recently. I suspect it's been fixed.

@tannewt tannewt closed this as completed Aug 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants