Skip to content

Add supervisor.set_next_code_file() #3454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 25, 2021
Merged

Conversation

cwalther
Copy link

This is a prototype implementation of #1084. Feature set and implementation details to be discussed.

Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!!! This approach looks really good overall. I have a question about reload_on_success but it should be easy to sort out.

If this is used in an OTA case maybe we want something like "errors_ok=False" where a failure in the next code reverts what is run back to code.py. When on USB, I think an error would be ok.

Wild idea, what if you use the same approach to save the traceback string and provide it to the code.py? That way an OTA could capture the error.

@cwalther
Copy link
Author

Thanks for the comments! Together with the implementation experience, they enable me to identify more higher-level requirements and I decided to go back to #1084 to discuss that, or should we continue everything here?

Wild idea, what if you use the same approach to save the traceback string and provide it to the code.py? That way an OTA could capture the error.

Sounds good, but maybe that would best be tackled in a separate PR? I’m not sure offhand how easy that is to obtain – is it still stored anywhere after pyexec_file() returns, would it need to be recorded off the serial output, …?

@tannewt
Copy link
Member

tannewt commented Sep 25, 2020

Sounds good, but maybe that would best be tackled in a separate PR? I’m not sure offhand how easy that is to obtain – is it still stored anywhere after pyexec_file() returns, would it need to be recorded off the serial output, …?

Yup, agreed a separate PR is best. I don't know how to get it either.

@cwalther
Copy link
Author

This force push

  • squashes and rebases the previous commits
  • fixes the inconsistency in next-code-state preservation between a reload during execution and a reload during the wait state
  • adds lots of options (keyword arguments) for prototype testing that can be removed again when considered useless.

By the way, are the artifacts from the automated build available anywhere? The logs say Artifact xxx has been successfully uploaded! but don’t say where. That would make it easier for people to test without having to build it themselves.

Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. It's a bit complex but it's an advanced API and I'm not actually sure what we'll want. I'm fine if you want to undraft it and then we can merge.

@cwalther
Copy link
Author

That’s good to hear, my thinking was to reduce the complexity a bit and only keep what we need before merging. I’m not sure yet what that is either, but we’re not done yet figuring it out. It partially depends on how the “retain previous traceback“ feature comes along (expect a PR on that later, maybe tonight) and on resolving divergent opinions.

(By the way, I found the artifacts – top right of the build output. A text search would have found it… Will probably rebase so we get the latest state built too.)

@cwalther
Copy link
Author

cwalther commented Oct 2, 2020

Rebased to fix trivial merge conflict and get a new build. No content changes (so don’t bother re-reviewing). More changes forthcoming, but they’re not ready yet.

@cls22
Copy link

cls22 commented Nov 11, 2020

Hi @cwalther

Thanks for the new functionality. I tested "set_next_code_file()" this afternoon and it is just what I needed.

Do you have a time line on when you intend to merge the functionality into the main branch?

Thanks for your time!

Cheers,

Lee

@cwalther
Copy link
Author

I have an implementation on a more robust base than this prototype almost done. I hope to have pull requests ready for review this week. A few more days to weeks of review and discussion will then probably be required before the final merge.

Thanks for testing! Would you mind telling which of the optional keyword arguments to set_next_code_file you are using and with what values? Because we will probably not keep all of them for the final version, to keep things simple. Getting feedback on which of them people find useful is helpful in deciding that. Preferably in #1084, where the rest of the requirements discussion is.

@cls22
Copy link

cls22 commented Nov 11, 2020

Hi @cwalther,

I have a pretty simple use case, my supervisor calls are as follows:
supervisor.set_next_code_file(scriptName, reload_on_success=True)
supervisor.reload()

For me, I prefer the VM not running my menu system (code.py) when the secondary script raises an unhandled exception. I am developing a factory test fixture, so leaving error information on the screen is preferential. That said, for commercial products it may be preferential to reload the original code.py script if the secondary script crashes.

One nice to have would be to run scripts from an SD card. I am guessing this would require running a configuration script to mount the SD card prior to running the secondary script. This is just a nice to have, but something to think about.

Thanks for the work you are doing on this!

Cheers,

Lee

@cwalther
Copy link
Author

This is now implemented on top of the movable allocation system. Functionality should be unchanged from the previous revisions. Only the last commit is relevant, the rest are from #3695.

@cwalther
Copy link
Author

Will fix the mimxrt10xx and xtensa failures soon, I missed something there.

That it would overflow the flash on some of the smaller boards I kinda feared. Pruning back the optional arguments will reduce the size a little, but probably not enough for all of them. Open for discussion on what to do about that.

@tannewt tannewt mentioned this pull request Dec 1, 2020
2 tasks
@deshipu
Copy link

deshipu commented Dec 5, 2020

I wonder if it will fit better now, with the complex arithmetic removed.

@cwalther
Copy link
Author

cwalther commented Dec 5, 2020

Working on it. I plan to make it optional anyway, and then we’ll see on how many boards it can be enabled.

@cwalther
Copy link
Author

As a sign of life, here is the previous state rebased on current main. I have not checked yet whether the way I merged it with the sleep changes makes sense. Haven’t mustered the motivation to learn how sleep works yet, it looks complicated. I also haven’t made any of the other promised/planned changes yet.

Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing your latest code! Sorry it overlapped with the deep sleep changes. I think we'll want to treat deep sleep like a successful run.

@cwalther
Copy link
Author

Thanks for taking up work on this! I’m embarrassed about dropping the ball. I’m busy right now but I’ll try to at least take a look at your changes.

@hierophect
Copy link
Collaborator

I've brought this up to date with the Main branch and removed the use of goto. Tested on a Feather M4 Express, works as expected. There are a couple of remaining questions I have before it gets merged:

  1. Are we trying to reduce this API before merging, to avoid having to reduce it later and cut functionality? If so, what elements are in question?
  2. Related to the above, scanning the discussion, I don't see a clear consensus on why someone would want RELOAD_ON_ERROR to be true. This struck me as one of the most unusual settings, because it skips the LED error indication, will not reload something like an e-paper screen, and generally seems like it masks the fact that there was an error at all. Is this something we want to keep? Or is it related to Add supervisor.get_previous_traceback() #3696, maybe?
  3. Integrating this with deep sleep is complicated, because a true deep sleep acts a lot more like a full power reset than a soft reboot, so storing things in the heap isn't going to work. If a "sub-program" is going to restart into itself after a deep sleep (which I think it should, sub-programs shouldn't diverge from their usual behavior just because they're sub-programs) it will need to store the filename in either NVM, the sleep memory, or the scratch registers. These are unfortunately all different per port, and have their own issues:
    • NVM is dangerous, noted by @dhalbert, because some programs may wake fairly frequently, causing significant flash wearing over time.
    • Sleep RAM is made accessible to users via the SleepMemory API, and thus using it competes with those accesses.
    • Many chips have special register banks that are not reset during sleep, but the number of them is inconsistent between vendors and usually small (STM32 has 20 32 bit registers, RP2040 has only 7, etc), and the deep sleep API may need them for other purposes.

Currently, true deep sleep will just "not work" with the next_file system and will always default to startup behavior (ie return to the menu) which we should make a note of in the docs if we push the sleep solution to a later PR.

I think the best solution for deep sleep is probably to add a new API to the Sleep Memory system that keeps track of how much sleep RAM is in use by the user, and adds system reservations to the end of the block, reducing the available "size" at runtime. If it conflicts with user-reserved memory, or if the user tries to encroach on system memory, it'll throw a runtime error.

@hierophect hierophect marked this pull request as ready for review June 20, 2021 21:51
@tannewt
Copy link
Member

tannewt commented Jun 22, 2021

Thanks for taking up work on this! I’m embarrassed about dropping the ball. I’m busy right now but I’ll try to at least take a look at your changes.

No problem! We know folks don't always have time to contribute. We're happy to pick it up. You got it very far!

1. Are we trying to reduce this API before merging, to avoid having to reduce it later and cut functionality? If so, what elements are in question?

No, I don't think so. It's an advanced API so it can have all of the options.

2. Related to the above, scanning the discussion, I don't see a clear consensus on why someone would want `RELOAD_ON_ERROR` to be true. This struck me as one of the most unusual settings, because it skips the LED error indication, will not reload something like an e-paper screen, and generally seems like it masks the fact that there was an error at all. Is this something we want to keep? Or is it related to #3696, maybe?

I think it's ok to have as an option. It's meant for advanced use and presumably the user intends to hide errors.

3. Integrating this with deep sleep is complicated, because a true deep sleep acts a lot more like a full power reset than a soft reboot, so storing things in the heap isn't going to work. If a "sub-program" is going to restart into itself after a deep sleep (which I think it should, sub-programs shouldn't diverge from their usual behavior just because they're sub-programs) it will need to store the filename in either NVM, the sleep memory, or the scratch registers. These are unfortunately all different per port, and have their own issues:
   
   * NVM is dangerous, noted by @dhalbert, because some programs may wake fairly frequently, causing significant flash wearing over time.
   * Sleep RAM is made accessible to users via the SleepMemory API, and thus using it competes with those accesses.
   * Many chips have special register banks that are not reset during sleep, but the number of them is inconsistent between vendors and usually small (STM32 has 20 32 bit registers, RP2040 has only 7, etc), and the deep sleep API may need them for other purposes.

Currently, true deep sleep will just "not work" with the next_file system and will always default to startup behavior (ie return to the menu) which we should make a note of in the docs if we push the sleep solution to a later PR.

I think the best solution for deep sleep is probably to add a new API to the Sleep Memory system that keeps track of how much sleep RAM is in use by the user, and adds system reservations to the end of the block, reducing the available "size" at runtime. If it conflicts with user-reserved memory, or if the user tries to encroach on system memory, it'll throw a runtime error.

After today's meeting we decided to set aside part of sleep memory to store the next filename. It'll be part of a fixed area so that the user accessible portion of sleep memory is consistent.

@cwalther
Copy link
Author

My 2 cents, without having examined the code yet:

  1. Are we trying to reduce this API before merging, to avoid having to reduce it later and cut functionality? If so, what elements are in question?

I did originally intend to pare down the options, but if you are OK with keeping them, that’s fine with me too. I think we were approaching a consensus around #1084 (comment).

  1. Related to the above, scanning the discussion, I don't see a clear consensus on why someone would want RELOAD_ON_ERROR to be true. This struck me as one of the most unusual settings, because it skips the LED error indication, will not reload something like an e-paper screen, and generally seems like it masks the fact that there was an error at all. Is this something we want to keep? Or is it related to Add supervisor.get_previous_traceback() #3696, maybe?

RELOAD_ON_ERROR is useful for an application that wants to display the error in its own GUI (or whatever) instead of having it on the serial output and/or terminalio. Yes, that requires #3696. The scenario I’m coming from is that of a game console where the application consists of two parts: a main menu that was written by experts, is finished and more or less bug-free, and a game that is under development by a beginner and may often end with an error. The main menu runs the game by setting it as the next code file and reloading, and when the game exits the main menu takes over again, retrieving and displaying a potential error backtrace from the game. I think the idea came up around here: #1084 (comment).

  1. Integrating this with deep sleep is complicated

As far as I’m concerned, dealing with deep sleep is outside of the scope of this proposal. I’m fine with having it “not work” (work the same way as a hard power-off). I agree it should be mentioned in the function docs though, where currently only “powering off or resetting” is mentioned. But if you can make it work, all the better.

Finally: “All checks have passed” – does that mean it now also fits on the small SAMD21s and doesn’t need to be made optional? Nice!

@hierophect
Copy link
Collaborator

We've discussed adding all of the deep sleep integration settings in a new PR, so if this is passing CI and other people can confirm that it works for their purposes under test, it should be good to go, right?

Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me! Thanks for picking this up @hierophect!

@tannewt tannewt merged commit d67fb85 into adafruit:main Jun 25, 2021
@cwalther
Copy link
Author

I have tested the new version and it appears to work fine in my game console menu use case. 👍

Some code review notes:

Your goto replacement brings some changes in behavior, but I believe they are inconsequential:

  • In the case of a reload-on-success, your version prints “Code done running.”, mine didn’t. Arguable which is better.
  • If result.return_code & PYEXEC_FORCED_EXIT, the new behavior is wrong. However, as far as I can see, that never happens (PYEXEC_FORCED_EXIT is only used in the context of the REPL). So I’m not sure why these lines had been there to begin with.

I’m not totally sure about the interaction with deep sleep since I still haven’t put in the effort to understand how that works in detail, but it seems to me like in the event of entering a fake deep sleep (not a real deep sleep because then the settings are lost anyway), the sticky_on_error setting takes effect. I think I would rather expect sticky_on_success to take effect. Or actually none at all, not even NEWLY_SET, for consistency with real deep sleep – for now, until persistency over deep sleep is implemented.

@hierophect
Copy link
Collaborator

  • In the case of a reload-on-success, your version prints “Code done running.”, mine didn’t. Arguable which is better.

This was intentional, I didn't want the code to restart without any text indicator that it did so.

  • If result.return_code & PYEXEC_FORCED_EXIT, the new behavior is wrong. However, as far as I can see, that never happens (PYEXEC_FORCED_EXIT is only used in the context of the REPL). So I’m not sure why these lines had been there to begin with.

I will double check this, not sure if I understand you totally.

I’m not totally sure about the interaction with deep sleep since I still haven’t put in the effort to understand how that works in detail, but it seems to me like in the event of entering a fake deep sleep (not a real deep sleep because then the settings are lost anyway), the sticky_on_error setting takes effect.

Currently the system masks out deep sleep exceptions and will never skip when they are present. The interaction issue is that true deep sleep, across all ports, acts more like a power reset than a soft-reboot - it clears all ram and register contents except for those in the vendor's sleep management peripheral. To store any information across deep sleep at all requires use of special ram banks that we refer to as "sleep memory", which are handled differently based on the port since they all have different hardware implementations.

Without a special API integration using this sleep memory, a chip waking from deep sleep has no way of knowing what the next file allocation was, and will return to the main menu as if it was waking from a reset. This is not intuitive behavior for a sub-program: users will expect that their software examples would work the same whether they are main/sub program, but in the deep sleep case they would act very differently. It's a bad "philosophical" problem, which is why I'm putting in the extra effort to make sure deep sleep programs can wake up to the same file they slept in.

@cwalther
Copy link
Author

I will double check this, not sure if I understand you totally.

Specifically, if reload_requested == false and next_code_options == SUPERVISOR_NEXT_CODE_OPT_RELOAD_ON_ERROR and (hypothetically) result.return_code == PYEXEC_FORCED_EXIT, then the result is skip_repl = false. Correct (and achieved in the goto solution) would be skip_repl = true.

Currently the system masks out deep sleep exceptions and will never skip when they are present.

That much is fine. That is the reload_on_… part of the behavior, and I agree with that. What I’m questioning is the sticky_on_… part: In the case of entering deep sleep, I think we should rather have next_code_stickiness_situation |= SUPERVISOR_NEXT_CODE_OPT_STICKY_ON_SUCCESS than next_code_stickiness_situation |= SUPERVISOR_NEXT_CODE_OPT_STICKY_ON_ERROR. Going to sleep seems like a success to me more than an error.

Or perhaps it’s a separate category from “success”, “error”, and “reload” altogether? There probably doesn’t need to be an additional argument sticky_on_deepsleep, it could always be effectively True.

@hierophect
Copy link
Collaborator

@cwalther if you are entering a deep sleep exception for true deep sleep, the code stickiness situation is irrelevant - all of that information will be lost when the RAM shuts down. The program will never reach a decision point where it would use it. This is slightly different for a fake deep sleep, but we try to make fake sleep as similar to true deep sleep as possible, so the stickyness should not be retained in that case either.

@hierophect
Copy link
Collaborator

Think of deep sleep as basically being an exception that causes the entire chip to lose power for a set duration of time, and start up essentially from scratch, with some very limited data that we can fetch out of a special section of memory.

@cwalther
Copy link
Author

cwalther commented Jul 4, 2021

It appears we are talking past each other. 🙂 Sorry, I should have been clearer which of my suggestions apply to the current situation where storage in sleep memory for persistence across deep sleep is not implemented yet, and which to the future situation where it is.

we try to make fake sleep as similar to true deep sleep as possible, so the stickyness should not be retained in that case either.

I agree with that goal. But I claim it is not what your code does.

In order to stop speculating, I just tried it on a Feather S2:

code.py:

import time
import alarm
import supervisor
print('waking in code.py')
supervisor.set_next_code_file('other.py', sticky_on_success=False, sticky_on_error=True)
alarm.exit_and_deep_sleep_until_alarms(alarm.time.TimeAlarm(monotonic_time=time.monotonic() + 5))

other.py:

import time
import alarm
print('waking in other.py')
alarm.exit_and_deep_sleep_until_alarms(alarm.time.TimeAlarm(monotonic_time=time.monotonic() + 5))

Actual result:

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
waking in code.py

Code done running.

Press any key to enter the REPL. Use CTRL-D to reload.
Pretending to deep sleep until alarm, CTRL-C or file write.
Woken up by alarm.
soft reboot

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
other.py output:
waking in other.py

Code done running.

Press any key to enter the REPL. Use CTRL-D to reload.
Pretending to deep sleep until alarm, CTRL-C or file write.
Woken up by alarm.
soft reboot

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
other.py output:
waking in other.py

Code done running.

i.e. code, other, other, other, …

Expected result in the current situation (no sleep memory):
Deep sleep doesn’t retain next file info, so fake deep sleep shouldn’t either.

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
waking in code.py

Code done running.

Press any key to enter the REPL. Use CTRL-D to reload.
Pretending to deep sleep until alarm, CTRL-C or file write.
Woken up by alarm.
soft reboot

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
waking in code.py

Code done running.

Press any key to enter the REPL. Use CTRL-D to reload.
Pretending to deep sleep until alarm, CTRL-C or file write.
Woken up by alarm.
soft reboot

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
waking in code.py

Code done running.

i.e. code, code, code, …

Expected result in the future situation with sleep memory:
sticky_on_error should be ignored since going to sleep is not an error.

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
waking in code.py

Code done running.

Press any key to enter the REPL. Use CTRL-D to reload.
Pretending to deep sleep until alarm, CTRL-C or file write.
Woken up by alarm.
soft reboot

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
other.py output:
waking in other.py

Code done running.

Press any key to enter the REPL. Use CTRL-D to reload.
Pretending to deep sleep until alarm, CTRL-C or file write.
Woken up by alarm.
soft reboot

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
waking in code.py

Code done running.

i.e. code, other, code, other, code, …

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants