Skip to content

[RFC] Should we add methods to clear errors? #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
therealprof opened this issue Jun 2, 2018 · 15 comments
Open

[RFC] Should we add methods to clear errors? #85

therealprof opened this issue Jun 2, 2018 · 15 comments

Comments

@therealprof
Copy link
Contributor

I've just discovered that the nice errors we may create in HAL implementations have no standard way to get cleared which often means that the peripheral is stuck in that error.

E.g. If we use a hal::serial:Read implementation we may get an Error::Overrun reported by the implementation but without clearing the error flag in the MCU peripheral, any consecutive calls will return the same error over and over again.

There're three ways to address this:

  1. Have each HAL implementation create their own custom method to do this
  2. Automatically clear the error in hardware before returning it
  3. Add a method to the HAL traits allowing the user to clear errors

2.) and 3.) have a set of pros and cons attached to them so I'd like to feel public opinion on this.

CC @japaric @hannobraun @thejpster @astro @ilya-epifanov

@bergus
Copy link

bergus commented Jun 2, 2018

I like #​2. What's the con?

@therealprof
Copy link
Contributor Author

@bergus Lack of application control and accidentally ignored problems due to developer laziness. Possibly there're more.

@thejpster
Copy link
Contributor

thejpster commented Jun 2, 2018 via email

@TheZoq2
Copy link

TheZoq2 commented Jun 3, 2018

I also like 2, but 3 would also work well. I made a PR to the stm32f103 repo a while ago implementing 2 japaric/stm32f103xx-hal#53

@hannobraun
Copy link
Member

I always implicitely assumed that option 2 would be the one to go with. Option 3 would be second-best, in my opinion, but I haven't stumbled upon any use cases that require that level of control.

I really hope that option 2 is good enough for the general case, as that would be more convenient to the user, and seems like the more Rustic API design to me.

@therealprof
Copy link
Contributor Author

I always implicitely assumed that option 2 would be the one to go with.

That's exactly the problem with that approach: It isn't done that way at the moment and it's also not documented that it should be done that way. And even if it was, there's no way to verify it other than inspect the code or test it on the hardware.

@hannobraun
Copy link
Member

I've thought some more about this. I'm changing my vote to option 3:

  1. @therealprof is making a good point.
  2. An implementation can decide to reset the error on return, and make the reset method a no-op. The approach gives us the benefits of option 2 (user doesn't have to reset manually), while adding a fallback for HAL implementers that otherwise would have forgotten to reset the errors.

@bergus
Copy link

bergus commented Jun 6, 2018

@therealprof yes, setting and documenting a standard is the problem, regardless of which approach is chosen. I don't however get the point you're trying to make with "there's no way to verify it other than inspect the code or test it on the hardware." - isn't that always the case?

@hannobraun

An implementation can decide to reset the error on return, and make the reset method a no-op.

I would strictly recommend not do do that. Creating such inconsistencies would only cause confusion, and telling the user "we have a method that you don't need to use" kind of makes the trait useless.

@hannobraun
Copy link
Member

@bergus

An implementation can decide to reset the error on return, and make the reset method a no-op.

I would strictly recommend not do do that. Creating such inconsistencies would only cause confusion, and telling the user "we have a method that you don't need to use" kind of makes the trait useless.

I didn't mean that we (embedded-hal) should say "you, the user, don't really need to call that, sometimes". I meant that a HAL implementer can say "my users won't suffer, if they forget to call that method". Of course such a user's code wouldn't be portable over multiple HAL implementations, so there's limited use to that.

As for recommending not to do that, there might not be a choice. I reviewed some of my own error handling code before posting my previous comment, and I noticed that the flags auto-reset when I read the register, so my error reset method wouldn't have anything to do.

(Granted, there's another set of registers that doesn't seem to auto-reset the flags, but I'm not sure if using them would be practical for me. In any case, we can't assume that hardware in the wild always requires or allows manual reset of error flags.)

@TheZoq2
Copy link

TheZoq2 commented Jun 6, 2018

While I prefer 2, I wouldn't mind 3 either. However, I think it's important to be consitent. If 3 is chosen, but resets are also done automatically by some crates, I see some potential issues.

  • I write a crate but forget to call reset on one of my errors. However, I only test on hardware with an implementation that auto-reset them. Then my code will appear bug free, but will fail if I run it with another hal implementation.
  • I write a crate that for some reason relies on flags not being explicitly reset, but the HAL resets them anyway. That might be a tricky bug to track down.

Additionally, as @hannobraun said, sometimes the hardware resets the flags automatically after being read, that would make 3 impossible to implement if it is enforced that resets should be done manually. I can't think of a case where 2 would be impossible to implement, though I guess it would add some overhead if flags are reset even if the user doesn't want them to be.

Finally, I think forgetting to reset error flags might be a common user mistake which would also be fairly tricky to track down. Having reset being done implicitly by the HAL, would prevent that issue as long as the implementation is correct.

@therealprof
Copy link
Contributor Author

Finally, I think forgetting to reset error flags might be a common user mistake which would also be fairly tricky to track down.

Not sure what you mean by that. Today the calls always return with a result that requires handling, if it returns with an error the user needs to handle it, however there's no way to deal with the error so one can only ignore it (or do something drastic like reinitialise the serial device, if that is possible).

If we add a clear_error() (or whatever) method the user has something to call instead. If the device automatically clears errors when reading the status, then the clear_error() would be an empty implementation but that is fine because the serial device would continue to work. For devices where automatic clearing is not done in hardware but needs to be done with software the clear_error() would contain an implementation to clear the error and the same user code would work in both cases.

@therealprof
Copy link
Contributor Author

@bergus

I don't however get the point you're trying to make with "there's no way to verify it other than inspect the code or test it on the hardware." - isn't that always the case?

Not quite. Having an implicit error clearing means relying on the fact that the HAL author recognised that this needs to be done and implemented. Having an explicit method to do that means that there's no way around implementing it or it or the HAL driver can't be used.

@TheZoq2
Copy link

TheZoq2 commented Jun 8, 2018

Not sure what you mean by that

What I mean is that with method 2, the responsibility to clear errors is on the HAL implementation author. This means that only a few well tested crates would need to clear the errors and as a user of those crates you have nothing to worry about.

With method 3, every time there is a potential to return an error, the user would need to remember that simply passing the error along isn't enough and it needs to be cleared which is fairly uncommon in other parts of the language and might therefore be easy to forget.

Having an explicit method to do that means that there's no way around implementing it or it or the HAL driver can't be used.

That's a good point, I didn't think of that.

@therealprof
Copy link
Contributor Author

What I mean is that with method 2, the responsibility to clear errors is on the HAL implementation author. This means that only a few well tested crates would need to clear the errors and as a user of those crates you have nothing to worry about.

The responsibility for the correct implementation is always on the HAL implementation author anyways.

With method 3, every time there is a potential to return an error, the user would need to remember that simply passing the error along isn't enough

Simply passing an error along is never enough, at some point you'll need to handle it.

and it needs to be cleared which is fairly uncommon in other parts of the language and might therefore be easy to forget.

I disagree, clearing errors is pretty much always a necessity when dealing with I/O (and this happens to be I/O, too): it really doesn't matter whether you're trying to write to a file and the disk quota is exceeded or to a network socket that was just closed on the other end, the user has to do something meaningful to get rid of the error at runtime or print an error and give up...

@austinglaser
Copy link
Contributor

I've got a related PR open for stm32f30x-hal (japaric/stm32f30x-hal#14), which has been hanging fire since I've had no free time to test it on hardware. It implements method 2.

With that said, I think the arguments here have convinced me that method 3 is the best way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants