Skip to content

Clarify SPI timing behaviour (currently completely broken!) #264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mgottschlag opened this issue Jan 15, 2021 · 6 comments · Fixed by #365
Closed

Clarify SPI timing behaviour (currently completely broken!) #264

mgottschlag opened this issue Jan 15, 2021 · 6 comments · Fixed by #365
Milestone

Comments

@mgottschlag
Copy link

mgottschlag commented Jan 15, 2021

TLDR

Currently, it is unclear when exactly the SPI transfer functions return. Some drivers assume that the functions return after the clock line has returned to idle state. This is not how some (most?) HALs implement SPI. The documentation should clarify the expected behaviour.

The Problem

I am currently building an application with https://github.com/astro/embedded-nrf24l01 on a system with an STM32F429 clocked at 168 MHz. I clocked the SPI bus at 500 KHz -- much slower than the system clock, which to a certain degree caused the following problems -- and subsequently saw very weird behaviour. embedded-nrf24l01 would pull the CS pin high and then low again before the SCK signal reached idle, so the device only ever registered one single long transfer instead of multiple small transfers as required to send individual commands. All but the first command was therefore ignored or incorrectly processed by the device.

The corresponding code in embedded-nrf24l01 is as follows:

    self.csn.set_low().unwrap();
    let transfer_result = self.spi.transfer(buf).map(|_| {});
    self.csn.set_high().unwrap();

Changing the code as follows solves all problems, because then all CS changes occur while the SPI bus is idle:

    cortex_m::asm::delay(200);
    self.csn.set_low().unwrap();
    cortex_m::asm::delay(200);
    let transfer_result = self.spi.transfer(buf).map(|_| {});
    cortex_m::asm::delay(200); // <- this line is what fixes the problem, the other delays are just to make the signal look more like in the datasheet
    self.csn.set_high().unwrap();

The Solution(s)

  1. The documentation could state that the SPI functions return as soon as the last bit was read from the bus. Device drivers would then have to add delays where necessary, similar to the code above. I do not know how many devices would require this type of delay -- if many devices need it, this solution is a huge footgun and substantially increases complexity. I hardcoded the delay cycles in the example above, but that's only possible because I know the clock speed of the device.
  2. The documentation could state that the SPI functions return as soon as the SCK signal is idle again. Probably, all HALs would have to be modified.

Other Problems

If I understand it correctly, this bug is not completely identical to stm32-rs/stm32f0xx-hal#130, although that ticket shows a similar problem with SPI documentation: The documentation should (or should not?) state that transfer() finishes after the last bit has been transferred, not after the bits have been shoved into a FIFO.

@mgottschlag
Copy link
Author

A third possible solution would be to add a function "poll_idle()" which waits until the bus is idle again (potentially in a separate trait?) and which could be used by drivers requiring specific timings.

I just have the gut feeling that there might be a lot of drivers which are broken, where nobody noticed because people are using slower microcontrollers and higher SPI speeds.

@ryankurte
Copy link
Contributor

hmm, good work catching this, and thanks for opening an issue! timing bugs can be a real... pain.

The documentation could state that the SPI functions return as soon as the SCK signal is idle again. Probably, all HALs would have to be modified.

my expectation would be that transfer and other blocking SPI functions should not return until the particular transaction is complete, given the need to manually manage CS i don't think there are really other viable approaches (for the blocking traits). unless i'm overlooking something (@rust-embedded/hal ?) we could stand to improve the documentation to state that the transaction must be complete prior to returning.

looking at the STM32F4 reference manual (P 893, Section 28.3.7) i can see exactly how this situation would happen wrt. TXE vs. BUSY flags, probably we need to be checking TXE when transmitting then BUSY once the TX buffer is empty.

ST RM0080 28.3.7 SPI Status flags Screen Shot 2021-01-16 at 12 12 00 PM

A third possible solution would be to add a function "poll_idle()" which waits until the bus is idle again (potentially in a separate trait?) and which could be used by drivers requiring specific timings.

the stm32f4xx-hal seems to use our Default impl for blocking methods over spi::FullDuplex which doesn't have the ability to signal done-ness, so it looks to me like we may also need to add an .idle() or whatever method to this trait to pass this through.

as you say none of these problems are likely show up with fast SPI or on platforms with syscalls to manage transactions so, nice catch!

@ryankurte ryankurte added this to the v1.0.0 milestone Jan 15, 2021
@ryankurte
Copy link
Contributor

cc. @therealprof since it looks like you're already across this in stm32-rs/stm32f0xx-hal#130

@mgottschlag
Copy link
Author

The result of a discussion in @stm32-rs:matrix.org was that an abstraction which manages the CS pin as well solves all these problems - implementations would either use software-managed CS and add delays as required (needs to be implemented in the HAL) or would use hardware-managed CS where the hardware "just works". I did not have a close look at the corresponding pull requests for embedded-hal yet.

Performance-wise it may make sense to support transfers where the HAL does not wait until the bus is idle, as multiple transfers to the same device without changes to CS could be chained with higher performance.

Note that on STM32F4, according to the timing diagrams in the manual (and my own tests) the BSY bit does unfortunately not indicate that the bus is idle - it indicates that the last bit was transmitted (i.e., it is often set half a cycle too early). The HAL therefore would need to add a manual delay.

@ryankurte
Copy link
Contributor

on one hand i agree that managing CS can solve this, on the other, how do we avoid this situation occuring for other hal users?

... an abstraction which manages the CS pin as well solves all these problems - implementations would either use software-managed CS and add delays as required (needs to be implemented in the HAL) or would use hardware-managed CS where the hardware "just works"

can i interest you in a ManagedCS trait perhaps?

Performance-wise it may make sense to support transfers where the HAL does not wait until the bus is idle, as multiple transfers to the same device without changes to CS could be chained with higher performance.

we have the Transactional API for chaining multiple transfers (ideally within a single CS assertion), my feeling is that the least surprising approach is for the spi::blocking::* calls to always leave the bus in an inactive state?

Note that on STM32F4, according to the timing diagrams in the manual (and my own tests) the BSY bit does unfortunately not indicate that the bus is idle

how frustrating!

@ryankurte
Copy link
Contributor

ryankurte commented Jan 19, 2021

a friend pointed out that the correct process for determining transaction completion is documented in the procedure for disabling the SPI, using FTLVL and FRLVL, which might be useful for stm32-rs side of this (thanks @TravisScott!).

Procedure for disabling the SPI - STM32L451 RM p. 1315

image

edit: this may not apply to all chip families, but hopefully the RM for the appropriate family can resolve this in some cases.

and the documentation update here should be a reasonably straightforward PR if y'all are interested in opening it ^_^

Dirbaio added a commit to embassy-rs/embedded-hal that referenced this issue Feb 16, 2022
Dirbaio added a commit to embassy-rs/embedded-hal that referenced this issue Feb 16, 2022
Dirbaio added a commit to embassy-rs/embedded-hal that referenced this issue Feb 17, 2022
Dirbaio added a commit to embassy-rs/embedded-hal that referenced this issue Feb 17, 2022
Dirbaio added a commit to embassy-rs/embedded-hal that referenced this issue Feb 17, 2022
Dirbaio added a commit to embassy-rs/embedded-hal that referenced this issue Feb 18, 2022
bors bot added a commit that referenced this issue Feb 24, 2022
365: spi: allow impls returning early, add flush.  r=therealprof a=Dirbaio

Fixes #264 
Depends on #351 

- Document that all methods may return early, before bus is idle.
- Add a `flush` method to wait for bus to be idle.

The alternative is to document that methods MUST wait for bus idle before returning, but that would be bad for performance IMO. Allowing early return means multiple successive writes can "overlap" so that bytes come out back-to-back without gaps, which is great for workloads with tons of small writes, like SPI displays with `embedded-graphics`.

Drivers using `SpiDevice` won't have to worry about this, it'll Just Work since Device impls must flush before deasserting CS. Drivers using SpiBus will have to care if they coordinate SPI activity with other GPIO activity.

Co-authored-by: Dario Nieuwenhuis <[email protected]>
@bors bors bot closed this as completed in 9c00620 Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants