Skip to content

Conversation

@zagor
Copy link
Contributor

@zagor zagor commented Jan 30, 2023

PR #30403 implemented nocache regions for ethernet DMA buffers in sram3 to fix issue #29915. Unfortunately, chip variants STM32H742xx do not have any sram3, so they still suffer from #29915.

All H7 variants have sram2 though, so lets use that instead.

(Additionally, I removed the ifdef DT_NODE_HAS_STATUS(DT_NODELABEL(sramX), okay) because it hid the problem. The H7 ethernet driver requires nocache dma buffers, we should not let the build succeed without them.)

Fixes #29915

@rlezuo1
Copy link
Contributor

rlezuo1 commented Feb 6, 2023

Have tried to cherry-pick this one on top of 4256cd4 (v3.2.0), resulted in crash and reboot during early zephyr init.
Cherry-picking this one: 8ed12ae though made my Ethernet very stable (3.838.506 flood pings, 1 lost).

This is on a H735ig (has no sram3).

@zagor
Copy link
Contributor Author

zagor commented Feb 6, 2023

@rlezuo1 That's odd. I'd like to know more about what goes wrong. Is your build using sram2 for anything else?

As I understand the issue, the 8ed12ae patch works by force-flushing the AXI SRAM (D1 domain), where your dma buffers are currently positioned. Moving the buffers to an AHB SRAM area (D2 domain) removes the need for flushing. This is the solution ST recommends, and it is already merged into Zephyr. Now we just need to make it work for targets that lack sram3.

@rlezuo1
Copy link
Contributor

rlezuo1 commented Feb 6, 2023

@zagor sram2 is not in use, the bootloader leaves some bytes in sram4 though. dma_tx_buffer_ends up @ 0x24000000. Perhaps this change depends on some other commits which have been merged after v3.2.0 ? It applied cleanly and looked plausible though.

@zagor
Copy link
Contributor Author

zagor commented Feb 7, 2023

@rlezuo1 I just tried this on v3.2.0 and it works fine on my H742 board. Are you able to debug your target and see more details about what goes wrong?

@zagor
Copy link
Contributor Author

zagor commented Feb 7, 2023

dma_tx_buffer_ends up @ 0x24000000.

Wait a minute. 0x24000000 is the AXI SRAM, SRAM2 is located at 0x30004000.
Your map file should have a section looking like this:

eth_stm32       0x0000000030020000     0x4000
                0x0000000030020000                . = ABSOLUTE (0x30020000)
 *(SORT_BY_ALIGNMENT(.eth_stm32_desc))
 .eth_stm32_desc
                0x0000000030020000       0xc0 zephyr/drivers/ethernet/libdrivers__ethernet.a(eth_stm32_hal.c.obj)
                0x0000000030020100                . = (ABSOLUTE (0x30020000) + 0x100)
 *fill*         0x00000000300200c0       0x40 
 *(SORT_BY_ALIGNMENT(.eth_stm32_buf))
 .eth_stm32_buf
                0x0000000030020100     0x2fc0 zephyr/drivers/ethernet/libdrivers__ethernet.a(eth_stm32_hal.c.obj)
                0x0000000030024000                . = (ABSOLUTE (0x30020000) + 0x4000)
 *fill*         0x00000000300230c0      0xf40 

@rlezuo1
Copy link
Contributor

rlezuo1 commented Feb 7, 2023

You've pushed me the right direction with that address comment, I've found the reason it does not work, you'll like that one, there is a very subtle difference in sram2 addresses across h7 product lines it seems, screenshot from h723/733, h725/735, h730 RM:

image

---
 dts/arm/st/h7/stm32h723.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/dts/arm/st/h7/stm32h723.dtsi b/dts/arm/st/h7/stm32h723.dtsi
index 0291c66163..6f56053c63 100644
--- a/dts/arm/st/h7/stm32h723.dtsi
+++ b/dts/arm/st/h7/stm32h723.dtsi
@@ -93,8 +93,8 @@
 	};
 
 	/* D2 domain, AHB SRAM */
-	sram2: memory@30040000 {
-		reg = <0x30040000 DT_SIZE_K(16)>;
+	sram2: memory@30004000 {
+		reg = <0x30004000 DT_SIZE_K(16)>;
 		compatible = "zephyr,memory-region", "mmio-sram";
 		zephyr,memory-region = "SRAM2";
 	};
-- 

with above patch applied sram2 DMA buffers work as expected

@zagor
Copy link
Contributor Author

zagor commented Feb 8, 2023

 -	sram2: memory@30040000 {
 -		reg = <0x30040000 DT_SIZE_K(16)>;
 +	sram2: memory@30004000 {
 +		reg = <0x30004000 DT_SIZE_K(16)>;

Ouch, that's pretty severe and needs a separate PR. Do you want to do it @rlezuo1 or shall I?

@rlezuo1
Copy link
Contributor

rlezuo1 commented Feb 8, 2023

@zagor: have done: #54593

@zagor
Copy link
Contributor Author

zagor commented Feb 8, 2023

@erwango Do we really want to delay this fix to 3.4.0?

Ethernet is broken (i.e exhibits #29915) for at least STM32H723, 725, 733, 735 and 742.

I know 3.3 is frozen, but this feels pretty serious to me.

@erwango
Copy link
Member

erwango commented Feb 8, 2023

lemented nocache regions for ethernet DMA buffers in sram3 to fix issue #29915. Unfortunately, chip variants STM32H742xx do not have any sram3, so they still suffer from #29915.

All H7 variants have sram2 though, so lets use that instead.

(Additionally, I removed the ifdef DT_NODE_HAS_STATUS(DT_NODELABEL(sramX), okay) because it hid the problem. The H7 ethernet driver requires nocache dma buffers, we should not let the build succeed without them.)

Actually we've done rather intensive testing on STM32 ethernet recently. We didn't notified specific issue, at least the one you mention.
Do I understand correctly that issues requires CONFIG_DEBUG=y ?

@rlezuo1
Copy link
Contributor

rlezuo1 commented Feb 8, 2023

@erwango No it does not require DEBUG_CONFIG=y,

bug triggers when:

CONFIG_ETH_STM32_HAL_USE_DTCM_FOR_DMA_BUFFER=n
CONFIG_SOC_SERIES_STM32H7X=y

results in crash before main starts

@erwango
Copy link
Member

erwango commented Feb 8, 2023

bug triggers when: CONFIG_ETH_STM32_HAL_USE_DTCM_FOR_DMA_BUFFER=n
CONFIG_SOC_SERIES_STM32H7X=y

ETH_STM32_HAL_USE_DTCM_FOR_DMA_BUFFER is F7 specific, so you basically mean ethernet is broken for H7.
And this is what I'm finding strange given the tests we've performed (several boards and nucleo_h723zg)
And this is also because of this status that I'm relunctant to merge late fixes.
This being said, I'm taking this seriously. We'll do more tests and try to reproduce your issue.
We'll also perform tests with this fix. If we don't find issues related, I'm ok to merge it.

@zagor
Copy link
Contributor Author

zagor commented Feb 8, 2023

To clarify, the bug that this PR is about is #29915. It manifests as eth_stm32_hal: HAL_ETH_TransmitIT tx_int_sem take timeout and eth_stm32_hal: eth packet timeout. This bug was fixed for the larger STM32H7 variants in PR #30403.

Unfortunately, the #30403 fix assumes the existence of sram3. If sram3 does not exist, it silently reverts to putting ethernet DMA buffers in AXI SRAM, which is what causes bug #29915.

@erwango erwango modified the milestones: v3.4.0, v3.3.0 Feb 8, 2023
Copy link
Member

@erwango erwango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look to sram2 definition on other H7 devices and I'm not 100% confident they are correctly done (for instance stm32h7a3 defines AXI SRAM Address, some sram2 regions declare 384K) and that change is 100% harmless on existing configurations: what is the impact on users already using sram2 in cacheable configuration?

I agree this is a required change but given we're a bit late in the party to merge risky change, can you re-arrange this PR in order not to impact SoC that currently use SRAM3 ?

In next DV we'll take time to revisit this, review existing SRAM descriptions, ensure this is harmless on existing configuration or notify users otherwise.

@zagor zagor force-pushed the move-stm32h7-eth-dma branch from 7e9a0e8 to 71dd619 Compare February 8, 2023 16:14
@zagor
Copy link
Contributor Author

zagor commented Feb 8, 2023

Patch updated, now uses sram3 if available and sram2 otherwise.

I don't have both kinds of targets to test on but I built a sample for 723 and 753 and could see that the buffers end up in sram2 on the 723 and sram3 on the 753:

STM32H723:
Memory region         Used Size  Region Size  %age Used
           FLASH:      132936 B         1 MB     12.68%
             RAM:       47336 B       320 KB     14.45%
     BACKUP_SRAM:          0 GB         4 KB      0.00%
           SRAM1:          0 GB        16 KB      0.00%
           SRAM2:         16 KB        16 KB    100.00%
           SRAM4:          0 GB        16 KB      0.00%
            DTCM:          0 GB       128 KB      0.00%
            ITCM:          0 GB        64 KB      0.00%
        IDT_LIST:          0 GB         2 KB      0.00%

STM32H753:
Memory region         Used Size  Region Size  %age Used
           FLASH:      132200 B         2 MB      6.30%
             RAM:       47336 B       512 KB      9.03%
           SRAM1:          0 GB       128 KB      0.00%
           SRAM2:          0 GB       128 KB      0.00%
           SRAM3:         16 KB        32 KB     50.00%
           SRAM4:          0 GB        64 KB      0.00%
            DTCM:          0 GB       128 KB      0.00%
        IDT_LIST:          0 GB         2 KB      0.00%

PR zephyrproject-rtos#30403 implemented nocache regions for ethernet DMA buffers in sram3 to
fix issue zephyrproject-rtos#29915. Unfortunately, some STM32H7 variants do not have any
sram3 so they still suffer from zephyrproject-rtos#29915.

All H7 variants have sram2 though, so use that for targets without sram3.

Signed-off-by: Björn Stenberg <[email protected]>
erwango
erwango previously approved these changes Feb 8, 2023
Copy link
Member

@erwango erwango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick change.
This is fine for me.

@stephanosio stephanosio merged commit dcbc56c into zephyrproject-rtos:main Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

eth: stm32h747i_disco: sem timeout and hang on debug build

7 participants