Skip to content

Firmware transaction timeout when setting backlight #5397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
HakanL opened this issue Mar 24, 2023 · 61 comments
Closed

Firmware transaction timeout when setting backlight #5397

HakanL opened this issue Mar 24, 2023 · 61 comments

Comments

@HakanL
Copy link

HakanL commented Mar 24, 2023

Describe the bug

Intermittent errors when setting backlight brightness. It works some times, but after I get this error then it stops working (and the system becomes unstable). To make things more complicated, this is a CM4 on a custom carrier board, so I can't rule out that it's not a hardware problem (which I doubt, this board has worked before without issues). I'll post this issue to see if there's any indication what the root cause may be.

Steps to reproduce the behaviour

Set backlight brightness

Device (s)

Raspberry Pi CM4

System

uname -a
Linux a487250 5.15.34-v8 #1 SMP PREEMPT Tue Apr 19 19:21:26 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

Logs

[10147.406933] i2c-bcm2835 fe205000.i2c: Got unexpected interrupt (from firmware?)
[10158.470804] ------------[ cut here ]------------
[10158.470826] Firmware transaction timeout
[10158.470877] WARNING: CPU: 1 PID: 10222 at drivers/firmware/raspberrypi.c:67 rpi_firmware_property_list+0x138/0x20c
[10158.470913] Modules linked in: ip6t_REJECT nf_reject_ipv6 ip6table_filter ipt_REJECT nf_reject_ipv4 ip6_tables xt_MASQUERADE nf_conntrack_netlink nfnetlink br_netfilter rfkill xt_owner i2c_dev joydev rpi_panel_attiny_regulator edt_ft5x06 v3d gpu_sched raspberrypi_ts bcm2835_v4l2(C) tc358762 bcm2835_isp(C) i2c_mux_pinctrl bcm2835_codec(C) gpio_keys i2c_mux videobuf2_vmalloc bcm2835_mmal_vchiq(C) v4l2_mem2mem rtc_pcf85063 videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 dwc2 raspberrypi_hwmon i2c_brcmstb roles videobuf2_common vc_sm_cma(C) videodev snd_bcm2835(C) mc rpi_backlight backlight rpivid_mem panel_simple drm_dp_aux_bus uio_pdrv_genirq nvmem_rmem uio sch_fq_codel fuse
[10158.471194] CPU: 1 PID: 10222 Comm: Event Loop 1 Tainted: G C 5.15.34-v8 #1
[10158.471209] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[10158.471217] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[10158.471232] pc : rpi_firmware_property_list+0x138/0x20c
[10158.471244] lr : rpi_firmware_property_list+0x138/0x20c
[10158.471256] sp : ffffffc00a5e3ae0
[10158.471262] x29: ffffffc00a5e3ae0 x28: ffffff8100358000 x27: ffffff8100a1fa40
[10158.471286] x26: ffffffd473166b70 x25: ffffffc008475008 x24: ffffff8101e68500
[10158.471308] x23: 0000000000001000 x22: 0000000000000010 x21: ffffff8100a1fa00
[10158.471329] x20: ffffffc008475000 x19: 00000000ffffff92 x18: 0000000000000000
[10158.471351] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[10158.471372] x14: 0000000000000000 x13: 74756f656d697420 x12: 6e6f69746361736e
[10158.471392] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : ffffffd4710dbbac
[10158.471413] x8 : ffffffd4730438a0 x7 : 6361736e61727420 x6 : ffffffd4731d4ab9
[10158.471434] x5 : c0000000ffffefff x4 : 0000000000000000 x3 : 0000000000000027
[10158.471455] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff8100358000
[10158.471476] Call trace:
[10158.471483] rpi_firmware_property_list+0x138/0x20c
[10158.471495] rpi_firmware_property+0x7c/0xfc
[10158.471507] rpi_backlight_update_status+0x68/0xd0 [rpi_backlight]
[10158.471528] backlight_update_status+0x38/0x60 [backlight]
[10158.471552] backlight_device_set_brightness+0x54/0x94 [backlight]
[10158.471571] brightness_store+0x74/0x94 [backlight]
[10158.471589] dev_attr_store+0x24/0x38
[10158.471606] sysfs_kf_write+0x48/0x5c
[10158.471620] kernfs_fop_write_iter+0xc4/0x17c
[10158.471631] new_sync_write+0x80/0xd8
[10158.471646] vfs_write+0x118/0x13c
[10158.471660] ksys_pwrite64+0x58/0x98
[10158.471673] __arm64_sys_pwrite64+0x28/0x34
[10158.471687] invoke_syscall+0x84/0x11c
[10158.471699] el0_svc_common.constprop.0+0xcc/0x100
[10158.471711] do_el0_svc+0x54/0x84
[10158.471721] el0_svc+0x24/0x54
[10158.471737] el0t_64_sync_handler+0xbc/0x158
[10158.471751] el0t_64_sync+0x1a0/0x1a4
[10158.471763] ---[ end trace c634a32393bac947 ]---
[10158.471865] rpi-backlight rpi_backlight: Failed to set brightness
[10159.494846] raspberrypi-clk soc:firmware:clocks: Failed to change fw-clk-arm frequency: -110
[10161.542850] hwmon hwmon1: Failed to get throttled (-110)
[10162.566843] rpi-backlight rpi_backlight: Failed to set brightness
[10165.638847] rpi-backlight rpi_backlight: Failed to set brightness
[10391.686865] INFO: task kworker/3:2:7875 blocked for more than 120 seconds.
[10391.686908] Tainted: G WC 5.15.34-v8 #1
[10391.686927] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10391.686943] task:kworker/3:2 state:D stack: 0 pid: 7875 ppid: 2 flags:0x00000008
[10391.686977] Workqueue: events_freezable mmc_rescan

Additional context

config.txt:

framebuffer_depth=32
dtdebug=1
disable_splash=1
dtoverlay=vc4-kms-v3d
dtoverlay=vc4-kms-dsi-7inch
dtoverlay=i2c-rtc,pcf85063a
dtoverlay=dwc2,dr_mode=host
dtoverlay=gpio-shutdown,gpio_pin=6,active_low=1
dtoverlay=sd0,overclock_50=50
dtoverlay=rpi-ft5406
dtoverlay=rpi-backlight
dtparam=spi=off
dtparam=audio=on
dtparam=i2c_arm=on
enable_uart=1
gpu_mem=16

This is using a DSI clone (Waveshare 4.3" DSI), but has worked fine before.

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

It's likely that the backlight control is not the thing that killed the VPU, just the first to discover the body, unless the additional load of powering the backlight was last straw on an underpowered system.

  1. What additional hardware is attached to the Pi?
  2. How is everything powered?
  3. What software would have been active at the time of the crash?

@6by9
Copy link
Contributor

6by9 commented Mar 24, 2023

dtoverlay=rpi-ft5406
dtoverlay=rpi-backlight

These are both wrong for using the 800x480 DSI screen drivers with vc4-kms-v3d.

vc4-kms-dsi-7inch already includes instantiating the edt-ft5x06 driver for the touch controller, and rpi-panel-attiny-regulator includes the backlight driver.
Having both sets of drivers loaded will result in conflicts over whether GPU or ARM are using i2c-10.

As pelwell says, something appears to have killed the firmware. sudo vcdbg log msg (or sudo vclog -m if on 64bit userspace) may give some clue. If not, then adding start_debug=1 to config.txt, and sudo vcdbg log assert / sudo vclog -a should show the firmware issue.

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

Yes - that first log line looks like the smoking gun:

[10147.406933] i2c-bcm2835 fe205000.i2c: Got unexpected interrupt (from firmware?)

That means the ARM has detected an I2C interrupt condition that it wasn't expecting, and clearing it (the only sane thing to do, otherwise the ARM could end up spinning) will probably leave the firmware waiting indefinitely.

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

It's likely that the backlight control is not the thing that killed the VPU, just the first to discover the body, unless the additional load of powering the backlight was last straw on an underpowered system.

  1. What additional hardware is attached to the Pi?
    A PIC microcontroller for input/output, but it's on a separate I2C bus. On that bus I also have a LM75 for temperature, and a pcf85063a RTC clock. The display I2C bus is dedicated to the display, nothing else on that.
  1. How is everything powered?
    Power Over Ethernet with a SIP module, can provide 12W of power.
  1. What software would have been active at the time of the crash?
    So this runs a hardened distribution called BalenaOS, which essentially runs a docker optimized for embedded systems. At the time of crash I only run my own application, which writes to the /sys/class/backlight/rpi_backlight/brightness file.

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

Power Over Ethernet with a SIP module, can provide 12W of power.

But only a fraction of this power can be delivered if it is all being routed through the Pi - the topology matters. However, my questions were rendered moot by @6by9's observations.

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

dtoverlay=rpi-ft5406
dtoverlay=rpi-backlight

These are both wrong for using the 800x480 DSI screen drivers with vc4-kms-v3d.

vc4-kms-dsi-7inch already includes instantiating the edt-ft5x06 driver for the touch controller, and rpi-panel-attiny-regulator includes the backlight driver. Having both sets of drivers loaded will result in conflicts over whether GPU or ARM are using i2c-10.

For some reason, if I only have the vc4-kms-dsi-7inch overlay then touch and backlight isn't working/loaded. I'm sure it's something related to this particular distribution, or how the overlays are pulled in when the image is built, but that's the part I'm trying to solve. I get this in dmesg when I only have vc4-kms-dsi-7inch in config.txt:

[   12.459240] rpi_touchscreen_attiny 10-0045: Failed to read REG_ID reg: -5
[   12.472380] rpi_touchscreen_attiny: probe of 10-0045 failed with error -5

I've been trying to decipher the overlay source file, but I'm coming up short, how can I specify overlays to do the same thing as vc4-kms-dsi-7inch does? Ultimately I will use a different display so if I can just manually do the necessary overlays then I'll have a better understanding of what components are used.

As pelwell says, something appears to have killed the firmware. sudo vcdbg log msg (or sudo vclog -m if on 64bit userspace) may give some clue. If not, then adding start_debug=1 to config.txt, and sudo vcdbg log assert / sudo vclog -a should show the firmware issue.

Unfortunately this distribution doesn't have those tools. I've tried in the past to manually install them, but was unsuccessful. And I'm positive that if I use a standard distribution then I won't have the problems (but that doesn't help my end goal). Is there a way to "copy" this vclog from this machine to another RPi where I can read it, or is the vclog/vcdbg doing direct firmware reads?.

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

vclog is open source, and you can easily build it yourself: https://github.com/raspberrypi/utils/tree/master/vclog

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

vclog is open source, and you can easily build it yourself: https://github.com/raspberrypi/utils/tree/master/vclog

Just found it, great, building it now. I can copy files onto the OS and run so hopefully that will bring some clarity. Thanks for your help!

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

I think this is what I ran into last time:

root@a487250:/tmp# ./vclog -m
Could not read from Device Tree log starts and log size

Updated config.txt:

root@a487250:/tmp# cat /mnt/boot/config.txt
framebuffer_depth=32
dtdebug=1
start_debug=1
disable_splash=1
dtoverlay=vc4-kms-v3d
dtoverlay=vc4-kms-dsi-7inch
dtoverlay=i2c-rtc,pcf85063a
dtoverlay=dwc2,dr_mode=host
dtoverlay=gpio-shutdown,gpio_pin=6,active_low=1
dtoverlay=sd0,overclock_50=50
dtparam=spi=off
dtparam=audio=on
dtparam=i2c_arm=on
enable_uart=1
gpu_mem=16
root@a487250:/tmp# 

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

It seems that I have a firmware from 1/20/2022, I don't know if that is good, bad or expected:

[    0.120554] raspberrypi-firmware soc:firmware: Attached to firmware from 2022-01-20T13:57:04, variant start_cd
[    0.124582] raspberrypi-firmware soc:firmware: Firmware hash is bd88f66f8952d34e4e0613a85c7a6d3da49e13e2

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

Unfortunately that firmware is too old to support vclog - it needs to contain a patch from Jul 13 2022.

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

There is a static build of vcdbg that might work for you: https://drive.google.com/file/d/1HS9E5vnxxNqrizB4mEYrnFoQQ1axSRKm/view?usp=sharing

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

There is a static build of vcdbg that might work for you: https://drive.google.com/file/d/1HS9E5vnxxNqrizB4mEYrnFoQQ1axSRKm/view?usp=sharing

Thanks, I can run that, but still no luck :(

root@a487250:/tmp# ./vcdbg log msg
Unable to determine the value of __LOG_START
Unable to read logging_header from 0x00000000
root@a487250:/tmp# 

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

Unfortunately that firmware is too old to support vclog - it needs to contain a patch from Jul 13 2022.

Stupid question, but how can I upgrade the firmware? Is that part of the distribution image, or can/should I go through rpi-update or something similar (realizing I don't have that tool in this distribution).

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

sudo curl -L --output /usr/bin/rpi-update https://raw.githubusercontent.com/raspberrypi/rpi-update/master/rpi-update && sudo chmod +x /usr/bin/rpi-update

To just update the firmware, skipping the kernel:

sudo SKIP_KERNEL=1 rpi-update

However, I don't think it would tell us much. If there were any DT-related errors you would be able to read them from /proc/device-tree/chosen/user-warnings.

@6by9
Copy link
Contributor

6by9 commented Mar 24, 2023

What is this distribution? 32 or 64bit?

Firmware can be found in https://github.com/raspberrypi/firmware/tree/master/boot. *.elf and *.dat.

The ATTiny on the official panel controls the backlight, DSI->DPI bridge chip, and touch, bridge, and panel reset lines.
What does i2cdetect -y 10 return? Address 0x45 is the microcontroller, and 0x38 is the touch controller (when enabled).
You can manually retrieve the ID register with sudo i2ctransfer -y 10 w1@0x45 0x80 r1@0x45.

There is a question as to whether we are now debugging this 3rd party device rather than the kernel drivers. The drivers are intended for our display, so clones play second fiddle.

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

The distribution is 64-bit. I tried to upgrade the firmware using rpi-boot (no errors), but after reboot the hash/version is the same in dmesg. Is it possible the boot process downgrades the firmware? Or a better question may be, is the firmware loaded during boot (vs stored on the chip)?

I understand that this is a clone display, however it has worked perfectly fine before (with the dsi-7inch command), I believe there's something wrong with the distribution (but it's a commercial distribution, https://www.balena.io/os#download-os, if there's anything wrong then I'll work with them to get it corrected, but first step is to determine where the issue is, which you're absolutely helping me with, and I greatly appreciate it). I actually have an official display, I can try that as well to eliminate that potential issue. But I'm still curious if I can manually enter the dtoverlay commands that dsi-7inch include?

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

Let's focus on understanding the symptoms, as @6by9 is asking you to do.

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

root@a487250:~# i2cdetect -l
i2c-20  i2c             Broadcom STB :                          I2C adapter
i2c-10  i2c             i2c-22-mux (chan_id 1)                  I2C adapter
i2c-1   i2c             bcm2835 (i2c@7e804000)                  I2C adapter
i2c-21  i2c             Broadcom STB :                          I2C adapter
i2c-0   i2c             i2c-22-mux (chan_id 0)                  I2C adapter
i2c-22  i2c             bcm2835 (i2c@7e205000)                  I2C adapter
root@a487250:~# i2cdetect -y 0
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:                         -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- 38 -- -- -- -- -- -- --
40: -- -- -- -- -- 45 -- -- -- -- -- -- -- -- -- --
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: -- -- -- -- -- -- -- --
root@a487250:~# i2ctransfer -y 0 w1@0x45 0x80 r1@0x45
0xc3

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

/proc/device-tree/chosen/user-warnings

This file doesn't exist (the chosen folder does though).

@6by9
Copy link
Contributor

6by9 commented Mar 24, 2023

root@a487250:~# i2ctransfer -y 0 w1@0x45 0x80 r1@0x45
0xc3

That would be the correct value.

If the driver has failed to probe, you could try sudo rmmod rpi_panel_attiny_regulator sudo modprobe rpi_panel_attiny_regulator to force it to unload and reload the regulator driver.

Error -5 is EIO. You've provided only a tiny snippet of the log around that. Did you also have a i2c transfer failed: 0x400 when it failed reading the ID?
I think I saw something recently along a similar line to that. If the firmware had last done an I2C read, then it left the I2C block in a weird state. If the kernel next issued a write&read transaction, it aborted due to there being bytes in the receive FIFO after the write phase.

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

As far as user-warnings goes, no news is good news - if the file is absent then there is nothing to report.

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

I tried rmmod/modprobe and I'm getting the same error. I don't see the i2c transfer error in dmesg. Here's the full pastebin: https://pastebin.com/xUdbTvtj
Note that I have two builds, the official BalenaOS build, and one I built myself (using their build process). On both I'm getting the same issue with loading the display drivers, but I'm not getting the interrupt exception issue on my build. The kernel seems to be the same, I opened this issue to try to get to the bottom of it, so I can help Balena update their official builds with whatever it may be that's causing it. I did not check which firmware the official build is, am I correct in assuming that the firmware is loaded on each boot (vs written to a flash storage or something)? So the firmware would go into the image that's on the SD/eMMC storage?

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

For what it's worth, I tried to load edt_ft5x06 both via config.txt and modprobe. It loads, but nothing in dmesg, and the touchscreen isn't working. If I switch back to rpi-ft5406 then my touchscreen works again. Could there be an issue with the display I2C bus being on id 0 instead of 10 (no clue why that is)?

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

Could there be an issue with the display I2C bus being on id 0 instead of 10 (no clue why that is)?

Yes, there would be, if that's the case, and the i2cdetect output suggests that it is. Are you using a dt-blob.bin on the CM4? If so, can you upload it somewhere so I can examine it? Alternatively, if you know for a fact the values of DISPLAY_I2C_PORT, DISPLAY_SCL and DISPLAY_SDA, you can just post them here. Or even a datasheet or software resources for the display.

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

You can confirm which pins rpi-ft5406 is using raspi-gpio get 0-1,44-45. SDA0 and SCL0 will appear on one of the two pairs of pins.

You may also find you can break the touchscreen by running i2cdetect -y 10 and fix it again with i2cdetect -y 0. The raspi-gpio command above will show the effect on the pins as the i2c-mux switches between buses.

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

Yes, I'm using a dt-blob, which I built from source files for the Waveshare PoE board, that I used as a template for my own baseboard (https://www.waveshare.com/wiki/Compute_Module_4_PoE_Board). I then just changed the pins to match my design (or if they were already correct, I can't remember now). I have the source file somewhere on a different computer, but attached here is the compiled file if that helps? Looking at my schematic, the I2C bus for the display is on pins 35 & 36 on the CM4 module (labeled ID_SD and ID_SC). If I remember correctly, before I switched to Bullseye, the bus was on a different bus id, but I'm pretty sure I'm using the same dt-blob (I'll be the first to admit I'm still learning how all the parts work together). However, the display, touch and backlight all work fine (with using the rpi* drivers), but I want to make sure I have it using the correct overlays and obviously eliminate the i2c exception issue.

dt-blob.zip

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

You can confirm which pins rpi-ft5406 is using raspi-gpio get 0-1,44-45. SDA0 and SCL0 will appear on one of the two pairs of pins.

root@a487250:~# raspi-gpio get 0-1,44-45
GPIO 0: level=1 fsel=4 alt=0 func=SDA0 pull=UP
GPIO 1: level=1 fsel=4 alt=0 func=SCL0 pull=UP
GPIO 44: level=1 fsel=0 func=INPUT pull=NONE
GPIO 45: level=1 fsel=0 func=INPUT pull=NONE

You may also find you can break the touchscreen by running i2cdetect -y 10 and fix it again with i2cdetect -y 0. The raspi-gpio command above will show the effect on the pins as the i2c-mux switches between buses.

It didn't break, but it shows the same output, detecting the devices on 0x38 and 0x45.

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

Just noticed something, raspi-gpio reports pull-up on the SDA0/SCL0 gpio, but I also have external pull-ups (2.2k) on the board, could that cause problems?

@pelwell
Copy link
Contributor

pelwell commented Mar 24, 2023

Thanks, that's all consistent with what you said about the display being on 0 & 1. In order to get edt_ft5x06 working with your display it would need to be hacked or extended to target i2c0 instead of i2c10 (or i2c_csi_dsi as it is labelled). The source for the overlay is https://github.com/raspberrypi/linux/blob/rpi-6.1.y/arch/arm/boot/dts/overlays/edt-ft5406-overlay.dts and https://github.com/raspberrypi/linux/blob/rpi-6.1.y/arch/arm/boot/dts/overlays/edt-ft5406.dtsi.

Which brings us back to your original question:

how can I specify overlays to do the same thing as vc4-kms-dsi-7inch does?

Explain precisely what you want it to do and I can tell you how to do it. It might accelerate the process if you were to make a start on a new overlay, copying the bits of vc4-kms-dsi-7inch that you need and perhaps starting to change them how you want.

@HakanL
Copy link
Author

HakanL commented Mar 24, 2023

Thanks, that's all consistent with what you said about the display being on 0 & 1. In order to get edt_ft5x06 working with your display it would need to be hacked or extended to target i2c0 instead of i2c10 (or i2c_csi_dsi as it is labelled). The source for the overlay is https://github.com/raspberrypi/linux/blob/rpi-6.1.y/arch/arm/boot/dts/overlays/edt-ft5406-overlay.dts and https://github.com/raspberrypi/linux/blob/rpi-6.1.y/arch/arm/boot/dts/overlays/edt-ft5406.dtsi.

Which brings us back to your original question:

how can I specify overlays to do the same thing as vc4-kms-dsi-7inch does?

Explain precisely what you want it to do and I can tell you how to do it. It might accelerate the process if you were to make a start on a new overlay, copying the bits of vc4-kms-dsi-7inch that you need and perhaps starting to change them how you want.

Ok, I'm glad things are consistent at least. So my end goal is to build my own display board for my CM4 baseboard. Which I've learned is quite hard, so I'm looking into HDMI for the display part, but I still need backlight and touch control. So in my process of doing that I'm working on a prototype display board that will have I2C with FT5406 and a PWM I2C controller for backlight. I was thinking if I can configure the software to use the separate components (overlays) then I can test out those two parts (besides the HDMI part) and verify that design is correct, while in parallel work on the HDMI->parallel part. And the end goal is to have it run on the BalenaOS standard/public distribution with just some boot file updates/dtblob/config.txt changes (so it'll be easier to upgrade in the future).
The reason I opened this issue was because of the exception, which I couldn't determine the root cause of, but it seemed serious enough (since it locks up the bus, etc) that it may be of value for others, and/or an issue with the code (or more likely a combination of files used in this distribution). The question about what overlays dsi-7inch came out of why I'm using the rpi-drivers. Hopefully that background helps.
Question: Would I be better off trying to change my I2C bus to be on id 10 instead of 0 so the drivers would automatically work, vs patching the overlays to use i2c0? It just happened to be bus 0, it wasn't a design decision I made.

@6by9
Copy link
Contributor

6by9 commented Mar 25, 2023

I totally missed that you'd routed it to i2c-0 instead of 10 as you hadn't made any comment about it when I'd asked for i2cdetect -y 10.

For a simple swap, you can use dtoverlay=cm-swap-i2c0,i2c10-gpio0,i2c0-gpio44 to swap over the GPIO assignments.

You can't really split vc4-kms-dsi-7inch apart as the different devices link to one another, eg you need to tell the display where the appropriate backlight controller is, and DSI peripherals have to sit under the DSI host controller node that therefore has to be enabled.

There is an underlying expectation that if you're doing a custom carrier, then you are likely to need to create your own device tree and that may as well include all the relevant configuration rather than relying on the standard overlays. Alternatively you could have one overlay that applied all the relevant changes for your base board.

@pelwell
Copy link
Contributor

pelwell commented Mar 27, 2023

Would I be better off trying to change my I2C bus to be on id 10 instead of 0 so the drivers would automatically work

Yes.

@HakanL
Copy link
Author

HakanL commented Mar 30, 2023

I upgraded the firmware to this version (by manually copying the elf and dat files to the boot partition since this OS has a read-only root system rpi-update didn't work):

[    0.095429] raspberrypi-firmware soc:firmware: Attached to firmware from 2023-03-21T17:18:33, variant start_cd
[    0.099448] raspberrypi-firmware soc:firmware: Firmware hash is 3cc1c2dfc5460da9e1a0a4f48b48ab508c48bfe5

And now the I2C bus seems to be on 10 (I didn't change anything else) and the edt_ft5x06 and rpi_panel_attiny_regulator drivers are loaded automatically from vc4-kms-dsi-7inch, so that's good. The bad news is that I'm still getting the interrupt error on the official Balena image. I tried vclog and vcdbg but neither is able to open the log:

./vcdbg log msg
Unable to determine the value of __LOG_START
Unable to read logging_header from 0x00000000

I also still get this error (but the driver is loaded and I can control the backlight, until I get the interrupt error):

[   10.452085] rpi_touchscreen_attiny 10-0045: Failed to read REG_ID reg: -5
[   10.452129] rpi_touchscreen_attiny: probe of 10-0045 failed with error -5

Noteworthy is that my other i2c bus continue to work, it's just the display i2c bus that gets locked up (but that may be expected).
Startup dmesg: https://pastebin.com/WHQwNFXJ

So now I have the correct drivers, but I'm still getting the random unexpected interrupt.

@HakanL
Copy link
Author

HakanL commented Mar 30, 2023

What I don't understand though is that I2C bus 0 and 10 seems to be the same:

root@9a540a0:~# i2cdetect -l
i2c-10  i2c             i2c-22-mux (chan_id 1)                  I2C adapter
i2c-1   i2c             bcm2835 (i2c@7e804000)                  I2C adapter
i2c-0   i2c             i2c-22-mux (chan_id 0)                  I2C adapter
i2c-22  i2c             bcm2835 (i2c@7e205000)                  I2C adapter
root@9a540a0:~# i2cdetect -y 0
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:                         -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- 38 -- -- -- -- -- -- --
40: -- -- -- -- -- 45 -- -- -- -- -- -- -- -- -- --
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: -- -- -- -- -- -- -- --
root@9a540a0:~# i2cdetect -y 10
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:                         -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- 38 -- -- -- -- -- -- --
40: -- -- -- -- -- 45 -- -- -- -- -- -- -- -- -- --
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: -- -- -- -- -- -- -- --
root@9a540a0:~#

@HakanL
Copy link
Author

HakanL commented Mar 30, 2023

Here's the dmesg for when I get the error. Sometimes I can run for minutes, other times it happens after a few seconds:

[   81.353694] i2c-bcm2835 fe205000.i2c: Got unexpected interrupt (from firmware?)
[   82.376882] ------------[ cut here ]------------
[   82.376902] Firmware transaction timeout
[   82.376943] WARNING: CPU: 2 PID: 1175 at drivers/firmware/raspberrypi.c:67 rpi_firmware_property_list+0x138/0x20c
[   82.376975] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 ip6table_filter ip6_tables xt_MASQUERADE nf_conntrack_netlink nfnetlink br_netfilter rfkill xt_owner i2c_dev edt_ft5x06 joydev rpi_panel_attiny_regulator rtc_pcf85063 gpio_keys bcm2835_codec(C) bcm2835_isp(C) rpivid_hevc(C) bcm2835_v4l2(C) v4l2_mem2mem raspberrypi_ts bcm2835_mmal_vchiq(C) videobuf2_vmalloc videobuf2_dma_contig raspberrypi_hwmon dwc2 i2c_mux_pinctrl videobuf2_memops videobuf2_v4l2 roles i2c_mux tc358762 snd_bcm2835(C) videobuf2_common videodev vc_sm_cma(C) mc rpi_backlight panel_simple backlight nvmem_rmem uio_pdrv_genirq uio drm_dp_aux_bus sch_fq_codel fuse
[   82.377186] CPU: 2 PID: 1175 Comm: kworker/2:3 Tainted: G         C        5.15.34-v8 #1
[   82.377197] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[   82.377205] Workqueue: events dbs_work_handler
[   82.377221] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   82.377232] pc : rpi_firmware_property_list+0x138/0x20c
[   82.377242] lr : rpi_firmware_property_list+0x138/0x20c
[   82.377251] sp : ffffffc0087439b0
[   82.377256] x29: ffffffc0087439b0 x28: 0000000000000000 x27: ffffff8100a10640
[   82.377275] x26: ffffffea48b66b70 x25: ffffffc008475008 x24: ffffff8113e1a280
[   82.377291] x23: 0000000000001000 x22: 0000000000000018 x21: ffffff8100a10600
[   82.377308] x20: ffffffc008475000 x19: 00000000ffffff92 x18: 0000000000000000
[   82.377324] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[   82.377341] x14: 0000000000000000 x13: 74756f656d697420 x12: 6e6f69746361736e
[   82.377357] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : ffffffea46adbbac
[   82.377373] x8 : ffffffea48a40da8 x7 : 6361736e61727420 x6 : ffffffea48bd2989
[   82.377389] x5 : c0000000ffffefff x4 : 0000000000000000 x3 : 0000000000000027
[   82.377405] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff810356bc80
[   82.377422] Call trace:
[   82.377427]  rpi_firmware_property_list+0x138/0x20c
[   82.377437]  rpi_firmware_property+0x7c/0xfc
[   82.377446]  raspberrypi_clock_property.isra.0+0x4c/0x84
[   82.377459]  raspberrypi_fw_set_rate+0x54/0xd4
[   82.377469]  clk_change_rate+0x17c/0x2a0
[   82.377481]  clk_core_set_rate_nolock+0x15c/0x18c
[   82.377493]  clk_set_rate+0x4c/0x9c
[   82.377504]  _generic_set_opp_clk_only+0x28/0x68
[   82.377515]  _set_opp+0x2ac/0x38c
[   82.377525]  dev_pm_opp_set_rate+0x12c/0x15c
[   82.377536]  set_target+0x3c/0x48
[   82.377546]  __cpufreq_driver_target+0x188/0x230
[   82.377555]  od_dbs_update+0xf4/0x178
[   82.377565]  dbs_work_handler+0x4c/0x84
[   82.377575]  process_one_work+0x1d4/0x298
[   82.377588]  worker_thread+0x1e4/0x27c
[   82.377598]  kthread+0xfc/0x10c
[   82.377610]  ret_from_fork+0x10/0x20
[   82.377622] ---[ end trace 36a3f453885ce6dc ]---
[   82.377652] raspberrypi-clk soc:firmware:clocks: Failed to change fw-clk-arm frequency: -110
[   83.400881] graphics fb0: Set display number call failed. Old GPU firmware?
[   86.472891] hwmon hwmon1: Failed to get throttled (-110)
[   87.496880] bcm2708_fb soc:fb: ioctl 0x40044620 failed (-110)

@pelwell
Copy link
Contributor

pelwell commented Mar 30, 2023

What is the content of your config.txt now?

@HakanL
Copy link
Author

HakanL commented Mar 30, 2023

config.txt:

root@9a540a0:~# cat /mnt/boot/config.txt
framebuffer_depth=32
start_debug=1
disable_splash=1
dtoverlay=vc4-kms-v3d
dtoverlay=vc4-kms-dsi-7inch
dtoverlay=i2c-rtc,pcf85063a
dtoverlay=dwc2,dr_mode=host
dtoverlay=gpio-shutdown,gpio_pin=6,active_low=1
dtoverlay=sd0,overclock_50=50
dtparam=spi=off
dtparam=audio=on
dtparam=i2c_arm=on
avoid_warnings=1
enable_uart=0
gpu_mem=16

@HakanL
Copy link
Author

HakanL commented Mar 30, 2023

Is there any way to determine where this interrupt is coming from? I'm also curious on why there is an interrupt in the i2c driver, since i2c is polled from the master, or maybe it's unrelated (or I'm lacking understanding).

[   81.353694] i2c-bcm2835 fe205000.i2c: Got unexpected interrupt (from firmware?)

@pelwell
Copy link
Contributor

pelwell commented Mar 31, 2023

The interrupt can only be because both the firmware and kernel are being asked to drive I2C0.

root@9a540a0:~# cat /mnt/boot/config.txt

I find this a bit suspicious. What does df /mnt/boot or lsblk report when the SD card is mounted as it was when you ran this command?
The fastest way to an explanation is probably for you to upload /sys/firmware/fdt somewhere I can download it, or email it to me - [email protected].

@HakanL
Copy link
Author

HakanL commented Mar 31, 2023

Note that this is how BalenaOS does it, by mouting most things readonly, I assume it helps if there's a power outage, etc (but it's not like I can control it, it's just the way they have it, it's basically using Yocto to generate the distribution).

root@9a540a0:~# df /mnt/boot
Filesystem     1K-blocks  Used Available Use% Mounted on
/dev/mmcblk0p1     40314  8509     31806  22% /mnt/boot
root@9a540a0:~# lsblk
NAME         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
mmcblk0      179:0    0 14.6G  0 disk
|-mmcblk0p1  179:1    0   40M  0 part /mnt/boot
|-mmcblk0p2  179:2    0  320M  0 part /mnt/sysroot/inactive
|-mmcblk0p3  179:3    0  320M  0 part /mnt/sysroot/active
|-mmcblk0p4  179:4    0    1K  0 part
|-mmcblk0p5  179:5    0   20M  0 part /var/volatile/lib/systemd
|                                     /var/lib/systemd
|                                     /var/volatile/lib/chrony
|                                     /var/lib/chrony
|                                     /var/volatile/lib/bluetooth
|                                     /var/lib/bluetooth
|                                     /var/volatile/lib/NetworkManager
|                                     /var/lib/NetworkManager
|                                     /usr/share/ca-certificates/balena
|                                     /home/root/.ssh
|                                     /home/root/.docker
|                                     /home/root/.rnd
|                                     /etc/udev/rules.d
|                                     /etc/ssh/hostkeys
|                                     /etc/openvpn
|                                     /etc/hostname
|                                     /etc/docker
|                                     /etc/balena-supervisor
|                                     /etc/NetworkManager/system-connections
|                                     /etc/NetworkManager/conf.d
|                                     /etc/fake-hwclock
|                                     /etc/machine-id
|                                     /mnt/state
`-mmcblk0p6  179:6    0 13.9G  0 part /var/volatile/lib/docker
                                      /var/lib/docker
                                      /resin-data
                                      /mnt/data
mmcblk0boot0 179:32   0    4M  1 disk
mmcblk0boot1 179:64   0    4M  1 disk
zram0        253:0    0  3.8G  0 disk [SWAP]
root@9a540a0:~#

Here's the FDT file: https://transfer.sh/ym78ka/fdt-hakan

Note when I upgraded the firmware I didn't change the kernel8.img as I assumed that wasn't part of the firmware (and because of the read-only filesystem I wasn't able to run rpi-update so I had to manually replace files in the boot partition). Maybe that was wrong?

@pelwell
Copy link
Contributor

pelwell commented Mar 31, 2023

The fdt file shows that the rpi-ft5406 overlay is still being applied, causing the firmware to also access i2c0, hence the conflict.

@HakanL
Copy link
Author

HakanL commented Mar 31, 2023

That's odd, I don't see rpi-ft5406 in lsmod or anywhere in dmesg.

@6by9
Copy link
Contributor

6by9 commented Mar 31, 2023

I've just noticed in #5397 (comment)

[   87.496880] bcm2708_fb soc:fb: ioctl 0x40044620 failed (-110)

bcm2708_fb shouldn't be running either if vc4-kms-v3d is enabled. You should get a simple_fb initially instead with soc/fb being disabled (cat /proc/device-tree/soc/fb/status).

@HakanL
Copy link
Author

HakanL commented Mar 31, 2023

root@9a540a0:~# cat /proc/device-tree/soc/fb/status
okayroot@9a540a0:~#

I also noticed this now:

root@9a540a0:~# cat /proc/device-tree/chosen/user-warnings
dterror: can't find symbol 'audio'
Failed to resolve overlay 'vc4-kms-v3d'
root@9a540a0:~#

I don't see any reference to kms in dmesg or lsmod, but I don't know if that's expected.

@pelwell
Copy link
Contributor

pelwell commented Mar 31, 2023

The failure to enable the KMS driver is prompting the firmware to automatically apply the touchscreen overlay. Strangely, I can't find any reference to an audio label in any of the overlays or base device tree files.

@HakanL
Copy link
Author

HakanL commented Mar 31, 2023

Just so I didn't mess up the firmware upgrade I was able to get rpi-update to work by running this command (I had to manually copy in readelf and its library since it wasn't in the distribution, that's why the paths are updated):

SKIP_BACKUP=1 SKIP_KERNEL=1 ROOT_PATH=/tmp BOOT_PATH=/mnt/boot LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH PATH=.:$PATH  ./rpi-update f5c4fc199c8d8423cb427e509563737d1ac21f3c

So now it's running this firmware (f5c4fc199c8d8423cb427e509563737d1ac21f3c was the FW that matched kernel 5.15.34 the latest 5.15 which I assumed would be the correct one to use, don't know if that is the way to do it):

[    0.096434] raspberrypi-firmware soc:firmware: Attached to firmware from 2023-01-18T12:28:17, variant start_cd
[    0.100460] raspberrypi-firmware soc:firmware: Firmware hash is 658f02cc8edcb68a568273f05d2b6ceede181e15

But it's still using bcm2708_fb and I still get a user-warning about the vc4-kms-v3d overlay.

@pelwell
Copy link
Contributor

pelwell commented Mar 31, 2023

The audio label used to exist in the rpi-5.15.y tree, but was removed as part of a reorganisation in rpi-6.1.y. I suspect your overlays are out of date with respect to the kernel.

@HakanL
Copy link
Author

HakanL commented Mar 31, 2023

Ok, so two steps forward and one back. I copied the overlays from 5.15.92 firmware and now I don't have any references to 2708 and no user-warnings. One step back because I don't have backlight control any more (same error: rpi_touchscreen_attiny 10-0045: Failed to read REG_ID reg: -5), but I assume it worked before because the old drivers got loaded instead. I tried to copy kernel8.img as well, and that bricked by system so I'm going to put that back and see if I still get interrupt errors. However I need to figure out how to get backlight control back (my app expects the sys/class node to exist).

@pelwell
Copy link
Contributor

pelwell commented Mar 31, 2023

Can you run i2cdump on addresses 0x45 and 0x38 of i2c-10?

@HakanL
Copy link
Author

HakanL commented Mar 31, 2023

Ah, so maybe two steps forward and 1.5 back... now the display i2c is back on bus id 0.

root@9a540a0:~# i2cdetect -l
i2c-20  i2c             Broadcom STB :                          I2C adapter
i2c-10  i2c             i2c-22-mux (chan_id 1)                  I2C adapter
i2c-1   i2c             bcm2835 (i2c@7e804000)                  I2C adapter
i2c-21  i2c             Broadcom STB :                          I2C adapter
i2c-0   i2c             i2c-22-mux (chan_id 0)                  I2C adapter
i2c-22  i2c             bcm2835 (i2c@7e205000)                  I2C adapter
root@9a540a0:~#

I can try the swap, but is the correct fix to update my dt-blob? I'm looking at the examples and they all seem to have 0 in there. What is controlling the i2c bus ids? For reference, my carrier board is modeled after the official RPi IO Board, same pins for the display/i2c, and I'm using the dtblob from the main repo (I just removed the non-CM4 stuff). I'm not saying I'm confident all that is correct, but it's not a conscious decision to put the display i2c bus on id 0.

@HakanL
Copy link
Author

HakanL commented Mar 31, 2023

FWIW I tried the cm-swap overlay and it didn't seem to change anything. dtoverlay=cm-swap-i2c0,i2c10-gpio0,i2c0-gpio44. i2cdetect is identical.

@6by9
Copy link
Contributor

6by9 commented Mar 31, 2023

So what does cat /proc/device-tree/chosen/user-warnings report? Does /boot/cm-swap-i2c0 exist on your SD card?

dt-blob.bin only affects the firmware, not the kernel. If using vc4-kms-v3d and vc4-kms-dsi-7inch, then it is largely irrelevant other than the first around 6 seconds of boot (the rainbow screen and early console).

@pelwell
Copy link
Contributor

pelwell commented Mar 31, 2023

For the avoidance of doubt, that's overlays/cm-swap-i2c0.dtbo you're looking for.

@HakanL
Copy link
Author

HakanL commented Mar 31, 2023

Of course, my bad, the cm-swap-i2c0 wasn't in the overlays folder. That's fixed now, and the drivers for touch and backlight are loaded. I'm still back 1 step (I lost track), because now the backlight is turned off, and I don't seem to have control of it (I now have /sys/class/backlight/10-0045, but nothing happens when I write to it, and the backlight is off). The rpi_panel_attiny_regulator overlay is loaded though.

@HakanL
Copy link
Author

HakanL commented Apr 4, 2023

To try to eliminate potential issues I set up a different CM4, same carrier board, but with the official 7" RPi display. Now I can control the backlight, it seems to load the touch controller, but after the initial boot sequence (where I see log lines) it flashes/fades to grey and then the display is blank. It seems to be similar to this issue: #4686, however I have tried to add the i2c_vc_baudrate=50000 workaround, but it's still not working. I know this issue here has taken on a life of its own, but I still haven't been able to get my system to work reliably with the KMS driver. It's of course possible it's related to some combination of kernel and firmware, but I don't know if that's the case, or how I can know which versions will work together.
Config.txt on this system:

framebuffer_depth=32
dtdebug=1
start_debug=1
disable_splash=1
dtoverlay=cm-swap-i2c0,i2c10-gpio0,i2c0-gpio44
dtoverlay=vc4-kms-v3d
dtoverlay=vc4-kms-dsi-7inch
dtoverlay=i2c-rtc,pcf85063a
dtoverlay=dwc2,dr_mode=host
dtoverlay=gpio-shutdown,gpio_pin=6,active_low=1
dtoverlay=sd0,overclock_50=50
dtparam=i2c_arm=on
dtparam=spi=off
dtparam=audio=off
dtparam=i2c_vc_baudrate=50000
enable_uart=0
gpu_mem=16

@HakanL
Copy link
Author

HakanL commented Apr 7, 2023

To further troubleshoot this I have now set up both a standard Pi 4 and a CM4 with IO Board (so not using my custom carrier). I've loaded the latest 64-bit Raspberry Pi OS Lite on both. I have tested the official 7" DSI display, and my Waveshare 4.3" clone. I've updated everything with apt-get full-upgrade and the firmware is identical on the two set ups. On the standard Pi4, everything works fine (expected of course).
But what I've found out is that the issue seems to be related DSI0 and KMS. If I run the CM4 without KMS then DSI0 works fine, but if I enable KMS in config.txt then I don't get any output on the display (it loads touch and backlight drivers, but I can't control the backlight). The framebuffer is enabled, but the display is black (granted, without backlight on I don't know if it's actually displaying anything). I'm getting a similar issue on the official 7" display, but I think I can control the backlight there at least, but nothing is displayed.
If I move the display over to DSI1 then everything is working. The unfortunate thing is that my carrier board is designed for DSI0, which I now realize was a mistake, I was thinking DSI0 is primary since it's the lowest number. Could I redesign it, sure, but is there any chance that the driver(?) could be fixed to work with DSI0 as well? It would seem like a reasonable issue to address for the greater community. I'm happy to assist is any way possible, but I'm glad that I was able to determine what works and what doesn't. I haven't been able to test for the original issue with interrupt issues, but I assume we have determined that it was because of the wrong drivers loaded which conflicted with each other. Would it make sense to keep this issue open to address DSI0, or would you prefer me to open a new issue (maybe there's already one, but I did a brief search and I couldn't find one).
I'm happy to provide full config.txt, dmesg, vclog and anything else now when I can reproduce on a standard distribution as well.

@6by9
Copy link
Contributor

6by9 commented Apr 7, 2023

#4946
It's on the list, but proving awkward to solve.
KMS sets up and tears down the pipeline every time the screen saver kicks in or similar, whilst the firmware only ever powers up the pipeline once. Some piece of state appears not to be cleared when resuming.

@HakanL
Copy link
Author

HakanL commented Apr 7, 2023

Thanks @6by9, it sounds like I need to re-design my board as DSI0 isn't useful (with KMS) yet. Of course I could use legacy/fkms, but I prefer to build upon something that will be the best supported in the long run. I'll close this issue as I think it's been determine the root cause thanks to yours and @pelwell's help. I'm now a little wiser and learned a lot so I appreciate that.

@HakanL HakanL closed this as completed Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants