Skip to content

GPU is reset on Raspberry Pi 3 #3221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
carlonluca opened this issue Sep 11, 2019 · 24 comments
Closed

GPU is reset on Raspberry Pi 3 #3221

carlonluca opened this issue Sep 11, 2019 · 24 comments
Labels
Close within 30 days Issue will be closed within 30 days unless requested to stay open

Comments

@carlonluca
Copy link

Is this the right place for my bug report?
It is very difficult to say what component is causing the issue, but maybe someone here can help.

Describe the bug
I ported a system including Qt apps based on eglfs/dispmanx to fkms in order to support rpi4. Apps are also used on Windows, rpi3, mac os etc... with small variations. I have issues with animations that I cannot explain: some scaling animations seem to cause distortion of the scene. The distortion appears a bit like a vsync issue (just trying to describe it, not sure if it is related to vsync). The distortion often persists after the animation completes. If I set the animation to have duration 0, no issue appear.
The distortion seems to only appear when resolution is set to values "near" 1366x768 (I could also reproduce on 1360x768) and on fkms. 1920x1080 seems to be ok with both drivers. 1280x720 seems to be ok with both drivers. 1366x768 was ok on old dispmanx drivers but is not on fkms.

In some cases, when the animation starts on 1366x768, I get this from the kernel:

[  262.072366] [drm] Resetting GPU.
[  270.952406] [drm] Resetting GPU.
[  276.952461] [drm] Resetting GPU.
[  278.952417] [drm] Resetting GPU.
[  281.112422] [drm] Resetting GPU.
[  283.032194] [drm] Resetting GPU.

To reproduce
I tried to write a minimal example showing the problem, but I failed. Even in the same software, not every scaling animations show the problem.
What I see is that I can reproduce by running some scaling animations on fkms by setting the resolution to 1366x768.

Expected behaviour
No distortion, like with the legacy drivers.

Actual behaviour
Distortion appears.

System

  • Which model of Raspberry Pi?

Pi3B+

  • Which OS and version (cat /etc/rpi-issue)?
Raspberry Pi reference 2019-06-20
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 150e25c4f8123a4c9c63e8dca1b4737fa6c1135c, stage2
  • Which firmware version (vcgencmd version)?
Aug 15 2019 12:08:48
Copyright (c) 2012 Broadcom
version 0e6daa5106dd4164474616408e0dc24f997ffcf3 (clean) (release) (start_x)
  • Which kernel version (uname -a)?

Linux raspberrypi 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l GNU/Linux

Logs
Dmesg output:

[  262.072366] [drm] Resetting GPU.
[  270.952406] [drm] Resetting GPU.
[  276.952461] [drm] Resetting GPU.
[  278.952417] [drm] Resetting GPU.
[  281.112422] [drm] Resetting GPU.
[  283.032194] [drm] Resetting GPU.

Not sure if other logs exist somewhere. I tried to get logs with vcdbg but I got an error:

sudo vcdbg log msg
Failed to allocate -505734863 bytes for message buffer

Additional context
My application runs without X11. Difficult to say if this is a problem in Qt, my app, drivers etc... Any advice on how to investigate further?

@carlonluca
Copy link
Author

carlonluca commented Sep 23, 2019

I'm not getting any answer here. May I have more luck in the firmware repo? Thanks.

@JamesH65
Copy link
Contributor

JamesH65 commented Sep 23, 2019

I doubt anyone has had time to look at this. Note that 1366x768 is a very strange resolution that does cause problems in the HDMI system due to having odd numbered timings. Might be related.

EDIT: Sorry, I was thinking of the Pi4. Probably not relevant.

@carlonluca
Copy link
Author

Thank you for answering. That is the resolution of the screen I'm targeting. To temporarily "fix", I simply forced a resolution of 720p when auto detection sets fb to 1366x768 or 1360x768, but I'm afraid people may hit the same issue with other resolutions.

Do you think someone will have time to look into this (I perfectly understand you have priorities to respect)? Do you suggest to leave the message here or in the firmware repo? I'm testing a rpi3b+ atm.
Thanks again.

@6by9
Copy link
Contributor

6by9 commented Sep 23, 2019

This would appear to be related to KMS and/or Mesa, so this is the right place.

Is the behaviour the same with both vc4-kms-v3d(*) and vc4-fkms-v3d on your Pi3? That would imply that it is more in the 3D rendering side than the composition and output side.
[drm] Resetting GPU. would seem to say that it is 3D related anyway.

The behaviour may be slightly different on a Pi4 anyway, as the 3D block is 2 generations further on than Pi3.

(*) NB That vc4-kms-v3d is currently NOT available on Pi4.

@carlonluca
Copy link
Author

To test I simply switched from dtoverlay=vc4-fkms-v3d to dtoverlay=vc4-kms-v3d, is this sufficient to switch from fkms to kms?

I ran the same test and the issue is there in both cases. However, there is a slight difference: with fkms the scale animation stutters and then, for some time, I get a corruption similar to horizontal tearing, which is very noticeable as there is a scrollable content in my UI.
On kms, the same scale animation stutters but then I can see no noticeable "corruption" when scrolling the content. So it seems the only issue is the scaling animation on kms.

@carlonluca carlonluca changed the title Distortion on new gles drivers on 1366x768 Distortion on fkms Feb 7, 2020
@carlonluca
Copy link
Author

I could reproduce the issue also on other resolutions, like 1080p and 720p only on rpi3. Rpi4 seems to work properly.

@carlonluca carlonluca changed the title Distortion on fkms GPU is reset on Raspberry Pi 3 Feb 10, 2020
@timemaster5
Copy link

I am having the same issue. Can't find a solution for a longer time. I also tried to get back to the userland drivers instead of fkms and the problem persists.

@JamesH65
Copy link
Contributor

The code here has been undergoing a lot of changes. Is you system completely up to date?

@timemaster5
Copy link

Thank you for your response. After today's deep dive, I suspect some low-level issue is going on there, like device tree configuration or so. I tried multiple kernels, from 4.19.127 to 5.10, then kms, fkms and userland. I was also fiddling with gpu_mem and cma. Nothing helped. Then I thought, maybe it is a mesa problem or libdrm, so I updated them too to the 21.x version without any luck at the end :(

The issue is nicely summarized here: https://forums.raspberrypi.com/viewtopic.php?t=274293. I am also on CM3 and see exactly the same behaviour. I will try Rpi 3b+ tomorrow to be sure that there is nothing wrong with my side.

What leads me to think about the device tree problem is that when this happens ([drm] Resetting GPU..), and I reboot the machine by issuing a reboot command, it won't boot again. I can see kernel messages on the serial interface, but no picture, only display blinking from time to time when it is supposed to change resolution. Then the whole system is stuck, and I need to do a complete power cycle to boot again.

Any suggestions on what I should try? I am on Yocto, so I can precisely build a specific commit of some component if needed. But so far, my list of things I wanted to try is shrinking fast :(

@timemaster5
Copy link

This would appear to be related to KMS and/or Mesa, so this is the right place.

Is the behaviour the same with both vc4-kms-v3d(*) and vc4-fkms-v3d on your Pi3? That would imply that it is more in the 3D rendering side than the composition and output side. [drm] Resetting GPU. would seem to say that it is 3D related anyway.

The behaviour may be slightly different on a Pi4 anyway, as the 3D block is 2 generations further on than Pi3.

(*) NB That vc4-kms-v3d is currently NOT available on Pi4.

It is related to 3D only, yes. It happens a few secs after some rendering starts. On 2D like Weston desktop, I see no problems at all, but if I open some for example WebGL content, it goes bad after a few seconds of nice rendering.

I would like to gather some debug logs, but I wasn't successful with these environment variables:

export G_MESSAGES_DEBUG=all
export MESA_DEBUG=1
export EGL_LOG_LEVEL=debug
export LIBGL_DEBUG=verbose
export WAYLAND_DEBUG=1

@timemaster5
Copy link

Ok, I can confirm it is the same on RPi3b+. But I am moving forward slowly. Is it possible that this is caused by the code of the 3D app which uses DRM? I try some WebGL stuff, and one demo loads fine and renders without any issue and the other causes crash every time. So I am wondering what is wrong, not enough memory, some unsupported instruction or some bug?

Even if it is an application issue, I would expect the application to segfault instead of full system lockup with the power cycle needed.

@mbtronics
Copy link

I got this exact problem when overclocking from 1.2Ghz to 1.3Ghz (in batocera)

@popcornmix
Copy link
Collaborator

Is this still an issue?

@popcornmix popcornmix added the Close within 30 days Issue will be closed within 30 days unless requested to stay open label Jan 24, 2023
@mbtronics
Copy link

I have not tried a recent version in half a year, so no idea actually

@jcea
Copy link

jcea commented Jul 21, 2023

@pelwell
Copy link
Contributor

pelwell commented Jul 21, 2023

This issue was marked for closure in January, and that's what I'm going to do. I suggest you disable your screensaver (or try a different one), and wait for another OSMC update - ideally to a 6.1 kernel.

@pelwell pelwell closed this as completed Jul 21, 2023
@jcea
Copy link

jcea commented Jul 21, 2023

If you read the OSMC thread, you will see that this is not related to a screensaver or to me, specifically. It is a pretty widespread issue with current OSMC release, using kernel 5.15.83-3.

If you are closing this issue, where should I report the problem?

Thanks.

@jcea
Copy link

jcea commented Jul 21, 2023

I guess this issue was marked as "to be closed" because no more report were coming in. Reports are coming in now :-).

Thanks.

@popcornmix
Copy link
Collaborator

The 5.15 kernel is no longer updated. If any fixes are required they will appear on the current 6.1 kernel tree.

So we'd need to know if the issue still affects the 6.1 kernel.

You could test an RPiOS image (which will use the 6.1 kernel). Install kodi and the screensaver from apt, and test if the issue occurs. If it does, we can investigate.

@samnazarko
Copy link
Contributor

We (OSMC) will move to 6.1 in the near future.

@pizdjuk
Copy link

pizdjuk commented Nov 10, 2023

The 5.15 kernel is no longer updated. If any fixes are required they will appear on the current 6.1 kernel tree.

So we'd need to know if the issue still affects the 6.1 kernel.

You could test an RPiOS image (which will use the 6.1 kernel). Install kodi and the screensaver from apt, and test if the issue occurs. If it does, we can investigate.

I catch the issue on RPi 3B+, LibreELEC (official): 11.0.3, Kernel 6.1.38. The resolution was standard 1920x1080. No 3D apps was running

@pizdjuk
Copy link

pizdjuk commented Nov 12, 2023

@pelwell please, reopen.

@arren-ru
Copy link

Got the same issue on RPi3B+ with Linux 6.1.38 armv7l GNU/Linux, stuck with [drm] Resetting GPU.

@pizdjuk
Copy link

pizdjuk commented Nov 24, 2023

Other time got the same issue. But this time only one time appeared the line [drm] Resetting GPU after some kernel stack dump
kodi_screen_shifted.log

The screen was then shifted horizontally on ~0.1 screenwidth to the left. On the right side was the part that should be no the left. Restarting kodi didnt help, reboot of device helped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Close within 30 days Issue will be closed within 30 days unless requested to stay open
Projects
None yet
Development

No branches or pull requests