Skip to content

Zombie processes "git-upload-pack" makes Gitea unusable after short period of time #21133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
blackandred opened this issue Sep 10, 2022 · 9 comments
Labels

Comments

@blackandred
Copy link

blackandred commented Sep 10, 2022

Description

Hi, thanks for this fantastic project. Recently I deployed Gitea rootless using Podman from this playbook https://github.com/riotkit-org/core-services .

The instance is quickly becoming unusable, as the number of zombie processes is increasing. After ~3 minutes from restarting the instance I get:

git         1           0           0.173       9m38.459241369s  ?           1s          /usr/local/bin/gitea -c /etc/gitea/app.ini web 
git         49          1           0.000       9m29.45930959s   ?           0s          [git-upload-pack]
git         59          1           0.000       9m27.45934445s   ?           0s          [git-upload-pack]
git         78          1           0.000       9m22.459388431s  ?           0s          [git-upload-pack]
git         97          1           0.000       9m13.459420231s  ?           0s          [git-upload-pack]
git         120         1           0.000       7m55.459449531s  ?           0s          [git-upload-pack]
git         121         1           0.000       7m55.459481392s  ?           0s          [git-upload-pack]
git         124         1           0.000       7m55.459510752s  ?           0s          [git-upload-pack]
git         134         1           0.000       6m14.459542972s  ?           0s          [git-upload-pack]
git         143         1           0.000       6m11.459578793s  ?           0s          [git-upload-pack]
git         162         1           0.000       3m14.459607973s  ?           0s          [git-upload-pack]
git         241         1           0.000       1m54.459642093s  ?           0s          [git-upload-pack]
git         246         1           0.000       1m54.459674664s  ?           0s          [git-upload-pack]
git         248         1           0.000       1m54.459703774s  ?           0s          [git-upload-pack]
git         253         1           0.000       1m54.459735045s  ?           0s          [git-upload-pack]
git         254         1           0.000       1m54.459763314s  ?           0s          [git-upload-pack]
git         263         1           0.000       1m54.459794566s  ?           0s          [git-upload-pack]
git         266         1           0.000       1m54.459823266s  ?           0s          [git-upload-pack]
git         268         1           0.000       1m54.459858046s  ?           0s          [git-upload-pack]
git         279         1           0.000       1m54.459888195s  ?           0s          [git-upload-pack]
git         280         1           0.000       1m54.459917196s  ?           0s          [git-upload-pack]
git         294         1           0.000       14.459946256s    ?           0s          [git-upload-pack]

The processes are appearing when doing git push from local computer.

I see in the log there are EOF's, it may be caused because of my networking which is having sometimes a higher latency.

Gitea Version

1.17.2-rootless

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

https://gist.github.com/blackandred/3593a9b0a73dd913a39860b81f372e20

Screenshots

No response

Git Version

git version 2.36.2

Operating System

Linux 5.4.0-124-generic, Podman 3.4.2

How are you running Gitea?

Database

PostgreSQL

@blackandred
Copy link
Author

As a very dirt workaround I created this:

*/5 * * * * /bin/bash -c '[[ "$(podman top gitea | wc -l)" -gt 128 ]] && podman restart gitea'

@lunny lunny added this to the 1.17.3 milestone Sep 19, 2022
@simbelmas
Copy link

simbelmas commented Sep 21, 2022

Same issue with Gitea version 1.17.2 built with GNU Make 4.3, go1.18.6 running non root on top of kubernetes.
Since i plugged ArgoCD, i see a lot of 'git-upload-pack' processes and application is not reachable by ssh. UI works.

Gitea is built from source in alpine image.
Used image: quay.io/simbelmas/gitea-alpine:latest
Dockerfile: https://github.com/simbelmas/dockerfiles/blob/latest/gitea-alpine/Dockerfile

@cshazi
Copy link

cshazi commented Oct 14, 2022

Same issue with gitea/gitea:1.17.2-rootless image on top of AWS EKS.

After a few minutes from starting, a bunch of zombie processes appear:

ec2-user 27152  0.0  0.0      0     0 ?        Z    05:06   0:00 [git-upload-pack] <defunct>
ec2-user 32365  0.0  0.0      0     0 ?        Z    05:12   0:00 [git-upload-pack] <defunct>
ec2-user 32372  0.0  0.0      0     0 ?        Z    05:12   0:00 [git-upload-pack] <defunct>

During the creation of a new zombie process, the following log entries were created:

2022/10/14 04:57:52 ...s/asymkey/ssh_key.go:159:SearchPublicKeyByContent() [I] [6348e9ac-19] [SQL] SELECT "id", "owner_id", "name", "fingerprint", "content", "mode", "type", "login_source_id", "created_unix", "updated_unix", "verified" FROM "public"."public_key" WHERE (content like $1) LIMIT 1 [ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIK0wmN/Cr3JXqmLW7u+g9pTh+wyqDHpSQEIQczXkVx9q%] - 2.016575ms
2022/10/14 04:57:52 models/user/user.go:1011:GetUserByName() [I] [6348ec50] [SQL] SELECT "id", "lower_name", "name", "full_name", "email", "keep_email_private", "email_notifications_preference", "passwd", "passwd_hash_algo", "must_change_password", "login_type", "login_source", "login_name", "type", "location", "website", "rands", "salt", "language", "description", "created_unix", "updated_unix", "last_login_unix", "last_repo_visibility", "max_repo_creation", "is_active", "is_admin", "is_restricted", "allow_git_hook", "allow_import_local", "allow_create_organization", "prohibit_login", "avatar", "avatar_email", "use_custom_avatar", "num_followers", "num_following", "num_stars", "num_repos", "num_teams", "num_members", "visibility", "repo_admin_change_team_access", "diff_view_style", "theme", "keep_activity_private" FROM "public"."user" WHERE "lower_name"=$1 LIMIT 1 [myapp] - 2.012835ms
2022/10/14 04:57:52 ...bce556200f/engine.go:1244:Get() [I] [6348ec50] [SQL] SELECT "id", "owner_id", "owner_name", "lower_name", "name", "description", "website", "original_service_type", "original_url", "default_branch", "num_watches", "num_stars", "num_forks", "num_issues", "num_closed_issues", "num_pulls", "num_closed_pulls", "num_milestones", "num_closed_milestones", "num_projects", "num_closed_projects", "is_private", "is_empty", "is_archived", "is_mirror", "status", "is_fork", "fork_id", "is_template", "template_id", "size", "is_fsck_enabled", "close_issues_via_commit_in_any_branch", "topics", "trust_model", "avatar", "created_unix", "updated_unix" FROM "public"."repository" WHERE "owner_id"=$1 AND "lower_name"=$2 LIMIT 1 [29 myapp-env] - 5.840552ms
2022/10/14 04:57:52 ...s/asymkey/ssh_key.go:144:GetPublicKeyByID() [I] [6348ec50] [SQL] SELECT "id", "owner_id", "name", "fingerprint", "content", "mode", "type", "login_source_id", "created_unix", "updated_unix", "verified" FROM "public"."public_key" WHERE "id"=$1 LIMIT 1 [4] - 1.690692ms
2022/10/14 04:57:52 models/user/user.go:996:GetUserByIDCtx() [I] [6348ec50] [SQL] SELECT "id", "lower_name", "name", "full_name", "email", "keep_email_private", "email_notifications_preference", "passwd", "passwd_hash_algo", "must_change_password", "login_type", "login_source", "login_name", "type", "location", "website", "rands", "salt", "language", "description", "created_unix", "updated_unix", "last_login_unix", "last_repo_visibility", "max_repo_creation", "is_active", "is_admin", "is_restricted", "allow_git_hook", "allow_import_local", "allow_create_organization", "prohibit_login", "avatar", "avatar_email", "use_custom_avatar", "num_followers", "num_following", "num_stars", "num_repos", "num_teams", "num_members", "visibility", "repo_admin_change_team_access", "diff_view_style", "theme", "keep_activity_private" FROM "public"."user" WHERE "id"=$1 LIMIT 1 [7] - 2.115505ms
2022/10/14 04:57:52 ...epo/collaboration.go:85:IsCollaborator() [I] [6348ec50] [SQL] SELECT "id", "repo_id", "user_id", "mode", "created_unix", "updated_unix" FROM "public"."collaboration" WHERE "repo_id"=$1 AND "user_id"=$2 LIMIT 1 [39 7] - 1.715573ms
2022/10/14 04:57:52 ...ls/repo/repo_unit.go:218:getUnitsByRepoID() [I] [6348ec50] [SQL] SELECT "id", "repo_id", "type", "config", "created_unix" FROM "public"."repo_unit" WHERE (repo_id = $1) [39] - 1.797653ms
2022/10/14 04:57:52 [6348ec50] router: completed GET /api/internal/serv/command/4/myapp/myapp-env?mode=1&verb=git-upload-pack for 127.0.0.1:58648, 200 OK in 17.4ms @ private/serv.go:81(private.ServCommand)

If I stop the argocd repo server, no new zombie processes are created.

ArgoCD handles the upload packet operation in a special way:
https://github.com/argoproj/argo-cd/blob/master/util/git/workaround.go

@bendem
Copy link

bendem commented Oct 25, 2022

I have started seeing this behavior right after switching from root to rootless docker image. Not sure what's up yet.

image

@cshazi
Copy link

cshazi commented Oct 31, 2022

An example of a zombie process getting stuck:

  • Gitea ssh session handler executes the git operation with the Gitea serv command. (e.g. gitea serv key-2 --config=/data/gitea/conf/app.ini)
  • Gitea serv starts the git command based on the SSH_ORIGINAL_COMMAND variable. (e.g. git-upload-pack '/admin1/test.git')
  • The ssh context ends and terminate the child contexts.
  • The Golang sends a SIGKILL signal to the started os process. (e.g. gitea serv key-2 --config=/data/gitea/conf/app.ini)
  • The gitea serv process terminated without executing exec.Wait().
  • The status of the git-upload-pack process will be 'defunct'.

Debug log:

2022/10/28 04:20:45 modules/ssh/ssh.go:71:sessionHandler() [T] [635b586a-40] SSH: Payload: git-upload-pack '/admin1/test.git'
2022/10/28 04:20:45 modules/ssh/ssh.go:74:sessionHandler() [T] [635b586a-40] SSH: Arguments: [serv key-2 --config=/data/gitea/conf/app.ini]
2022/10/28 04:20:45 [635b589d] router: started   GET /api/internal/serv/command/2/admin1/test?mode=1&verb=git-upload-pack for [::1]:58292
2022/10/28 04:20:45 ...ters/private/serv.go:412:ServCommand() [D] [635b589d] Serv Results:
	IsWiki: false
	DeployKeyID: 0
	KeyID: 2	KeyName: cshazi
	UserName: admin1
	UserID: 2
	OwnerName: admin1
	RepoName: test
	RepoID: 1
2022/10/28 04:20:45 [635b589d] router: completed GET /api/internal/serv/command/2/admin1/test?mode=1&verb=git-upload-pack for [::1]:58292, 200 OK in 2.0ms @ private/serv.go:81(private.ServCommand)

I couldn't find a way to get the control before the cancel context function was executed. It would also be good to be able to give the kill signal when the context is finished, but it is not possible:
golang/go#21135
golang/go#22757

In the workaround I found, the context is derived from context.Background() in modules/ssh/ssh.go, and a separate goroutine watch the change of state of the parent context.

Source code: 84714c3

Can you suggest a better solution to the problem?

@Exagone313
Copy link
Contributor

golang/go#21135
golang/go#22757

By following up the discussions, it seems like a solution has been implemented and is planned to be released in Go 1.20: golang/go#50436 (see commit on golang/go@55eaae4).

@lunny lunny modified the milestones: 1.17.4, 1.17.5 Dec 21, 2022
@bendem
Copy link

bendem commented Feb 13, 2023

We haven't seen this issue for a while. Our last update (to 1.18.2) 3 weeks ago seems to have fixed it:

image

@lunny lunny removed this from the 1.17.5 milestone Feb 13, 2023
@lunny
Copy link
Member

lunny commented Feb 13, 2023

Close as looks like it has been resolved. Please feel free to reopen it if it's still a problem.

@lunny lunny closed this as completed Feb 13, 2023
@bendem
Copy link

bendem commented Feb 13, 2023

For reference, I'm gonna go on a limb and guess this was fixed by #20695

@go-gitea go-gitea locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants