Skip to content

Code Search is very slow for non-admin users #17998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Joannazyx opened this issue Dec 16, 2021 · 2 comments
Closed

Code Search is very slow for non-admin users #17998

Joannazyx opened this issue Dec 16, 2021 · 2 comments
Labels
issue/duplicate The issue has already been reported.

Comments

@Joannazyx
Copy link

Gitea Version

1.15.5

Git Version

git version 2.34.1

Operating System

k8s

How are you running Gitea?

docker

Database

MySQL

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Description

We have recently enabled the repository indexer (code search feature), and choose elastic search as the search engine. It works fine at first, but after we create more than 20000 repos, the code search become very slow, especially for non-admin users.
My test results are as follows under 23551 repos, most repos are public:

  1. If user is not logged in or user is the admin, the code search takes no more than 2s.
    log:2021/11/25 17:51:49 Completed GET /explore/code 200 OK in 230.275728ms
  2. If user is logged in but is not admin, the code search takes more than 40s.
    log:2021/11/25 17:51:57 Completed GET /explore/code 200 OK in 46.054051649s

After checking the code, I think the bottleneck is the loop in routers/web/explore/code.go.
For non-admin login user, Gitea will check UnitTypeCode at first, and loop through all the repos, where the CheckUnitUser() function includes several possible database queries. On the other hand, if user is not logged in or user is the admin, Gitea won't check UnitTypeCode .
image

My attempt to improve the performance of code search includes:

  1. When people first open the code search page, and haven't input a keyword, just like when people first open a Google, do not start search and do not show "No source code matching your search term found.".
  2. For non-admin login user, only check UnitTypeCode for private repos. I add this condition because in previous function FindUserAccessibleRepoIDs(), we already check which repos the user can access. But in function CheckUnitUser(), the progress is repeated before finally check UnitTypeCode. Therefore, check repo.IsPrivate outside function CheckUnitUser() can skip some database queries.
    In my case, it reduce code search time for non-admin users from 40+s to 1+s.
    Here is my change in release 1.15.
    Joannazyx@972b758

Thanks for reading my question and naive code, look forward to your kind reply. I know my attempt may not be the best solution, please let me know if I misunderstand something~~

Screenshots

No response

@lunny lunny added the performance/speed performance issues with slow downs label Dec 16, 2021
@lunny
Copy link
Member

lunny commented Dec 16, 2021

The design is totally wrong for many repositories Gitea instances. But the check should not be removed. I will try to give a small optimization.

@a1012112796
Copy link
Member

maybe duplicate with #12849

@lunny lunny closed this as completed Dec 16, 2021
@lunny lunny added issue/duplicate The issue has already been reported. and removed performance/speed performance issues with slow downs labels Dec 16, 2021
@go-gitea go-gitea locked and limited conversation to collaborators Apr 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/duplicate The issue has already been reported.
Projects
None yet
Development

No branches or pull requests

3 participants