Skip to content

Code Search is very slow for non-admin users  #17998

Closed
@Joannazyx

Description

@Joannazyx

Gitea Version

1.15.5

Git Version

git version 2.34.1

Operating System

k8s

How are you running Gitea?

docker

Database

MySQL

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Description

We have recently enabled the repository indexer (code search feature), and choose elastic search as the search engine. It works fine at first, but after we create more than 20000 repos, the code search become very slow, especially for non-admin users.
My test results are as follows under 23551 repos, most repos are public:

  1. If user is not logged in or user is the admin, the code search takes no more than 2s.
    log:2021/11/25 17:51:49 Completed GET /explore/code 200 OK in 230.275728ms
  2. If user is logged in but is not admin, the code search takes more than 40s.
    log:2021/11/25 17:51:57 Completed GET /explore/code 200 OK in 46.054051649s

After checking the code, I think the bottleneck is the loop in routers/web/explore/code.go.
For non-admin login user, Gitea will check UnitTypeCode at first, and loop through all the repos, where the CheckUnitUser() function includes several possible database queries. On the other hand, if user is not logged in or user is the admin, Gitea won't check UnitTypeCode .
image

My attempt to improve the performance of code search includes:

  1. When people first open the code search page, and haven't input a keyword, just like when people first open a Google, do not start search and do not show "No source code matching your search term found.".
  2. For non-admin login user, only check UnitTypeCode for private repos. I add this condition because in previous function FindUserAccessibleRepoIDs(), we already check which repos the user can access. But in function CheckUnitUser(), the progress is repeated before finally check UnitTypeCode. Therefore, check repo.IsPrivate outside function CheckUnitUser() can skip some database queries.
    In my case, it reduce code search time for non-admin users from 40+s to 1+s.
    Here is my change in release 1.15.
    Joannazyx@972b758

Thanks for reading my question and naive code, look forward to your kind reply. I know my attempt may not be the best solution, please let me know if I misunderstand something~~

Screenshots

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    issue/duplicateThe issue has already been reported.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions