Skip to content

adds: rel='nofollow' to all links generated by the application #17345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

raygervais
Copy link

Closes #17341

This adds rel="nofollow" to all links which comply (I believe) with the ask itself. Though I had considered using a regex method to identify , I opted instead to just do a find command to identify all <a in the repository and went in with a surgeon's hands from there.

Looking forward to contributing further, and improving this PR in anyway that's required.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Oct 18, 2021

I think we should do this more carefully. Replacing all <a> is not a good idea, and it make the situation more complex for new code.

My thoughts:

  1. We can just add <meta name="robots" content="noindex,nofollow"> to the <head> instead of replacing every <a>
  2. I believe there are some pages should be indexed, just like GitHub, you can search GitHub repositories from Google. We can just hide old commit pages to reduce server load.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Oct 18, 2021
@noerw
Copy link
Member

noerw commented Oct 18, 2021

@wxiaoguang

I believe there are pages should be indexed, just like GitHub, you can search GitHub repositories from Google.

Unlike Github, Gitea runs on resource constrained devices, where the extra load of search bots may be significant.
This is about reducing load on the instance from bots, not avoiding indexing at all.
You can index a file, but do you need to index content from all old commits? Does the blame page being indexed have any value?
This is also about actions that are not links, but interaction elements, sort-by buttons, star/fork/watch buttons etc.

<meta name="robots" content="noindex"> to the <head>

This won't help, once it's rendered, most of the work on Gitea's side was already done.

I agree though that this approach here will be hard to maintain. Hm.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Oct 18, 2021

@noerw

Yep, we all agree that some pages need to be indexed while others needn't. I think making the index-able flag page level is better.

For you second concern, I think it won't be a problem. We can mark all old commit pages as noindex,nofollow. Then spiders can stop at the page level and won't follow the links inside the old pages.

Anyway, page-level flags are easier to maintain than link-level.

reference:

@raygervais
Copy link
Author

Will review the feedback and update accordingly, thanks for the advice

@raygervais raygervais closed this Jul 10, 2022
@go-gitea go-gitea locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mark most UI links & buttons as rel="nofollow" to avoid constant bot traffic
4 participants