Skip to content

RFE: Run linkchecker on documentation on the CI #84947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
hroncok mannequin opened this issue May 25, 2020 · 7 comments
Open

RFE: Run linkchecker on documentation on the CI #84947

hroncok mannequin opened this issue May 25, 2020 · 7 comments
Labels
3.10 only security fixes docs Documentation in the Doc dir

Comments

@hroncok
Copy link
Mannequin

hroncok mannequin commented May 25, 2020

BPO 40770
Nosy @terryjreedy, @vstinner, @ned-deily, @JulienPalard, @hroncok, @amaajemyfren, @petdance

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2020-05-25.16:09:02.564>
labels = ['3.10', 'docs']
title = 'RFE: Run linkchecker on documentation on the CI'
updated_at = <Date 2020-05-30.02:14:20.599>
user = 'https://github.com/hroncok'

bugs.python.org fields:

activity = <Date 2020-05-30.02:14:20.599>
actor = 'terry.reedy'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2020-05-25.16:09:02.564>
creator = 'hroncok'
dependencies = []
files = []
hgrepos = []
issue_num = 40770
keywords = []
message_count = 7.0
messages = ['369892', '369893', '370196', '370202', '370226', '370270', '370353']
nosy_count = 8.0
nosy_names = ['terry.reedy', 'vstinner', 'ned.deily', 'docs@python', 'mdk', 'hroncok', 'amaajemyfren', 'petdance']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue40770'
versions = ['Python 3.10']

@hroncok
Copy link
Mannequin Author

hroncok mannequin commented May 25, 2020

In Fedora, we run the following check when we build Python documentation:

# Verify that all of the local links work
#
# (we can't check network links, as we shouldn't be making network connections
# within a build.  Also, don't bother checking the .txt source files; some
# contain example URLs, which don't work)
linkchecker \
  --ignore-url=^mailto: --ignore-url=^http --ignore-url=^ftp \
  --ignore-url=.txt\$ --no-warnings \
  Doc/build/html/index.html

From time to time, it discovers broken links:

#15700
#20383
#20388

It would be really nice if this check run as part of the CI that builds the documentation.

@hroncok hroncok mannequin added the 3.10 only security fixes label May 25, 2020
@hroncok hroncok mannequin assigned docspython May 25, 2020
@hroncok hroncok mannequin added docs Documentation in the Doc dir 3.10 only security fixes labels May 25, 2020
@hroncok hroncok mannequin assigned docspython May 25, 2020
@hroncok hroncok mannequin added the docs Documentation in the Doc dir label May 25, 2020
@hroncok
Copy link
Mannequin Author

hroncok mannequin commented May 25, 2020

Side note: linkchecker can be installed via pip, but the released version is not Python 3 compatible. In Fedora, we package it from git.

@hroncok
Copy link
Mannequin Author

hroncok mannequin commented May 28, 2020

Note: I would gladly contribute this check, but I have no idea where should I do that.

@amaajemyfren
Copy link
Mannequin

amaajemyfren mannequin commented May 28, 2020

On Thu, May 28, 2020 at 3:13 PM Miro Hrončok <[email protected]> wrote:

Note: I would gladly contribute this check, but I have no idea where should I do that.

I don't know either. I suspect it will have to be with one of the
CI/CD providers that cpython uses.

I _think_ it uses three:
a. Travis cpython/.travis.yml
b. Github Actions .github/workflows/doc.yml
c. Azures Pipelines .azure-pipelines/docs-steps.yml

Beyond that no idea. I fear I am also blind here. Still google is my friend.

@petdance
Copy link
Mannequin

petdance mannequin commented May 28, 2020

Some high-level questions to consider:

  • Is it run only when a build of the docs is started? Or should it be done regularly (daily/weekly?) to keep an eye on links so that it's not a surprise when build time comes along?

  • Does a broken link stop the build, or is it just advisory?

  • Who sees the results? Are they emailed to someone? A mailing list? Posted somewhere publicly?

  • Is someone assigned responsibility for acting on the failures?

  • What counts as a failure? Is a 301 redirect OK? It seems that a 301 might be OK to pass, but someone should know about it to update to the new URL.

I am not familiar with the current documentation build process, so forgive me if these are already answered somehow. I'm not looking for answers myself, but providing suggestions.

@ned-deily
Copy link
Member

I think our CI checks already take too long to run and use possibly more than our fair share of global open source resources (provided by GitHub, Travis, MS Azure) especially considering how infrequently you would expect to find a problem and the low severity of missing one immediately. I think a more appropriate choice would be to set up a buildbot to do such a check, perhaps weekly is often enough, not more than daily.

Julien, what do you think?

@terryjreedy
Copy link
Member

Something rebuilds the online docs once a day. That same something might be appropriate for running a link checker (including external links) once a week, say.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

2 participants