Skip to content

Broken links #12808

@spageektti

Description

@spageektti

I created a Python script that looks for broken links.
It may contain some false positives.
The pages should be checked and corrected.

Script
import os
import re
import requests

def find_all_files(root_dir):
    all_files = []
    for subdir, _, files in os.walk(root_dir):
        for file in files:
            all_files.append(os.path.join(subdir, file))
    return all_files

def extract_link(line):
    match = re.search(r'> More information: <(https?://[^>]+)>', line)
    if match:
        return match.group(1)
    return None

def check_link(url):
    try:
        response = requests.head(url, allow_redirects=True, timeout=10)
        if response.status_code == 404:
            return False
        return True
    except requests.RequestException:
        return False

def process_files(root_dir):
    all_files = find_all_files(root_dir)
    for file_path in all_files:
        with open(file_path, 'r', encoding='utf-8') as file:
            for line in file:
                link = extract_link(line)
                if link and not check_link(link):
                    rel_path = os.path.relpath(file_path, root_dir)
                    print(f'Broken link in file: {rel_path}')

if __name__ == '__main__':
    root_directory = 'tldr/pages/'
    process_files(root_directory)
Output
Broken link in file: common/bru.md
Broken link in file: common/cabal.md
Broken link in file: common/clash.md
Broken link in file: common/deemix.md
Broken link in file: common/docker-machine.md
Broken link in file: common/gcloud-info.md
Broken link in file: common/golangci-lint.md
Broken link in file: common/hub-browse.md
Broken link in file: common/idnits.md
Broken link in file: common/jdupes.md
Broken link in file: common/magento.md
Broken link in file: common/mutagen.md
Broken link in file: common/nf-core.md
Broken link in file: common/ouch.md
Broken link in file: common/pnpx.md
Broken link in file: common/qemu-img.md
Broken link in file: common/runsv.md
Broken link in file: common/runsvchdir.md
Broken link in file: common/runsvdir.md
Broken link in file: common/sam2p.md
Broken link in file: common/secrethub.md
Broken link in file: common/slimrb.md
Broken link in file: common/spatial.md
Broken link in file: common/spfquery.md
Broken link in file: common/sv.md
Broken link in file: common/texdoc.md
Broken link in file: common/tree.md
Broken link in file: common/unison.md
Broken link in file: common/virsh.md
Broken link in file: common/wireplumber.md
Broken link in file: common/wpexec.md
Broken link in file: common/xdelta.md
Broken link in file: linux/asterisk.md
Broken link in file: linux/burpsuite.md
Broken link in file: linux/check-language-support.md
Broken link in file: linux/eopkg.md
Broken link in file: linux/feedreader.md
Broken link in file: linux/genid.md
Broken link in file: linux/guix-package.md
Broken link in file: linux/gummy.md
Broken link in file: linux/kdialog.md
Broken link in file: linux/lxterminal.md
Broken link in file: linux/ntpdate.md
Broken link in file: linux/obabel.md
Broken link in file: linux/pro.md
Broken link in file: linux/rpmbuild.md
Broken link in file: linux/swupd.md
Broken link in file: linux/virt-manager.md
Broken link in file: linux/vrms.md
Broken link in file: linux/warpd.md
Broken link in file: osx/airport.md
Broken link in file: osx/bnepd.md
Broken link in file: osx/emond.md
Broken link in file: osx/safeejectgpu.md
Broken link in file: osx/shuf.md
Broken link in file: osx/tail.md
Broken link in file: osx/webinspectord.md
Broken link in file: osx/whence.md
Broken link in file: osx/yaa.md
Broken link in file: windows/reg-flags.md

Pages with broken links:

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedYou can help make tldr-pages better!page editChanges to an existing page(s).

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions