-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Closed as not planned
Labels
help wantedYou can help make tldr-pages better!You can help make tldr-pages better!page editChanges to an existing page(s).Changes to an existing page(s).
Description
I created a Python script that looks for broken links.
It may contain some false positives.
The pages should be checked and corrected.
Script
import os
import re
import requests
def find_all_files(root_dir):
all_files = []
for subdir, _, files in os.walk(root_dir):
for file in files:
all_files.append(os.path.join(subdir, file))
return all_files
def extract_link(line):
match = re.search(r'> More information: <(https?://[^>]+)>', line)
if match:
return match.group(1)
return None
def check_link(url):
try:
response = requests.head(url, allow_redirects=True, timeout=10)
if response.status_code == 404:
return False
return True
except requests.RequestException:
return False
def process_files(root_dir):
all_files = find_all_files(root_dir)
for file_path in all_files:
with open(file_path, 'r', encoding='utf-8') as file:
for line in file:
link = extract_link(line)
if link and not check_link(link):
rel_path = os.path.relpath(file_path, root_dir)
print(f'Broken link in file: {rel_path}')
if __name__ == '__main__':
root_directory = 'tldr/pages/'
process_files(root_directory)
Output
Broken link in file: common/bru.md
Broken link in file: common/cabal.md
Broken link in file: common/clash.md
Broken link in file: common/deemix.md
Broken link in file: common/docker-machine.md
Broken link in file: common/gcloud-info.md
Broken link in file: common/golangci-lint.md
Broken link in file: common/hub-browse.md
Broken link in file: common/idnits.md
Broken link in file: common/jdupes.md
Broken link in file: common/magento.md
Broken link in file: common/mutagen.md
Broken link in file: common/nf-core.md
Broken link in file: common/ouch.md
Broken link in file: common/pnpx.md
Broken link in file: common/qemu-img.md
Broken link in file: common/runsv.md
Broken link in file: common/runsvchdir.md
Broken link in file: common/runsvdir.md
Broken link in file: common/sam2p.md
Broken link in file: common/secrethub.md
Broken link in file: common/slimrb.md
Broken link in file: common/spatial.md
Broken link in file: common/spfquery.md
Broken link in file: common/sv.md
Broken link in file: common/texdoc.md
Broken link in file: common/tree.md
Broken link in file: common/unison.md
Broken link in file: common/virsh.md
Broken link in file: common/wireplumber.md
Broken link in file: common/wpexec.md
Broken link in file: common/xdelta.md
Broken link in file: linux/asterisk.md
Broken link in file: linux/burpsuite.md
Broken link in file: linux/check-language-support.md
Broken link in file: linux/eopkg.md
Broken link in file: linux/feedreader.md
Broken link in file: linux/genid.md
Broken link in file: linux/guix-package.md
Broken link in file: linux/gummy.md
Broken link in file: linux/kdialog.md
Broken link in file: linux/lxterminal.md
Broken link in file: linux/ntpdate.md
Broken link in file: linux/obabel.md
Broken link in file: linux/pro.md
Broken link in file: linux/rpmbuild.md
Broken link in file: linux/swupd.md
Broken link in file: linux/virt-manager.md
Broken link in file: linux/vrms.md
Broken link in file: linux/warpd.md
Broken link in file: osx/airport.md
Broken link in file: osx/bnepd.md
Broken link in file: osx/emond.md
Broken link in file: osx/safeejectgpu.md
Broken link in file: osx/shuf.md
Broken link in file: osx/tail.md
Broken link in file: osx/webinspectord.md
Broken link in file: osx/whence.md
Broken link in file: osx/yaa.md
Broken link in file: windows/reg-flags.md
Pages with broken links:
- common/bru → pages/common/*: fix broken links #12821
- common/cabal → pages/common/*: fix broken links #12821
- common/clash → pages/common/*: fix broken links #12821
- common/deemix → pages/common/*: fix broken links #12821
- common/docker-machine → pages/common/*: fix broken links #12821
- common/gcloud-info → pages/common/*: fix broken links #12821
- common/golangci-lint → pages/common/*: fix broken links #12821
- common/hub-browse → pages/common/*: fix broken links #12821
- common/idnits → pages/common/*: fix broken links #12821
- common/jdupes → pages/common/*: fix broken links #12821
- common/magento → false positive ( page took too long to load )
- common/mutagen → false positive
- common/nf-core → pages/common/*: fix broken links #12821
- common/ouch → false positive
- common/pnpx → pages/common/*: fix broken links #12821
- common/qemu-img → pages/common/*: fix broken links #12821
- common/runsv →
- common/runsvchdir →
- common/runsvdir →
- common/sam2p → pages*/common/*: fix broken links #12826
- common/secrethub → pages*/common/*: fix broken links #12826
- common/slimrb → pages*/common/*: fix broken links #12826
- common/spatial → spatial: delete page #12853
- common/spfquery → pages*/common/*: fix broken links #12826
- common/sv →
- common/texdoc →
- common/tree → pages*/common/*: fix broken links #12826
- common/unison → pages*/common/*: fix broken links #12826
- common/virsh → pages*/common/*: fix broken links #12826
- common/wireplumber → pages*/common/*: fix broken links #12826
- common/wpexec → pages*/common/*: fix broken links #12826
- common/xdelta → pages*/common/*: fix broken links #12826
- linux/asterisk → pages/linux/*: replace broken links #12850
- linux/burpsuite →
- linux/check-language-support →
- linux/eopkg → pages/linux/*: replace broken links #12850
- linux/feedreader → pages/linux/*: replace broken links #12850
- linux/genid → pages/linux/*: replace broken links #12850
- linux/guix-package →
- linux/gummy → pages/linux/*: replace broken links #12850
- linux/kdialog → pages/linux/*: replace broken links #12850
- linux/lxterminal → pages/linux/*: replace broken links #12850
- linux/ntpdate → pages/linux/*: replace broken links #12850
- linux/obabel → pages/linux/*: replace broken links #12850
- linux/pro →
- linux/rpmbuild → pages/linux/*: replace broken links #12850
- linux/swupd → pages/linux/*: replace broken links #12850
- linux/virt-manager →
- linux/vrms → pages/linux/*: replace broken links #12850
- linux/warpd → pages/linux/*: replace broken links #12850
- osx/airport →
- osx/bnepd →
- osx/emond →
- osx/safeejectgpu →
- osx/shuf →
- osx/tail → tail: fix link #13813
- osx/webinspectord →
- osx/whence →
- osx/yaa →
- windows/reg-flags →
Metadata
Metadata
Assignees
Labels
help wantedYou can help make tldr-pages better!You can help make tldr-pages better!page editChanges to an existing page(s).Changes to an existing page(s).