linkextr.py - Extract links from Markdown files

Extract links and optionally images from Markdown files. Output sorted list of unique links, one per line.

Extracted links can be further processed by other programs.

Tested on Windows and Linux.

Install

git clone https://github.com/wpdevelopment11/linkextr
cd linkextr
python3 -m venv .venv
source .venv/bin/activate

# Install the package which is used to parse Markdown files
pip install mistletoe

Example

Extract links from all Markdown files in the directory (recursively) and output them to stdout:

./linkextr.py /path/to/dir

Same, but output to the file:

./linkextr.py --output links.txt /path/to/dir

Extract links from the specified Markdown files only:

./linkextr.py first_file.md second_file.md

Extract images too, but only if their src starts with http:// or https://:

./linkextr.py --images first_file.md second_file.md

Prefix the links that start with a forward slash to form the valid URL:

# For example:
# [see this post](/posts/hello-world) => https://example.com/posts/hello-world

./linkextr.py --prefix https://example.com file.md

Usage

linkextr.py [-h] [-o OUTPUT] [-p PREFIX] [-a] [-i] [dir | file(s) ...]

Extract links from Markdown files

Positional arguments:
  dir | file(s)        Directory or zero or more Markdown files to extract the links from (default: stdin)

Options:
  -o, --output OUTPUT  File to write extracted links (default: stdout)
  -p, --prefix PREFIX  Add a prefix to the links that start with a forward slash
  -a, --alluri         Extract all links, even if they don't start with http:// or https://
  -i, --images         Extract URLs of images in addition to links
  -h, --help           Show this help message and exit

Run tests

python3 -m unittest discover test

Limitations

Only Markdown links are supported, the links from <a> HTML tags are not extracted.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
linkextr.py		linkextr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

linkextr.py - Extract links from Markdown files

Install

Example

Usage

Run tests

Limitations

About

Uh oh!

Uh oh!

Languages

License

wpdevelopment11/linkextr

Folders and files

Latest commit

History

Repository files navigation

linkextr.py - Extract links from Markdown files

Install

Example

Usage

Run tests

Limitations

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages