If your website is not Markdown-based, there is a large HTML LinkChecker Python program that was an effective offline or online method to recursively check websites from the command line. However, it is not frequently maintained, and has a growing number of false positives and false negatives.
The PyPi releases are out of date so instead of the usual
pip install linkchecker
we recommend using the development Linkchecker code
git clone --depth 1 https://github.com/linkchecker/linkchecker/ cd linkchecker python -m pip install -e .
Internal/external links are tested recursively. This example is for a Jekyll website running on my laptop:
linkchecker --check-extern http://localhost:4000
The checking process takes several minutes, perhaps even 20-30 minutes, depending on your website size (number of pages & links). Pipe to a file as below if you want to save the result (recommended).
- list options for recursion depth, format output and much more:
- save the output to a text file
linkchecker --check-extern http://localhost:4000 &> check.log
monitor progress with
tail -f check.log
LinkChecker is broken on Ubuntu 17.10 only, from the system
apt install linkchecker.
--check-externgives a lot of errors:
LinkChecker internal error, over and out
which seem to be outdated references in Python 2.7. This is fixed in Ubuntu 18.04 (or by using the install method recommended at the top of this article).