The LinkChecker Python program has been an effective offline method to recursively check websites from the command line.
LinkChecker is available in Debian and Ubuntu 18.04/16.04 from
apt install linkchecker
- get the LinkChecker master code (release 9.3 is broken for current
python-requestsversions) and prereq
git clone https://github.com/wummel/linkchecker
- install needs Python 2.7, Python 3 is not yet supported
python -m pip install -e .
Internal/external links are tested recursively. This example is for a Jekyll website running on my laptop:
linkchecker --check-extern http://localhost:4000
The checking process takes 5-10 minutes depending on your website size (number of pages & links). Pipe to a file as below if you want to save the result.
- list options for recursion depth, format output and much more:
- save the output to a text file
linkchecker --check-extern http://localhost:4000 &> check.log
LinkChecker is broken on Ubuntu 17.10 only. –check-extern` gives a lot of errors:
LinkChecker internal error, over and out
which seem to be outdated references in Python 2.7. This is fixed in Ubuntu 18.04.