Check website for broken link with Python

less than 1 minute read

The LinkChecker Python program is an effective offline or online method to recursively check websites from the command line.

Install

  • Linux: apt install linkchecker
  • Mac / Windows:
    1. get the LinkChecker master code (release 9.3 is broken for current python-requests versions) and prereq
      git clone https://github.com/linkcheck/linkchecker
      
    2. install needs Python 2.7, Python 3 is not yet supported
      python -m pip install -e .
      

Internal/external links are tested recursively. This example is for a Jekyll website running on my laptop:

linkchecker --check-extern http://localhost:4000

The checking process takes 5-10 minutes depending on your website size (number of pages & links). Pipe to a file as below if you want to save the result.

Examples

  • list options for recursion depth, format output and much more:
    linkchecker -h
    
  • save the output to a text file
    linkchecker --check-extern http://localhost:4000 &> check.log
    

Notes

  • LinkChecker is broken on Ubuntu 17.10 only. --check-extern gives a lot of errors:

    LinkChecker internal error, over and out

    which seem to be outdated references in Python 2.7. This is fixed in Ubuntu 18.04.

Categories:

Updated:

Leave a Comment