Check website for broken link with Python

The LinkChecker Python program has been an effective offline method to recursively check websites from the command line.

  1. get the LinkChecker master code (release 9.3 is broken for current python-requests versions) and prereq
    git clone https://github.com/wummel/linkchecker
    
  2. install-note you’ll need Python 2.7, Python 3 is not yet supported
    pip install -e .
    
  3. check internal/external links recursively. This example is for a Jekyll website running on my laptop
    python linkchecker --check-extern http://localhost:4000
    

The checking process takes 5-10 minutes depending on your website size (number of pages & links). The output goes to your terminal only. Pipe to a file as below if you want to save the result.

Examples

  • list options for recursion depth, format output and much more:
    python linkchecker -h
    
  • save the output to a text file
    python linkchecker --check-extern http://localhost:4000 &> check.log
    

Categories:

Updated:

Leave a Comment