using wget for website broken link checking

lynx is broken for https website link checking. We use wget instead to check for broken internal links on your website.

wget --spider -r -nd -nv -w1 -o mysite.log https://www.yourwebsite.com

That command will tell you which internal website links are broken.

Near the bottom of mysite.log will be

Found no broken links

Assuming you use .html or .md files on your computer, you can search them for text with Notepad++ or findtext

findtext https://brokenlink.com "*.html" 

wget spider options

--spider
don’t store HTML files retrieved
-nd
put output file in the current directory
-nv
non-verbose. Minimal messages output
-w1
wait 1 second between requests (don’t get banned by your own server for false scraping detection)

Categories:

Updated:

Leave a Comment