Some sites 404 or 403 on bots. Add more headers to pretend that we're not a bit and make the headers configurable.
HEADERS = {
"User-Agent" => "Mozilla/5.0 (Windows NT 6.1) " \
"AppleWebKit/537.36 (KHTML, like Gecko) " \
"Chrome/41.0.2228.0 Safari/537.36",
"Accept" => "text/html," \
"application/xhtml+xml," \
"application/xml;" \
"q=0.9,*/*;q=0.8",
"Accept-Language" => "en-US,en;q=0.5",
"DNT" => "1",
"Upgrade-Insecure-Requests" => "1",
"Pragma" => "no-cache",
"Cache-Control" => "no-cache"
}.freeze
Some sites 404 or 403 on bots. Add more headers to pretend that we're not a bit and make the headers configurable.
A good set seems to exist in https://gitlab.com/ZakCodes/jekyll-link-checker/-/blob/master/lib/link-checker.rb#L11