How to Extract All URLs From a Text File
While looking for a simple way to extract all URLs from an HTML file, I came across this gem.
grep -o -E "https?://[][[:alnum:]._~:/?#@!\$&'()*+,;%-]+"
According to RFC 3986 only certain characters are valid for use in URL strings. I haven’t observed any obvious issues with it this far.