Idiomdrottning’s homepage

Put w3m in your pipe and smoke it

Pretty much the only reason I have w3m installed (but it’s a banger of a reason) is so I can pipe web output to stuff. 99% of the time I use wget (or curl, when I can remember curl’s flags) or my own sxml-url but sometimes I want to pipe the page rather than the source of the page.

w3m "http://www.boksidan.net/bok.asp?bokid=20"|grep bräma|tail -1

I rely on this method on some scripts I use daily.

Unfortunately in the the modern-day SPA hellscape it’s no longer easy to do things from the command line.

On my “Someday/​Maybe/​Probably Never” pile I have “make a CLI app to grab the source code for a web page after JavaScript has loaded and pipe that source code to stdout”. It’s OK if the app is slow and resource hoggy and uses the headless mode of one of the various puppeting browser extensions or whatever. And it won’t solve pages that need some sorta interaction. But there’d still be plenty of web pages where something like this would re-enable old-fashioned scraping.

I could then pipe that source code to w3m -T text/html.