I'd like to download a lot of web pages (theese in particular consist of lines of text with occasionally images) as PDFs, but it's a bit too much to do it manually. The urls per se are easibly iterable as they are in the form "https://www.(site).com/(stuff)/(number) (site) and (stuff) are static, while the number changes. Is there a way to download all the sites from number n to m, using chrome standard print as PDF or any other method. I tried to look a bit on the internet, but I didn't really find much that could help. I can code a bit in python, c, css and HTML, but if I need another coding language I'm ready to learn it. P.S: I'm sorry if the post is a bit dry, but it's my first and I'm not sure on what to write. Thanks in advance!
Your Answer is based on the programming you specified.
https://www.(site).com/(stuff)/(number)
where (site) and (stuff) are fixed thus only the number changes.
So as simply as 1, 2, 3, just create your loop in your shell and then call your browser.
I am using Windows so my Chrome is an alias to MS Edge but they work on the same programming code base. I have allowed the heading to be included, but there is a difference in how that is switched off, you would need to checkout via your browser command level. (Search this site for https://stackoverflow.com/search?q=headless+no-header+print-to-pdf )
for /l %i in (1,1,3) do @%chrome% --headless --print-to-pdf="%cd%\%i.pdf" https://www.example.com/stuff/%i
In above case (n,1,m) is the numbering integer %i
whilst %cd%
is current working directory folder. That save location should be fully qualified or you may get a blank output and thus on Windows if there are spaces it should be written with quotes ="%cd%\%i.pdf"
.