I know I can check the last modification time with
wget -S http://www.staticpage.com
as long as the page is static. But when doing the same to a dynamic page I always get the present time.
So, what is the less intrusive way to ask a site if a page has changed since some arbitrary time or when the page was updated the last time. I obviously could download the whole page and compare with the content I have saved on file, but I want to reduce overhead.
A dynamic page is literally updating each page load. If you want to know when a dynamic page is updated you're going to need to look at the page itself or an RSS feed for the page. Your best bet is generally going to be to download it and parse out the latest date from the latest post.
UPDATE: If you want to limit the amount of data you read when downloading a page you can use the following:
curl http://someurl.com | head -c 512
Linux will stop reading from the stream and end the request after 512 bytes using this. It is up to the server to see that and stop transmitting. This may or may not happen but at least you aren't wasting more bandwidth.