Hi I have the following code to skip the particular URL if it is taking too long to read.
timeout = 30
loop begins below for different urlz {
timeout_start = time.time()
webpage = urlopen(urlz[i]).read()
if time.time() > timeout_start + timeout:
continue}
My question is; wont the program execute the line of code "webpage = urlopen(urlz[i]).read()" before moving down to check the if condition? In that case I think it wont detect if the page is taking too long (more than 30 seconds to read). I basically want to skip this URL and move on to the next one if the program is stuck for 30 seconds (i.e. we have run into a problem when reading this particular URL).
The urlopen()
function has a timeout method inbuilt:
urllib.request.urlopen(url, data=None, [timeout, ]*, cafile=None, capath=None, cadefault=False, context=None)
So in your code:
timeout = 30
loop begins below for different urlz {
try:
webpage = urlopen(urlz[i], timeout=timeout).read()
}