I'm using the below code to check server response codes. Instead of manually entering the URLs, I'd like python to check a CSV (data.csv
) and then export the results to a new CSV (new_data.csv
). Does anyone know how to write this?
Thanks for your time!
import urllib2
for url in ["http://stackoverflow.com/", "http://stackoverflow.com/questions/"]:
try:
connection = urllib2.urlopen(url)
print connection.getcode()
connection.close()
except urllib2.HTTPError, e:
print e.getcode()
# Prints:
#200 or 404
UPDATE:
import csv
out=open("urls.csv","rb")
data=csv.reader(out)
data=[row for row in data]
out.close()
print data
import urllib2
for url in ["http://stackoverflow.com/", "http://stackoverflow.com/questions/"]:
try:
connection = urllib2.urlopen(url)
print connection.getcode()
connection.close()
except urllib2.HTTPError, e:
print e.getcode()
OUTPUT:
[['link'], ['link'], ['link'], ['link'], ['link'], ['link']]
200
200
UPDATE:
import csv
with open("urls.csv", 'r') as csvfile:
urls = [row[0] for row in csv.reader(csvfile)]
import urllib2
for url in urls:
try:
connection = urllib2.urlopen(url)
print connection.getcode()
connection.close()
except urllib2.HTTPError, e:
print e.getcode()
I think you have your clue from your print data output: [['link'], ['link'], ['link'], ['link'], ['link'], ['link']] - This tells me that you are probably making a mistake with the line data=[row for row in data] as it is giving you a list of lists this is why you can not simply use for url in data:
.
BTW you will find the whole thing less confusing if you put some thought into naming - e.g. input from a file handle called 'out' and data = something based on data...