pythoncsvhttp-headersurllib2server-response

Check the server response code, then export to csv


I'm using the below code to check server response codes. Instead of manually entering the URLs, I'd like python to check a CSV (data.csv) and then export the results to a new CSV (new_data.csv). Does anyone know how to write this?

Thanks for your time!

import urllib2
for url in ["http://stackoverflow.com/", "http://stackoverflow.com/questions/"]:
try:
    connection = urllib2.urlopen(url)
    print connection.getcode()
    connection.close()
except urllib2.HTTPError, e:
    print e.getcode()

# Prints:
#200 or 404

UPDATE:

import csv

out=open("urls.csv","rb")
data=csv.reader(out)
data=[row for row in data]
out.close()

print data

import urllib2
for url in ["http://stackoverflow.com/", "http://stackoverflow.com/questions/"]:
try:
    connection = urllib2.urlopen(url)
    print connection.getcode()
    connection.close()
except urllib2.HTTPError, e:
    print e.getcode()

OUTPUT:

[['link'], ['link'], ['link'], ['link'], ['link'], ['link']]

200

200

UPDATE:

import csv

with open("urls.csv", 'r') as csvfile:
    urls = [row[0] for row in csv.reader(csvfile)]

import urllib2
for url in urls:
    try:
        connection = urllib2.urlopen(url)
        print connection.getcode()
        connection.close()
    except urllib2.HTTPError, e:
        print e.getcode()

Solution

  • I think you have your clue from your print data output: [['link'], ['link'], ['link'], ['link'], ['link'], ['link']] - This tells me that you are probably making a mistake with the line data=[row for row in data] as it is giving you a list of lists this is why you can not simply use for url in data:.

    BTW you will find the whole thing less confusing if you put some thought into naming - e.g. input from a file handle called 'out' and data = something based on data...