I know the content-type can be gotten from
response = urllib2.urlopen(url)
content-type = response.info().getheader('Content-type')
Now, I need to execute js code so I choose selenium with Phantomjs to fetch web page.
driver = webdriver.PhantomJS()
driver.get(url)
source = driver.page_source
How can I get content-type from source without downloading web page twice? I know I can save the response.read() as html file, and then driver render the local html file without downloading it again. However, it's too slow. Any suggestions?
Selenium does not get the headers but you can just request the head with requests:
import requests
print(requests.head(url).headers["Content-Type"])
You can use httplib2, urliib2 etc.. there are numerous answers here showing how to request the head with various libs.