I want to scrape multiple Google scholar user profiles - publications, journals, citations etc. I have already written the python code for scraping a user profile given the url. Now, suppose I have 100 names and the corresponding urls in an excel file like this.
name link
Autor https://scholar.google.com/citations?user=cp-8uaAAAAAJ&hl=en
Dorn https://scholar.google.com/citations?user=w3Dri00AAAAJ&hl=en
Hanson https://scholar.google.com/citations?user=nMtHiQsAAAAJ&hl=en
Borjas https://scholar.google.com/citations?user=Patm-BEAAAAJ&hl=en
....
My question is can I read the 'link' column of this file and write a for loop for the urls so that I can scrape each of these profiles and append the results in the same file. I seems a bit far fetched but I hope there is a way to do so. Thanks in advance!
You can use pandas.read_csv()
to read a specific file from a csv. For example:
import pandas as pd
df = pd.read_csv('data.csv')
arr = []
link_col = df['link']
for i in link_col:
arr.append(i);
print(arr)
This would allow you extract only the link column and append each value into your array. If you'd like you learn more, you can refer to pandas.