python-3.xweb-crawlercitationspubmedrentrez

Is it possible to get the number of times an article has been cited?


I am using Entrez to search for articles on Pubmed. Is it possible to use Entrez to also determine the number of citations for each article that is found using my search parameters? If not, is there an alternative method that I can use? My googling hasn't turned up much, so far.

NOTE: number of citations references (in my context) the number of times that the specific article in question has been cited in OTHER articles.

One thing that I have found: https://gist.github.com/mcfrank/c1ec74df1427278cbe53 which may indicate that I can get the citation number for articles that are also in the Pubmed DB, but it was unclear (to me) how I can use this to determine the number of citations for each article.

The following is the code that I am currently using (I'd like to include a 'print' line of the number of citations):

#search pubmed

from Bio import Entrez
from Bio import Medline

search_string = r'("Blah Blah")'

Entrez.email = "hello_world@example.com" 
handle = Entrez.egquery(term=search_string)
record = Entrez.read(handle)
count = 0
for row in record["eGQueryResult"]:
        if row["DbName"]=="pubmed":
            print("Number of articles found with requested search parameters:", row["Count"])
            count = row["Count"]


handle = Entrez.esearch(db="pubmed", term=search_string, retmax=count)
record = Entrez.read(handle)
handle.close()
idlist = record["IdList"]

handle = Entrez.efetch(db="pubmed", id=idlist, rettype="medline", retmode="text")
records = Medline.parse(handle)

records = list(records)
x=1
for record in records:
    print("(" + str(x) + ")")
    print("Title:", record.get("TI", "?"))
    print("Authors: ", ", ".join(record.get("AU", "?")))
    print("Pub Date:", record.get("DP", "?"))
    print("Journal:", record.get("JT", "?"))
    print("DOI:", record.get("LID", "?")[:-6])
    #print("number of citations:", get.number_of_citations) #<--what I am requesting help about
    print("\n")
    x += 1

Solution

  • I solved this by writing script that crawls through the actual website where the publication is hosted (using the DOI to find the web address), and then the script parses out the citation amount from the xmlx data of the site. This method works for the specific journal I am interested in (only), unfortunately.

    An alternative is to use WebOfScience, if anyone is interested. It does this and gives a lot more citation data, such as citations per year as well as total citation number and a lot more data. The downside is that WebOfScience is not a free service.