pythonncbi

Download all NCBI PubMed IDs based on a tag


I am able to read in a PubMed ID of a paper, and return a set of records about that paper using this code:

from Bio import Entrez
from Bio import Medline
Entrez.email = "Your.Name.Here@example.org"
pubmed_rec = Entrez.efetch(db='pubmed',id=19053980,retmode='text',rettype='medline')
records = Medline.parse(pubmed_rec)
for rec in records:
    print(rec)

The output is:

{'PMID': '19053980', 'OWN': 'NLM', 'STAT': 'MEDLINE', 'DCOM': '20090706', 'LR': '20091015', 'IS': '1365-2036 (Electronic) 0269-2813 (Linking)', 'VI': '29', 'IP': '5', 'DP': '2009 Mar 1', 'TI': 'Clinical trial: the effects of a trans-galactooligosaccharide prebiotic on faecal microbiota and symptoms in irritable bowel syndrome.', 'PG': '508-18', 'LID': '10.1111/j.1365-2036.2008.03911.x [doi]', 'AB': 'BACKGROUND: Gut microflora-mucosal interactions may be involved in the pathogenesis of irritable bowel syndrome (IBS). AIM: To investigate the efficacy of a novel prebiotic trans-galactooligosaccharide in changing the colonic microflora and improve the symptoms in IBS sufferers. METHODS: In all, 44 patients with Rome II positive IBS completed a 12-week single centre parallel crossover controlled clinical trial. Patients were randomized to receive either 3.5 g/d prebiotic, 7 g/d prebiotic or 7 g/d placebo. IBS symptoms were monitored weekly and scored according to a 7-point Likert scale. Changes in faecal microflora, stool frequency and form (Bristol stool scale) subjective global assessment (SGA), anxiety and depression and QOL scores were also monitored. RESULTS: The prebiotic significantly enhanced faecal bifidobacteria (3.5 g/d P < 0.005; 7 g/d P < 0.001). Placebo was without effect on the clinical parameters monitored, while the prebiotic at 3.5 g/d significantly changed stool consistency (P < 0.05), improved flatulence (P < 0.05) bloating (P < 0.05), composite score of symptoms (P < 0.05) and SGA (P < 0.05). The prebiotic at 7 g/d significantly improved SGA (P < 0.05) and anxiety scores (P < 0.05). CONCLUSION: The galactooligosaccharide acted as a prebiotic in specifically stimulating gut bifidobacteria in IBS patients and is effective in alleviating symptoms. These findings suggest that the prebiotic has potential as a therapeutic agent in IBS.', 'FAU': ['Silk, D B A', 'Davis, A', 'Vulevic, J', 'Tzortzis, G', 'Gibson, G R'], 'AU': ['Silk DB', 'Davis A', 'Vulevic J', 'Tzortzis G', 'Gibson GR'], 'AD': ['Department of Academic Surgery, Imperial College Healthcare NHS Trust, London, UK. David.Silk@nwlh.nhs.uk'], 'LA': ['eng'], 'PT': ['Journal Article', 'Randomized Controlled Trial', "Research Support, Non-U.S. Gov't"], 'DEP': '20081202', 'PL': 'England', 'TA': 'Aliment Pharmacol Ther', 'JT': 'Alimentary pharmacology & therapeutics', 'JID': '8707234', 'RN': ['0 (Oligosaccharides)'], 'SB': 'IM', 'CIN': ['Expert Rev Gastroenterol Hepatol. 2009 Oct;3(5):487-92. PMID: 19817670'], 'MH': ['Adult', 'Aged', 'Bifidobacterium/*drug effects/growth & development', 'Colony Count, Microbial', 'Feces/*microbiology', 'Female', 'Humans', 'Irritable Bowel Syndrome/*diet therapy', 'Male', 'Middle Aged', 'Oligosaccharides/*administration & dosage/metabolism', 'Probiotics/*therapeutic use', 'Quality of Life', 'Statistics as Topic', 'Treatment Outcome'], 'EDAT': '2008/12/05 09:00', 'MHDA': '2009/07/07 09:00', 'CRDT': ['2008/12/05 09:00'], 'PHST': ['2008/12/05 09:00 [pubmed]', '2009/07/07 09:00 [medline]', '2008/12/05 09:00 [entrez]'], 'AID': ['APT3911 [pii]', '10.1111/j.1365-2036.2008.03911.x [doi]'], 'PST': 'ppublish', 'SO': 'Aliment Pharmacol Ther. 2009 Mar 1;29(5):508-18. doi: 10.1111/j.1365-2036.2008.03911.x. Epub 2008 Dec 2.'}

If you look at the same paper on the NCBI website: enter image description here

You can see it belongs to Randomized Controlled Trial, which i think is here in the output:

'PT': ['Journal Article', 'Randomized Controlled Trial', "Research Support, Non-U.S. Gov't"]

I now want to get a list of ALL randomized control trials in PMID, NOT providing individual PubMed IDs.

i.e. I want to input to the query 'PT = randomized control trials tag and give me back all of the PubMed IDs'.

I was trying to change id=19053980 to something like "pt = Randomized Controlled Trial", but that's clearly not the key.

How do I return a list of all PMIDs with the Randomized Controlled Trial tag and also, how would I find this answer myself? I found a list of all the codes here but I'm not clear how I would know how to use them from this.


Solution

  • You can use the MeSH term "clinical trial" and a recipe in biopython's tutorial. I have added the code below.

    from Bio import Entrez
    from Bio import Medline
    
    term = "clinical trial[MeSH Terms]"
    
    Entrez.email = "A.N.Other@example.com"  # Always tell NCBI who you are
    handle = Entrez.esearch(db="pubmed", term=term, retmax=5)
    esearch_record = Entrez.read(handle)
    handle.close()
    
    print(f"Records found: {esearch_record['Count']}")
    all_pmids = esearch_record["IdList"]
    
    handle = Entrez.efetch(
        db="pubmed", 
        id=all_pmids, 
        rettype="medline",
        retmode="text")
    records = Medline.parse(handle)
    
    for record in records:
        print(record["PMID"])
    

    Output is

    Records found: 349991
    33288887
    33275561
    33273736
    33149899
    33264530
    

    Here is the link to the same query on pubmed's website: https://pubmed.ncbi.nlm.nih.gov/?term=clinical+trial[MeSH+Terms]

    To find this answer, I did two things. First, I used pubmed's website to search for clinical trials and found the appropriate MeSH term. Second, went to biopython's tutorial and control+f for pubmed.