pythonpython-3.xbeautifulsoupbs4dash

beautiful soup 4 issue in mulitple data fetching. it is confusing me


When i am fetching one data it is working fine as i mentioned below code. Whenever i am finding all datas in a similar tagging (example - {'class': 'doctor-name'}) it showing output as none.

Single tag output

from bs4 import BeautifulSoup
s = """
<a class="doctor-name" itemprop="name" href="/doctors/gastroenterologists/dr-isaac-raijman-md-1689679557">Dr. Isaac Raijman, MD</a>
"""
soup = BeautifulSoup(s, 'html.parser')

print(soup.find('a ', {'class': 'doctor-name'}).text)
print(soup.find('a ', {'itemprop': 'name'}).text)

Output - [Dr. Isaac Raijman, MD, Dr. Isaac Raijman, MD]

Finding all using similar tagging but showing output as none-

import requests, bs4
from bs4 import BeautifulSoup
url = "https://soandso.org/doctors/gastroenterologists"
page = requests.get(url)
page
page.status_code
page.content
soup = BeautifulSoup(page.content, 'html.parser')
soup
print(soup.prettify())
lists = soup.find_all('section', attrs={'class': 'search-page find-a-doctor'})
for list in lists:
   doctor = list.find('a', attrs={'class': 'doctor-name'})#.text
   info = [doctor]
print(info)

Output - none

Please help me to solve this issue. Share your understanding as a code and #hastags definitions also fine.


Solution

  • That information is built up by the browser and is not returned in the HTML. An easier approach is to request it from the JSON API as follows:

    import requests
    
    headers = {'Authorization' : 'eyJhbGciOiJodHRwOi8vd3d3LnczLm9yZy8yMDAxLzA0L3htbGRzaWctbW9yZSNobWFjLXNoYTI1NiIsInR5cCI6IkpXVCJ9.eyJodHRwOi8vc2NoZW1hcy54bWxzb2FwLm9yZy93cy8yMDA1LzA1L2lkZW50aXR5L2NsYWltcy9uYW1lIjoiYWRtaW4iLCJleHAiOjIxMjcwNDQ1MTcsImlzcyI6Imh0dHBzOi8vZGV2ZWxvcGVyLmhlYWx0aHBvc3QuY29tIiwiYXVkIjoiaHR0cHM6Ly9kZXZlbG9wZXIuaGVhbHRocG9zdC5jb20ifQ.zNvR3WpI17CCMC7rIrHQCrnJg_6qGM21BvTP_ed_Hj8'}    
    json_post = {"query":"","start":0,"rows":10,"selectedFilters":{"availability":[],"clinicalInterest":[],"distance":[20],"gender":["Both"],"hasOnlineScheduling":False,"insurance":[],"isMHMG":False,"language":[],"locationType":[],"lonlat":[-95.36,29.76],"onlineScheduling":["Any"],"specialty":["Gastroenterology"]}}
    req = requests.post("https://api.memorialhermann.org/api/doctorsearch", json=json_post, headers=headers)
    data = req.json()
    
    for doctor in data['docs']:
        print(f"{doctor['Name']:30}  {doctor['PrimarySpecialty']:20}  {doctor['PrimaryFacility']}")
    

    Giving you:

    Dr. Isaac Raijman, MD           Gastroenterology      Memorial Hermann Texas Medical Center
    Dr. Gabriel Lee, MD             Gastroenterology      Memorial Hermann Southeast Hospital
    Dr. Dang Nguyen, MD             Gastroenterology      Memorial Hermann Texas Medical Center
    Dr. Harshinie Amaratunge, MD    Gastroenterology      Memorial Hermann Texas Medical Center
    Dr. Tanima Jana, MD             Gastroenterology      Memorial Hermann Texas Medical Center
    Dr. Tugrul Purnak, MD           Gastroenterology      Memorial Hermann Texas Medical Center
    Dr. Dimpal Bhakta, MD           Gastroenterology      Memorial Hermann Texas Medical Center
    Dr. Dharmendra Verma, MD        Gastroenterology      Memorial Hermann Texas Medical Center
    Dr. Jennifer Shroff, MD         Gastroenterology      Memorial Hermann Texas Medical Center
    Dr. Brooks Cash, MD             Gastroenterology      Memorial Hermann Texas Medical Center