pythonweb-scrapingbeautifulsouprel

How do I extract contents within rel in <a> tag?


<a href="#" class="tip" rel="&nbsp;
    Principal Name - S. BALKAR SINGH
    Mobile No. - 8146611008
    Email ID - gsssdhapaiasr@gmail.com
    &nbsp;" style="user-select: text;">View Contact Details<span 
class="caret"></span></a>

Principal name, mobile number and email id are the contents I'm interested in. When I specify soup.find('a', {'class':'tip'}) it gives me only "View Contact Details".

Is there a way to extract contents within rel?


Solution

  • rel is attribute so you have to use ['rel'] - ie. soup.find('a', {'class':'tip'})['rel']

    Working example

    data = '''<a href="#" class="tip" rel="&nbsp;
        Principal Name - S. BALKAR SINGH
        Mobile No. - 8146611008
        Email ID - gsssdhapaiasr@gmail.com
        &nbsp;" style="user-select: text;">View Contact Details<span 
    class="caret"></span></a>'''
    
    from bs4 import BeautifulSoup
    
    soup = BeautifulSoup(data, 'html.parser')
    
    item = soup.find('a', {'class':'tip'})
    
    print('text:', item.text)
    print(' rel:', item['rel'])
    print(' rel:', ' '.join(item['rel']))
    

    Result:

    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'S.', 'BALKAR', 'SINGH', 'Mobile', 'No.', '-', '8146611008', 'Email', 'ID', '-', 'gsssdhapaiasr@gmail.com', '']
     rel:  Principal Name - S. BALKAR SINGH Mobile No. - 8146611008 Email ID - gsssdhapaiasr@gmail.com 
    

    BS for rel returns list, not one string, because Multi-valued attributes


    EDIT: to get table with data you have to send POST request with all data which normally send browser to server - it means data in form, it can be even empty string but server has to receive form fields.

    import requests
    from bs4 import BeautifulSoup
    
    headers = {'User-Agent': 'Mozilla/5.0'}
    
    # form fields send to server
    params = {
        'SchoolType': '',
        'Dist1': '',    
        'Sch1': '', 
        'SearchString': ''  
    }
    
    r = requests.post('http://www.registration.pseb.ac.in/School/Schoollist', headers=headers, data=params)
    
    soup = BeautifulSoup(r.text, 'html.parser')
    
    all_a = soup.find_all('a', {'class':'tip'})
    
    for items in all_a:
        print('text:', item.text)
        print(' rel:', item['rel'])
        print(' rel:', ' '.join(item['rel']))
        print('-----')
    

    Result:

    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----
    text: View Contact Details
     rel: ['', 'Principal', 'Name', '-', 'POONAM', 'POONI', 'Mobile', 'No.', '-', '8568940353', 'Email', 'ID', '-', 'GAURAVGUPTA806@YAHOO.COM', '']
     rel:  Principal Name - POONAM POONI Mobile No. - 8568940353 Email ID - GAURAVGUPTA806@YAHOO.COM 
    -----