<div class="comments-post-meta__profile-info-wrapper display-flex">
<a class="app-aware-link inline-flex overflow-hidden t-16 t-black t-bold tap-target" target="_self" href="https://www.linkedin.com/in/ACoAAAAg-vkBuoZD8xeJW57GlPMiPRWUe-jvvSM" data-test-app-aware-link="">
<h3 class="comments-post-meta__actor display-flex flex-column overflow-hidden t-12 t-normal t-black--light">
<span class="comments-post-meta__name text-body-small-open t-black">
<span class="comments-post-meta__name-text hoverable-link-text mr1">
<span dir="ltr"><span aria-hidden="true"><!---->Nathan Greenhut<!----></span>
<span class="visually-hidden"><!---->View Nathan Greenhut’s profile<!----></span>
</span>
</span>
</div>
I'm Trying to scrape the name of the people that commented on a particular LinkedIn post.
I tried this code:
for i in soup.find_all("span",attrs = {"class" : "comments-post-meta__name-text hoverable-link-text mr1"}):
print(i.find('span').get_text())
The output I got is:
Nathan GreenhutView Nathan Greenhut’s profile
But the Output I want is:
Nathan Greenhut
You could select the element directly by its attribute:
soup.find('span', {'aria-hidden': 'true'}).get_text(strip=True)
or by css selector
soup.select_one('[aria-hidden="true"]').get_text(strip=True)
or if there are other elements with that kind of attribute being mor specific with:
soup.select_one('.comments-post-meta__profile-info-wrapper [aria-hidden="true"]').get_text(strip=True)
from bs4 import BeautifulSoup
html = '''
<div class="comments-post-meta__profile-info-wrapper display-flex">
<a class="app-aware-link inline-flex overflow-hidden t-16 t-black t-bold tap-target" target="_self" href="https://www.linkedin.com/in/ACoAAAAg-vkBuoZD8xeJW57GlPMiPRWUe-jvvSM" data-test-app-aware-link="">
<h3 class="comments-post-meta__actor display-flex flex-column overflow-hidden t-12 t-normal t-black--light">
<span class="comments-post-meta__name text-body-small-open t-black">
<span class="comments-post-meta__name-text hoverable-link-text mr1">
<span dir="ltr"><span aria-hidden="true"><!---->Nathan Greenhut<!----></span>
<span class="visually-hidden"><!---->View Nathan Greenhut’s profile<!----></span>
</span>
</span>
</div>
'''
soup = BeautifulSoup(html)
soup.select_one('[aria-hidden="true"]').get_text(strip=True)