I see a bunch of info on finding tags and info within tags, but I can't seem to find something similarly simple. What I'm trying to do is access the text or title of the "a" element(in this case, "United States") in the 3rd "td" element. The problem I'm running into is that the "style" of the 3rd "td" element is the same as the first "td" element("text-align: left") so I can't filter with that, and nextSibling only let me go one level down, to the 2nd "td" element. The rest of my code works(though it makes me use what I believe is outdated BeautifulSoup syntax, even though I have Python 3.6.5 and BS 4.6 installed. That's another issue, I guess). I'm trying to get the "country" variable to work and I've tried multiple things, but nothing I've tried works, except putting a bunch of .next.next.next at the end.
The HTML(https://en.wikipedia.org/wiki/Toronto_FC#Current_roster):
<td style="text-align: left">
<a href="/wiki/Goalkeeper_(association_football)"
title="Goalkeeper (association football)">Goalkeeper</a>
</td>
<td style="padding-right:15px;">
<span class="fn">...</span>
</td>
<td style="text-align: left">
<span class="flagicon">...</span>
<a href="/wiki/United_States" title="United States">United States</a>
</td>
My Code:
vcard = page_soup.findAll("tr", {"class": "vcard agent"})
cards = vcard[0]
for cards in vcard:
league = page_soup.find("a", {"title": "Major League Soccer"})
league_name = league.text
team = page_soup.find("h1", {"class": "firstHeading"})
team_name = team.text
position = cards.a.text
name = cards.findAll("span", {"class": "fn"})
player_name = unidecode(name[0].text)
***this variable not working***
country = cards.find("td", {"style": "text-align: left")
when in doubt use css-selectors. when not in doubt use css-selectors
but obviously, it works only if you're sure it will always be the third td
country = cards.select_one('td:nth-of-type(3) a')
print(country['title'], country.text)