I am practicing crawling web, and yesterday I had an unexpected correct result which I dont think it should be work.
I used soup.find(id=i)
to find the attribute key i, I though i must be string, but when I passed a tuple - which is first element of tuple is string that is key, and I was surprise when it still ran correct result.
let say '01' is the key of attribute, the code below had exact result with id='01'
tup = ('01', 'Revenue') acc = soup.find(id=tup).text.strip().split('\n')
Who has experience on this matter, please help me to explain? Thank you so much.
What I tried:
tup = ('01', 'Revenue')
acc = soup.find(id=tup).text.strip().split('\n')
I expect the KeyError because I passed a tuple instead of a string to id.
I searched the BeautifulSoup source code
And found 3 occurrences where it checks if something is a tuple
.
I haven't went to the whole chain of calls, but it seems to me that whenever you pass tuples or lists as arguments, BeautifulSoup will turn it into a space-separated string, and will further checks for every values.
I think this part is the actual unpacking / conversion: From source code
def _attr_value_as_string(self, value, default=None):
"""Force an attribute value into a string representation.
A multi-valued attribute will be converted into a
space-separated stirng.
"""
value = self.get(value, default)
if isinstance(value, list) or isinstance(value, tuple):
value =" ".join(value)
return value