Given a URL for a HTML-file which consists of some links. (-). I want to return the @href-contents which match a given substring.
For example given the HTML-code
<html><body>
<a href="https://stackoverflow.com/">secure link</a>
<a href="http://google.com/">other link</a>
</body></html>
I want to have those links, matching "stack".
I do it by //a[contains(@href, "stack")]/@href
, what is working.
But I wonder, why //a/@href[contains(text(),"stack")]
is not working, what I would prefer and find more logical.
I would prefer to get the @href and return those, which match as this seems to be easier than find all a-tags, which contain a substring matched @href and return the @hrefs.
text()
function returns the value of the text node, in your second example you don't have a text node but an attribute node, so the function resolves as empty.
If you want to have a filter on the attribute, you need to get the attribute value. It seems the string()
might work in this case:
//a/@href[contains(string(),"stack")]