xpathweb-scrapingxpath-2.0descendant

XPath Select parent node based on child node


I need to extract the href where its descendant is: i/[@class="icon-right-open rotate180"]

I tried the following but it didn't work for me

//a[@class="arrowDot "]/@href /descendant::i[@class="icon-right-open rotate180"]

here is the HTML CODE:

<div class="paginationDots sMargTop centered">
    <a href="https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:p:2:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale" class="arrowDot ">
        <i class="icon-left-open rotate180"/>
    </a>
    <a href="https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale" class="Dots ">
                        1
    </a>
    <a href="https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:p:2:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale" class="Dots ">
                        2
    </a>
    <a class="Dots currentDot">
                        3
    </a>
    <a href="https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:p:4:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale" class="Dots ">
                        4
    </a>
    <a href="https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:p:5:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale" class="Dots ">
                        5
    </a>
    <a href="https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:p:6:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale" class="Dots ">
                        6
    </a>
    <a href="https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:p:7:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale" class="Dots ">
                        7
    </a>
    <a href="https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:p:4:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale" class="arrowDot ">
        <i class="icon-right-open rotate180"/>
    </a>
</div>

expected result is the following URL: https://www.mubawab.tn/fr/cc/immobilier-a-vendre-all:p:4:sc:apartments-for-sale,commercial-property-for-sale,farms-for-sale,houses-for-sale,land-for-sale,villas-and-luxury-homes-for-sale but actual output is empty


Solution

  • you almost got it correct. Here's the one you needed.

    //a[@class="arrowDot "][descendant::i[@class="icon-right-open rotate180"]]/@href