htmlxmlweb-scrapingxpathxpath-1.0

Facing difficulties with the XPATH text() function


My goal is to get the price text from the page. When I use the XPATH

//*[@id="module_product_price_1"]/div/div/span

, it finds just one item. However, when I include the text() function

//*[@id="module_product_price_1"]/div/div/span/text()

, it discovers two items. The first one is hidden, while the second one holds the visible text.

Here is the webpage.

Here is the element

<div id="module_product_price_1" class="pdp-block module"><div class="pdp-mod-product-price"><img src="https://img.lazcdn.com/g/gcp/lazada/id00780-480-72.jpg_500x500q80.jpg_.webp" alt="promotion" class="pdp-mod-product-price-topbanner"><div class="pdp-product-price"><span class="notranslate pdp-price pdp-price_type_normal pdp-price_color_orange pdp-price_size_xl">$30.00</span></div></div></div>


Solution

  • Inspecting

    $x('//*[@id="module_product_price_1"]/div/div/span/text()')
    

    shows an extra, empty text node as the first child of the targeted span.

    It can easily be ignored by taking the string value of the targeted span and normalizing the whitespace via normalize-space():

    normalize-space(//*[@id="module_product_price_1"]/div/div/span)