web-scrapingscrapy

How can I get text content from p tag with nested span?


I'm using Scrapy to get some data from a site, but I'm having some trouble getting text content from a part of the HTML that has this structure:

<div class="price">
    <p>
        <span class="price-label">
            Some label
        </span>
        Price value
    </p>
</div>

My main goal is to get the string "Price value", but as you can see, it's placed inside the <p> tag and after the <span> tag is closed.

This position makes the response.css('.price p ::text').get() instruction return an empty string because it tries to get the content between <p> and <span> tags. The only way I've reached my goal was using string methods to remove the <span> tag from the 'response.css('.price p').get()`, but I think there is some better way to get the content.


Solution

  • "".join(response.css('.price p::text').getall()) is one of the many possible solutions.