javascriptcheerio

How to get the text


I have some html to scrape.

<div class="content">
  <strong> This is first content </strong> This is second content
  <br />
  <small>
    <p>Something</p>
  </small>
</div>

how to get the This is second content with cheerio ?


Solution

  • Using nodeType property, it could solve your problem even if you have text before <strong> tag

    <div class="content">
      Before first content
      <strong> This is first content </strong> This is second content
      <br />
      <small>
        <p>Something</p>
      </small>
    </div>
    

    Then it could be

    var cheerio = require("cheerio")
    const $ = cheerio.load('<div class="content">Before first content<strong> This is first content </strong> This is second content<br /><small><p>Something</p></small></div>');
    
    var $outer = $("div.content").contents().filter(function() {
        return this.nodeType === 3;
    });
    
    console.log($outer.text()); //"Before first content This is second content"
    
    $outer.each(function() {
        console.log($(this).text());
    });
    //"Before first content"
    //" This is second content"
    

    Check it here