next.jsremarkjs

Extract specific tags using 'remark' in js


I am making a blog using markdown. From next.js.

After reading markdown using fs, the process of converting it to html is as follows.

const markdownToHtml = async (markdownValue: string) => {
  const processedValue = await unified()
    .use(remarkParse)
    .use(remarkHtml)
    .process(markdownValue)

  const stringedValue = processedValue.toString()

  return stringedValue
}

This allowed me to express markdown as my blog post.

However, I would like to provide several posts and 'previews' on other pages.

Like this page.

enter image description here

In order to do that, I want to print only the p-tag.

<h2>sorry..</h2>
<p>Hi!</p>
<p><img src="/assets/cardTmp.jpg" alt="tmp"></p>
<p>hello world</p>
<p><strong>bye</strong></p>

All I need is 'Hi! hello world bye'.

Should I use a regular expression or javascript function?

Do you have any recommended methods or libraries?

I tried to use a regular expression, but I'm sure there's a cleaner and clearer way.


Solution

  • You can do that :

    // Parse the HTML string into a DOM tree
    const doc = new DOMParser().parseFromString(html, 'text/html');
    
    // Get all the <p> tags from the document
    const presult = doc.getElementsByTagName('p');
    
    // Loop through the <p> tags
    for (let i = 0; i < presult .length; i++) {
      console.log(presult[i].textContent);
    }