javascripthtmlregex

regex to match HTML <p ...> tag


I would like to match an HTML <p ...> tag when the 1st char after closing tag > is in lower case, I saw some answers on this site to match p tag with attribute, but not a simple p tag (<p>), what change I need to do to match both? Ex:

<p class="calibre1">All the while I was
<p>All the while I was
<p class="calibre1">all the while I was
<p>all the while I was

the regex should match the last 2 tags, the code that I have (/<\/?([^p](\s.+?)?|..+?)>[a-z]/) matches only the 3rd, not the 4th tag ps: I'm trying to edit an epub file using Sigil, which has regex but no HTML parser.


Solution

  • The following regex looks like a good starting place:

    enter image description here

    From there I'm not sure what you want to capture, but it'll work. And yes, of course you should you an HTMLParser for this, but for something as simple as this I don't see why regex is an issue (provided you know the input, it won't work on a generalized html input).