regexreplacenumberscalibre

Regex and replacement with numbers and texts


In a epub code, I have this text:

<span>Capitulo 1 - Apple is red</span>
<span>Capitulo 2 - Milk is white</span>
<span>Capitulo 3 - Weeds are green</span>

I need to replace "span" tags with "h1" tags, and all instances of "capitulo" with "chapter", mantaining the rest of the text. I tried this in calibre, with no fortune:

Find: <span>Capitulo (/d+) * </span>
Replace: <h1>Chapter /1 * </h1>

What can i do?

2nd question: If i had this text:

<span>Capitulo 1 - apple is red, 5 chicas</span>
<span>Capitulo 2 - milk is white, 6 chicas</span>
<span>Capitulo 3 - weeds are green, 7 chicas</span>

and i want to obtain:

<h1>Chapter1 - apple is red, 5 girls</h1>
<h2>Chapter2 - milk is white, 6 boys</h2>
<h3>Chapter3 - weeds are green, 7 men</h3>

how should i proceed?


Solution

  • You may use

    Find: <span>Capitulo ([^<]*)</span>
    Replace: <h1>Chapter \1</h1>

    See the regex demo and the Regulex graph:

    enter image description here

    The ([^<]*) part matches any 0 or more characters other than < as [^<] is a negated character class and the (...) form a capturing group whose contents are accessible from the replacement pattern via backreferences (see \1 in the replacement).