So I want to extract MathML from HTML. For example, I have this string:
<p>Task: </p><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>x</mi><mo>+</mo><mn>2</mn><mo>=</mo><mn>5</mn></mrow></math><p> find </p><math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>x</mi><mn>2</mn></msup></math><p>.</p>
I want to match
<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>x</mi><mo>+</mo><mn>2</mn><mo>=</mo><mn>5</mn></mrow></math>
and
<math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>x</mi><mn>2</mn></msup></math>
How can I achieve this.
I've tried this expression /(<math)(.*)(math>)/g
but it matches everything between first <math
and last math>
words.
By default, the quantifiers are greedy
in nature, You just need to make it lazy
by placing ?
after the *
const str = `<p>Task: </p><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>x</mi><mo>+</mo><mn>2</mn><mo>=</mo><mn>5</mn></mrow></math><p> find </p><math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>x</mi><mn>2</mn></msup></math><p>.</p>`;
const regex = /(<math)(.*?)(math>)/g;
const result = str.match(regex);
console.log(result.length);
console.log(result);