javascripthtmlregex

Regex replace with a small text change?


I need to convert some HTML content to UBB code, for instance replacing the < > signs by square brackets [ ]. There also may be an ordered list <ol> tag with a start= attribute specifying the kind of marker.

const str = '<b>Something</b> is going on.<br><i>But what?</i><br><br><ol start="3"><li>First</li><li>Second</li><li>Third</li></ol>';
const regex = /<(\/?([bisu]|li|ul|ol|ol start="\d+"))>/gi;
let result = str.replace(regex, "[$1]");
console.log(result);

This works as expected, but I'd like to remove the quotation marks, so the <ol start="3"> will become [ol start=3]. I wonder if this is possible in the same regex.


Solution

  • This can be accomplished with minimal changes by adding a replacement function. The function adds the square brackets and removes quote marks. It uses a template literal for formatting, though that's optional.

    Per comments, this answer assumes the input string is narrowly confined and that the output is a UBB variant where lists are defined as [ol] rather than [list].

    const str = '<b>Something</b> is going on.<br><i>But what?</i><br><br><ol start="3"><li>First</li><li>Second</li><li>Third</li></ol>';
    
    const regex = /<(\/?([bisu]|li|ul|ol|ol start="\d+"))>/gi;
    
    let result = str.replace(regex, (a,b) => `[${b.replace(/"/g,'')}]`)
    
    console.log(result);