regexperltracmathjaxtextmatebundles

How to disassemble, process, and reassemble a string in perl (using regex?)?


I am working on documenting some mathematical research in a Trac wiki. I have set up the Trac installation with a MathJax plugin and everything has been working great. As the documents have gotten longer, I wanted to be able to use TextMate to do syntax highlighting and make generating previews easier. I found a Trac bundle and installed it. Inside the bundle it has the following perl script for generating an HTML preview:

# Preview command contributed by Tudor Marghidanu 
#
# Requires the Text::Trac perl module:
#
# sudo perl -MCPAN -e 'install Text::Trac'
#
#!/usr/bin/env perl

use strict;
use warnings;

use Text::Trac;

my $parser = Text::Trac->new();
$parser->parse( join( '', <STDIN> ) );

print $parser->html();

(For anyone else using this bundle, the #! line needed to be moved to the top of the file for it to work properly)

This did a good job of generating the Trac wiki markup into HTML, but, obviously, did not do anything about the MathJax markup. I simply added a line

print '<script type="text/javascript"
  src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>';

to load the javascript from the CDN. Now the problem is that the MathJax code symbols, such as the ^ are being translated into HTML because they are also part of the TracWiki syntax. I tried using the TracWiki markups to essentially "comment out" the MathJax code segments, but the Perl Trac library does not seem to care about those.

It seems like there should be a way to regex match all the mathjax code segments, stash them into an array, replace them with placeholder tokens (like mj1, mj2, ...), process the replaced wiki text into HTML, then regex replace the placeholders from the array of stashed values.

If this is the right way to do it, how is this done in Perl?

If this is not the right way to do it, what is?


Solution

  • Taking your solution as a starting point, the following is a much simpler solution.

    It relies on the ordered nature of the cut snippets and s///eg to do the replacement in a single go instead of two steps:

    #!/usr/bin/env perl
    
    #
    # Preview command contributed by Tudor Marghidanu
    #
    # Requires the Text::Trac perl module:
    #
    # sudo perl -MCPAN -e 'install Text::Trac'
    #
    
    use strict;
    use warnings;
    
    use Text::Trac;
    
    my $parser = Text::Trac->new();
    
    my $tractext = join '', <STDIN>;
    
    my @mathjax_snippets;
    
    $tractext =~ s{(\\\(.*\\\))}{
        push @mathjax_snippets, $1;
        "math_jax_snippet"
    }eg;
    
    $parser->parse($tractext);
    my $html = $parser->html();
    
    $html =~ s/math_jax_snippet/shift @mathjax_snippets/eg;
    
    print '<script type="text/javascript"
      src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
    </script>';
    
    print $html;