I have this problem, I refactored jQuery Terminal unix_formatting extension (a formatter) that process ANSI escape codes, based on paid work. The task that I was hired for was to only output text, but it should work with any output the terminal can process that you can record into a single file. I've got permission to use this code for my Open Source project.
When my code finish parsing the input string, I end up with a structure like this:
var data = [
{
text: " the knight ▄▄▄▄▄ fuel",
formatting: [
{start: 55, end: 65, format: '[[b;#555;#000]'},
{start: 65, end: 66, format: '[[;#000;]'},
{start: 66, end: 76, format: '[[b;#555;]'}
]
}
For the work I was hired, I ignored the formatting and colors and only return text and concatenate the result.
The jQuery Terminal formatting look like this, the opening is the thing in format and then the closing is single closing bracket. The formatting can overlap because I have a formatter that use simple stack to create a flat list.
Now I need to process this code into the final color form using jQuery Terminal formatting. The problem is that the start and end can overlap. I have a function in the library called $.terminal.substring that can return substring of the string and keep the formatting in place. e.g.:
$.terminal.substring('[[;;]hello]', 1, 2);
// '[[;;]e]'
The problem is that this function is slow when text is long, and I use this to test with ANSI Art that sometimes is one big line that need to be split into lines depending on metadata.
So I was thinking of pre-processing the input string somehow and put the markers in palaces where the formatting need to be put. And at the end combine the formatting and input chunks. But I'm not sure how to do this.
I asked ChatGPT It has given me a simple code (way too simple) at first I thought it was working:
function preprocessAndFormat(data) {
var text = data.text;
var formatting = data.formatting;
// Collect all formatting start and end points
var points = [];
formatting.forEach(function(item) {
points.push({ pos: item.start, type: 'start', format: item.format });
points.push({ pos: item.end, type: 'end', format: item.format });
});
// Sort points by position, with 'end' before 'start' if equal
points.sort(function(a, b) {
if (a.pos === b.pos) {
return a.type === 'end' ? -1 : 1;
}
return a.pos - b.pos;
});
// Process the text and insert formatting placeholders
var result = '';
var lastPos = 0;
var openFormats = [];
points.forEach(function(point) {
if (point.pos > lastPos) {
result += text.slice(lastPos, point.pos);
}
if (point.type === 'start') {
openFormats.push(point.format);
result += point.format;
} else if (point.type === 'end') {
var index = openFormats.lastIndexOf(point.format);
if (index !== -1) {
openFormats.splice(index, 1);
result += ']';
}
}
lastPos = point.pos;
});
// Add the remaining text after the last position
result += text.slice(lastPos);
return result;
}
But it failed with more complex ANSI Art. As one of unit test I have this Denis Richie ANSI Art.
This is the output, of the above code:
As you can see, some of the formatting is visible, because the code substring on the output string that already contain the formatting, and the start, end markers are for the original string without the formatting.
So to be complete, here is my question summary:
How to write an algorithm that will process the input string and inject the formatting at specific places, where the formatting can overlap?
Here is a CodePen demo where you can experiment:
https://codepen.io/jcubic/pen/mybEYPV?editors=1010
This is related code:
function format_lines(str, len) {
str = $.terminal.apply_formatters(str, {
unixFormatting: {
format_text: function(line) {
// return line.text;
return format_text(line);
},
ansiArt: true
}
});
...
}
format_text
function receives a data structure like I showed at the beginning. format_text
is invalid function based on ChatGPT.
When a ]
is appended right after a backslash (\
), that ]
is output as a literal character and not interpreted as the closing delimiter of a formatted sequence, and as a consequence some escape codes are nested and appear in the output.
You should somehow avoid that \]
sequence.
One way is to insert a NUL character between the two, which the jQuery terminal emulator does not render in the output. Change this line of code:
result += ']';
to this:
result += result.at(-1) == '\\' ? '\x00]' : ']';