I'm building a CLI app in PHP that has a method to output text:
$out->line('Morbi leo risus, porta ac consectetur ac, vestibulum at eros. Aenean lacinia bibendum nulla sed consectetur. Nullam id dolor id nibh ultricies vehicula ut id elit. Aenean lacinia bibendum nulla sed consectetur. Curabitur blandit tempus porttitor.');
I'm limiting the line output to 80 characters within line()
via:
public function line(string $text): void
{
$this->rawLine(wordwrap($text, 80, PHP_EOL));
}
This prints the output across multiple lines:
Morbi leo risus, porta ac consectetur ac, vestibulum at eros. Aenean lacinia
bibendum nulla sed consectetur. Nullam id dolor id nibh ultricies vehicula ut id
elit. Aenean lacinia bibendum nulla sed consectetur. Curabitur blandit tempus
porttitor.
Now, I can also style parts of the text using ANSI escape codes:
$out->line('Morbi leo risus, ' . Style::inline('porta ac consectetur', ['color' => 'blue', 'attribute' => 'bold']) . ' ac, vestibulum at eros. Aenean lacinia bibendum nulla sed consectetur. Nullam id dolor id nibh ultricies vehicula ut id elit. Aenean lacinia bibendum nulla sed consectetur. Curabitur blandit tempus porttitor.');
Which gets converted to this:
Morbi leo risus, \x1b[34;1mporta ac consectetur\x1b[39;22m ac, vestibulum at
eros. Aenean lacinia bibendum nulla sed consectetur. Nullam id dolor id nibh
ultricies vehicula ut id elit. Aenean lacinia bibendum nulla sed consectetur.
Curabitur blandit tempus porttitor.
And when passed to line()
, printed out like this:
Morbi leo risus, porta ac consectetur ac, vestibulum at eros.
Aenean lacinia bibendum nulla sed consectetur. Nullam id dolor id nibh ultricies
vehicula ut id elit. Aenean lacinia bibendum nulla sed consectetur. Curabitur
blandit tempus porttitor.
Where "porta ac consectetur ac" is blue and bold, but if you notice, the line is shorter than before and doesn't break at the same place.
Even though these are non-printing characters, wordwrap()
(and strlen()
) has issues calculating the length appropriately.
The first line is originally 76 characters without ANSI escape codes:
Morbi leo risus, porta ac consectetur ac, vestibulum at eros. Aenean lacinia
But after adding styles, it comes back as 97 characters:
Morbi leo risus, \x1b[34;1mporta ac consectetur\x1b[39;22m ac, vestibulum at eros. Aenean lacinia
In other parts of the app, like a table, I "solved" this by having a method to set the column value and then a separate method to style said column. That way, I can reliably get the length, but also output the text in the defined style.
I could pass both an unstyled version and then a style version of the text, but that doesn't feel right. Nor does it solve the problem of then splitting the style version accurately.
To solve the issue with line()
, I thought about stripping out the ANSI escape codes to get actual length, then add the PHP_EOL
break where needed, and then inject the style back in, but that doesn't feel like the right solution and it seems complicated-- how would I even go about doing that?
So my question is: How can I reliably split text containing ANSI escape codes based on text length?
This is the input:
$styledText = "Morbi leo risus, \x1b[34;1mporta ac consectetur\x1b[39;22m ac, vestibulum at eros. Aenean lacinia bibendum nulla sed consectetur. Nullam id dolor id nibh ultricies vehicula ut id elit. Aenean lacinia bibendum nulla sed consectetur. Curabitur blandit tempus porttitor.";
The following method strips out escape codes from styled text and saves a copy as clean text.
The clean text is used to add line breaks using wordwrap
based on desired column width.
It loops over styled text and injects a line break after every word in which PHP added a line break in clean text.
function wrap(string $styledText) {
// Strip ANSI escape codes from $styledText
$cleanText = preg_replace('/\\x1b\[[0-9;]+m/', '', $styledText);
// Add PHP_EOL to ensure $cleanText does not exceed line width
$cleanWrappedText = wordwrap($cleanText, 80, PHP_EOL . ' ');
// Split $styledText and $cleanWrappedText on each space
$styledTextArray = explode(' ', $styledText);
$cleanTextArray = explode(' ', $cleanWrappedText);
// $fusedText will comprise $styledText w/ line breaks from $cleanWrappedText
$fusedText = '';
// Loop over each segment (likely a word)
foreach ($styledTextArray as $index => $segment) {
// Append word (with ANSI escape codes)
$fusedText .= $segment;
// If word has line break in clean version then add line break
if (str_ends_with($cleanTextArray[$index], PHP_EOL)) {
$fusedText .= PHP_EOL;
continue;
}
// If word does not have line break in clean version,
// but there is another word coming, then add space between words
if (isset($cleanTextArray[$index+1])) {
$fusedText .= ' ';
}
}
return $fusedText;
}
Note that this can't easily be tested on the web, since the escape codes only style text appropriately when used via a CLI.