I have lot of text which is in a CSV file with the following format.
ab(cd)(ef)ghi
In each line, such text exists any number of times and there are about 2-3000 lines in the text file.
What I'm trying to achieve is to replace such text with the following format.
abcdghi, abefghi
I am currently trying the manual method (editing in Brackets) on a Mac.
Any suggestions? A simple text manipulator code or using a php based code is most welcome.
Thank you for your help in advance
Try this PHP script:
<?php
$inputFile = 'input.csv';
$outputFile = 'output.csv';
$lines = file($inputFile, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$output = [];
foreach ($lines as $line) {
// Match pattern like ab(cd)(ef)ghi
preg_match_all('/\((.*?)\)/', $line, $matches); // Grab what's inside ()
$prefixSuffix = preg_split('/\((.*?)\)/', $line); // Split into prefix/suffix
if (count($prefixSuffix) < 2 || empty($matches[1])) {
// No match found, keep line as-is
$output[] = $line;
continue;
}
$prefix = $prefixSuffix[0];
$suffix = end($prefixSuffix);
$variations = [];
foreach ($matches[1] as $insert) {
$variations[] = $prefix . $insert . $suffix;
}
$output[] = implode(', ', $variations);
}
// Save the result
file_put_contents($outputFile, implode(PHP_EOL, $output));
echo "Done! Output saved to $outputFile\n";
?>
Save this PHP code in a file, e.g., transform.php.
Create an input.csv file in the same folder with lines like ab(cd)(ef)ghi.
Run the script using terminal (on your Mac):
php transform.php
You'll get an output.csv file with lines like:
abcdghi, abefghi