I need a preg_match expression to remove all the timings from a .srt subtitle file (imported as a string) but I could never quite get my head round regex patterns. So for example it would change:
5
00:05:50,141 --> 00:05:54,771
This is what was said
to
This is what was said
Not sure where you got stuck it's only \d+ and colon/comma really.
$re = '/\d+.\d+:\d+:\d+,\d+\s-->\s\d+:\d+:\d+,\d+./s';
//$re = '\d+.[0-9:,]+\s-->\s[\d+:,]+./s'; //slightly compacter version of the regex
$str = '5
00:05:50,141 --> 00:05:54,771
This is what was said';
$subst = '';
$result = preg_replace($re, $subst, $str);
echo $result;
Working demo here.
With the little compacter pattern it looks like: https://regex101.com/r/QY9QXG/2
$str = "1
00:05:50,141 --> 00:05:54,771
This is what was said1
2
00:05:50,141 --> 00:05:54,771
This is what was said2
3
00:05:50,141 --> 00:05:54,771
This is what was said3
4
00:05:50,141 --> 00:05:54,771
This is what was said4
LLLL
5
00:05:50,141 --> 00:05:54,771
This is what was said5";
$count = explode(PHP_EOL.PHP_EOL, $str);
foreach($count as &$line){
$line = implode(PHP_EOL, array_slice(explode(PHP_EOL, $line), 2));
}
echo implode(PHP_EOL.PHP_EOL, $count);
The non regex will first split on double new lines which means each new subtitle group is a new item in the array.
Then loop though them and explode again on new line.
First two lines are not wanted, array slice them away.
If the subtitle is more than one line I need to merge them. Do that with implode on new line.
Then as the last step rebuild the string again with implode on double new line.
As Casimir wrote in comments below I have used PHP_EOL as new line and that works in the example.
But when used on a real srt file the new line may be different.
If the code does not work as expected try replacing PHP_EOL with some other new line.