javascriptjqueryregexmatch

Regex that splits long text in separate sentences with match()


This is a textarea where the user writes some text. I've written an example in it.

<textarea id="text">First sentence. Second sentence? Third sentence!
Fourth sentence.

Fifth sentence
</textarea>

Requirements already considered in the regex

Missing requirement (I need help with this) <<

Each new line should be represented by an empty array item. If the regex is applied, this should be the response:

["First sentence.", "Second sentence?", "Third sentence!", "", "Fourth sentence.", "", "", "Fifth sentence"]

Instead, I'm receiving this:

["First sentence.", "Second sentence?", "Third sentence!", "Fourth sentence.", "Fifth sentence"]

This is the regex and match call:

var tregex = /[^\r\n.!?]+(:?(:?\r\n|[\r\n]|[.!?])+|$)/gi;
var sentences = $('#text').val().match(tregex).map($.trim);

Any ideas? Thanks!


Solution

  • I simplified it a lot, either match the end of a line (new line) or a sentence followed by punctuation:

    var tregex = /\n|([^\r\n.!?]+([.!?]+|$))/gim;
    

    I also believe the m flag for multiline is important