phphtmldomparserucfirstsentencecase

In paragraph making the first letter of every sentence uppercase?


I got this function from php.net for convert uppercase become lowercase in sentence case.

function sentence_case($string) {
    $sentences = preg_split('/([.?!]+)/', $string, -1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
    $new_string = '';
    foreach ($sentences as $key => $sentence) {
        $new_string .= ($key & 1) == 0
            ? ucfirst(strtolower(trim($sentence)))
            : $sentence . ' ';
    }
    return trim($new_string);
}

If the sentence is not in the paragraph, all works well. But if the sentence is in the paragraph, the first letter in opening paragraph (<p>) or break (<br>) tag HTML become lowercase.

This is the sample:

Before:

<p>Lorem IPSUM is simply dummy text. LOREM ipsum is simply dummy text! wHAt is LOREM IPSUM? Hello lorem ipSUM!</p>

Output:

<p>lorem ipsum is simply dummy text. Lorem ipsum is simply dummy text! What is lorem ipsum? Hello lorem ipsum!</p>

Can someone help me to make the first letter in the paragraph become capital letter?


Solution

  • Your problem is that you're considering HTML within the sentence, so the first "word" of the sentence is <P>lorem, not Lorem.

    You can change the regexp to read /([>.?!]+)/, but this way you'll see extra spaces before "Lorem" as the system now sees two sentences and not one.

    Also, now Hello <em>there</em> will be considered as four sentences.

    This looks disturbingly like a case of "How can I use regexp to interpret (X)HTML"?