In Perl, there's the ucfirst function.
Is it this the equivalent to this:
sub uppercase {
my ($W) = @_;
$$W = uc(substr($$W,0,1)).substr($$W,1);
}
Does it matter across Perl version?
Contextualizing the question, https://github.com/moses-smt/mosesdecoder/pull/206/files#diff-876e51db2a1ab71c1ae736182d1e5e04R63 ,
Previously, the usage of uppercase
is as such:
sub process {
my $line = $_[0];
chomp($line);
$line =~ s/^\s+//;
$line =~ s/\s+$//;
my @WORD = split(/\s+/,$line);
# uppercase at sentence start
my $sentence_start = 1;
for(my $i=0;$i<scalar(@WORD);$i++) {
&uppercase(\$WORD[$i]) if $sentence_start;
if (defined($SENTENCE_END{ $WORD[$i] })) { $sentence_start = 1; }
elsif (!defined($DELAYED_SENTENCE_START{$WORD[$i] })) { $sentence_start = 0; }
}
# uppercase headlines {
if (defined($SRC) && $HEADLINE[$sentence]) {
foreach (@WORD) {
&uppercase(\$_) unless $ALWAYS_LOWER{$_};
}
}
But it seems like replacing &uppercase(\$WORD[$i])
and &uppercase(\$_)
with ucfirst(\$WORD[$i])
and ucfirst(\$_)
is different.
ucfirst
is not equivalent to the following:
sub uppercase {
my ($W) = @_;
$$W = uc(substr($$W,0,1)).substr($$W,1);
}
ucfirst
is mostly[1] equivalent to the following:
sub ucfirst {
my ($W) = @_;
return uc(substr($W,0,1)).substr($W,1);
}
If you wanted to rewrite uppercase
in terms of ucfirst
, it would look like this:
sub uppercase {
my ($W) = @_;
$$W = ucfirst($$W);
}
uppercase(\$string);
That means that if you wanted to eliminate uppercase
entirely, you'd replace
uppercase(\$string);
with
$string = ucfirst($string); # Correct
You tried using
ucfirst(\$string); # Wrong
ucfirst
actually does a better job of handling more esoteric characters such as U+01F3 LATIN SMALL LETTER DZ ("dz").