I have a simple POD text file:
$ cat test.pod
=encoding UTF-8
Münster
It is encoded in UTF-8, as per this literal hex dump of the file:
00000000 3d 65 6e 63 6f 64 69 6e 67 20 55 54 46 2d 38 0a |=encoding UTF-8.|
00000010 0a 4d c3 bc 6e 73 74 65 72 0a |.M..nster.|
0000001a
The "ü" is being encoded as the two bytes C3 and BC.
But when I run perldoc
on the file it is turning my lovely formatted UTF-8 characters into ASCII.
What's more, it is correctly handling the German language convention of representing "ü" as "ue".
$ perldoc test.pod | cat
TEST(1) User Contributed Perl Documentation TEST(1)
Muenster
perl v5.16.3 2014-06-10 TEST(1)
Why is it doing this?
Is there an additional declaration I can put into my file to stop it from happening?
After additional investigation with App::perlbrew
I've found the difference comes from having a particular version of Pod::Perldoc.
perl-5.10.1 3.14_04 Muenster
perl-5.12.5 3.15_02 Muenster
perl-5.14.4 3.15_04 Muenster
perl-5.16.2 3.17 Münster
perl-5.16.3 3.19 Muenster
perl-5.16.3 3.17 Münster
perl-5.17.3 3.17 Münster
perl-5.18.0 3.19 Muenster
perl-5.18.1 3.23 Münster
However I would still like, if possible, a way to make Pod::Perldoc 3.14, 3.15, and 3.19 behave "correctly".
Found this RT ticket http://rt.cpan.org/Public/Bug/Display.html?id=39000
This "bug" seems to be introduced with Perl 5.10 and perhaps this was solved in later versions.
Also see: How can I use Unicode characters in Perl POD-derived man pages? and incorrect behaviour of perldoc with UTF-8 texts.
You should add the latest available version of Pod::Perldoc as a dependency.