javamavenmaven-javadoc-plugin

maven-javadoc-plugin error javadoc: error - cannot read Input length = 1 with non-ASCII characters in directory name


I'm using OpenJDK 11 on Windows 10. I have a very simple POM, for a single Java file, that generates Javadocs. Here is an extract:

<properties>
  <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  <maven.compiler.source>11</maven.compiler.source>
  <maven.compiler.target>11</maven.compiler.target>
</properties>

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-javadoc-plugin</artifactId>
      <version>3.0.1</version>
      <executions>
        <execution>
          <goals>
            <goal>jar</goal>
          </goals>
        </execution>
      </executions>
    </plugin>
  </plugins>
</build>

Strangely just running mvn clean package causes an error:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:3.0.1:jar (default) on project foobar: MavenReportException: Error while generating Javadoc:
[ERROR] Exit code: 1 - javadoc: error - cannot read Input length = 1
[ERROR]
[ERROR] Command line was: C:\bin\jdk-11\bin\javadoc.exe @options @packages

In target/apidocs there are only three files: javadoc.bat, options, and packages. The options file is the most interesting. It explicitly says UTF-8 everywhere, as it should. But look at these lines:

-sourcepath
C:/projects/li��o 1/src/main/java

This project is in C:\projects\lição 1. It appears that somewhere along the chain Java or Maven or the Javadoc plugin didn't correctly convert the directory name to UTF-8.

Sure enough; when I renamed the directories in Windows to remove non-ASCII characters, mvn clean package worked just fine.

This would seem like a blatant bug; once Maven starts, everything should be UTF-8 throughout. Is it a problem with the Javadoc plugin? Anyone have an idea where this originates? Where should I file a bug ticket? Or am I doing something wrong?


Solution

  • As you say, this looks like the encoding used to write the files to target/apidocs.

    Looking through the source for the maven-javadoc-plugin, it is just using the platform encoding when writing these files - e.g. this line.

    Directly setting the encoding while calling Maven repaired the example above for me:

    mvn clean package -Dfile.encoding=UTF-8
    

    This feels more like a workaround than a good fix though - it needs to assume there is nothing else depending on the platform encoding in the Maven build.

    I think the cause is a change in the main JDK between 8 and 9. The bit of code (actually under javac) that parses the argument files (e.g. @options in the javadoc command line) has switched from using the platform encoding here to calling Files.newBufferedReader() here. Files.newBufferedReader(Path) states that it uses UTF-8 if the encoding is not specified. This means argument files, in both javac and javadoc, must now be encoded in UTF-8.