apache-commons-compress

Apache commons-compress


I am using commons-compress to process tarball files and noticed that even files which are not tar seem to be processed. Why is this -- is there a better library to detect valid tar files

 <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-compress</artifactId>
      <version>1.20</version>
 </dependency>

bug689.csv is a CSV file, the test fails because apparently te.isFile() returns true. te.getName() seems to return the contents of the CSV. Is this a bug of am I using the package incorrectly -- I'd expect the InputStream to not be successfully converted to TarArchiveEntry

    @Test
    public void testTarball() throws IOException{
        InputStream tarData = this.getClass().getResourceAsStream("/bug689.csv");
        TarArchiveInputStream tis = new TarArchiveInputStream(tarData);
        TarArchiveEntry te = tis.getNextTarEntry();
        assertFalse(te.isFile());
    }


Solution

  • If you are not dealing with a tar file, then tis.getNextTarEntry() will be null - so you would have to check for that explicitly.

    But if you do have a valid tar file, beware relying on te.isFile(). The first item in your tar may not be a regular file. It may be a directory or something else.

    The tar file may even be empty - in which case tis.getNextTarEntry() will again be null.

    If you want to only test for a tar containing one regular file, then I see no issue with using te.isFile().