javacompressionapache-commonsapache-commons-compressbzip

How to decompress BZIP (not BZIP2) with Apache Commons


I have been working on a task to decompress from different types of file format such as "zip,tar,tbz,tgz". I am able to do for all except tbz because apache common compress library provides BZIP2 compressors. But I need to decompress a old BZIP not BZIP2. Is there any way to do it java. I have added the code I have done so far for extracting different tar file archives using apache commons library below.

public List<ArchiveFile> processTarFiles(String compressedFilePath, String fileType) throws IOException {
    List<ArchiveFile> extractedFileList = null;
    TarArchiveInputStream is = null;
    FileOutputStream fos = null;
    BufferedOutputStream dest = null;
    try {
        if(fileType.equalsIgnoreCase("tar"))
        {
            is = new TarArchiveInputStream(new FileInputStream(new File(compressedFilePath)));
        }
        else if(fileType.equalsIgnoreCase("tbz")||fileType.equalsIgnoreCase("bz"))
        {
            is = new TarArchiveInputStream(new BZip2CompressorInputStream(new FileInputStream(new File(compressedFilePath))));
        }
        else if(fileType.equalsIgnoreCase("tgz")||fileType.equalsIgnoreCase("gz"))
        {
            is = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream(new File(compressedFilePath))));
        }
        TarArchiveEntry entry = is.getNextTarEntry();
        extractedFileList = new ArrayList<>();
        while (entry != null) {
            // grab a zip file entry
            String currentEntry = entry.getName();

            if (!entry.isDirectory()) {
                File destFile = new File(Constants.DEFAULT_ZIPOUTPUTPATH, currentEntry);
                File destinationParent = destFile.getParentFile();
                // create the parent directory structure if needed
                destinationParent.mkdirs();
                ArchiveFile archiveFile = new ArchiveFile();
                int currentByte;
                // establish buffer for writing file
                byte data[] = new byte[(int) entry.getSize()];
                // write the current file to disk
                fos = new FileOutputStream(destFile);
                dest = new BufferedOutputStream(fos, (int) entry.getSize());

                // read and write until last byte is encountered
                while ((currentByte = is.read(data, 0, (int) entry.getSize())) != -1) {
                    dest.write(data, 0, currentByte);
                }
                dest.flush();
                dest.close();
                archiveFile.setExtractedFilePath(destFile.getAbsolutePath());
                archiveFile.setFormat(destFile.getName().split("\\.")[1]);
                extractedFileList.add(archiveFile);
                entry = is.getNextTarEntry();
            } else {
                new File(Constants.DEFAULT_ZIPOUTPUTPATH, currentEntry).mkdirs();
                entry = is.getNextTarEntry();
            }

        }
    } catch (IOException e) {
        System.out.println(("ERROR: " + e.getMessage()));
    } catch (Exception e) {
        System.out.println(("ERROR: " + e.getMessage()));
    } finally {
        is.close();
        dest.flush();
        dest.close();
    }

    return extractedFileList;
}

Solution

  • The original Bzip was supposedly using a patented algorithm so Bzip2 was born using algorithms and techniques that were not patented.

    That might be the reason why it's no longer in widespread use and open source libraries ignore it.

    There's some C code for decompressing Bzip files shown here (gist.github.com mirror).

    You might want to read and rewrite that in Java.