javaphpreverse-engineeringrandomaccessfile

Extracting an archive created via Java RandomAccessFile with PHP


I'm trying to recreate a long lost PHP website. One of the pages of this website allowed employees to upload archive files that were created by a local script they executed. The webserver would then extract the contents into separate files to be stored in different folders for other purposes.

Thankfully I have the script that created the archives, but it is in Java. I imagine it can be reversed though? The script they used would basically just run the below addFile on multiple file paths.

public class Archive {
    static void create(File f) throws IOException {
        BufferedOutputStream w = new BufferedOutputStream(new FileOutputStream(f));
        w.write(new byte[]{1, 3, 3, 7});
        w.write(new byte[4]);
        w.close();
    }

    static int addFile(File archive, File add, String name) throws IOException {
        if (!add.exists()) {
            throw new IOException("File to be added does not exist!");
        }
        if (add.isDirectory()) {
            throw new IOException("Cannot add directories!");
        }
        if (!archive.exists()) {
            Archive.create(archive);
        }
        if (archive.isDirectory()) {
            throw new IOException("Archive is no valid archive!");
        }
        RandomAccessFile r = new RandomAccessFile(archive, "rw");
        int code = r.readInt();
        if (code != 16974599) {
            throw new IOException("Archive is no valid archive!");
        }
        int fileCount = r.readInt();
        r.seek(4);
        r.writeInt(fileCount + 1);
        r.seek(r.length());
        RandomAccessFile bi = new RandomAccessFile(add, "r");
        r.writeInt((int)bi.length());
        r.writeBytes(name);
        r.write(0);
        byte[] swap = new byte[(int)bi.length()];
        bi.readFully(swap);
        r.write(swap);
        bi.close();
        r.close();
        return fileCount + 1;
    }

    public static void main(String[] args) throws IOException {
    }
}

Update:

I have created a function using fread() but then it runs out of memory after the first file. That is with the memory limit temporarily set at 512mb. Is there an alternative?


Solution

  • According to the Java code, the file format is as follows:

    It is not an archive format but a simple concatenation of files with some metadata.

    To extract the files from such an 'archive,' we can use a PHP code like this:

    <?php
    class MyArchiveHeader {
        public function __construct(
            private int $typeCode,
            private int $fileCount
        ) {}
    
        public function getTypeCode(): int
        {
            return $this->typeCode;
        }
    
        public function getFileCount(): int
        {
            return $this->fileCount;
        }
    }
    
    class MyArchiveFile {
        public function __construct(
            private string $filename,
            private string $contents
        ) {}
    
        public function getFilename(): string
        {
            return $this->filename;
        }
    
        public function getContents(): string
        {
            return $this->contents;
        }
    }
    
    class MyArchive {
    
        public function __construct(private string $filename) {}
    
        public function extractFiles(string $outputDirectory): void
        {
            if (!is_dir($outputDirectory)) {
                throw new \InvalidArgumentException('Output directory does not exist');
            }
    
            $file = new \SplFileObject($this->filename, 'rb');
    
            $header = $this->parseHeader($file);
    
            $fileCount = $header->getFileCount();
            for ($i = 0; $i < $fileCount; $i++) {
                $parsedFile = $this->parseFile($file);
    
                $outputFilename = $outputDirectory . DIRECTORY_SEPARATOR . $parsedFile->getFilename();
                file_put_contents($outputFilename, $parsedFile->getContents());
            }
        }
    
        private function parseHeader(\SplFileObject $file): MyArchiveHeader
        {
            $typeCodeBytes = $file->fread(4);
            if ($typeCodeBytes === false) {
                throw new \RuntimeException('Could not read file type code');
            }
    
            $typeCode = unpack('V', $typeCodeBytes)[1]; // Unpack 4 bytes as unsigned integer
            if ($typeCode !== 0x01030307) {
                throw new \RuntimeException('Invalid file type code');
            }
    
            $fileCountBytes = $file->fread(4);
            if ($fileCountBytes === false) {
                throw new \RuntimeException('Could not read file count');
            }
    
            $fileCount = unpack('V', $fileCountBytes)[1]; // Unpack 4 bytes as unsigned integer
    
            return new MyArchiveHeader($typeCode, $fileCount);
        }
    
        private function parseFile(\SplFileObject $file): MyArchiveFile
        {
            $fileLengthBytes = $file->fread(4);
            if ($fileLengthBytes === false) {
                throw new \RuntimeException('Could not read file length');
            }
    
            $fileLength = unpack('V', $fileLengthBytes)[1]; // Unpack 4 bytes as unsigned integer
    
            $filename = "";
            while (!$file->eof()) {
                $char = $file->fread(1);
                if ($char === "\0") {
                    break;
                }
                $filename .= $char;
            }
    
            // TODO Might need to convert $filename to UTF-8, for instance.
    
            $contents = $file->fread($fileLength);
            if ($contents === false) {
                throw new \RuntimeException('Could not read file contents');
            }
    
            return new MyArchiveFile($filename, $contents);
        }
    }
    

    I haven't tested the code, but it should give you a good starting point. You can use it like this:

    $archiveFilename = 'archive';
    $outputDir = sys_get_temp_dir() . DIRECTORY_SEPARATOR . 'extracted';
    mkdir($outputDir);
    
    echo "Extracting archive to $outputDir\n";
    $archive = new MyArchive($archiveFilename);
    $archive->extractFiles($outputDir);