I'm trying to recreate a long lost PHP website. One of the pages of this website allowed employees to upload archive files that were created by a local script they executed. The webserver would then extract the contents into separate files to be stored in different folders for other purposes.
Thankfully I have the script that created the archives, but it is in Java. I imagine it can be reversed though? The script they used would basically just run the below addFile on multiple file paths.
public class Archive {
static void create(File f) throws IOException {
BufferedOutputStream w = new BufferedOutputStream(new FileOutputStream(f));
w.write(new byte[]{1, 3, 3, 7});
w.write(new byte[4]);
w.close();
}
static int addFile(File archive, File add, String name) throws IOException {
if (!add.exists()) {
throw new IOException("File to be added does not exist!");
}
if (add.isDirectory()) {
throw new IOException("Cannot add directories!");
}
if (!archive.exists()) {
Archive.create(archive);
}
if (archive.isDirectory()) {
throw new IOException("Archive is no valid archive!");
}
RandomAccessFile r = new RandomAccessFile(archive, "rw");
int code = r.readInt();
if (code != 16974599) {
throw new IOException("Archive is no valid archive!");
}
int fileCount = r.readInt();
r.seek(4);
r.writeInt(fileCount + 1);
r.seek(r.length());
RandomAccessFile bi = new RandomAccessFile(add, "r");
r.writeInt((int)bi.length());
r.writeBytes(name);
r.write(0);
byte[] swap = new byte[(int)bi.length()];
bi.readFully(swap);
r.write(swap);
bi.close();
r.close();
return fileCount + 1;
}
public static void main(String[] args) throws IOException {
}
}
Update:
I have created a function using fread() but then it runs out of memory after the first file. That is with the memory limit temporarily set at 512mb. Is there an alternative?
According to the Java code, the file format is as follows:
It is not an archive format but a simple concatenation of files with some metadata.
To extract the files from such an 'archive,' we can use a PHP code like this:
<?php
class MyArchiveHeader {
public function __construct(
private int $typeCode,
private int $fileCount
) {}
public function getTypeCode(): int
{
return $this->typeCode;
}
public function getFileCount(): int
{
return $this->fileCount;
}
}
class MyArchiveFile {
public function __construct(
private string $filename,
private string $contents
) {}
public function getFilename(): string
{
return $this->filename;
}
public function getContents(): string
{
return $this->contents;
}
}
class MyArchive {
public function __construct(private string $filename) {}
public function extractFiles(string $outputDirectory): void
{
if (!is_dir($outputDirectory)) {
throw new \InvalidArgumentException('Output directory does not exist');
}
$file = new \SplFileObject($this->filename, 'rb');
$header = $this->parseHeader($file);
$fileCount = $header->getFileCount();
for ($i = 0; $i < $fileCount; $i++) {
$parsedFile = $this->parseFile($file);
$outputFilename = $outputDirectory . DIRECTORY_SEPARATOR . $parsedFile->getFilename();
file_put_contents($outputFilename, $parsedFile->getContents());
}
}
private function parseHeader(\SplFileObject $file): MyArchiveHeader
{
$typeCodeBytes = $file->fread(4);
if ($typeCodeBytes === false) {
throw new \RuntimeException('Could not read file type code');
}
$typeCode = unpack('V', $typeCodeBytes)[1]; // Unpack 4 bytes as unsigned integer
if ($typeCode !== 0x01030307) {
throw new \RuntimeException('Invalid file type code');
}
$fileCountBytes = $file->fread(4);
if ($fileCountBytes === false) {
throw new \RuntimeException('Could not read file count');
}
$fileCount = unpack('V', $fileCountBytes)[1]; // Unpack 4 bytes as unsigned integer
return new MyArchiveHeader($typeCode, $fileCount);
}
private function parseFile(\SplFileObject $file): MyArchiveFile
{
$fileLengthBytes = $file->fread(4);
if ($fileLengthBytes === false) {
throw new \RuntimeException('Could not read file length');
}
$fileLength = unpack('V', $fileLengthBytes)[1]; // Unpack 4 bytes as unsigned integer
$filename = "";
while (!$file->eof()) {
$char = $file->fread(1);
if ($char === "\0") {
break;
}
$filename .= $char;
}
// TODO Might need to convert $filename to UTF-8, for instance.
$contents = $file->fread($fileLength);
if ($contents === false) {
throw new \RuntimeException('Could not read file contents');
}
return new MyArchiveFile($filename, $contents);
}
}
I haven't tested the code, but it should give you a good starting point. You can use it like this:
$archiveFilename = 'archive';
$outputDir = sys_get_temp_dir() . DIRECTORY_SEPARATOR . 'extracted';
mkdir($outputDir);
echo "Extracting archive to $outputDir\n";
$archive = new MyArchive($archiveFilename);
$archive->extractFiles($outputDir);