phppdfcorrupt

Verifiy corrupted PDF using PHP


I would like to detect corrupted PDF using PHP. I have been able to determine that on not corrupted pdf I have the tag "%%EOF" at the end of the file. I also checked for this tag on corrupted and it not appear.

I had the idea to automatically checked the validty of my PDF file before uploading it to my server.

<?php
$file = file('good.pdf');

$endfile= $file[count($file) - 1];

echo gettype($endfile),"\n";
echo $endfile,"\n";

?>

I get this result

string %%EOF 

For now, everything seems to be fine, but I have an issue when comparing the results.

I tested this code

<?php
$file = file('good.pdf');
$endfile= $file[count($file) - 1];
$n="%%EOF";

echo $endfile;
echo $n;

if ($endfile === $n) {
    echo "good";

} else {
    echo "corrupted";
}

?>

I get this result

%%EOF %%EOF corrupted

I know that $endfile and $n are string but when i want to compare it, I never get the equality/match. I also tried with == but the result is the same.

I also tried it like that :

<?php
$file = file('good.pdf');
$endfile= $file[count($file) - 1];
$var1val = $endfile;
$var2val = "%%EOF";
echo $var2val;
echo $var1val;
$n = strcmp($var1val,$var2val); // 0 mean that they are the same
echo $n;
if ($n == 0) {
    echo "good";

} else {
    echo "corrupted";
}

?>

but I get this result :

%%EOF %%EOF 1 corrupted

It gave me the same result with === .

I only tested with a working and not corrupted pdf. Do you know why this is not working ? Maybe you have other methods using php to check if the pdf is not corrupted before I automatically upload it to my server ?


Solution

  • Reading http://php.net/manual/en/function.file.php :

    Returns the file in an array. Each element of the array corresponds to a line in the file, with the newline still attached. You need to remove the newlines to compare properly.

    You need to do something like:

    <?php
    $file = file('good.pdf');
    $endfile= trim($file[count($file) - 1]);
    $n="%%EOF";
    
    
    if ($endfile === $n) {
        echo "good";
    
    } else {
        echo "corrupted";
    }