phpjsonfilescandir

Remove files which have not filename duplicates


For each document (.pdf, .txt, .docx ecc) I have also a corresponding json file with the same filename.

Example: file1.json, file1.pdf, file2.json, file2.txt, filex.json, filex.pdf,

But I got also some json files which are not accompanied with the corresponding document.

I want to delete all json files which have no corresponding document. Im really stucked because I cant find a proper solution to my problem.

I know how to scandir() get the filename, extensions from pathinfo() ecc. but the issue is that for each json file I find in directory I have to perform another foreach on that directory excluding all json files and see If the same filename exists or not so than I can decide to delete it. (This is how I think to solve it).

The problem here is with performance since there are millions of files and for each json I have to run a foreach on millions of files.

Can anyone guide me to a better solution?

Thank you!

Edit: Since no one will help without first posting a piece of code (and this approach in stackoverflow is definitively wrong) here is how I'm trying.:

<?php

$dir = "2000/";

$files = scandir($dir);

foreach ($files as $file) {

    $fullName = pathinfo($file);

    if ($fullName['extension'] === 'json') {
        if (!in_array($fullName['filename'].'.pdf', $files)){
            unlink($dir.$file);
        }
    }
}

Now as you can see I can only search only for one type of document (.pdf in this case). I want to search for every extension excluding .json and also I don't want that for each json file to run a foreach/in_array() but achieving all this in just one foreach.


Solution

  • Maybe you should consider it in another way? I mean, iterate through all files, and try to find corresponding files to json, if not found remove it.

    It would look like follows:

    $dir = "2000/";
    
    foreach (glob($dir . "*.json") as $file) {
        $file = new \SplFileInfo($dir . $file);
        if (count(glob($dir . $file->getBasename('.' . $file->getExtension()) . ".*")) === 1) {
            unlink($dir . $file->getFilename());
        }
    }
    

    Manual

    PHP: SplFileInfo

    PHP: glob