phpiowindows-10file-existsunlink

What is the PHP manual talking about with clearstatcache()?


You should also note that PHP doesn't cache information about non-existent files. So, if you call file_exists() on a file that doesn't exist, it will return false until you create the file. If you create the file, it will return true even if you then delete the file. However unlink() clears the cache automatically.

Source: https://www.php.net/manual/en/function.clearstatcache.php

I've read this numerous times but simply cannot make any sense of it. What is the actual information it's trying to convey?

To me, it sounds as if it's contradicting itself. First it says that PHP doesn't cache information about non-existent files. Then it goes on to state that it will return true even if you delete the file. But also that unlink() clears the cache automatically.

Is it referring to the file being deleted outside of PHP? Isn't that the only thing it can mean? But the way it's put is so incredibly confusing, ambiguous and weird. Why even mention that file_exists() will return false until you create the file? It's like saying that water will remain wet even if you clap your hands.

Is it actually saying, in a very round-about way, that I have to always run clearstatcache() before file_exists() unless I want a potentially lying response because a non-PHP script/program has deleted the file in question after the script was launched?

I swear I've spent half my life just re-reading cryptic paragraphs like this because they just don't seem to be written by a human being. I've many, many times had to ask questions like this about small parts of various manuals, and even then, who knows if your interpretations are correct?


Solution

  • I'd like to first address your last paragraph:

    I swear I've spent half my life just re-reading cryptic paragraphs like this because they just don't seem to be written by a human being.

    Quite the opposite: like all human beings, the people who contribute to the PHP manual are not perfect, and make mistakes. It's worth stressing that in this case these people are not professional writers being paid to write the text, they are volunteers who have spent their free time working on it, and yet the result is better than many manuals I've seen for paid software. If there are parts you think could be improved, I encourage you to join that effort.

    Now, onto the actual question. Before going onto the part you quote, let's look at the first sentence on that page:

    When you use stat(), lstat(), or any of the other functions listed in the affected functions list (below), PHP caches the information those functions return in order to provide faster performance.

    What this is saying is that when PHP asks the system about the status of a file (permissions, modification times, etc), it stores the answer in a cache. Next time you ask about the same file, it looks in that cache rather than asking the system again.

    Now, onto the part you quoted:

    You should also note that PHP doesn't cache information about non-existent files.

    Straight-forward enough: if PHP asks the system about the status of a file, and the answer is "it doesn't exist", PHP does not store that answer in its cache.

    So, if you call file_exists() on a file that doesn't exist, it will return false until you create the file.

    The first time you call file_exists() for a file, PHP will ask the system; if the system says it doesn't exist, and you call file_exists() again, PHP will ask the system again. As soon as the file starts existing, a call to file_exists() will return true.

    Put another way, file_exists() is guaranteed not to return false if the file exists at the time you call it.

    If you create the file, it will return true even if you then delete the file.

    This is the point of the paragraph: as soon as the system says "yes, the file exists", PHP will store the information about it in its cache. If you then call file_exists() again, PHP will not ask the system; it will assume that it still exists, and return true.

    In other words, file_exists() is not guaranteed to return true if the file doesn't exist, because it might have previously existed, and had information filed in the cache.

    However unlink() clears the cache automatically.

    As you guessed, all of the above is about you monitoring if something else has created or deleted the file. This is just confirming that if you delete it from within PHP itself, PHP knows that any information it had cached about that file is now irrelevant, and discards it.

    Perhaps a different way to word this would be to give a scenario: Imagine you have a piece of software that creates a temporary file while it's running; you want to monitor when it is created, and when it is deleted. If you write a loop which repeatedly calls file_exists(), it will start returning true as soon as the software creates the file, without any delay or false negatives; however, it will then carry on returning true, even after the software deletes the file. In order to see when it is deleted, you need to additionally call clearstatcache() on each iteration of the loop, so that PHP asks the system every time.