htmlwebserver

What are the Legal / Allowed characters for web server file names on?


What characters are allowed in filenames for HTML files on ALL servers (*nix, Windows, etc.) ? I'm looking for the "lowest common denominator" that will work on all servers. USE: I'm naming a file to be served up publicly (Mysite.com/My-Page.htm)

E.g., space? _ - , etc.

E.g., can I have File-Name.htm, File_Name.htm File Name.htm?

Obviously, this needs to work with all servers and browsers. (IIRC, the name is limited by the server not the browser, but I could be wrong).


Solution

  • What characters are allowed in filenames for HTML files on servers?

    That totally depends on the server. HTTP itself allows any character at all, including control characters and non-ASCII characters, as long as they are suitably %-encoded when requested in a URL.

    On a Unix server you cannot use ‘/’ or the zero byte. (If you could use them, they'd appear in the URL as ‘%2F’ and ‘%00’ respectively.) You also can't have the specific filenames ‘.’ or ‘..’, or the empty string.

    On a Windows server you have all the limitations of a Unix server, plus you also can't use any of \/:*?"<>| or control characters 1-31 and you can't have leading or trailing dot or spaces, and you'll have difficulty using any of the legacy device filenames (CON, PRN, COM1 and many more).

    This is nothing to do with HTTP; just how filenames work on Windows, which is complicated.

    can I have File-Name.htm, File_Name.htm File Name.htm?

    Certainly. But in the last case you should link to it by URL-encoding the space:

    <a href="File%20Name.htm">thingy</a>
    

    Browsers will usually let you get away with leaving the space in, but it's not really valid. If you want to avoid having to think about URL-escaping, HTML-escaping and case-sensitive issues, stick to a–z, 0–9 and underscore.