sql-serverjdbcmetadata

Why does getExtraNameCharacters() not include the "[]."?


In an effort to fight SQL-injection, we've implemented a validator for table-names. It checks, if the characters are all alphanumeric, or, if anything else is found, whether it is present in the string returned by the JDBC-driver's getExtraNameCharacters-method:

public static void assertNameValidity(String name, Connection conn)
    throws SQLException {
    String suspect = name.replaceAll("[a-zA-Z0-9_]", "");
    if (suspect.isEmpty()) {
        /*
         * XXX if the name begins with a number, it may
         * XXX still be invalid, but we only care for
         * XXX SQL-injection attempts anyway...
         */
        logger.debug("{} is alphanumeric, a valid name", name);
        return;
    }
    DatabaseMetaData md = conn.getMetaData();
    logger.debug("'{}' contains suspect character{}: '{}'. " +
        "Checking, if {} valid for a name in {}.",
        name, suspect.length() == 1 ? "" : 's', suspect,
        suspect.length() == 1 ? "it is" : "they are",
        md.getDatabaseProductName());
    /*
     * Deal with the non-alphanumeric characters remaining -- they
     * may still be suitable in names in certain databases.
     */
    String additional = md.getExtraNameCharacters();
    for (int i = 0; i < suspect.length(); i++) {
        char c = suspect.charAt(i);
        if (additional.indexOf(c) >= 0) {
            logger.debug("Character '{}' is in '{}', " +
                "thus valid part of name", c, additional);
            continue;
        }
        logger.error("Found invalid character '{}' in \"{}\". " +
            "Neither alphanumeric, nor otherwise valid",
            c, name);
        System.exit(3); /* XXX throw SecurityException instead? */
    }
}

However, as the experience has shown, for SQL Server databases the getExtraNameCharacters() returns only these three characters: "$#@", which causes strings like [dbo].[myTable] to be rejected. As a work-around, we added the following block to the above function:

if (md.getDatabaseProductName() == "Microsoft SQL Server" &&
    additional.indexOf('[') == -1) {
    logger.info("Adding square brackets and dot to the " +
        "list of acceptable characters, because Microsoft " +
        "are lying to us returning only: '{}'", additional);
    additional += "[].";
}

Things work, but why wouldn't the result of getExtraNameCharacters() contain the dot and the square brackets in the first place?


Solution

  • You're making the wrong assumption about the result of DatabaseMetaData.getExtraNameCharacters(). Its documentation states:

    Retrieves all the "extra" characters that can be used in unquoted identifier names (those beyond a-z, A-Z, 0-9 and _).

    ISO/IEC 9075-2:2023 (the SQL standard) only specifies a-z, A-Z, 0-9 and _ as allowed in an unquoted identifier, but most DBMSes also allow additional characters. The getExtraNameCharacters() is so you can find out what those extra characters are.

    However, [dbo].[myTable] is an identifier chain (it is not a simple single identifier!), consisting of two quoted identifiers, namely [dbo] and [myTable]. Those square brackets are the default quote symbols of SQL Server. As such, neither those square brackets, nor the period (.) separating the identifiers in the identifier chain are allowed in an unquoted identifier, and thus those characters are not included in the value of getExtraNameCharacters().

    As an aside, I'm really wondering what kind of dynamic system with user-provided object names you're maintaining that you need to validate this at all to prevent SQL injection.