In an effort to fight SQL-injection, we've implemented a validator for table-names. It checks, if the characters are all alphanumeric, or, if anything else is found, whether it is present in the string returned by the JDBC-driver's getExtraNameCharacters-method:
public static void assertNameValidity(String name, Connection conn)
throws SQLException {
String suspect = name.replaceAll("[a-zA-Z0-9_]", "");
if (suspect.isEmpty()) {
/*
* XXX if the name begins with a number, it may
* XXX still be invalid, but we only care for
* XXX SQL-injection attempts anyway...
*/
logger.debug("{} is alphanumeric, a valid name", name);
return;
}
DatabaseMetaData md = conn.getMetaData();
logger.debug("'{}' contains suspect character{}: '{}'. " +
"Checking, if {} valid for a name in {}.",
name, suspect.length() == 1 ? "" : 's', suspect,
suspect.length() == 1 ? "it is" : "they are",
md.getDatabaseProductName());
/*
* Deal with the non-alphanumeric characters remaining -- they
* may still be suitable in names in certain databases.
*/
String additional = md.getExtraNameCharacters();
for (int i = 0; i < suspect.length(); i++) {
char c = suspect.charAt(i);
if (additional.indexOf(c) >= 0) {
logger.debug("Character '{}' is in '{}', " +
"thus valid part of name", c, additional);
continue;
}
logger.error("Found invalid character '{}' in \"{}\". " +
"Neither alphanumeric, nor otherwise valid",
c, name);
System.exit(3); /* XXX throw SecurityException instead? */
}
}
However, as the experience has shown, for SQL Server databases the getExtraNameCharacters()
returns only these three characters: "$#@", which causes strings like [dbo].[myTable]
to be rejected. As a work-around, we added the following block to the above function:
if (md.getDatabaseProductName() == "Microsoft SQL Server" &&
additional.indexOf('[') == -1) {
logger.info("Adding square brackets and dot to the " +
"list of acceptable characters, because Microsoft " +
"are lying to us returning only: '{}'", additional);
additional += "[].";
}
Things work, but why wouldn't the result of getExtraNameCharacters()
contain the dot and the square brackets in the first place?
You're making the wrong assumption about the result of DatabaseMetaData.getExtraNameCharacters()
. Its documentation states:
Retrieves all the "extra" characters that can be used in unquoted identifier names (those beyond a-z, A-Z, 0-9 and _).
ISO/IEC 9075-2:2023 (the SQL standard) only specifies a-z, A-Z, 0-9 and _ as allowed in an unquoted identifier, but most DBMSes also allow additional characters. The getExtraNameCharacters()
is so you can find out what those extra characters are.
However, [dbo].[myTable]
is an identifier chain (it is not a simple single identifier!), consisting of two quoted identifiers, namely [dbo]
and [myTable]
. Those square brackets are the default quote symbols of SQL Server. As such, neither those square brackets, nor the period (.
) separating the identifiers in the identifier chain are allowed in an unquoted identifier, and thus those characters are not included in the value of getExtraNameCharacters()
.
As an aside, I'm really wondering what kind of dynamic system with user-provided object names you're maintaining that you need to validate this at all to prevent SQL injection.