javasqlsql-injection

Sql string concatenation is always bad?


At university, we learned, and I also read in many places, that you should NEVER concatenate SQL strings. Yes, I know it's because of SQL injection, and I am aware of that.

However, what if the string you want to concatenate is only accessible in the code? So it is NOT user input and has nothing to do with it.

For example, I have a Character table with a character_id field. This Character has many attributes, so the character_id appears in many other tables as a foreign key. When I create a row in the Character table, I need to create rows in other tables where it appears as a foreign key.

To do this, I would need to write SQL statements one by one, since "you should NEVER use concatenation." However, a valid function here would be:

public void insertAttribute(String attributeTable, int characterId) {
String sql = "INSERT INTO "+attributeTable+" (character_id) VALUES (?)";
// execute
}

This could be called with any attributeTable, as long as the attributeTable string is not requested from the user, but only used in concatenation to shorten the code.


Solution

  • The act of concatenating a string is not inherently bad. For example, while silly there is no security risk in doing this:

    String sql = "INSERT INTO " + "MyTable" + " (character_id) VALUES (?)";
    

    Expanding further, if you have a finite set of values to replace "MyTable" and you control those values, you can dynamically choose from them without presenting a security risk.

    The problem comes not when concatenating strings, or even when letting users select which strings to concatenate. The problem comes when you don't control those strings.

    Any value which is user-modifiable, be it from direct user input or indirect user editing of other database values, or any other way in which users can modify the values used... Those values should not be concatenated into the SQL code.

    Not because of the act of concatenation, but because of the act of treating user input as code and executing it.

    As your example even demonstrates, one very common reason to do this at all is to dynamically replace schema objects (such as table names), which generally can't be done with parameters.


    Note that there are other reasons to avoid dynamically concatenating strings into SQL code. You could be missing out on performance optimizations from the database if doing this for otherwise parameter-drivable values. You could be setting a precedent that later, less-careful developers then expand into something that is SQL-injectable. Etc.

    There can often be better approaches. Using ORMs, using stored procedures, or just allowing the lines-of-code to be larger than you would have wanted just to keep it consistent/clear/secure. Sometimes even just avoiding the precedent of allowing such concatenation into a large codebase is itself worth a little extra effort.