mysqlstringintrowsql-types

Tradeoff between using a string or int for column value


I'm making a database table where one of the columns is type. This is the type of thing that's being stored into this row.

Since this software is open source, I have to consider other people using it. I can use an int, which would theoretically be smaller to save in the database as well as much faster on lookup, but then I would have to have some documentation and it would make things more confusing for my users. The other option is to use a string, which takes up much more space and is slower on lookup.

Assuming this table will handle thousands of rows per day, it can reach the point of being unscalable quickly if I select the wrong data type.

Is using int always preferred in this case, when there are many millions of rows potentially in the database?


Solution

  • You are correct, INT is faster and therefore the better choice.

    If you are concerned about future developers, add comments to the column explaining each value. If there are a lot of values, consider using a lookup table, so you can ask for a string, get it's numeric ID (a litle bit like a constant) and then look for that.

    Like this

    id |  id_name
    ---|------------
    1  | TYPE_ALPHA
    2  | TYPE_BETA
    3  | TYPE_DELTA
    

    Now you have a literal explanation of the ID's. Just collect the ID (WHERE id_name = 'TYPE_ALPHA') and then use that to filter your table.

    Perhaps a happy medium of the two solutions however is to use the ENUM data type. Documentation here.

    If my understanding of ENUM is correct, it treats the field like a string during comparisons, but stores the actual data as numerated integers. When you look for a string, and it's not defined in the table schema, MySQL will simply throw an error, and if it does exist, then it will use the integer equivalent without even showing it. This provides both speed and readability.