sql-serverdatabasemaster-data-management

How to store low cardinality master data in database


In my app database there are certain columns which can have only 2-4 possible values.

For example

  1. "GENDER" can have only 2 values('Male', 'Female')

  2. "MARITAL STATUS" can have 3 values('Single', 'Married', 'Divorced')

  3. "PROC_STATUS" can have 3 values ('Pending', 'In Progress', 'Finished')

  4. "PROC_SATISFACTION" can have values ('Disappointing', 'Not Satisfied', 'Satisfied', 'Highly Satisfied')

There are a few more master data values like this.

What is the best way to store this master data in the DB?

Making tables for each of them don't seem to be a good option as the data is static(will hardly change) and very less.

Another option would be to use Check Constraints.

Another option is to make enums in code.

I am looking for a way to store this master data in DB. Ia musing SQL Server 2012.

Any help will be highly appreciated.


Solution

  • Coming from a Systems Analyst...

    The major consideration is if you're going to do any internationalization. Gender seems really straightforward as 'M' and 'F' until you think that many languages aren't Latin based.

    Another consideration is if the database is truly intended to operate as a relational database fully independent of the application (for third party reporting, data import and export, etc.) or if the database is just application storage.

    IMX, if you have no need of internationalization, then for gender I would use char(1) fields for 'M' and 'F'. No need for anything beyond that because it's fairly obvious for categories like this (unless your system needs to worry about complex genders or the like). Similarly, if you can get away with 'Y' or 'N' for true/false fields and don't want to use bit then that's fine. Just be consistent throughout your application. Don't mix and match.

    For everything else, I would create a validation table that has the code, a description/expansion of the code, and (if at all possible) an Active column so users can designate that certain codes are no longer to be used (make sure your code respects that!). In a complex system, the system setup area of your application can allow users with SysAdmin access to create new codes, mark them active or inactive, or delete codes from the validation table once they are unused. They might want a PROC_STATUS of Cancelled, for example, or PROC_SATISFACTION of "No Response". Foreign key constraints are fine, but many applications that use this method don't use FKs in my experience.

    If your application needs to have i18n and the data needs to be portable between regions (i.e., Germany's database needs to be able to function cleanly attached to China's application server with just a few updates for language changes) then you can't really store the codes in the base tables. You'll probably need to use integers that map back to your validation tables where you have the lookup id integer, the code the users will use for their region, the long name for that code, and then the active/inactive option. Proper i18n will include pre-populating these tables with the correct values depending on the installation.