database-designentity-relationshipdatabase-normalizationthird-normal-form

Violation of 3NF if we use an auto-incremented identity column along with a PK


As it is said in the book of Database Solutions Second Edition written by Thomas Connolly and Carolyn Begg page 180:

Third normal form (3NF)
A table that is already in 1NF and 2NF, and in which the values in all non-primary-key columns can be worked out from only the primary key column(s) and no other columns.

I have seen many scenarios where people use an identity column though they already have a primary key column in their table. A record can also be worked out from the identity column, so isn't it a violation of 3NF if we use an auto-incremented identity column along with a primary key in a table?

UPDATE: If it is not so which column should be referenced as a foreign key in another table.The primary key column or the Identity column?


Solution

  • That book Database Solutions: A Step by Step Guide to Building Databases 2nd 2004 Edition is a mess. Unfortunately it says wrong things, it misleads, and a lot of their wording is extremely poor--like "work out"--which is informal & never defined.

    Third normal form (3NF)
    A table that is already in 1NF and 2NF, and in which the values in all non-primary-key columns can be worked out from only the primary key column(s) and no other columns.

    That wrong definition is actually intended to be informal and for when a table has just one CK (candidate key). But the book doesn't say that except indirectly and later when it gives another wrong definition involving PKs (primary keys):

    The formal definition for third normal form (3NF) is a table that is in first and second normal forms and in which no non-primary-key column is transitively dependent on the primary key.

    Later still it says that there can be multiple CKs but it still gives a wrong definition:

    Therefore, for tables with more than one candidate key you can use the generalized definition for 3NF, which is a table that is in 1NF and 2NF, and in which the values in all the non-primary-key columns can be worked out from only candidate key column(s) and no other columns.

    It is wrongly saying "primary-key columns" where prime columns ie CK columns would be correct.

    Their other book Database Systems 4th Edition 2005 also introduces special cases of definitions for when there is just one CK without saying so then later gives "general" definitions. At least those get "prime attribute" correct.

    General definition for Third Normal Form (3NF) is a relation that is in First and Second Normal Form in which no non-candidate-key attribute is transitively dependent on any candidate key.

    There is nothing unusual about a table having multiple CKs in any normal form.