mysqldatabasedatabase-designsurrogate-key

Should I create a surrogate key instead of a composite key?


Structure:

Actor <=== ActorMovie ===> Movie

ActorMovie: ActorID (fk), MovieId (fk)... ===> pk: (ActorID, MovieID)

Should do I create a surrogate key for ActorMovie table like this?

ActorMovie: ActorMovieID (pk), ActorID (fk), MovieId (fk)...

Solution

  • Conventions are good if they are helpful

    "SQL Antipatterns", Chapter 4, "ID Required"

    Intention of primary key

    Primary key - is something that you can use to identify your row with it's unique address in table. That means, not only some surrogate column can be primary key. In fact, primary key should be:

    Compound versus surrogate

    There are cases, when surrogate key has benefits. Most common problem - if you have table with people names. Can combination of first_name + last_name + taxpayer_id be unique? In most cases - yes. But in theory, there could be cases, when duplicated will occur. So, this is the case, when surrogate key will provide unique identifying of rows in any case.

    However, if we're talking about many-to-many link between tables, it's obvious, that linking table will always contain each pair once. In fact, you'll even need to check if duplicate does not exist before operating with that table (otherwise - it's redundant row, because it holds no additional information unless your design has a special intention to store that). Therefore, your combination of ActorID + MovieID satisfies all conditions for primary key, and there's no need to create surrogate key. You can do that, but that will have little sense (if not at all), because it will have no meaning rather than numbering rows. In other hand, with compound key, you'll have:

    As a conclusion - yes, there are cases, when surrogate key should (or even must) be applied, but in your particular case it will definitely be antipattern - use compound key.

    References: