database-designforeign-keyskeyrelational-databasesurrogate-key

Surrogate Keys complicate insertion?


I often see people using artifical / surrogate keys in relational databases. Thinking about it, it seems to me that while this simplifies join-Queries, it complicates the insertion of new tuples. Take the following example:

R1(a, b, c) R2(c, d, e) c is a surrogate primary key of R2, referenced by R1(c). If I want to insert data in R1 and R2, I first have to check whether the to-be-inserted R2 tuple already exists in R2, and if so, I have to get its corresponding artifical key so I can reference it in the tuple for R1.

Using natural keys: R1(a,b,d,e) R2(d,e) The attributes d and e are the natural primary key set for R2, referenced by R1(d,e). If I want to insert a new tuple in R1 and R2, I simply can insert them, because for the R1 tuple the foreign key (ie. value of the R2 primary key set) I reference is known.

Am I correct in my assumption or am I missing something?


Solution

  • You are correct that additional work is usually needed to process surrogate keys. You might be able to take advantage of some library code or a framework to help you with surrogate key processing but there is always a cost in terms of complexity and processing time. Similar considerations apply to queries as well as inserts. Extensive use of surrogate keys invariably means your queries will be doing more joins on average.

    You should evaluate these costs against any potential benefits every time you think you want a surrogate key. Don't overuse surrogates. In many cases, and probably most cases, there is no good reason for using surrogate keys in database tables.