mysqlsqldatabase-designprimary-keysurrogate-key

How to avoid mistakes at Primary Key


Hi I'm a beginner at Databases, for this reason I want to ask you which atributes should I use as primary key to avoid mistakes:

    CREATE TABLE customer(
    name
    first_lastname
    street
    ZIP_code
    mobile_phone
    telephone
    email
    gender
    birthdate
    nationality);

Optionally I was thinking to add idcustomer as auto_increment but I am not sure that will be a great idea.


Solution

  • I was thinking to add idcustomer as auto_increment but I am not sure that will be a great idea.

    It is indeed a good idea.

    Your other columns (attributes) do not necessarily have unique values. In other words, they are not suitable to use as natural primary keys. What sort of value might work as a natural primary key? An employee number, possibly. A product serial number might work. Taxpayer ID numbers (social security numbers) do not work: a surprising number of people use duplicate numbers by mistake. The uniqueness standard for choosing a real-world item as a primary key is so high that most database designers don't even try.

    So creating a guaranteed-unique primary key is generally good design. The jargon for that kind of key is surrogate primary key. Most DBMS systems, MySQL included, provide autoincrementing numbers for that purpose.

    You can choose one of two conventions for naming that id value. One is to call it id. The other is to call it customer_id (the table name with _id added). The second one will help you keep things straight when you start using those values in other tables to establish relationships.

    For example, you might have a sales table. That table might have these columns:

    sales_id      autoincrementing pk
    customer_id   the id of the customer to whom the sale was made. (foreign key)
    item_sold     description of the item
    list_price
    discount
    net_price
    

    You get the idea. Read about primary keys and foreign keys. In the jargon of "logical database design," you can read about entities (customer, sales) and relationships. Each table gets its own series of auto-incrementing values.

    You can then use a query like this to find out sales to each customer.

     SELECT customer.name, customer.first_lastname,
            COUNT(sales.sales_id) number_of_sales,
            SUM(sales.net_price) revenue
       FROM customer
       JOIN sales ON customer.customer_id = sales.customer_id
      GROUP BY customer.customer_id, customer.name, customer.first_lastname
    

    Here the sales entity has a many-to-one relationship to the customer entity. That's implemented by having a customer_id attribute in each sales row pointing back to the customer.

    It's also a convention to make the id the first column in each table.

    Conventions are good: they help the next person to look at your application. They also help your future self.

    Note: my sales table is just an example to show how autoincrementing id values might be useful. I don't claim it's a good layout for a real-world sales table: it is not.