sqldatabase-designforeign-keysrelational-databasedatabase-agnostic

Representing a plant with definite genus but unknown species


I want to represent plants in a database along with their genus/species. This was the schema that came to mind:

CREATE TABLE genuses (
    id INTEGER PRIMARY KEY,
    name TEXT UNIQUE NOT NULL);

CREATE TABLE species (
    id INTEGER PRIMARY KEY,
    genus_id INTEGER NOT NULL REFERENCES genuses(id),
    name TEXT UNIQUE NOT NULL);

CREATE TABLE plants (
    id INTEGER PRIMARY KEY,
    species_id INTEGER NOT NULL REFERENCES species(id),
    ...);

The problem with this approach is that it doesn't allow a plant to have a genus without a specific species. For instance, I own Phalaenopsis orchid. I know the genus is Phalaenopsis but I haven't been able to identify the exact species.

There are two solutions I can think of, but I'm not satisfied with either:

  1. For each genus, add an "unknown" row to the species table. (Redundant data, requires triggers or similar to keep species table up to date.)
  2. Keep a reference to species and genus in the plant table. Allow species to be null. (Data consistency issues, additional constraints necessary.)

Is there a better way to do this?


Solution

  • Composite keys are quite useful to enforce that transitive relationship between genuses and plants.

    You can do:

    CREATE TABLE genuses (
      id INTEGER PRIMARY KEY,
      name TEXT UNIQUE NOT NULL
    );
    
    CREATE TABLE species (
      id INTEGER not null UNIQUE,
      genus_id INTEGER NOT NULL REFERENCES genuses (id),
      PRIMARY KEY (id, genus_id),
      name TEXT UNIQUE NOT NULL
    );
    
    CREATE TABLE plants (
      id INTEGER PRIMARY KEY,
      genus_id INTEGER NOT NULL REFERENCES genuses (id),
      species_id INTEGER,
      FOREIGN KEY (species_id, genus_id) REFERENCES species (id, genus_id),
      name VARCHAR(50)
    );
    

    Then you can insert:

    insert into genuses (id, name) values (100, 'G1');
    insert into genuses (id, name) values (101, 'G2');
    
    insert into species (id, genus_id, name) values (1000, 100, 'S1');
    insert into species (id, genus_id, name) values (1001, 100, 'S2');
    
    insert into plants (id, genus_id, species_id, name) values (200, 100, 1000, 'P1');
    insert into plants (id, genus_id, species_id, name) values (201, 100, null, 'P2');
    

    See running example at db-Fiddle.

    Note: This is a database-agnostic solution... mostly. You need to make sure the foreign keys are defined with matching type MATCH SIMPLE, which is the default virtually all around relational databases. MATCH FULL won't allow your use case.