database

What is a relation in database terminology?


When someone refers to a relation in a database course, what does that mean?


Solution

  • Amazingly, "relation" in "relational" databases does not refer to the foreign key relationship of one table to another. "A relation is a data structure which consists of a heading and an unordered set of tuples which share the same type," according to Wikipedia on 'Relation (database)'.

    In SQL RDBMSes (such as MS SQL Server and Oracle] tables are permently stored relations, where the column names defined in the data dictionary form the "heading" and the rows are the "tuples" of the relation.

    Then from a table, a query can return a different relation:

    create table t (x number primary key, y number not null);
    
    Table created.
    
    SQL> insert into t values (1, 10);
    
    1 row created.
    
    SQL> insert into t values (2, 20);
    
    1 row created.
    
    SQL> select x from t;
    
             X
    ----------
             1
             2
    

    select x from t returned a relation with fewer columns, tuples of fewer elements, than the base table had. And select x, y from t where x = 1 will return a relation with fewer tuples than the base table:

    SQL> select x, y from t where x = 1;
    
             X          Y
    ---------- ----------
             1         10
    

    An example using inner join:

    SQL> create table s (x number primary key, words varchar2(100) not null);
    
    Table created.
    
    SQL> insert into s values (1, 'Hello World!');
    
    1 row created.
    
    SQL> insert into s values (3, 'Will not show');
    
    1 row created.
    
    SQL> select t.x, t.y, s.words
      2  from t
      3  inner join s
      4      on t.x = s.x;
    
             X          Y WORDS
    ---------- ---------- ---------------
             1         10 Hello World!
    

    Conceptually, t inner join s on t.x = s.x goes through the following steps:

    1. Take the cartesian product of s and t, which is to take each row of s and combine it with each row of t resulting in a tuple with size of s * size of t tuples or rows, each with all the columns from both s and t much like the results of:

      SQL> select * from s, t;

           X WORDS                    X          Y
      

           3 Will not show            1         10
           3 Will not show            2         20
           1 Hello World!             1         10
           1 Hello World!             2         20
      

    (Or select * from s cross join t in the SQL-92 syntax) From the cartesian product containing four tuples/rows with four columns on s.x = t.x trims the tuples down to one, still with four columns:

    SQL> select *
      2  from t
      3  inner join s
      4      on t.x = s.x;
    
             X          Y          X WORDS
    ---------- ---------- ---------- ---------------
             1         10          1 Hello World!
    

    And select t.x, t.y, s.words shaves one column off of the relation.

    Note that the above describes a conceptual or logical model of what is going on. Databases come with query optimizers that are designed to give the results as if all the logical steps had been followed, but manage to skip steps, in the physical implementation of the work and to use supporting physical structures, such as indexes, that are not part of the relational model.

    Views are relation definitions that do not store the relation, but define a relation based on other relations, eventually with tables at the bottom. (Except for materialized views, that precompute and store a relation based on other relations.)