I have some trouble defining appropriate database design for my java based web application. I have 3 tables(Tag, DT and Property) and my scenario is as follow. Each Tag can contain many Dts and each DT can be assigned to different Tags, each DT has many Properties and each Property can be used in many DTs. When a DT is assigned to a Tag, then the user can set values for all its properties.
My question is that how can I define a relation such that each Tag can have its properties' values base on the DT that is assigned to it. what are missing tables and relations in my design. I have to create relative Entities based on this db then.
Tag1==> DT1 ==> Initial values1, property values set 1 ;
Tag2==> DT1 ==> Initial values1, property values set 2 ;
Here is the db diagram. And thanks for your help in advance.
Here is some sample data
The central issue with your design, as I see it, is the use of the surrogate keys. It is not necessary to always create numeric single-column keys for tables. And even if you do so, this doesn't guarantees you'll be free of duplicates. This, in fact, forces system to keep more indexes on your tables, which is an extra job to be done.
Some concepts:
<table_name>_id
for PK columns. It's also possible due to singular
table names;You don't have to follow them, but it'd be very nice to stick to a consistent naming patterns for your design.
I would start off with property_type
dictionary:
CREATE TABLE property_type (
property_type varchar(20) NOT NULL,
CONSTRAINT p_property_type PRIMARY KEY (property_type)
);
It is a one-column table, that exists only to provide a domain of possible
values for the properties types. I used varchar(20)
, textual column is pretty
fine. A benefit of this — you don't have to join back to this table via numeric
keys to get what property_type_id=123
means.
Properties:
CREATE TABLE property (
property_id integer NOT NULL,
property_name varchar(50) NOT NULL,
property_type varchar(20) NOT NULL,
CONSTRAINT p_property PRIMARY KEY (property_id),
CONSTRAINT u_property_name UNIQUE (property_name),
CONSTRAINT f_property_type FOREIGN KEY (property_type)
REFERENCES property_type ON UPDATE CASCADE
);
I decided to go with a numeric PK here, for I suppose you might want to rename properties
at some point. And should you change property_type
, updates will be cascaded.
Despite the fact there's a PK here already, UNIQUE
constraint on names is a must here,
otherwise you risk to end up in a situation where you have equally named properties
with different IDs.
DT table:
CREATE TABLE dt (
dt_id integer NOT NULL,
dt_name varchar(50) NOT NULL,
CONSTRAINT p_dt PRIMARY KEY (dt_id),
CONSTRAINT u_dt_name UNIQUE (dt_name)
);
Again, having just PK is not enough, creating also UNIQUE
constraint.
Though I would very much got rid of dt_id
completely and would keep only
dt_name
and would have made it a PK.
Properties for the DT:
CREATE TABLE dt_property (
dt_id integer NOT NULL,
property_id integer NOT NULL,
initial_value varchar(50) NOT NULL,
CONSTRAINT p_dt_property PRIMARY KEY (dt_id, property_id),
CONSTRAINT f_dt_id FOREIGN KEY (dt_id) REFERENCES dt,
CONSTRAINT f_property_id FOREIGN KEY (property_id) REFERENCES property
);
This is the first big difference from your design — composite key is used.
Yes, it means you'll have to carry out 2 columns whenever you want to refer to the entry in this table. But this is not a big deal, really — you design table once, you write your queries also once, but your software might be used for quite some time if it's done properly and easily maintained. It's better to spend a bit more time writing queries and get easily maintained system in the long term.
Tags:
CREATE TABLE tag (
tag_id integer NOT NULL,
tag_name varchar(50) NOT NULL,
CONSTRAINT p_tag PRIMARY KEY (tag_id),
CONSTRAINT u_tag_name UNIQUE (tag_name)
);
This is just another dictionary. Again, like for dt
table, I would really want
to avoid using tag_id
column and keep just tag_name
, making it also a PK.
New table tag_dt
introduced:
CREATE TABLE tag_dt (
tag_id integer NOT NULL,
dt_id integer NOT NULL,
CONSTRAINT p_tag_dt PRIMARY KEY (tag_id, dt_id),
CONSTRAINT f_tag_id FOREIGN KEY (tag_id) REFERENCES tag,
CONSTRAINT f_dt_id FOREIGN KEY (dt_id) REFERENCES dt
);
This table is required to create dt
+ tag
relations. Without it, you have data
duplications — you can see it on your schema, you have 2 rows with Tag_name='Tag1'
.
Finally, tag properties:
CREATE TABLE tag_property (
tag_id integer NOT NULL,
dt_id integer NOT NULL,
property_id integer NOT NULL,
a_value varchar(50) NOT NULL,
CONSTRAINT p_tag_property PRIMARY KEY (tag_id, dt_id),
CONSTRAINT u_tag_property UNIQUE (tag_id, property_id),
CONSTRAINT f_tag_property_tag FOREIGN KEY (tag_id, dt_id) REFERENCES tag_dt,
CONSTRAINT f_tag_property_property FOREIGN KEY (dt_id, property_id)
REFERENCES dt_property
);
This table is full composite keys and it conforms to all your requirements.
Primary key is tag_id, dt_id
and it is also a Foreign key to the tag_dt
table,
so you want be able to introduce something that hadn't been defined previously.
Next, tag_id, property_id
is unique, meaning tag's properties cannot duplicate.
Finally, dt_id, property_id
references dt_property
table, which means that
only properties allowed for that dt
will be registered.
All Primary and Unique keys are implemented via indexes in most DBMSes. Also, some DBMSes can use composite indexes (multi-column ones) even if first column of the key is omitted. At least PostgreSQL can do it, this is what I use most.
Please, review your dt
and tag
tables, I highly recommend getting rid of the surrogate
keys in those, just like property_type
is done.
I haven't created any extra indexes, typically I take this exercise after Data Model is implemented and some real queries are done against it.
Also, don't use value
or name
for column names. These are reserved words and you might have unexpected effects in the future incarnations of your DBMS of choice.