Table Action
has fields id
, start_time
, end_time
and group_id
.
Table ActionGroup
has fields id
, start_time
and end_time
of a group of multiple sequential actions.
ActionGroup
has a 1-to-many relationship with Action
.
Example:
If there are four Action
records,
the relevant ActionGroup
record is
Action
already includes the data that ActionGroup
needs, and ActionGroup
is repeating it.
How do I query start time and end time of the ActionGroup
in a simple and performant way, without separately logging the start_time
and end_time
fields in ActionGroup
?
What are problems with duplicating the data as above when designing a table schema?
To start with your last question: Data duplication is bad in that it introduces the risk of inconsistency, e.g. if your ActionGroup
1 is registered with an end_time
of (say) 8 while having an Action
with end_time
13. If your data is inconsistent, you can't trust any query results, since an inconsistent logical system can produce any result at all.
In your example, ActionGroup
can be derived from Action
using an aggregate query (relational algebra pseudocode):
Action group by { group_id } add {
Min(start_time) start_time,
Max(end_time) end_time
}
rename { group_id id }
ā so you don't need those attributes in the Action
relvar (table). This is simple; whether it's fast enough depends on your requirements.
But note that if you do keep redundant start_time
and end_time
attributes in ActionGroup
, you need to control the redundancy (preferrably with a constraint, alternatively with triggered actions, or, in the worst case, application code) in order to avoid inconsistency. This will also have performance implications, but they will apply to writes instead of reads.