relational-databasereportinghistorical-db

Reporting on historical data - when to use current or historic data (warn user about changing the meaning of a record?)


This is similar to this question

Basically I am writing the database and software for a building monitoring system. The system monitors things like temperature, humidity, pressure of units (such as fridges).

When a report is generated for a particular unit, it has a column for each sensor with the sensor name, with the readings below.

I'm trying to decide whether it should be the names at the time of the reading, or the name as it stands now.

I currently think it should be the latter, because I tend to believe that given the same (primary) key, the meaning of the data it uniquely identifies should not change even if it's attributes do. So, any name changes should be for correction or clarification only, IMO, almost as if the name itself was the primary key (I'm all for natural keys but tend not to use the name of something due to length and possible need for spelling correction - in this case though I have a "SensorNo" field (not ID) which is unique to the unit it is monitoring).

Anyway - this means I can indeed use the current name, and if they change the name of the sensor from perhaps "Air" to "Food", then it should say "Food" on all report data even if it is showing data from before the change, the idea being this change should mean it was always meant to be "Food", not "Air" - it is a correction.

For none "key" data such as "Upper Limit" and "Lower Limit" (the temperature range for instance) the historical data should be used when reporting, in this case because it shows the temperature range of the sensor at the time of the reading.

You could argue it is similar to changing the full-name of a user: it is still the same person, but they may of changed their name, therefore the report history should always show the persons up to date name, otherwise confusion could ensue.

My plan is to allow users to change the name but warn them of the implications first, and state it should only be used for corrections.

What are peoples thoughts on this? How do you handle this situation? I'm interested too in hearing from Catcall who anwered the question linked.

(Please note this is not a discussion about how historical data is stored, I'm already fine with that side of things).


Solution

  • I think you answered by yourself. If they change the name of a field it's a correction (and then they do not need the see the old name in new reports). If they change the sensor name because the meaning is changed then they should ADD a new sensor. I do not think you can really save your users from this kind of mistakes. In some application I saw they DROP the database when they build a new input set (but you can't or you do not want to do this). It may be different if they want to use vendor/model name as tag name (sensor name). In this case it may change during time and probably they want to see the true name for each interval. In this case, if changes are limited to tag name, you can save yourself with a small table to track these changes (id, name, timestamp) and update the hypothetical ResolveTagName(id, time) function to query that table. I have more doubt about historical data for limits because they may change during time (for example the maximum temperature for a part can decrease during time because the part is getting older or because a measurable physical stress, in this case you cannot apply the current limit for an old measure).