databasedatabase-designrelational-databasedatabase-schemaidentifying-relationship

Data Modeling: Logical Modeling Exercise


In trying to learn the art of data storage I have been trying to take in as much solid information as possible. PerformanceDBA posted some really helpful tutorials/examples in the following posts among others: is my data normalized? and Relational table naming convention. I already asked a subset question of this model here.

So to make sure I understood the concepts he presented and I have seen elsewhere I wanted to take things a step or two further and see if I am grasping the concepts. Hence the purpose of this post, which hopefully others can also learn from. Everything I present is conceptual to me and for learning rather than applying it in some production system. It would be cool to get some input from PerformanceDBA also since I used his models to get started, but I appreciate all input given from anyone.

As I am new to databases and especially modeling I will be the first to admit that I may not always ask the right questions, explain my thoughts clearly, or use the right verbage due to lack of expertise on the subject. So please keep that in mind and feel free to steer me in the right direction if I head off track.

If there is enough interest in this I would like to take this from the logical to physical phases to show the evolution of the process and share it here on Stack. I will keep this thread for the Logical Diagram though and start new one for the additional steps. For my understanding I will be building a MySQL DB in the end to run some tests and see if what I came up with actually works.

Here is the list of things that I want to capture in this conceptual model. Edit for V1.2

  1. The purpose of this is to list Bands, their members, and the Events that they will be appearing at, as well as offer music and other merchandise for sale
  2. Members will be able to match up with friends
  3. Members can write reviews on the Bands, their music, and their events.
    • There can only be one review per member on a given item, although they can edit their reviews and history will be maintained.
    • BandMembers will have the chance to write a single Comment on Reviews about the Band they are associated with. Collectively as a Band only one Comment is allowed per Review.
    • Members can then rate all Reviews and Comments but only once per given instance
  4. Members can select their favorite Bands, music, Merchandise, and Events
  5. Bands, Songs, and Events will be categorized into the type of Genre that they are and then further subcategorized into a SubGenre if necessary. It is ok for a Band or Event to fall into more then one Genre/SubGenre combination.
  6. Event date, time, and location will be posted for a given band and members can show that they will be attending the Event. An Event can be comprised of more than one Band, and multiple Events can take place at a single location on the same day
  7. Every party will be tied to at least one address and address history shall be maintained. Each party could also be tied to more then one address at a time (i.e. billing, shipping, physical)
  8. There will be stored profiles for Bands, BandMembers, and general members.

So there it is, maybe a bit involved but could be a great learning tool for many hopefully as the process evolves and input is given by the community. Any input?

alt text

EDIT v1.1 In response to PerformanceDBA

U.3) That means no merchandise other than Band merchandise in the database. Correct ? That was my original thought but you got me thinking. Maybe the site would want to sell its own merchandise or even other merchandise from the bands. Not sure a mod to make for that. Would it require an entire rework of the Catalog section or just the identifying relationship that exists with the Band? Attempted a mod to sell both complete albums or song. Either way they would both be in electronic format only available for download. That is why I listed an Album as being comprised of Songs rather then 2 separate entities.

U.5) I understand what you bring up about the circular relation with Favorite. I would like to get to this “It is either one Entity with some form of differentiation (FavoriteType) which identifies its treatment” but how to is not clear to me. What am I missing here?

u.6) “Business Rules This is probably the only area you are weak in.”
Thanks for the honest response. I will readdress these but I hope to clear up some confusion in my head first with the responses I have posted back to you.

Q.1) Yes I would like to have Accepted, Rejected, and Blocked. I am not sure what you are referring to as to how this would change the logical model?

Q.2) A person does not have to be a User. They can exist only as a BandMember. Is that what you are asking?

Minor Issue

Zero, One, or More…Oops I admit I forgot to give this attention when building the model. I am submitting this version as is and will address in a future version. I need to read up more on Constraint Checking to make sure I am understanding things.

M.4) Depends if you envision OrderPurchase in the future. Can you expand as to what you mean here?

alt text

EDIT V1.2 In response to PerformanceDBA input...

Lessons learned.

  1. I was mixing the concept of Identifying / Non-Identifying and Cardinality (i.e. Genre / SubGenre), and doing so inconsistently to make things worse.
  2. Associative Tables are not required in Logical Diagrams as their many-to-many relationships can be depicted and then expanded in the Physical Model.
  3. I was overlooking the Cardinality in a lot of the relationships
  4. The importance of reading through relationships using effective Verb Phrases to reassure I am modeling what I want to accomplish.

U.2) In the concept of this model it is only required to track a Venue as a location for an Event. No further data needs to be collected. With that being said Events will take place on a given EventDate and will be hosted at a Venue. Venues will host multiple events and possibly multiple events on a given date. In my new model my thinking was that EventDate is already tied to Event . Therefore, Venue will not need a relationship with EventDate. The 5th and 6th bullets you have listed under U.2) leave me questioning my thinking though. Am I missing something here?

U.3) Is it time to move the link between Item and Band up to Item and Party instead? With the current design I don't see a possibility to sell merchandise not tied to the band as you have brought up.

U.5) I left as per your input rather than making it a discrete Supertype/Subtype Relationship as I don’t see a benefit of having that type of roll up.

Additional Revisions

AR.1) After going through the exercise for FavoriteItem, I feel that Item to Review requires a many-to-many relationship so that is indicated. Necessary? enter image description here

Ok here we go for v1.3

I took a few days on this version, going back and forth with my design. Once the logical process is complete, as I want to see if I am on the right track, I will go through in depth what I had learned and the troubles I faced as a beginner going through this process. The big point for this version was it took throwing in some Keys to help see what I was missing in the past. Going through the process of doing a matrix proved to be of great help also. Regardless of anything, if it wasn't for the input given by PerformanceDBA I would still be a lost soul wondering in the dark. Who knows my current design might reaffirm that I still am, but I have learned a lot so I am know I at least have a flashlight in my hand.

At this point in time I admit that I am still confused about identifying and non-identifying relationships. In my model I had to use non-identifying relationships with non nulls just to join the relationships I wanted to model. In reading a lot on the subject there seems to be a lot of disagreement and indecisiveness on the subject so I did what I thought represented the right things in my model. When to force (identifying) and when to be free (non-identifying)? Anyone have inputs?

enter image description here

EDIT V1.4

Ok took the V1.3 inputs and cleaned things up for this V1.4

Currently working on a V1.5 to include attributes.

enter image description here

EDIT V1.6

Okay, it has been some time since I have posted on here but the work on this project is still ongoing. I am posting V1.6 now which includes a number of changes from the last posting of V1.4. This version shows the further evolution of the Keys. It still does not include the attributes or any AK's or IE's. I have started working on the physical model and used that to help work through the attributes and to try and shed some light on the problems I am having with defining the AK's and IE's. The next posting of the Logical Model will include these keys and the attributes.

enter image description here


Solution

  • Method

    I will cover specifics, but I will cover one or two Subject Areas completely, not all. You can pick that up and apply it to all subject Areas.

    I have not responded to the core Subject Area, because we are still dealing with Identifying Entities. When that is resolved the Reviews, etc will be easier; the Transaction Entities are Dependent on the Identifying Entities.

    Direction

    D.1) I know that I stated that I need to see the whole model. There is one exception. Historic or Temporal or Audit data (eg. the Edit and stored versions). At this early stage, they can be set aside; to be implemented just before completion of the Logical Model. This is in recognition that (a) they are simple Dependents of some parent (b) the parents need to be modelled in relation to all other tables first, and (c) to exclude unnecessary complications, and thus allow us to concentrate on the relevant field.

    Unresolved

    U.1) Optional Parent
    That is completely disallowed. Not just by IDEF1X, but by any notion of Integrity. If the FK Reference is defined, then there must be a Parent. To allow optional parents, the FK Reference must be removed (or not implemented). Such a condition would exclude the result from qualfying as a "Relational database", by definition. Eg. Address:Order.

    U.2) Event
    Party::PartyAddress is correct; Address::PartyAdress is correct. Event::Address needs work. Address is an Identifying Reference table; if used, it would be the parent, Event would be the child. I leave it to you to identify/model multiple Events to a location, and Events at one or multiple locations.

    U.3) Assuming Catalog is an entry in the traditional sense (JCPenney 2011), a list of items for sale or hire.

    U.4) Genre

    U.5) Favorite
    The Cardinality of Item::Favorite is reversed. When you correct that, the Favorite Subject Area will require further modelling.

    U.6) Business Rules This is probably the only area you are weak in. General response. You have done the tasks separately (all the modelling vs writing BRs). These do not match the model. When you go through the next cycle, take the Business Rules as directives, and modulate them at the same time, as with the Entities, the Relations, and the Verb Phrases.

    Question

    Q.1) User/Friend
    You have the essence of it perfectly. And the Cardinality of the Relations. (Full treatment on this one.) That is correct for Accepted Friend.

    Q.2) What is the basis on which a Person is zero-to-many Users ?

    Minor Issue

    M.1) Singular only.

    M.2) Party Has zero-to-many Addresses. I would think they must have one, in order to transact business (but perhaps not for all Users).

    M.3) Order May Have zero-to-many Payments. "Requires" means that first Payment has to be inserted at the same time as Order.

    M.4) OrderSaleItem shoulld be OrderItem xor Order should be OrderSale. Depends if you envision OrderPurchase in the future.

    Subject Area Example

    Readers who are unfamiliar with the Standard for Modelling Relational Databases may find IDEF1X Notation useful.

    As stated, I am not providing a finished Data Model, only guidance. This is just one progression of one selected Subject Area. It is not "right" or complete in any way.

    So go ahead and progress the model, then post again (just edit the question, leaving the header paras, and replacing the rest).

    V1.1 and Response

    That is certainly a progression.

    I have re-numbered the items in pseudo-legal format, including the section headings, so that we can keep the numbering throughout, and keep adding to it. Actually it really eases the SO editing problems as well.

    U.3) Would it require an entire rework of the Catalog section or just the identifying relationship that exists with the Band?

    Attempted a mod to sell both complete albums or song. Either way they would both be in electronic format only available for download. That is why I listed an Album as being comprised of Songs

    rather then 2 separate entities.

    U.5) ... but how to is not clear to me. What am I missing here?

    Q.1) Yes I would like to have Accepted, Rejected, and Blocked. I am not sure what you are referring to as to how this would change the logical model?

    With the former, we have additional statuses:

    Q.2) A person does not have to be a User. They can exist only as a BandMember. Is that what you are asking?

    M.3) I need to read up more on Constraint Checking to make sure I am understanding things.

    M.4) Depends if you envision OrderPurchase in the future. Can you expand as to what you mean here?

    Version 1.1

    U.2) Event Progressed

    M.5) SubGenre. Can you explain why SubGenre is (a) Independent and (b) the Relation is Non-Identifying.

    M.6) Item Is zero-to-many Favourites. Therefore: Item Is a Favourite of zero-to-many Users. Likewise, Each User Chooses zero-to-many Favourites. Therefore Each User Chooses zero-to-many Favourite Items.

    V1.2 and Response

    Great Progress.

    U.2) Event Further Progressed

    Going by your Edit as well as the new Requirements, some yes and some no. All the other Subject Areas of the Data Model are pretty much complete (for Logical), this one area is confused, not nearly as resolved. Partly because of the added Requirements (no complaint, that happens in real life; it is about how you handle it).

    The main point I will make here is that the Data Model should always model the real world, as opposed to only the business Requirement. That (a) insulates the DM from the effect of change and (b) provides a solid platform for added Requirements. That does not mean you have to model the whole real world, but the parts of it that you do model must reflect reality and not be squished up to fill just the Requirement.

    Second, there is lack of clarity about the distinctions between Event, Band-Event, Performance, etc. Right now an Event is a Party-Band-Item-Event. That's fine, but it does not work for the new style Event per Requirement.

    Third, you have a good handle on Address re Party and Order, but not re Venue.

    If you do not mind, I will avoid typing one thousand words, and give you a picture. Subject Area Example V1.2

    U.3) Is it time to move the link between Item and Band up to Item and Party instead? With the current design I don't see a possibility to sell merchandise not tied to the band as you have brought up.

    Version 1.2

    AR.1) After going through the exercise for FavoriteItem, I feel that Item to Review requires a many-to-many relationship so that is indicated. Necessary?

    U.7) That leaves us with a new issue to resolve. If a Review can be about a Band or Album or Song or Performance, how do we ensure that Referential Integrity. We do not need an AlbumReview to reference a SongReview, etc. Model it.

    R.5) The model currently provides Genre at the Item level, that means Album and Song (Merchandise can be disallowed via a CHECK Constraint). Not Band. That may be enough, given that (a) bands change over time, (b) that kind of classification at the Item level is more precise, and (c) Band Genre can be easily derived from their Albums or Songs.

    R.6) members can show that they will be attending the Event is not modelled. Also clarify interest vs booking vs attendance.

    R.8) Is not modelled.

    M.3) The issue is closed, but the Verb Phrase remains unchanged.

    M.7) Logical Model vis-a-vis Associative tables. Now that that issue is closed, remove any Associative tables for the Logical model; any remaining tables (between two parents) will contain data. That means, go through all the Dependent tables and remove any that do not have data. Thus V1.3 should be less cluttered.

    M.8) Item is OrderItem.

    M.9) Now that Party-Person-User is resolved. An Exclusive Subtype structure requires a Discriminator, and the Constrainst will be used to enforce Integrity. Where there are many, PartyType is the way to go. But for just two, a column IsBand or IsPerson is adequate.

    M.10) You have corrected the cardinality-reversed bug, but some Verb Phrases are still going the wrong way.

    27 Jan 11

    Actually, I think a lot of these issues would be clearer if we move into the Logical Key/Attribute level (rather than just Entity Relation level). And it is high time we did. For example:

    Q.3) Order:Address is suspicious. The constraint is not quite correct because that would allow the order to have any Address, not an Address that is specific to the Party executing the order.

    But since you are MySQL, which has no Referential Integrity, you may not be aware of how it is done in real SQL, so I will provide the FK Definitions, which happen to be RI Constraints as well. It is kind of unfair to expect you to understand my terse statements, which are based in the RM, Normalisation and supported by SQL, when you do not have SQL.

    Address Qualification Example

    Continued in Part II ...