databasedesign-patternsdatabase-designcontent-management-system

Best practice for draft/published states


I'm building a CMS and trying to figure out the best data architecture to handle a "save as draft" functionality.

Let's say I have a post table. I need to be able to save a post as a draft, and to publish it when needed. If a post has been saved as a draft, we will see the published version of the post. When editing that post, we will see the draft version, with the option to save it as a draft or to publish it.

After having done some research I came a across these options for the data architecture:

Option 1

post
  id
  title
  body
  status

post_draft
  id
  title
  body
  post_id

When creating a post, we insert a row in both tables. When viewing a post, we look up the post table. When editing a post, we look in the post_draft table to see if there is a draft. When saving a draft we update the post_draft table, and when publishing we copy the content of post_draft to the relevant post.

I've tried it and it works well, but it has the drawback of having to maintain two identical tables. So if in the future I need to add a column to a post, I need to add it to both tables.

However it neatly separates the published and draft posts.

Option 2

post
  id

post_content
  title
  body
  status
  post_id

Here we have a single table for a post, and we retrieve its content based on which status we want to retrieve. If we want the draft version, we join post and post_content with status="draft"

That also works well, but then it makes things like indexing and search a bit more complicated. It also introduces more complication for other things I need to do, like being able to return several tables in one request (I won't go into details - suffice to say that it complicates some other requirements).

Option 3

post
  id
  title
  body
  status
  published_id

Here we have only one table. To get the published version of a post we looked for the id with status="published". To look for the draft version we look for status="draft" and published_id=[the post id]

That seems to solve a number of problems. The only drawback I can see is that the id of published posts will jump all over the place.

Is there a best practice for implementing something like this? Any preferable options or something I haven't considered?


Solution

  • The third option is the best option and, from my point of view, the only correct one. The content of DRAFT, and LIVE is the same, and is only differentiated by one property - status. Imagine a situation where you will need to add a new status like "ACCEPTED", "DELETED", etc. you probably won't want to create new tables for this, because handling it will be a nightmare.

    Why do you see skipping IDs as a problem? It's not a problem for the database, and the post numbers if you want them to be in sequence then you can add a post_number column for example.