sqldatabasenon-relational-databaseobject-relational-model

Dynamic Database/Key - Value/Entity - Key Value Dillemma


I have been programming relational database for many years, but now have come across an unusual and tricky problem:

I am building an application that needs to have very quick and easily defined entities (by the user). Instances of these entities could then be created, updated, deleted etc.

There are two options I can think of.

Option 1 - Dynamically created tables

The first option is to write an engine to dynamically generate the tables, and insert the data into these. However, this would become very tricky, as every query would also need to be dynamic, or at least dynamically created stored procedures etc.

Option 2 - Entity - Key - Value Pattern

This is the only realistic option I can think of, where I have 5 table structure:

EntityTypes

EntityTypeID int

EntityTypeName nvarchar(50)

Entities

EntityID int

EntityTypeID int

FieldTypes

FieldTypeID int

FieldTypeName nvarchar(50)

SQLtype int

FieldValues

EntityID int

FIeldID int

Value nvarchar(MAX)

Fields

FieldID int

FieldName nvarchar(50)

FieldTypeID int

The "FieldValues" table would work a little like a datawarehouse fact table, and all my inserts/updates would work by filling a "Key/Value" table valued parameter and passing this to a SPROC (to avoid multiple inserts/updates).

All the tables would be heavily indexed, and I would end up doing many self joins to obtain the data.

I have read a lot about how bad Key/Value databases are, but for this problem it still seems to be the best.

Now my questions!

Any direction and advice much appreciated!


Solution

  • As others have suggested NoSQL, I'm going to say that, in my opinion, schemaless databases really is best suited for use-cases with no schema.

    From the description, and the schema you came up with, it looks like your case is not in fact "no schema", but rather it seems to be "user-defined schema".

    In fact, the schema you came up with looks very similar to the internal meta-schema of a relational database. (You're sort of building a relational database on top of a relational database, which in my experience is not a good idea, as this "meta-database" will have at least twice the overhead and complexity for any basic operation - tables will get very large, which doesn't scale well, and the data will be difficult to query and update, problems will be difficult to debug, and so on.)

    For use-cases like that, you probably want DDL: Data Definition Language.

    You didn't say which SQL database you're using, but most SQL databases (such as MySQL, PostgreSQL and MS-SQL) support some dialect of DDL extensions to SQL syntax, which let you manipulate the actual schema.

    I've done this successfully for use-cases like yours in the past. It works well for cases where the schema rarely changes, and the data volumes are relatively low for each user. (For high volumes or frequent schema updates, you might want schemaless or some other type of NoSQL database.)

    You might need some tables on the side for additional field information that doesn't fit in SQL schema - you may want to duplicate some schema information there as well, as this can be difficult or inefficient to read back from actual schema.

    Ensuring atomic updates to your field information tables and the schema probably requires transactions, which may not be supported by your database engine - PostgreSQL at least does support transactional schema updates.

    You have to be vigilant when it comes to security - you don't want to open yourself up to users creating, storing or deleting things they're not supposed to.

    If it suits your use-case, consider using not only separate tables, but separate databases, which can also by created and destroyed on demand using DDL. This could be applicable if each customer has ownership of data collections that can't, shouldn't, or don't need to be queried across customers. (Arguably, these are rare - typically, you want at least analytics or something across customers, but there are cases where each customer "owns" an isolated, hosted wiki, shop or CMS/DMS of some sort.)

    (I saw in your comment that you already decided on NoSQL, so just posting this option here for completeness.)