databaserelational-database

What are the scalability limitations of relational databases?


Articles online indicate relational databases have scaling issues and are not good to use when it comes to big data, especially in cloud computing where the data is big.

What are the scalability limitations of relational databases?


Solution

  • Relational databases provide solid, mature services according to the ACID properties. We get transaction-handling, efficient logging to enable recovery etc. These are core services of the relational databases, and the ones that they are good at. They are hard to customize, and might be considered as a bottleneck, especially if you don't need them in a given application (eg. serving website content with low importance; in this case for example, the widely used MySQL does not provide transaction handling with the default storage engine, and therefore does not satisfy ACID). Lots of "big data" problems don't require these strict constrains, for example web analytics, web search or processing moving object trajectories, as they already include uncertainty by nature.

    When reaching the limits of a given computer (memory, CPU, disk: the data is too big, or data processing is too complex and costly), distributing the service is a good idea. Lots of relational and NoSQL databases offer distributed storage. In this case however, ACID turns out to be difficult to satisfy: the CAP theorem states somewhat similar, that availability, consistency and partition tolerance can not be achieved at the same time. If we give up ACID (satisfying BASE for example), scalability might be increased. See this post eg. for categorization of storage methods according to CAP.

    An other bottleneck might be the flexible and clever typed relational model itself with SQL operations: in lots of cases a simpler model with simpler operations would be sufficient and more efficient (like untyped key-value stores). The common row-wise physical storage model might also be limiting: for example it isn't optimal for data compression.

    There are however fast and scalable ACID compliant relational databases, including new ones like VoltDB, as the technology of relational databases is mature, well-researched and widespread. We just have to select an appropriate solution for the given problem.