Looking for the high-level differences/comparison among
Please use relative comparison when specifics are not available.
Included below is a high-level comparison between the various data tiers mentioned. Please feel free to drop a comment if any of these need corrections.
Feature | Database | Data Mart | Data Warehouse | Data Lake | Data Lakehouse |
---|---|---|---|---|---|
Source | Single | Single | Multiple | Multiple | Multiple |
Structure | Structured | Structured | Structured | Raw | Structured, semi-structured, and unstructured |
Purpose | Determined | Determined | Determined | Determined | Determined |
Storage | Centralized | Decentralized | Centralized | Centralized | Centralized |
Data Format | Detailed | Summarized | Both detailed and summarized | All | All |
Flexibility | Low | Medium | Medium | High | High |
Primary Use | Transactional | Reporting | Analytics & Reporting | Analytics | Analytics |
Cost | Low | Medium | Medium | High | High |
Data Volume | Low | Low | Medium | High | High |
Development | Top-down | Bottom-up | Top-down | All | All |
Design Time | Medium | Medium | High | Low | Low |
Volatility | Medium | Low | None | None | None |
Data Operations | CRUD | CR | CRU | CR | CRUD |
Subject Area | Single | Single | Multiple | Multiple | Multiple |
Design Schema | Relational | Multi-dimensional | Relational | No schema | Hybrid |
Notes:
Fit-gap