I was reading over the Citadel documenation and it mented that it used BerkeleyDB to store the data. Since BerkeleyDB is a key/value store I'm wondering how they can manage all the data relations (since Citadel does a lot of things) using such a simple data model.
CREATE TABLE citadel (
key LONGBLOB INDEX,
data LONGBLOB
);
This presents a chance for me to finally see a full application modeled out using a NoSQL database. Yet, I couldn't find any documentation on how they do this.
So, how does citadel structure it's data using only the BerkeleyDB key/value store?
and the list goes on, and on...
Quite a few NoSQL databases are, in their bare form, comparable to file-systems. Given a key (=path), you get a blob of data (= file contents). The rest roughly come down to tuning and extra features;
It currently seem the most popular thing is to do key-scans (HBase, Cassandra, CouchDB, and, I believe, BerkeleyDB), where you request a interval of keys you are interested in, eg. "From foo@bar:emails:folderName:00000000
to foo@bar:emails:folderName:999999999
". This usually returns a list of keys and/or values that are in the ASCIIbetic interval between the two. Thus you can emulate a file-like hierarchy in a flat namespace.
Next issue is concurrency. Very brief, most NoSQL databases drop ACID in favor of scalability and/or availability. Look into the CAP Theorem for more details.
In all, it is very hard to do the subject justice in such short space, so I would really recommend you to look into it yourself.
Pick some open-source project apart (OpenTSDB does things in a interesting, yet obvious manner). Or build something on NoSQL yourself.