I am using Apigee Usergrid which user Cassandra internally. While storing data I have the following predicament.
Do I store data like this:
[{
"portlet": "personal_information",
"fields": [{
"id": "first_name",
"text": {
"en_us": "First Name",
"de_de": "First Name"
}
}, {
"id": "salutation",
"text": {
"en_us": "Salutation",
"de_de": "Salutation"
}
}, {
"id": "marital_status",
"text": {
"en_us": "Marital Status",
"de_de": "Marital Status"
}
}, {
"id": "native_preferred_lang",
"text": {
"en_us": "Preferred Lang",
"de_de": "Preferred Lang"
}
}]
}]
or this
[{
"id": "first_name",
"portlet": "personal_information",
"text": {
"en_us": "First Name",
"de_de": "First Name"
}
}, {
"id": "salutation",
"portlet": "personal_information",
"text": {
"en_us": "Salutation",
"de_de": "Salutation"
}
}, {
"id": "marital_status",
"portlet": "personal_information",
"text": {
"en_us": "Marital Status",
"de_de": "Marital Status"
}
}, {
"id": "native_preferred_lang",
"portlet": "personal_information",
"text": {
"en_us": "Preferred Lang",
"de_de": "Preferred Lang"
}
}]
This is probably relevant to all no-sql databases. What's data format is more efficient?
In Usergrid 2.x we use ElasticSearch for indexing so between these two structures there should be no functional difference.
In Usergrid 1.x we denormalize the entities and use an indexing scheme on top of Cassandra, so they are likely equally performant.
That said, you should test and ensure this, as with any software/solution.