I'm having some performance issues with MySQL database due to it's normalization.
Most of my applications that uses a database needs to do some heavy nested queries, which in my case takes a lot of time. Queries can take up 2 seconds to run, with indexes. Without indexes about 45 seconds.
A solution I came a cross a few month back was to use a faster more linear document based database, in my case Solr, as a primary database. As soon as something was changed in the MySQL database, Solr was notified.
This worked really great. All queries using the Solr database only took about 3ms.
The numbers looks good, but I'm having some problems.
The MySQL database is about 200mb, the Solr db contains about 1.4Gb of data. Each time I need to change a table/column the database need to be reindexed, which in this example took over 12 hours.
The view is relying on a certain object. It doesn't care if the object it self is an Active Record object or an Solr object, as long as it can call a set of attributes on the it.
Like this.
# Controller
@song = Song.first
# View
@song.artist.urls.first.service.name
The problem in my case is that the data being returned from Solr is flat like this.
{
id: 123,
song: "Waterloo",
artist: "ABBA",
service_name: "Groveshark",
urls: ["url1", "url2", "url3"]
}
This forces me to build an active record object that can be passed to the view.
My question
Is there a better way to solve the problem? Some kind of super duper fast primary read only database that can handle complex queries fast would be nice.
About reindexing all on schema change: Solr does not support updating individual fields yet, but there is a JIRA issue about this that's still unresolved. However, how many times do you change schema?
If you can live without a RDBMS (without joins, schema, transactions, foreign key constrains), a document-based DB like MongoDB, or CouchDB would be a perfect fit. (here is a good comparison between them )
Why use MongoBD:
Why use SOLR:
Why use MySQL
So, the solutions (combinations) would be:
Use MongoDB + Solr
Use only MongoDB
Use MySQL in a master-slave configuration, and balance reads from slave(s) (using a plugin like octupus) + Solr
Keep current setup, denormalize data in MySQL
The MySQL database is about 200mb, the Solr db contains about 1.4Gb of data. Each time I need to change a table/column the database need to be reindexed, which in this example took over 12 hours.
Reindexing 200MB DB in Solr SHOULD NOT take 12 hours! Most probably you have also other issues like:
MySQL:
SOLR:
From http://outoftime.github.com/pivotal-sunspot-presentation.html:
- By default, Sunspot::Rails commits at the end of every request that updates the Solr index. Turn that off.
- Use Solr's autoCommit functionality. That's configured in solr/conf/solrconfig.xml
- Be glad for assumed inconsistency. Don't use search where results need to be up-to-the-second.
Look at the logs for more details