sqlmongodbreportdata-capturenosql

Database approach to use for Dynamic Form Data Collection which is suitable for good Reports and Searching


I am working on a project which involves collecting dynamic form data. These forms are user-defined (think surveymonkey) and thus a fixed schema cannot be defined for them. Data in terms of questions/answers would be retrieved for these forms and then stored into the database. Reporting/Searching on this answers (filtering and aggregation) is of utmost importance. There are two approaches which are feasible.

  1. Use a SQL database and store the each field data as a separate row. Reporting/Searching is then done via SQL. My apprehension is that it would result in complicated joins for reporting.

  2. Use a NoSQL database like MongoDB. This seems to be a perfect fit for storing the dynamic data since it is schema-less. However, I am not sure how good its reporting capabilities are.

It seems easier for target users to learn sql than to define map/reduce queries. How easy would it be to build a UI for reporting/searching over mongoDB.

Simple things like - list of users who gave a particular set of answers. How many such users over a period of time etc?

Thanks, Pulkit


Solution

  • It's already been mentioned in the comments, but I'll re-iterate that you should look at Mongo's map/reduce functionality for reporting and the aggregation framework.

    Having done map/reduce in both Couch and Mongo I can say that they are very similar. It's definitely a barrier to entry for a developer that isn't familiar with it, but once you get a few working examples, it's not too bad.

    Consider that Mongo can output a map/reduce job to a collection, which I've found to be really useful. This means you can schedule the jobs and run them periodically and output to a place that you can then report on. It's not that hard to create a framework that lets developers write simple Javascript map and reduce functions and then plug them in to be run on a schedule.

    The aggregation framework is much easier to understand for a developer coming from SQL. Still a learning curve, but not as bad as map/reduce. It is much more well suited to ad-hoc reporting queries and there is nothing comparable in Couch.

    You could maybe make a reporting UI that maps to the aggregation framework, but I wouldn't try to do something similar for map/reduce queries.