mysqlsqlrdbmsrdbms-agnostic

SQL design for survey with answers of different data types


I am working on an online survey. Most questions have a scale of 1-5 for an answer. If we need to add a question to the survey, I use a simple web form, which does an INSERT into the appropriate table, and voila! surveys are asking the new question -- no new code or change to the database structure.

We are being asked to add survey questions that can have answers of different data types. The spec is to have the survey 'configurable', so that at any point in the future, when someone says, "We need a new survey that asks {text answer question}, {1-5 question}, {true false question}, {question with a date as an answer}", we can do that without changing the database structure .

I'm trying to think of the best way to store those answers, but every method I come up with seems somewhat hackish.

Some questions may have a yes/no or true/false answer, some may have an integer answer ( "How many times in the past month have you used tech support?"), another answer may have a date, a string, a multiple choice with a single value, a multiple choice with multiple values, etc. Or sometimes, a particular answer value may prompt a sub-question ( "What disappointed you about...?" )

The simple solution is to have each question as a column in the survey, its answer as a column in the survey, and whether to ask it as a column in the survey. This feels like a mess to me -- it's one big table; not very 'relational'.

Brainstorming, the 'best' method I could come up with is to have a different table for each answer type, but that feels susceptible to data integrity issues. In other words, I would have

CREATE TABLE `Questions` (...);
CREATE TABLE `TrueFalseAnswers` (...);
ALTER TABLE `TrueFalseAnswers`
  ADD CONSTRAINT `TrueFalseAnswers_ibfk_1` FOREIGN KEY (`question_id`)
REFERENCES `Questions` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
CREATE TABLE `TextAnswers` (...);
ALTER TABLE `TextAnswers`
  ADD CONSTRAINT `TextAnswers_ibfk_1` FOREIGN KEY (`question_id`)
REFERENCES `Questions` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;

etc.

One problem with the above is that I can't guarantee that at least one and only one answer exists for any question in the DDL alone.

Another solution might be to have a binary or string column for the answer in the Questions table, and encode all answers into some string or binary format, and store them in the answer column. That gives me at least one and only one answer for every question, but then I lose access to aggregate features in SQL queries. This strikes me as not a very 'relational' solution.

So, I see problems with the ideas outlined above. Is there a 'best' way to solve this problem?

Now that I've taken the time to verbalize the problem and my thoughts, it seems that the broad problem that I'm posing is "I'd like to store data of an arbitrary type without doing any coding..." Is this hopeless?

I am using MySQL so I don't have access to thing that other RDBMSes might.


Solution

  • Use a column that specifies the type of answer, but store the answer as text. Your application or front end can use the answer_type column to determine what to display to the end user (a test box, radio buttons, a date picker) and how to validate it before sending it back to the database.