pythonicr

any Python tools for reading Scantron-style data


I am interested in doing some snail mail based surveys but I am looking for quick ways to digitize the surveys they send back.

So if I had a question and 5 boxes beneath it where you would indicate your opinion by checking the appropriate box, does anything exist where I could scan it and run it through a piece of software that spit out the responses.

Edit clarification:

I am inquiring about what I need to do after the paper has been digitized. I want to write some code that looks at an image file and recognizes which box has been marked in and outputs a representation of the respondents answers.

I would be looking at a page scanned from a desktop scanner or something similar.


Solution

  • From what i see you don't really need ICR (intelligent character recognition, used for handwritten and handprinted texts), but what you need is OMR - optical mark recognition (capturing human-marked data from document forms such as surveys and tests).

    The bad news is you would hardly find an opensource library for python. But there's a solution - you can use a cloud SDK, it's a website that let you upload an image and send you back an OCR'ed data. Try www.ocrsdk.com, it is a cloud based OCR SDK recently launched by ABBYY. It's now in closed beta so it's completely free to use.

    It has both ICR and OMR api methods and a set of python code samples.