pythonsqlalchemypostgresql-9.2unique-constraintdatabase-integrity

What is the good approach to check an existance of unique values in database table by SqlAlchemy in python2.7


I use session.merge in SqlAlchemy to insert data to the table with UniqueConstraint, which is common for several columns, for example, my ORM class has the following:

UniqueConstraint("attr_1", "attr_2", "attr_3", name="attributes_uix")

Should I use filter or filter_by with _and to check, if data, which I'm going to insert will violate this unique constraint, or is it the proper way to catch IntegrityError by try - except construction?


Solution

  • No an sql-alchemy expert but have struggled with this and similar questions for a while. I am not sure if there is a single right way in this case. What I can describe for you is what the tradeoff between grabbing db exceptions and handling it in your code are.

    Personally I prefer to trap db exceptions. This has the advantage of not running extra queries (with extra latency, etc). It also has the advantage of reusing existing framework pieces as much as possible.

    However it has two real drawbacks. The first, which can be solved on the db design level, is that multi-column unique constraints are not violated by matches containing NULL values. In other words, if I have aa cnstraint on (id, value), then (1, null), (1, null) cannot be said to be non-unique and therefore no integrity exception is thrown. So first it is worth checking to see if you can avoid this problem. If not, you need to find a workaround, but PostgreSQL gives you pleny of tools there (for example, functional indexes).

    The second is that catching IntegrityError is a kind of a blunt instrument and you can't really parse any error message available for the index name because this may vary between systems and the locale can mean the message may be translated. So you are relying on SQLSTATE and this is not very precise. I.e. you don't have good machine readable information. about which constraint was voilated if there are multiple candidates.

    The question then is between these and duplicating code and queries. Usually I think catching the error is the lesser evil, but due to the tradeoffs it is not a universal recommendation.