sqlstringfirebirdtrailing

Compare strings with trailing spaces in Firebird SQL?


I have an existing database with a table with a string[16] key field. There are rows whose key ends with a space: "16 ". I need to allow user to change from "16 " to e.g. "16" but also do a unique key check (i.e. the table does not have already a record with key="16"). I run the following query:

select * from plu__ where store=100 and plu_num = '16'

It returns the row with key="16 "! How do I check for unique key so that keys with trailing spaces are not included?

enter image description here

EDIT: The DDL and the char_length

CREATE TABLE PLU__
(
  PLU_NUM Varchar(16),
  CAPTION Varchar(50),
...

enter image description here


Solution

    1. string[16] - there is no such datatype in Firebird. There are CHAR(16) and VARCHAR(16) (and BLOB SUBTYPE TEXT, but it is improbable here). So you omit some crucial points about your system. You do not work with Firebird, but with some undisclosed intermediate layer, that no one knows how opaque or transparent is.

    I suspect you or your system chose CHAR datatype instead of VARCHAR where all data is right-padded with space to the max. OR maybe the COLLATION of the column/table/database is so that trailing spaces do not matter.

    Additionally, you may just appear wrong. You claim that the row being Selected does contain the trailing blank, but I do not see it. For example, add CHAR_LENGTH(plu_num) to the columns in your SELECT and see what is there.

    Additionally, if plu_num is number - should it not be integer or int64 rather than text?

    1. Bottom of your screenshot shows "(NONE)". I suspect that is the "connection charset". This is allowed for backward compatibility with programs made 20 years ago, but it is quite dangerous today. You have to consult your system documentation, how to set the connection charset to UTF-8 or Windows-1250 or something meaningful.

    2. "How do I check for unique key so that keys with trailing spaces are not included?" you do not. You just can not do it reliably, because of different transactions and different programs making simultaneous connections. You would check it, decide you are clear, but right before you would insert your row - some other computer would insert it too. That gap can not be crossed that way, between your two commands of checking and inserting - anyone else can do it too. It is called race conditions.

    You have to ask the server to do the checks.

    For example, you have to introduce UNIQUE CONSTRAINT on the pair of columns (store, plu_num). That way the server would refuse to store two rows with the same values in those columns, visible in the same transaction.

    Additionally, is it even normal to have values with spaces? Convert the field to integer datatype and be safe. Or if you want to keep it textual and non-numeric you still can

    1. Introduce CHECK CONSTRAINT that trim(plu_num) is not distinct from plu_num (or if plu_num is declared as a NOT NULL column to the server, then trim(plu_num) = plu_num). That way the server would refuse storing any value with spaces before or after the text.

    In a case the datatype or the collation of the column makes no difference for comparing texts with and without trailing spaces (and in case you can not change that datatype or collation), you may try adding tokens, like ('+' || trim(plu_num) || '+') = ('+' || plu_num || '+')

    1. Or instead of that CHECK CONSTRAINT, you can have proactively remove those spaces: set new before update or insert TRIGGER on the table, that would do like NEW.plu_num = TRIM(NEW.plu_num)

    Documentation:

    Also, via http://www.translate.ru a bit more verbose:

    You may also check http://www.firebirdfaq.org/cat3/

    Additionally, if you add the constraints onto existing table with non-valid data entered earlier before you introduced those checks, you might trap yourself into "non-restorable backup" situation. You would have to check for it, and sanitize your old data to abide by newly introduced constraints.


    Option #4 explained in detail is below. Just this seems be a bad idea of database design! One should not just "let people edit number to remove trailing blanks", one should make the database design so that there would be no any numbers with trailing blank and would be no any way to insert them into the database.

    CREATE TABLE "_NEW_TABLE" (
        ID   INTEGER NOT NULL,
        TXT  VARCHAR(10) 
    );
    
    Select id, txt, '_'||txt||'_', char_length(txt) from "_NEW_TABLE"
    
    ID  TXT CONCATENATION   CHAR_LENGTH
    1   1   _1_ 1
    2   2   _2_ 1
    4   1   _1 _    2
    5   2   _2 _    2
    7    1  _ 1_    2
    8    2  _ 2_    2
    
    
    Select id, txt, '_'||txt||'_', char_length(txt) from "_NEW_TABLE"
    where txt = '2'
    
    ID  TXT CONCATENATION   CHAR_LENGTH
    2   2   _2_     1
    5   2   _2 _    2
    
    Select id, txt, '_'||txt||'_', char_length(txt) from "_NEW_TABLE"
    where txt || '+' = '2+' -- WARNING - this PROHIBITS index use on txt column, if there is any
    
    ID  TXT CONCATENATION   CHAR_LENGTH
    2   2   _2_     1
    
    Select id, txt, '_'||txt||'_', char_length(txt) from "_NEW_TABLE"
    where txt = '2' and char_length(txt) = char_length('2')