encodingfirebirddatabase-metadatafirebird-4.0

How retrieve CHAR and VARCHAR field definitions depending on Firebird database encoding


In Firebird (4.0), I get the field length with following query:

SELECT T.RDB$RELATION_NAME, RF.RDB$FIELD_NAME, RF.RDB$NULL_FLAG, F.RDB$FIELD_TYPE, F.RDB$FIELD_LENGTH
    FROM RDB$RELATIONS T, RDB$RELATION_FIELDS RF,  RDB$FIELDS F 
    WHERE T.RDB$VIEW_BLR IS NULL AND (T.RDB$SYSTEM_FLAG IS NULL OR T.RDB$SYSTEM_FLAG = 0)  
        AND (RF.RDB$SYSTEM_FLAG IS NULL OR RF.RDB$SYSTEM_FLAG = 0)  
        AND (T.RDB$RELATION_NAME = RF.RDB$RELATION_NAME) 
        AND (RF.RDB$FIELD_SOURCE = F.RDB$FIELD_NAME) 
    ORDER BY T.RDB$RELATION_NAME, RF.RDB$FIELD_NAME 

F.RDB$FIELD_LENGTH is the allocated size, i.e. if I have a VARCHAR(256) with UTF8 encoding, then the corresponding F.RDB$FIELD_LENGTH value is 1024, but if this is default NONE encoded, the allocated value would be 256.

Is there a way to find the actual value X of VARCHAR(X) directly within the query itself, depending on the database encoding?

In my case, I only have two possible encodings, NONE or UTF8, and I can check it with following query:

SELECT A.RDB$CHARACTER_SET_NAME containing 'UTF8' FROM RDB$DATABASE A;

will return true and field lengths should be divided by 4, otherwise simply return the value itself.

Should I build a query based on an IF statement on the above queries? If so, what would it be? Or is there a better solution?


Solution

  • There are two ways:

    1. Use RDB$FIELDS.RDB$CHARACTER_LENGTH, which contains the length in characters
    2. Join RDB$CHARACTER_SETS and use RDB$CHARACTER_SETS.RDB$BYTES_PER_CHARACTER to calculate based on RDB$FIELDS.RDB$FIELD_LENGTH

    An example including both:

    SELECT T.RDB$RELATION_NAME, RF.RDB$FIELD_NAME, RF.RDB$NULL_FLAG, F.RDB$FIELD_TYPE, 
        F.RDB$FIELD_LENGTH,
        F.RDB$CHARACTER_LENGTH,
        F.RDB$FIELD_LENGTH / RCS.RDB$BYTES_PER_CHARACTER as calculated
    FROM RDB$RELATIONS T
    inner join RDB$RELATION_FIELDS RF
      on RF.RDB$RELATION_NAME = T.RDB$RELATION_NAME
    inner join RDB$FIELDS F
      on F.RDB$FIELD_NAME = RF.RDB$FIELD_SOURCE
    left join RDB$CHARACTER_SETS RCS
      on RCS.RDB$CHARACTER_SET_ID = F.RDB$CHARACTER_SET_ID 
    WHERE T.RDB$VIEW_BLR IS NULL 
    AND (T.RDB$SYSTEM_FLAG IS NULL OR T.RDB$SYSTEM_FLAG = 0)  
    AND (RF.RDB$SYSTEM_FLAG IS NULL OR RF.RDB$SYSTEM_FLAG = 0)  
    ORDER BY T.RDB$RELATION_NAME, RF.RDB$FIELD_NAME 
    

    You should not rely on the default character set of the database for this, because that only tells you the character set of newly created columns without an explicit character set. Each column has a character set, which was either specified explicitly or derived from the default character set at the time the column was created.