databasepostgresqlstoragenumeric

Byte size of numeric(precision, scale) in Postgres


Given that the size is variable, I am trying to calculate the average cost of a numeric(p, s) attribute. In my case, numeric(10, 2). From what I understand of the Postgres documentation, this means I have a total of 10 digits where 2 of them are exclusive of the decimal part, hence, the maximum value I can store is 99999999,99.

What I do not understand is this part of the documentation:

The actual storage requirement is two bytes for each group of four decimal digits, plus three to eight bytes overhead.

My two questions here are:

I ran a query with an average of pg_column_size that gave me a total of 6.95 bytes. However, I want to know if there is any way I can calculate the approximate value before checking the actual one. (Same with other variable types in Postgres)


Solution

  • If I have two decimal digits, that means the storage requirement for them would be one byte?

    No, a numeric with two digits occupies 5 bytes on disk and 8 bytes in RAM.

    If I have three to eight bytes overhead, how can I know which value corresponds to my case?

    As the manual says:

    two bytes for each group of four decimal digits

    For each started group of four decimal digits, to be precise. So we can compute:

    ceil(digits / 4.0) * 2
    

    plus three to eight bytes overhead.

    The minimum overhead of 3 bytes only applies to '0' in "packed" format on disk. Else:

    Sign, decimal point, and white space of the number are insignificant for storage size.
    Type modifiers (precision and scale) of the numeric type are also insignificant.

    Here is a demo with formulas for the complete computation:

    SELECT id, digits
         , pg_column_size(n) AS disk_size
         , CASE
              WHEN n = 0         THEN 3                           -- efficient storage of 0
              WHEN digits <= 248 THEN 3 + ceil(digits / 4.0) * 2  -- "packed" format for size <= 128 bytes 
              WHEN digits <= 256 THEN 6 + ceil(digits / 4.0) * 2  -- effecitve size > 128 bytes
              ELSE                    8 + ceil(digits / 4.0) * 2  -- more than 256 digits
           END AS expected_disk_size
         , pg_column_size(n + 0) AS ram_size
            , CASE
              WHEN n = 0         THEN 6                           -- efficient storage of 0
              WHEN digits <= 256 THEN 6 + ceil(digits / 4.0) * 2  -- effecitve size > 128 bytes
              ELSE                    8 + ceil(digits / 4.0) * 2  -- more than 256 digits
           END AS expected_ram_size
    FROM   tbl
         , length(translate(n::text, '-. ', '')) AS digits;       -- only digits matter
    

    fiddle

    Note that the no-op n + 0 forces the stored value to be unpacked, so we get RAM size. Related:

    Also note that storage of null values is different. See: