Been using this formula as a std for TB size calculation , as part of Cap. Planning effort . We are on TD 14
( rc * ( rsz / ( blocksize -38) ) * blocksize )
+ ( SQL (sel Hashamp()+1 ; ) * 1024 )
rsz : row size , rc : count ( * )
Here actually
(blocksize-38)/rsz
is nothing but rows / block. It comes out a fraction < 1. I think that's bad because it'd mean several blocks span a row. My questions are
A row in Teradata never spans blocks.
You simply got your calculation wrong, you talk about (blocksize-38)/rsz
, but the actual calculation shows rsz / ( blocksize -38)
.
As the block overhead in newer releases increased to 74 this should be the correct calculation:
( rc / (( blocksize - 74)/rsz ) * blocksize )
+ ( (HASHAMP()+1 ) * 1024 )
It's found in Sizing Base Tables, Hash Indexes, and Join Indexes
But you will notice that this approaches rc * rsz
for larger tables. As nobody cares about small tables, I usually use this simplified calculation to size a table (you might add 1 or 2 percent to get a maximum possible size).
Edit:
Not the calculation is wrong, it's due to base data types used (probably a truncation of a DECIMAL
). Change to a FLOAT
or NUMBER
:
( rc * ( rsz / ( CAST(blocksize -74 AS FLOAT)) ) * blocksize )
+ ( (HASHAMP()+1 ) * 1024 )