Been using this formula as a std for TB size calculation , as part of Cap. Planning effort . We are on TD 14
( rc * ( rsz / ( blocksize -38) ) * blocksize )
+ ( SQL (sel Hashamp()+1 ; ) * 1024 )
rsz : row size , rc : count ( * )
Here actually
(blocksize-38)/rsz
is nothing but rows / block. It comes out a fraction < 1. I think that's bad because it'd mean several blocks span a row. My questions are
- Do the formulas need any further honing. Latter part after the addition sign , provides for table header . There are NO SI's for this table - just 2 dates , 1 Integer and 1 varchar (50) with a NUPI which is NPPI . None of them are Nullable and obviously without the data , nothing is compressible to begin with ( well there's not enough data info to inc. compression now but we'd run compression scripts later )
- because it'd be several blocks spanning a row - I should be upping the block size ? how much - what should be the ideal number of rows per block. Table data will get a full refresh every quarter and in there'd be nothing happening in the interim
A row in Teradata never spans blocks.
You simply got your calculation wrong, you talk about
(blocksize-38)/rsz, but the actual calculation showsrsz / ( blocksize -38).As the block overhead in newer releases increased to 74 this should be the correct calculation:
It's found in Sizing Base Tables, Hash Indexes, and Join Indexes
But you will notice that this approaches
rc * rszfor larger tables. As nobody cares about small tables, I usually use this simplified calculation to size a table (you might add 1 or 2 percent to get a maximum possible size).Edit:
Not the calculation is wrong, it's due to base data types used (probably a truncation of a
DECIMAL). Change to aFLOATorNUMBER: