Spool Space Error while using LEFT JOIN in TERADATA

32 views Asked by At

I am getting "No more spool space in user" error for the below query.

  SEL
  C.*,AD.*,P.*
  FROM mytestDB.TABLE_1 C
   INNER JOIN mytestDB.TABLE_2 D
   ON C.COL_A     = D.COL_A
   AND C.COL_B = D.COL_B

  LEFT OUTER JOIN mytestDB.TABLE_3 AD
   ON C.COL_A = AD.COL_A
   AND C.COL_B = AD.COL_B

  LEFT OUTER JOIN mytestDB.TABLE_4 P
   ON C.COL_A = P.COL_A
   AND C.COL_B = P.COL_B;

Observations

  1. If I remove any of the LEFT OUTER JOIN the query is returning the results without any issues.
  2. If I use count(*) instead of column list, then also it is working
  3. If I select one or two columns, then also it is returning the data in viewpoint
  4. Spool skew is showing around 70-80 during query execution

All 4 tables used in the above query have the same PI columns (COL_A, COL_B) and stats are also collected on the same PI columns. The Skew factor of all the tables is less than or equal to 2.

Here is the total record count and distinct record count based on PI columns from each table.

 --TOTAL Count
  sel count(1) from  mytestDB.TABLE_1; 89,758,653
  sel count(1) from  mytestDB.TABLE_2; 148,915,580
  sel count(1) from  mytestDB.TABLE_3; 12,446,171
  sel count(1) from  mytestDB.TABLE_4; 160,661


  -- DISTINCT PI count
  sel count( 1) from (sel distinct  COL_A,COL_B  from  mytestDB.TABLE_1)a; 89,758,653
  sel count( 1) from (sel distinct  COL_A,COL_B  from  mytestDB.TABLE_2)a; 64,616,959
  sel count( 1) from (sel distinct  COL_A,COL_B  from  mytestDB.TABLE_3)a; 11,032,454
  sel count( 1) from (sel distinct  COL_A,COL_B  from  mytestDB.TABLE_4)a;     121,860        


   Explain plan
     1) First, we lock mytestDB.AD in TD_MAP1 for read on a reserved
     RowHash in all partitions to prevent global deadlock.
     2) Next, we lock mytestDB.P in TD_MAP1 for read on a reserved
     RowHash in all partitions to prevent global deadlock.
     3) We lock mytestDB.D in TD_MAP1 for read on a reserved RowHash in
     all partitions to prevent global deadlock.
     4) We lock mytestDB.C in TD_MAP1 for read on a reserved RowHash in
     all partitions to prevent global deadlock.
     5) We lock mytestDB.AD in TD_MAP1 for read, we lock mytestDB.P in
     TD_MAP1 for read, we lock mytestDB.D in TD_MAP1 for read, and we
     lock mytestDB.C in TD_MAP1 for read.
     6) We execute the following steps in parallel.
       1) We do an all-AMPs JOIN step in TD_MAP1 from mytestDB.C by
          way of a RowHash match scan with no residual conditions,
          which is joined to mytestDB.AD by way of a RowHash match
          scan with no residual conditions.  mytestDB.C and
          mytestDB.AD are left outer joined using a rowkey-based
          merge join, with a join condition of (
          "(mytestDB.C.COL_A = mytestDB.AD.COL_A) AND
          (mytestDB.C.COL_B = mytestDB.AD.COL_B)").
          The result goes into Spool 2 (all_amps), which is built
          locally on the AMPs.  Then we do a SORT to partition Spool 2
          by rowkey.  The size of Spool 2 is estimated with low
          confidence to be 89,797,457 rows (12,212,454,152 bytes).  The
          estimated time for this step is 7.85 seconds.
       2) We do an all-AMPs RETRIEVE step in TD_MAP1 from mytestDB.P
          by way of an all-rows scan with no residual conditions into
          Spool 5 (all_amps), which is built locally on the AMPs.  The
          size of Spool 5 is estimated with high confidence to be
          12,476,097 rows (1,147,800,924 bytes).  The estimated time
          for this step is 0.39 seconds.
     7) We do an all-AMPs JOIN step in TD_Map1 from Spool 2 (Last Use) by
     way of a RowHash match scan, which is joined to mytestDB.D by
     way of a RowHash match scan with no residual conditions.  Spool 2
     and mytestDB.D are joined using a rowkey-based merge join, with
     a join condition of ("(COL_B = mytestDB.D.COL_B)
     AND (COL_A = mytestDB.D.COL_A)").  The result goes into
     Spool 6 (all_amps), which is redistributed by the hash code of (
     mytestDB.C.COL_A, mytestDB.C.COL_B) to all AMPs in
     TD_Map1.  The size of Spool 6 is estimated with low confidence to
     be 149,320,197 rows (20,307,546,792 bytes).  The estimated time
     for this step is 14.00 seconds.
     8) We do an all-AMPs JOIN step in TD_Map1 from Spool 5 (Last Use) by
     way of an all-rows scan, which is joined to Spool 6 (Last Use) by
     way of an all-rows scan.  Spool 5 and Spool 6 are right outer
     joined using a single partition hash join, with a join condition
     of ("(COL_A = COL_A) AND (COL_B = COL_B)").
     The result goes into Spool 1 (group_amps), which is built locally
     on the AMPs.  The size of Spool 1 is estimated with low confidence
     to be 151,674,776 rows (37,463,669,672 bytes).  The estimated time
     for this step is 9.80 seconds.
     9) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
     -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 31.65 seconds.

Could you please provide some inputs on this issue ?

Tried collecting stats on PI columns

0

There are 0 answers