Why does the tez engine also add a reduce phase to the simplest insert statement, and how to remove it through configuration?

Question

Why does the tez engine also add a reduce phase to the simplest insert statement, and how to remove it through configuration?

38 views Asked by user21579832 At 06 April 2023 at 09:44

This is the hive-sql:

insert into my_orc_table_25 select * from my_orc_table limit 5;

And these are the schemas:

CREATE TABLE my_orc_table (
    id INT,
    name STRING
)

STORED AS ORC;

CREATE TABLE my_orc_table_25 as select id,name from my_orc_table limit 25;

My env : hive-3.1.0,tez-0.10.2

I tried to modify the following configuration items, but it didn't work

set hive.compute.query.using.stats=false;
set hive.stats.fetch.column.stats=false;
set hive.stats.fetch.partition.stats=false;
set hive.groupby.skewindata=false;
set hive.exec.dynamic.partition=false;

Original Q&A

There are 2 answers

**OneCricketeer** · Answer 1 · 2023-04-08T21:01:37+00:00

OneCricketeer On 08 April 2023 at 21:01

Reduce stages are required to write data to HDFS.

Map-only jobs read data as-is, and don't further process it.

**Raid** · Answer 2 · 2023-04-11T04:39:14+00:00

Raid On 11 April 2023 at 04:39

From explain plan I can see reducer is doing compute stats and file merge.

You can try after disabling both of them using below settings:

set hive.stats.autogather=false;
set hive.merge.tezfiles=false;

TechQA.

Why does the tez engine also add a reduce phase to the simplest insert statement, and how to remove it through configuration?

There are 2 answers

Related Questions in SQL

Related Questions in HADOOP

Related Questions in HIVE

Related Questions in APACHE-TEZ

Popular Questions

Trending Questions