How to perform a table join on two DFS tables and update the left table based on its result?

26 views Asked by At

Two DFS tables, pt and pt1, with the same schema were created in the database dfs://db using the following scripts.

n=10000
ID=rand(100, n)
dates=2022.08.07..2022.08.11
date=rand(dates, n)
vol=rand(1..10 join int(), n)
t=table(ID, date, vol)

if(existsDatabase("dfs://db1")){
  dropDatabase("dfs://db1")
}
db=database(directory="dfs://db1", partitionType=RANGE, partitionScheme=0 50 100)
pt=db.createPartitionedTable(table=t, tableName=`pt, partitionColumns=`ID)
pt.append!(t)
pt1=db.createPartitionedTable(table=t, tableName=`pt1, partitionColumns=`ID)
pt1.append!(t)

I want to update the left table (table pt1) after performing a left semi-join on pt1 and pt.

I tried the following two methods:

update pt1 set date = pt_date from lsj(pt1, pt, `ID)

x = select * from lsj(pt1, pt, `ID)
update pt1 set date = x.pt_date from x

But both ways failed and raised the errors as follows:

enter image description here

1

There are 1 answers

0
dontyousee On

Currently, your update issue cannot be solved. You can get it cracked using the first method after the release of version 1.30.21/2.00.9.

In addition, the second error thrown did not exactly explain the problem. The error is reported when updating each partition. We will improve this error message in the subsequent versions.