I want to connect to Vertica and HDFS in the same project. I created a dbt project with command dbt init and try to connect to Vertica. It works but I don't know how to connect to hdfs to read data and load it to Vertica.
I need to read and load data from HDFS to Vertica database. Is it possible?
You could start reading the docu here: https://docs.vertica.com/23.4.x/en/data-load/working-with-external-data/creating-external-tables/
An example for Parquet on HDFS:
Step 1: Let Vertica infer the Parquet file's definition:
I would then suggest that you change the generated DDL statement to reflect the maximum lengths of
r_nameandr_comment(varchar(25) and varchar(152) , for example) before running it.Once the command is executed, you can select from
regionas if it were a normal table.