No suitable driver found for jdbc:mysql://metastore.example.com/metastore in Google Dataproc Serverless

57 views Asked by At

I am trying to run a Spark job using Google Cloud Dataproc Serverless. This jobs runs fine when I run it using a normal dataproc Spark cluster. It uses a Hive metastore stored in a mysql db. When I run the job using Dataproc batches, I get the below error:

Caused by: java.sql.SQLException: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://metastore.example.com/metastore, username = test123. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: No suitable driver found for jdbc:mysql://metastore.example.com/metastore

I have tried including the mysql connector jar in my pom and adding the below config

spark.hadoop.javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver

Both of these didn't work. Kindly help.

2

There are 2 answers

0
Igor Dvorzhak On

Dataproc Serverless Spark does not come with pre-installed MySQL driver - you need to include MySQL driver as a dependency to the Dataproc Serverless Spark Batch if your job needs it.

0
Piyush Shrivastava On

Turned out that I needed to add this jar in the Docker image used to run the job. I created a custom image, uploaded it to artifact registry and passed it in my spark submit command to make it work.