AWS Glue Studio to AWS Athena tables

509 views Asked by At

I have a DB in AWS Athena with a bunch of tables. I want to perform a join of these tables using AWS Glue Studio. I have subscribed to the CData AWS Glue Connector for Amazon Athena. When I try to create a connection using this connector and connect to one of the tables in AWS Athena, I get the following error:

Py4JJavaError: An error occurred while calling o61.getSource. : java.lang.AssertionError: assertion failed: Glue ETL Marketplace: Either user/password or secretId should be provided for JDBC connector. at scala.Predef$.assert(Predef.scala:170) at com.amazonaws.services.glue.util.DataCatalogWrapper$$anonfun$22.apply(DataCatalogWrapper.scala:301) at com.amazonaws.services.glue.util.DataCatalogWrapper$$anonfun$22.apply(DataCatalogWrapper.scala:264) at scala.util.Try$.apply(Try.scala:192) at com.amazonaws.services.glue.util.DataCatalogWrapper.getCustomSourceConf(DataCatalogWrapper.scala:264) at com.amazonaws.services.glue.GlueContext.getCustomSourceWithConnection(GlueContext.scala:437) at com.amazonaws.services.glue.GlueContext.getSourceInternal(GlueContext.scala:909) at com.amazonaws.services.glue.GlueContext.getSource(GlueContext.scala:751) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748)

I have followed all instructions mentioned at this link: https://www.cdata.com/kb/tech/athena-glue-studio.rst Has anyone used AWS Glue Studio to connect to the Athena tables and if yes, have you faced this issue? Any pointers to help with this will be appreciated.

2

There are 2 answers

0
Robert Kossendey On

Athena is not a data base but a distributed query engine.

The underlying data base sits in the Glue Meta Data Catalogue. You don't need to have a connector to connect to those tables, you just select it from data source menu like this:

enter image description here

0
OP Tester On

To surpass the "Either user/password or secretId should be provided for JDBC connector.", you will need to configure the Secrets Manager.

To establish a connection to your Athena instance, follow the instructions below:

  1. Store Athena Connection properties credentials in AWS Secrets Manager:

a. Sign in to the AWS Secrets Manager Console
b. Choose Store a new secret.
c. On the ‘Store a new secret’ page, choose ‘Other type of secret’. This option means you must supply the structure and details of your secret.
d. Add the Athena connection properties in List item the key/value pair
e. Clicked ‘Next’ and you can leave the default values for the other configurations steps.

You can find more information on how to create an AWS Secrets Manager secret in the AWS documentation below: https://docs.aws.amazon.com/secretsmanager/latest/userguide/create_secret.html

  1. Create a Custom Connector and then save the changes.

  2. Then create a connection. Add the connection name, select default ‘Connection credentials type’, and choose the name of the AWS Secret you created.

Unfortunately, you will have to complete the connection properties again as this appears to be a quirk of Glue Studio.

I would also recommend setting the AuthScheme property to ‘Password’ and it is always a good idea to adding the Logfile and Verbosity properties to enable logging just in case.

  1. Create a job.

You should be good to go.