Unable to read data from ADLS gen 2 in Azure Databricks

Question

Unable to read data from ADLS gen 2 in Azure Databricks

33 views Asked by Tahmeed At 27 March 2024 at 13:38

I have followed this Microsoft Documentation to connect to my gen2 storage account: https://learn.microsoft.com/en-gb/azure/databricks/connect/storage/tutorial-azure-storage

and used this to authenticate according to step 6:

service_credential = dbutils.secrets.get(scope="<scope>",key="<service-credential-key>")

spark.conf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.<storage-account>.dfs.core.windows.net", "<application-id>")
spark.conf.set("fs.azure.account.oauth2.client.secret.<storage-account>.dfs.core.windows.net", service_credential)
spark.conf.set("fs.azure.account.oauth2.client.endpoint.<storage-account>.dfs.core.windows.net", "https://login.microsoftonline.com/<directory-id>/oauth2/token")

Now when I am running this:

df = spark.read.csv("abfss://<filepath>")

I am getting this error: abfss://filepath has invalid authority.

I have double checked :

tenant id of the SP
client id of the SP
secret scope name created according to the above mentioned documentation
The role of the service principal in the container is "Storage Blob data Contributor"

File Service properties of my storage account:

Large file share Disabled

Identity-based access Not configured

Default share-level permissions Disabled

Soft delete Enabled (7 days)

Share capacity 5 TiB

Original Q&A

There are 1 answers

**Tahmeed** · Answer 1 · 2024-03-27T15:49:03+00:00

Scope for SP didn't work even though the SP had "Storage Blob Data Contributor" role. So I tried creating a scope for my container's access key and it worked without any issues. Not sure exactly what the issue was though with the SP. I used this:

spark.conf.set(f"fs.azure.account.key.<container>.blob.core.windows.net", dbutils.secrets.get("scope-name", "secret-name"))

df = spark.read.csv(f"wasbs://container-name@sa_name.blob.core.windows.net/filepath")

TechQA.

Unable to read data from ADLS gen 2 in Azure Databricks

There are 1 answers

Related Questions in AZURE

Related Questions in AZURE-STORAGE

Related Questions in AZURE-DATABRICKS

Related Questions in AZURE-DATA-LAKE-GEN2

Popular Questions

Trending Questions