In Azure databricks I am applying a filter to show the data where Region column has value 'weu'.
display(df.where(col("Region") == 'weu'))
But the output dataframe I am getting has Region values as eus & sea. Can anyone help why is this happening?
I have used some sample data like below:
The reason you are seeing region values as eus & sea Because containing values with leading and trailing white spaces.
I have tried the below approach:
Filter the DataFrame based on the Region column, applying the
trim()function to remove any leading or trailing white spaces before filtering.Also you can check:
From the dilip_df DataFrame,
selectthe Region column. Convert all the values in the column to lowercase using thelower()function, and then return only the distinct values in the column using thedistinct()function.