How to compute the distance matrix in pyspark?

251 views Asked by At

I have a dataset with 1,00,000 records. I need to find euclidean distance matrix for this dataset. It should create 1,00,000*1,00,000 matrix. In python we have squareform(pdist(x)). As i cannot perform the same function on the rdd, How to do it on spark platform in python?

0

There are 0 answers