I got a question when using pdist, it would be so many thanks if you could give me some advice. The pdist(D) usually gives the sum of the distance for the multiple dimension, however, I want to get the distance separately. For example I have a data set S which is a 10*2 matrix , I am using pdist(S(:,1)) and pdist(S(:,2)) to get the distance separately, but this seems very inefficient when the data has many dimensions. Is there any alternative way to achieve this more efficient? Thanks in advance!
How to separately compute the Euclidean Distance in different dimension?
353 views Asked by Zhida Deng AtThere are 2 answers
On
Another option, since you're simply taking the absolute difference of the coordinates, is to use bsxfun:
>> D = randi(20, 10, 2) % generate sample data
D =
17 12
14 10
8 4
7 11
19 13
2 18
11 14
5 19
19 12
20 8
From here, we permute the data so that the coordinates (columns) extend into the 3rd dimension and the rows are in the 1st dimension for the 1st argument, and the 2nd dimension for the 2nd argument:
>> dist = bsxfun(@(x,y)abs(x-y), permute(D, [1 3 2]), permute(D, [3 1 2]))
dist =
ans(:,:,1) =
0 3 9 10 2 15 6 12 2 3
3 0 6 7 5 12 3 9 5 6
9 6 0 1 11 6 3 3 11 12
10 7 1 0 12 5 4 2 12 13
2 5 11 12 0 17 8 14 0 1
15 12 6 5 17 0 9 3 17 18
6 3 3 4 8 9 0 6 8 9
12 9 3 2 14 3 6 0 14 15
2 5 11 12 0 17 8 14 0 1
3 6 12 13 1 18 9 15 1 0
ans(:,:,2) =
0 2 8 1 1 6 2 7 0 4
2 0 6 1 3 8 4 9 2 2
8 6 0 7 9 14 10 15 8 4
1 1 7 0 2 7 3 8 1 3
1 3 9 2 0 5 1 6 1 5
6 8 14 7 5 0 4 1 6 10
2 4 10 3 1 4 0 5 2 6
7 9 15 8 6 1 5 0 7 11
0 2 8 1 1 6 2 7 0 4
4 2 4 3 5 10 6 11 4 0
This results in a 3-d symmetric matrix where
dist(p, q, d)
gives you the distance between points p and q in dimension d with
dist(p, q, d) == dist(q, p, d)
If you want the distances between p and q in all (or multiple) dimensions, you should use squeeze to put it in a vector:
>> squeeze(dist(3, 5, :))
ans =
11
9
Note that if you're using MATLAB 2016b or later (or Octave) you can create the same distance matrix without bsxfun:
dist = abs(permute(D, [1 3 2]) - permute(D, [3 1 2]))
The downside to this approach is that it creates the full symmetric matrix so you're generating each distance twice, which could potentially become a memory issue.
Assuming you just want the absolute difference between the individual dimensions of the points then
pdistis overkill. You can use the following simple functionwhich returns the absolute pairwise difference between all pairs of rows in
S.In this case
gives the same result as