I need to generate a lot of random mean-invariant orthogonal matrices for my work. A mean-invariant matrix has the property A*1_n=1_n, where 1_n is a vector of size n of the scalar 1, basicaly np.ones(n). I use Python and specifically Numpy to create my matrices, but I want to make sure that my method is both correct and the most efficient. Also, I want to present my findings on 3 separate orthogonalization methods I have tried and hopefully get an explanation on why one method is faster than the others. I ask four questions in the end of the post about my findings.
In general, in order to create a mean-invariant random orthogonal matrix A, you need to create a random square matrix M1, replace its first column with a column of ones and orthogonalize the matrix. Then, you do that again with another matrix M2 and the final mean-invariant random orthogonal matrix is A = M1*(M2.T). The bottleneck of this process is the orthogonalization. There are 3 main ways of orthogonalization, namely the Gram–Schmidt process, which uses projection, the Householder transformation, which uses reflection and the Givens rotation.
The creation of a nxn random matrix is pretty straight forward with Numpy:
M1 = np.random.normal(size=(n,n)).
Then, I replace the first column of M1 with 1_n.
As far as I know the Gram–Schmidt process does not exist in any very popular library, so I found this code which works fine:
def gram_schmidt(M1):
"""Orthogonalize a set of vectors stored as the columns of matrix M1."""
# Get the number of vectors.
n = M1.shape[1]
for j in range(n):
# To orthogonalize the vector in column j with respect to the
# previous vectors, subtract from it its projection onto
# each of the previous vectors.
for k in range(j):
M1[:, j] -= np.dot(M1[:, k], M1[:, j]) * M1[:, k]
M1[:, j] = M1[:, j] / np.linalg.norm(M1[:, j])
return M1
Obviously, the above code has to be executed for both M1 and M2.
For a 10,000x10,000 random mean-invariant orthogonal matrix, this process takes about 1 hour on my computer (8-core @3.7GHz, 16GB RAM, 512GB SSD).
I found that instead of the Gram-Schmidt process I could orthogonalize the matrix in Numpy with this:
q1, r1 = np.linalg.qr(M1)
where q1 is the orthogonalized matrix and r1 is an upper triangular matrix (we don't need to keep r1). I do the same to M2 and get q2. Then, A=q1*(q2.T). This process for the same 10,000x10,000 matrix takes about 70 seconds on the same computer. I believe the linalg.qr() library uses the Householder transformation, but I would like someone to confirm it.
Finaly, I tried to change the way that the initial random matrices M1 and M2 are produced. Instead of
M1 = np.random.normal(size=(n,n)) I used the Dirichlet distribution:
M1 = np.random.default_rng().dirichlet(np.ones(n),size=(n)).
Then, I used the linalg.qr() like before and I got the 10000x10000 matrix in about the same time as M1 = np.random.normal(size=(n,n)).
My questions are:
- Does the Numpy's
np.linalg.qr()method actually use the Householder transformation? Or possibly the the Givens rotation? - Why is the Gram-Schmidt process so much slower than
np.linalg.qr()? - I know that the Dirichlet process produces an almost orthogonal matrix. Is it because we are creating 10,000 dimensions, so it is likely to randomly get a vector that is orthogonal to all others? The
np.linalg.qr()does not care about how close the matrix is to orthogonality. - Is there an even faster way to generate random orthogonal matrices? Are there any optimizations I can make to my code to make it faster/more efficient?
Edit: The cupy's cp.linalg.qr() on the same 10,000x10,000 random matrix takes only 16 sec on my 2080ti, instead of 70 sec for numpy on the CPU (8-core @3.7GHz with multithreading, 16GB RAM and 512GB SSD).
Here's another method for manufacturing such matrices. I have no idea what the distribution is like, but it might be faster than the methods you describe.
First off, here a householder reflector is a matrix of the form
where h is a vector.
Note that:
H is orthogonal, and symmetric
H can be applied to a vector in O(dim)
if x and y are any two vectors with the same norm then we can find such a matrix to map x to y (h is x-y)
One way to manufacture 'random' orthogonal matrices is by computing products of 'random' Householder matrices
If Q is an orthogonal matrix and u is the vector of all 1's and
Then q and u have the same norms, so if H is the householder reflector that maps q to u,