Suppose one has two lists of bitsets A=(a1,a2,...aN) and B=(b1,b2,...bN). Each bitset holds K bits.
Now, for each couple of bitsets (an,bn), one scans all the couples of indexes (i,j) where an[i]==1 and bn[j]==1 and increments a counter for such a couple (i,j).
At the end, one would get the number of times each couple of indexes (i,j) has been seen in the combination of bitsets A and B.
Example for N=4 and K=3:
0 1 2 0 1 2
--------- --------
a1 0 0 1 b1 0 0 1
a2 1 0 1 b2 0 0 1
a3 0 0 0 b3 1 1 0
a4 1 1 0 b4 1 0 1
One should get the following counts:
(0,0)=1 (0,1)=0 (0,2)=2
(1,0)=1 (1,1)=0 (1,2)=1
(2,0)=0 (2,1)=0 (2,2)=2
Obviously, one can use a vector C of size K*K for storing the count for each (i,j). Then, one can iterate 1<=n<=N, make the "product" of bitsets an and bn in order to get relevant couples (i,j) and update the vector C accordingly.
Question: is there a better way to compute this counters compared to this naive approach ? Better here means both less memory usage and less computing operations.
Note that in my real use case, one would have N=10000 and K=256 for instance.
An important point is that the bitsets are quite sparse, which is perhaps an indication to reduce computation. An idea would be to begin the count for a given couple (i,j) only if it was seen at least once before (this could be done for instance with the help of a Bloom filter).