So, I was trying speed up the my workflow by applying parallelism to my algorithm. I have a list of data, and I am given a window size that is less than the count of that data list.
My task is to apply a window on my data, and then I need to do some comparison to determine whether that window has bad data or not by returning a boolean. Then, I slide that window down by just 1, repeat, and continue the process until the list is finish.
The problem is it starts to get very slow when my window size is big. It will be something like O(N*W^2), where N is the data size, and W is the window size. I need to copy W size of data into a sublist(GetRange), and then loop through that sublist to compare.
So I try to apply parallelism by putting that code in a Parallel.For. However, when I look at the CPU core usage it just shows ~10% utilisation, and I can see some of the cores are just being idle. How do I maximize all my CPU power in this case? I have got like 16 cores on my machine.
public static double CountBadData(List<double> data, int someWindowSize, double limit)
{
var badCount = 0;
var totalSlidings = data.Count - someWindowSize + 1;
Parallel.For(0, totalSlidings, i =>
{
if (!DetermineGoodOrBad(data.GetRange(i,someWindowSize), limit))
Interlocked.Increment(ref badCount);
});
return badCount;
}
private bool DetermineGoodOrBad(List<double> subData, double limit)
{
foreach(var data in subData)
{
if (data > limit) return false;
}
return true;
}
I'm not going to try to answer why the parallelism works poorly. There might be bandwith or caching issues, or that the CPUs spend time transferring ownership of the
badCountvariable. You will likely need to profile your program to get a real answer.I'm instead going to try answering the implied question, how to make the code faster.
One way to do this is to keep track of how many bad values exist within the sliding window, so when we move the window one step we only need to check the values that where added and removed from the window, and decriment/increment the running count accordingly:
Be aware that this code is not tested. I would assume that there are a few of-by-one errors.