I'm detecting faces with haarcascade and tracking them with a webcam using OpenCV. I need to save each face that is tracked. But the problem is when people are moving. In which case the face becomes blurry.
I've tried to mitigate this problem with opencv's dnn face detector and Laplacian with the following code:
blob = cv2.dnn.blobFromImage(cropped_face, 1.0, (300, 300), (104.0, 177.0, 123.0))
net.setInput(blob)
detections = net.forward()
confidence = detections[0, 0, 0, 2]
blur = cv2.Laplacian(cropped_face, cv2.CV_64F).var()
if confidence >= confidence_threshold and blur >= blur_threshold:
cv2.imwrite('less_blurry_image', cropped_face)
Here I tried to limit saving a face if it is not blurry due to motion by setting blur_threshold to 500 and confidence_threshold to 0.98 (i.e. 98%).
But the problem is if I change the camera I have to change the thresholds again manually. And in most of the cases setting a threshold omits most of the faces.
Plus, it is difficult to detect since the background is always clear compared to the blurred face.
So my question is how can I detect this motion blur on a face. I know I can train an ML model for motion blur detection of a face. But that would require heavy processing resources for a small task.
Moreover, I will be needing a huge amount of annotated data for training if I go that route. Which is not easy for a student like me.
Hence, I am trying to detect this with OpenCV which will be a lot less resource intensive compared to using an ML model for detection.
Is there any less resource intensive solution for this?
You can probably use a Fourier Transform (FFT) or a Discrete Cosine Transform (DCT) to figure out how blurred your faces are. Blur in images leads to high frequencies disappearing, and only low frequencies remaining.
So you'd take an image of your face, zero-pad it to a size that'll work well for FFT or DCT, and look how much spectral power you have at higher frequencies.
You probably don't need FFT - DCT will be enough. The advantage of DCT is that it produces a real-valued result (no imaginary part). Performance-wise, FFT and DCT are really fast for sizes that are powers of 2, as well as for sizes that have only factors 2, 3 and 5 in them (although if you also have 3's and 5's it'll be a bit slower).