I am attempting to use python to count some cells in an image. I am more or less following the tutorial here. After a thresholding step, I find the regional maximas and count them. This works very well for counting nuclei, however there are some false positives, including dead cells and cell fragments that I dont want to count. The code I used:
import mahotas as mh
import numpy as np
from matplotlib import pyplot as plt
dna = mh.imread('img.jpg')
dna = dna[:,:,0]
dnaf = mh.gaussian_filter(dna.astype(float), 4)
maxima = mh.regmax(mh.stretch(dnaf))
maxima = mh.dilate(maxima, np.ones((5,5)))
plt.imshow(mh.as_rgb(np.maximum(255*maxima, dnaf), dnaf, dna > T_mean))
plt.show()
And the image is below. The dead cells are in the bottom right and just left of center. The false positives are the large red blobs
Is there anyway I can filter out these false positives? I have tried getting the sizes of all the regions and filtering based on size, but the results look odd once I take the regional maxima.
dnaf = mh.gaussian_filter(dna.astype(float), 4)
sizes = mh.labeled.labeled_size(dnaf)
too_small = np.where(sizes < 800)
dnaf = mh.labeled.remove_regions(dnaf, too_small)
maxima = mh.regmax(mh.stretch(dnaf))
maxima = mh.dilate(maxima, np.ones((5,5)))
plt.imshow(mh.as_rgb(np.maximum(255*maxima, dnaf), dnaf, dna > T_mean))
plt.show()
This only got rid of one of the false positives and distorted the image at several other locations (see below) making me think I did something wrong.
Again, this image is at a different location than the first but it looks distorted compared to original, and the dead cell fragments still remain, so Im certain I am not doing this right.
So my question is, what is the best way, using python, to remove small debris/dead cells from image in order to get a better cell count estimate?

I think you have to start at the original image, to identify false positives. Mark the locations of dead cells and other false positives, and record the geometric properties of the blobs they produce. If the properties like size can show a clear difference, then you can distinguish them easily.
Another option is to change parameters like the Gaussian filter size and the dilation operator size.
If none of these help, the recommended option is to use a large number of microscopic images and train a machine learning system based on them.