Calculating EuclideanDistance in SQL for Deepface facial embeddings?

31 views Asked by At

Is anyone here familiar with deepface library? I was following this article to start my adventures into facial recognition, but I ran into an issue that I can't seem to resolve (due to my lack of knowledge in this area).

Here's what I'm trying to do: I have a photo of Angelina Jolie in a group. I've created the face embeddings for everyone in the photo, and stored it into a sqlite db. Now I have a single photo of Angelina Jolie by herself, and I want to match this single face and get back the photo and face embedding that's stored in the sqlite db.

The problem: The sql command that calculates the Euclidean distance to find matches in the db returns nothing (so it thinks there's no Angelina Jolie faces in the db when there is). I think the sql command is incorrect because loading all the db data first and then running the distance calculation with pure python actually does return a result.

Here's the group photo that I've stored into the db: angie_group.jpg

This is the code used to do the distance calculation in sql:

with conn:
  cur = conn.cursor()


  # compare
  target_img = "angie_single.jpg"
  target_represent = DeepFace.represent(img_path=target_img, model_name="Facenet", detector_backend="retinaface")[0]
  target_embedding = target_represent["embedding"]
  target_facial_area = target_represent["facial_area"]

  target_statement = ""
  for i, value in enumerate(target_embedding):
      target_statement += 'select %d as dimension, %s as value' % (i, str(value)) #sqlite
      
      if i < len(target_embedding) - 1:
          target_statement += ' union all '

  select_statement = f'''
    select * 
    from (
        select img_name, sum(subtract_dims) as distance_squared
        from (
            select img_name, (source - target) * (source - target) as subtract_dims
            from (
                select meta.img_name, emb.value as source, target.value as target
                from face_meta meta left join face_embeddings emb
                on meta.id = emb.face_id
                left join (
                    {target_statement}  
                ) target
                on emb.dimension = target.dimension
            )
        )
        group by img_name
    )
    where distance_squared < 100
    order by distance_squared asc
'''

  results = cur.execute(select_statement)
  instances = []

  for result in results:
      print(result)
      img_name = result[0]
      distance_squared = result[1]
      
      instance = []
      instance.append(img_name)
      instance.append(math.sqrt(distance_squared))
      instances.append(instance)
  
  result_df = pd.DataFrame(instances, columns = ['img_name', 'distance'])

  print(result_df)

And here is the target photo I'm using to query: angie_single.jpg

Unfortunately the above code finds nothing even though there should be one hit (from the group photo):

Empty DataFrame
Columns: [img_name, distance]
Index: []

If I grab all the db data into memory and then run it against a calculation done in python (not sql), then I do get a match:

  def findEuclideanDistance(row):
      source = np.array(row['embedding'])
      target = np.array(row['target'])
      distance = (source - target)
      return np.sqrt(np.sum(np.multiply(distance, distance)))

Finds one match:

          img_name                                          embedding                                             target  distance
0  angie_group.jpg  [0.10850527882575989, 0.5568691492080688, 0.81...  [-0.6434235572814941, 0.5883399248123169, 0.29...  8.263514

What's missing in the sql code? Why does it not match anything?

0

There are 0 answers