How to test if one dataframe contains string from another dataframe (loop)

230 views Asked by At

Having two dataframes (addresses and cities) I try to check which address correspond to the certain city in the certain region. I have 2 dataframes:

cities = pd.DataFrame({'region': {0: 'region_1', 1: 'region_1', 2: 'region_2'}
                      'city': {0: 'city_1', 1: 'city_2', 2: 'city_3'}},
                      columns=['city', 'region'])

addresses = pd.DataFrame({'region': {0: 'region_1', 1: 'region_2', 2: 'region_1'},
                       'address': {0: 'adress_1', 1: 'adress_2', 2: 'adress_3'}},
                       columns=['region', 'address'])

As a result I try to add a column with True/False for each row in case a region from cities is in the region from addresses and the city from cities is in the address. I tried the following:

cities_list = cities.groupby('region')['name'].agg(list).reset_index(city='cities')
cities_list = cities_list.astype('string')
addresses = addresses.astype('string')

for row in ['region', 'address']:
       regions_list = cities_list ['region']
       names_list = cities_list ['city']
       row['check'] = addresses['region'].str.findall('|'.join(regions_list))&`addresses['address'].str.findall('|'.join(names_list)))

It doesn`t work

1

There are 1 answers

0
merruem On

You can use a loop and iterate through the values of one dataframe to check if they exist in the other dataframe.

import pandas as pd

def check_string(df1, df2):
    for value in df1['Column_Name']:
        if any(value in str(row) for row in df2['Column_Name']):
            print(f"String '{value}' found in df2.")
        else:
            print(f"String '{value}' not found in df2.")

# Example dataframes
df1 = pd.DataFrame({'Column_Name': ['String1', 'String2', 'String3']})
df2 = pd.DataFrame({'Column_Name': ['This is String1', 'Some other text', 'String3 data']})

# Calling the function
check_string(df1, df2)