How to call multiple str.contains on same column to take out data in pandas

Question

How to call multiple str.contains on same column to take out data in pandas

688 views Asked by M-M At 13 December 2017 at 23:59

I created a working example like this:

appart = OrderedDict([ ('Description', ['A LOUER F2 GRENOBLE Quartier Île Verte Rue Eugène Delacroix, place Dr Girard, proche tramway B et ligne de bus, 50,60 m² 4 ème étage avec ascenseur.', 'Actuellement libre.Transport : Ligne de bus C6 desservant le centre ville toutes les 10 mintram A arrêt Mc2Le stationnement.', ' Idéalement située: -à deux pas du Tram (Arrêt Gustave RIvet)-à 10 minutes du Centre Ville -supermarché à 2']),
      ('Loyer', [350, 267, 150]),
      ('Type',  ['Appartement', 'Maison', 'Parking']),
      ('Surface', [25, 18, 15]) ] )
df1 = pd.DataFrame.from_dict(appart)
df1

And this is my output :

    Description                                         Loyer   Type            Surface

0   A LOUER F2 GRENOBLE Quartier Île Verte Rue Eug...   350     Appartement     25
1   Actuellement libre.Transport : Ligne de bus C6...   267     Maison          18
2   Idéalement située: -à deux pas du Tram (Arrêt...    150     Parking         15

So for this DataFrame, I want to take out the area from each description and add it in a new column called Quartier. For example if the 1st description contains ('victor hugo|centre ville|hyper-centre-ville') then add 'Centre Ville' in Quartier column, if 2nd description contains (''ile verte|Île-verte|ile-verte|la tronche') then add 'Île-Verte' in the Quartier column and so on for each area.

Original Q&A

There are 1 answers

**furas** · Accepted Answer · 2017-12-14T18:57:42+00:00

I use df['Description'].apply(callback) to execute function on every row and return new value which will create new column.

import pandas as pd
import re

appart = {
    'Description': [
        'A LOUER F2 GRENOBLE Quartier Île Verte Rue Eugène Delacroix, place Dr Girard, proche tramway B et ligne de bus, 50,60 m² 4 ème étage avec ascenseur.',
        'Actuellement libre.Transport : Ligne de bus C6 desservant le centre ville toutes les 10 mintram A arrêt Mc2Le stationnement.',
        ' Idéalement située: -à deux pas du Tram (Arrêt Gustave RIvet)-à 10 minutes du Centre Ville -supermarché à 2'
    ],
    'Loyer': [350, 267, 150],
    'Type': ['Appartement', 'Maison', 'Parking'],
    'Surface': [25, 18, 15]
}

df = pd.DataFrame(appart)
print(df)

# ----

def callback(text):
    if re.search('Victor Hugo|victor hugo|Centre-ville|centre ville|hyper-centre-ville|gare|grenette|saint André', text, re.IGNORECASE):
        return 'Centre-ville'

    if re.search('ile verte|Île-verte|ile-verte|la tronche|trois tours|île verte', text, re.IGNORECASE):
        return 'Île-Verte'

    return ''

df['Quartier'] = df['Description'].apply(callback)
print(df)

EDIT: I think you could nested first np.where() in second np.where() as third argument.

 np.where( ..., ..., np.where())

but I know if it gives correct result.

df['Quartier_2'] = np.where(df['Description'].str.contains('Victor Hugo|victor hugo|\
Centre-ville|centre ville|hyper-centre-ville|gare|grenette|\
saint André', case=False, na=True), 'Centre-ville',
    np.where(df['Description'].str.contains('ile verte|Île-verte|ile-verte|la tronche|trois tours|île verte', case=False, na=True), 'Île-Verte', ''))

print(df)

I use apply() with one column but you can use it with many columns or with full dataframe and then you have to use axis=1 to gets rows instead of columns. And inside function you can get values from different columns.

def callback(row):

    text = row['Description']

    if re.search('Victor Hugo|victor hugo|Centre-ville|centre ville|hyper-centre-ville|gare|grenette|saint André', text, re.IGNORECASE):
        return 'Centre-ville'

    if re.search('ile verte|Île-verte|ile-verte|la tronche|trois tours|île verte', text, re.IGNORECASE):
        return 'Île-Verte'

    return ''

df['Quartier'] = df.apply(callback, axis=1)

TechQA.

How to call multiple str.contains on same column to take out data in pandas

There are 1 answers

Related Questions in PYTHON

Related Questions in STRING

Related Questions in PANDAS

Related Questions in CONTAIN

Popular Questions

Trending Questions