How to merge pandas dataframe passing a lambda as first parameter?

90 views Asked by At

Restricting to pandas method chaining, how to apply merge method using last dataframe state with lambda function without using pipe?

The code below works. But it depends on the pipe method.

(pd.DataFrame(
    [{'YEAR':2013,'FK':1, 'v':1},
     {'YEAR':2013,'FK':2, 'v':2},
     {'YEAR':2014,'FK':1, 'v':3},
     {'YEAR':2014,'FK':2, 'v':4}
    ])
  .pipe(lambda w: w.merge(w.query('YEAR==2013')[['FK','v']],
        on='FK',
        how='left'
       ))
)

The code below doesn't work.

(pd.DataFrame(
    [{'YEAR':2013,'FK':1, 'v':1},
     {'YEAR':2013,'FK':2, 'v':2},
     {'YEAR':2014,'FK':1, 'v':3},
     {'YEAR':2014,'FK':2, 'v':4}
    ])
 .merge(lambda w: w.query('YEAR==2013'),
        on='FK',
        how='left'
       )
)

Return: TypeError: Can only merge Series or DataFrame objects, a <class 'function'> was passed

1

There are 1 answers

0
mozway On BEST ANSWER

You can't, this is precisely why the pipe method exists.

For completeness, DataFrame methods/accessors that accept a callable (as primary parameter and as of pandas 2.0.3) are:

For other cases, you need to use pipe.