Sklearn Label Encoder - Not getting desired output based on prediction and inverse transform

Question

Sklearn Label Encoder - Not getting desired output based on prediction and inverse transform

871 views Asked by ItsMeGokul At 18 January 2022 at 17:43

I'm new to the Python ML using scikit. I was working on a solution to create a model with three columns Pets, Owner and location.

import pandas
import joblib
from sklearn.tree import DecisionTreeClassifier
from collections import defaultdict
from sklearn import preprocessing 

df = pandas.DataFrame({
    'pets': ['cat', 'dog', 'cat', 'monkey', 'dog', 'dog'], 
    'owner': ['Champ', 'Ron', 'Brick', 'Champ', 'Veronica', 'Ron'], 
    'location': ['San_Diego', 'New_York', 'New_York', 'San_Diego', 'San_Diego', 
                 'New_York']
})

Now, with the label encoder I'm encoding the entire Data Frame.

le = preprocessing.LabelEncoder()
df_encoded = df.apply(le.fit_transform)
df_array=df_encoded.values

Now, I'm splitting the encoded array into Input set (Pets and Owner) and an Output set (location)

IpSet = df_array[:,0:2]
Opset = df_array[:,2:3]

Then, I create a new model of decision tree classifier and am fitting the input and output set.

model = DecisionTreeClassifier()
model.fit(IpSet,Opset)

Now, I'm trying to predict the Location using the model for a new Dataframe. I'm using the same Label encoder as used earlier.

df_Predict = pandas.DataFrame({
    'pets': ['cat'], 
    'owner': ['Champ']})
df_encoded_Predict = df_Predict.apply(le.fit_transform)
predictions_train = model.predict(df_encoded_Predict)
print(le.inverse_transform(predictions_train)[:1])

With this, I'm expecting to see the value 'San Diego'. Not sure, why I'm getting 'Champ' as an output.

Could someone help me through this?

Original Q&A

There are 1 answers

**sulhi** · Answer 1 · 2022-01-19T17:13:50+00:00

The logic you following is not correct.

    df_encoded = df.apply(le.fit_transform)

Here the same encoder ( le ) fitted for every column and end of this line execution le has only the location information.

When you need to use already fitted encoder use the .transform() method instead of following.

       df_encoded_Predict = df_Predict.apply(le.fit_transform)

TechQA.

Sklearn Label Encoder - Not getting desired output based on prediction and inverse transform

There are 1 answers

Related Questions in PYTHON

Related Questions in SCIKIT-LEARN

Related Questions in PREDICT

Related Questions in LABEL-ENCODING

Related Questions in INVERSE-TRANSFORM

Popular Questions

Trending Questions