raise ValueError("Input contains NaN") ValueError: Input contains NaN when trying to build machine learning model

Question

raise ValueError("Input contains NaN") ValueError: Input contains NaN when trying to build machine learning model

840 views Asked by J. Doe At 14 December 2020 at 17:47

I am trying to build a prediction model but currently keep getting an error: raise ValueError("Input contains NaN") ValueError: Input contains NaN. I tried to use np.any(np.isnan(dataframe)) and np.any(np.isnan(dataframe)), but I just keep getting new errors. For example, TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''.

Here is the code so far:

import pandas as pd
from sklearn.preprocessing import LabelEncoder
import numpy as np

dataframe = pd.read_csv('file.csv', delimiter=',')

le = LabelEncoder()
dfle = dataframe

dfle2 = dfle.apply(lambda col: le.fit_transform(col.astype(str)), axis=0, result_type='expand')

newdf = dfle2[['column1', 'column2', 'column3', 'column4', 'column5', 'column6', 'column7']]

X = dataframe[['column1', 'column2', 'column4', 'column5', 'column6', 'column7']].values

y = dfle.column3

from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
ohe = OneHotEncoder()

ColumnTransformer([('encoder', OneHotEncoder(), [0])], remainder='passthrough')
# np.all(np.isfinite(dfle))
# np.any(np.isnan(dfle))
X = ohe.fit_transform(X).toarray()

Original Q&A

There are 2 answers

Alex Newman On 14 December 2020 at 18:05

The error

TypeError: ufunc 'isfinite' not supported for the input types,
and the inputs could not be safely coerced to any supported types
according to the casting rule ''safe''

is probably because you're converting to str when doing col.astype(str). Use something like astype(float) instead.

As for the NaN error, you need to figure if it's feasible to solve by just replacing it with zeros (fillna(0)) or if there is the need to go for something more complex like a Kalman filter for example.

**Atif Rizwan** · Accepted Answer · 2020-12-14T17:55:16+00:00

You can do multiple things to deal with this error first, you can fill the Nan values by 0 dataframe = pd.read_csv('file.csv', delimiter=',').fillna(0)

or you can use sklearn imputation techniques to fill the Nan value.

https://scikit-learn.org/stable/modules/classes.html#module-sklearn.impute

Multiple Imputation techniques are available but you should use KNNImputer.

TechQA.

raise ValueError("Input contains NaN") ValueError: Input contains NaN when trying to build machine learning model

There are 2 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in SCIKIT-LEARN

Related Questions in ONE-HOT-ENCODING

Related Questions in LABEL-ENCODING

Popular Questions

Popular Tags

Trending Questions