Hello i tried to look at this error code online but coulnd't find much information about it. What i understand is that have to make like one Dataframe of X_train and Y_train but i don't know on how to do that. These 2 variables are a series, so do i have to make a new df or is there a different way?
The full error code: ValueError: Expected a 2-dimensional container but got <class 'pandas.core.series.Series'> instead. Pass a DataFrame containing a single row (i.e. single sample) or a single column (i.e. single feature) instead.
my code (the error is getting showen at the line where model.fit(X_train, y_train) is located :
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
df = pd.read_excel('Month.xlsx', usecols='W, AJ')
data = pd.read_excel('Month.xlsx')
x = df.iloc[:,0]
y = df.iloc[:,1]
x = x.replace(r'^\s*$', 0, regex=True)
y = y.replace(r'^\s*$', 0, regex=True)
# Opsplitsen van de dataset in trainings- en testset
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)
# Maken van het lineaire regressiemodel
model = LinearRegression()
# Trainen van het model
model.fit(X_train, y_train)
# Voorspellen met het getrainde model
y_pred = model.predict(X_test)
# Visualisatie van de regressielijn
plt.scatter(X_test, y_test, color='gray')
plt.plot(X_test, y_pred, color='red', linewidth=2)
plt.xlabel('Luchtvochtigheid')
plt.ylabel('Regenval')
plt.show()
The scikit-learn fit expects a matrix (in your case a nx1 array) as input. Thus, you have to make x a DataFrame (not a Series), by replacing the assignment of x (x = df.iloc[:,0]) by: