SVC with linear kernel incorrect accuracy

68 views Asked by At

The performance of the model does not increase during training epoch(s) where values are sorted by a specific row key. Dataset is balance and have 40,000 records with binary classification(0,1).

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

Linear_SVC_classifier = SVC(kernel='linear', random_state=1)#supervised learning
Linear_SVC_classifier.fit(x_train, y_train)
SVC_Accuracy = accuracy_score(y_test, SVC_Prediction)
print("\n\n\nLinear SVM Accuracy: ", SVC_Accuracy)
1

There are 1 answers

0
user11717481 On BEST ANSWER

Add a count vectorizer to your train data and use logistic regression model

from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score 

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0) 

cv = CountVectorizer() 
ctmTr = cv.fit_transform(X_train) 
X_test_dtm = cv.transform(X_test)

model = LogisticRegression() 
model.fit(ctmTr, y_train)

y_pred_class = model.predict(X_test_dtm)

SVC_Accuracy = accuracy_score(y_test)
print("\n\n\nLinear SVM Accuracy: ", SVC_Accuracy)

the above model definition is something 'equivalent' to this statement

Linear_SVC_classifier = SVC(kernel='linear', random_state=1)  
Linear_SVC_classifier.fit(ctmTr, y_train)