1;81.7474;1268.5;122.85;46.9;64.95;823.69;55.64;18.54;3;4;1.2;0.7;1045;915.68;890.54;12;35;36420
первые 17 столбцов признаки, 18 столбец желаемое значение.
По результатам тестирования у меня выдает такие данные:
Accuracy: 30535102968705.60%
Accuracy: 17282909.55%
Accuracy: 8681055.47%
Мб я где-то накосячил в программе? Или не до конца уловил механизм scikit-learn?
Пример программы:
import pandas as pd import xgboost as xgb from sklearn.metrics import confusion_matrix, mean_squared_error from sklearn.metrics import mean_absolute_error,mean_squared_error,median_absolute_error df = pd.read_csv('new_work_1.csv',";",header=None) X_train = df.drop(18,axis=1) Y_train = df[18] T_train_xgb = xgb.DMatrix(X_train, Y_train)# params = {"objective": "reg:linear", "booster":"gblinear"} gbm = xgb.train(dtrain=T_train_xgb,params=params) test_data = pd.read_csv('new_work_2.csv',";",header=None) #print(test_data) X_test = test_data.drop(18,axis=1) Y_test = test_data[18] Y_pred = gbm.predict(xgb.DMatrix(X_test)) test_erorr = mean_squared_error(Y_test,Y_pred); print("Accuracy: %.2f%%" % (test_erorr * 100.0)) accuracy = mean_absolute_error(Y_test,Y_pred) print("Accuracy: %.2f%%" % (accuracy * 100.0)) accuracy2 = median_absolute_error(Y_test, Y_pred) print("Accuracy: %.2f%%" % (accuracy2 * 100.0))