**

sklearn实现12种回归模型(LinearRegression,KNN,SVR,Ridge,Lasso,MLP,DecisionTree,ExtraTree,RandomForest,AdaBoost,GradientBoost,Bagging)

**
本文主要是针对本人做的一个项目需求,查找合适的回归模型,记录实现过程,仅方便自己以后查找。

1.数据准备

import numpy as np
import pandas as pd
data = pd.read_excel(r"data.xlsx")
data = np.array(data)
a=data[:,3:8]
b=data[:,2]

2.开始试验各种不同的回归方法

2.1线性回归模型

#LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = LinearRegression()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("LinearRegression结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
LinearRegression结果如下:
训练集分数: 0.591856113161297
验证集分数: 0.6214511243968527

2.2KNN回归模型

from sklearn.neighbors import KNeighborsRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = KNeighborsRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("KNeighborsRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
KNeighborsRegressor结果如下:
训练集分数: 0.7216991832348424
验证集分数: 0.601773245289923

2.3SVM回归模型

from sklearn.svm import SVR
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = SVR()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("SVR结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
SVR结果如下:
训练集分数: 0.37625861753449674
验证集分数: 0.4536826131027402

2.4岭回归模型

from sklearn.linear_model import Ridge
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = Ridge()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("Ridge结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
Ridge结果如下:
训练集分数: 0.5999728749276192
验证集分数: 0.5903386836435587

2.5LASSO回归模型

from sklearn.linear_model import Lasso
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = Lasso()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("Lasso结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
Lasso结果如下:
训练集分数: 0.5959277449910918
验证集分数: 0.6063097915792626

2.6多层感知机回归模型

from sklearn.neural_network import MLPRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = MLPRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("MLPRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
MLPRegressor结果如下:
训练集分数: 0.6260209012837945
验证集分数: 0.6234879650836542

2.7决策树回归模型

from sklearn.tree import DecisionTreeRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = DecisionTreeRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("DecisionTreeRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
DecisionTreeRegressor结果如下:
训练集分数: 0.9029579714124932
验证集分数: 0.5789140015732428

2.8极限树回归模型

from sklearn.tree import ExtraTreeRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = ExtraTreeRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("ExtraTreeRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
ExtraTreeRegressor结果如下:
训练集分数: 0.9037680679611091
验证集分数: 0.46830247193588975

2.9随机森林回归模型

from sklearn.ensemble import RandomForestRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = RandomForestRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("RandomForestRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
RandomForestRegressor结果如下:
训练集分数: 0.8692011534234785
验证集分数: 0.6943344063242647

2.10AdaBoost回归模型

from sklearn.ensemble import AdaBoostRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = AdaBoostRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("AdaBoostRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
AdaBoostRegressor结果如下:
训练集分数: 0.6384420177311011
验证集分数: 0.607339934856168

2.11梯度提升回归模型

from sklearn.ensemble import GradientBoostingRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = GradientBoostingRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("GradientBoostingRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
GradientBoostingRegressor结果如下:
训练集分数: 0.7484658216805864
验证集分数: 0.7122203061071664

2.12Bagging回归模型

from sklearn.ensemble import BaggingRegressor
x_train,x_test,y_train,y_test = train_test_split(a,b,test_size=0.2)
clf = BaggingRegressor()
rf = clf.fit (x_train, y_train.ravel())
y_pred = rf.predict(x_test)
print("BaggingRegressor结果如下:")
print("训练集分数:",rf.score(x_train,y_train))
print("验证集分数:",rf.score(x_test,y_test))
BaggingRegressor结果如下:
训练集分数: 0.8641707121920719
验证集分数: 0.6610529256307627