实验目的

  1. 理解神经网络相关原理和概念;
  2. 能够使用神经网络解决相关问题;
  3. 熟练使用 Scikit-learn 中神经网络相关模块。

实验内容

使用神经网络解决糖尿病病情预测问题。

数据集介绍

样本数据个数: 442
特征个数(数据维度): 10
各特征含义: 年龄,性别,BMI指数,平均血压,S1,S2,S3,S4,S5,S6
特征取值范围: (-0.2,0.2)
标签含义: 基于病情进展一年后的定量测量
标签取值范围 :[25,346]

实验准备

引入糖尿病数据集

from sklearn.datasets import load_diabetes
diabetes = load_diabetes()
X = diabetes.data  # 特征向量集
y = diabetes.target  # 标记集

数据标准化

from sklearn.preprocessing import StandardScaler

ss = StandardScaler()
# 将先数据再拟合标准化
Xs1 = ss.fit_transform(X)
# 将数据标准化
ss.fit(X)
Xs2 = ss.transform(X)


X
Out[4]: 
array([[ 0.03807591,  0.05068012,  0.06169621, ..., -0.00259226,
         0.01990842, -0.01764613],
       [-0.00188202, -0.04464164, -0.05147406, ..., -0.03949338,
        -0.06832974, -0.09220405],
       [ 0.08529891,  0.05068012,  0.04445121, ..., -0.00259226,
         0.00286377, -0.02593034],
       ...,
       [ 0.04170844,  0.05068012, -0.01590626, ..., -0.01107952,
        -0.04687948,  0.01549073],
       [-0.04547248, -0.04464164,  0.03906215, ...,  0.02655962,
         0.04452837, -0.02593034],
       [-0.04547248, -0.04464164, -0.0730303 , ..., -0.03949338,
        -0.00421986,  0.00306441]])

Xs1
Out[5]: 
array([[ 0.80050009,  1.06548848,  1.29708846, ..., -0.05449919,
         0.41855058, -0.37098854],
       [-0.03956713, -0.93853666, -1.08218016, ..., -0.83030083,
        -1.43655059, -1.93847913],
       [ 1.79330681,  1.06548848,  0.93453324, ..., -0.05449919,
         0.06020733, -0.54515416],
       ...,
       [ 0.87686984,  1.06548848, -0.33441002, ..., -0.23293356,
        -0.98558469,  0.32567395],
       [-0.9560041 , -0.93853666,  0.82123474, ...,  0.55838411,
         0.93615545, -0.54515416],
       [-0.9560041 , -0.93853666, -1.53537419, ..., -0.83030083,
        -0.08871747,  0.06442552]])

Xs2
Out[6]: 
array([[ 0.80050009,  1.06548848,  1.29708846, ..., -0.05449919,
         0.41855058, -0.37098854],
       [-0.03956713, -0.93853666, -1.08218016, ..., -0.83030083,
        -1.43655059, -1.93847913],
       [ 1.79330681,  1.06548848,  0.93453324, ..., -0.05449919,
         0.06020733, -0.54515416],
       ...,
       [ 0.87686984,  1.06548848, -0.33441002, ..., -0.23293356,
        -0.98558469,  0.32567395],
       [-0.9560041 , -0.93853666,  0.82123474, ...,  0.55838411,
         0.93615545, -0.54515416],
       [-0.9560041 , -0.93853666, -1.53537419, ..., -0.83030083,
        -0.08871747,  0.06442552]])

模块划分

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

神经网络分类模块

from sklearn.neural_network import MLPClassifier

神经网络回归模块

from sklearn.neural_network import MLPRegressor

神经网络模块属性

- hidden_layer_sizes:

  hidden_layer_sizes=(50, 50) # 表示有两层隐藏层,第一层隐藏层有50个神经元,第二层也有50个神经元。

- activation: 激活函数 ,默认relu

  identity:f(x) = x
  logistic:f(x) = 1 / (1 + exp(-x)).
  tanh:f(x) = tanh(x).
  relu:f(x) = max(0, x)

- solver: 权重优化器,默认adam
  lbfgs: quasi-Newton方法的优化器
  sgd:随机梯度下降
  adam: Kingma, Diederik, and Jimmy Ba 提出的机遇随机梯度的优化器
  注意:默认solver ‘adam’在相对较大的数据集上效果比较好(几千个样本或者更多),对小数据集来说,lbfgs收敛更快效果也更好。

- alpha: float,可选的,默认0.0001,正则化项参数

- batch_size: int , 可选的,默认’auto’,随机优化的minibatches的大小batch_size=min(200,n_samples),如果solver是’lbfgs’,分类器将不使用minibatch

- learning_rate :学习率,用于权重更新,只有当solver为’sgd’时使用,默认constant
  ‘constant’: 有’learning_rate_init’给定的恒定学习率
  ‘incscaling’:随着时间t使用’power_t’的逆标度指数不断降低学习率learning_rate_ ,effective_learning_rate = learning_rate_init / pow(t, power_t)

- ‘adaptive’:只要训练损耗在下降,就保持学习率为’learning_rate_init’不变,当连续两次不能降低训练损耗或验证分数停止升高至少tol时,将当前学习率除以5.

- power_t: double, 可选, default 0.5,只有solver=’sgd’时使用,是逆扩展学习率的指数.

- learning_rate=’invscaling’,用来更新有效学习率。

- max_iter: int,可选,默认200,最大迭代次数。

- random_state:int 或RandomState,可选,默认None,随机数生成器的状态或种子。

- shuffle: bool,可选,默认True,只有当solver=’sgd’或者‘adam’时使用,判断是否在每次迭代时对样本进行清洗。

- tol:float, 可选,默认1e-4,优化的容忍度

- learning_rate_int:double,可选,默认0.001,初始学习率,控制更新权重的补偿,只有当solver=’sgd’ 或’adam’时使用。

实现糖尿病预测

通过分类

from sklearn.datasets import load_diabetes
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# 保存数据集
diabetes = load_diabetes()
X = diabetes.data  # 特征向量集
y = diabetes.target  # 标记集

# 划分数据
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=26)

# 数据标准化
ss = StandardScaler()
ss.fit(X_train)
X_train = ss.transform(X_train)
X_test = ss.transform(X_test)

# 建立模型1
mlpmodel = MLPClassifier(solver='lbfgs',hidden_layer_sizes=(100))
mlpmodel.fit(X_train,y_train)
mlpmodel.score(X_test,y_test,sample_weight=None)

Out[11]: 0.011235955056179775

# 建立模型2
mlpmodel1 = MLPClassifier(hidden_layer_sizes=(1000))
mlpmodel1.fit(X_train,y_train)
mlpmodel1.score(X_test,y_test,sample_weight=None)

Out[11]: 0.02247191011235955
    
# 建立模型3
mlpmodel2 = MLPClassifier(hidden_layer_sizes=(10000))
mlpmodel2.fit(X_train,y_train)
mlpmodel2.score(X_test,y_test,sample_weight=None)

Out[23]: 0.0224719101123595

# 建立模型4
mlpmodel3 = MLPClassifier(solver='lbfgs',hidden_layer_sizes=(100,100))
mlpmodel3.fit(X_train,y_train)
mlpmodel3.score(X_test,y_test,sample_weight=None)

Out[24]: 0.011235955056179775
    
# 建立模型5
mlpmodel4 = MLPClassifier(solver='lbfgs',hidden_layer_sizes=(100,100,100))
mlpmodel4.fit(X_train,y_train)
mlpmodel4.score(X_test,y_test,sample_weight=None)

Out[25]: 0.011235955056179775

通过回归

from sklearn.datasets import load_diabetes
from sklearn.neural_network import MLPRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# 保存数据集
diabetes = load_diabetes()
X = diabetes.data  # 特征向量集
y = diabetes.target  # 标记集

# 划分数据
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=26)

# 数据标准化
ss = StandardScaler()
ss.fit(X_train)
X_train = ss.transform(X_train)
X_test = ss.transform(X_test)


# 建立模型1
mlpmodel = MLPRegressor(hidden_layer_sizes=(100))
mlpmodel.fit(X_train,y_train)
mlpmodel.score(X_test,y_test,sample_weight=None)

Out[11]: -1.4322417210808136

# 建立模型2
mlpmodel1 = MLPRegressor(hidden_layer_sizes=(1000))
mlpmodel1.fit(X_train,y_train)
mlpmodel1.score(X_test,y_test,sample_weight=None)

Out[11]: 0.21921630994511387
    
# 建立模型3
mlpmodel2 = MLPRegressor(hidden_layer_sizes=(10000))
mlpmodel2.fit(X_train,y_train)
mlpmodel2.score(X_test,y_test,sample_weight=None)

Out[23]: 0.4405095276511962

# 建立模型4
mlpmodel3 = MLPRegressor(hidden_layer_sizes=(100,100))
mlpmodel3.fit(X_train,y_train)
mlpmodel3.score(X_test,y_test,sample_weight=None)

Out[24]: 0.3716811807893977
    
# 建立模型5
mlpmodel4 = MLPRegressor(hidden_layer_sizes=(100,100,100))
mlpmodel4.fit(X_train,y_train)
mlpmodel4.score(X_test,y_test,sample_weight=None)

Out[25]:  0.40633014171562987