Python 做拟合 python做数据拟合

转载

新新人类 2023-08-31 07:46:07

文章标签 Python 做拟合 python 开发语言机器学习拟合 文章分类 Python 后端开发

在之前的博客"使用python来完成数据的线性拟合"当中，介绍了基于python,使用三种方法完成线性拟合的理论和代码实现。同样经常会碰到样本分布呈现非线性关系的情况，那么如何拟合出来呢？本文侧重对数据已经有建模，但是准确的关系需要得以确定的情况。

如果想直接求出拟合系数，而不清楚原本模型的话，可以直接使用numpy的polyfit函数来解决，举例：theta = np.polyfit(X, Y, deg=3)表示的数学方程为y=a*x^3+b*x^2+c*x+d方程的各个权重系数，theta=[a,b,c,d]。这里deg=3表明多项式的最高次方为3.

非线性拟合，也可以利用线性拟合的方法，只不过需要将非线性模型进行整理，做成线性模型来处理。举例：比如我有一个模型，它的曲线方程形式为 y=a*x^4+b*x^2+c. 很明显，这是一个偶函数，关于y轴对称。输入为x，输出为y。权重参数a、b、c需要通过已知的大量样本点（x，y）来求得。

将问题转换为线性拟合的问题，即已知x，那么x0=x^4, x1=x^2, x2=x^0=1的x0,x1,x2就能够直接得到了。更近一步，模型 y=a*x^4+b*x^2+c其实就可以表示为 y=a*x0+b*x1+c*x2，而后者恰恰就是线性模型了，因为这里a,b,c为未知权重系数，而且都是一次幂，x0,x1,x2都是已知。

所以，就可以直接套用线性拟合的方法来求取此本质为非线性模型，被我们转换为线性模型来计算的最佳参数a,b,c了。

代码实现功能如下，

1. 完成了利用预设模型y= a * x ** 4 + b * x**2 + c (已假设a=10,b=6,c=20)来制造样本数据。

Python 做拟合 python做数据拟合_python

2. 然后利用不同方法来拟合得到a,b,c。

3.可视化拟合结果。

Python 做拟合 python做数据拟合_拟合_02

import numpy as np
from sklearn.linear_model import LinearRegression
from matplotlib import pyplot as plt

# 生成数据
SAMPLE_NUM = 1000
X = np.linspace(-2, 2, SAMPLE_NUM)
a = 10
b = 6
c = 20
Y = list(map(lambda x: a * x ** 4 + b * x ** 2 + c, X))
Y_noise = [np.random.randn() * 5 + y for y in Y]
plt.figure()
plt.legend()
plt.xlabel("x")
plt.ylabel("y")
plt.scatter(X, Y_noise, c='blue', s=10, linewidths=0.3)
plt.show()

# 直接求解,每个x阶次都要有对应的权重参数
# 使用最小二乘法做多项式拟合
theta = np.polyfit(X, Y_noise, deg=4)
print(theta)

# 如果曲线方程形式是已知的，要确定准确的方程参数。那么就可以用如下方式：
x0 = np.array(list(map(lambda x: x ** 4, X)))
x1 = np.array(list(map(lambda x: x ** 2, X)))
x2 = np.ones(len(X))

# shape=(SAMPLE_NUM,3)
A = np.stack((x0, x1, x2), axis=1)
b = np.array(Y_noise).reshape((SAMPLE_NUM, 1))

print("方法列表如下:"
              "1.最小二乘法 least square method "
              "2.常规方程法 Normal Equation "
              "3.线性回归法 Linear regression")
method = int(input("method="))

if method == 1:
    theta, _, _, _ = np.linalg.lstsq(A, b, rcond=None)
    theta = theta.flatten()
    a_ = theta[0]
    b_ = theta[1]
    c_ = theta[2]
    print("拟合结果为: y={:.4f}*x^4+{:.4f}*x^2+{:.4f}".format(a_, b_, c_))
    Y_predict = list(map(lambda x: a_ * x ** 4 + b_ * x ** 2 + c_, X))
    plt.scatter(X, Y_noise, s=10, linewidths=0.3)
    plt.plot(X, Y_predict, c='red')
    plt.title("method {}: y={:.4f}*x^4+{:.4f}*x^2+{:.4f}".format(method, a_, b_, c_))
    plt.show()
elif method == 2:
    AT = A.T
    A1 = np.matmul(AT, A)
    A2 = np.linalg.inv(A1)
    A3 = np.matmul(A2, AT)
    A4 = np.matmul(A3, b)
    A4 = A4.flatten()
    print("A4=", A4)
    a_ = A4[0]
    b_ = A4[1]
    c_ = A4[2]
    print("拟合结果为: y={:.4f}*x^4+{:.4f}*x^2+{:.4f}".format(a_, b_, c_))
    Y_predict = list(map(lambda x: a_ * x ** 4 + b_ * x ** 2 + c_, X))
    plt.scatter(X, Y_noise, s=10, linewidths=0.3)
    plt.plot(X, Y_predict, c='red')
    plt.title("method {}: y={:.4f}*x^4+{:.4f}*x^2+{:.4f}".format(method, a_, b_, c_))
    plt.show()
elif method == 3:
    # 利用线性回归构建模型,拟合数据
    model = LinearRegression()
    X_normalized = A
    Y_noise_normalized = np.array(Y_noise).reshape((SAMPLE_NUM, 1))  #
    model.fit(X_normalized, Y_noise_normalized)
    # 利用已经拟合到的模型进行预测
    Y_predict = model.predict(X_normalized)
    a_ = model.coef_.flatten()[0]
    b_ = model.coef_.flatten()[1]
    c_ = model.intercept_[0]
    print(model.coef_)
    print("拟合结果为: y={:.4f}*x^4+{:.4f}*x^2+{:.4f}".format(a_, b_, c_))
    plt.scatter(X, Y_noise, s=10, linewidths=0.3)
    plt.plot(X, Y_predict, c='red')
    plt.title("method {}: y={:.4f}*x^4+{:.4f}*x^2+{:.4f}".format(method, a_, b_, c_))
    plt.show()
else:
    print("请重新输入")

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。