通过 1至10 阶来拟合对比 均方误差及r评分,可以确定最优的“最大阶数”。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
|
import numpy as np import matplotlib.pyplot as plt from sklearn.preprocessing import polynomialfeatures from sklearn.linear_model import linearregression,perceptron from sklearn.metrics import mean_squared_error,r2_score from sklearn.model_selection import train_test_split x = np.array([ - 4 , - 3 , - 2 , - 1 , 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 ]).reshape( - 1 , 1 ) y = np.array( 2 * (x * * 4 ) + x * * 2 + 9 * x + 2 ) #y = np.array([300,500,0,-10,0,20,200,300,1000,800,4000,5000,10000,9000,22000]).reshape(-1, 1) x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3 ) rmses = [] degrees = np.arange( 1 , 10 ) min_rmse, min_deg,score = 1e10 , 0 , 0 for deg in degrees: # 生成多项式特征集(如根据degree=3 ,生成 [[x,x**2,x**3]] ) poly = polynomialfeatures(degree = deg, include_bias = false) x_train_poly = poly.fit_transform(x_train) # 多项式拟合 poly_reg = linearregression() poly_reg.fit(x_train_poly, y_train) #print(poly_reg.coef_,poly_reg.intercept_) #系数及常数 # 测试集比较 x_test_poly = poly.fit_transform(x_test) y_test_pred = poly_reg.predict(x_test_poly) #mean_squared_error(y_true, y_pred) #均方误差回归损失,越小越好。 poly_rmse = np.sqrt(mean_squared_error(y_test, y_test_pred)) rmses.append(poly_rmse) # r2 范围[0,1],r2越接近1拟合越好。 r2score = r2_score(y_test, y_test_pred) # degree交叉验证 if min_rmse > poly_rmse: min_rmse = poly_rmse min_deg = deg score = r2score print ( 'degree = %s, rmse = %.2f ,r2_score = %.2f' % (deg, poly_rmse,r2score)) fig = plt.figure() ax = fig.add_subplot( 111 ) ax.plot(degrees, rmses) ax.set_yscale( 'log' ) ax.set_xlabel( 'degree' ) ax.set_ylabel( 'rmse' ) ax.set_title( 'best degree = %s, rmse = %.2f, r2_score = %.2f' % (min_deg, min_rmse,score)) plt.show() |
因为因变量 y = 2*(x**4) + x**2 + 9*x + 2 ,自变量和因变量是完整的公式,看图很明显,degree >=4 的都符合,拟合函数都正确。(rmse 最小,r平方非负且接近于1,则模型最好)
如果将 y 值改为如下:
1
|
y = np.array([ 300 , 500 , 0 , - 10 , 0 , 20 , 200 , 300 , 1000 , 800 , 4000 , 5000 , 10000 , 9000 , 22000 ]).reshape( - 1 , 1 ) |
degree=3 是最好的,且 r 平方也最接近于1(注意:如果 r 平方为负数,则不准确,需再次测试。因样本数据较少,可能也会判断错误)。
以上这篇python 确定多项式拟合/回归的阶数实例就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持服务器之家。
原文链接:https://blog.csdn.net/kk185800961/article/details/79215575