服务器之家

服务器之家 > 正文

R语言逻辑回归深入讲解

时间:2021-12-30 15:54     来源/作者:瑶池里

逻辑回归

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
> ###############逻辑回归
> setwd("/Users/yaozhilin/Downloads/R_edu/data")
> accepts<-read.csv("accepts.csv")
> names(accepts)
 [1] "application_id" "account_number" "bad_ind"    "vehicle_year"  "vehicle_make"
 [6] "bankruptcy_ind" "tot_derog"   "tot_tr"     "age_oldest_tr" "tot_open_tr" 
[11] "tot_rev_tr"   "tot_rev_debt"  "tot_rev_line"  "rev_util"    "fico_score" 
[16] "purch_price"  "msrp"      "down_pyt"    "loan_term"   "loan_amt"  
[21] "ltv"      "tot_income"   "veh_mileage"  "used_ind"  
> accepts<-accepts[complete.cases(accepts),]
> select<-sample(1:nrow(accepts),length(accepts$application_id)*0.7)
> train<-accepts[select,]###70%用于建模
> test<-accepts[-select,]###30%用于检测
> attach(train)
> ###用glm(y~x,family=binomial(link="logit"))
> gl<-glm(bad_ind~fico_score,family=binomial(link = "logit"))
> summary(gl)
 
Call:
glm(formula = bad_ind ~ fico_score, family = binomial(link = "logit"))
 
Deviance Residuals:
  Min    1Q  Median    3Q   Max
-2.0794 -0.6790 -0.4937 -0.3073  2.6028
 
Coefficients:
       Estimate Std. Error z value Pr(>|z|) 
(Intercept) 9.049667  0.629120  14.38  <2e-16 ***
fico_score -0.015407  0.000938 -16.43  <2e-16 ***
---
Signif. codes: 0 ‘***' 0.001 ‘**' 0.01 ‘*' 0.05 ‘.' 0.1 ‘ ' 1
 
(Dispersion parameter for binomial family taken to be 1)
 
  Null deviance: 2989.2 on 3046 degrees of freedom
Residual deviance: 2665.9 on 3045 degrees of freedom
AIC: 2669.9
 
Number of Fisher Scoring iterations: 5

多元逻辑回归

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
> ###多元逻辑回归
> gls<-glm(bad_ind~fico_score+bankruptcy_ind+age_oldest_tr+
+      tot_derog+rev_util+veh_mileage,family = binomial(link = "logit"))
> summary(gls)
 
Call:
glm(formula = bad_ind ~ fico_score + bankruptcy_ind + age_oldest_tr +
  tot_derog + rev_util + veh_mileage, family = binomial(link = "logit"))
 
Deviance Residuals:
  Min    1Q  Median    3Q   Max
-2.2646 -0.6743 -0.4647 -0.2630  2.8177
 
Coefficients:
         Estimate Std. Error z value Pr(>|z|) 
(Intercept)   8.205e+00 7.433e-01 11.039 < 2e-16 ***
fico_score   -1.338e-02 1.092e-03 -12.260 < 2e-16 ***
bankruptcy_indY -3.771e-01 1.855e-01 -2.033  0.0421 *
age_oldest_tr  -4.458e-03 6.375e-04 -6.994 2.68e-12 ***
tot_derog    3.012e-02 1.552e-02  1.941  0.0523 .
rev_util     3.763e-04 5.252e-04  0.717  0.4737 
veh_mileage   2.466e-06 1.381e-06  1.786  0.0741 .
---
Signif. codes: 0 ‘***' 0.001 ‘**' 0.01 ‘*' 0.05 ‘.' 0.1 ‘ ' 1
 
(Dispersion parameter for binomial family taken to be 1)
 
  Null deviance: 2989.2 on 3046 degrees of freedom
Residual deviance: 2601.4 on 3040 degrees of freedom
AIC: 2615.4
 
Number of Fisher Scoring iterations: 5
 
> glss<-step(gls,direction = "both")
Start: AIC=2615.35
bad_ind ~ fico_score + bankruptcy_ind + age_oldest_tr + tot_derog +
  rev_util + veh_mileage
 
         Df Deviance  AIC
- rev_util    1  2601.9 2613.9
<none>        2601.3 2615.3
- veh_mileage   1  2604.4 2616.4
- tot_derog    1  2605.1 2617.1
- bankruptcy_ind 1  2605.7 2617.7
- age_oldest_tr  1  2655.9 2667.9
- fico_score   1  2763.8 2775.8
 
Step: AIC=2613.88
bad_ind ~ fico_score + bankruptcy_ind + age_oldest_tr + tot_derog +
  veh_mileage
 
         Df Deviance  AIC
<none>        2601.9 2613.9
- veh_mileage   1  2604.9 2614.9
+ rev_util    1  2601.3 2615.3
- tot_derog    1  2605.7 2615.7
- bankruptcy_ind 1  2606.1 2616.1
- age_oldest_tr  1  2656.9 2666.9
- fico_score   1  2773.2 2783.2
?
1
2
3
4
5
6
7
8
9
10
11
> #出来的数据是logit,我们需要转换
> train$pre<-predict(glss,train)
> #出来的数据是logit,我们需要转换
> train$pre<-predict(glss,train)
> summary(train$pre)
  Min. 1st Qu. Median  Mean 3rd Qu.  Max.
 -4.868 -2.421 -1.671 -1.713 -1.011  2.497
> train$pre_p<-1/(1+exp(-1*train$pre))
> summary(train$pre_p)
  Min. 1st Qu. Median  Mean 3rd Qu.  Max.
0.00763 0.08157 0.15823 0.19298 0.26677 0.92395
?
1
2
3
4
5
#逻辑回归不需要检测扰动项,但需要检测共线性
> library(car)
> vif(glss)
> fico_score bankruptcy_ind age_oldest_tr   tot_derog  veh_mileage
>1.271283    1.144846    1.075603    1.423850    1.003616

到此这篇关于R语言逻辑回归深入讲解的文章就介绍到这了,更多相关R语言逻辑回归内容请搜索服务器之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持服务器之家!

原文链接:https://www.cnblogs.com/ye20190812/p/13925635.html

标签:

相关文章

热门资讯

yue是什么意思 网络流行语yue了是什么梗
yue是什么意思 网络流行语yue了是什么梗 2020-10-11
背刺什么意思 网络词语背刺是什么梗
背刺什么意思 网络词语背刺是什么梗 2020-05-22
2020微信伤感网名听哭了 让对方看到心疼的伤感网名大全
2020微信伤感网名听哭了 让对方看到心疼的伤感网名大全 2019-12-26
蜘蛛侠3英雄无归3正片免费播放 蜘蛛侠3在线观看免费高清完整
蜘蛛侠3英雄无归3正片免费播放 蜘蛛侠3在线观看免费高清完整 2021-08-24
2021年耽改剧名单 2021要播出的59部耽改剧列表
2021年耽改剧名单 2021要播出的59部耽改剧列表 2021-03-05
返回顶部