A tutorial on employing StepReg for stepwise regression analysis with four well-known datasets namely the mtcars, remission, lung, and CreditCard. The guild showcases the utility of StepReg across four well-known datasets, employing it for different regression models such as linear, logistic, Cox proportional hazard, and Poisson regression. The vignette elucidates the stepwise process with distinct parameters, offering users a clear understanding of how to effectively utilize StepReg for exploratory data analysis and model building in various regression scenarios.
StepReg 1.5.0
Stepwise regression is a widely employed data-mining technique aimed at identifying a valuable subset of predictors for utilization in a multiple regression model. To facilitate this process, we have developed the R package StepReg. Depending on the nature of the response variable, StepReg facilitates users in conducting linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event outcomes, and Poisson regression for count outcomes, incorporating popular selection criteria. It provides a versatile set of stop rules available in forward selection, backward elimination, both-direction, and best subset methods.
Here, we applied the StepReg package to four well-established and diverse datasets—mtcars, remission, lung, and CreditCard—utilizing distinct parameters across various regression scenarios. These datasets provide robust test cases for showcasing the capabilities and versatility of the StepReg package in real-world applications. Through practical demonstrations, we illustrated the application of linear stepwise regression for continuous outcomes, logistic stepwise regression for binary outcomes, Cox stepwise regression for time-to-event outcomes, and Poisson stepwise regression for count outcomes. These examples offer users valuable insights into the effective utilization of StepReg for variable selection in different regression scenarios, providing a comprehensive guide for those seeking proficiency in incorporating StepReg into their analytical toolkit.
A breif introduction for four datasets is descripted as below,
mtcars: the mtcars dataset is a classic automotive dataset that provides information on various car models and their performance attributes. With 32 observations and 11 variables, it includes details such as miles per gallon (mpg), horsepower, and the number of cylinders.
remission: the remission dataset is relevant in the context of medical research, specifically in oncology. It captures data related to the remission status of leukemia patients. The dataset includes variables such as cellularity of the marrow clot section, the highest temperature before the start of treatment, and remission status (1 for remission and 0 for non-remission).
lung: the lung dataset is a dataset in the survival analysis domain, containing information related to the survival times of 228 patients with advanced lung cancer. It includes variables such as the patient’s age, the type of treatment received, and survival status.
CreditCard: the CreditCard dataset is associated with credit risk analysis and financial research. It contains information about credit card transactions, including details such as the amount spent, credit limit, and payment status.
A list containing multiple tables will be returned. Names and descriptions of each table are outlined as follows:
Table 1. Summary of Parameters
: This table presents the parameters utilized in stepwise regression along with their default or user-specified values.
Table 2. Variables and Types
: This table outlines the variables and their respective types utilized in the dataset.
Table names prefixed with Table. Selection Process
: This table details overview of the variable selection process. Variables are selected based on information criteria rules, such as AIC, BIC, SBC, IC(1), HQ, etc., where lower values indicate better model fit. The significance levels include SLE for the entry of variables in forward selection and SLS for staying in backward elimination. For Rsq or adjusted R-squared, higher values indicate a better model fit.
Tabel names prefixed with Table. Parameter Estimates
: This table provides summary information for the optimal models.
This section provides 9 examples utilizing distinct parameters across various regression scenarios with the above 4 datasets.
mtcars
Example1: In this analysis, we used
mpg
as the response variable, with all other variables serving as predictors, employing a strategy offorward
and a metric ofAIC
for linear stepwise regression. The analysis involved enforcingdisp
andcyl
to be included in all models.
library(StepReg)
data(mtcars)
formula <- mpg ~ .
exam1 <- stepwise(formula = formula,
data = mtcars,
type = "linear",
include = c("disp","cyl"),
strategy = "forward",
metric = "AIC")
exam1
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ——————————————————————————————————————————
## included variable disp cyl
## strategy forward
## metric AIC
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## intercept 1
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## ——————————————————————————————————————————————
## Dependent mpg numeric
## Independent cyl numeric
## Independent disp numeric
## Independent hp numeric
## Independent drat numeric
## Independent wt numeric
## Independent qsec numeric
## Independent vs numeric
## Independent am numeric
## Independent gear numeric
## Independent carb numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn AIC
## —————————————————————————————————————————————————————————————————————————————————————
## 0 1 0 1 149.943449990894
## 0 disp cyl -2 3 108.33357089067
## 1 wt 3 4 98.7462938182664
## 2 hp 4 5 97.5255371708581
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Parameter Estimates for mpg under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## ——————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) 40.8285367422432 2.75746792810596 14.8065318642844 1.76140221350856e-14
## disp 0.0115992393009777 0.0117268091002486 0.989121525030348 0.331385561864358
## cyl -1.29331972351378 0.655876754872712 -1.97189443581482 0.0589468066844992
## wt -3.85390352303833 1.01547364107822 -3.79517829625422 0.000758947039357617
## hp -0.020538376368824 0.0121467704321512 -1.69085078898512 0.102379131471602
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using information criteron
AIC
.
plot(exam1)
Example2: In this illustration, we maintained
mpg
as the response variable, while designating the other variables as predictors. The chosen strategy wasbidirectional
withAIC
,AICc
,BIC
,HQ
,HQc
,SBC
, andSL
as the stopping criterion, and the significance levels for entry (sle
) and stay (sls
) were both set to 0.05 parallelly. The analysis involved removingintercept
from the model. The specific characteristics of the data and the goals of the analysis in each subject area require users to choose different stepwise regression method and selection criteria. Users can compare all metics through the output list or the plots.
formula <- mpg ~ . + 0
exam2 <- stepwise(formula = formula,
data = mtcars,
type = "linear",
strategy = "bidirection",
metric = c("AIC","SBC","SL","AICc","BIC","HQ","HQc"),
sle = 0.05,
sls = 0.05)
exam2
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ————————————————————————————————————————————————————————————————————————
## included variable NULL
## strategy bidirection
## metric AIC & SBC & SL & AICc & BIC & HQ & HQc
## entry significance level (sle) 0.05
## stay significance level (sls) 0.05
## test method F
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## intercept 0
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## ——————————————————————————————————————————————
## Dependent mpg numeric
## Independent cyl numeric
## Independent disp numeric
## Independent hp numeric
## Independent drat numeric
## Independent wt numeric
## Independent qsec numeric
## Independent vs numeric
## Independent am numeric
## Independent gear numeric
## Independent carb numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn AIC
## —————————————————————————————————————————————————————————————————————————————————————
## 0 0 0 0 Inf
## 1 drat 1 1 131.940615397799
## 2 carb 2 2 112.874977003721
## 3 gear 3 3 105.767914676203
## 4 hp 4 4 105.399654906399
## 5 qsec 5 5 105.277861812131
## 6 wt 6 6 100.437613232709
## 7 hp 5 5 98.4440186423183
## 8 am 6 6 97.8009315992234
## 9 gear 5 5 96.5751826224886
## 10 carb 4 4 96.0485980614551
## 11 drat 3 3 95.418690850739
## 12 disp 4 4 95.3954043177414
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Selection Process under SBC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn SBC
## —————————————————————————————————————————————————————————————————————————————————————
## 0 0 0 0 Inf
## 1 drat 1 1 99.4063513005992
## 2 carb 2 2 81.8064488093206
## 3 gear 3 3 76.1651223846019
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 5. Selection Process under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn SL
## —————————————————————————————————————————————————————————————————————————————————————————
## 0 0 0 0 1
## 1 drat 1 1 2.44913223058495e-22
## 2 carb 2 2 1.03775546495333e-05
## 3 gear 3 3 0.00438861027007104
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 6. Selection Process under AICc
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn AICc
## —————————————————————————————————————————————————————————————————————————————————————
## 0 0 0 0 Inf
## 1 drat 1 1 132.354408501248
## 2 carb 2 2 113.732119860864
## 3 gear 3 3 107.249396157684
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 7. Selection Process under BIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn BIC
## —————————————————————————————————————————————————————————————————————————————————————
## 0 0 0 0 Inf
## 1 drat 1 1 97.7553842586695
## 2 carb 2 2 79.2794817092699
## 3 gear 3 3 72.9946131788242
## 4 hp 4 4 72.9415244192211
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 8. Selection Process under HQ
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn HQ
## —————————————————————————————————————————————————————————————————————————————————————
## 0 0 0 0 Inf
## 1 drat 1 1 96.0182982097902
## 2 carb 2 2 75.0303426277027
## 3 gear 3 3 66.0009631121751
## 4 hp 4 4 63.7103861543619
## 5 qsec 5 5 61.6662758720848
## 6 wt 6 6 54.9037101046532
## 7 hp 5 5 54.8324327022722
## 8 am 6 6 52.2670284711681
## 9 disp 7 7 51.5596073775111
## 10 hp 8 8 50.5198981811353
## 11 carb 7 7 50.5007045958875
## 12 cyl 8 8 50.3437098642926
## 13 carb 9 9 50.2608086854785
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 9. Selection Process under HQc
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn HQc
## —————————————————————————————————————————————————————————————————————————————————————
## 0 0 0 0 Inf
## 1 drat 1 1 96.0263343627548
## 2 carb 2 2 75.052537716843
## 3 gear 3 3 66.0441202299477
## 4 hp 4 4 63.7820933654303
## 5 qsec 5 5 61.7750318088718
## 6 wt 6 6 55.0590757286348
## 7 hp 5 5 54.9411886390593
## 8 am 6 6 52.4223940951496
## 9 disp 7 7 51.7723907320945
## 10 hp 8 8 50.802381133829
## 11 carb 7 7 50.713487950471
## 12 cyl 8 8 50.6261928169863
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 10. Parameter Estimates for mpg under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## ———————————————————————————————————————————————————————————————————————————————————————————
## qsec 1.70550996283541 0.127485704584404 13.3780486870687 1.09964868080962e-13
## wt -4.61279456246674 1.15817323630342 -3.98281916545536 0.000440008628764359
## am 4.18085430467977 1.01361607335742 4.12469219320039 0.000300527233592535
## disp 0.0120200576653963 0.0088914542638529 1.35186633240215 0.187238258962162
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 11. Parameter Estimates for mpg under SBC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## —————————————————————————————————————————————————————————————————————————————————————————
## drat 3.85142334610757 1.0678868653112 3.6065836852345 0.00115059878350687
## carb -2.36055514388328 0.350142761922923 -6.74169339077442 2.12967212881877e-07
## gear 3.4883542511301 1.12895206584177 3.08990466174409 0.00438861027007103
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 12. Parameter Estimates for mpg under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## —————————————————————————————————————————————————————————————————————————————————————————
## drat 3.85142334610757 1.0678868653112 3.6065836852345 0.00115059878350687
## carb -2.36055514388328 0.350142761922923 -6.74169339077442 2.12967212881877e-07
## gear 3.4883542511301 1.12895206584177 3.08990466174409 0.00438861027007103
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 13. Parameter Estimates for mpg under AICc
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## —————————————————————————————————————————————————————————————————————————————————————————
## drat 3.85142334610757 1.0678868653112 3.6065836852345 0.00115059878350687
## carb -2.36055514388328 0.350142761922923 -6.74169339077442 2.12967212881877e-07
## gear 3.4883542511301 1.12895206584177 3.08990466174409 0.00438861027007103
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 14. Parameter Estimates for mpg under BIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## ————————————————————————————————————————————————————————————————————————————————————————————
## drat 4.30273288409265 1.09158280873171 3.94173749318379 0.000491150652824566
## carb -1.73804836219828 0.54597600366577 -3.18337866596471 0.00355127557199684
## gear 3.17367474781416 1.12779616721381 2.81404994987227 0.00885014913869611
## hp -0.0155479574747606 0.0106015597450293 -1.46657264107297 0.153634601937481
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 15. Parameter Estimates for mpg under HQ
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## ————————————————————————————————————————————————————————————————————————————————————————————
## drat 1.24018628116636 1.43081088945576 0.866771626009983 0.395019989028695
## gear 1.05507041373944 1.31730971553748 0.800928135042981 0.431369721203034
## qsec 1.21092335916019 0.39825927287781 3.0405402752084 0.00580910211687029
## wt -3.8393173477773 1.81635802262831 -2.11374481239207 0.0455922405971267
## am 2.78900560523413 1.87631320367996 1.48642859825542 0.15074627783938
## disp 0.013377487347384 0.0171483102637304 0.780105278108835 0.443283496072846
## hp -0.0200206683398262 0.0202056393976394 -0.990845572655583 0.332071276534613
## cyl 0.314427335966453 0.637410972654929 0.493288238601867 0.626486125552571
## carb -0.269382366878572 0.791924966403595 -0.340161477800012 0.736822040562188
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 16. Parameter Estimates for mpg under HQc
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## ————————————————————————————————————————————————————————————————————————————————————————————
## drat 1.15464081893128 1.38234393696422 0.835277522515128 0.411799846499336
## gear 0.86861885215473 1.17558240530996 0.738883848747045 0.467142793848725
## qsec 1.30385655642974 0.28438801586838 4.58478024275534 0.000119371208464951
## wt -4.28678423131556 1.22920122659441 -3.48745521772087 0.00190040875483783
## am 2.83676935913337 1.83625846165786 1.54486387312394 0.135464286600943
## disp 0.0173369535398226 0.0123585220697692 1.40283388595724 0.173471044902963
## hp -0.0236106906721794 0.0169099008757928 -1.39626428597101 0.175416642765162
## cyl 0.25167003723096 0.598781514963363 0.420303618167569 0.678003000841066
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using
bidirection
strategy under information criteronAIC
,AICc
,BIC
,HQ
,HQc
,SBC
, andSL
withsle
=0.05 andsls
=0.05.
plot(exam2)
Example3: In this multivariable multiple stepwise regression, we employed
mpg
anddrat
as response withcyl
,disp
,hp
,wt
,vs
andam
serving as predictors. The variablewt
was enforced to be included in all models. The analysis involved thesubset
strategy for variable selection, withAIC
andAICc
as the criteria individually.
formula <- cbind(mpg,drat) ~ cyl + disp + hp + wt + vs + am
exam3 <- stepwise(formula = formula,
data = mtcars,
type = "linear",
include = 'wt',
strategy = "subset",
metric = c("AIC","AICc"),
best_n = 3)
exam3
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ————————————————————————————————————————————
## included variable wt
## strategy subset
## metric AIC & AICc
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## intercept 1
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## —————————————————————————————————————————————————
## Dependent cbind(mpg, drat) nmatrix.2
## Independent cyl numeric
## Independent disp numeric
## Independent hp numeric
## Independent wt numeric
## Independent vs numeric
## Independent am numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## NumberOfVariables AIC VariablesInModel
## —————————————————————————————————————————————————————————————
## 2 161.304784331389 1 wt
## 3 150.7504611172 1 wt cyl
## 3 153.019337447204 1 wt hp
## 3 158.369641697526 1 wt vs
## 4 146.595187612804 1 wt cyl am
## 4 147.017463011528 1 wt cyl hp
## 4 148.411791388542 1 wt hp am
## 5 145.725338025984 1 wt cyl hp am
## 5 148.951593685218 1 wt hp vs am
## 5 149.437618782264 1 wt cyl disp hp
## 6 148.135750495875 1 wt cyl disp hp am
## 6 149.129861532524 1 wt cyl hp vs am
## 6 150.642710344858 1 wt disp hp vs am
## 7 151.290651094496 1 wt cyl disp hp vs am
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Selection Process under AICc
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## NumberOfVariables AICc VariablesInModel
## —————————————————————————————————————————————————————————————
## 2 195.897376923981 1 wt
## 3 186.904307271046 1 wt cyl
## 3 189.17318360105 1 wt hp
## 3 194.523487851372 1 wt vs
## 4 184.755187612804 1 wt cyl am
## 4 185.177463011528 1 wt cyl hp
## 4 186.571791388542 1 wt hp am
## 5 186.392004692651 1 wt cyl hp am
## 5 189.618260351885 1 wt hp vs am
## 5 190.104285448931 1 wt cyl disp hp
## 6 191.874880930657 1 wt cyl disp hp am
## 6 192.868991967306 1 wt cyl hp vs am
## 6 194.381840779641 1 wt disp hp vs am
## 7 198.745196549042 1 wt cyl disp hp vs am
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 5. Parameter Estimates for Response mpg under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## ———————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) 36.1465357519024 3.10478079459192 11.6422182895696 4.94480374933663e-12
## wt -2.60648070821658 0.919837490381642 -2.83363173981433 0.00860321812827099
## cyl -0.745157023930062 0.582787409987315 -1.27860865070212 0.211916611111083
## hp -0.0249510591437429 0.0136461447865208 -1.82843283096253 0.0785533736998695
## am 1.47804770540896 1.44114927311401 1.02560347701888 0.314179886317532
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 6. Parameter Estimates for Response drat under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## —————————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) 4.53925999228833 0.407323665632239 11.1441106306519 1.32299213312141e-11
## wt -0.093678528460361 0.120675694406779 -0.776283318035738 0.444329239934457
## cyl -0.1691236910122 0.0764572830822192 -2.21200236516816 0.0356145054513225
## hp 0.00171645750563225 0.00179027058073661 0.958769877638278 0.346182118711362
## am 0.377500876754036 0.189067841977936 1.99664243694118 0.056037276070909
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 7. Parameter Estimates for Response mpg under AICc
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## ————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) 39.4179334351865 2.6414572997099 14.9227978962656 7.42499755293912e-15
## wt -3.12514220026708 0.910882701148664 -3.43089422636541 0.00188589438685631
## cyl -1.5102456624971 0.422279222208057 -3.57641480582487 0.00129160458914754
## am 0.176493157719669 1.30445145498685 0.135300671439281 0.89334214792396
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 8. Parameter Estimates for Response drat under AICc
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error t value Pr(>|t|)
## ———————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) 4.31421082403332 0.332409660876704 12.9785963881282 2.29282220122696e-13
## wt -0.0579982631143607 0.114628470360106 -0.505967347659431 0.616840130742605
## cyl -0.116490969949067 0.053141004045333 -2.19211081991774 0.0368483162012922
## am 0.467038681922974 0.164156454783472 2.84508265324694 0.00821005565960176
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using
subset
strategy under information criteronAIC
andAICc
.
plot(exam3)
remission
Example4: In this run, we employed
remiss
as response with the other 6 variables serving as predictors. The variablecell
was enforced to be included in all models. The analysis involved theforward
strategy for variable selection, withAIC
andSL
as the criteria parallelly, and the significance levels for entry (sle
) and stay (sls
) were both set to 0.05.
data(remission)
formula <- remiss ~ .
exma4 <- stepwise(formula = formula,
data = remission,
type = "logit",
include= "cell",
strategy = "forward",
metric = c("AIC","SL"),
sle = 0.05,
sls = 0.05)
exma4
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ——————————————————————————————————————————
## included variable cell
## strategy forward
## metric AIC & SL
## entry significance level (sle) 0.05
## test method Rao
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## intercept 1
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## ——————————————————————————————————————————————
## Dependent remiss numeric
## Independent cell numeric
## Independent smear numeric
## Independent infil numeric
## Independent li numeric
## Independent blast numeric
## Independent temp numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn AIC
## —————————————————————————————————————————————————————————————————————————————————————
## 0 1 1 1 36.3717650879199
## 0 cell -1 2 35.7917917196118
## 1 li 3 3 30.3407188099083
## 2 temp 4 4 29.953368109419
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Selection Process under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn SL
## ————————————————————————————————————————————————————————————————————————————————————————
## 0 1 1 1 1
## 0 cell -1 2 0.169276579083175
## 1 li 3 3 0.00801481059473564
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 5. Parameter Estimates for remiss under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error z value Pr(>|z|)
## —————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) 67.6339061281107 56.8875473471752 1.18890529267068 0.234476937196249
## cell 9.65215222462213 7.75107586200402 1.24526612775618 0.213033942178284
## li 3.86710032908392 1.77827772175707 2.17463238827675 0.0296576752212302
## temp -82.0737742795405 61.7123821323401 -1.32994014237752 0.183537993940985
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 6. Parameter Estimates for remiss under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error z value Pr(>|z|)
## —————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) -9.58583007808603 6.2743262779057 -1.52778635561899 0.126565591026528
## cell 6.29163359335162 6.15249803344084 1.0226144826304 0.306490159285688
## li 2.87858063854095 1.25185701053452 2.299448430865 0.0214794885107115
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using
forward
strategy under information criteronAIC
andSL
.
plot(exma4)
Example5: In this anslysis,
remiss
was retained as the response variable, while the other six variables served as predictors. The analysis utilized thesubset
strategy for variable selection, withAIC
andSL
as the criterion parallelly.
data(remission)
formula <- remiss ~ .
exma5 <- stepwise(formula = formula,
data = remission,
type = "logit",
strategy = "subset",
metric = c("AIC","SL"),
best_n = 3)
exma5
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ——————————————————————————————————————————
## included variable NULL
## strategy subset
## metric AIC & SL
## test method Rao
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## intercept 1
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## ——————————————————————————————————————————————
## Dependent remiss numeric
## Independent cell numeric
## Independent smear numeric
## Independent infil numeric
## Independent li numeric
## Independent blast numeric
## Independent temp numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## NumberOfVariables AIC VariablesInModel
## ————————————————————————————————————————————————————————————————————————
## 2 30.0729645050923 1 li
## 2 34.8205019629019 1 blast
## 2 35.7917917196118 1 cell
## 3 30.3407188099083 1 cell li
## 3 30.6478222686119 1 li temp
## 3 31.4904839517614 1 infil li
## 4 29.953368109419 1 cell li temp
## 4 31.5343469686038 1 li blast temp
## 4 31.7759437799494 1 infil li temp
## 5 31.8579906429728 1 cell smear li temp
## 5 31.8691398459254 1 cell infil li temp
## 5 31.9325085686912 1 cell li blast temp
## 6 33.7550450527665 1 cell smear infil li temp
## 6 33.8571463530981 1 cell smear li blast temp
## 6 33.8686584978099 1 cell infil li blast temp
## 7 35.7506522854459 1 cell smear infil li blast temp
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Selection Process under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## NumberOfVariables SL VariablesInModel
## ————————————————————————————————————————————————————————————————————————
## 2 7.93109912391465 1 li
## 2 3.52581053761541 1 blast
## 2 1.88933825259033 1 cell
## 3 8.66108166784948 1 cell li
## 3 8.36482911322912 1 li temp
## 3 8.17463918794755 1 infil li
## 4 9.25024541927856 1 cell li temp
## 4 8.79127825261338 1 smear infil li
## 4 8.68174276768291 1 cell li blast
## 5 9.44759197671312 1 smear infil li temp
## 5 9.27906944643618 1 cell smear li temp
## 5 9.26147971975759 1 cell infil li temp
## 6 9.46088500719727 1 cell smear infil li temp
## 6 9.45015507221275 1 smear infil li blast temp
## 6 9.32952313436336 1 cell smear li blast temp
## 7 9.46088923922502 1 cell smear infil li blast temp
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 5. Parameter Estimates for remiss under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error z value Pr(>|z|)
## —————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) 67.6339061281107 56.8875473471752 1.18890529267068 0.234476937196249
## cell 9.65215222462213 7.75107586200402 1.24526612775618 0.213033942178284
## li 3.86710032908392 1.77827772175707 2.17463238827675 0.0296576752212302
## temp -82.0737742795405 61.7123821323401 -1.32994014237752 0.183537993940985
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 6. Parameter Estimates for remiss under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error z value Pr(>|z|)
## ——————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) 58.0384871144701 71.2364334179606 0.814730389068539 0.415226654399395
## cell 24.6615438508061 47.8376944382513 0.515525343359495 0.606185964565718
## smear 19.2935745808349 57.9500115690838 0.332934783935884 0.739183512053845
## infil -19.6012612370258 61.6814798296488 -0.317781954829235 0.750650339683589
## li 3.89596332799396 2.33711543506734 1.66699653322075 0.0955150942220478
## blast 0.151092333239208 2.27857061152582 0.0663101386785774 0.947130911494031
## temp -87.4339023538899 67.5735358529564 -1.29390746318427 0.195697386304521
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using
subset
strategy under information criteronAIC
andSL
.
plot(exma5)
lung
Example6: Cox stepwise regression used the
forward
method for variable selection withIC(1)
andSL
as the criteria for stop rules parallelly, including variableage
in all models. The significance levels for enter (sle
) was set to 0.05.
lung <- survival::lung
my.data <- na.omit(lung)
my.data$status1 <- ifelse(my.data$status == 2,1,0)
formula = Surv(time, status1) ~ . - status
exma6 <- stepwise(formula = formula,
data = my.data,
type = "cox",
include = "age",
strategy = "forward",
metric = c("IC(1)","SL"),
sle = 0.05)
exma6
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ————————————————————————————————————————————
## included variable age
## strategy forward
## metric IC(1) & SL
## entry significance level (sle) 0.05
## test method efron
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## ————————————————————————————————————————————————————
## Dependent Surv(time, status1) nmatrix.2
## Independent inst numeric
## Independent age numeric
## Independent sex numeric
## Independent ph.ecog numeric
## Independent ph.karno numeric
## Independent pat.karno numeric
## Independent meal.cal numeric
## Independent wt.loss numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under IC(1)
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn IC(1)
## —————————————————————————————————————————————————————————————————————————————————————
## 0 age 1 1 1013.71004248687
## 1 ph.ecog 2 2 1005.03577133648
## 2 sex 3 3 999.220449574499
## 3 inst 4 4 996.745082496922
## 4 ph.karno 5 5 993.164700650656
## 5 wt.loss 6 6 990.378814285337
## 6 pat.karno 7 7 989.5365169151
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Selection Process under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn SL
## ————————————————————————————————————————————————————————————————————————————————————————
## 0 age 1 1 0.0605025541359114
## 1 ph.ecog 2 2 0.00186866385928123
## 2 sex 3 3 0.00903790177522826
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 5. Parameter Estimates for Surv(time, status1) under IC(1)
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable coef exp(coef) se(coef) z Pr(>|z|)
## —————————————————————————————————————————————————————————————————————————————————————————————————————————————————
## age 0.0127911717927875 1.01287332875159 0.0117657197510269 1.08715591255444 0.276967911173407
## ph.ecog 0.907317172186787 2.47766645634186 0.238503963744317 3.80420164907388 0.000142262259259764
## sex -0.566868101335099 0.56729938352366 0.200032540541155 -2.83387942682491 0.00459866794108441
## inst -0.0303746283354971 0.970082045244345 0.0131043742343092 -2.3178999464142 0.0204547593691531
## ph.karno 0.026580081421336 1.02693648250202 0.0116170285677177 2.28802755079718 0.0221359167402799
## wt.loss -0.0167121591832758 0.983426714247836 0.00791193897857379 -2.11227099052884 0.0346632125632795
## pat.karno -0.0108962907298638 0.98916285881416 0.00799900477152 -1.36220580448451 0.173132944510343
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 6. Parameter Estimates for Surv(time, status1) under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable coef exp(coef) se(coef) z Pr(>|z|)
## ——————————————————————————————————————————————————————————————————————————————————————————————————————————————
## age 0.00803434264043171 1.00806670458218 0.011086104693833 0.724721880436602 0.46862266905784
## ph.ecog 0.455257333389347 1.5765790372648 0.136856945048725 3.32651977016775 0.000879377814698571
## sex -0.502179683704786 0.605210054490898 0.197336202068922 -2.5447924832839 0.0109342697305065
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using
forward
strategy under information criteronIC(1)
andSL
.
plot(exma6)
Example7: Cox stepwise regression used the
backward
method for variable selection withSL
andAIC
as the criterion. The significance levels for staying (sls
) was set to 0.05.
formula = Surv(time, status1) ~ . - status
exma7 <- stepwise(formula = formula,
data = my.data,
type = "cox",
strategy = "backward",
metric = c("SL","AIC"),
sls = 0.05)
exma7
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ——————————————————————————————————————————
## included variable NULL
## strategy backward
## metric SL & AIC
## stay significance level (sls) 0.05
## test method efron
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## ————————————————————————————————————————————————————
## Dependent Surv(time, status1) nmatrix.2
## Independent inst numeric
## Independent age numeric
## Independent sex numeric
## Independent ph.ecog numeric
## Independent ph.karno numeric
## Independent pat.karno numeric
## Independent meal.cal numeric
## Independent wt.loss numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn SL
## ———————————————————————————————————————————————————————————————————————————————————————
## 0 8 8 1
## 1 meal.cal 7 7 0.992244442233114
## 2 age 6 6 0.276967911173407
## 3 pat.karno 5 5 0.150201287416114
## 4 ph.karno 4 4 0.0554707154521576
## 5 wt.loss 3 3 0.0652481881858785
## 6 inst 2 2 0.0670860630827791
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Selection Process under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn AIC
## —————————————————————————————————————————————————————————————————————————————————————
## 0 8 8 998.536422466941
## 1 meal.cal 7 7 996.5365169151
## 2 age 6 6 995.742173541071
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 5. Parameter Estimates for Surv(time, status1) under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable coef exp(coef) se(coef) z Pr(>|z|)
## —————————————————————————————————————————————————————————————————————————————————————————————————————————————
## sex -0.510099064468472 0.600436093983396 0.196899845516193 -2.59065243617229 0.00957941855627087
## ph.ecog 0.48251852871466 1.62014966165472 0.132315991621882 3.6467136194206 0.000265615670889065
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 6. Parameter Estimates for Surv(time, status1) under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable coef exp(coef) se(coef) z Pr(>|z|)
## —————————————————————————————————————————————————————————————————————————————————————————————————————————————————
## inst -0.0291538752858262 0.971266998980296 0.0129546401503024 -2.25045813296061 0.0244198783334591
## sex -0.562968129724542 0.569516154877045 0.199295953293519 -2.82478454991715 0.00473124175773548
## ph.ecog 0.901507779010255 2.46331444633504 0.240838552326673 3.74320377822007 0.000181688764023633
## ph.karno 0.0238044694133571 1.02409005738304 0.0113996516289851 2.08817516430336 0.0367820367652906
## pat.karno -0.0115478833807012 0.988518537505187 0.00802593552960615 -1.43882084999353 0.150201287416114
## wt.loss -0.0168103912518805 0.983330114952026 0.00781085562695218 -2.15218307119574 0.0313829384207112
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using
backward
strategy under information criteronAIC
andSL
.
plot(exma7)
CreditCard
Example8: In this exmaples, We designated
reports
as the response variable, with the remaining variables serving as predictors. The analysis employed theforward
method for variable selection, utilizingSL
andIC(3/2)
as the criterion for stop rules, and the significance levels for entry (sle
) was set to be 0.05.
data(CreditCard, package = 'AER')
formula = reports ~ .
exma8 <- stepwise(formula = formula,
data = CreditCard,
type = "poisson",
strategy = "forward",
metric = c("SL","IC(3/2)"),
sle=0.05)
exma8
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ——————————————————————————————————————————————
## included variable NULL
## strategy forward
## metric SL & IC(3/2)
## entry significance level (sle) 0.05
## test method Rao
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## intercept 1
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## ——————————————————————————————————————————————
## Dependent reports numeric
## Independent card factor
## Independent age numeric
## Independent income numeric
## Independent share numeric
## Independent expenditure numeric
## Independent owner factor
## Independent selfemp factor
## Independent dependents numeric
## Independent months numeric
## Independent majorcards numeric
## Independent active numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn SL
## —————————————————————————————————————————————————————————————————————————————————————————
## 0 1 1 1 1
## 1 card 2 2 8.5877221503339e-235
## 2 active 3 3 7.53645358139937e-61
## 3 expenditure 4 4 0.00016672060857812
## 4 months 5 5 0.00149658752249917
## 5 owner 6 6 0.000289704544697657
## 6 majorcards 7 7 0.00853660247624029
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Selection Process under IC(3/2)
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn IC(3/2)
## —————————————————————————————————————————————————————————————————————————————————————
## 0 1 1 1 2998.46727451496
## 1 card 2 2 2161.54380348632
## 2 active 3 3 1970.86790229148
## 3 expenditure 4 4 1962.20751098534
## 4 months 5 5 1954.79845463846
## 5 owner 6 6 1942.91316602911
## 6 majorcards 7 7 1937.16269545213
## 7 income 8 8 1937.02611660997
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 5. Parameter Estimates for reports under SL
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error z value Pr(>|z|)
## ———————————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) -0.298643659038297 0.109685399601467 -2.7227293707585 0.00647450714191624
## cardyes -2.70352225795467 0.117195939295856 -23.0683953232351 9.61653770781882e-118
## active 0.0654296707660895 0.00399754905523789 16.367446618412 3.26632926223479e-60
## expenditure 0.000672431213470284 0.000177638845762774 3.78538382515879 0.00015347151873607
## months 0.00212461501050368 0.000530320086460442 4.00628802254911 6.16804294228691e-05
## owneryes -0.343769864333074 0.0926480304376423 -3.71049295607479 0.000206856052610907
## majorcards 0.274039347897117 0.104512901787067 2.62206237901079 0.00873994324369934
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 6. Parameter Estimates for reports under IC(3/2)
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error z value Pr(>|z|)
## ———————————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) -0.370064654071273 0.122569709425332 -3.01921784596146 0.00253428230748237
## cardyes -2.69192183928203 0.117368471922898 -22.935646985763 2.04939317764651e-116
## active 0.064733432332392 0.00402555478259611 16.0806238713375 3.48848117936459e-58
## expenditure 0.000598437222203223 0.000185715652467924 3.22233055884496 0.00127152347189399
## months 0.00202983041537343 0.000534807209066601 3.79544325686278 0.000147379901803219
## owneryes -0.368861130713364 0.0948758378946128 -3.88783001972632 0.000101144415795062
## majorcards 0.257131033787657 0.105294602110372 2.44201534204124 0.0146055259166052
## income 0.0328317155222303 0.0253062467170453 1.29737593604176 0.194501868167294
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using
forward
strategy under information criteronIC(3/2)
andSL
.
plot(exma8)
Example9:
reports
variable was maintained as the response variable, and the remaining variables were set as predictors. The variablescard
andmonths
were enforced to be included in all models. Poisson stepwise regression was performed using thebidirection
method for variable selection withIC(3/2)
andAIC
as the criterion for stop rules parrallelly.
formula = reports ~ .
exma9 <- stepwise(formula = formula,
data = CreditCard,
type = "poisson",
include=c("card","months"),
strategy = "bidirection",
metric = c("IC(3/2)","AIC")
)
exma9
## Table 1. Summary of Parameters
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Parameter Value
## ———————————————————————————————————————————————
## included variable card months
## strategy bidirection
## metric IC(3/2) & AIC
## tolerance of multicollinearity 1e-07
## multicollinearity variable NULL
## intercept 1
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 2. Type of Variables
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable type Variable name Variable class
## ——————————————————————————————————————————————
## Dependent reports numeric
## Independent card factor
## Independent age numeric
## Independent income numeric
## Independent share numeric
## Independent expenditure numeric
## Independent owner factor
## Independent selfemp factor
## Independent dependents numeric
## Independent months numeric
## Independent majorcards numeric
## Independent active numeric
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 3. Selection Process under IC(3/2)
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn IC(3/2)
## —————————————————————————————————————————————————————————————————————————————————————
## 0 1 1 1 2998.46727451496
## 0 card months -2 3 2153.23072693751
## 1 active 4 4 1963.56167073775
## 2 owner 5 5 1952.16998343778
## 3 expenditure 6 6 1942.91316602911
## 4 majorcards 7 7 1937.16269545213
## 5 income 8 8 1937.02611660997
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 4. Selection Process under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Step EnteredEffect RemovedEffect NumberEffectIn NumberParmsIn AIC
## —————————————————————————————————————————————————————————————————————————————————————
## 0 1 1 1 2998.96727451496
## 0 card months -2 3 2154.73072693751
## 1 active 4 4 1965.56167073775
## 2 owner 5 5 1954.66998343778
## 3 expenditure 6 6 1945.91316602911
## 4 majorcards 7 7 1940.66269545213
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 5. Parameter Estimates for reports under IC(3/2)
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error z value Pr(>|z|)
## ———————————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) -0.370064654071274 0.122569709425332 -3.01921784596147 0.00253428230748225
## cardyes -2.69192183928203 0.117368471922898 -22.935646985763 2.04939317764651e-116
## months 0.00202983041537344 0.000534807209066601 3.79544325686279 0.000147379901803215
## active 0.064733432332392 0.00402555478259611 16.0806238713375 3.48848117936579e-58
## owneryes -0.368861130713364 0.0948758378946128 -3.88783001972632 0.000101144415795061
## expenditure 0.000598437222203224 0.000185715652467925 3.22233055884496 0.00127152347189399
## majorcards 0.257131033787657 0.105294602110372 2.44201534204124 0.0146055259166053
## income 0.0328317155222304 0.0253062467170453 1.29737593604176 0.194501868167292
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
##
## Table 6. Parameter Estimates for reports under AIC
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
## Variable Estimate Std. Error z value Pr(>|z|)
## ———————————————————————————————————————————————————————————————————————————————————————————————————
## (Intercept) -0.298643659038298 0.109685399601467 -2.72272937075851 0.00647450714191609
## cardyes -2.70352225795467 0.117195939295856 -23.0683953232351 9.61653770781724e-118
## months 0.00212461501050368 0.000530320086460441 4.00628802254911 6.16804294228673e-05
## active 0.0654296707660895 0.0039975490552379 16.367446618412 3.26632926223498e-60
## owneryes -0.343769864333074 0.0926480304376423 -3.71049295607478 0.00020685605261091
## expenditure 0.000672431213470282 0.000177638845762774 3.78538382515879 0.000153471518736071
## majorcards 0.274039347897117 0.104512901787067 2.62206237901079 0.0087399432436993
## ‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗‗
Visulization of the selection process using
bidirection
strategy under information criteronIC(3/2)
andAIC
.
plot(exma9)
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur/Monterey 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] StepReg_1.5.0 BiocStyle_2.22.0
##
## loaded via a namespace (and not attached):
## [1] zip_2.3.1 Rcpp_1.0.12 highr_0.10
## [4] bslib_0.4.2 compiler_4.1.3 pillar_1.9.0
## [7] BiocManager_1.30.20 jquerylib_0.1.4 tools_4.1.3
## [10] digest_0.6.31 lattice_0.21-8 jsonlite_1.8.4
## [13] evaluate_0.20 lifecycle_1.0.4 tibble_3.2.1
## [16] gtable_0.3.4 pkgconfig_2.0.3 rlang_1.1.3
## [19] openxlsx_4.2.5.2 Matrix_1.5-3 cli_3.6.2
## [22] rstudioapi_0.14 ggrepel_0.9.5 yaml_2.3.7
## [25] xfun_0.38 fastmap_1.1.1 withr_3.0.0
## [28] stringr_1.5.1 dplyr_1.1.4 knitr_1.42
## [31] generics_0.1.3 sass_0.4.5 vctrs_0.6.5
## [34] tidyselect_1.2.0 grid_4.1.3 glue_1.7.0
## [37] R6_2.5.1 fansi_1.0.6 survival_3.5-5
## [40] rmarkdown_2.21 bookdown_0.33 farver_2.1.1
## [43] purrr_1.0.2 ggplot2_3.4.4 magrittr_2.0.3
## [46] splines_4.1.3 scales_1.3.0 htmltools_0.5.5
## [49] colorspace_2.1-0 labeling_0.4.3 utf8_1.2.4
## [52] stringi_1.8.3 munsell_0.5.0 cachem_1.0.7