You are running a large logistic regression for 1,000 feature variables by using the LoisticRegression() function
in the MicrosoftML package. All of the predictor variables are numeric.Currently, you specify the input variables separately by using the following formula.
Outcome ~ Feature000 + Feature001 + Feature002 + … + Feature999
You discover that it takes 20 minutes to estimate each model.
You need to reduce the amount of time required to estimate each model without losing any information in the
What should you do?
Use stepControl() to perform stepwise regression to limit the number of variables that contribute to the
Use selectFeatures() to select the features that provide the most information about the outcome variable.
Use princomp() on the correlation matrix of Features, and then use only the first 100 principle components
to reduce the number of input variables.
Use concat() to create a singlearray variable named Features, and then specify a new formula named
Outcome ~ Features.