EZ Study

Actuarial Biology Chemistry Economics Calculators Confucius Engineer
Physics
C.S.


Logistic Regression Analysis Study Notes 3
Intuitively Interpret Output: Odds ratio, ROC Curve, Concordant, Discordant

Download pdf • SAS Linear Regression Analysis, Linear Regression Analysis Study Notes

In this tutorial, we will use the High School and Beyond data set, hsb2.sas7bdat to describe what a logistic model is, how to perform a logistic regression model analysis and how to interpret the model. Our dependent variable is created as a dichotomous variable indicating if a student's writing score is higher than or equal to 52. We call it hiwrite. The predictor variables will include female and other test scores. Our data set has 200 observations.
 data hsb2;  set hsb2;  
 hiwrite = (write >=52);  run;

Let's now take a look at a model with both a continuous variable math and a categorical variable female as predictors. We will focus on how to interpret the parameter estimate for the continuous variable.

proc logistic data = hsb2  Desc (the same effect as event='1');
  model hiwrite (event='1') = female math /clodds=wald; 


      units math = 5;
  output out = m2 p = prob xbeta = logit; run;

 proc template; /*display parameter estimates with more decials */          
   define table Stat.Logistic.ParameterEstimates;
      dynamic NRows;                             
      column Variable GenericClassValue Response DF Estimate 
      StdErr WaldChiSq ProbChiSq StandardizedEst ExpEst Label;                             
      define Estimate;                                                       
             header = "Estimate";                                                
             parent = Stat.Logistic.vbest8;
             format = 20.8 ;
     end;   end ; run ;

Proc Logistic Data=A Descending;
Model Y=X1 X2 X3 X4;
Test X1=0; *Tests H0:Beta1=0;
Test X1=X2=0; *Tests H0: Beta1=Beta2=0;
Test X1=X2; *Tests H0: Beta1=Beta2;
run; 
             Analysis of Maximum Likelihood Estimates
                                  Standard               Wald
Parameter    DF    Estimate       Error    Chi-Square    Pr > ChiSq
Intercept     1    -10.3651      1.5535       44.5153        <.0001
FEMALE        1      1.6304      0.4052       16.1922        <.0001
MATH          1       0.1979       0.0293       45.5559        <.0001
           Odds Ratio Estimates
        Point          95% Wald
Effect    Estimate      Confidence Limits
FEMALE       5.106       2.308      11.298
MATH           1.219        1.151       1.291
    Wald Confidence Interval for Adjusted Odds Ratios
 Effect         Unit     Estimate     95% Confidence Limits
MATH         5.0000        2.689        2.018        3.584

The interpretation for the parameter estimate of  math is very similar to that for the categorical variable female. In terms of logit scale, we can say that for every unit increase in the math score, the logit will increase by .198, holding everything else constant. We can also say that for a one unit increase in math score, the odds of scoring 51 or higher in writing test increases by (1.219-1)*100% = 22%.

You may wonder what's the relationship between the parameter estiamte math=0.1979 and the odd ratio math=1.219. click the following picture to see more clear explanation.


Sometimes, a one unit change may not be a desirable scale to use. We can ask SAS to give us odds ratio for different units of change. For example, it may make more sense to talk about change of every 5 units in math score. This can be done using units statement.

We also include the option clodds = wald to the model statement so that the confidence interval will also be calculated for the odds ratio calculated in the unit statement. Of course, you can always manually compute the odds ratio for every 5 units change in math score as 1.219^5 = 2.69.

We can compare the linear predictions and the probabilities in terms of the math scores for the males and females.

proc sort data = m2;   by math; run;
symbol1 i = join v=star l=32  c = black;
symbol2 i = join v=circle l = 1 c=black;
proc gplot data = m2;
  plot logit*math = female;  plot prob*math = female;
run; quit;
Acknowledgement: The tutorial is based on the notes from: www.ats.ucla.edu.

Continue : Confounding and Collinearity for Logistic Regression   Stats 101 Home   SAS tutorial home
Back to: ROC/AUC Interpretation in SAS Logistic regression   Classic regression home   Experiment Design Statistics tutorial