proc logistic data=datain descending namelen=100; model dep_var = var1-var10 / selection=stepwise slentry=&slent /*threshold of entering a variable into model*/ slstay=&slst; /* the threshold of leaving the model */ weight split; output out=dataout pred=pred; run; quit;In logistic regression, you can simply run the stepwise selection to all the datasets you have to get variables you want. In general, the result should tell you almost everything. However, in case of large dataset, it may take you quite some time to run logistic regression over the whole dataset, instead, you can run over a sample of it. The question will be, how can we make sure we get the representative sample? You actually don't need to worry about that, since you can just run the samples continuously.
%let varlist= var1-var10; %macro var_filt(input, dep_var, nboots, bootsize, slent, slst, out); %do i=1 %to &nboots; /* run the stepwise logistic regression nboots time */ proc surveyselect method=srs data=&input out=boot&i seed=%sysevalf(1000+&i*10) /* generates a small sample for each run */ n=&bootsize; run; proc logistic data=boot&i desc noprint outest=log_var_filt_&i ; model &dep_var=&varlist / selection=stepwise slentry=&slent /* threshold of entering a variable into model */ slstay=&slst; /* threshold of leaving the model */ run; proc datasets nolist; append data=log_var_filt_&i base= &out force; /* append all the output files */ run; %end; %mend var_filt; options mprint mlogic spool; proc sql; drop table out; quit; %var_filt(Data1,tag , 20, 30000, 0.2, 0.1,dataout1 ); ods html file='var_selection.html'; title 'Variable Selection Logistic'; proc means data=dataout1 noprint; output out= subset; run; ods html; proc print data=subset; run; title;