EZ Study

Actuarial Biology Chemistry Economics Calculators Confucius Engineer
Physics
C.S.


Cool Tips & Tricks of Proc SQL - 1
How to Rank a variable, Draw a horizontal bar chart
Random sampling, Replicate a data set without data

Question: Proc SQL is some tool that once you get used to, you will definitely fall in love with it! In this tutorial we introduce some cool features of proc SQL. We demonstrate by using the sel-included dataset class in sashelp library. To Start,

• Rank a variable
Say, we need to rank the dataset by the variable weight within each group sex.We can use PROC RANK to accomplish it.
proc rank data=class out=results ties=low descending;
   by sex;
   var weight;
   ranks Weight_Rank;
run;
In case if you are using SQL, you can use PROC SQL for the ranking as well.
proc sql;
   select sex, name, weight, (select count(distinct b.weight)
   from class b
   where b.weight <= a.weight and a.sex=b.sex
   ) as age_rank
   from class a
   group by sex
   order by sex, age_rank;
quit;
You can even use Proc SQL for the 30 days' Moving average calculation.
proc sql;
   select division, date, amount as amount_daily, 
   (select sum(b.amount)   from data1 b
   where a.division=b.division and a.date-b.date>=0 and a.date-b.date<=29 
   ) as amount_last30
   from data1 a
   order by division, date;
quit;

• Draw a horizontal bar chart
It is a substitute in case that we want to have a look at the distribution of a variable but don’t know PROC GPLOT or other plotting procedures.
proc sql;
   select age, '|',
   repeat('*',count(*)*4) as frequency
   from class
   group by age
   order by age;
quit;
• Return the running total for a variable
PROC SQL get get the running total by using a subquery at the SELECT clause.
proc sql;
   select name, weight, 
      (select sum(a.weight) from class as a where a.obs <= b.obs) as running_total
   from class as b;
quit;
• Report data with subtotal   The first thing in my mind is the powerful REPORT procedure. However, PROC SQL is an alternative with the set operator.
proc sql;
   select name, weight
   from class
   union all
   select 'Total', sum(weight) 
   from class;
quit;
• Random sampling
PROC SURVEYSELECT involves with too much statistics. If we just need a simple random sampling, PROC SQL is a more popular option. For example, if I want to pick out 8 observations, I can use PROC SQL like:
proc sql outobs = 8;
   select *
   from class
   order by ranuni(1234);
quit;
• Replicate a data set without data
In PROC SQL, it is a simple one-sentence statement to create an empty data set.
proc sql;
   create table class2 like class;
quit;
or you can use data step:
data a;
set sashelp.class;
stop;   /* or use: if _n_>=1 then delete; */
run;
• Count the missing values   The great thing in PROC SQL is that the NMISS and N functions under it work for both numeric and character variables.
proc sql
   select count(*), nmiss(weight), n(weight)
   from class;
quit;
Acknowledgement: some tutorials are coming from popular SAS Blog by Dr. Charlie.

Related links:
Continue to next: SAS Cool Tips & Tricks of Proc SQL: SAS Library dictionaries   SAS tutorial home
Back to: Tricky way to process data, extend weekly to daily   Statistics tutorial home