EZ Study

Actuarial Biology Chemistry Economics Calculators Confucius Engineer
Physics
C.S.


Format SQL Statement to look nice
Tranpose data efficiently in Netezza/SAS/SQL

Question: Previously we have a few tutorials: how to Trun around & Transpose data
nicely in SAS, by using array and proc transpose in SAS
(Also a tutorial for data tranpose in R, how to format SQL to look nice). You may realize that if you have millions of records to tranpose, then it may take a while by using proc tranpose, especially when the number of columns after transposing is really big.

In this tutorial, we introduce some cool features of SQL to transpose data nicely and quickly.
     Proc sql;
        CREATE TABLE car1a AS
        SELECT DISTINCT DriveTrain,Origin,count(*) AS count
        FROM sashelp.cars
        GROUP BY DriveTrain,Origin
        ORDER BY DriveTrain,Origin;
        quit;

      Proc transpose data = car1a out = car1b;
          BY DriveTrain;
          var count;
          id Origin;
          run;

      *you can get better same OUTPUT BY using the following SQL **;
           
      Proc sql;
      CREATE TABLE car2a AS
      SELECT DISTINCT DriveTrain,

count(DISTINCT (CASE WHEN Origin = 'Asia'  THEN make END )) AS Asia_distinct_make_count,
count(CASE WHEN Origin = 'Asia' THEN Origin	END) AS Asia_count,
count(CASE WHEN Origin = 'USA' THEN Origin	END) AS USA_count,
avg(CASE WHEN Origin = 'Asia' THEN Weight	END) AS Avg_Asia_weight format 8.1,
avg(CASE WHEN Origin = 'USA' THEN Weight	END) AS Avg_USA_weight format 8.1,
max(CASE WHEN Origin = 'Asia' THEN Weight	END) AS Max_Asia_weight,
sum(CASE WHEN Origin = 'Asia' THEN MPG_City	END) AS Sum_Asia_MPG,
      
      FROM sashelp.cars
      GROUP BY DriveTrain
      ORDER BY DriveTrain;
      quit;
If it's inside Netezza/SAS, we can use the following to do tranpose inside netezza SQL.
proc sql;
      create table Data1 as
      select Var, count(*) as cnt
      from Data0
      group by Var
      ;quit;
      
Then print that table into excel, use 
="sum(nvl(case when "&A1&"='"&A2&"' then Var_ID end,0)) as " &A2&" ,"
to generate the code needed for the following: 
(It's a little bit tricky here, or use max if not numericals)
      	
proc sql stimer; 
        connect to netezza (server='server_name' database=database1 
        user=myuser pwd=mypassword connection=global); 
        execute (create table outputfile as  
        select TRANSACTION_ID,
sum(nvl(case when Option='ABS'  then  12345 end,0)) as ABS ,
...
from Datain
group by 1
distribute on random ) 
by netezza; quit;   

Formatting SQL Statement

SQL statement is so useful that not only analysts are necessarily to get familiar with, also many IT stuff or database adminstrators are using that quite often. The question is, sometime you get a very long SQL query, and it seems not that readable; Just like the following SQL query, it's really not that clear at the first glance of it.

The good news is that, we have some good tools to format the SQL query, so it's much easy to understand. One of the popular useful SQL formatting websites that LDD(Little Dumb Doctor) highly recommended is: www.poorsql.com. Also many
are using this one: http://www.dpriver.com/pp/sqlformat.htm.


Related links:
Continue to next: NameLen option: NO truncation in the model output   SAS tutorial home
Back to: SAS Cool Tips & Tricks of Proc SQL; How to Rank a variable ?   Statistics tutorial home