EZ Study

Actuarial Biology Chemistry Economics Calculators Confucius Engineer

How SAS automatically generate dummy variables?

Question: Say, we have a data with one column containing all different character values, as you can see the following column in the yellow box, we want to automatically generate a sequence of dummy variables based on the different values in that column, as showed in the table:

Answer: We used some automation skills to generate the dummy varaibles.

        proc sql;
        select distinct scan(make,1,"-'/*?_+-|\ ")
        into: make_1- from sashelp.cars;
        /* we can leave it blank if we don't know how many */
        /* notice we can't have some characters like "/+-?\|" in varaibles name */

        %let make_count=&SQLOBS.;

        %macro output;
        data dataout1 ;
        set sashelp.cars(keep=make);

        %do i=1 %to &make_count.;
        &&make_&i=(scan(make,1,"-'/*?_+-|\ ")="&&make_&i");

        %end; run;


        proc print data=dataout1; run;

        /*****part-II: automatically rename all variables*********/
        proc contents data=dataout1 out=varnames; run;

        filename rename_file "./rename.txt";

        data _null_; file rename_file;
        set varnames end=eof;

        if _n_=1 then put "rename ";
            put name '= ' new_var;
        if eof then put ';'; run;

        data dataout2; set dataout1;
        %include rename_file;

        proc print data=dataout2; run;

you can simply copy the above code to run in your sas programm, see how it works.

    Note (1): For the scan function we used, scan(make,1,"-'/*?_+-\ ")|, we try to select the first word/string before any of those delimiters showing up. To get the last word in the string: last_word=scan( string, -1, ' ' );

    Note (2): We used the left function, left(string,32) since the variable names can't be more than 32 characters long.

    Note (3): for the double ampersand we have used for the macro variable, we need to pay some extra attention: &&make_&i. here the counter i only works within the macro, not working outside of macro.

    Note (4): Another useful application for into: statement, by applying TRIMMED function to get rid of leading blanks.

        proc sql noprint;
        select distinct style, sqfeet
        into :s1, :s2 TRIMMED
        from proclib.houses;

        %put &s1 &s2;
        %put There were &sqlobs distinct values.;

Related links:

Continue to next: How to format Excel file by SAS DDE?   SAS tutorial home
Back to: How to read dynamic macro parameters?   Statistics tutorial home