Proc Expand - Quite useful in modeling

Ques:  How would you create lag(s) of various orders of a variable in SAS ?
Ans  :  Using lag n (x) function, Simple !

Ques:  How would you create moving average of a variable in SAS?
Ans  :  Will take various lag, and then will take average ?

Ques : How will you create lead(s) of various orders of a variable in SAS ?
Ans  : We can't ... No function available for this, sorry !

Are you sure ??? Don't be. Here comes one of the most versatile SAS procedures that would make you a super hero.


Proc Expand

Let's use the following data  for understanding it.

Data Demo;
infile datalines ;
Input year Month $ sales;
cards;
2011  Jan  100
2011  Feb  110
2011  Mar  120
2011  Apr  130
2011  May  140
2011  Jun  150
2012  Jan  160
2012  Feb  170
2012  Mar  180
2012  Apr  190
2012  May  200
2012  Jun  210
;

run;

Try this one :


Proc expand data= Demo out= new_demo;
convert Sales = lag_sales /transformout = (lag 1);
convert sales = lead_sales /transformout = (lead 2);
convert sales = sales_movave / transformout=(movave 3);
run;

It would give you lag of first order, lead of 2nd order and moving average of 3rd order  (you can customize orders).



By the way, in the demo data, we had data only for six months for year 2011, so principally data for Jun 2011 and Jan 2012 should not act as lead or lag ... there should be a  break by year.

Yes we can do that ... don't get so impatient J


Use by statement with proc expand

Proc expand data= Demo out= new_demo_year;
convert Sales = lag_sales /transformout = (lag 1);
convert sales = lead_sales /transformout = (lead 1);
convert sales = sales_movave / transformout=(movave 3);
by year;

run;

In the above example data was already sorted in the order by Year, otherwise you had to first use proc sort to do so.



You can use this procedure for creation of lags, moving averages during modeling exercise for transforming the variables.

Returning to where we started from, 

Ques : How will you create lead(s) of various orders of a variable in SAS ?
Ans  : We can't ... No function available for this, sorry !

Answer was wrong. You can invert the data, take lag ... and invert the data once again .... You get your lead.

Try this code to understand it better:

Data check;
infile datalines ;
Input year Month $ sales;
cards;
2011  Jan  100
2011  Feb  110
2011  Mar  120
2011  Apr  130
2011  May  140
2011  Jun  150
;
run;

Data check ;
set check;
count + 1;
run;

proc sort data = check ; by descending count; run;

data check ;
set check ;
lead = lag(sales);
run;

proc sort data = check ; by  count; run;

There is so much more that we can do with Proc Expand but that we would "Expand" later ... 

Enjoy reading our other articles and stay tuned with ...

Kindly do provide your feedback in the 'Comments' Section and share as much as possible.

No comments:

Post a Comment

Do provide us your feedback, it would help us serve your better.