For the purpose of Frequency Analysis, SAS has provided Proc Freq to us. The procedure is used to measure the number of occurrences or frequency of instances individually or in combinations.

The procedure is also used for performing Chi-Square test, which helps establishing the independence of two or more samples on the basis of frequency.

Let's understand it by an examples.

Download the data using following link, we will use the same for almost all tests:

Let's try to answer following questions :

Proc freq data = a.sample_1;

tables female /

out = gender_freq ;

run;

Proc freq data = a.sample_1;

tables female * prog /list

out = prog_gender_freq ;

run;

The procedure is also used for performing Chi-Square test, which helps establishing the independence of two or more samples on the basis of frequency.

Let's understand it by an examples.

Download the data using following link, we will use the same for almost all tests:

## Data for tests

*Download the data, keep it at a location and assign the location to library "a" in SAS.*Let's try to answer following questions :

**1. How Many Males and Female are there in Data ?**Proc freq data = a.sample_1;

tables female /

out = gender_freq ;

run;

**2. How Many Males and Female are there in each program type ?**Proc freq data = a.sample_1;

tables female * prog /list

out = prog_gender_freq ;

run;

Option List helps you print the data in list format, if not mentioned, it would print the output in matrix format ( output dataset though remains same).

Proc freq data = a.sample_1;

tables female * prog /

out = prog_gender_freq ;

run;

tables female * prog /

out = prog_gender_freq ;

run;

**The matrix format shows the important stats : Row Pct and Col Pct which are nothing but row percentage and column percentage.**

Out of total population under Pct of prog = 1 when female = 0 category we have 21 people.

So Row Pct of prog = 1 when female = 0 i.e. for male population is 23.08 has been calculated as 21/( 21+47+23). It means that In male population , out of all programs, frequency of Prog 1 is 23.08%.

If we don’t put an asterisk (

*****) in between the variables, it will consider them individually and in the output data, frequency as pr last variable would come.
Try :

Proc freq data = a.sample_1;

tables female prog /list

out = prog_gender_freq ;

run;

tables female prog /list

out = prog_gender_freq ;

run;

Another usage of Proc Freq is in performing a Chi Square test :