### How to subset a data logically in SAS

In day to day work, we need to fetch only selective data for analysis purposes e.g. for one analysis, we need data only for female customer, for another data for male customers between age 14 - 40 might be required. Such type of conditional sub-settings, we have named as logical sub-setting.

How to do it ... explore here with examples.

This article has been written in continuation of

This article has been written in continuation of

**Base SAS - Logical Sub-setting part 1 (Covering Where).**

**Let's begin learning IF statement for sub-setting in contrast with Where, so that we would be clear about their applicability along with.**
Let's use the same data, copy and paste the below code to make the data (Insurance).

Data Insurance;

infile datalines ;

input Name $ Gender $ Year Claim_amount Policy_Type : $20. Sum_assured Monthly_Income

;

Datalines;

VINOD M 2009 6069 ENDOWMENT 300000 21000

DEEPA F 2010 9911 TERM 100000 7000

DINESH M 2009 7147 TERM 400000 28000

MONAL F 2011 9418 TERM 200000 14000

NEETU F 2011 9635 ENDOWMENT 200000 14000

VINOD M 2010 6105 WHOLELIFE 100000 7000

TANGO M 2010 5914 TERM 400000 28000

RAJAT M 2011 7599 ENDOWMENT 300000 21000

NEETU F 2011 1466 TERM 200000 14000

MONAL F 2012 5265 TERM 200000 14000

BALJEET M 2010 8197 ENDOWMENT 500000 35000

SANTOSH M 2011 1078 ENDOWMENT 400000 28000

RAJAT M 2012 1006 WHOLELIFE 400000 28000

DINESH M 2010 3739 WHOLELIFE 500000 35000

DEEPA F 2011 3818 TERM 200000 14000

;

Run;

**Now let's solve the following 9 queries, that we have solved in part 1 of this article using Where statement; Now we would use IF statement:**

Create the following data sets by using the

**IF**:
1. List out the insured people who has term policy

2. Insured getting salary in the range 10000-25000

3. People who claimed in year 2009 and 2010

4. two different dataset for male and female.

5. Insured whose name starting with ‘S ‘ or ‘V’

6. Insured whose name ending with ‘H’ OR ‘T’

7. Insured whose name consist at least one ‘N’

8. Insured whose name does not consist ‘N’

9. Insured who has term policy and getting salary in range of 15000-30000 claimed in the year 2009.

1. List out the insured people who has term policy.

data Insurance_1;

set insurance;

if policy_type = "TERM";

Run;

It works same as where statement.

2. Insured getting salary in the range 10000-25000

data
Insurance_2;

set insurance;

If monthly_income between 10000 and 25000;

Run;

"Between" doesn't work with IF

data Insurance_2;

set insurance;

if monthly_income > 10000 and monthly_income
< 25000;

Run;

It works same as where statement.

3. People who claimed in year 2009 and 2010

data
Insurance_3;

set insurance;

If
year in (2009,2010);

Run;

It works same as where statement.

4. two different dataset for male and female.

data
female ;

set insurance ;

if gender='F';

run;

data male ;

set insurance ;

if gender='M';

run;

It works same as where statement.

5. Insured whose name starting with ‘S ‘ or ‘V’

data Insurance_5;

set insurance;

WHERE name like 'S%' or name like 'V%' ;

Run;

Like and "%" don't work with IF

6. Insured whose name ending with ‘H’ OR ‘T’

data Insurance_6;

set insurance;

WHERE name like '%H' or name like '%T' ;

Run;

Like and "%" don't work with IF

data Insurance_7;

set insurance;

WHERE name contains 'N' ;

Run;

"Contains" don't work with IF

8. Insured whose name does not consist ‘N’

data Insurance_8;

set insurance;

WHERE name not contains 'N' ;

Run;

Already told you, "Contains" don't work with IF

9. Insured who has term policy and getting salary in range of 15000-30000 claimed in the year 2009.

Data high_income_9;

set insurance;

if policy_type = 'TERM'

and (Monthly_income
ge 15000 and monthly_income
le 30000)

and year = 2009 ;

Run;

It works same as where statement.

## So Finally we can summarize the contrast of Where as if as :

So can we conclude, Where can be applied everywhere but if only in few case...when why should i waste my energy in learning

**IF**, I can easily spend my life with**Where.**

Please don't be judgmental so early, let's solve one last question and see the real contrast between where and IF.

Question:

**List out the insured people who has sum assured to claim amount ratio as more than 100.**

**First let me use If :**

Data high_income_11;

set insurance;

If Sum_assured
/Claim_amount > 100;

Run;**it works well.**

**First let's try the same with Where :**

**Something went wrong !**

In this case, if helped me but not Where ...

Let's understand how If an Where work; Why there is difference between the two:

**Where condition filter the data before input buffer whereas if condition filter the data after PDV.**

**Where condition is faster because not all observation have to be read.**

**Where statement work only on existing variable, but If can work on derived/calculated variables too.**

That's all for now ... I hope now you are clear about where, If, logical sub-setting and all the stuff covered.

Enjoy reading our other articles and stay tuned with ...

Kindly do provide your feedback in the 'Comments' Section and share as much as possible.

## No comments:

## Post a Comment

Do provide us your feedback, it would help us serve your better.