How to modify data with IF Else conditions

R Tutorial 4.0

It is very important to know how to apply logic while learning an programming. This article  covers "How to use of IF-Else conditions in R for performing logical operations". We will see how it is used  for making new variables and imputing missing values etc.

Let's consider an inbuilt SAS dataset for leaning IF-Else based logical operations. :

# We make a copy of inbuilt R data cars having two columns only speed and dist

cars_data = cars

#  We will now work on copy dataset. Let's now create a new variable : pickup

cars_data$pickup =  cars_data$speed/ cars_data$dist

#or we can first attach the dataset in temporary memory

cars_data$pickup =  speed / dist

Let now learn the basic method of logical IF operation

Syntax :   ifelse(condition, "What if true", "What if false")

#Simplest ifelse :  Will create a column with two values

cars_data$category = ifelse(pickup>1,"High","Low")

#Simplest nested ifelse : Will create a column with three values

cars_data$category = ifelse(pickup>1 ,"High",ifelse(pickup>0.5,"Medium","Low"))

A little Complex If-Else-IF operation using another inbuilt R dataset : iris

# Let's first create Data_1 as a copy of iris 

Data_1 = iris

# Microsoft Excel Style coding for nested IFs : Will create a column with 6 variants

Data_1$category = ifelse(Species == "setosa" & Sepal.Length <5,"C-1",
                                ifelse(Species == "setosa" & Sepal.Length >= 5,"C-2",
                                  ifelse(Species == "versicolor" & Sepal.Length < 6,"C-3",
                                     ifelse(Species == "versicolor" & Sepal.Length >= 6,"C-4",
                                            ifelse(Species == "virginica" & Sepal.Length < 7,"C-5",
                                             ifelse(Species == "virginica" & Sepal.Length >= 7,"C-6","Other"))))))


How to use the ifelse method for missing value imputation

# Let's first define a vector
xyz = c(11,12,NA,13,15)

#  To identify the position of missing value

#  Summary function works even when the missing values are there 

#  mean function, however, doesn't works when the missing values are there 

#  but to make it work, we can use a simple trick
mean(xyz,na.rm = T)

# Now, let impute the missing value with a hard coded value
abc = ifelse(,0,xyz)

# Now, let impute the missing value with a mean of non-missing values
pqr = ifelse(,mean(xyz,na.rm = T),xyz)

Enjoy reading our other articles and stay tuned with us.

Kindly do provide your feedback in the 'Comments' Section and share as much as possible.

No comments:

Post a Comment

Do provide us your feedback, it would help us serve your better.