R Tutorial 3.0
Let's use an in-built data of R for this tutorial.
Create a new data "Data_1" as a copy of "iris" data. use the simple R syntax :
Data_1 = iris
Voila! the data is created.
I want to check how many rows and columns are there in the data.
Code: dim(Data_1)
Here is the answer in console :
[1] 150 5
It means 150 rows and 5 columns.
I want to check sample of the data.
Code :
head(Data_1) # will show top 6 observations of data
or
tail(Data_1) # will show bottom 6 observations of data
and you can see the sample.
I just want the columns' names of the data.
Code : names(Data_1)
Here is the answer in console :
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
I want to see specific rows/ columns of the data.
Code : Data_1[c(2,3,4),]
Will show 2nd, 3rd and 4th observation and all columns
Code : Data_1[,c(1,3)]
Will show all rows of 1st and 3rd column
Code : Data_1[c(2,3,4), 1:3]
Will show 2nd, 3rd and 4th observation of 1st to 3rd columns.
Using above syntax, we can perform vertical sub-setting on a data i.e. keeping or dropping variables.
Data_2 = Data_1[ , 2:4]
You can also keep specifying names of column:
Data_3 = Data_1[ , c("Sepal.Width", "Petal.Length","Petal.Width")]
The same can be written in indirect method or vector method :
to_keep = c("Sepal.Width", "Petal.Length","Petal.Width")
Data_3 = Data_1[ , to_keep]
Doesn't it look like macro of SAS ?
Data_4 = Data_1[ , -( 1:3)]
You can also keep specifying names of column:
to_drop = names(Data_1)%in%c("Sepal.Width","Petal.Width")
Data_5 = Data_1[!to_drop]
Clear everything with
rm(list = ls())
Create a data from Data_1 with observation pertaining to species setosa only
Will show 2nd, 3rd and 4th observation of 1st to 3rd columns.
Using above syntax, we can perform vertical sub-setting on a data i.e. keeping or dropping variables.
Keeping variables :
Suppose you want to keep 2nd to 4th variablesData_2 = Data_1[ , 2:4]
You can also keep specifying names of column:
Data_3 = Data_1[ , c("Sepal.Width", "Petal.Length","Petal.Width")]
The same can be written in indirect method or vector method :
to_keep = c("Sepal.Width", "Petal.Length","Petal.Width")
Data_3 = Data_1[ , to_keep]
Dropping variables :
Suppose you want to drop 1st to 3rd variablesData_4 = Data_1[ , -( 1:3)]
You can also keep specifying names of column:
to_drop = names(Data_1)%in%c("Sepal.Width","Petal.Width")
Data_5 = Data_1[!to_drop]
Clear everything with
rm(list = ls())
All right ... Let's practice logical sub-setting now.
Create a data from Data_1 with observation pertaining to species setosa only
setosa = Data_1[ Species == "setosa" , ]
OMG ! it is giving an error.
Error in `[.data.frame`(Data_1, Species == "setosa", ) :
object 'Species' not found
Basically R is not able to identify the vector Species which is within Data_1. Now to let R know, there are two ways to do :
1. We specifically tell R that "Species" is within the data "Data_1" using $.
setosa = Data_1[ Data_1$Species == "setosa" , ]
2. We first load the data "Data_1" in the temporary memory of R
attach(Data_1)
setosa = Data_1[Species == "setosa" , ]
Now it would work fine.
attach(Data_1)
setosa_1 = Data_1[which(Species == "setosa" & Petal.Length >= 1.4),]
& is used for AND logical operations.
| is used for OR logical operations.
Create a data from Data_1 for Species = setosa and virginica
attach(Data_1)
setosa_virginica = Data_1[which(Species == "setosa" | Species == "virginica") ,]
**using which is not mandatory here
The logical sub-setting can also be done using "subset" function.
To get only Setosa data where Petal length is more than or equal to 1.4.
setosa_only = subset(Data_1,(Species == "setosa" & Petal.Length >= 1.4))
To get data for setoda and virginica species data, but only two columns:
setosa_virginica = subset(Data_1,(Species == "setosa" | Species == "virginica"), select =c("Species","Petal.Length"))
To get data for setoda and virginica species data, but only two columns:
setosa_virginica = subset(Data_1,(Species == "setosa" | Species == "virginica"), select =c("Species","Petal.Length"))
That's enough to know about sub-setting for now. Enjoy reading our other articles and stay tuned with us.
Kindly do provide your feedback in the 'Comments' Section and share as much as possible.
Kindly do provide your feedback in the 'Comments' Section and share as much as possible.
No comments:
Post a Comment
Do provide us your feedback, it would help us serve your better.