One to many and many to many joins

<<< Click here to go back


Till now, all the merging examples we have seen were having one to one relationship between matching keys. 

We also need to understand how R behaves, if the matching keys are having one to many or many to many type of relationship.


One to Many or Many to many joins in R

For understanding the one to many (same as "many to one") and many to many merging, let's prepare three datasets:

#Clearing workspace
rm(list = ls())

Data_1 = data.frame(CustomerId = c(1:5), Product = c(rep("Apple", 2), rep("Banana", 3)))

Data_2 = data.frame( Product = c("Apple","Banana"), price = c("14","20"))

Data_3 = data.frame(Product = c(rep("Apple", 2), rep("Banana", 3)), Color = c("Red","Green","Yellow","Green","Black"))
Click to enlarge




These datasets have one to many and many to many relationships in matching keys:











Let's first do a one to many merge
Data_one_to_many

Data_one_to_many =  merge(x = Data_1 , y = Data_2, by = "Product")


It merges data as one would expect  >>> 






And now it's time for doing a many to many merge



Try_many_to_many = merge(Data_1,Data_3,by = "Product")




What's that ? Yes, it is nothing but sort of a cartesian product of the datasets. 

2 instances of Apple in Data_1 cross joined with 2 instances of Apple  in Data_3 and 4 observations are created.

Similarly, 3 instances of Banana in Data_1 cross joined with 3 instances of Banana in Data_3 and 9 observations are created.

We can conculde that in case of many to many, it behaves like PROC SQL join of SAS.

No comments:

Post a Comment

Do provide us your feedback, it would help us serve your better.