Market Basket Analysis in R

Market Basket Analysis in R with example

How can we identify the different products which can be bundled together to increase the sales ?  The answer of the question is Market Basket Analysis or Apriori Algorithm.

Do you know, how to run the Apriori algorithm in R ?


This article has been written in continuation of the previous article covering Basic of Market Basket Analysis


We are taking a very common example of grocery store to make you understand the algorithm step by step. Sample snapshot of the data is given right side:



      There are only two variables in the datasets

 1)      Customer Id –   Unique identity number of customers

       2)      Products – Products bought by the customers

   



We would now cover all steps to run apriori algorithm in R


1. Let's first import the data.

Code 1: 

mba_data<-read.csv("C:\\MBA_data_new.csv")  # we are creating a data frame by importing csv file




 << Here is the screen shot of data in R.

mba_data( data frame in R) has two variable customer_id and products.

Each customer id has bought some products for example:
customer id 1 has bought

Bread
Butter
Eggs
Milk









2. We cannot directly use imported data to run apriori algorithm. We need to aggregate it first by customer id and transform into different format.

Code 2 :

trans <- split(mba_data$Products, mba_data$Customer_Id,"transactions")

head(trans)                               # you can check top 6 observation using head() function

Screenshot of top transaction :




We have transformed the data into the desired format to run the apriori algorithm. In order to run apriori algorithm, first, we need to install and load arules library package using below code.

Code 3:

install.packages("arules")    # install arules library package

library(arules)                      # loading arules library


Below are the screenshots of R Console :


























We have installed the arules library. Now we can run the apriori algorithm using following statement:

Code 4:

rules = apriori(trans, parameter=list(support=0.10, confidence=0.5,maxlen=2,minlen=2))




Below are the screen shot of R result.



We have successfully derived 66 rules( written second last line in the above screenshot).

Now lets have a look on those rules.


Code 5:

inspect(rules)        # to get the rules 


Screenshot for R.



Lets manually validate the first two rules ( buns => mustard and mustard => buns).

I hope that you would have read our last blog on Market Basket Analysis. In our last article we have explicitly explained support, confidence and lift

Please read that article before getting into below calculation to understand it better.









If you want to export the rule file into csv file,  you can get it by below mentioned code.
Code 6:

write(rules,file="mba_rules1.csv",sep=",",row.names = FALSE)


I hope, the article is useful in understanding the market basket analysis




Enjoy reading our other articles and stay tuned with ...

Kindly do provide your feedback in the 'Comments' Section and share as much as possible.

10 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. Infycle Technologies, the best software training institute cum placement center in Chennai offers the No.1 Data Science Training in Chennai for freshers, students, and tech professionals at the best offers. In addition to the Digital Marketing Training, other in-demand courses such as AWS, DevOps, Data Science, Python, Selenium, Big Data, Java, Power BI, Oracle will also be trained with 100% practical classes. After the completion of training, the trainees will be sent for placement interviews in the top MNC's. Call 7504633633 to get more info and a free demo.No.1 Data Science Training in Chennai | Infycle Technologies

    ReplyDelete
  3. I enjoyed this post thanks for sharing

    ReplyDelete

Do provide us your feedback, it would help us serve your better.