Association Rule Mining using Arules in R

Let's consider a dataset. The dataset has to be in a certain format ..txn format..if its not there arules doesn't work. We take  a csv file and then convert it to transaction in example below.

Let's create dataset

CustomerId,Products
100, Savings Pre
100,Home20
101,Home20
102,Checking Zero
102,Home20
102,Gold10
103,Home20
103,Gold20
104,Checking Zero
104,Savings Pre
104,Home20


ArulesUsage.R
------------

ds<-read.csv("ProductsSmall.csv")
colnames(ds)
library(arules)
trans<-as(split(ds[,"Products"],ds[,"CustomerId"]),"transactions")
summary(trans) # Lists the most frequent items too...
rules<-apriori(trans,parameter = list(support = 0.14,
                 confidence = 0.05,
                 minlen = 2))
inspect(rules)
rules<-sort(rules,by="lift")
rules_output<-rules[!is.redundant(rules)]
inspect(rules_output)
library(pmml)
saveXML(pmml(rules_output),"Apriori_ProductsSmall.pmml")


------------------------------
<?xml version="1.0"?>
<PMML version="4.3" xmlns="http://www.dmg.org/PMML-4_3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.dmg.org/PMML-4_3 http://www.dmg.org/pmml/v4-3/pmml-4-3.xsd">
 <Header copyright="Copyright (c) 2018 Binu" description="arules association rules model">
  <Extension name="user" value="Binu" extender="Rattle/PMML"/>
  <Application name="Rattle/PMML" version="1.4"/>
  <Timestamp>2018-11-27 21:58:39</Timestamp>
 </Header>
 <DataDictionary numberOfFields="2">
  <DataField name="transaction" optype="categorical" dataType="string"/>
  <DataField name="item" optype="categorical" dataType="string"/>
 </DataDictionary>
 <AssociationModel functionName="associationRules" numberOfTransactions="5" numberOfItems="6" minimumSupport="0.14" minimumConfidence="0.05" numberOfItemsets="6" numberOfRules="14">
  <MiningSchema>
   <MiningField name="transaction" usageType="group"/>
   <MiningField name="item" usageType="active"/>
  </MiningSchema>
  <Item id="1" value=" Savings Pre"/>
  <Item id="2" value="Checking Zero"/>
  <Item id="3" value="Gold10"/>
  <Item id="4" value="Gold20"/>
  <Item id="5" value="Home20"/>
  <Item id="6" value="Savings Pre"/>
  <Itemset id="1" numberOfItems="1">
   <ItemRef itemRef="3"/>
  </Itemset>
  <Itemset id="2" numberOfItems="1">
   <ItemRef itemRef="2"/>
  </Itemset>
  <Itemset id="3" numberOfItems="1">
   <ItemRef itemRef="6"/>
  </Itemset>
  <Itemset id="4" numberOfItems="1">
   <ItemRef itemRef="1"/>
  </Itemset>
  <Itemset id="5" numberOfItems="1">
   <ItemRef itemRef="5"/>
  </Itemset>
  <Itemset id="6" numberOfItems="1">
   <ItemRef itemRef="4"/>
  </Itemset>
  <AssociationRule support="0.2" confidence="1" lift="2.5" antecedent="1" consequent="2"/>
  <AssociationRule support="0.2" confidence="0.5" lift="2.5" antecedent="2" consequent="1"/>
  <AssociationRule support="0.2" confidence="1" lift="2.5" antecedent="3" consequent="2"/>
  <AssociationRule support="0.2" confidence="0.5" lift="2.5" antecedent="2" consequent="3"/>
  <AssociationRule support="0.2" confidence="1" lift="1" antecedent="4" consequent="5"/>
  <AssociationRule support="0.2" confidence="0.2" lift="1" antecedent="5" consequent="4"/>
  <AssociationRule support="0.2" confidence="1" lift="1" antecedent="6" consequent="5"/>
  <AssociationRule support="0.2" confidence="0.2" lift="1" antecedent="5" consequent="6"/>
  <AssociationRule support="0.2" confidence="1" lift="1" antecedent="1" consequent="5"/>
  <AssociationRule support="0.2" confidence="0.2" lift="1" antecedent="5" consequent="1"/>
  <AssociationRule support="0.2" confidence="1" lift="1" antecedent="3" consequent="5"/>
  <AssociationRule support="0.2" confidence="0.2" lift="1" antecedent="5" consequent="3"/>
  <AssociationRule support="0.4" confidence="1" lift="1" antecedent="2" consequent="5"/>
  <AssociationRule support="0.4" confidence="0.4" lift="1" antecedent="5" consequent="2"/>
 </AssociationModel>
</PMML>




Comments

Popular posts from this blog

ScoreCard Model using R

The auxService:mapreduce_shuffle does not exist

Zeppelin and Anaconda