This is Part 4 to show you how to generate confident association rules by using the R packages arules and aulesViz. In order to test the script, you must have already completed the following parts.
The Basket Data
In [Part 2]( {{site.url}}{{site.baseurl}}{% post_url 2018-10-15-association-rule-read-transactions %} ), we have read the following five shopping baskets into transactions
of the Transactions class.
f,a,c,d,g,l,m,p
a,b,c,f,l,m,o
b,f,h,j,o
b,c,k,s,p
a,f,c,e,l,p,m,n
we run arules::apriori
with the parameter target
set to frequent itemsets
. By assigning values to the paramters support
, and set minlen
and maxlen
equal to each other, the apriori
function returns all itemsets of a specific length having the minimum support or above.
In this part, we will generate association rules for a given threshold of a selected measure. The measure evaluates how certain or strong a rule occurs. The measures include confidence, lift and leverage.
To generate the association rules, run the same function, arules::apriori
, with a different set of parameters.
Generate the association rules
Parameters
To find the strong association rules, passing values to the following parameters:
support
: minimum supportconfidence
: minimum confidencetarget
:rules
The following script will return to rules
, rules whose support is at least 0.5 and confidence is at least 0.6.
#rules having at least a confidence of 0.6
rules <- apriori(
transactions,
parameter = list(support=0.5, confidence=0.6, target="rules")
)
A Summary of the Rules
To display a summary of the confident rules, run summary
with rules
. The summary shows number of rules, rule length, ranges of support and lift.
summary(rules)
## set of 84 rules
##
## rule length distribution (lhs + rhs):sizes
## 1 2 3 4 5
## 7 22 30 20 5
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 3.000 2.929 4.000 5.000
##
## summary of quality measures:
## support confidence lift count
## Min. :0.6000 Min. :0.6000 Min. :0.9375 Min. :3.000
## 1st Qu.:0.6000 1st Qu.:1.0000 1st Qu.:1.2500 1st Qu.:3.000
## Median :0.6000 Median :1.0000 Median :1.2500 Median :3.000
## Mean :0.6048 Mean :0.9446 Mean :1.4152 Mean :3.024
## 3rd Qu.:0.6000 3rd Qu.:1.0000 3rd Qu.:1.6667 3rd Qu.:3.000
## Max. :0.8000 Max. :1.0000 Max. :1.6667 Max. :4.000
##
## mining info:
## data ntransactions support confidence
## transactions 5 0.5 0.6
The summary shows that 84 rules are returned. The confidence of rules ranges from 0.6 to 1. The lift ranges from 0.93 to 1.67.
The Top-N Strong Rules
To print all the rules in descending order of lift,
#print the top-10 rules in descending order of lift score
inspect(head(sort(rules, by="lift", decreasing=TRUE),10))
## lhs rhs support confidence lift count
## [1] {a} => {m} 0.6 1 1.666667 3
## [2] {m} => {a} 0.6 1 1.666667 3
## [3] {a} => {l} 0.6 1 1.666667 3
## [4] {l} => {a} 0.6 1 1.666667 3
## [5] {m} => {l} 0.6 1 1.666667 3
## [6] {l} => {m} 0.6 1 1.666667 3
## [7] {a,m} => {l} 0.6 1 1.666667 3
## [8] {a,l} => {m} 0.6 1 1.666667 3
## [9] {l,m} => {a} 0.6 1 1.666667 3
## [10] {a,f} => {m} 0.6 1 1.666667 3
Generate Association Rules longer than 1
To exclude the rules only one item long, turn on the parameter minlen
and set it to 2.
#rules having at least a confidence of 0.6
rules <- apriori(
transactions,
parameter = list(support=0.5, confidence=0.6, minlen=2, target="rules")
)
The minlen argument cuts down the total rules to 66. Print the summary of rules:
summary(rules)
## set of 77 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3 4 5
## 22 30 20 5
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 3.000 3.104 4.000 5.000
##
## summary of quality measures:
## support confidence lift count
## Min. :0.6 Min. :0.7500 Min. :0.9375 Min. :3
## 1st Qu.:0.6 1st Qu.:1.0000 1st Qu.:1.2500 1st Qu.:3
## Median :0.6 Median :1.0000 Median :1.6667 Median :3
## Mean :0.6 Mean :0.9708 Mean :1.4529 Mean :3
## 3rd Qu.:0.6 3rd Qu.:1.0000 3rd Qu.:1.6667 3rd Qu.:3
## Max. :0.6 Max. :1.0000 Max. :1.6667 Max. :3
##
## mining info:
## data ntransactions support confidence
## transactions 5 0.5 0.6
Exercise
Write a script which returns all the 2-sized rules with the minimum support 0.5 and minimum confidence 0.6, displays the top-10 rules by their lift scores in descending order.
Share this post
Twitter
Facebook
LinkedIn
Email