This is Part 4 to show you how to generate confident association rules by using the R packages arules and aulesViz. In order to test the script, you must have already completed the following parts.
The Basket Data
In [Part 2]( {{site.url}}{{site.baseurl}}{% post_url 2018-10-15-association-rule-read-transactions %} ), we have read the following five shopping baskets into transactions of the Transactions class.
f,a,c,d,g,l,m,p
a,b,c,f,l,m,o
b,f,h,j,o
b,c,k,s,p
a,f,c,e,l,p,m,n
we run arules::apriori with the parameter target set to frequent itemsets. By assigning values to the paramters support, and set minlen and maxlen equal to each other, the apriori function returns all itemsets of a specific length having the minimum support or above.
In this part, we will generate association rules for a given threshold of a selected measure. The measure evaluates how certain or strong a rule occurs. The measures include confidence, lift and leverage.
To generate the association rules, run the same function, arules::apriori, with a different set of parameters.
Generate the association rules
Parameters
To find the strong association rules, passing values to the following parameters:
support: minimum supportconfidence: minimum confidencetarget:rules
The following script will return to rules, rules whose support is at least 0.5 and confidence is at least 0.6.
#rules having at least a confidence of 0.6
rules <- apriori(
transactions,
parameter = list(support=0.5, confidence=0.6, target="rules")
)
A Summary of the Rules
To display a summary of the confident rules, run summary with rules. The summary shows number of rules, rule length, ranges of support and lift.
summary(rules)
## set of 84 rules
##
## rule length distribution (lhs + rhs):sizes
## 1 2 3 4 5
## 7 22 30 20 5
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 3.000 2.929 4.000 5.000
##
## summary of quality measures:
## support confidence lift count
## Min. :0.6000 Min. :0.6000 Min. :0.9375 Min. :3.000
## 1st Qu.:0.6000 1st Qu.:1.0000 1st Qu.:1.2500 1st Qu.:3.000
## Median :0.6000 Median :1.0000 Median :1.2500 Median :3.000
## Mean :0.6048 Mean :0.9446 Mean :1.4152 Mean :3.024
## 3rd Qu.:0.6000 3rd Qu.:1.0000 3rd Qu.:1.6667 3rd Qu.:3.000
## Max. :0.8000 Max. :1.0000 Max. :1.6667 Max. :4.000
##
## mining info:
## data ntransactions support confidence
## transactions 5 0.5 0.6
The summary shows that 84 rules are returned. The confidence of rules ranges from 0.6 to 1. The lift ranges from 0.93 to 1.67.
The Top-N Strong Rules
To print all the rules in descending order of lift,
#print the top-10 rules in descending order of lift score
inspect(head(sort(rules, by="lift", decreasing=TRUE),10))
## lhs rhs support confidence lift count
## [1] {a} => {m} 0.6 1 1.666667 3
## [2] {m} => {a} 0.6 1 1.666667 3
## [3] {a} => {l} 0.6 1 1.666667 3
## [4] {l} => {a} 0.6 1 1.666667 3
## [5] {m} => {l} 0.6 1 1.666667 3
## [6] {l} => {m} 0.6 1 1.666667 3
## [7] {a,m} => {l} 0.6 1 1.666667 3
## [8] {a,l} => {m} 0.6 1 1.666667 3
## [9] {l,m} => {a} 0.6 1 1.666667 3
## [10] {a,f} => {m} 0.6 1 1.666667 3
Generate Association Rules longer than 1
To exclude the rules only one item long, turn on the parameter minlen and set it to 2.
#rules having at least a confidence of 0.6
rules <- apriori(
transactions,
parameter = list(support=0.5, confidence=0.6, minlen=2, target="rules")
)
The minlen argument cuts down the total rules to 66. Print the summary of rules:
summary(rules)
## set of 77 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3 4 5
## 22 30 20 5
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 3.000 3.104 4.000 5.000
##
## summary of quality measures:
## support confidence lift count
## Min. :0.6 Min. :0.7500 Min. :0.9375 Min. :3
## 1st Qu.:0.6 1st Qu.:1.0000 1st Qu.:1.2500 1st Qu.:3
## Median :0.6 Median :1.0000 Median :1.6667 Median :3
## Mean :0.6 Mean :0.9708 Mean :1.4529 Mean :3
## 3rd Qu.:0.6 3rd Qu.:1.0000 3rd Qu.:1.6667 3rd Qu.:3
## Max. :0.6 Max. :1.0000 Max. :1.6667 Max. :3
##
## mining info:
## data ntransactions support confidence
## transactions 5 0.5 0.6
Exercise
Write a script which returns all the 2-sized rules with the minimum support 0.5 and minimum confidence 0.6, displays the top-10 rules by their lift scores in descending order.
Share this post
Twitter
Facebook
LinkedIn
Email