This is Part 3 to show you how to perform association rules mining by using the R packages arules and aulesViz. In order to test the script, you must complete Part 1 and Part 2.
The Basket Data
In Part 2 Read Transaction Data ,
we have read the following five shopping basket data into R, of the Transactions class.
f,a,c,d,g,l,m,p
a,b,c,f,l,m,o
b,f,h,j,o
b,c,k,s,p
a,f,c,e,l,p,m,n
To find the frequent 1-itemsets, we can set a minimum support to 0.5, minlen
to 1 and maxlen
to 1.The parameter target
is frequent itemsets
.
The following script will return to itemsets
, all the 1-itemsets whose support is at least \(50%\).
#all the 1-itemsets having at least a support of 0.5
itemsets <- apriori(
transactions,
parameter = list(minlen=1, maxlen=1, support=0.5, target="frequent itemsets")
)
A Summary of the Frequent k-Itemsets
To display a summary of the frequent 1-itemsets, run summary
with itemsets
.
summary(itemsets)
## set of 7 itemsets
##
## most frequent items:
## a b c f l (Other)
## 1 1 1 1 1 2
##
## element (itemset/transaction) length distribution:sizes
## 1
## 7
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 1 1 1 1 1
##
## summary of quality measures:
## support count
## Min. :0.6000 Min. :3.000
## 1st Qu.:0.6000 1st Qu.:3.000
## Median :0.6000 Median :3.000
## Mean :0.6571 Mean :3.286
## 3rd Qu.:0.7000 3rd Qu.:3.500
## Max. :0.8000 Max. :4.000
##
## includes transaction ID lists: FALSE
##
## mining info:
## data ntransactions support confidence
## transactions 5 0.5 1
The summary shows that the support of 1-itemsets ranges from 0.6 to 0.8. The maximum support of 1-itemset is 0.8
The Top-N Frequent k-Itemsets
To print all 1-itemsets in descending order of support,
#print all 1-itemsets in descending order of support
inspect(sort(itemsets, by="support"))
## items support count
## [1] {f} 0.8 4
## [2] {c} 0.8 4
## [3] {b} 0.6 3
## [4] {p} 0.6 3
## [5] {a} 0.6 3
## [6] {m} 0.6 3
## [7] {l} 0.6 3
Only print the top-5 1-itemsets in descending order of support,
#print top-5 1-itemsets in descending order of support
inspect(head(sort(itemsets, by="support"), 5))
## items support count
## [1] {f} 0.8 4
## [2] {c} 0.8 4
## [3] {b} 0.6 3
## [4] {p} 0.6 3
## [5] {a} 0.6 3
Exercise
Write a script which returns all the 2-itemsets whose support is at least $$50%$$, finds the minmum support and maximum support, number of frequent 2-itemsets, and print all the itemsets.
Share this post
Twitter
Facebook
LinkedIn
Email