Database Reference

In-Depth Information

One of the frequent itemset generation strategies is to reduce number of candidates

by pruning (using Apriori algorithm).

The Apriori algorithm

Apriori is one of the ancient and the most commonly used algorithms for association

rules. Apriori algorithm uses the notion of frequent itemset.

For example, if we define
L
as an itemset (
L = {Bread, Jam}
), we define our

support to be 50 percent (
s = 50%
).

If 50 percent of the transactions have the itemset
L
, we say
L
is a frequent itemset.

It is apparent that if 50 percent of itemsets have
{Bread, Jam}
in them, at least 50

percent of the transactions will have either
{Bread}
or
{Jam}
in them.

Apriori algorithm principle is that a subset of frequent itemset also is frequent.

In Apriori approach, we often start bottom-up, we start with all the frequent itemsets

of size 1 (for example, Bread, Jam, Milk, and so on) first and determine the support.

Then we start pairing them. We find the support for, say
{Bread, Jam}
or
{Jam,

Milk}
or
{Milk, Bread}
.

The following figure shows an illustration of the pruning done as a result of an Apriori

algorithm:

Search WWH ::

Custom Search