Apriori

An implementation of the well known Apriori algorithm for the data mining step. It works on a sample read from the database. The sample size is given by the parameter SampleSize.

The input format is fixed. There is one input concept (TheInputConcept) having a BaseAttribute for the customer ID (parameter: CustID), one for the transaction ID (TransID), and one for an item part of this customer/transaction's itemset (Item). The algorithm expects all entries of these BaseAttributes to be integers. No null values are allowed.

It then finds all frequent (parameter: MinSupport) rules with at least the specified confidence (parameter: MinConfidence). Please keep in mind that these settings (especially the minimal support) are applied to a sample!

The output is specified by three parameters. TheOutputConcept is the concept the output table is attached to. It has two BaseAttributes, PremiseBA for the premises of rules and ConclusionBA for the conclusions. Each entry for one of these attributes contains a set of whitespace-separated item IDs (integers).

ParameterName ObjType Type Remarks
TheInputConcept CON IN inherited
CustID BA IN customer id (integer, not NULL)
TransID BA IN transaction id (integer, not NULL)
Item BA IN item id (integer, not NULL)
MinSupport V IN minimal support (integer)
MinConfidence V IN minimal confidence (in [0,1])
SampleSize V IN the size of the sample to be used
PremiseBA BA OUT premises of rules
ConclusionBA BA OUT conclusions of rules
TheOutputConcept CON OUT inherited