Operator Apriori

An implementation of the well known Apriori algorithm for the data mining step. It works on a sample read from the database. The sample size is given by the parameter SampleSize.

The input format is fixed. There is one input concept (TheInputConcept) having a BaseAttribute for the customer ID (parameter: CustID), one for the transaction ID (TransID), and one for an item part of this customer/transaction's itemset (Item). The algorithm expects all entries of these BaseAttributes to be integers. No null values are allowed.

It then finds all frequent (parameter: MinSupport) rules with at least the specified confidence (parameter: MinConfidence). Please keep in mind that these settings (especially the minimal support) are applied to a sample!

The output is specified by three parameters. TheOutputConcept is the concept the output table is attached to. It has two BaseAttributes, PremiseBA for the premises of rules and ConclusionBA for the conclusions. Each entry for one of these attributes contains a set of whitespace-separated item IDs (integers).

Parameter

Parameter Object Type optional min_arg max_arg Remarks
TheInputConcept Concept Input no 1 1 inherited
TheOutputConcept Concept Output no 1 1 inherited
CustID BaseAttribute Input no 1 1 customer id (integer, not NULL)
TransID BaseAttribute Input no 1 1 transaction id (integer, not NULL)
Item BaseAttribute Input no 1 1 item id (integer, not NULL)
MinSupport Value Input no 1 1 minimal support (integer)
MinConfidence Value Input no 1 1 minimal confidence (in [0,1])
SampleSize Value Input no 1 1 the size of the sample to be used
PremiseBA BaseAttribute Output no 1 1 premises of rules
ConclusionBA BaseAttribute Output no 1 1 conclusions of rules