MiningMart Approach Part III |
|
|
OperatorsAs mentioned before each step is related to exactly one operator, and holds all of its input arguments. An operator performs data transformations such as, e.g., discretization, handling null values, aggregation of attributes into a new one, or collecting sequences from time-stamped data. The operators directly access the database and are capable of handling large masses of data. Machine learning is not restricted to a data mining step, but is also applicable in preprocessing. This view offers a variety of learning tasks that are not as well investigated as is learning classifiers. For instance, an important task is to acquire events and their duration (i.e. a time interval) on the basis of time series (i.e. measurements at time points). There are two kinds of operators, distinguished by their output on the
conceptual level: those that have an output Concept (Concept Operators,
Feature selection operator), and those that have an output BaseAttribute
(Feature Construction Operators). All operators have parameters,
such as input Concept or output BaseAttribute. Concept operatorsAll Concept operators take an input Concept and create at least one new
ColumnSet which they attach to the output Concept. The output Concept
must have all its Features attached to it before the operator is compiled.
All Concept operators have the two parameters TheInputConcept
and TheOutputConcept, which are marked as inherited in
the following parameter descriptions.
Feature selection operatorsFeature selection operators are also concept operators in that their
output is a Concept, but they are listed in their own section since they
have some common special properties. All of them (except FeatureSelectionByAttributes)
use external algorithms to determine which features are taken over to
the output concept. This means that at the time of designing an operating
chain, it is not known which features will be selected.
Feature construction operatorsAll operators in this section are loopable. For loops, TheInputConcept remains the same while TheTargetAttribute, TheOutputAttribute and further operator-specific parameters change from loop to loop (loop numbers start with 1).
Next... |