Emmanuel Müller, Fabian Keller, Sebastian Blanc, and Klemens Böhm
OutRules
Outlier rules enable the interpretation of outlying objects in databases by providing descriptions. This is accomplished by providing answers to the questions how the outlier deviates from what is considered to be normal. To enable the exploitation of these outlier rules we provide OutRules. OutRules is based on the WEKA framework and extends our previous toolkit SOREX with extraction techniques for outlier descriptions. In particular, it facilitates the generation of outlier rules for outliers in an outlier ranking. These outlier rankings can for example be provided by the various (subspace) outlier mining algorithms provided in the framework. In our preliminary compilation of this framework, we provide first visualization techniques. For example, the generated outlier rules can be visualized by a quantitative analysis of the deviation or a grahpic plot using parallel coordinates for the k-neighborhood of the outlier to clarify the characteristics of the outlier.Screenshots
The OutRules framework for outlier rule exploitation will be explained via screenshots below. Main screen
for outlier mining using different outlier mining algorithms. Outlier
rules can be generated for the resulting outliers. The outlier
explanation functionality must be activated, settings may be provided
via "Edit".
Settings
screen for the OutRules algorithm with OutRules specific settings and
the possibility to acitve the precomputation of explanations for all
outliers found by the upstream outlier mining algorithm.
The outlier
explanations are available in the OutlierRanking-Plot which is
accessible by right clicking on a result set of an outlier mining
algorithm and choosing "Visualization" and "Show OutlierRanking-Plot"
For
every outlier in the outlier ranking (upper window) a set of
outlier rules with quality measures is provided (lower window). Please
click on the screenshot for a detailled view.
Further
information about every outlier rule is provided by double
clicking on the rules in the set ouf outlier rules (see previous
screenshot). The provided information focuses on the deviation of the
outlier from other objects. Further insights are provided by the
visualization with parallel coodinates plots. Please click on the
screenshot for more a detailled view.
Resources
Executables and a test data set (thyroid data set from UCI Machine Learning Repository) for outlier rule exploration: OutRules.zipAccess to this website & Citation Information
We encourage researchers in this area to use the proposed framework for their own publications as basis for comparison and evaluation. Our implementation are available for anyone to use.