OutRules: A Framework for Outlier Descriptions in Multiple Context Spaces
In Proc. ECML PKDD 2012

Emmanuel Müller, Fabian Keller, Sebastian Blanc, and Klemens Böhm

OutRules

Outlier rules enable the interpretation of outlying objects in databases by providing descriptions. This is accomplished by providing answers to the questions how the outlier deviates from what is considered to be normal. To enable the exploitation of these outlier rules we provide OutRules. OutRules is based on the WEKA framework and extends our previous toolkit SOREX with extraction techniques for outlier descriptions. In particular, it facilitates the generation of outlier rules for outliers in an outlier ranking. These outlier rankings can for example be provided by the various (subspace) outlier mining algorithms provided in the framework. In our preliminary compilation of this framework, we provide first visualization techniques. For example, the generated outlier rules can be visualized by a quantitative analysis of the deviation or a grahpic plot using parallel coordinates for the k-neighborhood of the outlier to clarify the characteristics of the outlier.

Screenshots

The OutRules framework for outlier rule exploitation will be explained via screenshots below.

main screen Main screen for outlier mining using different outlier mining algorithms. Outlier rules can be generated for the resulting outliers. The outlier explanation functionality must be activated, settings may be provided via "Edit".


settings screenSettings screen for the OutRules algorithm with OutRules specific settings and the possibility to acitve the precomputation of explanations for all outliers found by the upstream outlier mining algorithm.


menu The outlier explanations are available in the OutlierRanking-Plot which is accessible by right clicking on a result set of an outlier mining algorithm and choosing "Visualization" and "Show OutlierRanking-Plot"


explanation overview screenFor every outlier in the outlier ranking (upper window) a set of outlier rules with quality measures is provided (lower window). Please click on the screenshot for a detailled view.


explanation details screenFurther information about every outlier rule is provided by double clicking on the rules in the set ouf outlier rules (see previous screenshot). The provided information focuses on the deviation of the outlier from other objects. Further insights are provided by the visualization with parallel coodinates plots. Please click on the screenshot for more a detailled view.

Resources

Executables and a test data set (thyroid data set from UCI Machine Learning Repository) for outlier rule exploration: OutRules.zip
For testing we recommend an outlier mining algorithm the OutRank_Proclus algorithm for low runtime or the OutRulesRank algorithm for best results.

Access to this website &  Citation Information

We encourage researchers in this area to use the proposed framework for their own publications as basis for comparison and evaluation. Our implementation are available for anyone to use.

If you publish material based on databases, algorithms or evaluation measures obtained from this repository, then please note the assistance you received by using this repository. This will help others to obtain the same data sets, algorithms and evaluation measures and replicate your experiments.

We suggest the following reference format for referring to this project:
 
Müller E., Keller F., Blanc S., Böhm K.:
OutRules:
A Framework for Outlier Descriptions in Multiple Context Spaces
http://www.ipd.kit.edu/~muellere/OutRules/
In Proc. ECML PKDD 2012