Emmanuel Müller, Fabian Keller, Sebastian Blanc, and Klemens Böhm
OutRules
Outlier rules enable the interpretation of outlying objects in databases by providing descriptions. This is accomplished by providing answers to the questions how the outlier deviates from what is considered to be normal. To enable the exploitation of these outlier rules we provide OutRules. OutRules is based on the WEKA framework and extends our previous toolkit SOREX with extraction techniques for outlier descriptions. In particular, it facilitates the generation of outlier rules for outliers in an outlier ranking. These outlier rankings can for example be provided by the various (subspace) outlier mining algorithms provided in the framework. In our preliminary compilation of this framework, we provide first visualization techniques. For example, the generated outlier rules can be visualized by a quantitative analysis of the deviation or a grahpic plot using parallel coordinates for the k-neighborhood of the outlier to clarify the characteristics of the outlier.Screenshots
The OutRules framework for outlier rule exploitation will be explained via screenshots below.Main screen for outlier mining using different outlier mining algorithms. Outlier rules can be generated for the resulting outliers. The outlier explanation functionality must be activated, settings may be provided via "Edit".
Settings screen for the OutRules algorithm with OutRules specific settings and the possibility to acitve the precomputation of explanations for all outliers found by the upstream outlier mining algorithm.
The outlier explanations are available in the OutlierRanking-Plot which is accessible by right clicking on a result set of an outlier mining algorithm and choosing "Visualization" and "Show OutlierRanking-Plot"
For every outlier in the outlier ranking (upper window) a set of outlier rules with quality measures is provided (lower window). Please click on the screenshot for a detailled view.
Further information about every outlier rule is provided by double clicking on the rules in the set ouf outlier rules (see previous screenshot). The provided information focuses on the deviation of the outlier from other objects. Further insights are provided by the visualization with parallel coodinates plots. Please click on the screenshot for more a detailled view.
Resources
Executables and a test data set (thyroid data set from UCI Machine Learning Repository) for outlier rule exploration: OutRules.zipAccess to this website & Citation Information
We encourage researchers in this area to use the proposed framework for their own publications as basis for comparison and evaluation. Our implementation are available for anyone to use.