RapidMiner Contributions

On this website we provide code extensions for RapidMiner, the leading tool for data mining. In the last years, RapidMiner turned out to be tool of choice for Data Mining for us – so we decided to give something back to the community and contributed with a variety of algorithms over the years. Most of the software has been developed as part of the BMBF funded project PaREn (Pattern Recognition Engineering).

The following list gives a brief overview of the extensions, the code already integrated and our experimental contributions. Many parts have been presented at RCOMM, the RapidMiner Community Meeting and Conference.


RapidMiner Extensions

Anomaly Detection Extension

The Anomaly Detection extension is the first approach to use RapidMiner for unsupervised anomaly detection. It currently comes with a number of the most well known unsupervised anomaly detection algorithms. A dataset can be analyzed and for all examples in an ExampleSet, an anomaly score is computed. It can be either used for detecting outliers (e.g. in fraud detection or medical applications) as well as for removing outliers as a preprocessing step for training classifiers. More detailed information is available on its website: Anomaly Detection Extension




PaREn Automatic System Construction Wizard

The PaREn Automatic System Construction Wizard is a tool for supporting you in constructing a classification process within RapidMiner. For a given data set, it automatically recommends and constructs a classification process based on certain characteristics of the data set. More info on this webpage:

RapidMiner Extension: PaREn Automatic System Construction Wizard

It also contains the Landmarking Operator for extracting features from data sets used for Meta-Learning. More details can be found in our publication Landmarking for Meta-Learning using RapidMiner.

Contact: Matthias.Reifdfki.de

RapidMiner Contributions

The code in this section has been integrated into RapidMiner and is available if the latest version is used.

X-Means Clustering and k-means++

Integrated in RapidMiner since 5.3.x



AutoMLP

AutoMLP is a simple algorithm for both learning rate and size adjustment of neural networks during training. The algorithm combines ideas from genetic algorithms and stochastic optimization. It maintains a small ensemble of networks that are trained in parallel with different rates and different numbers of hidden units. After a small, fixed number of epochs, the error rate is determined on a validation set and the worst performers are replaced with copies of the best networks, modified to have different numbers of hidden units and learning rates. Hidden unit numbers and learning rates are drawn according to probability distributions derived from successful rates and sizes.

You can find more information and the download link here: AutoMLP Website

More details are also in the following publications: Pattern Recognition Engineering
AutoMLP: Simple, Effective, Fully Automated Learning Rate and Size Adjustment

Contact: Faisal.Shafaitdfki.de

Fast k-Means

The Fast k-Means Operator represents an implemenation of the k-Means algorithm according to Charles Elkan, which is in many cases much faster than the standard implementation.

You can find more information and the download link here: Fast k-Means Website

More details are also in the following (external) publication: Using the Triangle Inequality to Accelerate k-Means.

Contact: Christian.Koflerdfki.de

Experimental Section

Distributed Pattern Recognition

DisPaRe is a framework for processing RapidMiner operations in a distributed environment. You can find more information in this publication:

Distributed Pattern Recognition in RapidMiner

The DisPaRe framework is the result of the diploma thesis by Alexander Arimond.
You can find all details here: diploma thesis by Alexander Arimond

Finally, the plain code of the system is here: DisPaRe code

Please note, that we consider the status of the software as alpha. Since we do not work anymore on this project, we can not provide any support.

Contact: Christian.Koflerdfki.de


Image Mining

This extension is intended to make working with images possible in RapidMiner. This includes handling of image collections, doing transformations on these images, and extraction of certain features for further data mining tasks.

You can find more information and the download link here:

Image Mining Website

Contact: Christian.Koflerdfki.de

     
Last modified:: 16.04.2015