RapidMiner Extension: PaREn Automatic System Construction Wizard

The PaREn Automatic System Construction Wizard is a tool for supporting you in constructing a classification process within RapidMiner. For a given data set, it automatically recommends and constructs a classification process based on certain characteristics of the data set.

More precisely, the Wizard is following an approach called Meta-Learning. Based on Meta-Features and classification performances of 90 UCI data sets, it predicts classification accuracies of certain classifiers for a new given data set.

You can find more details in the the following publications:

Pattern Recognition Engineering
Landmarking for Meta-Learning using RapidMiner

How to install the Wizard?

There are two ways for installing the Wizard:

  • The easiest way is to download it over the RapidMiner update mechanism. For this you click on Help ⇒ Update RapidMiner in the menu bar and select the PaREn Automatic System Construction Wizard for installation. After the automatic installation and a restart of RapidMiner, the functionality of the Wizard is available in RapidMiner.
  • You can also download and install the Wizard manually. Download the archive, extract it, and copy the file “Wizard.jar” into the lib/plugins folder of your RapidMiner installation. You can download the software (including source code) here: PaREn Wizard (zip).

How to use the Wizard?

The usage of the Wizard is very straightforward. In the following, the different steps for constructing your classification process are explained in detail:

Step 1:

First, start the Wizard by clicking on Tools ⇒ Automatic System Construction in the menu bar.

The Wizard is started now and you can select your data set for which you want to construct a classification process.

Step 2:

After selecting your data set and clicking Next, the Wizard extracts Meta-Features from it and predicts accuracies for certain classifiers. This may take a while depending on the size of your data set. When finished, the Wizard presents the classifiers ranked by their predicted accuracies and along with a Root Mean Squared Error (RMSE) value. A small RMSE value indicates that the Wizard is very confident about his prediction.

You now can select which classifiers you want to evaluate in more detail to get an optimal parameterization for them depending on your data set. Note that the evaluation process usually takes some time (as it would if you do it manually). The classifier ranking will help you to select the most promising classifiers and by this reduce evaluation time dramatically. Select the classifiers via the check box and click Next.

Step 3:

As said before, the evaluation for the classifiers may take some time. You can cancel an evaluation for a classifier by clicking the cancel button. After evaluation is finished, the actual accuracy of each previously selected classifer is shown. You now can decide for which classifier you want to construct your classification process. Of course the best classifier is the one with the highest accuracy value. Just check the box in the Load column and click Finish.

The Process

The resulting process is a classification process, which consists of a source operator for your data set, different preprocessing steps according to your classifier, and the classifier itself. After running, the process will result in a classification model, which you can use for further application on new data.

     
Last modified:: 23.09.2010