![]() |
DATA MINING
Desktop Survival Guide by Graham Williams |
![]() |
|||
The Evaluate Tab displays all the options available for evaluating the performance of our models, and for deploying the model over new datasets.
The range of different types of evaluations is presented as a series of radio buttons, allowing just a single evaluation type to be chosen at any time. Each type of evaluation is presented in the following sections of this chapter.
Below the row of evaluation types is a row of check buttons to choose the model types we wish to evaluate. The check buttons are only sensitive once a model has been built, and so on a fresh start of Rattle no model check button can be checked. As models are built, they will become sensitive, and as we move from the Model tab to this Evaluate tab the most recently built model will be automatically checked. This corresponds to a common pattern of behaviour, in that often we will build and tune a model, then want to explore its performance by moving to this Evaluate tab. The check status of each of the other tabs does not otherwise change except by our clicking them.
To evaluate a model we need to identify a dataset on which to perform the evaluation.
The first option (but not the best option) is to evaluate our model on the training dataset. This is generally not a good idea, and the information dialogue shown here will be displayed each time we perform an evaluation on a training dataset. The output of any evaluation on the training dataset will also highlight this fact. The problem is that we have built our model on this training dataset, and it is often the case that the model will perform very well on that dataset! It should, because we've tried hard to make sure it does. But this does not give us a very good idea of how well the model will perform in general, on previously unseen data.
For a better guide to how well the model will perform in general, that is, on new and previously unseen data, we need to apply the model to such data and obtain an error rate. This error rate, and not the error rate from the training dataset, will then be a better estimate of how well the model will perform.
We discussed the concept of a training set in
Section , presenting the Sample tab which provides a
simple but effective mechanism for identifying a part of the dataset
to be held separately from the training dataset, and to be used
explicitly as the testing dataset. As indicated there, the default in
Rattle is to use 70% of the dataset for training, and 30% for
testing.
The final piece of information displayed in the common area of the
Evaluate tab is the Risk Variable. The concept of the Risk Variable
has been discussed in Section . It is used as a
measure of how significant each case is, with a typical example
recording the dollar value of the fraud related to the case. The Risk
Chart makes use of this variable if there is one, and it is included
in the common area of the Evaluation tab for information purposes
only.
Copyright © 2004-2006 Graham.Williams@togaware.com Support further development through the purchase of the PDF version of the book.