DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
Decision tree induction algorithms are generally what are called greedy algorithms. They are greedy in that they decide on a question to ask then don't consider any more alternatives later on. The algorithms are also dived and conquer because they partition the database into smaller sets. In fact, the general algorithm continually partitions the database into smaller sets until the sets all have the same value for the output variable.
Note that pruning is a mechanism for reducing the variance of the resulting models. However, for large datasets the reduction of variance is not usually useful thus unpruned trees may actually be better.