DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
The example data for contact lenses comes from cendrowska:1987:prism and is available from the machine learning repository at ftp://ftp.ics.uci.edu/pub/machine-learning-databases/lenses/lenses.data.
A problem with naïve Bayes arises when the training database has no examples of a particular value of a variable for a particular class.
Bayesian networks relax the conditional independence assumption by identifying conditional independence among subsets of variables.
kohavi:1996:scaling_bayes addressed the problem of independence by combining naïve Bayes with decision trees. The decision tree is used to partition a database and for each resulting partition (corresponding to separate paths through the decision tree) a naïve Bayes classifier is built using variables not included in the corresponding path through the decision tree. Whilst some improvement in accuracy can result, the final knowledge structures tend to be less compact (with replicated structures). Nonetheless this may be a useful approach for very large databases.