![]() |
DATA MINING
Desktop Survival Guide by Graham Williams |
![]() |
|||
Part II constitutes a complete guide to using Rattle for data mining.
In Chapter 2 we introduce Rattle as a graphical user interface (GUI) developed for making any data mining project a lot simpler. This covers the installation of both R and Rattle, as well as basic interaction with Rattle.
Chapters to
then detail the steps
of the data mining process, corresponding to the straightforward
interface presented through Rattle. We describe how to get data into
Rattle, how to select variables, and how to perform sampling in
Chapter
. Chapter
then reviews various
approaches to exploring the data in order for us to gain some insights
about the data we are looking at as well as understanding the
distribution of the data and to assess the appropriateness of any
modelling.
Chapters to
cover
modelling, including descriptive and predictive modelling, and text
mining. The evaluation of the performance of the models and their
deployment is covered in Chapter
.
Chapter
provides an introduction to migrating from
Rattle to the underlying R system. It does not attempt to cover
all aspects of interacting with R but is sufficient for a competent
programmer or software engineer to be able to extend and further fine
tune the modelling performed in
Rattle. Chapter
covers
troubleshooting within Rattle.
Part II delves much deeper into the use of R for
data mining. In particular, R is introduced as a programming
language for data mining. Chapter introduces the
basic environment of R. Data and data types are covered in
Chapter
and R's extensive capabilities in
producing stunning graphics is introduced in
Chapter
. We then pull together the capabilities
of R to help us understand data in
Chapter
. We then move on to preparing our
data for data mining in Chapter
, building models
in Chapter
, and evaluating our models in
Chapter
.
Part reviews the algorithms employed in data
mining. The encyclopedic type overview covers many tools and
techniques deployed within data mining, ranging from decision tree
induction and association rules, to multivariate adaptive regression
splines and patient rule induction methods. We also cover standards
for sharing data and models.
We continue the Desktop Guide with a snapshot of some current
alternative open source and then commercial data mining products in
Part , Open Source Products, and
Part
, Commercial Off The Shelf Products.
Copyright © 2004-2006 Graham.Williams@togaware.com Support further development through the purchase of the PDF version of the book.