DATA MINING
Desktop Survival Guide
by
Graham Williams
Desktop Survival
Project Home
List of Figures
List of Tables
Data Mining
Data Mining
Data Mining with Rattle
Introduction
Data
Transform
Explore
A Model Building Framework
Unsupervised Modelling
Two Class Models
Multi Class Models
Regression Models
Text Mining
Evaluation and Deployment
Moving into R
Troubleshooting
R for the Data Miner
R
Data
Graphics in R
Understanding Data
Preparing Data
Building Models
Evaluating Models
Algorithms
Apriori
Bagging
Bayes Classifier
Boosting
Cluster Analysis
Conditional Trees
Hierarchical Clustering
K-Means
K-Nearest Neighbours
Linear Models
Logistic Regression
Neural Networks
Support Vector Machines
Text Mining
Open Products
AlphaMiner
Borgelt Data Mining Suite
KNime
R
Rattle
Weka
Closed Products
C4.5
Clementine
Equbits Foresight
GhostMiner
InductionEngine
ODM
Enterprise Miner
Statistica Data Miner
TreeNet
Virtual Predict
Appendicies
Glossary
Bibliography
Index
Data Mining with
Rattle
Subsections
Introduction
Installing GTK,
R
, and
Rattle
The Initial Interface
Interacting with
Rattle
Menus and Buttons
Project Menu and Buttons
Edit Menu
Tools Menu and Toolbar
Execute
Export
Settings
Help
Paradigms
Interacting with Plots
Data
Nomenclature
Loading Data
CSV File Option
ODBC Option
RData File Option
R Dataset Option
Variable Roles
Moving into
R
Data
Help
Transform
Sampling Data
Moving into
R
Impute
Zero/Missing
Mean/Median
Nearest Neighbours
Explore
Summary Option
Summary
Describe
Basics
Kurtosis
Skewness
Distributions Option
Box Plot
Histogram
Cumulative Distribution Plot
Benford's Law
Bar Plot
Dot Plot
GGobi Option
Correlation Option
Hierarchical Correlation
Principal Components
Moving into
R
Single Variable Overviews
Textual Summaries
Stem and Leaf Plots
Histogram
Barplot
Density Plot
Basic Histogram
Basic Histogram with Density Curve
Practical Histogram
Correlation Plot
Colourful Correlations
Measuring Data Distributions
Textual Summaries
Boxplot
Multiple Boxplots
Boxplot by Class
Box and Whisker Plot
Box and Whisker Plot: With Means
Clustered Box Plot
Further Resources
A Model Building Framework
Unsupervised Modelling
Associate
Basket Analysis
General Rules
Moving into
R
Two Class Models
Decision Trees
Boosting
Random Forests
Support Vector Machines
Logistic Regression
Moving into
R
Multi Class Models
Regression Models
Text Mining
Evaluation
The Evaluate Tab
Confusion Matrix
Basics
Measures
Cross Validation
Graphical Measures
Issues
Overfitting
Imbalanced Decisions
Risk Charts
Lift
ROC Curves
Other Examples
10 Fold Cross Validation
Area Under Curve
Precision versus Recall
Sensitivity versus Specificity
Score Option
Moving into
R
Calibration Curves
Moving into
R
Internal Structures
Current
Rattle
State
Projects
The
Rattle
Log
Troubleshooting
A factor has new levels
Copyright © 2004-2006 Graham.Williams@togaware.com
Support further development through the
purchase of the PDF
version of the book.
Brought to you by
Togaware
.