DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
A http://en.wikipedia.org/wiki/correlationcorrelation measures how two variables are related and is useful for measuring the association between the two variables. A correlation plot shows the strength of any linear relationship between a pair of variables. The ellipse package provides the plotcorr function for this purpose. Linear relationships between variables indicate that as the value of one variable changes, so does the value of another. The degree of correlation is measured between with 1 being perfect correlation and 0 being no correlation. The Pearson correlation coefficient is the common statistic and R also supports Kendall's tau and Spearman's rho statistics for rank-based measures of association, which are regarded as being more robust and recommended other than for a bivariate normal distribution. The cor function is used to calculate the correlation matrix between variables in a numeric vector, matrix or data frame. A matrix is always symmetric about the diagonal, and the diagonal consists of 1s (each variable is perfectly correlated with itself!)
The sample R code here generates the correlations for variables in
the wine dataset (cor) and then orders the
variables according to their correlation with the first variable
(Type: [1,]). This is sorted and ellipses are
printed with colour fill using cm.colors.
library(ellipse) wine.corr <- cor(wine) ord <- order(wine.corr[1,]) xc <- wine.corr[ord, ord] plotcorr(xc, col=cm.colors(11)[5*xc + 6]) |
The correlation matrix is:
> wine.corr Type Alcohol Malic Ash Alcalinity Type 1.00000000 -0.32822194 0.43777620 -0.049643221 0.51785911 Alcohol -0.32822194 1.00000000 0.09439694 0.211544596 -0.31023514 Malic 0.43777620 0.09439694 1.00000000 0.164045470 0.28850040 Ash -0.04964322 0.21154460 0.16404547 1.000000000 0.44336719 Alcalinity 0.51785911 -0.31023514 0.28850040 0.443367187 1.00000000 Magnesium -0.20917939 0.27079823 -0.05457510 0.286586691 -0.08333309 Phenols -0.71916334 0.28910112 -0.33516700 0.128979538 -0.32111332 Flavanoids -0.84749754 0.23681493 -0.41100659 0.115077279 -0.35136986 Nonflavanoids 0.48910916 -0.15592947 0.29297713 0.186230446 0.36192172 Proanthocyanins -0.49912982 0.13669791 -0.22074619 0.009651935 -0.19732684 Color 0.26566757 0.54636420 0.24898534 0.258887259 0.01873198 Hue -0.61736921 -0.07174720 -0.56129569 -0.074666889 -0.27395522 Dilution -0.78822959 0.07234319 -0.36871043 0.003911231 -0.27676855 Proline -0.63371678 0.64372004 -0.19201056 0.223626264 -0.44059693 Magnesium Phenols Flavanoids Nonflavanoids Type -0.20917939 -0.71916334 -0.8474975 0.4891092 Alcohol 0.27079823 0.28910112 0.2368149 -0.1559295 Malic -0.05457510 -0.33516700 -0.4110066 0.2929771 Ash 0.28658669 0.12897954 0.1150773 0.1862304 Alcalinity -0.08333309 -0.32111332 -0.3513699 0.3619217 Magnesium 1.00000000 0.21440123 0.1957838 -0.2562940 Phenols 0.21440123 1.00000000 0.8645635 -0.4499353 Flavanoids 0.19578377 0.86456350 1.0000000 -0.5378996 Nonflavanoids -0.25629405 -0.44993530 -0.5378996 1.0000000 Proanthocyanins 0.23644061 0.61241308 0.6526918 -0.3658451 Color 0.19995001 -0.05513642 -0.1723794 0.1390570 Hue 0.05539820 0.43368134 0.5434786 -0.2626396 Dilution 0.06600394 0.69994936 0.7871939 -0.5032696 Proline 0.39335085 0.49811488 0.4941931 -0.3113852 Proanthocyanins Color Hue Dilution Proline Type -0.499129824 0.26566757 -0.61736921 -0.788229589 -0.6337168 Alcohol 0.136697912 0.54636420 -0.07174720 0.072343187 0.6437200 Malic -0.220746187 0.24898534 -0.56129569 -0.368710428 -0.1920106 Ash 0.009651935 0.25888726 -0.07466689 0.003911231 0.2236263 Alcalinity -0.197326836 0.01873198 -0.27395522 -0.276768549 -0.4405969 Magnesium 0.236440610 0.19995001 0.05539820 0.066003936 0.3933508 Phenols 0.612413084 -0.05513642 0.43368134 0.699949365 0.4981149 Flavanoids 0.652691769 -0.17237940 0.54347857 0.787193902 0.4941931 Nonflavanoids -0.365845099 0.13905701 -0.26263963 -0.503269596 -0.3113852 Proanthocyanins 1.000000000 -0.02524993 0.29554425 0.519067096 0.3304167 Color -0.025249931 1.00000000 -0.52181319 -0.428814942 0.3161001 Hue 0.295544253 -0.52181319 1.00000000 0.565468293 0.2361834 Dilution 0.519067096 -0.42881494 0.56546829 1.000000000 0.3127611 Proline 0.330416700 0.31610011 0.23618345 0.312761075 1.0000000 |
Copyright © 2004-2006 Graham.Williams@togaware.com Support further development through the purchase of the PDF version of the book.