DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
A http://en.wikipedia.org/wiki/Stem_and_leaf_diagramStem-and-leaf plot is a simple textual plot of numeric data that is useful to get an idea of the shape of a distribution. It is similar to the graphic histograms that we will see next, but a useful quick place to start for smaller datasets. A stem-and-leaf plot has the advantage of showing actual data values in the plot rather than just a bar indicating frequency.
In reviewing a stem-and-leaf plot we might look to see if there is a clear central value, or whether the data is very spread out. We look at the spread to see if it might be symmetric about the central value or whether there is a skew in one particular direction. We might also look for any data values that are a long way from the general values in the rest of the population.
> stem(wine$Magnesium) The decimal point is 1 digit(s) to the right of the | 7 | 0 7 | 888 8 | 0000012444 8 | 55555566666666666777888888888888899999 9 | 0000112222233444444 9 | 55566666666777778888888889 10 | 000111111111222222233333444 10 | 55666677778888 11 | 00011122222233 11 | 5566678889 12 | 0001234 12 | 678 13 | 24 13 | 69 14 | 14 | 15 | 1 15 | 16 | 2 |
Note the change in where the decimal point is.
> stem(wine$Alcohol) The decimal point is 1 digit(s) to the left of the | 110 | 3 112 | 114 | 1566 116 | 1245669 118 | 1224476 120 | 000478888867 122 | 01255599993346777777 124 | 22235711238 126 | 004790022779 128 | 124556783369 130 | 355555578116677 132 | 034478902469 134 | 0015889900126688 136 | 2347891123345678 138 | 23346678804 140 | 266002369 142 | 01223047889 144 | 146 | 5 148 | 3 |
Copyright © 2004-2006 Graham.Williams@togaware.com Support further development through the purchase of the PDF version of the book.