DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
The most basic data structure is a simple vector, a list-like data structure that stores values which are all of the same data type or class. You can either directly create a vector using the R function c (for combine), or else have R create a random list of numbers for you, using, for example runif (which will generate a sequence of random numbers uniformly distributed between the supplied limits).
[basicstyle=\ttfamily\tiny] > v <- c(1, 2, 3, 4, 5) > v [1] 1 2 3 4 5 > class(v) [1] "numeric" > v <- runif(20, 0, 100) > v [1] 69.717291 98.491863 98.541503 72.558488 85.607629 35.441444 59.622427 [8] 40.191194 8.311273 24.215177 77.378846 55.563735 71.554547 97.522348 [15] 2.186403 52.528335 69.281037 44.634309 2.063750 47.125579 |
The vector function will create a vector of a specific
mode (logical, by default):
> vector(length=10) [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > vector(mode="numeric", length=10) [1] 0 0 0 0 0 0 0 0 0 0 |
Various sequences of numbers can be generated to produce a vector
using the seq function:
> seq(10) # [1] 1 2 3 4 5 6 7 8 9 10 > seq(1, 10) # [1] 1 2 3 4 5 6 7 8 9 10 > seq(length=10) # [1] 1 2 3 4 5 6 7 8 9 10 > seq(2, 10, 2) # [1] 2 4 6 8 10 > seq(10, 2, -2) # [1] 10 8 6 4 2 > seq(length = 0) # numeric(0) > seq(0) # [1] 1 0 > seq(0, 1, by=.1) > seq(0, 1, length=11) > 1:10 # [1] 1 2 3 4 5 6 7 8 9 10 |
R will operate on vectors whenever they are given as arguments.
> c(2, 4, 6, 8, 10)/2 # [1] 1 2 3 4 5 > c(2, 4, 6, 8, 10)/c(1, 2, 3, 4, 5) # [1] 2 2 2 2 2 > log(c(0.1, 1, 10, 100), 10) # [1] -1 0 1 2 |
In vector operations, short vectors are recycled when additional
values are required, but the longer vector's length must be a multiple
of the shorter vector's length.
> c(1, 2, 3, 4) + c(1, 2) # [1] 2 4 4 6 > c(1, 2, 3, 4, 5) + c(1, 2) [1] 2 4 4 6 6 Warning message: longer object length is not a multiple of shorter object length in: c(1, 2, 3, 4, 5) + c(1, 2) |