DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
The next example changes the values in one vector (weights) according to some conditions on the values in another vector (data). The data vector is randomly sampled from the letters of the alphabet. Both vectors are the same length. Where data is larger than m, the weight is set to 2. Where it is between d and m, the weight is set to 3.
> weights <- rep(1, 10) > data <- letters[sample(seq(1,length(letters)), 10)] > data [1] "y" "b" "j" "m" "c" "q" "o" "a" "i" "p" > weights[data > "m"] <- 2 > weights [1] 2 1 1 1 1 2 2 1 1 2 > weights[data <= "m" & data >= "d"] <- 3 > weights [1] 2 1 3 3 1 2 2 1 3 2 |
An example of where this might be useful is in data mining pre-processing where we wish to selectively change the weights associated with entities in a modelling exercise. The weights might indicate the relative important the specific entities. An example of this transformation is included in the usage of rpart in See Chapter .