 DATA MINING
Desktop Survival Guide
by Graham Williams All R objects can be saved using the save function and then restored at a later time using the load function. The data will be saved into a .RData file. To illustrate this we make use of a standard dataset called iris.

We create a random sample of 20 entities from the dataset. This is done by randomly sampling 20 numbers between 1 and the number of rows (nrow) in the iris dataset, using the sample function. The list of numbers generated by sample is then used to index the iris dataset, to select the sample of rows, by supplying this list of rows as the first argument in the square brackets. The second argument in the square brackets is left blank, indicating that all columns are required in our new dataset. We then save the dataset to file using the save function which compresses the data for storage:

 ```> rows <- sample(1:nrow(iris), 20) > myiris <- iris[rows,] > dim(myiris)  20 5 > save(myiris, file="myiris.RData", compress=TRUE) ```

 ```> load("myiris.RData") > dim(myiris)  20 5 ```

Using the compress option will reduce disk space required to store the dataset.

You can save any objects in an R binary file. For example, suppose you have built a model and want to save it for later exploration:

 ```> library(rpart) > iris.rp <- rpart(Species ~ ., data=iris) > save(iris.rp, file="irisrp.RData", compress=TRUE) ```

At a later stage, perhaps on a fresh start of R, you can load the model:

 ```> load("irisrp.RData") > iris.rp n= 150 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 150 100 setosa (0.33333333 0.33333333 0.33333333) 2) Petal.Length< 2.45 50 0 setosa (1.00000000 0.00000000 0.00000000) * 3) Petal.Length>=2.45 100 50 versicolor (0.00000000 0.50000000 0.50000000) 6) Petal.Width< 1.75 54 5 versicolor (0.00000000 0.90740741 0.09259259) * 7) Petal.Width>=1.75 46 1 virginica (0.00000000 0.02173913 0.97826087) * ```

To identify what is saved into an RData file you can attach the file and then get a listing of its contents:

 ```attach("irisrp.RData") ls(2) ... detach(2) ```

Subsections