DATA MINING
Desktop Survival Guide
by Graham Williams

D

Data Cube: A block of data, often extracted from a data warehouse, offering fast access/views of data in any number of dimensions.

Data Mining: A technology concerned with using a variety of techniques, including http://en.wikipedia.org/wiki/Association_(statistics)associations, classification, and segmentation, on problem domains that require analysing extremely large collections of data, including domains such as fraud and churn. Typically draws on tools and techniques from machine learning, Parallel Computation, http://en.wikipedia.org/wiki/olapOLAP, visualisation, Mathematical Computation, and statistics. Fayyad defines data mining as a single step in the KDD process that under acceptable computational efficiency limitations, enumerates structures (patterns or models) over the data.

Data Warehouse: Provides a multi-dimensional view of an organisations data for efficient analysis as distinct from typical transactional relational database systems which manage the day-to-day operations of the organisation. Data warehouses particularly lend themselves to quick ad-hoc queries on large volumes of data on a read-only basis. Inmon defined a data warehouse as a subject-oriented, integrated, time variant and non-volatile collection of data in support of management's decision-making process.

Decision Tree:

Demographic Clustering: Clustering performed on the characteristics of population groups in terms of size, distribution, and other vital statistics, rather than on the individuals of the population. Related to Data Mining in Data Cubes in that Data Cubes allow varying degrees of aggregation over any of the variables.

Dependent Variable: The classical term for an Output Variable.

Support further development through the purchase of the PDF version of the book.
Brought to you by Togaware.