Cluster computing

Wednesday, June 19, 2013

Classification and prediction are forms of data analysis to extract models. Classification predicts labels or data classes while prediction models continuous-valued functions.
Same steps for data preprocessing are applied as discussed before.
ID3, C4.5 and CART are greedy algorithms for the induction of decision trees. Each algorithm tests the attribute for each non-leaf node before selection.
Naïve Bayes classification and Bayesian belief networks are based on Bayes probability theory. The former assumes data classes are conditionally independent while the latter assumes subsets of variables can be conditionally independent.
A rule based classifier uses a set of IF-THEN rules for classification. Rules can be generated directly from training data using sequential covering algorithms and associative classification algorithms.
Associative classification uses association mining techniques that search for frequently occurring patterns in large databases
The set of all classifiers that use training data to build a generalization model are called eager learners. This is different from lazy learners or instance based learners that store different training tuples in pattern space and wait for test data before performing generalization. Lazy learners require indexing techniques.
Linear, non-linear and generalized linear models of regression can be used for prediction. Non-linear programs are converted to linear problems by transforming predictor variables.

Cluster computing

Wednesday, June 19, 2013

No comments:

Post a Comment