Cluster computing

Wednesday, May 17, 2017

we were discussing the rxFastTree algorithm.The rxFastTrees is a fast tree algorithm which is used for binary classification or regression. It can be used for bankruptcy prediction. It is an implementation of FastRank which is a form of MART gradient boosting algorithm. It builds each regression tree in a step wise fashion using a predefined loss function. The loss function helps to find the error in the current step and fix it in the next. The term boosting is used to denote the improvements in numerical optimization in the function space by correlating it with the steepest descent minimization. When the individual additive components are regression trees, this boosting is termed TreeBoost. Gradient boosting of regression trees is said to produce competitive, highly robust, interpretable procedures for both regression and classification.
When the mapping function is restricted to be a member of a parameterized class of functions, then it can be represented as a weighted summation of the individual functions in the parameterized set. This is called additive expansion. This technique is very helpful for approximations. With gradient boost, the constraint is applied to the rough solution by fitting the parameterized function set to obtain "pseudoresponses" This permits the replacement of the difficult minimization problem by the least squares function minimization followed by only a single optimization based on the original criterion.
Therefore the gradient boost algorithm is described by Friedman as :
1. Describe the problem as a minimization function over a parameterized class of functions
2. For each of the parameterized set from 1 to M do
3. Fit the mapping function to the pseudoresponses by calculating the negative gradient from i = 1 to N
4. find the smoothed negative gradient by using any fitting criterion such as least squares
5. Perform the line search using the constrained negative gradient in steepest descent, we take the one that leads to the minimum
6. Update the approximation by performing a step along the direction of line of search.

Cluster computing

Wednesday, May 17, 2017

No comments:

Post a Comment