Tuesday, May 16, 2017

We were discussing the MicrosoftML rxFastLinear algorithm which is a fast linear model trainer based on the Stochastic Dual Coordinate Ascent method. It combines the capabilities of logistic regressions and  SVM algorithms. Today we discuss the rxFastTree algorithm.The rxFastTrees is a fast tree algorithm which is used for binary classification or regression. It can be used for bankruptcy prediction.  It is an implementation of FastRank which is a form of MART gradient boosting algorithm. It builds each regression tree in a step wise fashion using a predefined loss function. The loss function helps to find the error in the current step and fix it in the next. The term boosting is used to denote the improvements in numerical optimization in the function space by correlating it with the steepest descent minimization.  When the individual additive components are regression trees, this boosting is termed TreeBoost. Gradient boosting of regression trees is said to produce competitive, highly robust, interpretable procedures for both regression and classification.
Generally we start with a system consisting of random output or response variable y using a set of random input or explanatory variables x. Using a training sample of known input and output variables, the goal is to obtain an estimate or approximation of the function that maps x to y that minimizes the expected value of some specified loss function such as squared error or absolute error. One approach is to restrict the mapping function to be a member of a parameterized class of functions. Then the mapping function can be represented as a weighted summation of the individual functions in the parameterized set. This is called additive expansion. This technique is very helpful to many approximation methods. In the steepest descent method, the current gradient is taken as the "line of search" and a step is taken along that direction. 
With gradient boost, the constraint is applied to the rough solution by fitting the parameterized function set to obtain "pseudoresponses"  This permits the replacement of the difficult minimization problem by the least squares function minimization followed by only a single optimization based on the original criterion. As long as a suitable least squares algorithm exists with the parameterized function set, any differentiable loss can then be minimized together with the stagewise additive modeling.
courtesy: Friedman's Gradient Boosting Machine
#codingexercise 
http://collabedit.com/sghvg

No comments:

Post a Comment