# Greedy function approximation: A gradient boosting machine.

@article{Friedman2001GreedyFA, title={Greedy function approximation: A gradient boosting machine.}, author={Jerome H. Friedman}, journal={Annals of Statistics}, year={2001}, volume={29}, pages={1189-1232} }

Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest-descent minimization. A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion. Specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logisticâ€¦Â Expand

#### Figures and Tables from this paper

#### 11,839 Citations

Optimization by gradient boosting

- Mathematics, Computer Science
- ArXiv
- 2017

A thorough analysis of two widespread versions of gradient boosting is provided, and a general framework for studying these algorithms from the point of view of functional optimization is introduced. Expand

Gradient Boosting Trees

- 2020

Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typicallyâ€¦ Expand

Stochastic gradient boosting

- Mathematics
- 2002

Gradient boosting constructs additive regression models by sequentially fitting a simple parameterized function (base learner) to current "pseudo'-residuals by least squares at each iteration. Theâ€¦ Expand

A Fast Sampling Gradient Tree Boosting Framework

- Computer Science, Mathematics
- ArXiv
- 2019

This work combines gradient tree boosting with importance sampling, which achieves better performance by reducing the stochastic variance and uses a regularizer to improve the diagonal approximation in the Newton step of gradient boosting. Expand

BOOSTING WITH EARLY STOPPING: CONVERGENCE

- 2005

Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically aâ€¦ Expand

Conjugate direction boosting for regression

- 2004

Boosting in the context of linear regression has gained additional attraction by the invention of least angle regression (LARS), where the connection between the lasso and forward stagewise fittingâ€¦ Expand

BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING

- Mathematics
- 2007

We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models asâ€¦ Expand

Special Invited Paper-Additive logistic regression: A statistical view of boosting

- Mathematics
- 2000

Boosting is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training dataâ€¦ Expand

Boosting with early stopping: Convergence and consistency

- Mathematics
- 2005

Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically aâ€¦ Expand

Gradient and Newton Boosting for Classification and Regression

- Computer Science, Mathematics
- Expert Syst. Appl.
- 2021

The experiments show that Newton boosting outperforms gradient and hybrid gradient-Newton boosting in terms of predictive accuracy on the majority of datasets, and empirical evidence is presented that this difference in predictive accuracy is not primarily due to faster convergence of Newton boosting, but rather since Newton boosting often achieves lower test errors while at the same time having lower training losses. Expand

#### References

SHOWING 1-10 OF 32 REFERENCES

Additive Logistic Regression : a Statistical

- Computer Science
- 1998

This work develops more direct approximations of boosting that exhibit performance comparable to other recently proposed multi-class generalizations of boosting, and suggests a minor modiication to boosting that can reduce computation, often by factors of 10 to 50. Expand

A Geometric Approach to Leveraging Weak Learners

- Computer Science
- EuroCOLT
- 1999

A new leveraging algorithm is introduced based on a natural potential function that has bounds that are incomparable to AdaBoost's, and their empirical performance is similar to Ada boost's. Expand

Generalized Additive Models

- Computer Science, Mathematics
- 1990

The class of generalized additive models is introduced, which replaces the linear form E fjXj by a sum of smooth functions E sj(Xj), and has the advantage of being completely auto- matic, i.e., no "detective work" is needed on the part of the statistician. Expand

Improved Boosting Algorithms using Confidence-Rated Predictions

- Mathematics, Computer Science
- COLT
- 1998

We describe several improvements to Freund and Schapireâ€˜s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give aâ€¦ Expand

Experiments with a New Boosting Algorithm

- Computer Science
- ICML
- 1996

This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers. Expand

Prediction Games and Arcing Algorithms

- Mathematics, Computer Science
- Neural Computation
- 1999

The theory behind the success of adaptive reweighting and combining algorithms (arcing) such as Adaboost and others in reducing generalization error has not been well understood, and an explanation of whyAdaboost works in terms of its ability to produce generally high margins is offered. Expand

Radial Basis Functions

- Computer Science
- 2001

This paper gives a selective but up-to-date survey of several recent developments that explains their usefulness from the theoretical point of view and contributes useful new classes of radial basis function. Expand

Radial basis functions

- Computer Science
- Acta Numerica
- 2000

This paper gives a selective but up-to-date survey of several recent developments that explains their usefulness from the theoretical point of view and contributes useful new classes of radial basis function. Expand

Learning representations by back-propagating errors

- Computer Science
- Nature
- 1986

Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain. Expand

Improving Regressors using Boosting Techniques

- Computer Science
- ICML
- 1997

This work uses regression trees as fundamental building blocks in bagging committee machines and boosting committee machines to build a committee of regressors that may be superior to a single regressor. Expand