Journal of Credit Risk

Risk.net

Benchmarking machine learning models to predict corporate bankruptcy

Emmanuel Alanis, Sudheer Chava and Agam Shah

  • We use a comprehensive sample of bankruptcies to benchmark the performance of various machine learning models in predicting financial distress of publicly traded U.S. firms.
  • We find that gradient boosted trees outperform other models in one-year-ahead forecasts.
  • The predictive accuracy of our ML models does not improve in a meaningful way when we include features based on companies’ annual 10-K filings.
  • In a credit competition model that accounts for the asymmetric cost of default misclassification, the survival random forest algorithm captures large dollar profits.

Using a comprehensive sample of 2585 bankruptcies from 1990 to 2019, we benchmark the performance of various machine learning models in predicting the financial distress of publicly traded US firms and we find that gradient-boosted trees outperform the other models in one-year-ahead forecasts. Permutation tests show that excess stock returns, idiosyncratic risk and relative size are the most important variables for predictions. Textual features derived from corporate filings do not materially improve performance. In a credit competition model that accounts for the asymmetric cost of default misclassification, the random survival forest is able to capture large dollar profits.

Sorry, our subscription options are not loading right now

Please try again later. Get in touch with our customer services team if this issue persists.

New to Risk.net? View our subscription options

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here