Journal of Risk Model Validation
ISSN:
1753-9579 (print)
1753-9587 (online)
Editor-in-chief: Steve Satchell
On the mathematical modeling of point-in-time and through-the-cycle probability of default estimation/ validation
Xin Zhang and Tony Tung
Need to know
- First formal mathematical model for PIT and TTC PD.
- Correct variance evaluation for PD estimation/validation.
- Discussion over IFRS9 PD possibly based on rank statistics.
- Critiques on Moody’s TTC EDF and potential alternative solutions.
Abstract
Since Basel II, the second of the Basel Accords, was first published in June 2004, banks around the world have been engaged in a continuous effort to develop methodologies to estimate the key parameters: probability of default (PD), loss given default (LGD) and exposure at default (EAD). In this paper, we focus on PD estimation and validation. We provide the mathematical modeling for both point-in-time (PIT) and through-the-cycle (TTC) PD estimation, and discuss their relationship and application in our banking system.
Introduction
1 Introduction
Under Basel II (Basel Committee on Banking Supervision 2006):
“Generally, all banks using the IRB [internal ratings-based] approaches must estimate a PD for each internal borrower grade for corporate, sovereign and bank exposures or for each pool in the case of retail exposures.” (Paragraph 446)
“PD estimates must be a long-run average of one-year default rates for borrowers in the grade, with the exception of retail exposures.” (Paragraph 447)
“In order to avoid over-optimism, a bank must add to its estimates a margin of conservatism that is related to the likely range of errors.” (Paragraph 451)
“Banks must use information and techniques that take appropriate account of the long-run experience when estimating the average PD for each rating grade. For example, banks may use one or more of the three specific techniques set out below: internal default experience, mapping to external data, and statistical default models.” (Paragraph 461)
The four paragraphs quoted above define the key IRB requirements on the probability of default (PD) parameter estimation. Among the three techniques specified, internal default experience is the predominant one because of the difficulty involved in relating the internal rating to external ones and the data insufficiency for statistical default model development (although predominant under IFRS 9 implementation (see, for example, Yang 2014, 2017)), especially for wholesale portfolios. Based on internal default experience, as a classical statistics problem, known as binomial proportion estimation and confidence interval evaluation, banks developed their PD estimation methodologies through a historical annual PD cohort. The widely chosen long-run averaging methods are a simple average and the obligor-weighted sum of realized annual default rate (DR). To add a margin of conservatism, the confidence interval could be evaluated using a wide range of approaches, such as Clopper–Pearson (Clopper and Pearson 1934), Wald (normal approximation), Wilson (Wilson 1927), Agresti–Coull (Agresti and Coull 1998), etc, with the first two being the most popular. These confidence intervals, however, are only applicable to a single period. Oftentimes, banks find more breaches than expected from their chosen confidence interval while backtesting during validation processes, which is illustrated using the Standard & Poor’s (S&P) data in Section 3.1.
One of the main reasons for this backtesting failure, we believe, is related to the concept of rating philosophy. Rating philosophy is commonly referred to as either point-in-time (PIT) or through-the-cycle (TTC) ratings. Note the difference between the PIT/TTC PD and the PIT/TTC rating. There might be some misuse of these terminologies, but, by PIT/TTC PD, we are referring to the realized/expected PD at different instances in time for PIT PD and throughout time for TTC PD. Together, they describe the dynamics of rating systems, the evaluation of which is part of the validation process (Basel Committee on Banking Supervision 2005). On the other hand, PIT ratings measure credit risk given the current state of a borrower in their current economic environment, whereas TTC ratings measure credit risk, taking into account the (assumed) state of the borrower over a whole economic cycle (Blochwitz et al 2011). Whether a rating system is PIT or TTC could be evaluated through a ratings migration study (Araten et al 2004). We can observe both TTC and PIT PD associated with a given credit rating whether it is assigned by a PIT or a TTC philosophy. The difference is in how the PIT PD varies over time. Most credit rating agencies, eg, S&P and Moody’s, embrace the TTC philosophy for rating stability. Banks’ internal ratings, with an effort to incorporate some PIT elements, usually follow a PIT–TTC hybrid philosophy, with wholesale portfolios closer to TTC.11 1 A TTC rating philosophy might be preferred by the Basel Committee on Banking Supervision (BCBS) as well. In Basel Committee on Banking Supervision (2006, Paragraph 414), it states that: “Although the time horizon used in PD estimation is one year (as described in Paragraph 447), banks are expected to use a longer time horizon in assigning ratings.” In a TTC rating system, when an obligor’s rating is stabilized throughout the economic cycle, the implied PD of each rating will fluctuate along economic expansions and recessions. Therefore, we will observe lower realized annual DRs during economic peaks and higher realized annual DRs during economic troughs. In other words, we would observe time-varying PD in a TTC rating framework. Apparently, the long-run average PD calculated by banks is an invariant long-term PD (usually modeled as a single binomial process), and the year-to-year variation of the realized/observed annual DRs will be much larger than the confidence interval defined on a long-term PD based on the simple binomial process.
As the transition of our current accounting standard to International Financial Reporting Standard 9 (IFRS 9) gets underway, a lot of attention has been paid to the conversion between so-called PIT PD and TTC PD. In this paper, we will also touch on this problem from a mathematical perspective. However, the discussion will be around its basic concepts rather than practical implementation.
In the next section, we will introduce our model for PIT and TTC PD estimation. The application of the methodology to PD estimation/validation is illustrated using the S&P default data, and results are shown in Section 3. PIT versus TTC PD is discussed in Section 4. Section 5 will be a summary of this paper and possible future work.
2 Probability of default estimation model
According to Basel II requirements (Basel Committee on Banking Supervision 2006, Paragraph 463), the length of the underlying historical observation period used for the bank’s PD estimation must be at least five years. Most banks nowadays possess data that covers a much longer history. Therefore, a long time series of annual DRs can easily be obtained. To compute the long-run average from time series, our options are generally a simple average, an obligor-weighted average or a default-weighted average. A default-weighted average is deemed too conservative and biased. The first two are the most commonly used averaging methods. We will show in the following subsections that, despite the converging numeric values yielded by these two methods when a longer time period is used, they actually have very different interpretations.
2.1 TTC PD
The underlying assumption when using an obligor-weighted average of annual DRs is that all obligors of a certain credit rating are associated with a unique TTC PD (independent of their time of default; this contrasts with the PIT PD model discussed in the next subsection). This can be justified with mathematical equations.
Assume that we have years of data, and for each year there are , , obligors. Whether a specific obligor defaulted or not can be modeled by independent and identically distributed (iid) Bernoulli random variables , . The annual DR would be expressed as . The obligor-weighted long-run average PD () would be
(2.1) |
which implies that for all , and can be estimated with (2.1). In other words, follows a binomial distribution, . The variance of can be computed by
Usually, is large, so can be estimated with high precision. The confidence intervals evaluated using all types of methodologies (eg, Clopper–Pearson, Wald, etc) would be narrow and close to one another. However, these small confidence intervals offer false security. The simple derivation of the mean and variance of LRPD is based on the over-simplified assumption of all historical obligors following the same at the cost of losing the annual DR information. Note how disappeared in (2.1).
2.2 PIT PD
In recognition of the fact that the realized annual DRs of a certain credit rating (from a TTC rating system) do vary from time to time, a more reasonable assumption is that all obligors of a certain credit rating are associated with a PIT PD (a time-varying PD dependent on the time of default). This PIT PD is a function of time and, in practice, is assumed to be constant over a year. Therefore, in mathematical terms, the obligors are represented by iid Bernoulli random variables , , and , , where are iid random variables following an arbitrary distribution; this contrasts with the TTC PD model in the previous subsection, where was assumed to be constant throughout time. The optimal estimate of the long-run PD is therefore the expectation of (not a function of ), which can be calculated as the simple average of the estimate of , , as the realized annual DR:
(2.2) |
To evaluate the deviation of this estimate of the LRPD, note that . By the law of total variance, the variance of can be broken down into two parts:
The first part captures the variance generated by the binomial process:
The second part captures the variance of the mean of itself:
The total variance of is the combination of the two:
(2.3) |
Since the distribution of is unknown, without loss of generality, and could be replaced with their approximations and . Except for in the first component, all other terms are second-order values. So, is expected to be dominated by the first component (the binomial process) when and by the second component (the time series) when . In the next section, we will use the S&P data to evaluate exactly how the two components affect the deviation of the estimate.
A special case assuming the iid normal distribution of was assessed by Cantor and Falkenstein (2001). Our derivation below shows that a minor error exists in their final result. Based on this assumption, . We could replace the terms in (2.3) using and and get
(2.4) |
For each year (), the observed (or ) follows the distribution
(2.5) |
Note the missing term in Cantor and Falkenstein (2001, Equation (3)).
2.2.1 PIT PD with correlated obligors
For the PIT PD model, we can go a step further to make the assumptions more realistic by allowing the default events of obligors to be correlated. Keeping all the rest of the assumptions unchanged, we now assume within each year (still independent across different years) that the defaults of obligors are represented by identically distributed Bernoulli random variables , , and with pairwise correlation , . This will only change the first term in the total variance formula . We evaluate it under the new assumption:
Compared to the result under the independent assumption, the total variance is increased by
The full total variance in (2.3) becomes
When is large, the covariance term dominates, and
The reasonable assumption/estimation of is another topic and is beyond the scope of this paper.
2.3 TTC PD versus PIT PD
The two mathematical models are built on the same assumption; namely, for each credit rating there is a fundamental implied long-term PD. For the TTC PD estimation model, the annual PDs are assumed to be constant and equal to this long-term PD. For the PIT PD estimation model, the annual PDs are assumed to be realizations of a random variable whose expectation is the long-term PD. As the number of years gets larger, and will converge.
Despite having the same assumption on the existence of long-term PDs, the two mathematical models have different focuses. The TTC PD model ignores the fluctuation of the implied PD of credit ratings over time and is inappropriate for the PD estimation of a practical credit system whose ratings have a TTC component.22 2 Having a TTC component in the rating means that the rating is not pure PIT; a constant PD assumption is only appropriate for a PIT rating system where the economic conditions are 100% (ideally) absorbed in the rating so that the implied PD can stay constant. However, it is more suitable (or easier to use) for some processes in PD validation, eg, the calibration of the rating system and the evaluation of rating consistency between different portfolios. It may best be used to answer the following question: in the long run, does the rating system yield reliable/consistent rankings of obligors’ credit? In other words, when the monotonicity of implied PDs and the separability of their confidence intervals are evaluated (Hanson and Schuermann 2005), the TTC PD model is the simplest and best option. On the other hand, the PIT PD model focuses on the annual realized PD and provides reliable answers to the following questions: what is the best PD estimate for the next years, and what is its confidence level?
3 Model application
To better understand the mathematical models introduced in the previous section, we demonstrate the application of the methodologies to the S&P data. The visualization and interpretation of the following results will be helpful for practitioners in making decisions on assumptions and methods to adopt during their PD estimation and validation processes.
The S&P data available is for large corporates and covers the years 1982–2015. According to BCBS requirements (Basel Committee on Banking Supervision 2006, Paragraph 463), the length of the underlying historical observation period used for PD estimation must be at least five years. If the available observation period spans a longer period, and the data is relevant and material, this longer period must be used. As the credit rating system evolves along with the macroeconomic environment and regulations, it is reasonable to expect that older, historic data will become irrelevant. How to determine whether data is “relevant and material” is at the discretion of practitioners. In this paper, as a simple demonstration, we use the population stability index (PSI, a common metric used to evaluate population distribution shifts between the development and validation periods of a rating system) as a metric to indicate the changing of the population or the rating system or both:
We start from the most recent year and move backward in time. We calculate the PSI of year ’s rating distribution against that of years –2015. Results are shown in Table 1.
Year | |||||||||||
1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | |
PSI | 0.30 | 0.24 | 0.21 | 0.17 | 0.16 | 0.11 | 0.11 | 0.09 | 0.09 | 0.11 | 0.09 |
Year | |||||||||||
2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | |
PSI | 0.08 | 0.07 | 0.04 | 0.02 | 0.03 | 0.06 | 0.03 | 0.02 | 0.02 | 0.01 | 0 |
We base this on the criteria that indicates insignificant change, indicates some minor change and indicates major change. We consider that S&P data back to 1995 is relevant and will use data from 1995 to 2015 as our analysis data set. The rating distribution shift can be easily spotted in Figure 1.
3.1 TTC PD evaluation
The results for the TTC PD evaluation are summarized in Table 2.
Total number | Total number | Deviation | Coefficient of | ||
---|---|---|---|---|---|
Rating | of obligors | of defaults | (%) | (%) | variance (%) |
AAA | 532 | 0 | 0.000 | 0.000 | |
AA | 399 | 0 | 0.000 | 0.000 | |
AA | 1078 | 0 | 0.000 | 0.000 | |
AA | 1585 | 0 | 0.000 | 0.000 | |
A | 2461 | 2 | 0.081 | 0.057 | 70.682 |
A | 4179 | 0 | 0.000 | 0.000 | |
A | 4813 | 0 | 0.000 | 0.000 | |
BBB | 5690 | 5 | 0.088 | 0.039 | 44.702 |
BBB | 6798 | 7 | 0.103 | 0.039 | 37.777 |
BBB | 5234 | 14 | 0.267 | 0.071 | 26.690 |
BB | 3201 | 4 | 0.125 | 0.062 | 49.969 |
BB | 4145 | 27 | 0.651 | 0.125 | 19.182 |
BB | 5598 | 45 | 0.804 | 0.119 | 14.847 |
B | 7734 | 184 | 2.379 | 0.173 | 7.284 |
B | 6599 | 255 | 3.864 | 0.237 | 6.140 |
B | 3005 | 260 | 8.652 | 0.513 | 5.927 |
CCC | 931 | 206 | 22.127 | 1.360 | 6.148 |
CCC | 500 | 168 | 33.600 | 2.112 | 6.287 |
CCC | 143 | 73 | 51.049 | 4.180 | 8.189 |
CC | 139 | 85 | 61.151 | 4.134 | 6.761 |
The most notable issue in Table 2 is that the monotonicity of the implied PD is broken for ratings above BB. This result is consistent with that in Hanson and Schuermann (2005), which indicates that, due to the limited default data, fine rating grades in high credit rankings are difficult to calibrate and deemed unreliable. In our subsequent analyses, we will therefore combine the ratings BB and above by ignoring their modifiers.33 3 Although we are tempted to combine modifiers in investment grades only, Table 11 in the online appendix shows that rating BB is not distinguishable from BBB. Table 2 becomes Table 3. The monotonicity of implied PDs is conserved and the TTC PD estimates of ratings below AA are separated by at least 2.44 standard deviations (98.5% confidence).
Total number | Total number | Deviation | Coefficient of | ||
---|---|---|---|---|---|
Rating | of obligors | of defaults | (%) | (%) | variance (%) |
AAA | 532 | 0 | 0.000 | 0.000 | |
AA | 3 062 | 0 | 0.000 | 0.000 | |
A | 11 453 | 2 | 0.017 | 0.012 | 70.705 |
BBB | 17 722 | 26 | 0.147 | 0.029 | 19.597 |
BB | 12 944 | 76 | 0.587 | 0.067 | 11.437 |
B | 7 734 | 184 | 2.379 | 0.173 | 7.284 |
B | 6 599 | 255 | 3.864 | 0.237 | 6.140 |
B | 3 005 | 260 | 8.652 | 0.513 | 5.927 |
CCC | 931 | 206 | 22.127 | 1.360 | 6.148 |
CCC | 500 | 168 | 33.600 | 2.112 | 6.287 |
CCC | 143 | 73 | 51.049 | 4.180 | 8.189 |
CC | 139 | 85 | 61.151 | 4.134 | 6.761 |
The validation results of the above monotonicity and confidence interval analyses indicate that banks should not try to keep fine rating notches with high credit ratings. In the PD estimation process, rather than applying techniques such as line-of-best-fit to get monotonic PD numbers, fine rating notches could be combined into coarse rating grades over which PD estimation may be done. The advantage of this alternative approach is two-fold. First, it would help improve product pricing. Second, it would not distort the true underlying PDs of the rating system.
Back to the discussion on PD parameter estimation, under the TTC PD methodology with the common assumption by most practitioners that the annual DR follows an iid normal distribution with mean , we can derive the 95% confidence interval of the estimate of for each rating grade. The upper limit is considered an estimate with a sufficient margin of conservatism. However, as we mentioned in the introduction of this paper, if we conduct a backtest, excessive breaches (highlighted in bold in Table 4) may be observed in all rating grades with meaningful historical defaults, even though only one breach is expected over the twenty-one-year history. This inconsistency indicates that the TTC PD assumption is not appropriate for a TTC rating system, whose rating stays stable over the economic cycle while its implied PD fluctuates. This problem could be fixed with PIT PD modeling, which we will show in the next subsection.
3.2 PIT PD evaluation
Rating | ||||||||||||
AAA | AA | A | BBB | BB | B | B | B | CCC | CCC | CCC | CC | |
95% confidence | 0.000 | 0.000 | 0.038 | 0.194 | 0.698 | 2.664 | 4.254 | 9.496 | 24.364 | 37.075 | 57.925 | 67.951 |
upper limit (TTC) | ||||||||||||
1995 | 0 | 0 | 0 | 0.505 | 0.857 | 2.575 | 8.140 | 7.692 | 0 | 50 | 0 | — |
1996 | 0 | 0 | 0 | 0 | 0.535 | 1.762 | 3.226 | 10.417 | 12.500 | 25 | — | — |
1997 | 0 | 0 | 0 | 0 | 0.456 | 0.837 | 5.691 | 9.091 | 6.667 | 16.667 | 100 | 100 |
1998 | 0 | 0 | 0 | 0.314 | 0.198 | 1.739 | 6.107 | 6.667 | 43.750 | 40 | — | 50 |
1999 | 0 | 0 | 0 | 0.143 | 1.024 | 5.252 | 10.500 | 21.978 | 32 | 46.154 | 50 | 28.571 |
2000 | 0 | 0 | 0 | 0.262 | 1.495 | 4.822 | 10.577 | 16.667 | 13.333 | 26.667 | 54.545 | 28.571 |
2001 | 0 | 0 | 0.339 | 0.492 | 1.846 | 5.747 | 16.146 | 29.464 | 53.061 | 42.308 | 55.556 | 83.333 |
2002 | 0 | 0 | 0 | 1 | 3.442 | 9.348 | 13.115 | 26.263 | 40.351 | 44.828 | 36.364 | 66.667 |
2003 | 0 | 0 | 0 | 0.222 | 0.504 | 3.155 | 9.302 | 19.780 | 34.043 | 35.135 | 46.154 | 65 |
2004 | 0 | 0 | 0 | 0.107 | 0.308 | 0.287 | 4.500 | 2.222 | 13.636 | 22.581 | 28.571 | 40 |
2005 | 0 | 0 | 0 | 0.208 | 0.144 | 1.171 | 1.695 | 4.348 | 13.953 | 25 | 0 | 40 |
2006 | 0 | 0 | 0 | 0 | 0.556 | 0.225 | 0.735 | 2.326 | 11.475 | 11.765 | 33.333 | 33.333 |
2007 | 0 | 0 | 0 | 0 | 0 | 0 | 0.287 | 2.381 | 11.538 | 15.789 | 20 | 100 |
2008 | 0 | 0 | 0 | 0.110 | 0.141 | 2.506 | 1.695 | 4.420 | 20.455 | 40 | 62.500 | 50 |
2009 | 0 | 0 | 0 | 0 | 0.917 | 7.267 | 13.506 | 25.455 | 47.273 | 60.606 | 100 | 85.714 |
2010 | 0 | 0 | 0 | 0 | 0 | 0.366 | 0.917 | 2.740 | 10.465 | 33.333 | 54.545 | 73.684 |
2011 | 0 | 0 | 0 | 0 | 0 | 0 | 0.964 | 3.620 | 10.638 | 18.750 | 50 | 57.143 |
2012 | 0 | 0 | 0 | 0 | 0.145 | 0.528 | 1.035 | 5.213 | 20.755 | 40 | 66.667 | 57.143 |
2013 | 0 | 0 | 0 | 0 | 0.145 | 0.501 | 1.107 | 3.540 | 19.048 | 37.838 | 72.727 | 75 |
2014 | 0 | 0 | 0 | 0 | 0 | 0 | 0.699 | 3.200 | 19.403 | 31.429 | 57.143 | 75 |
2015 | 0 | 0 | 0 | 0 | 0.362 | 0.693 | 2.083 | 4.319 | 18.310 | 29.167 | 45 | 100 |
When we tried to evaluate the annual realized DRs for S&P, we discovered that, for certain rating grades, data is not available for all of the years from 1995 to 2015. Namely, for some years, there are no obligors in the S&P data set rated at certain rating grades. These are regarded as missing data points, and the total number of years is adjusted accordingly. Table 5 summarizes the key statistics from applying the PIT PD model to the S&P large corporate data set (1995–2015).
PIT PD | Coefficient | ||||||
of | |||||||
Min | Max | Expectation | Deviation | variance | First | Second | |
Rating | (%) | (%) | (%) | (%) | (%) | term | term |
AAA | 0 | 0 | 0 | 0 | 0 | 0 | |
AA | 0 | 0 | 0 | 0 | 0 | 0 | |
A | 0 | 0.339 | 0.016 | 0.020 | 124.302 | 1.42E08 | 2.61E08 |
BBB | 0 | 1.004 | 0.160 | 0.063 | 39.406 | 9.71E08 | 3.02E07 |
BB | 0 | 3.442 | 0.623 | 0.192 | 30.813 | 4.95E07 | 3.19E06 |
B | 0 | 9.348 | 2.323 | 0.609 | 26.205 | 2.99E06 | 3.41E05 |
B | 0.287 | 16.146 | 5.335 | 1.147 | 21.495 | 1.05E05 | 1.21E04 |
B | 2.222 | 29.464 | 10.086 | 2.083 | 20.655 | 3.87E05 | 3.95E04 |
CCC | 0 | 53.061 | 21.555 | 3.483 | 16.158 | 0.00022 | 0.00100 |
CCC | 11.765 | 60.606 | 33.001 | 3.838 | 11.630 | 0.00073 | 0.00075 |
CCC | 0 | 100 | 49.111 | 7.818 | 15.919 | 0.00233 | 0.00378 |
CC | 28.571 | 100 | 63.640 | 7.088 | 11.137 | 0.00205 | 0.00297 |
As expected, the S&P data set is large enough that the LRPDs computed under the two methodologies are quite close to each other. However, some deviations are obvious at ratings “B” and “B” in Figure 2. These are where neither the number of obligors nor the PD is large enough. Figure 2 also shows that the LRPDs from both models can be well fitted into lines, indicating that the S&P rating system is nicely calibrated. The two lines are close to each other too. Although we do not suggest using the PDs from fitted lines for ratings with default data (“A” and below), the extrapolated PDs for ratings “AAA” and “AA” are good candidates. The readings are 0.003% for “AA” and 0.001% for “AAA”, which will be used as the PD estimations for these two ratings.
The last two columns in Table 5 show the contributions of variance from binomial processes and time series to the total variance. The variances from time series are always larger than those from binomial processes, especially in the middle credit rating ranges, where are relatively large compared to . However, at both ends of the spectrum, they are comparable. This observation suggests that the deviation from the TTC PD model is not appropriate to derive the confidence interval when LRPD is used as the prediction of PD for the coming year; this is evident from the backtest result shown earlier. It misses the variance coming from the annual realization of . Actually, if we take the square root of the first term of variance in Table 5, the numbers are very close to the deviation column in Table 3.
Conducting the same backtest using the deviation under the PIT PD model, we see (in Table 6) that the numbers of breaches are as expected.
3.3 PD estimate for capital use
Rating | ||||||||||||
AAA | AA | A | BBB | BB | B | B | B | CCC | CCC | CCC | CC | |
95% confidence | 0.000 | 0.000 | 0.168 | 0.619 | 2.040 | 6.876 | 13.720 | 25.317 | 46.517 | 58.611 | 95.835 | 100.000 |
upper limit (PIT) | ||||||||||||
1995 | 0 | 0 | 0 | 0.505 | 0.857 | 2.575 | 8.140 | 7.692 | 0 | 50 | 0 | — |
1996 | 0 | 0 | 0 | 0 | 0.535 | 1.762 | 3.226 | 10.417 | 12.500 | 25 | — | — |
1997 | 0 | 0 | 0 | 0 | 0.456 | 0.837 | 5.691 | 9.091 | 6.667 | 16.667 | 100 | 100 |
1998 | 0 | 0 | 0 | 0.314 | 0.198 | 1.739 | 6.107 | 6.667 | 43.750 | 40 | — | 50 |
1999 | 0 | 0 | 0 | 0.143 | 1.024 | 5.252 | 10.500 | 21.978 | 32 | 46.154 | 50 | 28.571 |
2000 | 0 | 0 | 0 | 0.262 | 1.495 | 4.822 | 10.577 | 16.667 | 13.333 | 26.667 | 54.545 | 28.571 |
2001 | 0 | 0 | 0.339 | 0.492 | 1.846 | 5.747 | 16.146 | 29.464 | 53.061 | 42.308 | 55.556 | 83.333 |
2002 | 0 | 0 | 0 | 1 | 3.442 | 9.348 | 13.115 | 26.263 | 40.351 | 44.828 | 36.364 | 66.667 |
2003 | 0 | 0 | 0 | 0.222 | 0.504 | 3.155 | 9.302 | 19.780 | 34.043 | 35.135 | 46.154 | 65 |
2004 | 0 | 0 | 0 | 0.107 | 0.308 | 0.287 | 4.500 | 2.222 | 13.636 | 22.581 | 28.571 | 40 |
2005 | 0 | 0 | 0 | 0.208 | 0.144 | 1.171 | 1.695 | 4.348 | 13.953 | 25 | 0 | 40 |
2006 | 0 | 0 | 0 | 0 | 0.556 | 0.225 | 0.735 | 2.326 | 11.475 | 11.765 | 33.333 | 33.333 |
2007 | 0 | 0 | 0 | 0 | 0 | 0 | 0.287 | 2.381 | 11.538 | 15.789 | 20 | 100 |
2008 | 0 | 0 | 0 | 0.110 | 0.141 | 2.506 | 1.695 | 4.420 | 20.455 | 40 | 62.500 | 50 |
2009 | 0 | 0 | 0 | 0 | 0.917 | 7.267 | 13.506 | 25.455 | 47.273 | 60.606 | 100 | 85.714 |
2010 | 0 | 0 | 0 | 0 | 0 | 0.366 | 0.917 | 2.740 | 10.465 | 33.333 | 54.545 | 73.684 |
2011 | 0 | 0 | 0 | 0 | 0 | 0 | 0.964 | 3.620 | 10.638 | 18.750 | 50 | 57.143 |
2012 | 0 | 0 | 0 | 0 | 0.145 | 0.528 | 1.035 | 5.213 | 20.755 | 40 | 66.667 | 57.143 |
2013 | 0 | 0 | 0 | 0 | 0.145 | 0.501 | 1.107 | 3.540 | 19.048 | 37.838 | 72.727 | 75 |
2014 | 0 | 0 | 0 | 0 | 0 | 0 | 0.699 | 3.200 | 19.403 | 31.429 | 57.143 | 75 |
2015 | 0 | 0 | 0 | 0 | 0.362 | 0.693 | 2.083 | 4.319 | 18.310 | 29.167 | 45 | 100 |
One major application of a PD estimate is for IRB capital calculation. In this case, the PD estimate is used as a prediction of the coming year’s DR. Apparently, the PIT PD estimation model would be more appropriate for this purpose because the assumption of the coming year’s , with its parameter following the same distribution as previous years, and with an expectation of and a variance of , is much more realistic than the assumption of every year’s DR being equal to . So, we have the expected DR in the coming year as , estimated as the average of the historical annual realized DRs (), , also known as . The expected deviation of DR is
by replacing with 1 in (2.3). Here, like , can be approximated as the standard deviation (STD) of squared, . Some might be tempted to use the STD directly to compute the confidence interval for the coming year’s PD. However, as we have shown in the previous subsection, the deviation of the PD prediction would be underestimated, especially for credit ratings at the two extremes (good and bad), where the first term of would play an important role because is not much larger than .
Back to the S&P large corporate data set (1995–2015), if we assume that the numbers of obligors do not change between 2015 and 2016, Table 7 shows our PD estimate/prediction for 2016.
One-sided | |||||||
Deviation | confidence | ||||||
No. of | PD | First | Second | Total | 80% | 90% | |
Rating | obligors | estimate | term (%) | term (%) | (%) | (%) | (%) |
AAA | 10 | 0.001* | 0 | 0 | 0 | — | — |
AA | 94 | 0.003* | 0 | 0 | 0 | — | — |
A | 520 | 0.016 | 0.056 | 0.074 | 0.093 | 0.094 | 0.135 |
BBB | 1118 | 0.160 | 0.119 | 0.252 | 0.279 | 0.395 | 0.518 |
BB | 828 | 0.623 | 0.272 | 0.818 | 0.862 | 1.348 | 1.727 |
B | 433 | 2.323 | 0.712 | 2.675 | 2.768 | 4.653 | 5.870 |
B | 816 | 5.335 | 0.767 | 5.040 | 5.098 | 9.625 | 11.868 |
B | 301 | 10.086 | 1.654 | 9.111 | 9.260 | 17.879 | 21.953 |
CCC | 71 | 21.555 | 4.568 | 14.472 | 15.176 | 34.327 | 41.003 |
CCC | 24 | 33.001 | 9.252 | 12.523 | 15.570 | 46.105 | 52.954 |
CCC | 20 | 49.111 | 9.438 | 26.793 | 28.406 | 73.018 | 85.515 |
CC | 3 | 63.640 | 24.148 | 23.762 | 33.878 | 92.153 | 100.000 |
The right two columns in Table 7 provide the 80% and 90% one-sided confidence intervals. These can be interpreted as an expectation that there will be no more than two and one breaches in ten years, respectively. A simple check of the S&P historic data found that two years for “CCC” and three years for “CC” observed 100% default rates; this confirms that the 100% PD estimate for these low credit ratings is not overly conservative, considering their high expected PD and low number of obligors. However, the evaluation of the confidence interval is done on each individual rating grade independently. Since the for all ratings are not perfectly correlated (see Table 8 for the realized correlation coefficients), the 80% one-sided confidence intervals might be good enough for the overall portfolio level.
BBB | BB | B | B | B | CCC | CCC | CCC | CC | |
---|---|---|---|---|---|---|---|---|---|
BBB | 1 | 0.8472 | 0.7051 | 0.6511 | 0.5752 | 0.4117 | 0.4107 | 0.3541 | 0.1349 |
BB | 0.8472 | 1 | 0.8746 | 0.7846 | 0.7874 | 0.4836 | 0.3792 | 0.0302 | 0.0607 |
B | 0.7051 | 0.8746 | 1 | 0.8954 | 0.9217 | 0.6830 | 0.6599 | 0.1565 | 0.0703 |
B | 0.6511 | 0.7846 | 0.8954 | 1 | 0.9448 | 0.6817 | 0.5961 | 0.1678 | 0.0160 |
B | 0.5752 | 0.7874 | 0.9217 | 0.9448 | 1 | 0.7440 | 0.5797 | 0.2562 | 0.0334 |
CCC | 0.4117 | 0.4836 | 0.6830 | 0.6817 | 0.7440 | 1 | 0.6120 | 0.3407 | 0.0756 |
CCC | 0.4107 | 0.3792 | 0.6599 | 0.5961 | 0.5797 | 0.6120 | 1 | 0.2216 | 0.0393 |
CCC | 0.3541 | 0.0302 | 0.1565 | 0.1678 | 0.2562 | 0.3407 | 0.2216 | 1 | 0.3764 |
CC | 0.1349 | 0.0607 | 0.0703 | 0.0160 | 0.0334 | 0.0756 | 0.0393 | 0.3764 | 1 |
4 PIT versus TTC
Up to now, our discussion has been from the perspectives of PD estimation and validation based on realized PD using historical data. PIT PD is the realized PD for a certain period, usually a year. TTC PD, on the other hand, is the PD for the whole history. At the same time, we have the notion of a PIT and TTC rating system/philosophy. We are able to observe both TTC PD and PIT PD from a rating system no matter what philosophy it deploys. As discussed in Basel Committee on Banking Supervision (2005), we expect the PIT PD of a PIT rating system to be stable around TTC PD, and the PIT PD of a TTC rating system to be negatively correlated with the economic cycle. The more TTC the rating system is, the more the PIT PD varies. The plot (shown in Figure 3) of the realized annual PDs of a retail rating system XXX (PIT) versus S&P (TTC) confirms this statement.
The realized PD of S&P rating “B” showed high spikes in 2002–3 and 2009, indicating a delayed effect from the macroeconomic recessions in 2001–2 and 2008–9. Meanwhile, the realized PD of the XXX rating system stays close to its TTC PD and does not follow the economic cycle at all.
4.1 PIT PD prediction for IFRS 9
Recently, a lot of research has been conducted to predict the PIT PD term structure in response to the IFRS 9 requirement (International Financial Reporting Standards 2014, Section 5.5.10): “At each reporting date, an entity shall measure the loss allowance for a financial instrument at an amount equal to the lifetime expected credit losses if the credit risk on that financial instrument has increased significantly since initial recognition.”
By lifetime expected credit losses (ECLs), we are to understand the “worst annual” ECL in a lifetime. Since ECL is computed based on the parameters PD, loss given default (LGD) and exposure at default (EAD), banks try to compute lifetime (usually five years) PIT PDs. There are several approaches to this, and each has different drawbacks.
- (1)
Derive the five-year cumulative PD term based on the historical data. This is a direct extension of the one-year PD estimate based on the Cohort method. However, the insufficiency of data becomes more serious as more years are considered (eg, very few obligors have rating data for five years). Therefore, more nonmonotonicity and missing data points for realized PD mean that techniques like curve-fitting and extrapolation are overused. This will result in unreliable PD term estimation and oftentimes unreasonable values ( or ).
- (2)
Project the multiperiod PD based on rating transition models. The two main issues with this approach are as follows. First, it assumes the rating migration follows a stationary Markov chain process. This assumption has been proven to be questionable because of the momentum in credit rating. Second, again because of limited data availability, it is extremely difficult to get a reliable estimate of the transition matrix.
- (3)
Predict the forward PD by some macroeconomic variables. As shown in the beginning of this section, the variation of realized PD in a TTC rating system could be explained in part by macroeconomic conditions. For a PIT rating system, however, the changes of the macroeconomic condition have been embedded into the rating assigned, and the observed PD therefore varies less and is uncorrelated with macroeconomic parameters. Even though for a TTC rating system the macroeconomic variables (current or one- or two-years latent) might provide good explanatory power for a one- or two-year PD, we should not expect them to be predictable for long-term PD. The reasoning behind this is as follows. Assume that some macroeconomic variables are able to reliably predict the five-year PD; at the same time, the PD is highly correlated with the recent macroeconomic parameters. This would imply that these macroeconomic parameters could be reliably predicted in a five-year range too, which is contrary to the basic principles of economics.
If we cannot hold much faith in these long-term PD predictions, are there any other options? If our understanding of lifetime ECL as worst annual ECL in a lifetime is correct, its estimate can be based on the worst annual PD in a lifetime. Here, the PD should be the one-year PIT PD: after all, IFRS 9 is an accounting standard for annual reports. According to common practice, five years can be used as a proxy for “lifetime”. The problem is thus reduced to find the worst one-year PD in five years. Assuming the underlying annual PDs follow a normal distribution , and based on our result in (2.5) in Section 2.2, the realized PD prediction . The worst one-year PD in five years would be the expected value of order statistics of five iid random variables. Using the results of Teichroew (1956) directly (partially quoted in Table 9), we get the worst one-year PD in five years:
(4.1) |
where and are the mean and STD of the historical observed PD, respectively. The numerical results applied to the S&P data set are shown in Table 10.
2 | 1 | 0.56418 95835 |
3 | 1 | 0.84628 43753 |
4 | 1 | 1.02937 53730 |
4 | 2 | 0.29701 13823 |
5 | 1 | 1.16296 44736 |
5 | 2 | 0.49501 89705 |
6 | 1 | 1.26720 63606 |
6 | 2 | 0.64175 50388 |
6 | 3 | 0.20154 68338 |
Number of | PD estimate | Deviation | ||
---|---|---|---|---|
Rating | obligors | (%) | (%) | (%) |
AAA | 10 | 0.001* | — | — |
AA | 94 | 0.003* | — | — |
A | 520 | 0.016 | 0.093 | 0.124 |
BBB | 1118 | 0.160 | 0.279 | 0.485 |
BB | 828 | 0.623 | 0.862 | 1.625 |
B | 433 | 2.323 | 2.768 | 5.542 |
B | 816 | 5.335 | 5.098 | 11.263 |
B | 301 | 10.086 | 9.260 | 20.855 |
CCC | 71 | 21.555 | 15.176 | 39.204 |
CCC | 24 | 33.001 | 15.570 | 51.108 |
CCC | 20 | 49.111 | 28.406 | 82.146 |
CC | 3 | 63.640 | 33.878 | 100.000 |
4.2 PIT to TTC rating conversion
Another area involving the PIT versus TTC discussion is the rating conversion from PIT to TTC. One typical objective of such a conversion is to get stable regulatory capital that does not vary greatly from year to year in response to changes in the economic cycle. The optimal way of developing another TTC rating system based on borrowers’ long-term credit characteristics involves too much effort. Moody’s (Hamilton et al 2011) proposed a method to convert its expected default frequency (EDF), a PIT credit measure based on market data (distance to default, DD), directly to TTC EDF by transforming DD. This simple shortcut, as recognized by Moody’s, comes at the cost of reduced efficiency and default prediction accuracy. Given the current input DD, the original mapped EDF is the best estimate at the moment. Any change to the mapping will result in a distorted/suboptimal EDF estimate. Keeping this in mind, we evaluate Moody’s PIT to TTC transformation. The fundamental transform is done using the simple linear equation
(4.2) |
where . This keeps the zero-frequency component and attenuates all other frequency components equally. is thus a suppressed version of around its mean , and the TTC EDF is a suppressed version of EDF around . In terms of variation of EDF, the amplitude is decreased. However, the assumption of the constant is a big flaw in this conversion. includes not only the cyclical component, but also the idiosyncratic credit factors whose changes are not mean-reverting. The exponentially weighted moving average (EWMA) seems to be a better alternative:
(4.3) |
In general, the idea of the direct PIT to TTC rating conversion is to apply a low pass (LP) filter to smooth out the resultant rating generated by the rating system. Whether this LP filter should be in the form of a simple moving average (SMV) or an EWMA, and whether the LP filter should be applied to the rating system input (risk drivers, ie, DD in Moody’s case) or to the final rating are choices left up to the practitioners, as long as they are theoretically correct and fulfill the practical objectives. It should always be kept in mind that the original PIT ratings are the best ratings assigned based on the best current knowledge. The converted TTC ratings are compromised ratings for capital calculation only. PIT ratings should still be kept and used for pricing purposes, at least for short-term loans, since PIT ratings incorporate more recent information and provide risk rankings that reflect obligors’ most current credit situations. Araten et al (2004) actually showed that the discriminatory power of Moody’s KMV is higher than Moody’s TTC ratings based on its rating methodology papers (71% versus 64.4% during 1997–2002 in terms of accuracy ratios).
5 Conclusion
In this paper, we used as our starting point the fundamental mathematical modeling of the PD parameter estimation as required by Basel II. The two methodologies based on TTC and PIT PD respectively were presented and evaluated using a S&P large corporation data set. We believe this modeling would be helpful to practitioners currently working on PD estimation and validation in the Basel II area. On this theoretical ground, the discussion was extended to the current hot topic of PD forecasting to meet IFRS 9 requirements. Rather than trying to predict the long-term PD term structure, we proposed a pure statistical approach namely using ranking statistics to estimate the worst PD for lifetime ECL estimation required by IFRS 9. We think this makes more business sense, but we are open to further discussion. Finally, we touched on the PIT to TTC rating conversion problem with the objective of getting a stable regulatory capital calculation. It would be worth looking at this in more detail in a future work.
Declaration of interest
The views expressed in this paper are not necessarily those of the Royal Bank of Canada or any of its affiliates.
Acknowledgements
We thank Dr Michael Clayton, Darko Lakota, Dr Biao Wu and the anonymous reviewers for their valuable suggestions.
References
- Agresti, A., and Coull, B. A. (1998). Approximate is better than “exact” for interval estimation of binomial proportions. American Statistician 52, 119–126.
- Araten, M., Jacobs, M., Jr., Varshney, P., and Pellegrino, C. R. (2004). An internal ratings migration study. Journal of the Risk Management Association April, 92–97.
- Basel Committee on Banking Supervision (2005). Studies on the validation of internal rating systems. Working Paper 14, May, Bank for International Settlements.
- Basel Committee on Banking Supervision (2006). International convergence of capital measurement and capital standards: a revised framework. Report, June, Bank for International Settlements.
- Blochwitz, S., Martin, M. R. W., and Wehn, C. S. (2011). XIII. Statistical approaches to PD validation. In The Basel II Risk Parameters, 2nd edn. Springer (https://doi.org/10.1007/978-3-642-16114-8_14).
- Cantor, R., and Falkenstein, E. (2001). Testing for rating consistency in annual default rates. Journal of Fixed Income 11(2), 36–51 (https://doi.org/10.3905/jfi.2001.319296).
- Clopper, C., and Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26, 404–413 (https://doi.org/10.1093/biomet/26.4.404).
- Hamilton, D., Sun, Z., and Ding, M. (2011). Through-the-cycle EDF credit measures. Report, August, Moody’s Analytics.
- Hanson, S., and Schuermann, T. (2005). Confidence intervals for probabilities of default. Journal of Banking and Finance 30(8), 2281–2301 (https://doi.org/10.2139/ssrn.766345).
- International Financial Reporting Standards (2014). IFRS 9 Financial Instruments. Project Summary, July, IFRS. URL: https://bit.ly/2Mfpqrl.
- Teichroew, D. (1956). Tables of expected values of order statistics and products of order statistics for samples of size twenty and less from the normal distribution. Annals of Mathematical Statistics 27(2), 410–426 (https://doi.org/10.1214/aoms/1177728266).
- Wilson, E. B. (1927). Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association 22, 209–212 (https://doi.org/10.1080/01621459.1927.10502953).
- Yang, B. H. (2014). Modeling systematic risk and point-in-time probability of default under the Vasicek asymptotic single-risk-factor model framework. The Journal of Risk Model Validation 8(3), 33–48 (https://doi.org/10.21314/JRMV.2014.126).
- Yang, B. H. (2017). Point-in-time PD term structure models for multi-period-scenario loss projection: methodologies and implementations for IFRS 9 ECL and CCAR testing. MPRA Paper 76271, University Library of Munich, Germany.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe
You are currently unable to print this content. Please contact info@risk.net to find out more.
You are currently unable to copy this content. Please contact info@risk.net to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@risk.net
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@risk.net