Journal of Risk Model Validation

Risk.net

On the mathematical modeling of point-in-time and through-the-cycle probability of default estimation/ validation

Xin Zhang and Tony Tung

  • First formal mathematical model for PIT and TTC PD.
  • Correct variance evaluation for PD estimation/validation.
  • Discussion over IFRS9 PD possibly based on rank statistics.
  • Critiques on Moody’s TTC EDF and potential alternative solutions.

Since Basel II, the second of the Basel Accords, was first published in June 2004, banks around the world have been engaged in a continuous effort to develop methodologies to estimate the key parameters: probability of default (PD), loss given default (LGD) and exposure at default (EAD). In this paper, we focus on PD estimation and validation.  We provide the mathematical modeling for both point-in-time (PIT) and through-the-cycle (TTC) PD estimation, and discuss their relationship and application in our banking system.

1 Introduction

Under Basel II (Basel Committee on Banking Supervision 2006):

“Generally, all banks using the IRB [internal ratings-based] approaches must estimate a PD for each internal borrower grade for corporate, sovereign and bank exposures or for each pool in the case of retail exposures.” (Paragraph 446)

“PD estimates must be a long-run average of one-year default rates for borrowers in the grade, with the exception of retail exposures.” (Paragraph 447)

“In order to avoid over-optimism, a bank must add to its estimates a margin of conservatism that is related to the likely range of errors.” (Paragraph 451)

“Banks must use information and techniques that take appropriate account of the long-run experience when estimating the average PD for each rating grade. For example, banks may use one or more of the three specific techniques set out below: internal default experience, mapping to external data, and statistical default models.” (Paragraph 461)

The four paragraphs quoted above define the key IRB requirements on the probability of default (PD) parameter estimation. Among the three techniques specified, internal default experience is the predominant one because of the difficulty involved in relating the internal rating to external ones and the data insufficiency for statistical default model development (although predominant under IFRS 9 implementation (see, for example, Yang 2014, 2017)), especially for wholesale portfolios. Based on internal default experience, as a classical statistics problem, known as binomial proportion estimation and confidence interval evaluation, banks developed their PD estimation methodologies through a historical annual PD cohort. The widely chosen long-run averaging methods are a simple average and the obligor-weighted sum of realized annual default rate (DR). To add a margin of conservatism, the confidence interval could be evaluated using a wide range of approaches, such as Clopper–Pearson (Clopper and Pearson 1934), Wald (normal approximation), Wilson (Wilson 1927), Agresti–Coull (Agresti and Coull 1998), etc, with the first two being the most popular. These confidence intervals, however, are only applicable to a single period. Oftentimes, banks find more breaches than expected from their chosen confidence interval while backtesting during validation processes, which is illustrated using the Standard & Poor’s (S&P) data in Section 3.1.

One of the main reasons for this backtesting failure, we believe, is related to the concept of rating philosophy. Rating philosophy is commonly referred to as either point-in-time (PIT) or through-the-cycle (TTC) ratings. Note the difference between the PIT/TTC PD and the PIT/TTC rating. There might be some misuse of these terminologies, but, by PIT/TTC PD, we are referring to the realized/expected PD at different instances in time for PIT PD and throughout time for TTC PD. Together, they describe the dynamics of rating systems, the evaluation of which is part of the validation process (Basel Committee on Banking Supervision 2005). On the other hand, PIT ratings measure credit risk given the current state of a borrower in their current economic environment, whereas TTC ratings measure credit risk, taking into account the (assumed) state of the borrower over a whole economic cycle (Blochwitz et al 2011). Whether a rating system is PIT or TTC could be evaluated through a ratings migration study (Araten et al 2004). We can observe both TTC and PIT PD associated with a given credit rating whether it is assigned by a PIT or a TTC philosophy. The difference is in how the PIT PD varies over time. Most credit rating agencies, eg, S&P and Moody’s, embrace the TTC philosophy for rating stability. Banks’ internal ratings, with an effort to incorporate some PIT elements, usually follow a PIT–TTC hybrid philosophy, with wholesale portfolios closer to TTC.11 1 A TTC rating philosophy might be preferred by the Basel Committee on Banking Supervision (BCBS) as well. In Basel Committee on Banking Supervision (2006, Paragraph 414), it states that: “Although the time horizon used in PD estimation is one year (as described in Paragraph 447), banks are expected to use a longer time horizon in assigning ratings.” In a TTC rating system, when an obligor’s rating is stabilized throughout the economic cycle, the implied PD of each rating will fluctuate along economic expansions and recessions. Therefore, we will observe lower realized annual DRs during economic peaks and higher realized annual DRs during economic troughs. In other words, we would observe time-varying PD in a TTC rating framework. Apparently, the long-run average PD calculated by banks is an invariant long-term PD (usually modeled as a single binomial process), and the year-to-year variation of the realized/observed annual DRs will be much larger than the confidence interval defined on a long-term PD based on the simple binomial process.

As the transition of our current accounting standard to International Financial Reporting Standard 9 (IFRS 9) gets underway, a lot of attention has been paid to the conversion between so-called PIT PD and TTC PD. In this paper, we will also touch on this problem from a mathematical perspective. However, the discussion will be around its basic concepts rather than practical implementation.

In the next section, we will introduce our model for PIT and TTC PD estimation. The application of the methodology to PD estimation/validation is illustrated using the S&P default data, and results are shown in Section 3. PIT versus TTC PD is discussed in Section 4. Section 5 will be a summary of this paper and possible future work.

2 Probability of default estimation model

According to Basel II requirements (Basel Committee on Banking Supervision 2006, Paragraph 463), the length of the underlying historical observation period used for the bank’s PD estimation must be at least five years. Most banks nowadays possess data that covers a much longer history. Therefore, a long time series of annual DRs can easily be obtained. To compute the long-run average from time series, our options are generally a simple average, an obligor-weighted average or a default-weighted average. A default-weighted average is deemed too conservative and biased. The first two are the most commonly used averaging methods. We will show in the following subsections that, despite the converging numeric values yielded by these two methods when a longer time period is used, they actually have very different interpretations.

2.1 TTC PD

The underlying assumption when using an obligor-weighted average of annual DRs is that all obligors of a certain credit rating are associated with a unique TTC PD (independent of their time of default; this contrasts with the PIT PD model discussed in the next subsection). This can be justified with mathematical equations.

Assume that we have T years of data, and for each year there are Nt, t=1,,T, obligors. Whether a specific obligor defaulted or not can be modeled by independent and identically distributed (iid) Bernoulli random variables xk,t, k=1,,Nt. The annual DR would be expressed as DRt=xk,t/Nt. The obligor-weighted long-run average PD (LRPDTTC) would be

  LRPDTTC=t(NtDRt)tNt=tkxk,ttNt,   (2.1)

which implies that xk,tBern(p) for all k,t, and p can be estimated with (2.1). In other words, LRPDTTCtNt follows a binomial distribution, B(tNt,p). The variance of LRPDTTC can be computed by

  σLRDPTTC2=p(1-p)tNt.  

Usually, tNt is large, so LRPDTTC can be estimated with high precision. The confidence intervals evaluated using all types of methodologies (eg, Clopper–Pearson, Wald, etc) would be narrow and close to one another. However, these small confidence intervals offer false security. The simple derivation of the mean and variance of LRPD is based on the over-simplified assumption of all historical obligors following the same LRPDTTC at the cost of losing the annual DR information. Note how DRt disappeared in (2.1).

2.2 PIT PD

In recognition of the fact that the realized annual DRs of a certain credit rating (from a TTC rating system) do vary from time to time, a more reasonable assumption is that all obligors of a certain credit rating are associated with a PIT PD (a time-varying PD dependent on the time of default). This PIT PD is a function of time and, in practice, is assumed to be constant over a year. Therefore, in mathematical terms, the obligors are represented by iid Bernoulli random variables xk,t, k=1,,Nt, and xk,tBern(pt), t=1,,T, where pt are iid random variables following an arbitrary distribution; this contrasts with the TTC PD model in the previous subsection, where p was assumed to be constant throughout time. The optimal estimate of the long-run PD is therefore the expectation of pt (not a function of t), which can be calculated as the simple average of the estimate of pt(t=1,,T), p^t=DRt, as the realized annual DR:

  LRPDPIT=Y=tp^tT=t(kxk,t/Nt)T.   (2.2)

To evaluate the deviation of this estimate of the LRPD, note that p^tB(Nt,pt). By the law of total variance, the variance of LRPDPIT can be broken down into two parts:

  Var(Y)=E[Var(Yp1,,pT)]+Var(E[Yp1,,pT]).  

The first part captures the variance generated by the binomial process:

  E[Var(Yp1,,pT)] =E[1T2t=1Tpt(1-pt)Nt]  
    =1T2t=1TE[pt]-E2[pt]-Var[pt]Nt.  

The second part captures the variance of the mean of pt itself:

  Var(E[Yp1,,pT])=Var(1Tt=1Tpt)=1TVar[pt].  

The total variance of LRPDPIT is the combination of the two:

  Var(LRPDPIT)=1T2t=1TE[pt]-E2[pt]-Var[pt]Nt+1TVar[pt].   (2.3)

Since the distribution of pt is unknown, without loss of generality, E[pt] and Var[pt] could be replaced with their approximations p¯=(1/T)tp^t and (1/(T-1))t(p^t-p¯)2. Except for E[pt] in the first component, all other terms are second-order values. So, Var(LRPDPIT) is expected to be dominated by the first component (the binomial process) when Nt1/p¯ and by the second component (the time series) when Nt1/p¯. In the next section, we will use the S&P data to evaluate exactly how the two components affect the deviation of the estimate.

A special case assuming the iid normal distribution of pt was assessed by Cantor and Falkenstein (2001). Our derivation below shows that a minor error exists in their final result. Based on this assumption, ptN(p,σ2). We could replace the terms in (2.3) using E[pt]=p and Var[pt]=σ2 and get

  Var(LRPDPIT)Normal=1T2t=1Tp-p2-σ2Nt+1Tσ2.   (2.4)

For each year (T=1), the observed p^t (or DRt) follows the distribution

  Annualp^tN(p,p(1-p)-σ2Nt+σ2).   (2.5)

Note the missing term in Cantor and Falkenstein (2001, Equation (3)).

2.2.1 PIT PD with correlated obligors

For the PIT PD model, we can go a step further to make the assumptions more realistic by allowing the default events of obligors to be correlated. Keeping all the rest of the assumptions unchanged, we now assume within each year (still independent across different years) that the defaults of obligors are represented by identically distributed Bernoulli random variables xk,t, k=1,,Nt, and xk,tBern(pt) with pairwise correlation Corr(xi,t,xj,t)=ρt, ij. This will only change the first term in the total variance formula Var(Y)=E[Var(Yp1,,pT)]+Var(E[Yp1,,pT]). We evaluate it under the new assumption:

  E[Var(Yp1,,pT)] =E[1T2t=1TVar(k=1Ntxk,tNt)]  
    =E{1T2t=1Tpt(1-pt)[1+ρt(Nt-1)Nt]}  
    =1T2t=1T{E[pt]-E2[pt]-Var[pt]Nt[1+ρt(Nt-1)]}.  

Compared to the result under the independent assumption, the total variance is increased by

  1T2t=1T{E[pt]-E2[pt]-Var[pt]Nt[ρt(Nt-1)]}.  

The full total variance in (2.3) becomes

  Var(LRPDPIT)=1T2t=1T{E[pt]-E2[pt]-Var[pt]Nt[1+ρt(Nt-1)]}+1TVar[pt].  

When Nt is large, the covariance term dominates, and

  Var(LRPDPIT)=t=1TρtT2(E[pt]-E2[pt]-Var[pt])+1TVar[pt].  

The reasonable assumption/estimation of ρt is another topic and is beyond the scope of this paper.

2.3 TTC PD versus PIT PD

The two mathematical models are built on the same assumption; namely, for each credit rating there is a fundamental implied long-term PD. For the TTC PD estimation model, the annual PDs are assumed to be constant and equal to this long-term PD. For the PIT PD estimation model, the annual PDs are assumed to be realizations of a random variable whose expectation is the long-term PD. As the number of years T gets larger, LRPDTTC and LRPDPIT will converge.

Despite having the same assumption on the existence of long-term PDs, the two mathematical models have different focuses. The TTC PD model ignores the fluctuation of the implied PD of credit ratings over time and is inappropriate for the PD estimation of a practical credit system whose ratings have a TTC component.22 2 Having a TTC component in the rating means that the rating is not pure PIT; a constant PD assumption is only appropriate for a PIT rating system where the economic conditions are 100% (ideally) absorbed in the rating so that the implied PD can stay constant. However, it is more suitable (or easier to use) for some processes in PD validation, eg, the calibration of the rating system and the evaluation of rating consistency between different portfolios. It may best be used to answer the following question: in the long run, does the rating system yield reliable/consistent rankings of obligors’ credit? In other words, when the monotonicity of implied PDs and the separability of their confidence intervals are evaluated (Hanson and Schuermann 2005), the TTC PD model is the simplest and best option. On the other hand, the PIT PD model focuses on the annual realized PD and provides reliable answers to the following questions: what is the best PD estimate for the next T years, and what is its confidence level?

3 Model application

To better understand the mathematical models introduced in the previous section, we demonstrate the application of the methodologies to the S&P data. The visualization and interpretation of the following results will be helpful for practitioners in making decisions on assumptions and methods to adopt during their PD estimation and validation processes.

The S&P data available is for large corporates and covers the years 1982–2015. According to BCBS requirements (Basel Committee on Banking Supervision 2006, Paragraph 463), the length of the underlying historical observation period used for PD estimation must be at least five years. If the available observation period spans a longer period, and the data is relevant and material, this longer period must be used. As the credit rating system evolves along with the macroeconomic environment and regulations, it is reasonable to expect that older, historic data will become irrelevant. How to determine whether data is “relevant and material” is at the discretion of practitioners. In this paper, as a simple demonstration, we use the population stability index (PSI, a common metric used to evaluate population distribution shifts between the development and validation periods of a rating system) as a metric to indicate the changing of the population or the rating system or both:

  PSI=[(Actual%-Expected%)×(lnActual%Expected%)].  

We start from the most recent year and move backward in time. We calculate the PSI of year Y’s rating distribution against that of years (Y+1)–2015. Results are shown in Table 1.

Table 1: PSI of year Y against years (Y+1)-2015 for S&P data set.
  Year
   
  1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
PSI 0.30 0.24 0.21 0.17 0.16 0.11 0.11 0.09 0.09 0.11 0.09
  Year
   
  2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
PSI 0.08 0.07 0.04 0.02 0.03 0.06 0.03 0.02 0.02 0.01 0

We base this on the criteria that PSI<0.1 indicates insignificant change, 0.1<PSI<0.25 indicates some minor change and PSI>0.25 indicates major change. We consider that S&P data back to 1995 is relevant and will use data from 1995 to 2015 as our analysis data set. The rating distribution shift can be easily spotted in Figure 1.

S&P large corporate rating distribution migration (1982--2015).
Figure 1: S&P large corporate rating distribution migration (1982–2015).

3.1 TTC PD evaluation

The results for the TTC PD evaluation are summarized in Table 2.

Table 2: S&P TTC PD estimate and deviation: twenty rating grades.
  Total number Total number ??????? Deviation Coefficient of
Rating of obligors of defaults (%) (%) variance (%)
AAA 0532 000 00.000 0.000  
AA+ 0399 000 00.000 0.000  
AA 1078 000 00.000 0.000  
AA- 1585 000 00.000 0.000  
A+ 2461 002 00.081 0.057 70.682
A 4179 000 00.000 0.000  
A- 4813 000 00.000 0.000  
BBB+ 5690 005 00.088 0.039 44.702
BBB 6798 007 00.103 0.039 37.777
BBB- 5234 014 00.267 0.071 26.690
BB+ 3201 004 00.125 0.062 49.969
BB 4145 027 00.651 0.125 19.182
BB- 5598 045 00.804 0.119 14.847
B+ 7734 184 02.379 0.173 07.284
B 6599 255 03.864 0.237 06.140
B- 3005 260 08.652 0.513 05.927
CCC+ 0931 206 22.127 1.360 06.148
CCC 0500 168 33.600 2.112 06.287
CCC- 0143 073 51.049 4.180 08.189
CC 0139 085 61.151 4.134 06.761

The most notable issue in Table 2 is that the monotonicity of the implied PD is broken for ratings above BB+. This result is consistent with that in Hanson and Schuermann (2005), which indicates that, due to the limited default data, fine rating grades in high credit rankings are difficult to calibrate and deemed unreliable. In our subsequent analyses, we will therefore combine the ratings BB- and above by ignoring their modifiers.33 3 Although we are tempted to combine modifiers in investment grades only, Table 11 in the online appendix shows that rating BB+ is not distinguishable from BBB. Table 2 becomes Table 3. The monotonicity of implied PDs is conserved and the TTC PD estimates of ratings below AA are separated by at least 2.44 standard deviations (98.5% confidence).

Table 3: S&P TTC PD estimate and deviation: twelve rating grades.
  Total number Total number ??????? Deviation Coefficient of
Rating of obligors of defaults (%) (%) variance (%)
AAA 00 532 000 00.000 0.000  
AA 03 062 000 00.000 0.000  
A 11 453 002 00.017 0.012 70.705
BBB 17 722 026 00.147 0.029 19.597
BB 12 944 076 00.587 0.067 11.437
B+ 07 734 184 02.379 0.173 07.284
B 06 599 255 03.864 0.237 06.140
B- 03 005 260 08.652 0.513 05.927
CCC+ 00 931 206 22.127 1.360 06.148
CCC 00 500 168 33.600 2.112 06.287
CCC- 00 143 073 51.049 4.180 08.189
CC 00 139 085 61.151 4.134 06.761

The validation results of the above monotonicity and confidence interval analyses indicate that banks should not try to keep fine rating notches with high credit ratings. In the PD estimation process, rather than applying techniques such as line-of-best-fit to get monotonic PD numbers, fine rating notches could be combined into coarse rating grades over which PD estimation may be done. The advantage of this alternative approach is two-fold. First, it would help improve product pricing. Second, it would not distort the true underlying PDs of the rating system.

Back to the discussion on PD parameter estimation, under the TTC PD methodology with the common assumption by most practitioners that the annual DR follows an iid normal distribution with mean p, we can derive the 95% confidence interval of the estimate of p for each rating grade. The upper limit is considered an estimate with a sufficient margin of conservatism. However, as we mentioned in the introduction of this paper, if we conduct a backtest, excessive breaches (highlighted in bold in Table 4) may be observed in all rating grades with meaningful historical defaults, even though only one breach is expected over the twenty-one-year history. This inconsistency indicates that the TTC PD assumption is not appropriate for a TTC rating system, whose rating stays stable over the economic cycle while its implied PD fluctuates. This problem could be fixed with PIT PD modeling, which we will show in the next subsection.

3.2 PIT PD evaluation

Table 4: Backtest of the annual DR against the 95% confidence upper limit of TTC PD. [All values given are percentages.]
  Rating
   
  AAA AA A BBB BB B+ B B- CCC+ CCC CCC- CC
95% confidence 0.000 0.000 0.038 0.194 0.698 2.664 04.254 09.496 24.364 37.075 57.925 67.951
upper limit (TTC)                        
1995 0 0 0 0.505 0.857 2.575 08.140 07.692 0 50 0
1996 0 0 0 0 0.535 1.762 03.226 10.417 12.500 25
1997 0 0 0 0 0.456 0.837 05.691 09.091 06.667 16.667 100 100
1998 0 0 0 0.314 0.198 1.739 06.107 06.667 43.750 40 50
1999 0 0 0 0.143 1.024 5.252 10.500 21.978 32 46.154 50 28.571
2000 0 0 0 0.262 1.495 4.822 10.577 16.667 13.333 26.667 54.545 28.571
2001 0 0 0.339 0.492 1.846 5.747 16.146 29.464 53.061 42.308 55.556 83.333
2002 0 0 0 1 3.442 9.348 13.115 26.263 40.351 44.828 36.364 66.667
2003 0 0 0 0.222 0.504 3.155 09.302 19.780 34.043 35.135 46.154 65
2004 0 0 0 0.107 0.308 0.287 04.500 02.222 13.636 22.581 28.571 40
2005 0 0 0 0.208 0.144 1.171 01.695 04.348 13.953 25 0 40
2006 0 0 0 0 0.556 0.225 00.735 02.326 11.475 11.765 33.333 33.333
2007 0 0 0 0 0 0 00.287 02.381 11.538 15.789 20 100
2008 0 0 0 0.110 0.141 2.506 01.695 04.420 20.455 40 62.500 50
2009 0 0 0 0 0.917 7.267 13.506 25.455 47.273 60.606 100 85.714
2010 0 0 0 0 0 0.366 00.917 02.740 10.465 33.333 54.545 73.684
2011 0 0 0 0 0 0 00.964 03.620 10.638 18.750 50 57.143
2012 0 0 0 0 0.145 0.528 01.035 05.213 20.755 40 66.667 57.143
2013 0 0 0 0 0.145 0.501 01.107 03.540 19.048 37.838 72.727 75
2014 0 0 0 0 0 0 00.699 03.200 19.403 31.429 57.143 75
2015 0 0 0 0 0.362 0.693 02.083 04.319 18.310 29.167 45 100

When we tried to evaluate the annual realized DRs for S&P, we discovered that, for certain rating grades, data is not available for all of the years from 1995 to 2015. Namely, for some years, there are no obligors in the S&P data set rated at certain rating grades. These are regarded as missing data points, and the total number of years T is adjusted accordingly. Table 5 summarizes the key statistics from applying the PIT PD model to the S&P large corporate data set (1995–2015).

Table 5: S&P PIT PD evaluation: twelve rating grades.
  PIT PD ??????? Coefficient ???(???????)
      of  
  Min Max Expectation Deviation variance First Second
Rating (%) (%) (%) (%) (%) term term
AAA 0 0 0 0   0 0
AA 0 0 0 0   0 0
A 0 00.339 00.016 0.020 124.302 1.42E-08 2.61E-08
BBB 0 01.004 00.160 0.063 039.406 9.71E-08 3.02E-07
BB 0 03.442 00.623 0.192 030.813 4.95E-07 3.19E-06
B+ 0 09.348 02.323 0.609 026.205 2.99E-06 3.41E-05
B 00.287 16.146 05.335 1.147 021.495 1.05E-05 1.21E-04
B- 02.222 29.464 10.086 2.083 020.655 3.87E-05 3.95E-04
CCC+ 0 53.061 21.555 3.483 016.158 0.00022 0.00100
CCC 11.765 60.606 33.001 3.838 011.630 0.00073 0.00075
CCC- 0 100 49.111 7.818 015.919 0.00233 0.00378
CC 28.571 100 63.640 7.088 011.137 0.00205 0.00297

As expected, the S&P data set is large enough that the LRPDs computed under the two methodologies are quite close to each other. However, some deviations are obvious at ratings “B” and “B-” in Figure 2. These are where neither the number of obligors nor the PD is large enough. Figure 2 also shows that the LRPDs from both models can be well fitted into lines, indicating that the S&P rating system is nicely calibrated. The two lines are close to each other too. Although we do not suggest using the PDs from fitted lines for ratings with default data (“A” and below), the extrapolated PDs for ratings “AAA” and “AA” are good candidates. The readings are 0.003% for “AA” and 0.001% for “AAA”, which will be used as the PD estimations for these two ratings.

S&P long-run PD with its log-linear fitting.
Figure 2: S&P long-run PD with its log-linear fitting.

The last two columns in Table 5 show the contributions of variance from binomial processes and time series to the total variance. The variances from time series are always larger than those from binomial processes, especially in the middle credit rating ranges, where Nt are relatively large compared to 1/pt. However, at both ends of the spectrum, they are comparable. This observation suggests that the deviation from the TTC PD model is not appropriate to derive the confidence interval when LRPD is used as the prediction of PD for the coming year; this is evident from the backtest result shown earlier. It misses the variance coming from the annual realization of pt. Actually, if we take the square root of the first term of variance in Table 5, the numbers are very close to the deviation column in Table 3.

Conducting the same backtest using the deviation under the PIT PD model, we see (in Table 6) that the numbers of breaches are as expected.

3.3 PD estimate for capital use

Table 6: Backtest of the annual DR against the 95% confidence upper limit of PIT PD.
  Rating
   
  AAA AA A BBB BB B+ B B- CCC+ CCC CCC- CC
95% confidence 0.000 0.000 0.168 0.619 2.040 6.876 13.720 25.317 46.517 58.611 95.835 100.000
upper limit (PIT)                        
1995 0 0 0 0.505 0.857 2.575 08.140 07.692 0 50 0
1996 0 0 0 0 0.535 1.762 03.226 10.417 12.500 25
1997 0 0 0 0 0.456 0.837 05.691 09.091 06.667 16.667 100 100
1998 0 0 0 0.314 0.198 1.739 06.107 06.667 43.750 40 50
1999 0 0 0 0.143 1.024 5.252 10.500 21.978 32 46.154 50 28.571
2000 0 0 0 0.262 1.495 4.822 10.577 16.667 13.333 26.667 54.545 28.571
2001 0 0 0.339 0.492 1.846 5.747 16.146 29.464 53.061 42.308 55.556 83.333
2002 0 0 0 1 3.442 9.348 13.115 26.263 40.351 44.828 36.364 66.667
2003 0 0 0 0.222 0.504 3.155 09.302 19.780 34.043 35.135 46.154 65
2004 0 0 0 0.107 0.308 0.287 04.500 02.222 13.636 22.581 28.571 40
2005 0 0 0 0.208 0.144 1.171 01.695 04.348 13.953 25 0 40
2006 0 0 0 0 0.556 0.225 00.735 02.326 11.475 11.765 33.333 33.333
2007 0 0 0 0 0 0 00.287 02.381 11.538 15.789 20 100
2008 0 0 0 0.110 0.141 2.506 01.695 04.420 20.455 40 62.500 50
2009 0 0 0 0 0.917 7.267 13.506 25.455 47.273 60.606 100 85.714
2010 0 0 0 0 0 0.366 00.917 02.740 10.465 33.333 54.545 73.684
2011 0 0 0 0 0 0 00.964 03.620 10.638 18.750 50 57.143
2012 0 0 0 0 0.145 0.528 01.035 05.213 20.755 40 66.667 57.143
2013 0 0 0 0 0.145 0.501 01.107 03.540 19.048 37.838 72.727 75
2014 0 0 0 0 0 0 00.699 03.200 19.403 31.429 57.143 75
2015 0 0 0 0 0.362 0.693 02.083 04.319 18.310 29.167 45 100

One major application of a PD estimate is for IRB capital calculation. In this case, the PD estimate is used as a prediction of the coming year’s DR. Apparently, the PIT PD estimation model would be more appropriate for this purpose because the assumption of the coming year’s DRB(Nt,pt), with its parameter pt following the same distribution as previous years, and with an expectation of E[pt] and a variance of Var[pt], is much more realistic than the assumption of every year’s DR being equal to p. So, we have the expected DR in the coming year as E[pt], estimated as the average of the historical annual realized DRs (p^t), p¯=(1/T)tp^t, also known as LRPDPIT. The expected deviation of DR is

  Var(DR)=(E[pt]-E2[pt]-Var[pt]Nt)+Var[pt]  

by replacing T with 1 in (2.3). Here, like E[pt], Var[pt] can be approximated as the standard deviation (STD) of p^t squared, (1/(T-1))t(p^t-p¯)2. Some might be tempted to use the STD (1/(T-1))t(p^t-p¯)2 directly to compute the confidence interval for the coming year’s PD. However, as we have shown in the previous subsection, the deviation of the PD prediction would be underestimated, especially for credit ratings at the two extremes (good and bad), where the first term of (E[pt]-E2[pt]-Var[pt])/Nt would play an important role because Nt is not much larger than 1/pt.

Back to the S&P large corporate data set (1995–2015), if we assume that the numbers of obligors do not change between 2015 and 2016, Table 7 shows our PD estimate/prediction for 2016.

Table 7: S&P PD prediction. [* PD estimates for AAA and AA are from linear extrapolation.]
            One-sided
      Deviation confidence
         
  No. of PD First Second Total 80% 90%
Rating obligors estimate term (%) term (%) (%) (%) (%)
AAA 0010 00.001* 0 0 0
AA 0094 00.003* 0 0 0
A 0520 00.016 00.056 00.074 00.093 00.094 000.135
BBB 1118 00.160 00.119 00.252 00.279 00.395 000.518
BB 0828 00.623 00.272 00.818 00.862 01.348 001.727
B+ 0433 02.323 00.712 02.675 02.768 04.653 005.870
B 0816 05.335 00.767 05.040 05.098 09.625 011.868
B- 0301 10.086 01.654 09.111 09.260 17.879 021.953
CCC+ 0071 21.555 04.568 14.472 15.176 34.327 041.003
CCC 0024 33.001 09.252 12.523 15.570 46.105 052.954
CCC- 0020 49.111 09.438 26.793 28.406 73.018 085.515
CC 0003 63.640 24.148 23.762 33.878 92.153 100.000

The right two columns in Table 7 provide the 80% and 90% one-sided confidence intervals. These can be interpreted as an expectation that there will be no more than two and one breaches in ten years, respectively. A simple check of the S&P historic data found that two years for “CCC-” and three years for “CC” observed 100% default rates; this confirms that the 100% PD estimate for these low credit ratings is not overly conservative, considering their high expected PD and low number of obligors. However, the evaluation of the confidence interval is done on each individual rating grade independently. Since the pt for all ratings are not perfectly correlated (see Table 8 for the realized correlation coefficients), the 80% one-sided confidence intervals might be good enough for the overall portfolio level.

Table 8: S&P annual DR correlation between ratings (1995–2015).
  BBB BB B+ B B- CCC+ CCC CCC- CC
BBB 1 0.8472 0.7051 0.6511 0.5752 0.4117 0.4107 -0.3541 -0.1349
BB 0.8472 1 0.8746 0.7846 0.7874 0.4836 0.3792 -0.0302 -0.0607
B+ 0.7051 0.8746 1 0.8954 0.9217 0.6830 0.6599 0.1565 -0.0703
B 0.6511 0.7846 0.8954 1 0.9448 0.6817 0.5961 0.1678 -0.0160
B- 0.5752 0.7874 0.9217 0.9448 1 0.7440 0.5797 0.2562 0.0334
CCC+ 0.4117 0.4836 0.6830 0.6817 0.7440 1 0.6120 0.3407 0.0756
CCC 0.4107 0.3792 0.6599 0.5961 0.5797 0.6120 1 0.2216 0.0393
CCC- -0.3541 -0.0302 0.1565 0.1678 0.2562 0.3407 0.2216 1 0.3764
CC -0.1349 -0.0607 -0.0703 -0.0160 0.0334 0.0756 0.0393 0.3764 1

4 PIT versus TTC

Up to now, our discussion has been from the perspectives of PD estimation and validation based on realized PD using historical data. PIT PD is the realized PD for a certain period, usually a year. TTC PD, on the other hand, is the PD for the whole history. At the same time, we have the notion of a PIT and TTC rating system/philosophy. We are able to observe both TTC PD and PIT PD from a rating system no matter what philosophy it deploys. As discussed in Basel Committee on Banking Supervision (2005), we expect the PIT PD of a PIT rating system to be stable around TTC PD, and the PIT PD of a TTC rating system to be negatively correlated with the economic cycle. The more TTC the rating system is, the more the PIT PD varies. The plot (shown in Figure 3) of the realized annual PDs of a retail rating system XXX (PIT) versus S&P (TTC) confirms this statement.

PIT PD variation of a retail rating system (XXX) versus S&P.
Figure 3: PIT PD variation of a retail rating system (XXX) versus S&P.

The realized PD of S&P rating “B-” showed high spikes in 2002–3 and 2009, indicating a delayed effect from the macroeconomic recessions in 2001–2 and 2008–9. Meanwhile, the realized PD of the XXX rating system stays close to its TTC PD and does not follow the economic cycle at all.

4.1 PIT PD prediction for IFRS 9

Recently, a lot of research has been conducted to predict the PIT PD term structure in response to the IFRS 9 requirement (International Financial Reporting Standards 2014, Section 5.5.10): “At each reporting date, an entity shall measure the loss allowance for a financial instrument at an amount equal to the lifetime expected credit losses if the credit risk on that financial instrument has increased significantly since initial recognition.”

By lifetime expected credit losses (ECLs), we are to understand the “worst annual” ECL in a lifetime. Since ECL is computed based on the parameters PD, loss given default (LGD) and exposure at default (EAD), banks try to compute lifetime (usually five years) PIT PDs. There are several approaches to this, and each has different drawbacks.

  1. (1)

    Derive the five-year cumulative PD term based on the historical data. This is a direct extension of the one-year PD estimate based on the Cohort method. However, the insufficiency of data becomes more serious as more years are considered (eg, very few obligors have rating data for five years). Therefore, more nonmonotonicity and missing data points for realized PD mean that techniques like curve-fitting and extrapolation are overused. This will result in unreliable PD term estimation and oftentimes unreasonable values (PD<0 or PD>1).

  2. (2)

    Project the multiperiod PD based on rating transition models. The two main issues with this approach are as follows. First, it assumes the rating migration follows a stationary Markov chain process. This assumption has been proven to be questionable because of the momentum in credit rating. Second, again because of limited data availability, it is extremely difficult to get a reliable estimate of the transition matrix.

  3. (3)

    Predict the forward PD by some macroeconomic variables. As shown in the beginning of this section, the variation of realized PD in a TTC rating system could be explained in part by macroeconomic conditions. For a PIT rating system, however, the changes of the macroeconomic condition have been embedded into the rating assigned, and the observed PD therefore varies less and is uncorrelated with macroeconomic parameters. Even though for a TTC rating system the macroeconomic variables (current or one- or two-years latent) might provide good explanatory power for a one- or two-year PD, we should not expect them to be predictable for long-term PD. The reasoning behind this is as follows. Assume that some macroeconomic variables are able to reliably predict the five-year PD; at the same time, the PD is highly correlated with the recent macroeconomic parameters. This would imply that these macroeconomic parameters could be reliably predicted in a five-year range too, which is contrary to the basic principles of economics.

If we cannot hold much faith in these long-term PD predictions, are there any other options? If our understanding of lifetime ECL as worst annual ECL in a lifetime is correct, its estimate can be based on the worst annual PD in a lifetime. Here, the PD should be the one-year PIT PD: after all, IFRS 9 is an accounting standard for annual reports. According to common practice, five years can be used as a proxy for “lifetime”. The problem is thus reduced to find the worst one-year PD in five years. Assuming the underlying annual PDs follow a normal distribution ptN(p,σ2), and based on our result in (2.5) in Section 2.2, the realized PD prediction p^tN(p,((p(1-p)-σ2)/Nt)+σ2). The worst one-year PD in five years would be the expected value of order statistics of five iid N(p,((p(1-p)-σ2)/Nt)+σ2) random variables. Using the results of Teichroew (1956) directly (partially quoted in Table 9), we get the worst one-year PD in five years:

  E[p^max,5]=p+1.1629644736p(1-p)-σ2Nt+σ2,   (4.1)

where p and σ are the mean and STD of the historical observed PD, respectively. The numerical results applied to the S&P data set are shown in Table 10.

Table 9: Expected values of order statistics from normal distribution.
? ? ?(??,?)
2 1 0.56418 95835
3 1 0.84628 43753
4 1 1.02937 53730
4 2 0.29701 13823
5 1 1.16296 44736
5 2 0.49501 89705
6 1 1.26720 63606
6 2 0.64175 50388
6 3 0.20154 68338
Table 10: PD estimation for normal instrument versus deteriorated instrument. [*PD estimates for AAA and AA are from linear extrapolation.]
  Number of PD estimate Deviation ?[?^???,?]
Rating obligors (%) (%) (%)
AAA 0010 00.001*
AA 0094 00.003*
A 0520 00.016 00.093 000.124
BBB 1118 00.160 00.279 000.485
BB 0828 00.623 00.862 001.625
B+ 0433 02.323 02.768 005.542
B 0816 05.335 05.098 011.263
B- 0301 10.086 09.260 020.855
CCC+ 0071 21.555 15.176 039.204
CCC 0024 33.001 15.570 051.108
CCC- 0020 49.111 28.406 082.146
CC 0003 63.640 33.878 100.000

4.2 PIT to TTC rating conversion

Another area involving the PIT versus TTC discussion is the rating conversion from PIT to TTC. One typical objective of such a conversion is to get stable regulatory capital that does not vary greatly from year to year in response to changes in the economic cycle. The optimal way of developing another TTC rating system based on borrowers’ long-term credit characteristics involves too much effort. Moody’s (Hamilton et al 2011) proposed a method to convert its expected default frequency (EDF), a PIT credit measure based on market data (distance to default, DD), directly to TTC EDF by transforming DD. This simple shortcut, as recognized by Moody’s, comes at the cost of reduced efficiency and default prediction accuracy. Given the current input DD, the original mapped EDF is the best estimate at the moment. Any change to the mapping will result in a distorted/suboptimal EDF estimate. Keeping this in mind, we evaluate Moody’s PIT to TTC transformation. The fundamental transform is done using the simple linear equation

  DDtTTC=μ+β(DDt-μ),   (4.2)

where 0<β<1. This keeps the zero-frequency component and attenuates all other frequency components equally. DDtTTC is thus a suppressed version of DDt around its mean μ, and the TTC EDF is a suppressed version of EDF around EDF(μ). In terms of variation of EDF, the amplitude is decreased. However, the assumption of the constant μ is a big flaw in this conversion. DDt includes not only the cyclical component, but also the idiosyncratic credit factors whose changes are not mean-reverting. The exponentially weighted moving average (EWMA) seems to be a better alternative:

  DDtTTC=(1-β)DDt-1TTC+βDDt.   (4.3)

In general, the idea of the direct PIT to TTC rating conversion is to apply a low pass (LP) filter to smooth out the resultant rating generated by the rating system. Whether this LP filter should be in the form of a simple moving average (SMV) or an EWMA, and whether the LP filter should be applied to the rating system input (risk drivers, ie, DD in Moody’s case) or to the final rating are choices left up to the practitioners, as long as they are theoretically correct and fulfill the practical objectives. It should always be kept in mind that the original PIT ratings are the best ratings assigned based on the best current knowledge. The converted TTC ratings are compromised ratings for capital calculation only. PIT ratings should still be kept and used for pricing purposes, at least for short-term loans, since PIT ratings incorporate more recent information and provide risk rankings that reflect obligors’ most current credit situations. Araten et al (2004) actually showed that the discriminatory power of Moody’s KMV is higher than Moody’s TTC ratings based on its rating methodology papers (71% versus 64.4% during 1997–2002 in terms of accuracy ratios).

5 Conclusion

In this paper, we used as our starting point the fundamental mathematical modeling of the PD parameter estimation as required by Basel II. The two methodologies based on TTC and PIT PD respectively were presented and evaluated using a S&P large corporation data set. We believe this modeling would be helpful to practitioners currently working on PD estimation and validation in the Basel II area. On this theoretical ground, the discussion was extended to the current hot topic of PD forecasting to meet IFRS 9 requirements. Rather than trying to predict the long-term PD term structure, we proposed a pure statistical approach namely using ranking statistics to estimate the worst PD for lifetime ECL estimation required by IFRS 9. We think this makes more business sense, but we are open to further discussion. Finally, we touched on the PIT to TTC rating conversion problem with the objective of getting a stable regulatory capital calculation. It would be worth looking at this in more detail in a future work.

Declaration of interest

The views expressed in this paper are not necessarily those of the Royal Bank of Canada or any of its affiliates.

Acknowledgements

We thank Dr Michael Clayton, Darko Lakota, Dr Biao Wu and the anonymous reviewers for their valuable suggestions.

References

  • Agresti, A., and Coull, B. A. (1998). Approximate is better than “exact” for interval estimation of binomial proportions. American Statistician 52, 119–126.
  • Araten, M., Jacobs, M., Jr., Varshney, P., and Pellegrino, C. R. (2004). An internal ratings migration study. Journal of the Risk Management Association April, 92–97.
  • Basel Committee on Banking Supervision (2005). Studies on the validation of internal rating systems. Working Paper 14, May, Bank for International Settlements.
  • Basel Committee on Banking Supervision (2006). International convergence of capital measurement and capital standards: a revised framework. Report, June, Bank for International Settlements.
  • Blochwitz, S., Martin, M. R. W., and Wehn, C. S. (2011). XIII. Statistical approaches to PD validation. In The Basel II Risk Parameters, 2nd edn. Springer (https://doi.org/10.1007/978-3-642-16114-8_14).
  • Cantor, R., and Falkenstein, E. (2001). Testing for rating consistency in annual default rates. Journal of Fixed Income 11(2), 36–51 (https://doi.org/10.3905/jfi.2001.319296).
  • Clopper, C., and Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26, 404–413 (https://doi.org/10.1093/biomet/26.4.404).
  • Hamilton, D., Sun, Z., and Ding, M. (2011). Through-the-cycle EDF credit measures. Report, August, Moody’s Analytics.
  • Hanson, S., and Schuermann, T. (2005). Confidence intervals for probabilities of default. Journal of Banking and Finance 30(8), 2281–2301 (https://doi.org/10.2139/ssrn.766345).
  • International Financial Reporting Standards (2014). IFRS 9 Financial Instruments. Project Summary, July, IFRS. URL: https://bit.ly/2Mfpqrl.
  • Teichroew, D. (1956). Tables of expected values of order statistics and products of order statistics for samples of size twenty and less from the normal distribution. Annals of Mathematical Statistics 27(2), 410–426 (https://doi.org/10.1214/aoms/1177728266).
  • Wilson, E. B. (1927). Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association 22, 209–212 (https://doi.org/10.1080/01621459.1927.10502953).
  • Yang, B. H. (2014). Modeling systematic risk and point-in-time probability of default under the Vasicek asymptotic single-risk-factor model framework. The Journal of Risk Model Validation 8(3), 33–48 (https://doi.org/10.21314/JRMV.2014.126).
  • Yang, B. H. (2017). Point-in-time PD term structure models for multi-period-scenario loss projection: methodologies and implementations for IFRS 9 ECL and CCAR testing. MPRA Paper 76271, University Library of Munich, Germany.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe

You are currently unable to copy this content. Please contact info@risk.net to find out more.

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here