We propose a dependent hidden Markov model of credit quality. We suppose that the "true" credit quality is not observed directly but only through noisy observations given by posted credit ratings. The model is formulated in discrete time with a Markov chain observed in martingale noise, where "noise" terms of the state and observation processes are possibly dependent. The model provides estimates for the state of the Markov chain governing the evolution of the credit rating process and the parameters of the model, where the latter are estimated using the EM algorithm. The dependent dynamics allow for the so-called "rating momentum" discussed in the credit literature and also provide a convenient test of independence between the state and observation dynamics.
1. Introduction
Credit ratings summarise a range of qualitative and quantitative information about the credit worthiness of debt issuers and are therefore a convenient signal for the credit quality of the debtor. The estimation of credit quality transition matrices is at the core of credit risk measures with applications to pricing and portfolio risk management. In view of pending regulations regarding the calculation of capital requirements for banks, there is renewed interest in efficiency of credit ratings as indicators of credit quality and models of their dynamics (Basel Committee on Banking Supervision [1]).
In the study of credit quality dynamics, it is convenient to assume that the credit rating process is a time-homogeneous Markov chain, with past changes in credit quality characterised by a transition matrix. The assumptions of time homogeneity and Markovian behaviour of the rating process have been challenged by some empirical studies; see, for example, Bangia et al. [2] or Lando and Skødeberg [3]. In particular, it has been proposed that ratings exhibit “rating momentum” or “drift,” where a rating change in response to a change in credit quality does not fully reflect that change in credit quality. As pointed out by Löffler in [4, 5], these violations of information efficiency could be the result of some of the agencies’ rating policies, namely, rating through the cycle and avoiding rating reversals.
In recent years, a number of modelling alternatives were suggested to address departures from the Markov assumption. In Frydman and Schuermann [6], a mixture of two independent continuous time homogeneous Markov chains is proposed for the ratings migration process, so that the future distribution of a firm’s ratings depends not only its current rating but also on the past history of ratings. Wendin and McNeil [7] suppose that credit ratings are subject to both observed and unobserved systematic risk. Rating transition patters (e.g., rating momentum) are captured within the context of a generalised linear mixed model (GLMM) that is estimated using Bayesian techniques. Stefanescu et al. [8] propose a Bayesian hierarchical framework, based on Markov Chain Monte Carlo (MCMC) techniques, to model non-Markovian dynamics in ratings migrations. In Wozabal and Hochreiter [9], a coupled Markov chain model is introduced to model dependency among rating migrations of issuers.
In this paper we follow the hidden Markov model (HMM) approach taken in Korolkiewicz and Elliott [10] and assume that the “true” credit quality evolution can be described by a Markov chain but we do not observe this Markov chain directly. Rather, it is hidden in “noisy” observations represented by posted credit ratings. The model is formulated in discrete time, with a Markov chain of “true” credit quality observed in martingale noise. However, we suppose that noise terms of the signal and observation processes are not independent, which allows for the presence of “rating momentum” in posted credit ratings. Application of such dependent hidden Markov model dynamics to modelling credit quality appears to be new. We employ hidden Markov filtering and estimation techniques described in Elliott et al. [11] and use the filter-based EM (Expectation Maximization) algorithm to estimate the parameters of the model. By construction parameters are revised as new information is obtained and so the resulting filters are adaptive and “self-tuning.”
The paper is organized as follows. In Section 2 we describe a hidden Markov model (HMM) of credit quality and in Section 3 the dependent dynamics. Recursive filters are given in Section 4 and the parameter estimation procedure is described in Section 5. Section 6 provides an implementation example.
2. Dynamics of the Markov Chain and Observations
Here we briefly describe a hidden Markov model as given in Chapter 2 of Elliott et al. [11]. Formally, a discrete-time, finite-state, time homogeneous Markov chain is a stochastic process {Xk} with the state space S={1,2,…,N} and a transition matrix A=(aji)1≤i,j≤N. Without loss of generality, we can assume that the elements of S are identified with the standard unit vectors {e1,e2,…,eN}, ei=(0,…,0,1,0,…,0)′∈ℝN.
Write ℱk=σ{X0,X1,…,Xk} for a filtration {ℱk} models all possible histories of X. The relationship between the state process at time k and the state of the process at time k+1 is then given by E[Xk+1|Xk]=AXk.
Define Vk+1=Xk+1-AXk∈ℝN. Then, the semimartingale representation of the chain X is
(2.1)Xk+1=AXk+Vk+1,k=0,1,…,
where Vk+1 is a martingale increment with E[Vk+1|ℱk]=0∈ℝN.
Suppose we do not observe X directly. Rather, we observe a process Y such that
(2.2)Yk+1=c(Xk,ωk+1),k=0,1,…,
where c is a function with values in a finite set and {ωk} is a sequence of i.i.d. random variables independent of X. Random variables {ωk} represent the noise present in the system. Suppose the range of c consists of M points which are identified with unit vectors {f1,f2,…,fM},fj=(0,…,0,1,0,…,0)′∈ℝM.
Write
(2.3)Yk=σ{Y0,Y1,…,Yk},Gk=σ{X0,…,Xk,Y0,…,Yk}.
These increasing families of σ-fields are filtrations representing possible histories of the state process X, the observation process Y, and both processes (X,Y). Write cji=P(Yk+1=fj∣Xk=ei),1≤i≤N,1≤j≤M, for the probability of observing a state fj when the signal process is in fact in state ei. Then, it can be shown that E[Yk+1|Xk]=CXk, where C=(cji)1≤i,j≤M is a matrix with cji≥0 and ∑j=1Mcji=1.
Define Wk+1=Yk+1-CXk. The semimartingale representation of the process Y is
(2.4)Yk+1=CXk+Wk+1,k=0,1,…,
where W is a martingale increment with E[Wk+1|𝒢k]=0∈ℝM. In our context, the process Y represents posted credit ratings and X “true” credit quality. For reasons which will become apparent in the next section, we assume one-period delay between X and Y.
In summary, the model for the Markov chain X hidden in martingale noise is as follows.
Hidden Markov Model (HMM)
Under a probability measure P,
(2.5)Xk+1=AXk+Vk+1(signalequation,truecreditquality),Yk+1=CXk+Wk+1(observationequation,postedrating).A and C are matrices of transition probabilities whose entries satisfy
(2.6)∑j=1Naji=1,aji≥0;∑j=1Mcji=1,cji≥0.Vk and Wk are martingale increments satisfying
(2.7)E[Vk+1∣Fk]=0,E[Wk+1∣Gk]=0.
Parameters of this model are (aji),1≤i,j≤N and (cji),1≤j≤M,1≤i≤1.
3. Dependent Dynamics
The situation considered in this section is that of a hidden Markov model (HMM) for which the “noise” terms in the state and observation processes are possibly dependent.
The dynamics of the state process X and the observation process Y are as given in Section 2. However, the noise terms Vk and Wk are not independent. Instead, we suppose that the joint distribution of Yk and Xk is given by
(3.1)Yk+1Xk+1′=SXk+Γk+1,k=0,1,…,
where S=(srji) denotes a (M×N)×N matrix mapping ℝN into ℝM×ℝN and
(3.2)srji=P(Yk+1=fr,Xk+1=ej∣Xk=ei),1≤r≤M,1≤i,j≤N.Γk+1 is a {𝒢k}-martingale increment with E[Γk+1|𝒢k]=0. Write 1=(1,1,…,1)′ for the vector in ℝN or ℝM depending on the context. Then, for 1∈ℝM,〈1,SXk〉=AXk and for 1∈ℝN,〈SXk,1〉=CXk, where 〈,〉 denotes the scalar product in ℝM and ℝN, respectively.
Write γrji=P(Yk+1=fr∣Xk+1=ej,Xk=ei)=srji/aji, and let C~ be the M×(N×N) matrix (γrji),1≤r≤M,1≤i,j≤N. Then it can be shown that Yk+1=C~(Xk+1Xk′)+W~k+1, where E[W~k+1|𝒢k]=0.
In summary, the model is now as follows.
Dependent Hidden Markov Model (Dependent HMM)
Under a probability measure P,(3.3)Xk+1=AXk+Vk+1,Yk+1=C~(Xk+1Xk′)+W~k+1.A and C~ are matrices of transition probabilities whose entries satisfy
(3.4)∑j=1Naji=1,∑r=1Mγrji=1.Vk and W~k are martingale increments satisfying
(3.5)E[Vk+1∣Fk]=0,E[W~k+1∣Gk]=0.
Parameters of this model are (aji),1≤i,j≤N,(cji),1≤j≤M,1≤i≤1, and (srji),1≤i,j≤N,1≤r≤M.
We are in a situation analogous to the dependent hidden Markov model case discussed in Chapter 2, Section 10 of Elliott et al. [11]. The difference is that we are assuming dynamics where the observation Yk depends on both Xk and Xk-1. In other words, we suppose that the current credit rating contains information about both current and previous credit quality, thus allowing for the situation where a rating does not immediately reflect all available information about credit quality, as indicated by a number of empirical studies (see, e.g., Lando and Skødeberg [3]). Put differently, in this model Xk and observation Yk jointly depend on Xk-1, which means that, in addition to previous period’s credit quality, knowledge of current credit rating carries information about current credit quality. Moreover, probabilities γrji provide the distribution of the next period’s credit rating given both current and next period’s credit quality, thus allowing us to capture “rating momentum” or “rating drift.”
In the following sections we will presents estimates for the state of the Markov chain X, the number of jumps from one state to another, the occupation time of X in any state, the number of transitions of the observation process Y into a particular state of X, and the number of joint transitions of X and Y. We will then use the filter-based EM (expectation maximization) algorithm as described in Elliott et al. [11], to obtain optimal estimates of the model, making it adaptive or “self-tuning.”
Note that if the noise terms in the state X and observation Y are independent, we have
(3.6)P(Yk+1=fr,Xk+1=ej∣Xk=ei)=P(Yk+1=fr∣Xk=ei)P(Xk+1=ej∣Xk=ei).
Hence if the noise terms are independent,
(3.7)srji=crjaji
for 1≤r≤M,1≤i,j≤N. Consequently, a test of independence is to check whether parameter estimates satisfy
(3.8)s^rji=c^rja^ji.
4. Recursive Filter
Following Elliott et al. [11], suppose that under some probability measure P¯ on (Ω,ℱ),{Yk} is a sequence of i.i.d. uniform variables, that is, P¯(Yk+1=fj∣𝒢k)=P¯(Yk+1=fj)=1/M. Further, under P¯,X is Markov chain independent of Y, with state space S={e1,…,eN} and transition matrix A=(aji). That is, Xk+1=AXk+Vk+1, where E¯[Vk+1∣𝒢k]=E¯[Vk+1∣ℱk]=0∈ℝN. Suppose C~=(γrji),1≤r≤M,1≤i,j≤N, is a matrix with γji≥0, and ∑j=1Mγrji=1.
Define λ¯l=M∑j=1M〈C~(XlXl-1′),fj〉〈Yl,fj〉 and Λ¯k=∏l=1kλ¯l. Define a new probability measure P by putting (dP/dP¯)∣𝒢k=Λ¯k. Then, under P,X remains a Markov chain with transition matrix A and P(Yk+1=fr∣Xk+1=ej,Xk=ei)=γrji. That is, under P,Xk+1=AXk+Vk+1 and Yk+1=C~(Xk+1Xk′)+W~k+1.
Suppose we observe Y0,…,Yk, and we wish to estimate X0,…,Xk. The best (mean square) estimate of Xk given 𝒴k=σ{Y0,…,Yk} is E[Xk∣𝒴k]∈ℝN. However, P¯ is a much easier measure under which to work. Using Bayes’ Theorem as described in Elliott et al. [11], we have
(4.1)E[Xk∣Yk]=E¯[Λ¯kXk∣Yk]E¯[Λ¯k∣Yk].
Write q~k:=E¯[Λ¯kXk∣𝒴k]∈ℝN. q~k is then an unnormalized conditional expectation of Xk given the observations 𝒴k. Note that E¯[Λ¯k∣𝒴k]=〈q~k,1〉, where 1=(1,1,…,1)′∈ℝN. It then follows that
(4.2)E[Xk∣Yk]=q~k〈q~k,1〉.
Hence, to estimate E[Xk∣𝒴k] we need to know the dynamics of q~. Using the methods of Elliott et al. [11], the following recursive formula for q~k+1 is obtained:
(4.3)q~k+1=MYk+1′Sq~k.
5. Parameter Estimates
To estimate parameters of the model, matrices A,C, and S, we need estimates of the following processes:
(5.1)Jkij=∑n=1k〈Xn-1,ei〉〈Xn,ej〉,1≤i,j≤N,Oki=∑n=1k〈Xn-1,ei〉,1≤i≤N,Tkir=∑n=1k〈Xn-1,ei〉〈Yn,fr〉,1≤i≤N,1≤r≤M,Lkijr=∑n=1k〈Xn-1,ei〉〈Xn,ej〉〈Yn,fr〉,1≤r≤M,1≤i,j≤N.
The above processes are interpreted as follows:
𝒥kij is the number of jumps of X from state ei to state ej up to time k. 𝒪ki is the amount of time, up to time k-1,X has spent in state ei. 𝒯kir is the number of transitions, up to time k, from state ei to observation fr. ℒkijr is the number of jumps of X from state ei to state ej while Y was in state fr up to time k.
Note that ∑j=1N𝒥kij=∑j=1M𝒯kij=∑r=1M∑j=1Nℒkijr=𝒪ki.
Consider first the jump process {𝒥kij}. We wish to estimate 𝒥kij given the observations Y0,…,Yk. As in the case of a filter for the state X described in Section 4, the best (mean-square) estimate is
(5.2)E[Jkij∣Yk]=E¯[Λ¯kJkij∣Yk]E¯[Λ¯k∣Yk]:=σ(Jij)k〈q~k,1〉.
We wish to know how σ(𝒥ij)k is updated as time passes and new information arrives. However, as noted in Elliott et al. [11], we work with σ(JijX)k=E¯[Λ¯kJkijXk∣𝒴k] rather than σ(Jij)k=E¯[Λ¯kJkij∣𝒴k], in order to obtain closed-form recursions. The quantity of interest, namely, σ(𝒥ij)k, is then readily obtained as σ(𝒥ij)k=〈σ(𝒥ijX)k,1〉. We have
(5.3)σ(JijX)k+1=MYk+1′Sσ(JijX)k+〈q~k,ei〉(M∑r=1Msrji〈Yk+1,fr〉)ej.
Similarly, we consider the best (mean square) estimates of 𝒪ki, 𝒯kjr, and ℒrji given 𝒴k:
(5.4)E[Oki∣Yk]=E¯[Λ¯kOki∣Yk]E¯[Λ¯k∣Yk]:=σ(Oi)k〈q~k,1〉,E[Tkjir∣Yk]=E¯[Λ¯kTkjir∣Yk]E¯[Λ¯k∣Yk]:=σ(Tir)k〈q~k,1〉,E[Lkijr∣Yk]=E¯[Λ¯kLkijr∣Yk]E¯[Λ¯k∣Yk]:=σ(Lijr)k〈q~k,1〉.
Recursive formulae for the processes
(5.5)σ(OiX)k:=E¯[Λ¯kOkiXk∣Yk],σ(TirX)k:=E¯[Λ¯kTkirXk∣Yk],σ(LijrX)k:=E¯[Λ¯kLkijrXk∣Yk]
are as follows:
(5.6)σ(OiX)k+1=MYk+1′Sσ(OiX)k+〈q~k,ei〉∑j=1M(M∑r=1Msrji〈Yk+1,fr〉)ej,σ(TirX)k+1=MYk+1′Sσ(TirX)k+M〈q~k,ei〉(∑j=1Nsrjiej)〈Yk+1,fr〉,σ(LijrX)k+1=MYk+1′Sσ(LijrX)k+〈q~k,ei〉Msrji〈Yk+1,fr〉ej.
As in the case of the number of jumps of the state process X, quantities of interest σ(𝒪i)k, σ(𝒯ir)k, and σ(ℒijr)k are obtained by taking inner products with 1=(1,1,…,1)′:
(5.7)σ(Oi)k=〈σ(OiX)k,1〉,σ(Tir)k=〈σ(TirX)k,1〉,σ(Lijr)k=〈σ(LijrX)k,1〉.
The model is determined by parameters θ={aji,1≤i,j≤N;cji,1≤i≤N,1≤j≤M;srji,1≤r≤M,1≤i,j≤N}. These satisfy
(5.8)aji≥0,∑j=1Naji=1,cji≥0,∑j=1Mcji=1,srji≥0,∑r=1M∑j=1Nsrji=1.
We want to determine a new set of parameters θ^={a^ji,1≤i,j≤N;c^ji,1≤i≤N,1≤j≤M}; s^rji,1≤r≤M,1≤i,j≤N} given the arrival of new information embedded in the values of the observation process Y. This requires maximum likelihood estimation. As in [11], we proceed by using the filter-based EM (Expectation Maximization) algorithm, which retains the well-established statistical properties of the EM algorithm while reducing memory costs and thus allowing for faster computation (see, e.g., Krishnamurthy and Chung [12]).
Consider first the parameter aji. Suppose that, under measure Pθ,X is a Markov chain with transition matrix A=(aji). We define a new probability measure Pθ^ such that, under Pθ^,X is a Markov chain with transition matrix A^=(a^ji), that is,
(5.9)Pθ^(Xk+1=ej∣Xk=ei)=a^ji,a^ji≥0,∑j=1Na^ji=1. Define
(5.10)Λ0=1,Λk=∏l=1k(∑r,s=1N(a^srasr)〈Xl,es〉〈Xl-1,er〉).
In case aji=0 take a^ji=0 and a^sr/asr=1.
Define Pθ^ by setting (dPθ^/dPθ)∣ℱk=Λk. It can then be shown that, under Pθ^,X is a Markov chain with transition matrix A^=(a^ji). Moreover, given the observations up to time k, {Y0,Y1,…,Yk}, and given the parameter set θ={aji,1≤i,j≤N;cji,1≤i≤N,1≤j≤M}, the EM estimates a^ji are given by
(5.11)a^ji=σ(Jij)kσ(Oi)k.
Consider now the parameter cji. Suppose that, under measure Pθ,
(5.12)Yk+1=CXk+Wk+1,
where C=(cji). We define a new probability measure Pθ^ as follows. Put
(5.13)Λ0=1,Λk=∏l=1k(∑r,s=1N(c^srcsr)〈Xl-1,er〉〈Yl,fs〉).
In case cji=0 take c^ji=0 and c^sr/csr=1.
Define Pθ^ by setting (dPθ^/dPθ)∣𝒢k=Λk. Again it can be shown that, under Pθ^,
(5.14)Yk+1=C^Xk+W^k+1,
that is, Pθ^(Yk+1=fs∣Xk=er)=c^sr. Moreover, given the observations up to time k,{Y0,Y1,…,Yk}, and given the parameter set θ={aji,1≤i,j≤N;cji,1≤i≤N,1≤j≤M}, the EM estimates c^ji are given by
(5.15)c^ji=σ(Tij)kσ(Oi)k.
Finally, consider the parameter srji. A new probability measure Pθ^ is defined by putting
(5.16)Λ0=1,Λk=∏l=1k(∑r=1M∑i,j=1N(s^rjisrji)〈Yl,fr〉〈Xl,ej〉〈Xl-1,ei〉).
In case srji=0 take s^rji=0 and s^rji/srji=1. Define Pθ^ by setting (dPθ^/dPθ)∣𝒢k=Λk. Then, under Pθ^, Yk+1Xk+1′=S^Xk+Γ^k+1, that is,
(5.17)Pθ^(Yk+1=fr,Xk+1=ej∣Xk=ei)=s^rji.
Given the observations up to time k,{Y0,Y1,…,Yk}, and given the parameter set θ={aji,1≤i,j≤N;cji,1≤i≤N,1≤j≤M;srji,1≤r≤M,1≤i,j≤N}, the EM estimates s^rji are then given by
(5.18)s^rji=σ(Lijr)kσ(Oi)k.
6. Implementation Example
The dependent hidden Markov model (Dependent HMM) described in previous sections was applied to a dataset of Standard & Poor’s credit ratings. Description of the data and implementation results are given below.
6.1. Data Description
Our analysis takes advantage of the Standard & Poor’s COMPUSTAT database, which contains rating histories for 1,301 obligors over the period 1985–1999 (Standard & Poor’s [13]). The universe of obligors is mainly large US and Canadian corporate institutions. The obligors include industrials, utilities, insurance companies, banks and other financial institutions, and real-estate companies. The COMPUSTAT database provides annual ratings. Every year each of the rated obligors is assigned to one of the Standard and Poor’s 7 rating categories, ranging from AAA (highest rating) to CCC (lowest rating) as well as D (payment in default) and the NR (not rated) state.
We have a total of 19,515 firm-years in our sample. However, only 34% of those observations correspond to one of the eight Standard & Poor’s rating labels in a given year. The remaining 66% of observations represent the so-called NR (not rated) status. As discussed in the literature, transitions to NR may be due to several reasons, such as expiration of the debt, calling of the debt, or the issuer deciding to bypass an agency rating (see, e.g., Bangia et al. [2]). Unfortunately, details of individual transitions to NR are not known.
Excluding NR, approximately 85% of the remaining ratings are in categories A down to B. The median rating is BB, the highest non-investment-grade rating. Approximately 1% of the observed ratings are AAA and 2% are defaults. The most common rating is B, two rating categories above default, which accounts for 25.5% of the observations.
6.2. Implementation Results
Since individual firms generally experience few rating changes and changes that do occur are to neighbouring categories, we apply the Dependent HMM algorithm to an aggregate of firms in the dataset rather to allow for more observed transitions between rating categories and make inferences possible. Specifically, we follow the filter-based cohort approach adopted in Korolkiewicz and Elliott [10], and instead of estimating the distribution and parameters for the Markov chain Xkl for each firm l, we estimate the distribution and parameters for ∑l=1LXkl given the additivity of all stochastic processes discussed in Sections 4 and 5.
Given the fairly large number of parameters to be estimated compared to the number of rating transitions in the dataset, we have reclassified all firms in the sample as IG (investment grade), SG (speculative grade), D, or NR and then applied the Dependent HMM algorithm to the new dataset. This classification is motivated by the fact that a corporation which can issue higher rated debt usually receives better financing terms. Further, as a matter of policy or law, some institutional investors can only purchase investment-grade bonds. Hence it is often crucial for a borrower to maintain an investment-grade rating and so it is interesting to see if rating transition data reflects this.
Each modified credit rating category IG, SG, as well as default D and NR, was identified with a unit vector in ℝ4. Given the relatively short time period, parameter estimates were updated with the arrival of every new observation for the 1,301 firms in the dataset. Repetition of the estimation procedures ensures that the model and estimates improve with each iteration. Estimated parameters of the model, namely, matrices A^,C^, and S, are given in Table 1.
Estimates of matrices A, C, and S.
Estimated matrix A
Estimated matrix C
IG
SG
D
NR
IG
SG
D
NR
IG
0.408
0.018
0.000
0.000
IG
0.126
0.038
0.000
0.000
SG
0.068
0.249
0.000
0.000
SG
0.094
0.118
0.010
0.000
D
0.000
0.017
1.000
0.000
D
0.001
0.004
0.000
0.000
NR
0.524
0.715
0.000
0.999
NR
0.780
0.840
0.990
1.000
Estimated matrix S
IG category
D category
IG
SG
D
NR
IG
SG
D
NR
IG
0.119
0.007
0.000
0.000
IG
0.000
0.000
0.000
0.000
SG
0.076
0.018
0.000
0.000
SG
0.000
0.000
0.010
0.000
D
0.000
0.001
0.000
0.000
D
0.000
0.000
0.000
0.000
NR
0.211
0.043
0.000
0.524
NR
0.000
0.000
0.990
0.000
SG category
NR category
IG
SG
D
NR
IG
SG
D
NR
IG
0.007
0.032
0.000
0.000
IG
0.000
0.000
0.000
0.000
SG
0.006
0.105
0.008
0.000
SG
0.000
0.000
0.000
0.000
D
0.000
0.003
0.000
0.000
D
0.000
0.000
0.000
0.000
NR
0.008
0.108
0.009
0.715
NR
0.000
0.000
0.000
0.999
Considering the estimated transition matrix A^, note that entries above the diagonal correspond to rating upgrades and those below the diagonal to rating downgrades. Nonzero transition probabilities are concentrated and highest on the diagonal and the second largest probability is in the last row, indicating that obligors generally either maintain their rating or enter the NR (not rated) category. Our results show that investment-grade firms generally hold on to their status. The probability of downgrade to speculative-grade status is estimated as 6.8%. However, for speculative-grade firms, the probability of upgrade to investment-grade status is lower (estimated probability of 1.8%). Speculative-grade firms tend to maintain their status or disappear from the dataset because of either default or withdrawn rating. The probability of transition to the NR status is higher for speculative-grade obligors (71.5%) than for investment-grade obligors (52.4%).
Recall from Section 3 that, given estimates of matrices A and C, our Dependent HMM also provides the distribution of posted credit ratings at time k+1 given “true” credit quality at times k and k+1, namely, estimates of conditional probabilities γrji=P(Yk+1=fr∣Xk+1=ej,Xk=ei). To illustrate, consider a borrower with investment-grade “true” credit quality at times k and k+1. The probability that this borrower is assigned to a speculative-grade rating class is P(Yk+1=SG∣Xk+1=IG,Xk=IG), which, given our model parameter estimates, is given by s^SG,IG,IG/a^IG,IG=0.007/0.408=0.017. Similarly, for a borrower whose “true” credit quality improves from SG to IG, the probability of being assigned to an IG rating class is given by P(Yk+1=IG∣Xk+1=IG,Xk=SG), which we would estimate to be 0.007/0.018=0.389. These estimates again suggest that rating agencies may be somewhat reluctant to downgrade (upgrade) borrowers from (to) investment grade, thus introducing a degree of “rating momentum.”
6.3. Test of Independence
Recall that the Dependent HMM allows the “noise” terms in the state and observation processes to be possibly dependent. As indicated in Section 3, a convenient test of independence is to check whether the estimated parameters of the model satisfy s^rji=c^rja^ji.
Given our estimates of matrices A^ and C^, products c^rja^ji were calculated and then compared to corresponding entries of the estimated matrix S^ using linear regression. The regression results are given in Table 2. As indicated by the high F-statistic (4728.10) and high R2 value (98.71%), the fitted regression model is significant. The slope estimate is very close to one with low standard error and P value of 0.000, while the intercept estimate is very close zero and not significant (P value of 0.91). These regression results suggest no major departures from independence, which seems to agree with findings in Kiefer and Larson [14] that indicate the Markov assumption, implicit in most credit risk models, does not seem to be “too wrong” for typical forecast horizons. However, longer rating histories may be necessary to verify these results.
Linear regression results.
Summary output
Regression statistics
Multiple R
0.9935
R square
0.9871
Adjusted R square
0.9868
Standard error
0.0236
Observations
64
ANOVA
df
SS
MS
F
Significance F
Regression
1
2.6345
2.6345
4728.10
0.0000
Residual
62
0.0345
0.0006
Total
63
2.6690
Coeff
Std err
t stat
P value
Lower
Upper
95%
95%
Intercept
−0.0003
0.0031
−0.1109
0.9121
−0.0065
0.0058
s(rji)
1.0058
0.0146
68.7612
0.0000
0.9765
1.0350
7. Conclusion
We have proposed a Dependent Hidden Markov Model for the evolution of credit quality in discrete time with a Markov chain observed in martingale noise. We have applied the estimation techniques of hidden Markov models from Elliott et al. [11] to obtain the best estimate of the Markov chain representing “true” credit quality and estimates of the parameters. The estimation procedure was repeated to ensure that the model and estimates improved with each iteration. The model was applied to a dataset of Standard & Poor’s issuer ratings and our preliminary results agree with some qualitative observations made in the literature regarding credit rating systems but also indicate no significant dependence in the dynamics of the “state” (credit quality) and “observation” (credit rating) processes.
Basel Committee on Banking SupervisionStudies on validation of of internal rating systemsBangiaA.DieboldF. X.KronimusA.SchagenC.SchuermannT.Ratings migration and the business cycle, with application to credit portfolio stress testingLandoD.SkødebergT. M.Analyzing rating transitions and rating drift with continuous observationsLöfflerG.An anatomy of rating through the cycleLöfflerG.Avoiding the rating bounce: Why rating agencies are slow to react to new informationFrydmanH.SchuermannT.Credit rating dynamics and Markov mixture modelsWendinJ.McNeilA. J.Dependent credit migrationsStefanescuC.TunaruR.TurnbullS.The credit rating process and estimation of transition probabilities: A Bayesian approachWozabalD.HochreiterR.A coupled Markov chain approach to credit risk modelingKorolkiewiczM. W.ElliottR. J.A hidden Markov model of credit qualityElliottR. J.AggounL.MooreJ. B.KrishnamurthyV.ChungS. H.ChungS. H.AndersenS. H.KrishnamurthyV.Signal processing based on hidden Markov models for extracting small channel currentsStandard & Poor'sStandard & Poor's COMPUSTAT (North America) User's Guide, Standard & Poor's, Englewood Colorado, 2000KieferN. M.LarsonC. E.A simulation estimator for testing the time homogeneity of credit rating transitions