8/19/2019 Agb Rss 2016
1/27
Challenges arising when including
macroeconomic variables in survivalmodels of default
Dr Tony Bellotti
Department of Mathematics, Imperial College [email protected]
Royal Statist ical Society, 10 February 2016
8/19/2019 Agb Rss 2016
2/27
Discrete Survival Models for Retail redit Scoring
Outline of presentation
1. Background: credit scoring models
2. Including macroeconomic variables (MEVs) using
a discrete survival model.
3. Challenges
4. Results based on mortgage data including stress
test
8/19/2019 Agb Rss 2016
3/27
Background - Credit Scoring
• Typically, risk models for retail credit are models of default.
• Hence this is a statistical classification problem.
• Almost universally, logistic regression is the model of choice inretail banks.
= 0| = = where = ∙
where ∈ 0,1 is a default indicator, is the logit link function and
is the scorecar d.
• Default models are used for various functions including application
decisions, behavioural scoring and loss provisioning.
For example, for application scoring, the bank sets a threshold ,
depending on risk appetite, then accepts new applications n iff
n > .
8/19/2019 Agb Rss 2016
4/27
Introducing macroeconomic variables (MEVs)
• More accurate risk management, sensitive to changing economic
conditions.
• Pressure from regulators (Basel Accord internationally and
Prudential Risk Authority specifically in UK) to develop models that
calibrate economic conditions against credit risk, enabling forecastsof risk during recession periods (stress test).
• Logistic regression does not naturally allow inclusion of
macroeconomic time series, but survival models do through time-
varying covariates (TVCs).
• Discrete survival model is good option since:
(1) The default (failure) event is observed at discrete intervals (ie
monthly accounting data).
(2) Computationally efficient.
8/19/2019 Agb Rss 2016
5/27
List of Challenges
1. Selection of MEVs
2. Structural form of MEVs (ie lag and transformations)
3. Time trend in MEVs
4. Correlations amongst MEVs5. Systematic effects not fully explained by MEVs
6. Change in MEV risk factors over time
7. Segmentation and interactions
8. Confounding economic effects with behavioural variables
We will illustrate these challenges in this presentation using an
example of US mortgage credit risk modelling.
8/19/2019 Agb Rss 2016
6/27
Discrete survival model structure for credit risk
Outcome on account after some discrete duration > 0:
1 = default, 0 = non-default.Typically, duration is age of the account.
Non-linear transformation of duration; Baseline hazard.
eg, = , 2, log , log 2
Static variables; eg application variables and cohort effect.
− Behavioural variables over time (with some lag ).
Date of origination of account .
Frailty term on account .
+− MEVs over calendar time (with some lag ).
+
Unknown systematic (calendar time) effect.
= 1| = 0 for < , , − , +−
=
−
+− +
where
8/19/2019 Agb Rss 2016
7/27
Model estimation
• This is a panel model structure over accounts and duration .
• Need to specify a link function . This could be logit or probit.
• Taking to be complementary log-log, ie
= 1 exp exp , yields a discrete version of the Coxproportional hazard model.
• Most of the variables are included as fixed effect terms.
• Frailty can be included as a random effect term to deal withheterogeneity.
• Maximum marginal likelihood can be used to estimate coefficients on
fixed effects (, , , , ) and variance of the random effects.
8/19/2019 Agb Rss 2016
8/27
US Mortgage data
• We will use a large data set of account-level mortgage data for
illustration.
• Freddie Mac loan-level mortgage data set.
• Origination: 1999 to 2012.
• 181,000 loans (random sample, stratified by origination).
• Default event: D180 (180 days delinquency), short sale or short
payoff prior to D180 or deed-in-lieu of foreclosure prior to D180.
8/19/2019 Agb Rss 2016
9/27
Heat map:
Default rate by calendar month and account age
150
100
50
0 Jan1999
Jan2003
Jan2007
Jan2011
Jan2013
Jan2009
A c c o u n t a g e ( m o n t h s )
8/19/2019 Agb Rss 2016
10/27
Age of mortgage effect = Hazard probability
0 50 100 150Loan age (months)
H a z a r d p r o b a b i l i t y
8/19/2019 Agb Rss 2016
11/27
Calendar time effect
• This is the estimated risk over calendar time with (full line) and without
(dashed line) age, vintage and seasonality included in the model.
• We see that not including other time components would lead to
inaccurate coefficient estimates for the MEV effects.
Jan1999
Jan2003
Jan2007
Jan2011
D e f a u l t
r i s k
8/19/2019 Agb Rss 2016
12/27
1. Consider MEVs that we would expect to have a direct effect on default.
2. National versus local MEVs.
3. Consider MEVs that are required for stress testing, as specified by
regulators or the business.
°
For this exercise: US GDP, Unemployment rate (UR), House priceindex (HPI) and interest rate (IR).
Challenge 1: Selection of MEVs
8/19/2019 Agb Rss 2016
13/27
Challenge 2: Structural form of MEVs
How should we include the MEVs in our model?
Things to consider:-
• Choice of lag structure (including possibly geometric lag);
• Whether to use difference in MEV;
• Whether to smooth the MEV time series prior to inclusion in the
model;
• Whether to include seasonally adjusted or real values;
• Cumulative effects (eg we may expect high unemployment tohave a greater effect, the longer it continues);
• Whether the MEV need to be transformed prior to inclusion in
the model (eg log transform for price index variables).
8/19/2019 Agb Rss 2016
14/27
Challenge 3: Time trends in MEVs
• Time trends in the MEVs are problematic since this could lead to
spurious correlation with default risk.
• Simulation studies in the survival model setting show time trends
in MEVs can lead to errors in coefficient estimates.
• Therefore do not include MEVs with time trends. This effects
GDP and HPI, in particular.
• Solution #1: First difference? But this will only fit default risk
against short changes in MEVs (is this just noise?).• Solution #2 : Annual difference? This is better since this gives
information regarding change in economy over a period of time;
eg GDP growth. Also, do not need to worry whether or not to
seasonally adjust.
8/19/2019 Agb Rss 2016
15/27
First attempt at modelling…
Variable Lag *
(months)
Coefficient
estimate
SE P-value Expected
sign
Age effect…
Vintage effect…
Seasonality…
IR 0 +1.80 0.029
8/19/2019 Agb Rss 2016
16/27
… try removing Δ UR
… but now we have a problem with estimate for ΔGDP.
What is going on?
Variable Lag
(months)
Coefficient
estimate
SE P-value Expected
sign
Age effect…
Vintage effect…
Seasonality… IR 0 +1.82 0.029
8/19/2019 Agb Rss 2016
17/27
Challenge 4: Correlations between MEVs
ΔHPI ΔGDP UR ΔUR
ΔHPI 1 0.564 -0.835 -0.431
ΔGDP 0.564 1 -0.728 -0.842
UR -0.835 -0.728 1 0.543
ΔUR -0.431 -0.842 0.543 1
• This correlation matrix demonstrates some very high correlations
amongst the MEVs.• Solution #1: Variable selection – but this may remove some
variables that are required in stress testing.
• Solution #2 : Factor analysis to determine macroeconomic factors
(MFs) to include in the model.
8/19/2019 Agb Rss 2016
18/27
Principal Component Analysis on MEVs
Variable MF1 MF2 MF3
ΔHPI +0.474 -0.596 -0.562
ΔGDP +0.528 0.359 0.431
UR -0.524 0.375 -0.512
ΔUR -0.471 -0.613 0.486
Proportion
of variance
74.5% 18.5% 4.5%
• The first component (MF1) represents much of the economic effect
among the MEVs.• MF1 also has an unambiguous interpretation as a measure of
economic health.
• The remaining components do not account for much of the variance
and do not have a natural interpretation, hence only MF1 will be
included in the model.
8/19/2019 Agb Rss 2016
19/27
Structure: Relationship between MF1 and default
Bad ---------------- MF1 --------------- Good
S c o r e
There is a distinct “breakpoint” in the risk profile of MF1.
This can be modelled with an interaction term.
Bad
Good
8/19/2019 Agb Rss 2016
20/27
Model with MF1
Variable Coefficient
estimate
SE P-value Expected
sign
Age effect…
Vintage effect…
Seasonality…
IR +1.83 0.030
8/19/2019 Agb Rss 2016
21/27
8/19/2019 Agb Rss 2016
22/27
Challenge 7: Economic model breakpoints
How stable are MEV risk factors?
• One question we may have is whether the effects of
MEVs on default risk are stable over time.
• In particular, after an economic regime changes.
• Use a breakpoint model to test for this…
8/19/2019 Agb Rss 2016
23/27
Breakpoint model…
D = indicator variable: 1 if calendar date before or during Feb 2006,
0 otherwise
• D × MF1 effect is not significant, indicating no evident difference in
economic effect in the two different time periods.
Variable Coefficient
estimate
SE P-value Expected
sign
Other time
effects…
MF1 -0.076 0.0059
8/19/2019 Agb Rss 2016
24/27
Further challenges
7. Segmentations / interactions
• It is plausible that different segments of the population will react to
different MEVs in different ways.
• Eg high LTV accounts may be more sensitive to changes in HPI.
• Therefore, explore different model segments or variable interactions.
8. Confounding economic effects with behavioural variables (BVs)
• We may want to include BVs as TVCs; however, the MEVs may be
confounders for these effects.
• Eg economic conditions may affect repayments, in general.
• Therefore, test for confounding and build the model in stages (eg
include MEVs in first stage, then introduce BVs).
8/19/2019 Agb Rss 2016
25/27
Validate model with back-testing
• Use Default Rate (DR) during that period to measure performance.
• Use conservatism (as 2 × s.d of unknown systematic effect).
D e f a u l t r a t e (
m o n t h l y )
Jan1999
Jan2003
Jan2007
Jan2011
Jan2013
8/19/2019 Agb Rss 2016
26/27
Result 2: Stress test results
Scenario UR GDP HPI IR
Baseline -2% over 2 years 2.5% growth per
annum
7% increase
per annum
No change
Stress Rise from 7.4% to
peak of 10.6%
over 2 years
Reduction from
+2% to -2%
growth per annum
Zero increase No change
IR rise -2% over 2 years 2.5% growth per
annum
7% increase
per annum
Average +2%
increase over 2
years
Scenario Conservatism Year 1 Year 2Baseline No 1.21% 0.83%
Stress No 1.67% 2.34%
Stress Yes 2.21% 3.20%
IR rise No 1.73% 2.76%
Projection of annual default rate:
8/19/2019 Agb Rss 2016
27/27
Conclusion
1. Including MEVs in credit risk models is valuable to enable
more accurate risk management, sensitive to economic
changes, as well as stress testing.
2. We have illustrated several challenges when including MEVs
in credit risk models.
3. And suggested several approaches to handle these
challenges, demonstrating results on a large US mortgagedata set.