Download pdf - Agb Rss 2016

  • 8/19/2019 Agb Rss 2016


    Challenges arising when including

    macroeconomic variables in survivalmodels of default

    Dr Tony Bellotti

    Department of Mathematics, Imperial College [email protected] 

    Royal Statist ical Society, 10 February 2016

    mailto:[email protected]:[email protected]

  • 8/19/2019 Agb Rss 2016


    Discrete Survival Models for Retail redit Scoring 

    Outline of presentation

    1. Background: credit scoring models

    2. Including macroeconomic variables (MEVs) using

    a discrete survival model.

    3. Challenges

    4. Results based on mortgage data including stress


  • 8/19/2019 Agb Rss 2016


    Background - Credit Scoring

    • Typically, risk models for retail credit are models of default.

    • Hence this is a statistical classification problem.

    •  Almost universally, logistic regression is the model of choice inretail banks.

    = 0| = = where =   ∙  

    where ∈ 0,1  is a default indicator,  is the logit link function and

     is the scorecar d.

    • Default models are used for various functions including application

    decisions, behavioural scoring and loss provisioning.

    For example, for application scoring, the bank sets a threshold ,

    depending on risk appetite, then accepts new applications n iff

    n   > .

  • 8/19/2019 Agb Rss 2016


    Introducing macroeconomic variables (MEVs)

    • More accurate risk management, sensitive to changing economic


    • Pressure from regulators (Basel Accord internationally and

    Prudential Risk Authority specifically in UK) to develop models that

    calibrate economic conditions against credit risk, enabling forecastsof risk during recession periods (stress test).

    • Logistic regression does not naturally allow inclusion of

    macroeconomic time series, but survival models do through time-

    varying covariates (TVCs).

    • Discrete survival model is good option since:

    (1) The default (failure) event is observed at discrete intervals (ie

    monthly accounting data).

    (2) Computationally efficient.

  • 8/19/2019 Agb Rss 2016


    List of Challenges

    1. Selection of MEVs

    2. Structural form of MEVs (ie lag and transformations)

    3. Time trend in MEVs

    4. Correlations amongst MEVs5. Systematic effects not fully explained by MEVs

    6. Change in MEV risk factors over time

    7. Segmentation and interactions

    8. Confounding economic effects with behavioural variables

    We will illustrate these challenges in this presentation using an

    example of US mortgage credit risk modelling.

  • 8/19/2019 Agb Rss 2016


    Discrete survival model structure for credit risk

      Outcome on account  after some discrete duration > 0:

    1 = default, 0 = non-default.Typically, duration  is age of the account.


      Non-linear transformation of duration; Baseline hazard.

    eg, = , 2, log , log   2  

      Static variables; eg application variables and cohort effect. 

    −   Behavioural variables over time (with some lag ). 

      Date of origination of account . 

      Frailty term on account . 

    +−  MEVs over calendar time    (with some lag ). 

    Unknown systematic (calendar time) effect. 

     = 1| = 0 for < , , −   , +−  



    +−  +  


  • 8/19/2019 Agb Rss 2016


    Model estimation

    • This is a panel model structure over accounts  and duration .

    • Need to specify a link function . This could be logit or probit.

    • Taking  to be complementary log-log, ie

    = 1 exp exp , yields a discrete version of the Coxproportional hazard model.

    • Most of the variables are included as fixed effect terms.

    • Frailty  can be included as a random effect term to deal withheterogeneity.

    • Maximum marginal likelihood can be used to estimate coefficients on

    fixed effects (, , , , ) and variance of the random effects.

  • 8/19/2019 Agb Rss 2016


    US Mortgage data

    • We will use a large data set of account-level mortgage data for


    • Freddie Mac loan-level mortgage data set.

    • Origination: 1999 to 2012.

    • 181,000 loans (random sample, stratified by origination).

    • Default event: D180 (180 days delinquency), short sale or short

    payoff prior to D180 or deed-in-lieu of foreclosure prior to D180.

  • 8/19/2019 Agb Rss 2016


    Heat map:

    Default rate by calendar month and account age




    0 Jan1999






       A  c  c  o  u  n   t  a  g  e   (  m  o  n   t   h  s   )

  • 8/19/2019 Agb Rss 2016


     Age of mortgage effect = Hazard probability

    0 50 100 150Loan age (months)

       H  a  z  a  r   d  p  r  o   b  a   b   i   l   i   t  y

  • 8/19/2019 Agb Rss 2016


    Calendar time effect

    • This is the estimated risk over calendar time with (full line) and without

    (dashed line) age, vintage and seasonality included in the model.

    • We see that not including other time components would lead to

    inaccurate coefficient estimates for the MEV effects.





       D  e   f  a  u   l   t

      r   i  s   k

  • 8/19/2019 Agb Rss 2016


    1. Consider MEVs that we would expect to have a direct effect on default.

    2. National versus local MEVs.

    3. Consider MEVs that are required for stress testing, as specified by

    regulators or the business.


    For this exercise: US GDP, Unemployment rate (UR), House priceindex (HPI) and interest rate (IR).

    Challenge 1: Selection of MEVs

  • 8/19/2019 Agb Rss 2016


    Challenge 2: Structural form of MEVs

    How should we include the MEVs in our model?

    Things to consider:-

    • Choice of lag structure (including possibly geometric lag);

    • Whether to use difference in MEV;

    • Whether to smooth the MEV time series prior to inclusion in the


    • Whether to include seasonally adjusted or real values;

    • Cumulative effects (eg we may expect high unemployment tohave a greater effect, the longer it continues);

    • Whether the MEV need to be transformed prior to inclusion in

    the model (eg log transform for price index variables).

  • 8/19/2019 Agb Rss 2016


    Challenge 3: Time trends in MEVs

    • Time trends in the MEVs are problematic since this could lead to

    spurious correlation with default risk.

    • Simulation studies in the survival model setting show time trends

    in MEVs can lead to errors in coefficient estimates.

    • Therefore do not include MEVs with time trends. This effects

    GDP and HPI, in particular.

    • Solution #1: First difference? But this will only fit default risk

    against short changes in MEVs (is this just noise?).• Solution #2 : Annual difference? This is better since this gives

    information regarding change in economy over a period of time;

    eg GDP growth. Also, do not need to worry whether or not to

    seasonally adjust.

  • 8/19/2019 Agb Rss 2016


    First attempt at modelling… 

    Variable Lag *




    SE P-value Expected


     Age effect… 

    Vintage effect… 


    IR 0 +1.80 0.029

  • 8/19/2019 Agb Rss 2016


    … try removing Δ UR

    … but now we have a problem with estimate for ΔGDP.

    What is going on?

    Variable Lag




    SE P-value Expected


     Age effect… 

    Vintage effect… 

    Seasonality… IR 0 +1.82 0.029

  • 8/19/2019 Agb Rss 2016


    Challenge 4: Correlations between MEVs


     ΔHPI 1 0.564 -0.835 -0.431

     ΔGDP 0.564 1 -0.728 -0.842

    UR -0.835 -0.728 1 0.543

     ΔUR -0.431 -0.842 0.543 1

    • This correlation matrix demonstrates some very high correlations

    amongst the MEVs.• Solution #1: Variable selection – but this may remove some

    variables that are required in stress testing.

    • Solution #2 : Factor analysis to determine macroeconomic factors

    (MFs) to include in the model.

  • 8/19/2019 Agb Rss 2016


    Principal Component Analysis on MEVs

    Variable MF1 MF2 MF3

     ΔHPI +0.474 -0.596 -0.562

     ΔGDP +0.528 0.359 0.431

    UR -0.524 0.375 -0.512

     ΔUR -0.471 -0.613 0.486


    of variance

    74.5% 18.5% 4.5%

    • The first component (MF1) represents much of the economic effect

    among the MEVs.• MF1 also has an unambiguous interpretation as a measure of

    economic health.

    • The remaining components do not account for much of the variance

    and do not have a natural interpretation, hence only MF1 will be

    included in the model.

  • 8/19/2019 Agb Rss 2016


    Structure: Relationship between MF1 and default

    Bad ---------------- MF1 --------------- Good

       S  c  o  r  e

    There is a distinct “breakpoint” in the risk profile of MF1.

    This can be modelled with an interaction term.



  • 8/19/2019 Agb Rss 2016


    Model with MF1

    Variable Coefficient


    SE P-value Expected


     Age effect… 

    Vintage effect… 


    IR +1.83 0.030

  • 8/19/2019 Agb Rss 2016


  • 8/19/2019 Agb Rss 2016


    Challenge 7: Economic model breakpoints

    How stable are MEV risk factors?

    • One question we may have is whether the effects of

    MEVs on default risk are stable over time.

    • In particular, after an economic regime changes.

    • Use a breakpoint model to test for this… 

  • 8/19/2019 Agb Rss 2016


    Breakpoint model… 

    D = indicator variable: 1 if calendar date before or during Feb 2006,

    0 otherwise

    • D × MF1 effect is not significant, indicating no evident difference in

    economic effect in the two different time periods.

    Variable Coefficient


    SE P-value Expected


    Other time


    MF1 -0.076 0.0059

  • 8/19/2019 Agb Rss 2016


    Further challenges

    7. Segmentations / interactions

    • It is plausible that different segments of the population will react to

    different MEVs in different ways.

    • Eg high LTV accounts may be more sensitive to changes in HPI.

    • Therefore, explore different model segments or variable interactions.

    8. Confounding economic effects with behavioural variables (BVs)

    • We may want to include BVs as TVCs; however, the MEVs may be

    confounders for these effects.

    • Eg economic conditions may affect repayments, in general.

    • Therefore, test for confounding and build the model in stages (eg

    include MEVs in first stage, then introduce BVs).

  • 8/19/2019 Agb Rss 2016


     Validate model with back-testing

    • Use Default Rate (DR) during that period to measure performance.

    • Use conservatism (as 2 × s.d of unknown systematic effect).

       D  e   f  a  u   l   t  r  a   t  e   (

      m  o  n   t   h   l  y   )






  • 8/19/2019 Agb Rss 2016


    Result 2: Stress test results

    Scenario UR GDP HPI IR

    Baseline -2% over 2 years 2.5% growth per


    7% increase

    per annum

    No change

    Stress Rise from 7.4% to

    peak of 10.6%

    over 2 years

    Reduction from

    +2% to -2%

    growth per annum

    Zero increase No change

    IR rise -2% over 2 years 2.5% growth per


    7% increase

    per annum

     Average +2%

    increase over 2


    Scenario Conservatism Year 1 Year 2Baseline No 1.21% 0.83%

    Stress No 1.67% 2.34%

    Stress Yes 2.21% 3.20%

    IR rise No 1.73% 2.76%

    Projection of annual default rate:

  • 8/19/2019 Agb Rss 2016



    1. Including MEVs in credit risk models is valuable to enable

    more accurate risk management, sensitive to economic

    changes, as well as stress testing.

    2. We have illustrated several challenges when including MEVs

    in credit risk models.

    3. And suggested several approaches to handle these

    challenges, demonstrating results on a large US mortgagedata set.