41
A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 1 ETH Zürich, WBL Angewandte Stask 2019 Blockkurs Statistical Analysis Statistical Analysis of Financial Data of Financial Data Lecture 5 Lecture 5 Prof. Dr. Andreas Ruckstuhl Prof. Dr. Andreas Ruckstuhl Dozent für Statistische Datenanalyse Dozent für Statistische Datenanalyse Institut für Datenanalyse und Prozess Design IDP Institut für Datenanalyse und Prozess Design IDP Zürcher Hochschule für Angewandte Wissenschaften ZHAW Zürcher Hochschule für Angewandte Wissenschaften ZHAW [email protected] [email protected] 28. Januar 2019 28. Januar 2019

Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 1

ETH Zürich, WBL Angewandte Statistik 2019

Blockkurs

Statistical Analysis Statistical Analysis of Financial Dataof Financial Data

Lecture 5Lecture 5

Prof. Dr. Andreas RuckstuhlProf. Dr. Andreas RuckstuhlDozent für Statistische DatenanalyseDozent für Statistische Datenanalyse

Institut für Datenanalyse und Prozess Design IDPInstitut für Datenanalyse und Prozess Design IDPZürcher Hochschule für Angewandte Wissenschaften ZHAWZürcher Hochschule für Angewandte Wissenschaften ZHAW

[email protected]@zhaw.ch28. Januar 201928. Januar 2019

Page 2: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 2

Outline of Lecture 5

• Lecture 1: Financial Data and Their Properties

• Lecture 2: Model for Conditional Heteroskedasticity and Risk Measures

• Lecture 3: Statistical Issues When Applying Portfolio Theory

• Lecture 4: (Financial) Factor Models

• Lecture 5: Copulas– Basic Theory– Gaussian and t-Copulas– Archimedean Copulas– Rank Correlation – Tail Dependence– Calibrating Copulas (i.e., Estimating Copulas)– *Vine Copula

Page 3: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 3

8.1 Introduction8.1 Introduction

The statistical analysis of multivariate Date is classically based on the multivariate normal distribution.

• As a consequence we need just to bother estimating – the mean/expectation and – the covariance matrix

from the data.

• The stochastic dependence between two normally distributed variables is described by the covarianceThe covariance between two random variables depends on their variances as well as the strength of the linear relationship between them (see also slide 6, Part 3)

• To understand the relationship between variables, it is much better to examine their (sample) correlation matrix.

Page 4: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 4

Example: Daily returns on two equities GE and IBM and on CRSP, a value-weighted index including dividends (Jan 3,1969 to Dec 31, 1998)

Most people use the correlation matrix for examination:– All sample correlations are positive – The largest correlation is between CRSP and ge– …

• The correlation coefficients is only a summary of the linear relationship between variables.

• Interesting features, such as nonlinearity or the joint behaviour of extreme values, remain hidden

> round(cov(Stocks),6)

ge ibm CRSP

ge 0.000188 0.000080 7.6e-05

ibm 0.000080 0.000306 6.6e-05

CRSP 0.000076 0.000066 6.0e-05

> round(cor(Stocks),3)

ge ibm CRSP

ge 1.000 0.334 0.715

ibm 0.334 1.000 0.486

CRSP 0.715 0.486 1.000

Page 5: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 5

A way out of this shortcomings is the so-called scatterplot matrix:

• One sees little evidence of nonlinear relationships, but some structures caused by “non-normal” (i.e., non-Gaussian) events.

• This lack of nonlinearities is typical of returns on equities, but it should not be taken for granted – instead, one should always look at the scatterplot matrix.

Page 6: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 6

8.2 Basic Theory8.2 Basic Theory

Modelling multivariate distributions beyond normal distributions need an new approach to model dependencies (instead of correlations)

A copulacopula is a multivariate cumulative distribution function (CDF) whose univariate marginal distributions are all uniform [0,1]

Remember: If Y has a continuous CDF F, then F(Y) has a Uniform(0,1) distribution

• Suppose Y =(Y1, …, Yd)T has a multivariate CDF with continuous marginal univariate CDF

– Then the CDF of CDF of is called the copula of the copula of YY and denoted by and denoted by CCYY

that is

– CY contains all information about dependencies among the components of Y … … but has no information about the marginal CDFs of Y.

Page 7: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 7

• Sklar’s Theorem Sklar’s Theorem “justifies” the copula approach. It states that the joint CDF FY can be decomposed into

– the copula CY , which contains all information about the dependencies among (Y1, . . . , Yd), and

– the univariate marginal CDFs , which contain all information about the univariate marginal distributions.

• That is

Motivation:

• For calibration we need an expression in terms of the joint density:

• The independence copula:

Page 8: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 8

Example: Scatterplot of the three daily return series (GE, IBM and CRSP) (cf. slide 5) transformed to uniform marginals

Page 9: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 9

8.3 Gaussian and t-Copulas8.3 Gaussian and t-Copulas

Using copulas to model multivariate dependencies, we need parametric families of copulas.

→ The dependencies within Y is described by the correlation matrix Ω of Y since the copula does not depend on the univariate marginal distribution (including variance)

→ Hence, the copula CY depends only on the correlation matrix Ω of Y

• The Gaussian copulaGaussian copula with correlation matrix Ω will be denoted

• A distribution with a Gaussian copula is called a meta-Gaussian distribution.meta-Gaussian distribution.This does not mean that Y has a multivariate Gaussian distribution, since the univariate distributions could be any distribution at all.

• Example: Meta-Gaussian distribution for 2 dimensions:Transform the marginal distribution to uniformity and apply a standard two-dimensional Gaussian distribution with correlation ρ on it

where Φ-1() is the inverse of the standard Gaussian distribution and Φ2(·,·, ρ) the bivariate distribution of two standard Gaussian distribution with correlation ρ

Page 10: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 10

Example in R: Gaussian copula for 2 dim – different marginals

• library(copula)nC1 <- normalCopula(0.8, dim=2) # define the copula modelset.seed(5640)rnC1 <- rCopula(1000, nC1) # generate 1000 random numbers

• Margins: Gamma and beta distributedx <- qgamma(rnC1[,1],2)y <- qbeta(rnC1[,2],1,2) plot(x,y) # they correlate!

Page 11: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 11

Meta-Gaussian distribution became enormously popular ...... in the end causing problems because the market too strongly believed in it (→ 2007 subprime crises)

→ “The Gauss-copula is the worst invention ever for credit risk management”. (A risk Probabilist) see also http://www.wired.com/techbiz/it/magazine/17-03/wp_quant?currentPage=all

Why? - It does not capture tail dependence (cf. Sec 8.5 / Slide 20 ff)

t-copulat-copula

Similarly, let be the copula of a multivariate t-distribution with correlation matrix Ω and degrees of freedom ν (The same for all dimensions).The shape parameter ν affects both the univariate marginal distribution and the copula, so ν is a parameter of the copula.

A distribution with a t-copula is called a t-meta distribution t-meta distribution.

Page 12: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 12

> library(copula)> tC1.3 <- tCopula(c(-.6,0.75,0), dim=3, dispstr="un", df=1)> set.seed(5640)> r.tC1.3 <- rCopula(500, tC1.3)> pairs(r.tC1.3)> cor(r.tC1.3)[,1] [,2] [,3][1,] 1.0000000 -0.49562992 0.72293656[2,] -0.4956299 1.00000000 -0.04395247[3,] 0.7229366 -0.04395247 1.00000000

> tC25.3 <- tCopula(c(-0.6,0.75,0), dim=3, df=25)> set.seed(5640)> r.tC25.3 <- rCopula(500 , tC25.3)> pairs(r.tC25.3)> cor(r.tC25.3)[,1] [,2] [,3][1,] 1.0000000 -0.5702485 0.7576589[2,] -0.5702485 1.0000000 -0.0279640[3,] 0.7576589 -0.0279640 1.0000000

Page 13: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 13

8.4 Archimedean Copulas8.4 Archimedean Copulas

An Archimedean copula wit a strict generator has the form , (†)

where g() is the generator of the copula and satisfies1. g is continuous, strictly decreasing, convex mapping [0,1] onto [0,∞)2. g(0)= ∞ and g(1)=0

Notice that in (†), the value of C(u1, …, ud) is unchanged if we permute u1, …, ud .

One consequence of this property (called exchangeability) is: rank correlations (see later) are the same for allall pairs of variables.

Archimedian copulas are most useful in the bivariate case or in applications where we expect all pairs to have similar dependencies

Page 14: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 14

Three generators for Archimedean copulas:

• Frank copulaGenerator: Inverse Generator:

→ bivariate Frank copula:

using (†)

– “θ=0” requires some care. Evaluate the limit of , which yields the (bivariate) independence copula C0:

• Clayton copulaGenerator

→ Clayton copula

The Clayton copula for θ=0 is defined as the independence copula C0.For -1 < θ < 0 the copula is not strict and there is no probability, e.g. for d=2: if u1 and u2 are both small then occurs. In theses cases, the Clayton copula is set equal to 0.

Page 15: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 15

• Gumbel copulaGenerator: Gumbel copula:

The Gumbel copula is the independent copula when θ=1 but it cannot have negative dependence.

• SimulationsOn the next slides, simulation of bivariate samples are shown using scatterplots

– All simulations are with a sample size of 200– And with values of that give dependencies ranging from strongly negative (if

possible) to strongly positive

Page 16: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 16

• Random samples from Frank copulas

Page 17: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 17

• Random samples from Clayton copulas

Page 18: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 18

• Random samples from Gumbel copulas

Page 19: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 19

• Some remarks to the simulation results:– the Frank and Clayton copulas are rather different

when the amount of dependence is somewhere between the extremes.

• In particular, the Clayton copula's exclusion of the region when -1 < θ < 0 is evident, especially in the example with θ=-07

• The Frank copula is symmetric about the diagonal from (0, 1) to (1, 0), but the Clayton copula does not have this symmetry.

– It is clearly seen that the Gumbel copula cannot model negative dependencies.

Page 20: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 20

8.5 Tail Dependence8.5 Tail Dependence

• Tail dependence measures association between the extreme values of two random variables and depends only on their copula.

Suppose that Y=(Y1, Y2)T is a bivariate random vector with copula CY.

Lower Tail Dependence

• The coefficient of lower tail dependencecoefficient of lower tail dependence is denoted by ll and defined as

• If Y1 and Y2 are independent, thenTherefore, l = 0 implies that in the extreme left tail, Y1 and Y2 behave as they are independent.

2

2

2 1

1

1 12 1 11 1

2 1 1 10 01 1

2 1

001

( ) and ( ): lim ( ) | ( ) lim

( )

( ) and ( )lim

( , )l

( )im

Y Y

Y Yq q

Y

Y Y

q

Y

Y

l

q

P Y F q Y F qP Y F q Y F q

P Y F q

P F Y q F Y q

P F

C q q

qY q

Page 21: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 21

• Upper Tail DependenceThe coefficient of upper tail dependence is defined analogously

• For Gaussian and t-Copula,

• Knowing whether or not there is tail dependence is important for risk management.

– If there are no tail dependencies among the returns on the assets in a portfolio, then there is little risk of clusters of very negative returns, and the risk of an extreme negative return on the portfolio is low.

– Conversely, if there are tail dependencies, then the likelihood of extreme negative returns occurring simultaneously on several assets in the portfolio can be high.

Page 22: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 22

Tail Dependence for Gaussian and t-CopulasTail Dependence for Gaussian and t-Copulas

• Gaussian Copula: λl= 0, if correlation is ρ≠1. That is, Gaussian copulas do not have tail dependence except in case of ρ=1

• T-Copula with correlation ρ: where Ft,ν+1 is the CDF of the t-distribution with (υ+1) degree of freedom.

– Since Ft,ν+1(-∞) = 0, we see that λl → 0 if υ→ ∞ (i.e., consistent with Gaussian Copula)

– Left: Coefficients of tail-dependence for some t-copulas

Page 23: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 23

• Tail Dependance of Archimedean copulasTail Dependance of Archimedean copulasIn case of d=2, it can be shown that

– The Gumbel copula is able to model upper tail dependency, but no lower, whereas

– The Calyton copula can model lower tail dependence, but no upper:

– The Frank copula is not able to model either tail dependencies.

• Financial markets: losses occur more often jointly than gains do

Page 24: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 24

8.6 Calibrating Copulas (i.e., Estimating Copulas)8.6 Calibrating Copulas (i.e., Estimating Copulas)

Assume that we have an i.i.d. sample and we wish to estimate the joint distribution using a copula approach.

• An important task is choosing a copula model.An appropriate copula model can be selected using graphical techniques as well as with AIC (see below) after fitting

• The fitting of the selected model is based on the log-likelihood of (cf. Slide 7)

• A direct maximum likelihood estimation may be too challenging because– the large number of parameter cause numerical problems– if any of the marginals are not well fitted by a convenient parametric family, this

may cause bias in the estimated parameters of both the marginals and the copula

• A pseudo-maximum likelihood estimation may be more convenient.It is a two-step process, where each part (red, green) is estimated separately

Page 25: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 25

• In a first step, each of the d marginal distribution functions are estimated, one at a time (green part): →

There are two approaches, parametrically and nonparametrically.– Parametrically: Fit convenient parametric distribution functions to each of the

marginal distributions using maximum likelihood estimators– Nonparametrically: The marginal distribution is estimated by the empirical

CDF:

[The divisor (n+1) ensures that the maximum value of is (n/(n+1) < 1]

• In a second step,

is maximised over the parameter of the copula– Also the maximisation of this step can be difficult when is high-dimensional

. E.g., if one uses a t-Copula, then there are d(d-1) correlation parameters→ good starting values are needed (see next slide)

Page 26: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 26

8.6.1 Estimation of in the Meta-t-Distribution

• The Pearson correlation coefficient ρ measures the linear dependency between two random variables. A such it is not convenient for the copulas approach because

– It is not invariant under nonlinear strictly increasing transformations as the marginal CDFs are

– Hence, it depends on the univariate marginal distributions as well as on the copula.

– Marginal distribution and correlation do not determine the joint distribution except in case of Gaussian marginals

• Rank correlation coefficients remedy this problem, since they depend only on the copula

• Spearman’s rank correlationSpearman’s rank correlation depends only on the copula because it is invariant to any monotonically increasing transformation of the random variables; hence under CDF transformation as well.

It is calculated as Pearson correlation of the ranks of the variables and estimates the Pearson correlation of the CDF transformed variables:

Page 27: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 27

Another version which is based on ranks is Kendall’s tauKendall’s tau

• Let are two copies of independent bivariate of random variables. Then are called a concordant pair if i.e., either .

Similarly, are called a disconcordant pair if

• Kendall's tau Kendall's tau is the probability of a concordant pair minus the probability of a discordant pair. Therefore, Kendall’s tau for is

• Kendall’s tau of a bivariate sample is estimated by

Technically, let be a bivariate sample, then

Page 28: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 28

• Kendall’s tau has the following relation to Ωi,j:

• Hence we can use as estimate (or starting value)

• Unfortunately, this matrix may not be positive (semi-) definite. must be adjusted

– Determine the eigenvalues "λ" and eigenvectors of– Calculate

where E is the matrix of the eigenvectors and ε a small number > 0, e.g., ε =0.001

– But now the diagonal elements of may not be equal to 1– Hence, a easy fix yields the final solution:

multiply the ith row and the ith column of

The final result is called

Page 29: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 29

8.6.2 Example: Bivariate data set of returns on IBM and the CRSP index

library(Ecdat)data(CRSPday, package="Ecdat")IBM <- as.numeric(CRSPday[,5])CRSP <- as.numeric(CRSPday[,7])

## Fitting marginal t-distributions using package QRM> library(QRM)> IBM.tfit <- fit.st(IBM) ## fit a "scale-location" t-distribution> (IBM.est <- IBM.tfit$par.ests) nu mu sigma 4.1533842239 0.0002873347 0.0125834588

> CRSP.tfit <- fit.st(CRSP)>(CRSP.est <- CRSP.tfit$par.ests) nu mu sigma 3.4706229974 0.0009034934 0.0052143529

## Transformation of the data to uniform marginspstd <- function (q, mean=0, sigma=1, nu = 5) { # X U

## sd=sigma*s with s=sqrt(nu/(nu - 2)) z <- (q – mean)/sigma ## i.e., sd(z) = sqrt(nu/(nu - 2)) pt(q = z, df = nu)}

UD <- cbind(pstd(IBM, IBM.est["mu"], IBM.est["sigma"], IBM.est["nu"]), pstd(CRSP, CRSP.est["mu"], CRSP.est["sigma"], CRSP.est["nu"]))

Page 30: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 30

## Are the marginal distribution uniformly distributed?> h.n1 <- length(IBM)/10> h.t <- h.n1 + 2*c(1,-1)*sqrt(h.n1) ## "confidence band" for height ## of bars in histogram> hist(IBM, col=gray(0.9))> hist(CRSP, col=gray(0.9))> hist(UD[,1], col=gray(0.9))> abline(h=h.t, col="gray")> hist(UD[,2], col=gray(0.9))> abline(h=h.t, col="gray")

Page 31: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 31

## Fitting t copula in several steps## from Kendall’s tau to an estimation of Omega> (rhoTau <- cor(IBM, CRSP, method="kendall"))[1] 0.3308049> (omega <- sin(pi/2*rhoTau))[1] 0.4965565

## Define the t-copula using Omega as correlation parameter > require(copula)> tc2 <-tCopula(omega, dim=2)

## fit by "maximum pseudo-likelihood"> (UD.tc2 <- fitCopula(tc2, data=UD, method="mpl",start=c(omega,8), + optim.method="L-BFGS-B", lower=c(0.1, 5.5), upper=c(0.7, 15)))fitCopula() estimation based on 'maximum pseudo-likelihood'and a sample of size 2528. Estimate Std. Error z value Pr(>|z|) rho.1 0.49324 0.02458 20.07 <2e-16 ***df 9.80896 NA NA NA ---The maximized loglikelihood is 362.2 Optimization converged

Page 32: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 32

## Fitting Gumbel, Frank and Clayton copulas> (UD.gc2 <- fitCopula(copula=gumbelCopula(3,dim=2), data=UD, + method="mpl", optim.method="BFGS"))fitCopula() estimation based on 'maximum pseudo-likelihood'and a sample of size 2528. Estimate Std. Error z value Pr(>|z|) param 1.43005 0.02205 64.85 <2e-16 ***---The maximized loglikelihood is 313.8 Optimization converged

## Fitting Frank and Clayton copulas without R output(UD.fc2 <- fitCopula(copula=frankCopula(3,dim=2), data=UD,+ method="mpl", optim.method="BFGS"))

(UD.cc2 <- fitCopula(copula=claytonCopula(3,dim=2), data=UD,+ method="mpl", optim.method="BFGS"))

Page 33: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 33

## Compare the empirical copula with the estimated ones## using simulations> n <- nrow(UD)## define the 4 copulas> ht <- tCopula(param=UD.tc2@estimate[1], df=round(UD.tc2@estimate[2]))> hg <- gumbelCopula(UD.gc2@estimate, dim=2)> hf <- frankCopula(UD.fc2@estimate, dim=2)> hc <- claytonCopula(UD.cc2@estimate, dim=2)

> par(mfrow=c(2,3))## display data> plot(UD, main="Data")## simulate from the fitted copulas> set.seed(4711)> htr <- rCopula(n, ht); hgr <- rCopula(n, hg)> hfr <- rCopula(n, hf); hcr <- rCopula(n, hc)## display simulated data> h.pplot <- function(x, main=""){ plot(x, main=main, xlab=expression(u[1]),ylab=expression(u[2]), col=rgb(0,0,1, alpha=0.2), pch=16) }> h.pplot(htr, main="From t copula")> h.pplot(hgr, main="From Gumbel copula")> h.pplot(hfr, main="From Frank copula”)> h.pplot(hcr, main="From Clayton copula")

Page 34: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 34

Which simulated copula is most similar to the data?

Page 35: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 35

u1 <- UD[,1]; u2 <- UD[,2]Udex <- (1:n)/(n+1) ##h.l <- c(0.02, 0.05, 0.1, 0.2, 0.4, 0.6, 0.8, 0.9) ## contour levels## The function 'C.n()' from package copula computes the empirical copulaCn <- C.n(u=cbind(rep(Udex,n), rep(Udex,each=n)), U=UD, offset=0,method="C")## prepare the contours of the empirical copula:EmpCop <- expression(contour(Udex, Udex, matrix(Cn,n,n), col=2, add=T, levels=h.l))par(mfrow=c(2,3), las=1, mar=c(3.5,4,2,1), cex.lab=1.2, mgp=c(2.2,0.8,0))## "raw" dataplot(u1, u2, main="Uniform...", xlab=expression(hat(U)[1]), ylab = expression(hat(U)[2]))## tcontour(tCopula(param=UD.tc2@estimate[1], dim=2, df=round(UD.tc2@estimate[2])), pCopula, main=expression(hat(C)[t]), levels=h.l, xlab=...)eval(EmpCop)## Frankcontour(frankCopula(param= UD.fc2@estimate[1], dim=2), pCopula, main=expression(hat(C)[Fr]), levels=h.l, xlab=...)eval(EmpCop)## Claytoncontour(claytonCopula(param=UD.cc2@estimate[1], dim = 2), pCopula, main=expression(hat(C)[Cl]), levels=h.l, xlab=...) eval(EmpCop)## Gumbelcontour(claytonCopula(param=UD.gc2@estimate[1], dim = 2), pCopula, main = expression(hat(C)[Gu]), levels=h.l, xlab=...) eval(EmpCop)

Page 36: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 36

Page 37: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 37

Assess the fit by AIC

Akaike’s information criterion AIC = -2log-likelihood + 2p

## with R> h.AIC <- function(obj){+ np <- length(obj@estimate)+ h <- obj@loglik+ c(loglik=h, AIC=-2*h+2*np)}

> rbind(h.AIC(UD.tc2), h.AIC(UD.fc2), h.AIC(UD.gc2), h.AIC(UD.cc2)) loglik AIC[1,] 362.2008 -720.4016[2,] 325.0530 -648.1059[3,] 313.8293 -625.6587[4,] 293.7249 -585.4498

## According to the AIC the t copula clearly fits the data best

Page 38: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 38

8.7* Vine Copula

• There is a wide range of parametric copula families (Gauss, t, Frank, Clayton, ...)

• But: Standard multivariate copulas – can become inflexible in high dimensions.– do not allow for different dependency structures between pairs of variables.

• → Way out: E.g., Vine copulas for higher-dimensional data

• Vine CopulasVine Copulas:The dependency structure is determined by bivariate copulas and a nested set of trees. Example in d=3 dimensions:

– One possible decomposition of f(x1,x2,x3) is:

– Since

– We obtain

Page 39: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 39

• The decomposition is not unique

• A graphical strucure called regular vine structure has been introduced to help organize them:

• See also R Package ‘VineCopula’

• more research is needed, particularly towards application

Page 40: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 40

Take Home Message Lecture 5Take Home Message Lecture 5

Copulas are a general concept to model dependencies between random variables

• A copula is a multivariate cumulative distribution function (CDF) whose univariate marginal distributions are all uniform (0,1)

• To use copulas to model multivariate dependencies, we need parametric families of copulas Gaussian copula, t-copula, Archimedean copulas (e.g., Frank, Gumbel and Clayton copulas)

• In finance, the feature of modelling tail dependence is an advantageous characteristic of copulas t-copula, Clayton copula, Gumbel copula

• Practically, – use the introduced copulas mainly to model 2 dimensional r.v.– vine copulas enables a flexible modelling of the dependence structure for

portfolios of large dimensions

Page 41: Statistical Analysis of Financial Data · The statistical analysis of multivariate Date is classically based on the multivariate normal distribution. • As a consequence we need

A. Ruckstuhl -- WBL 2019, Lecture 5 of SAoFD -- Page 41

• Lecture 5 in the books:

Chapter 8 Copulas

• Sections marked by * are not examined