158
Exchangeability of Copulas Michael Harder Dissertation zur Erlangung des Doktorgrades Dr. rer. nat. der Fakult¨ at f¨ ur Mathematik und Wirtschaftswissenschaften der Universit¨ at Ulm. Vorgelegt von Michael Harder (geb. Weyhm¨ uller) aus Ulm an der Donau im Jahr 2016.

Exchangeability of Copulas

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Exchangeability of Copulas

Exchangeability of Copulas

Michael Harder

Dissertation zur Erlangung des Doktorgrades Dr. rer. nat. der Fakultat furMathematik und Wirtschaftswissenschaften der Universitat Ulm.

Vorgelegt von Michael Harder (geb. Weyhmuller) aus Ulm an der Donau im Jahr 2016.

Page 2: Exchangeability of Copulas

Tag der Prufung:

07. Juni 2016

Gutachter:

Prof. Dr. Ulrich Stadtmuller

Prof. Dr. Robert Stelzer

Amtierender Dekan:

Prof. Dr. Werner Smolny

Page 3: Exchangeability of Copulas

Fur Dich.

Page 4: Exchangeability of Copulas

ii

“There is never enough time to do all the nothing you want.”— Bill Watterson, Calvin and Hobbes, 1988-08-28

Page 5: Exchangeability of Copulas

Preface

Let me express deepest gratefulness towards everyone who supported me, in particular during mytime as a doctoral candidate.

First and foremost, I wish to thank my advisor Ulrich Stadtmuller who encouraged me whenencouragement was needed and showed patience when things took their time. Regardless oftime being tight, he was always receptive to questions, discussions and the need for advice.Furthermore, he accepted me as a member of the Institute of Number Theory and ProbabilityTheory at Ulm University, where I not only had the chance to work as class teacher during variouslectures by him and others, but where I also got the possibility to present my work and attendinspiring talks at various conferences and workshops in Cracow, Munich, Moscow and Ulm.

I also appreciate that Robert Stelzer agreed to be my second advisor, despite numerous otherduties and several doctoral candidates competing for his time.

Moreover, my present and former colleagues at the Faculty of Mathematics and Economicsshall not be left unmentioned. They made me enjoy the last years not only with fruitful discussionsand comments but also offered help and advice. In particular, I want to thank Marta Zampiceniand Ivan Lecei for proofreading this dissertation, notwithstanding the former being on maternityleave. Besides, it was always a pleasure to work with Karin Stadtmuller and Hartmut Lanzinger,especially when teaching classes accompanying their lectures and when trying to spread theinterest in mathematics among audiences with a somewhat mixed motivational structure.

Additionally, I am grateful for the assistance of Florian Schaub in my ongoing struggle withthe English language, grammar, and punctuation.

My gratitude is also directed towards my friends, who always offered numerous diversionson land, on sea and sometimes even in the air. Still, they accepted my regrets whenever I wasoccupied with mathematics.

I would probably not be the person I am without my parents. They sparked my curiosityabout the world around me and their ongoing support gave me the possibility to pursue mystudies.

Finally, words cannot describe my thankfulness to Nadine for being her, staying with methrough the highs and lows of more than a decade, marrying me and now carrying our child. Thankyou for finding a way to put things into perspective whenever I am about to get overwhelmed bylife, the universe, and everything.

If you are not among the aforementioned persons, then you, the reader, are neverthelessimportant to me, as any written work is nothing but ink stained paper (or a “one stained”sequence of zeroes, in case you are reading a digital version) as long as nobody reads it. Pleasebe assured of my gratefulness for transforming all this into a voice inside your head.

Michael HarderUlm, April 2016

iii

Page 6: Exchangeability of Copulas
Page 7: Exchangeability of Copulas

Contents

Preface iii

1 Introduction 1

2 Copulas 52.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Elementary Properties and Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Sklar’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 Some Important Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4.1 Elliptical Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4.2 Archimedean Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.3 Nested Archimedean Copulas . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.6 Measures of Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.6.1 Correlation Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.6.2 Measures of Concordance . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.6.3 Tail Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 Empirical Processes 353.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2 Convergence of Random Variables and Vectors . . . . . . . . . . . . . . . . . . . 373.3 Random Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.4 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.5 Donsker’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4 Limits of Non-Exchangeability 574.1 Some Concepts of Multivariate Symmetry . . . . . . . . . . . . . . . . . . . . . . 57

4.1.1 Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.1.2 Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2 Exchangeability and its Antonym . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.3 Limits of Non-Exchangeability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.3.1 Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734.3.2 Proof of the Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.4 Additional Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804.4.1 Some Aspects of Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . 804.4.2 Marginal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 864.4.3 Non-Maximal Non-Exchangeability . . . . . . . . . . . . . . . . . . . . . . 89

v

Page 8: Exchangeability of Copulas

vi CONTENTS

5 Tests for Non-Exchangeability 915.1 Test Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915.2 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.3 Test Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015.4 Simulation Study and Data Application . . . . . . . . . . . . . . . . . . . . . . . 107

5.4.1 First Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1085.4.2 Second Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1105.4.3 Third Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1115.4.4 Nutrient Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1115.4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Appendix 119

Zusammenfassung 129

Bibliography 133

Nomenclature 141

Index 145

Page 9: Exchangeability of Copulas

1 Introduction

There are two primary perspectives of randomness, which both may be considered rather philo-sophical than mathematical in nature: one is deterministic, the other probabilistic. For instance,consider rolling some dice. On the one hand, one might believe that such an experiment iscompletely deterministic. This means that if there were the means to know all variables involved(like the speed of the dice, the direction of their movement, their consistency, air drag as well asthe properties of the material they are rolled on, et cetera), one would be able to compute theexact outcome. This might be impossible. For example, according to Heisenberg’s uncertaintyprinciple in quantum mechanics it is impossible to know both the position and the momentum ofa particle. But even if it was possible, it might still be more practical to view the experimentas random instead of having to handle a large number of infinitesimal influences. On the otherhand, one might believe in true randomness as in the existence of an effect without any cause.For example, radioactive decay of a given particle seems to happen without any cause and theprobability that it decays does not change over time (according to quantum theory).

Regardless of how randomness is viewed, in the end it boils down to the understanding ofthe laws of probability, which drive some object or event of interest, as well as their implications.This kind of reasoning is part of human history. According to David (1955), games of chanceat least date back to ancient Egypt and Greece. It is hard to imagine that these games tookplace without any thought about the chances involved. Walker (1929, page 5) gives examplesof Chinese reasoning about “the probability that an expected child will turn out to be a boy orto be a girl” from about 2 000 years ago. Furthermore, she cites “a statement in a commentary(Venice, 1477) on Dante’s Divine Comedy concerning the different throws which can be madewith three dice” as “the first reference to the theory of probability in a European work.” Notonly from Walker (1929) it becomes clear that at least since the age of enlightenment, a constant(if not growing) stream of research has been conducted in the field of probability theory.

Of course, there is always the theoretical interest in certain features and implications of, forexample, distribution functions. But in many cases, the interest comes from the search for answersto practical problems and questions. When trying to answer such questions with mathematicaltheory, one first has to find a theoretical model that describes the real world. Typically, manyassumptions have to be made in the course of modeling. Some of these assumptions are madeexplicitly and may be quite obvious, for example when modeling the outcome of a dice rollwith a discrete random variable as opposed to a random variable with a continuous distributionfunction. Other assumptions are implicit and may be more subtle, for example implications aboutcertain symmetries and dependence structures in the distribution of the random variables underconsideration.

For instance, human height is often modeled with a normal distribution, especially in intro-ductory textbooks on statistics and probability. When using a normal distribution one implicitlyassumes that the unknown probability distribution of the random variable which generated thesample is symmetric. To be precise, it is symmetric in the sense that the excess of any positive

1

Page 10: Exchangeability of Copulas

1. Introduction

deviation from the mean is just as probable as getting a value below a negative deviation fromthe mean, as long as the absolute values of said deviations coincide. Stated more formally, givenµ ∈ R, σ2 > 0 and X ∼ N(µ, σ2), then

P(X ≥ µ+ a) = P(X ≤ µ− a)

holds for all a ∈ R.In the case of modeling more than one variable (for example, when considering height and

weight), there is not just the question of how to model each variable on its own, but also theirdependence is often a topic of interest. So, before choosing a model and making further inferences,one has to ensure that the dependence structure of the theoretical model in question fits thedependence in the sample. Of course, the first question is if there is any dependence at all or if theelements of the vectors of the sample are independent. One approach to obtain an answer is toconduct an appropriate statistical test. For example, Pearson (1900) investigated the propertiesof a statistical test for independence which is nowadays known as Pearson’s χ2 test.

Fortunately, the one-dimensional marginal distributions and the dependence structure maybe treated somewhat separately. This is due to the fact that for each multivariate distributionH, there exists a function C that couples H with its one-dimensional marginal distributionsF1, . . . , Fd, in the sense that

H(x1, . . . , xd) = C(F1(x1), . . . , Fd(xd)

)holds for all x ∈ Rd. This mapping C is called a copula and the above equation is valid bySklar’s theorem (see Theorem 2.3.4). Therefore, the copula contains all the information aboutthe dependence structure and the marginal distributions may be considered separately. Chapter 2is devoted to copulas. Some of their properties are given as well as the aforementioned importanttheorem by Sklar (1959). Furthermore, some popular classes are discussed, such as Archimedeancopulas. An Archimedean copula C is a copula of the form

C(u1, . . . , ud) = ϕ−(ϕ(u1) + . . .+ ϕ(ud)

)(1.0.1)

for all (u1, . . . , ud)> ∈ [0, 1]d with a generating function ϕ, which is required to fulfill certain

prerequisites. Such a generator is often given by the inverse of the Laplace transform of adistribution function. One reason for the popularity of this class, especially for large dimensions(which often occur in fields like risk management or finance), is the possibility to model a certainrange of dependence structures with a relatively simple generating function. If this generatoris endowed with a parameter, the whole dependence structure may be controlled by a singlenumber. However, there are also drawbacks to the use of a model employing an Archimedeancopula. One of those drawbacks is the lack of flexibility as far as a certain type of symmetry,namely exchangeability, is concerned. In the remainder of Chapter 2, selected algorithms forsampling are presented, as well as popular measures of association, which map some aspects ofdependence to a value between −1 and 1.

As mentioned before, it is often desirable to have access to the multivariate distribution whichwas involved in the generation of an observed sample. As this distribution is usually unknown,the empirical distribution function is used instead. However, it is inherently more difficult toidentify important features of the theoretical distribution function in its empirical counterpart.Nevertheless, the expected value of each point of the empirical distribution is equal to thetheoretical distribution evaluated at that very point. Moreover, there is almost sure, pointwiseconvergence due to the law of large numbers, and, therefore, we should get a better estimationof the theoretical distribution function, and thus more information, the larger the sample is.

2

Page 11: Exchangeability of Copulas

In Chapter 3, a definition of this empirical counterpart of the distribution function is given,and various well known modes of convergence associated with random objects are summarized.However, when viewing an empirical distribution function not pointwise, but as a random elementin a space of functions, problems with measurability might appear (see Theorem 3.3.11). Thiswas already pointed out by Skorokhod in the 1950s. One way to overcome this issue is to use socalled weak convergence. Some facts about the weak convergence of the empirical process (whichis closely related to the empirical distribution function) are summarized in Chapter 3, mostlyfollowing Billingsley (1999) and van der Vaart and Wellner (1996).

One of the aforementioned properties, which must be taken into account when consideringa specific situation or model, is exchangeability. This property may be understood as a weakerform of independence, where the elements of a random vector may be permuted at will withoutchanging the common distribution. Kingman (1978) lists several situations where exchangeabilitymay be useful. A copula C is called exchangeable if

C(u1, . . . , ud) = C(uπ(1), . . . , uπ(d)

)holds for all permutations π ∈ Sd and for all (u1, . . . , ud)

> ∈ [0, 1]d. For example, if anArchimedean copula is fitted to some sample, then this sample is implicitly assumed to beexchangeable (or rather to come from an exchangeable distribution). This is due to the factthat all Archimedean copulas are exchangeable, which is a straightforward consequence of theirrepresentation in terms of a generator as in (1.0.1). But Genest et al. (2012, Section 7) showthat there is indeed real world data which seems to be non-exchangeable. The assumptionof exchangeability is thus not always fulfilled. In the literature, exchangeability is sometimescalled symmetry, but as it is shown in Chapter 4, there exist various concepts of symmetry ofmultivariate distribution functions none of which are necessary nor sufficient for exchangeability.

The main point of Chapter 4 is the derivation of a previously unknown—at least to the best ofour knowledge—limit for non-exchangeability in arbitrary dimensions. It generalizes a result forthe bivariate case by Klement and Mesiar (2006) which was discovered independently by Nelsen(2007). To be precise, if C is a d-dimensional copula, then∣∣∣C(u1, . . . , ud)− C

(uπ(1), . . . , uπ(d)

)∣∣∣ ≤ d− 1

d+ 1

holds for all (u1, . . . , ud)> ∈ [0, 1]d as well as for all permutations π ∈ Sd and the bound is best

possible (see Theorem 4.3.3). This limit is not only of theoretical interest, but it is also importantas a normalizing factor in measures of non-exchangeability or in statistical tests. With somerefinements of the proof, it becomes clear that there is either just one point or an uncountablesubset of a lower-dimensional manifold where this limit is attained. Surprisingly, these twoalternatives depend on the dimension being even or odd (see Theorem 4.4.3).

If it is not obvious—which is probably the case in most scenarios—whether an exchangeablecopula is suited for modeling the distribution of some data, a test should be performed. Severaltest statistics for the bivariate case were suggested by Genest et al. (2012). We are interested intests for larger dimensions, but then additional problems may occur. One such problem is thatthere are d! permutations of dimension d. Hence, the amount of permutations under which thecopula is possibly non-exchangeable grows exponentially as the dimension gets large. Nevertheless,we show that it suffices to consider a subset of permutations with as few as two elements no matterhow large d gets (see Theorem 5.1.3). Thus, we generalize the bivariate test statistics by Genestet al. (2012) to arbitrarily larger dimensions in Chapter 5. Usually, the copula cannot be observeddirectly and instead the empirical copula is used as an approximation. But if the empirical copulais non-exchangeable it is not clear if its non-exchangeability is the result of an approximation error

3

Page 12: Exchangeability of Copulas

1. Introduction

or if this is due to the (real) copula being non-exchangeable. In order to improve performance,the use of a continuous weight function is proposed, which is based on the results of Chapter 4.Less weight is put on areas where little non-exchangeability is possible and more weight is put onareas with larger theoretical limits. Furthermore, it is shown in Chapter 5 that the test statisticsare consistent, i. e. they almost surely converge to their theoretical counterparts when the samplesize tends to infinity. All proofs are conducted with a continuous but otherwise arbitrary weightfunction. Yet, continuity of the weight function is not necessary for most results and it is notedin the proofs where this prerequisite may be relaxed. As the (asymptotic) distribution of thetest statistic depends on the unknown copula, the computation of (empirical) p-values and thusthe test decision relies on a special bootstrap procedure, which was proposed by Remillard andScaillet (2009) and is based on a multiplier central limit theorem by van der Vaart and Wellner(1996). In this procedure, some random variables are generated (based on the observed data)whose distribution is similar to the theoretical distribution of the test statistic (or rather its limit)under the null hypothesis. By comparing these bootstrap random variables with the test statistic,it can be decided if the hypothesis that the sample comes from an exchangeable distributionshould be rejected or not. The performance of the test procedures on samples from differentcopulas with known properties is analyzed in a small simulation study. Different combinations ofthe test statistics, the weight functions and the sets of permutations are used with data fromexchangeable and from non-exchangeable distributions. A brief discussion of the results of thesimulation study concludes Chapter 5.

In summary, whenever a multivariate distribution function is used, one should not only beaware of the existence of the property of exchangeability, but also of its limits. Before employinga model which exhibits exchangeability, it should at least be checked if the data matches thisassumption or even better, a test should be performed. Both, these limits as well as such a test,are presented in this work.

4

Page 13: Exchangeability of Copulas

2 Copulas

As Nelsen (2006) states, Abe Sklar coined the term “copula” in his seminal paper in 1959. Still,works like Hoeffding (1940), Hoeffding (1941) or Frechet (1951) predate Sklar (1959) and discussthe same functions as well as some of their basic properties, but without using a designation thathas been retained. Sklar himself recalls in Sklar (1996) that he came up with the term copula asfollows:

“Knowing the word ‘copula’ as a grammatical term for a word or expression thatlinks a subject and predicate, I felt that this would make an appropriate name for afunction that links a multidimensional distribution to its one-dimensional margins,and used it as such.”

The property of copulas mentioned in the quote (i. e. linking multivariate distributions to theirmargins) was shown in a theorem in Sklar (1959), which now bears his name (see Theorem 2.3.4).Copulas started to become a popular subject in the 1990s. Schweizer (1991) gives a historicaloverview of the developments in copula theory until then. By now, the books by Joe (1997) andNelsen (2006) (first edition published in 1998) are cited as standard introductory literature in mosttalks and papers about copulas. Since then, increasing computing power brought the possibilityto make inferences about the dependence structure of data in two and more dimensions. This isclosely related to the study of the underlying copula, as we will see in this chapter. Therefore, inrecent years, a broad range of applications of copula-theory has emerged, not only in quantitativerisk management (see, e. g., McNeil et al. (2005) and Embrechts (2009)) and finance (see e. g.Hofert and Scherer (2011)), but also in hydrology (see e. g. Genest and Favre (2007)) and evenseemingly remote fields such as dairy farming (see, e. g., Massonnet et al. (2009)).

The plural form “copulæ” is seen in the literature but throughout this work, we will followthe convention of Nelsen (2006) and use the plural form “copulas”.

2.1 Basics

In order to define what constitutes a valid copula, we first have to give some preliminary definitions.In the following, d ∈ N \ 1 denotes the dimension.

Definition 2.1.1. For j ∈ 1, . . . , d, let vj,0, vj,1 ∈ [0, 1] with vj,0 ≤ vj,1. Given a mapping

H : [0, 1]d → R and the hyperrectangle V :=×dj=1[vj,0, vj,1], we call

H(V ) :=∑

i∈0,1d(−1)d−(i1+...+id)H

(v1,i1 , . . . , vd,id

)the H-volume of V . The sum is taken over all vertices i = (i1, . . . , id) of the unit hypercube.

5

Page 14: Exchangeability of Copulas

2. Copulas

An analogous definition can be given for mappings on other hypercubes, other hyperrectangles,or on Rd, but as this work is mainly concerned with mappings on the unit hypercube, thisdefinition suffices. The same holds for the next definition.

Definition 2.1.2. A mapping H : [0, 1]d → R is called d-increasing, if H(V ) ≥ 0 for all hyper-rectangles V ⊂ [0, 1]d (as in Definition 2.1.1).

Note that for a mapping H on [0, 1]d being strictly increasing in each variable is not sufficientand being non-decreasing in each variable is not necessary for being d-increasing. This can beseen in the following examples, which are based on the examples by Nelsen (2006).

Example 2.1.3. For d = 2, the mapping H1 : [0, 1]2 → [0, 1] with H1(x, y) := 13

(maxx, y+x+ y

)is strictly increasing in each variable, but H1

([0, 1]d

)= − 1

3 . The mapping H2 : [0, 1]2 → [0, 1]

with H2(x, y) := 12

((2x− 1)(2y − 1) + 1

)is 2-increasing, as

H2

([x0, x1]× [y0, y1]

)= 2(x1 − x0)(y1 − y0),

but H2 is strictly decreasing in x for each y ∈(0, 1

2

)and strictly decreasing in y for each x ∈

(0, 1

2

).

The preceding example also shows that a 2-increasing function exists. Existence of a d-increasing function for d > 2 is given by the following example.

Example 2.1.4. The mapping H : [0, 1]d → [0, 1] with H(x) :=∏di=1 xi is d-increasing, as

H(V ) =

d∏j=0

(vj,1 − vj,0)

holds for all hyperrectangles V ⊂ [0, 1]d with V =×dj=1[vj,0, vj,1]. Moreover each cumulative

distribution function HX of a random vector X with P(X ∈ [0, 1]d

)= 1 is d-increasing, as

HX(V ) = P(X ∈ V ) ≥ 0.

If Definition 2.1.2 is given for mappings on Rd, then each d-dimensional cumulative distributionfunction is d-increasing. Now we turn to a special subset of the set of d-increasing mappings.

Definition 2.1.5. The mapping C : [0, 1]d → R is called copula, if

• C is d-increasing and

• C is grounded, i. e. if there exists i ∈ 1, . . . , d with ui = 0, then C(u) = 0, and

• C has uniform margins, i. e. C(1, . . . , 1, ui, 1, . . . , 1) = ui for all i ∈ 1, . . . , d.Emphasizing the dimension, we also use the terms d-dimensional copula or d-copula. The set ofall d-dimensional copulas will be denoted by Cd.

When verifying whether a given mapping is a copula, in most cases the first point needs mosteffort. In the next example, we will show the existence of such mappings and, simultaneously,derive the subsequent corollary.

Example 2.1.6. As we have seen in Example 2.1.4, each cumulative distribution function HXon the unit hypercube is d-increasing. If we assume P

(X ∈ (0, 1]d

)= 1, then HX is grounded.

Therefore, HX being a copula depends on the margins being uniform (by Definition 2.1.5). Thismeans, it depends on the components of X following a standard uniform distribution. Thus, letUi ∼ U[0, 1] for all i ∈ 1, . . . , d, then HX is a copula for X := (U1, . . . , Ud). If additionally theUi are independent, then HX coincides with the function H from Example 2.1.4. As we well seein Corollary 2.3.6, this is an important copula, which we will call independence copula and denoteit by Π.

6

Page 15: Exchangeability of Copulas

2.1. Basics

Corollary 2.1.7. Let U be a d-dimensional random vector, such that Ui ∼ U[0, 1] for alli ∈ 1, . . . , d. Furthermore, let C be the cumulative distribution function of U . Then C is acopula.

Definition 2.1.8. Let C be a d-dimensional copula and U ∼ C. Then the cumulative distributionfunction C of the random vector (1− U1, . . . , 1− Ud)> is called the survival copula of C.

Note that the survival copula is indeed a valid copula by Corollary 2.1.7 and because 1− Ufollows a standard uniform distribution, if and only if U does. It must not be mistaken for thesurvival function C of a copula C. Remember that if U ∼ C, then C(u) := P(U1 > u1, . . . , Ud >ud). This somewhat confusing naming convention will become clear in Corollary 2.3.5. Thefollowing theorem establishes a connection between C and C.

Theorem 2.1.9. Let C be a d-dimensional copula. Then

C(u1, . . . , ud) = C(1− u1, . . . , 1− ud)

holds for all u ∈ [0, 1]d.

Proof. Let U ∼ C and u ∈ [0, 1]d. Then

C(u1, . . . , ud) = P(U1 > u1, . . . , Ud > ud) = P(1− U1 < 1− u1, . . . , 1− Ud < 1− ud)= P(1− U1 ≤ 1− u1, . . . , 1− Ud ≤ 1− ud) = C(1− u1, . . . , 1− ud)

holds because C is the distribution function of the vector (1− U1, . . . , 1− Ud)> by definition andas Ui ∼ U[0, 1] for all i ∈ 1, . . . , d, we get P(Ui < ui) = P(Ui ≤ ui) for all ui ∈ [0, 1].

A direct consequence of Theorem 2.1.9 is that C and C never coincide. The assumptionC ≡ C yields the contradiction that C(u)→ 1 for u→ 0 and

C(u) = C(1− u) = C(1− u)→ 0

hold at the same time.

Some of the properties and deductions in this work are still valid when some of the requirementsof Definition 2.1.5 are relaxed. In the literature, two of these mappings are known as sub-copulasand quasi-copulas. We give their definitions according to Nelsen (2006).

Definition 2.1.10. Let I1, . . . , Id ⊂ [0, 1] be subsets of the unit interval (not necessarily intervals),with 0, 1 ⊂ Ii for each i ∈ 1, . . . , d and let I := ×di=1Ii. A mapping Cs : I → R which isd-increasing, grounded and has uniform margins (see Definition 2.1.5) is called sub-copula.

Definition 2.1.11. If a mapping Cq : [0, 1]d → R is grounded, has uniform margins (see Defini-tion 2.1.5), is non-decreasing in each argument and satisfies a Lipschitz-condition, i. e.

∣∣Cq(u)− Cq(v)∣∣ ≤ d∑

i=1

|ui − vi|

holds for all u,v ∈ [0, 1]d, then Cq is called quasi-copula.

7

Page 16: Exchangeability of Copulas

2. Copulas

As Beliakov et al. (2014) state, quasi-copulas were introduced by Alsina et al. (1993) whenexamining whether certain operations on univariate distribution functions are derivable fromcorresponding operations on random variables. Genest et al. (1999) showed that Definition 2.1.11and the definition by Alsina et al. (1993) yield the same functions in the bivariate case. Equalityin the multivariate case is due to Cuculescu and Theodorescu (2001). Examples of properquasi-copulas (i. e. quasi-copulas which are not d-increasing) are given by Genest et al. (1999)and Nelsen (2006). We will show (among other things) that each copula is a quasi-copula in thefollowing section.

2.2 Elementary Properties and Bounds

In Example 2.1.3 it was shown that for a mapping H on the unit hypercube being strictlyincreasing in each variable is not sufficient and that being non-decreasing in each variable is notnecessary for being d-increasing. But this changes when groundedness is added, as can be seen inthe next lemma.

Lemma 2.2.1. Let H : [0, 1]d → R be d-increasing and grounded. Then H is non-decreasing ineach variable.

Proof. Let j ∈ 1, . . . , d, u ∈ [0, 1]d and vj ∈ [uj , 1]. Then the hyperrectangle

V := [0, u1]× . . .× [0, uj−1]× [uj , vj ]× [0, uj+1]× . . .× [0, ud]

yields 0 ≤ H(V ) = H(u1, . . . , uj−1, vj , uj+1, . . . , ud)−H(u).

Therefore, as every copula is grounded and d-increasing by definition, it is non-decreasing ineach variable, in particular 0 = C(0, . . . , 0) ≤ C(u) for all u ∈ [0, 1]d. In addition, each copulahas uniform margins and thus C(u) ≤ C(1, . . . , 1) = 1, which yields the following corollary.

Corollary 2.2.2. Let C be a d-dimensional copula. Then C(u) ∈ [0, 1] for all u ∈ [0, 1]d.

Of course, this rather crude bound on the range of copulas can be improved. The improvementin question happened even before Sklar (1959) gave the name “copulas” to this class of functions.According to Nelsen (2006), Hoeffding (1940, 1941) already established best possible boundsfor copulas, but because of the Second World War (and their publication in “relatively obscureGerman journals”) this was unknown to Frechet (1951), when he rediscovered those bounds. Theyare given in the following definition. In Theorem 2.2.4, their limiting property is established.

Definition 2.2.3. The mappings Md,Wd : [0, 1]d → [0, 1] with

Md(u) := minu1, . . . , ud,

Wd(u) := max

0,

d∑i=1

ui − (d− 1)

for all u ∈ [0, 1]d are called (upper and lower) Frechet-Hoeffding-bounds.

Note that the notation M or W is used whenever the dimension d is clear from the context.

Theorem 2.2.4 (Hoeffding (1940, 1941); Frechet (1951)). Let C : [0, 1]d → [0, 1] be a copula,then

W (u) ≤ C(u) ≤M(u)

holds for all u ∈ [0, 1]d and the bounds are best possible.

8

Page 17: Exchangeability of Copulas

2.2. Elementary Properties and Bounds

The following lemma gives some important properties of the Frechet-Hoeffding-bounds, whichwill be needed in the proof of Theorem 2.2.4. The proof will be given after Lemma 2.2.6 onpage 10.

Lemma 2.2.5. The upper Frechet-Hoeffding-bound M is a copula for any d ∈ N \ 1. Thelower Frechet-Hoeffding-bound W is a copula if and only if d = 2.

Proof. Let U ∼ U[0, 1] and U := (U, . . . , U) be a d-dimensional random vector on the unithypercube and HU its cumulative distribution function. Then

HU (u) = P(U ≤ u1, . . . , U ≤ ud) = P(U ≤ minu1, . . . , ud

)= M(u)

holds for all u ∈ [0, 1]d. By Corollary 2.1.7, every cumulative distribution function of a randomvector with standard uniform components is a copula and therefore HU and thus M is a validcopula.

Now let V := (U, 1− U). Then

HV (u) = P(U ≤ u1, 1− U ≤ u2) = P(U ≤ u1, U ≥ 1− u2) = W (u)

holds for all u ∈ [0, 1]d. As (1− U) ∼ U[0, 1], again, we have a cumulative distribution functionof a random vector with standard uniform components and therefore a valid copula. For d > 2,the lower Frechet-Hoeffding-bound is obviously grounded and has uniform margins. But W is notd-increasing, because

W

([1

d− 1, 1

]d)= W (1, . . . , 1)− d

d− 1= − 1

d− 1< 0

holds as

W (1, . . . , 1) = 1,

W(

1, . . . , 1,1

d− 1, 1, . . . , 1

)=

1

d− 1

and W (u) = 0 for all remaining vertices u of[

1d−1 , 1

]d.

In order to prove Theorem 2.2.4, we need to verify that copulas satisfy the Lipschitz-conditionfrom Definition 2.1.11.

Lemma 2.2.6. Let C : [0, 1]d → R be a copula. Then C is 1-Lipschitz-continuous, i. e.,

∣∣C(u)− C(v)∣∣ ≤ d∑

i=1

|ui − vi|

holds for all u,v ∈ [0, 1]d.

Proof. Let u,v ∈ [0, 1]d and i, j ∈ 1, . . . , d, such that i < j. Assume without loss of generalitythat ui ≤ vi (otherwise replace u and v by u := v and v := u). Now we use the hyperrectangleV := ×dk=1[ak, bk], given by

[ak, bk] =

[ui, vi] if k = i,

[uj , 1] if k = j,

[0, uk] otherwise.

9

Page 18: Exchangeability of Copulas

2. Copulas

As a copula, C is d-increasing (and grounded) and thus 0 ≤ C(V ), which is equivalent to

C(u1, . . . , ui−1, vi, ui+1, . . . , ud)− C(u)

≤ C(u1, . . . , ui−1, vi, ui+1, . . . , uj−1, 1, uj+1, . . . , ud)− C(u1, . . . , uj−1, 1, uj+1, . . . , ud).

In the case i > j, we get an analogous result. An iteration of this procedure for each i ∈1, . . . , d \ j yields

C(u1, . . . , ui−1, vi, ui+1, . . . , ud)− C(u) ≤ vi − ui,

as C has uniform margins. By Lemma 2.2.1, C is non-decreasing in each argument and therefore∣∣C(u1, . . . , ui−1, vi, ui+1, . . . , ud)− C(u)∣∣ ≤ |vi − ui| (2.2.1)

holds. Then

∣∣C(u)− C(v)∣∣ ≤ d∑

k=1

∣∣C(u1, . . . , ud−k+1, vd−k+2, . . . , vd)− C(u1, . . . , ud−k, vd−k+1, . . . , vd)∣∣

≤d∑k=1

|uk − vk|

concludes the proof by first applying the triangular inequality and then inequality (2.2.1). Notethat the arguments of C vanish in all of the above equations whenever their index is not between1 and d. For example, C(u1, . . . , ud−k+1, vd−k+2, . . . , vd) = C(u) holds for k = 1.

Now we are not just ready to prove Theorem 2.2.4, along the way we also showed that eachcopula is also a quasi-copula, as it is non-decreasing in each argument (Lemma 2.2.1) and is1-Lipschitz-continuous (Lemma 2.2.6).

Proof of Theorem 2.2.4. Let C : [0, 1]d → [0, 1] be a copula. According to Lemma 2.2.1, C isnon-decreasing in each variable. Therefore

C(u) ≤ C(1, . . . , 1, ui, 1, . . . , 1) = ui

holds for all i ∈ 1, . . . , d. This directly implies C(u) ≤M(u) and the bound is best possible,as it was established in Lemma 2.2.5 that M is a valid copula itself.

For the lower bound, let u ∈ [0, 1]d. It follows from Lemma 2.2.6 that

1− C(u) = C(1, . . . , 1)− C(u) =∣∣C(1, . . . , 1)− C(u)

∣∣ ≤ d∑i=1

|1− ui| = d−d∑i=1

ui

holds. As C(u) ≥ 0 (see Corollary 2.2.2), this directly implies C(u) ≥ W (u). For d = 2 thebound is best possible, as W itself is a valid copula. For d > 2, it can be shown that for everyu ∈ [0, 1]d there exists a copula Cu, such that Cu(u) = W (u). For details on this, see, e. g. theproof of Theorem 1.2.4 in Hofert (2010).

10

Page 19: Exchangeability of Copulas

2.3. Sklar’s Theorem

2.3 Sklar’s Theorem

Even if a univariate distribution function is continuous, it does not have to be strictly increasingand therefore, in general, no inverse function exists. Because of this, we have to use a generalizedinverse function as in the following definition.

Definition 2.3.1. For any non-decreasing function F : R→ R, the mapping F− : R→ R with

F−(y) := infx ∈ R |F (x) ≥ y

is called generalized inverse of F (as usual, we use inf ∅ =∞ and R := R ∪ −∞,∞).

It follows directly from the definition (and from F being non-decreasing) that F− is non-decreasing. In Proposition 1 of Embrechts and Hofert (2013), many useful properties of generalizedinverses are derived under general assumptions, especially for distribution functions F which arenot continuous.

Lemma 2.3.2. If F : R→ [0, 1] is a continuous distribution function, then F(F−(y)

)= y for

all y ∈ (0, 1).

Proof. Let y ∈ (0, 1). As F is continuous, limx→∞ F (x) = 1 and limx→−∞ F (x) = 0, there existsat least one x ∈ R, such that F (x) = y and thus S :=

x ∈ R |F (x) ≥ y

6= ∅. Furthermore,

F−(y) ≤ x and, as F is non-decreasing, F(F−(y)

)≤ F (x) = y; as y > 0 and F is continuous,

inf S > −∞. There exists a sequence (xn)n∈N ⊂ S, such that limn→∞ xn = F−(y) holds. Bythe continuity of F and F (xn) ≥ y for all n ∈ N, we get F

(F−(y)

)≥ y which concludes the

proof.

With this, we are ready to prove the following theorem, which is not only needed in the proofof Sklar’s theorem, but will also be useful for sampling from copulas in Section 2.5.

Theorem 2.3.3. Let F be a continuous distribution function and X ∼ F , then F (X) ∼ U[0, 1].Furthermore, if U ∼ U[0, 1], then F−(U) ∼ F .

Proof. Let u ∈ (0, 1), then

P(F (X) ≤ u

)= P

(F (X) ≤ u,X ≤ F−(u)

)+ P

(F (X) ≤ u |X > F−(u)

)P(X > F−(u)

)= P

(X ≤ F−(u)

)+ P

(F (X) = u

)P(X > F−(u)

)by Lemma 2.3.2 and non-decreasingness of F . The level-set S := x ∈ R : F (x) = u is aninterval as F is non-decreasing and closed as F is continuous and therefore S = [a, b] for somea, b ∈ R with a ≤ b. This yields

P(F (x) = u

)= P(a ≤ X ≤ b) = F (b)− F (a) = 0 (2.3.1)

and thus P(F (X) ≤ u

)= u by Lemma 2.3.2. If u = 1, then obviously P

(F (X) ≤ u

)= u and for

u = 0 this also holds by (2.3.1). This means F (X) ∼ U[0, 1].

Now let x ∈ R and U ∼ U[0, 1]. As F is continuous, we get F(F−(U)

) d= U by Lemma 2.3.2

and hence

F (x) = P(F(F−(U)

)≤ F (x)

)= P

(F(F−(U)

)≤ F (x), F−(U) ≤ x

)+ P

(F(F−(U)

)≤ F (x), F−(U) > x

)= P

(F−(U) ≤ x

)+ P

(F(F−(U)

)≤ F (x), F−(U) > x

)11

Page 20: Exchangeability of Copulas

2. Copulas

holds, because F−(U) ≤ x implies F(F−(U)

)≤ F (x), as F is non-decreasing. For the same

reason, F−(U) > x implies F(F−(U)

)≥ F (x) and thus

P(F(F−(U)

)≤ F (x), F−(U) > x

)= P

(U = F (x), F−(U) > x

)= 0

completes the proof.

In the next theorem, it becomes clear that copulas link multivariate distributions to theirunivariate margins.

Theorem 2.3.4 (Sklar (1959)). For every multivariate cumulative distribution function H : Rd →R with univariate margins F1, . . . , Fd, there exists a copula C, such that

H(x) = C(F1(x1), . . . , Fd(xd)

)holds for all x ∈ Rd. If additionally, F1, . . . , Fd are continuous, then C is unique, else it is uniquelydetermined on ×di=1ran(Fi), i. e. on the set

y ∈ Rd | ∃x ∈ Rd ∀i ∈ 1, . . . , d : Fi(xi) = yi

.

Given a copula C and univariate distribution functions F1, . . . , Fd, the function H as given aboveis a multivariate distribution function with margins F1, . . . , Fd.

Proof. We will give the proof for the case of continuous margins, similar to the proof in Hering(2011). For the proof of the general case, some additional theory about extensions of sub-copulasto copulas is needed. The interested reader may kindly be referred to Nelsen (2006) and thereferences therein or to Hofert (2010).

Now let H : Rd → R be a multivariate cumulative distribution function with univariatecontinuous margins F1, . . . , Fd. Let x ∈ Rd and X ∼ H. As usual, inequalities of vectors aremeant to hold component-by-component, i. e.

X ≤ x ⇐⇒ Xi ≤ xi for all i ∈ 1, . . . , d. (2.3.2)

By F (x), we denote the vector(F1(x1), . . . , Fd(xd)

)>. Continuity of Fi yields P(Xi = x) = 0 for

all i, and thus

H(x) = P(X ≤ x)

= P(X < x)

= P(X < x |F (X) < F (x)

)P(F (X) < F (x)

)+ P

(X < x |F (X) ≤ F (x), F (X) 6< F (x)

)P(F (X) ≤ F (x), F (X) 6< F (x)

)+ P

(X < x | ∃i ∈ 1, . . . , d : Fi(Xi) > Fi(xi)

)P(∃i ∈ 1, . . . , d : Fi(Xi) > Fi(xi)

)by the law of total probability. As the margins Fi are non-decreasing, the first of the aboveconditional probabilities equals 1 and the last one vanishes. Because F (X) ≤ F (x) and F (X) 6<F (x) implies Fi(Xi) = Fi(xi) for at least one i,

P(F (X) ≤ F (x), F (X) 6< F (x)

)≤ P

(∃i ∈ 1, . . . , d : Fi(Xi) = Fi(xi)

)holds and the second summand vanishes (see (2.3.1) in the proof of Theorem 2.3.3). Therefore,

H(x) = P(F (X) < F (x)

)= P

(F (X) ≤ F (x)

)= HF (X)

(F (x)

)holds, where HF (X) is the cumulative distribution function of the vector F (X). By Theorem 2.3.3,the components of F (X) follow a standard uniform distribution, which makes C := HF (X) a

12

Page 21: Exchangeability of Copulas

2.3. Sklar’s Theorem

valid copula (see Corollary 2.1.7). If C is another such copula, let u ∈ [0, 1]d. By continuity ofthe margins Fi of H, there exists x ∈ Rd, such that F (x) = u. Then∣∣C(u)− C(u)

∣∣ =∣∣∣C(F (x)

)− C

(F (x)

)∣∣∣ =∣∣H(x)−H(x)

∣∣ = 0

yields uniqueness of C.For the second statement of the theorem, note that, by definition, each copula is a distribution

function on the unit hypercube. Now, given a copula C and continuous univariate distributionsF1, . . . , Fd, let U ∼ C. By Theorem 2.3.3, for each i ∈ 1, . . . , d, the random variable Xi :=F−i (Ui) is distributed according to Fi. The cumulative distribution function of the vector X isgiven by

P(X ≤ x) = P(X < x) = P(F (X) < F (x)

)= P

(U < F (x)

)= P

(U ≤ F (x)

)= C

(F1(x1), . . . , Fd(xd)

)with the same arguments as in the proof of the first statement.

With Sklar’s theorem, it can now be clarified that C as in Definition 2.1.8 is called survivalcopula, because it connects the joint survival function with the marginal survival functions in thesame way a copula connects a joint distribution function with its margins.

Corollary 2.3.5. Let H : Rd → R be a multivariate distribution function with univariate,continuous margins F1, . . . , Fd and a copula C as in Theorem 2.3.4. Then

H(x) = C(F1(x1), . . . , Fd(xd)

)holds for all x ∈ Rd.

Proof. Let X ∼ H and x ∈ Rd. In the proof of Sklar’s theorem, it was shown that the random

vector(F1(X1), . . . , Fd(Xd)

)>is distributed according to C. Then we get(1− F1(X1), . . . , 1− Fd(Xd)

)> ∼ C,

by Definition 2.1.8 and

H(x) = P(X > x) = P(F1(X1) ≥ F1(x1), . . . , Fd(Xd) ≥ Fd(xd)

)= P

(1− F1(X1) ≤ 1− F1(x1), . . . , 1− Fd(Xd) ≤ 1− Fd(xd)

)= C

(F1(x1), . . . , Fd(xd)

)completes the proof.

Given a random vector X, it is often of interest to make inferences about the dependencestructure of its components. By the decomposition of the multivariate distribution function asin Sklar’s theorem, it becomes clear that all the information about the dependence structureis contained in the copula. With the theorem’s statement about uniqueness of the copula, itbecomes easy to see that the lack of any dependence—i. e. independence—of the components ofX is linked to one special copula.

Corollary 2.3.6. For i ∈ 1, . . . , d, let Xi ∼ Fi, Fi continuous and X := (X1, . . . , Xd)> ∼ H

with copula C. Then X is independent, if and only if C(u) = Π(u) =∏di=1 ui holds for all

u ∈ [0, 1]d, i. e. the copula of X coincides with the independence copula.

Proof. The random vector X is independent if and only if H(x) =∏di=1 Fi(xi) which is equivalent

to H(x) = Π(F1(x1), . . . , Fd(xd)

). As all Fi are assumed to be continuous, the copula is unique

by Sklar’s theorem.

13

Page 22: Exchangeability of Copulas

2. Copulas

2.4 Some Important Classes

With M , Π and W , some important examples of copulas have already been introduced. In thissection, we will present some more examples which will be grouped according to certain sharedproperties or ways of construction.

2.4.1 Elliptical Copulas

The following definition of elliptical distributions is due to Cambanis et al. (1981). Sometimesthese distributions are also referred to as “elliptically contoured distributions,” because contourplots of their densities yield cocentric ellipses (or ellipsoids). We will use notation similar to Mroz(2012) and Hofert (2010).

Definition 2.4.1. Let H : Rd → [0, 1] be a distribution function and let X ∼ H. If there exist avector µ ∈ Rd, a (symmetric) positive semi-definite matrix Σ ∈ Rd×d and a mapping φ : R→ R,such that

E(

eit>(X−µ)

)= φ

(t>Σt

)(2.4.1)

holds for all t ∈ Rd (i. e. the characteristic function ofX−µ is a function of t>Σt), then H is calledelliptically contoured distribution or just elliptical distribution and we write X ∼ Ed(µ,Σ, φ).

Cambanis et al. (1981) introduce a stochastic representation of elliptically distributed randomvariables, which will be useful for sampling in Section 2.5. It is based on the following representa-tion of φ from Definition 2.4.1 which, according to Cambanis et al. (1981), is due to Schoenberg(1938).

Lemma 2.4.2. A mapping φ(t>t)

with t ∈ Rd is a characteristic function (for a d-dimensionaldistribution), if and only if

φ(t) =

∫ ∞0

Ψd(tx2) dFR(x) (2.4.2)

holds for all t ≥ 0, where FR : [0,∞)→ [0, 1] is a distribution function and Ψd is a mapping, suchthat Ψd

(t>t)

is the characteristic function of a random vector UUd which is uniformly distributed

on the d-dimensional unit hypersphere Ud =x ∈ Rd |x>x = 1

.

Proof. See Schoenberg (1938) or the proof of Theorem 2.2 by Schmidt (2002).

Now, we are ready for the aforementioned representation of elliptically distributed randomvariables by Cambanis et al. (1981).

Proposition 2.4.3 (Cambanis et al. (1981)). Given a positive semi-definite matrix Σ ∈ Rd×d ofrank k, X ∼ Ed(µ,Σ, φ) holds if and only if

Xd= µ+RAUUk

holds with UUk uniformly distributed on the k-dimensional unit hypersphere Uk (as in Lemma 2.4.2),R ≥ 0 independent of UUk with R ∼ FR as in (2.4.2) (thus related to φ) and A ∈ Rd×k of rankk, such that AA> = Σ (i. e., a rank factorization of Σ).

14

Page 23: Exchangeability of Copulas

2.4. Some Important Classes

Cambanis et al. (1981) also showed that the parameters µ, Σ and φ of a d-dimensional ellipticaldistribution are unique, up to a constant factor (at least, as long as it is non-degenerate, i. e.the mass is not concentrated in a single point). A similar result holds for the representation ofProposition 2.4.3. The details are outlined in the following proposition, which is Theorem 3 ofCambanis et al. (1981).

Proposition 2.4.4 (Cambanis et al. (1981)). Let X be a d-dimensional random vector, which isnon-degenerate, i. e. P(X = x) < 1 for all x ∈ Rd. If X ∼ Ed(µ,Σ, φ) and X ∼ Ed(µ, Σ, φ) asin Definition 2.4.1, then there exists c > 0, such that

µ = µ, Σ = cΣ, φ(t) = φ

(t

c

)

holds for all t ∈ R. If Xd= µ+RAUUk and X

d= µ+ RAUUk as in Proposition 2.4.3, then there

exists c > 0, such that

µ = µ, AA> = cAA>, Rd=R

c

holds.

The multivariate normal distribution is probably part of most lectures on probability andstatistics. The multivariate t-distribution is also quite well known. As we will see in the followingexamples, both belong to the class of elliptical distributions. Some contour plots of their densitiesin two dimensions can be found in the appendix of Mroz (2012).

Example 2.4.5. Let Σ ∈ Rd×d be positive semi-definite, µ ∈ Rd and φ(x) = exp(−x2). Then

X ∼ Ed(µ,Σ, φ) if and only if X ∼ N(µ,Σ).

Example 2.4.6. Let µ ∈ Rd, Σ ∈ Rd×d be positive semi-definite and A ∈ Rd×k, such thatAA> = Σ. Furthermore, let UUd as in Proposition 2.4.3 and let R be independent of UUd , suchthat R

d ∼ Fd,ν , i. e. Rd has a multivariate F -distribution with d and ν degrees of freedom. Then

Xd= µ+ RAUUd for X ∼ tν(µ,Σ), i. e. X has a multivariate t-distribution with ν degrees of

freedom, location vector µ and scale matrix Σ, see Example 2.5 of Fang et al. (1990). Because ofProposition 2.4.3, X ∼ Ed(µ,Σ, φ) holds for some mapping φ.

Definition 2.4.7. Let X ∼ Ed(µ,Σ, φ), then the copula C of X is called elliptical copula.

The copula of a multivariate normal distribution is called the Gaussian copula, the copula of amultivariate tν-distribution is called tν-copula. Of course, as there is no analytical expression forthe distribution function of the multivariate normal distribution or the multivariate tν-distribution,there is no analytical expression for the corresponding copula. By Sklar’s theorem, the distributionfunction can be given as an integral over a density in both cases, see Hofert (2010), or Demartaand McNeil (2005) for additional information about the tν-copula.

2.4.2 Archimedean Copulas

One reason Archimedean copulas became popular might be that they capture the dependencestructure of a multivariate random vector in a (in most cases) simple generating function, oftenwith only one parameter. Nonetheless, a broad range of copulas belongs to this class. They sharea common construction principle, which will become clear in Definition 2.4.9.

15

Page 24: Exchangeability of Copulas

2. Copulas

Definition 2.4.8. A mapping ϕ : [0, 1]→ [0,∞] is called (Archimedean) generator or generatingfunction, if it is continuous, strictly decreasing and ϕ(1) = 0. The generator is called strict ifϕ(0) =∞ and non-strict if ϕ(0) <∞.

Obviously, strict generators are invertible. If ϕ is non-strict, it is strictly decreasing by definitionand therefore, its inverse exists on

[0, ϕ(0)

]. This inverse can be extended to ϕ− : [0,∞]→ [0, 1]

by

ϕ−(y) :=

ϕ−1(y) if y ∈

[0, ϕ(0)

],

0 if y > ϕ(0).(2.4.3)

Note that ϕ− is not a generalized inverse as in Definition 2.3.1, where non-decreasingness wasassumed. With such a generator, we are ready to define what makes a copula Archimedean.

Definition 2.4.9. A copula C : [0, 1]d → [0, 1] is called Archimedean copula if there exists agenerator ϕ, such that

C(u) = ϕ−( d∑i=1

ϕ(ui)

)holds for all u ∈ [0, 1]d.

In some cases the definition of ψ := ϕ− as a generator is more convenient. Then a non-increasing, continuous mapping ψ : [0,∞]→ [0, 1] with ψ(0) = 1 and ψ(∞) = limx→∞ ψ(x) = 0is called a generator, if it is strictly decreasing on the interval [0, x0], where

x0 := infx ∈ [0,∞] | ψ(x) = 0

.

Similarly to (2.4.3), the inverse ψ−1 on (0, 1] can be extended to ψ− on [0, 1] by ψ−(0) := x0.

The Archimedean copula then takes the form C(u) = ψ(∑d

i=1 ψ−(ui)

).

According to Nelsen (2006), these copulas have been given the name “Archimedean”, as theyshare a property similar to the Archimedean axiom, i. e. for every a, b > 0 there exists n ∈ N,such that an > b. How this relates to Archimedean copulas will become clear in the followingtheorem.

Theorem 2.4.10. Let ϕ be a generator and ϕ− as in (2.4.3). Then for every u, v ∈ (0, 1), there

exists d ∈ N \ 1, such that ϕ−(∑d

i=1 ϕ(u))< v.

Proof. As u, v < 1, ϕ(1) = 0 and ϕ is strictly decreasing by definition, we get ϕ(u), ϕ(v) > 0.By the Archimedean axiom, there exists d ∈ N, such that ϕ(u)d > ϕ(v). This remains truefor all d ∈ N with d ≥ d and therefore, we can assume d ≥ 2. Now if ϕ(u)d > ϕ(0), thenϕ−(ϕ(u)d

)= 0 < v. Otherwise ϕ(u)d, ϕ(v) ∈

(0, ϕ(0)

], where ϕ− is strictly decreasing, i. e. the

inverse ϕ−1 exists and thus ϕ−(ϕ(u)d

)< v.

This result can be easily translated to Archimedean copulas, as long as it is known when a givengenerator ϕ generates a valid copula. This means the question is, for which d ∈ N the mappingC : [0, 1]d → [0, 1] with C(u) = ϕ−

(∑di=1 ϕ(ui)

)fulfills the requirements of Definition 2.1.5. It

follows from the definition that all generators are grounded and have uniform margins, but ford-increasingness, additional information about the generator is required.

Theorem 2.4.11. Let ϕ be a generator and ϕ− as in (2.4.3), then C : [0, 1]2 → [0, 1] withC(u, v) = ϕ−

(ϕ(u) + ϕ(v)

)is a copula if and only if ϕ is convex.

16

Page 25: Exchangeability of Copulas

2.4. Some Important Classes

For a proof, see Nelsen (2006, Theorem 4.1.4, pages 111–112), Schweizer and Sklar (1983) orAlsina et al. (2006). In the following example (which is basically Example 4.2 of Nelsen (2006)),the existence of such copulas is shown, by presenting generators for the independence copula Πfrom Example 2.1.6 and the lower Frechet-Hoeffding-bound W from Definition 2.2.3.

Example 2.4.12. The mappings ϕΠ, ϕW : [0, 1]→ [0,∞] with

ϕΠ(t) := − ln(t), ϕW (t) := 1− t

for u ∈ (0, 1] and ϕΠ(0) :=∞, ϕW (0) := 1 meet the requirements of Definition 2.4.8 and thereforeare generators. Their respective inverses as in (2.4.3) are given by

ϕ−Π(y) := e−y, ϕ−W (y) := (1− y)1(y ≤ 1)

for y ∈ [0,∞) and ϕ−Π(∞) := 0, ϕ−W (∞) := 0. The indicator function 1 is defined as usual, i. e. ify ≤ 1, then 1(y ≤ 1) = 1, else 1(y ≤ 1) = 0. Simple calculations show that

ϕ−Π(ϕΠ(u) + ϕΠ(v)

)= Π(u, v), ϕ−W

(ϕW (u) + ϕW (v)

)= W (u, v)

holds for all (u, v) ∈ [0, 1]2.

The question whether all copulas are Archimedean is answered in the next example, regardingthe upper Frechet-Hoeffding-bound M from Definition 2.2.3.

Example 2.4.13. Assume that M is Archimedean, i. e. there exists a generator ϕM , such thatM(u) = ϕ−M

(∑di=1 ϕM (ui)

)for all u ∈ [0, 1]d and all d ≥ 2. Now let u := 1

2 and ui := u for alli ∈ N, then M(u1, . . . , ud) = 1

2 for all d. With v := 14 we get

ϕ−M

( d∑i=1

ϕM (u)

)=

1

2>

1

4= v

for all d which contradicts Theorem 2.4.10. Therefore there exists no such generator and M isnot Archimedean.

Both mappings ϕΠ and ϕW in Example 2.4.12 are convex generators. For dimensions d > 2,they still generate Π and W , but the latter fails to be a valid copula (see Lemma 2.2.5). Therefore,an analogy to Theorem 2.4.11 for arbitrary dimensions will have to assume more than justconvexity of the generator. Instead, complete monotonicity, as in the following definition, isneeded.

Definition 2.4.14. A function f is called completely monotone on an interval J , if it is continuouson J , derivatives f (d) of all orders exist on the interior J of J and

(−1)df (d)(x) ≥ 0

holds for all x ∈ J and all d ∈ N ∪ 0.Theorem 2.4.15 (Kimberling (1974)). Let ϕ be a generator and ϕ− as in (2.4.3), then

C : [0, 1]d → [0, 1] with C(u) = ϕ−(∑d

i=1 ϕ(ui))

is a copula for all d ≥ 2 if and only if ϕ−

is completely monotone on [0,∞).

An alternative proof can be found in Ressel (2011). A direct consequence of Theorem 3 inKimberling (1974) is that all completely monotone generators are strict. Bernstein (1929) showedthat completely monotone generators are just inverses of Laplace transforms of distributionfunctions. In the literature this result is called Bernstein’s theorem. A proof can be found inFeller (1971).

17

Page 26: Exchangeability of Copulas

2. Copulas

Theorem 2.4.16 (Bernstein (1929)). Given a function ψ : [0,∞)→ R, there exists a distributionfunction F with F (0) = 0, such that

ψ(t) =

∫ ∞0

e−xt dF (x)

for all t ∈ [0,∞), if and only if ψ is completely monotone and ψ(0) = 1.

If the generated function C is only needed to be a copula up to a fixed dimension d, then theassumption of complete monotonicity may be relaxed. This was independently discovered byMalov (2001) as well as by McNeil and Neslehova (2009). Before stating the theorem, the notionof d-monotonicity is introduced.

Definition 2.4.17. Given d ∈ N \ 1, a function f is called d-monotone on an interval J , ifit is continuous on J , derivatives f (k) exist on J for all k ∈ 1, . . . , d − 2, (−1)d−2f (d−2) isnon-increasing and convex on J, and

(−1)kf (k)(x) ≥ 0

holds for all x ∈ J and all k ∈ 0, 1, . . . , d− 2.

Theorem 2.4.18 (McNeil and Neslehova (2009); Malov (2001)). Let ϕ be a generator and ϕ−

as in (2.4.3), then C : [0, 1]d → [0, 1] with C(u) = ϕ−(∑d

i=1 ϕ(ui))

is a copula if and only if ϕ−

is d-monotone on [0,∞).

As d-monotonicity implies d-monotonicity for d ≤ d, the generators of d-dimenional copulasgenerate copulas in any dimension d ≤ d as well. This also follows from the fact that if ϕ generatesthe d-dimensional copula C, then it also generates its lower-dimensional margins (as ϕ(1) = 0for every generator ϕ and by Definition 2.4.9). It is easy to verify that these lower-dimensionalmargins are copulas themselves. In the following example, which is Example 2.2 in McNeil andNeslehova (2009), a generator is introduced, whose inverse (as in (2.4.3)) is d-monotone but not(d+ 1)-monotone for a fixed d ≥ 2.

Example 2.4.19. Let d ≥ 2 and ϕL : [0, 1]→ [0, 1] with ϕL(t) := 1− t 1d−1 . Then ϕL is a generator

and the inverse ϕ−L (t) = max

(1− t)d−1, 0

as in (2.4.3) is d-monotone, but not (d+ 1)-monotone.

For d = 2, the generator ϕL obviously coincides with the generator ϕW of the lowerFrechet-Hoeffding-bound from Example 2.4.12. In fact it generates a lower bound for theset of Archimedean copulas.

Proposition 2.4.20. Let C : [0, 1]d → [0, 1] be an Archimedean copula. Then

CL(u) :=

(max

d∑i=1

u1d−1

i − (d− 1), 0

)d−1

≤ C(u) ≤M(u)

holds for all u ∈ [0, 1]d and the bounds are best possible.

The first inequality is Proposition 4.6 of McNeil and Neslehova (2009) and it is best possible asCL is an Archimedean copula itself (see Example 2.2 in McNeil and Neslehova (2009)). The secondinequality was established in Theorem 2.2.4 for all copulas. Although M is not Archimedean (seeExample 2.4.13), it is still best possible, as for example the parametric Archimedean copula Cθfrom Example 2.4.21 converges pointwise to M for θ →∞.

18

Page 27: Exchangeability of Copulas

2.4. Some Important Classes

One advantage of Archimedean copulas is that if the generator is endowed with a parameter,the dependence structure of a distribution, contained in the copula by Sklar’s theorem, is capturedby just one real number, namely said parameter. In the following examples, some popularparametric Archimedean copulas are presented. Even more may be found in Table 4.1 in Nelsen(2006).

Example 2.4.21. According to Nelsen (2006), the generator ϕθ(t) := 1θ

(t−θ−1

)was first discussed

by Clayton (1978) and therefore, with the inverse generator ϕ−θ (t) = (θt+ 1)−1θ 1(θt+ 1 > 0) as

in (2.4.3), the generated copula

Cθ(u) =

(max

d∑i=1

u−θi − (d− 1), 0

)− 1θ

is known as Clayton-copula. For a fixed d ≥ 2 this is a valid copula if and only if θ ∈[− 1d−1 ,∞

)\0

(see Example 2.3 in McNeil and Neslehova (2009)). The generator is strict for θ > 0 and non-strictfor θ < 0. Let u ∈ [0, 1]d and θ = − 1

d−1 , then Cθ(u) coincides with the lower bound CL(u) fromProposition 2.4.20. For θ → 0, we get Cθ(u)→ Π(u) and for θ →∞, we get Cθ(u)→M(u).

Example 2.4.22. According to Nelsen (2006), the generator ϕθ(t) := − ln exp(−θt)−1exp(−θ)−1 was first

discussed by Frank (1979) and therefore, with the inverse generator ϕ−θ (t) = − 1θ ln(1 + exp(−θ)−1

exp(t)

)as in (2.4.3), the generated copula

Cθ(u) = −1

θln

(1 +

(e−θ −1

)−d+1d∏i=1

(e−θui −1

))is known as Frank-copula. If θ > 0, then Cθ is a valid copula for all d ≥ 2 (see e. g. Example 4.24in Nelsen (2006)). If θ < 0 and 2 ≤ d ≤ 6 then Cθ is a copula if and only if θ ∈ Θ as in Table 2.1.In order to give a limit for dimensions d > 6 via Theorem 2.4.18, the roots of a polynomialof degree larger than four have to be computed. The generator is strict for all θ ∈ Θ. Letu ∈ [0, 1]d, then Cθ(u) → W (u) for θ → −∞ and d = 2. For d ≥ 3, the lower bound CL fromProposition 2.4.20 is not attained. For any dimension d ≥ 2 and θ → 0 we get Cθ(u)→ Π(u), aswell as Cθ(u)→M(u) for θ →∞.

Table 2.1: Parameter space Θ for a d-dimensional Frank-copula

d Θ θ >

2 (−∞, 0) ∪ (0,∞) −∞3 [− ln 2, 0) ∪ (0,∞) −0.6931

4[− ln

(3−√

3), 0)∪ (0,∞) −0.2374

5[− ln

(6− 2

√6), 0)∪ (0,∞) −0.0962

6[− ln

(15 +

√105−

√270 + 26

√105)

+ ln 2, 0)∪ (0,∞) −0.0421

Example 2.4.23. According to Nelsen (2006), the generator ϕθ(t) := (− ln t)θ was first discussed

by Gumbel (1960) and therefore, with the inverse generator ϕ−θ (t) = exp(−y 1

θ

)as in (2.4.3), the

19

Page 28: Exchangeability of Copulas

2. Copulas

generated copula

Cθ(u) = exp

(−( d∑i=1

(− lnui)θ

) 1θ

)is known as Gumbel-copula. The inverse generator ϕ−θ as in (2.4.3) is completely monotone forθ ∈ [1,∞) (see, e. g., Example 4.25 by Nelsen (2006)) and, because of Theorem 2.4.15, Cθ is avalid copula in any dimension. The generator is strict for all θ ∈ [1,∞). Let u ∈ [0, 1]d, thenCθ(u)→M(u) for θ →∞ and obviously C1(u) = Π(u).

2.4.3 Nested Archimedean Copulas

We will see that an even larger class of copulas can be created with Archimedean generators bynesting Archimedean copulas within each other.

Definition 2.4.24. Let ϕ be a generator and ϕ− as in (2.4.3). Let furthermore k < d, j1 = 1,jk+1 = d+ 1 and ji < ji+1 for i ∈ 1, . . . , k. If C : [0, 1]d → [0, 1] with

C(u) := ϕ−

(k∑i=1

ϕ(Ci(uji , . . . , uji+1−1

)))where Ci are Archimedean copulas (or the identity function if Ci is one-dimensional), is a validcopula, then C is called nested Archimedean copula. If addionally, for one or more i ∈ 1, . . . , k,Ci is a nested Archimedean copula itself, C is also called nested Archimedean copula.

Some authors use the term hierarchical Archimedean copula instead of nested Archimedeancopula. However, not every combination of generators always yields a valid copula. McNeil (2008)gives a sufficient condition, namely complete monotonicity of (ϕϕ−i )′ for all i ∈ 1, . . . , k, whereϕi is the generator of Ci in Definition 2.4.24. Hofert (2010) examined, among others, the case whereparametric generators of the same family are used. For most families of Archimedean copulas,including those of Clayton, Frank, and Gumbel, it is sufficient that the parameter of the innergenerator is not smaller than the parameter of the outer generator. For the Archimedean copulaspresented in the examples 2.4.21, 2.4.22, and 2.4.23, a larger value of the parameter coincideswith a larger value of several dependence measures, which will be introduced in Section 2.6.

Example 2.4.25. Let Cθ1 and Cθ2 be bivariate Gumbel-copulas with parameters θ1 = 1 and θ2 = 2,respectively. Then the mapping C1,2 : [0, 1]3 → [0, 1] with

C1,2(u, v, w) := C1

(u,C2(v, w)

)= uC2(v, w)

is a copula. Note that C1,2 is not Archimedean, because for every three-dimensional Archimedeancopula C, the two-dimensional margins of C coincide (as they are all generated by the samegenerator, namely the generator of C). Especially C(1, v, w) = C(v, 1, w) holds for all v, w ∈ [0, 1].But, for example, with v = 1

2 and w = 13 , we get

C1,2(1, v, w) = C2(v, w) >1

6= C1(v, w) = C1,2(v, 1, w)

and therefore C1,2 is a nested Archimedean copula, which is not Archimedean.

If the Ci in Definition 2.4.24 are all generated by the same generator ϕ, which coincides withthe generator ϕ from Definition 2.4.24, then the nested Archimedean copula C is equal to a(non-nested) d-dimensional Archimedean copula generated by ϕ. Together with Example 2.4.25,it becomes clear that the set of nested Archimedean copulas contains, but is not equal to, the setof all Archimedean copulas.

20

Page 29: Exchangeability of Copulas

2.5. Sampling

2.5 Sampling

The ability to generate samples of a given uni- or multivariate distribution is not only usefulwhen examining statistical inference procedures in a simulation study, but also in applicationslike Monte-Carlo-simulations. In the absence of real random number generators, for examplebased on radioactive decay, digits from tables of random numbers were extracted (see e. g. RANDCorporation (1955)). Nowadays, with sufficient computational power, pseudorandom numbergenerators are used instead. A pseudorandom number generator is a deterministic algorithm thatoutputs a sequence of numbers which should be indistinguishable from a sequence of randomnumbers, as far as certain statistical properties are concerned. Among others, subsequent elementsof the output sequence should not be significantly dependent. But whenever a pseudorandomnumber generator is put into a specific initial status, the same sequence of numbers will beproduced. At first glance, this might seem as oxymoronic as the aforementioned tables ofrandom numbers, but it assures reproducibility of simulations. In our simulations, we usedthe Mersenne-twister-algorithm suggested by Matsumoto and Nishimura (1998) as implementedin R (see R Core Team (2014)). This algorithm outputs a sequence of 32-bit integers, whichcan be easily transformed to the unit interval. Still the realizations are discrete in the sensethat there are at most 232 different outcomes as opposed to infinitely many numbers in theunit interval. Nonetheless, we may assume that a realization of a sequence of independent andstandard uniformly distributed random variables can be created.

Algorithm 2.1 Sampling from a continuous, univariate distribution F

Generate a realization u of a random variable U ∼ U[0, 1]x← F−(u)return x

One way to sample from a continuous, univariate distribution can be derived from Theo-rem 2.3.3 and is given in Algorithm 2.1. This, as well as other ways of sampling from univariatedistributions, can be found in Devroye (1986). Of course, this works best whenever F− is givenin a closed form.

According to Joe (2015), the conditional approach of Algorithm 2.2 for multivariate distribu-tions is also known as Rosenblatt-transform as it is based on the following theorem.

Theorem 2.5.1 (Rosenblatt (1952)). Let H : Rd → [0, 1] be a multivariate distribution functionand X ∼ H. Consider U with U1 := F1(X1) (where X1 ∼ F1) and

Ui := Fi | 1,...,i−1(Xi |X1, . . . , Xi−1)

for i ∈ 2, . . . , d, where Fi | 1,...,i−1(xi |x1, . . . , xi−1) := P(Xi ≤ xi |X1 = x1, . . . , Xi−1 = xi−1)is a conditional margin of H. Then U ∼ Π, i. e. U1, . . . , Ud are i. i. d. and U1 ∼ U[0, 1].

Now instead of transforming a vector X with a multivariate distribution to a vector Uon the unit hypercube with independent and standard uniformly distributed components, inAlgorithm 2.2 (as in Joe (2015)) the transformation of Theorem 2.5.1 is inverted in order totransform d independent samples from U[0, 1] to one sample from a d-dimensional copula C. Thecorrectness of Algorithm 2.2 is established in Lemma 2.5.2.

Lemma 2.5.2. Let X1, . . . , Xd ∼ U[0, 1] be independent. Furthermore, let C : [0, 1]d → [0, 1] bea copula, V ∼ C and let Ci | 1,...,i−1 be the conditional probability of Vi, given V1, . . . , Vi−1, i. e.

Ci | 1,...,i−1(ui |u1, . . . , ui−1) := P(Vi ≤ ui |V1 = u1, . . . , Vi−1 = ui−1)

21

Page 30: Exchangeability of Copulas

2. Copulas

Algorithm 2.2 Sampling from a d-dimensional copula C

Generate independent realizations u1, . . . , ud of a random variable U ∼ U[0, 1]x1 ← u1

for i ∈ 2, . . . , d doxi ← C−i|1,...,i−1(ui |u1, . . . , ui−1) with C−i | 1,...,i−1 as in Lemma 2.5.2

end forreturn (x1, . . . , xd)

for i ∈ 2, . . . , d and u1, . . . , ui ∈ [0, 1]. Consider U with U1 := X1 and

Ui := C−i | 1,...,i−1(Xi |U1, . . . , Ui−1)

for i ∈ 2, . . . , d, then U ∼ C.

Proof. See Joe (2015, page 271) for d = 2 and d = 3 or Hofert (2010, Theorem 1.8.1).

If the generalized inverts C−i | 1,...,i−1 of the conditional distributions Ci | 1,...,i−1 are not known

in closed form, numeric approximations can be used in Algorithm 2.2, at the cost of substantiallyslowing down computation. When continuous partial derivatives of the copula exist, the conditionaldistributions Ci | 1,...,i−1 can be expressed in terms of partial derivatives of marginal copulas, seeTheorem 1.8.2 of Hofert (2010) or Theorem 2.27 of Schmitz (2003).

If a sample from a multivariate distribution H with a given copula C and marginal distributionsF1, . . . , Fd is needed, then this sample can be generated via Sklar’s theorem and the abovealgorithms in the following way: First, generate a realization U ∼ C with Algorithm 2.2. Second,use the components U1, . . . , Ud of U as input for Algorithm 2.1 (of course, F has to be replacedby Fi, when the input is Ui) and thus create X1, . . . , Xd. Then, due to Algorithm 2.1, Xi ∼ Fifor all i ∈ 1, . . . , d and by Sklar’s theorem, X ∼ H. Of course, with the inverse Rosenblatt-transformation, it is also possible to sample from H directly, without computing C beforehand.To this end, Ci|1,...,i−1 in Algorithm 2.2 has to be replaced by Fi | 1,...,i−1 from Theorem 2.5.1.

For most copula classes, there are more convenient ways to generate a sample than Algo-rithm 2.2. The same holds true when sampling from an elliptical distribution as in Section 2.4.1.Algorithm 2.3, based on Algorithm 1.9.4 of Hofert (2010), uses the representation of a randomvector with elliptical distribution as in Proposition 2.4.3. Cambanis et al. (1981) state that

Algorithm 2.3 Sampling from an elliptical distribution Ed(µ,Σ, φ)

Compute A ∈ Rd×k, such that AA> = Σ.Generate a realization uUk of UUk as in Lemma 2.4.2. . see (2.5.1)Generate a realization r of R ∼ FR as in (2.4.2), independent of uUd .x← µ+ rAuUdreturn x

Proposition 2.4.3 still holds true if A ∈ Rd×d is not a rank factorization of Σ, but AA> = Σ stillholds. Therefore, in the first step of Algorithm 2.3, a Cholesky-factorization may be used tocompute A ∈ Rd×d. In Section 4.2.8 of Golub and Van Loan (2013), an algorithm for computingsuch a matrix A is given, even when Σ is not positive definite, but only positive semi-definite. Adirect consequence of Lemma 1 of Cambanis et al. (1981) is

UUkd=

Y

‖Y ‖ , (2.5.1)

22

Page 31: Exchangeability of Copulas

2.5. Sampling

where Y ∼ N(0, Ik) (see also Corollary 3.23 in McNeil et al. (2005)).

A sample from the corresponding elliptical copula can be generated by application of Sklar’stheorem. To be precise, wheneverX ∼ H for a d-dimensional distribution H with one-dimensional(continuous) margins F1, . . . , Fd and copula C, then U ∼ C, where Ui := Fi(Xi). When samplesfrom the Gaussian copula of the multivariate normal distribution of Example 2.4.5 and from thetν-copula of the multivariate tν-distribution of Example 2.4.6 are generated by Algorithm 2.4and Algorithm 2.5 respectively, there is, conveniently, no need for a sample from FR as inAlgorithm 2.3. However, this comes at a price. Instead of FR, the cumulative distributionfunctions of the univariate normal distribution or the univariate tν-distribution, both of which arenot given in closed form, have to be evaluated. At least there is no need for inverting the standardnormal distribution function when generating realizations of N(0, 1) by using the well-knownBox-Muller-transform (see Box and Muller (1958) or Algorithm P in Section 3.4.1 of Knuth(1998)) instead of Algorithm 2.1. Algorithm 2.4 is a version of Algorithm 5.9 of McNeil et al.(2005). Algorithm 2.5 is a version of the algorithm given in Section 2.2 of Demarta and McNeil

Algorithm 2.4 Sampling from a d-dimensional Gaussian copula with correlation matrix Σ

Compute A ∈ Rd×d, such that AA> = Σ.Generate d independent realizations x1, . . . , xd of X ∼ N(0, 1).y ← Ax(u1, . . . , ud)←

(Φ(y1), . . . ,Φ(yd)

). Φ is the N(0, 1)-cdf.

return u

(2005). As mentioned before, a decomposition AA> = Σ exists not only for positive definite, but

Algorithm 2.5 Sampling from a d-dimensional tν-copula with correlation matrix Σ

Compute A ∈ Rd×d, such that AA> = Σ.Generate d independent realizations x1, . . . , xd of X ∼ N(0, 1).Generate a realization w of W ∼ χ2

ν independent of x.y ←

√νwAx . y is a realization of Y ∼ tν(0,Σ).

(u1, . . . , ud)←(tν(y1), . . . , tν(yd)

). tν is the tν(0, 1)-cdf.

return u

also for positive semi-definite matrices Σ according to Golub and Van Loan (2013, Section 4.2.8).

In some cases, sampling from Archimedean copulas is also possible without the use of theconditional approach of Algorithm 2.2. In Section 5 of Marshall and Olkin (1988), an algorithmfor sampling from a broad range of distributions based on mixtures of distributions on (0,∞)d

and copulas is presented. The special case of a mixture of M and Π yields an algorithm forsampling from Archimedean copulas whose generators ϕ are inverses of Laplace-transforms ofdistribution functions. According to Bernstein’s theorem (see Theorem 2.4.16), these are allgenerators ϕ, such that ϕ− is completely monotone. Algorithm 2.6 is based on Algorithm 2.5.1of Hofert (2010) and Section 6.9.4 of Joe (2015). For the Clayton-, Frank-, and Gumbel-familiy of Examples 2.4.21, 2.4.22, and 2.4.23, ϕ−θ is completely monotone if and only if θ > 0.The corresponding distribution Fψ, such that ϕ− is the Laplace-transform of Fψ, is a gamma-distribution, a logarithmic distribution, and a stable distribution, respectively. For details onthis and more examples, see Table 2.1 of Hofert (2010) or Appendix A of Joe (2015). Obviously,Algorithm 2.6 can not be used to sample from a Clayton- or Frank-copula with θ < 0 as, in thiscase, there is no distribution Fψ, such that ϕ− is the Laplace-transform of Fψ.

23

Page 32: Exchangeability of Copulas

2. Copulas

Algorithm 2.6 Sampling from a d-dimensional Archimedean copula whose generator ϕ is theinverse of the Laplace transform of a distribution Fψ

Generate a realization x of X ∼ Fψ.Generate d independent realizations v1, . . . , vd of V ∼ U[0, 1], independent of x.(y1, . . . , yd)← 1

x (− ln v1, . . . ,− ln vd) . y1, . . . , yd are realizations of Y ∼ Exp(x).(u1, . . . , ud)←

(ϕ−(y1), . . . , ϕ−(yd)

)return u

According to Hofert (2010), algorithms for sampling nested Archimedean copulas were intro-duced by McNeil (2008). Just as Algorithm 2.6, they depend on sampling from a distributionFψ connected to the inverse generator via the Laplace-transform and on sampling from nestedArchimedean copulas recursively. The recursion always ends with sampling from an Archimedeancopula as in Algorithm 2.6. Of course, the algorithm has to be adapted to the given nestingstructure, which is mainly notationally demanding. For more information see also the works ofHofert (2008), Hofert (2011) when efficiency is an issue, and Hofert and Pham (2013) for densitiesof nested Archimedean copulas.

When using R (see R Core Team (2014)), in most situations one can rely on the abovealgorithms being implemented in the package copula (see Hofert et al. (2015)). It is based onYan (2007) and Kojadinovic and Yan (2010). Recently, the package nacopula by Hofert andMachler (2011), which, among others, provides functions for sampling from nested Archimedeancopulas, was merged into the package copula.

2.6 Measures of Association

Given two independent random variables X ∼ F1 and Y ∼ F2 with F1, F2 continuous and jointdistribution H, it was shown in Corollary 2.3.6 that X and Y are independent if and only if theircopula coincides with the independence copula Π. In that case, the distribution of one randomvariable is not altered by conditioning on the other random variable. Therefore informationabout the outcome of X can not be gathered by observing the outcome of Y (and vice versa).Without independence, i. e., in the case of dependence, the dependence structure is captured bythe copula as well by Sklar’s theorem. Various ways to measure the extent of dependence betweentwo random variables and expressing it in a single number have been proposed. Some of theseconcepts are presented in this section.

2.6.1 Correlation Coefficient

According to Rodgers and Nicewander (1988), Karl Pearson introduced the correlation coefficientin Pearson (1895). It was based on work by Sir Francis Galton and others (see Table 1 of Rodgersand Nicewander (1988)), but nevertheless, today it is widely known as Pearson’s correlationcoefficient.

Definition 2.6.1. Given two random variables X and Y , such that mX := E(X), mY := E(Y ),σ2X := var(X), σ2

Y := var(Y ) are finite and σXσY 6= 0, we call

ρX,Y :=cov(X,Y )

σXσY

Pearson’s correlation coefficient of X and Y or just correlation coefficient, where cov(X,Y ) =E((X −mX)(Y −mY )

)denotes the covariance, as usual.

24

Page 33: Exchangeability of Copulas

2.6. Measures of Association

The correlation coefficient takes values in [−1, 1], where ρX,Y = ±1 if and only if Y is almostsurely a linear transformation of X, to be precise P(aX + b = Y ) = 1 for some a 6= 0 andb ∈ R. The case ρX,Y = 1 corresponds to a > 0 and ρX,Y = −1 corresponds to a < 0. If Xand Y are independent, then ρX,Y = 0. But as the correlation coefficient captures only lineardependence, the converse is not true in general. This is demonstrated in Example 2.6.2. However,if (X,Y ) ∼ N(µ,Σ), then ρX,Y = 0 is equivalent to X and Y being independent.

Example 2.6.2. Let X ∼ U[−1, 1] and Y := X2. Then ρX,Y = 0 but, for example,

P(X ≤ x, Y ≤ y) =1

26= 3

8= P(X ≤ x)P(Y ≤ y)

for x = 12 and y = 1

4 .

2.6.2 Measures of Concordance

Before stating a set of axioms which measures of concordance should fulfill, we first give thedefinition of “concordance” as in Definition 4 of Scarsini (1984).

Definition 2.6.3. Let H1, H2 : R2 → [0, 1] be bivariate, continuous distribution functions withcopulas C1 and C2. Then H1 is called more concordant than H2, whenever

C1(u, v) ≥ C2(u, v)

for all (u, v) ∈ [0, 1]2. We also write C1 C2.

The notion of concordance is also used regarding (bivariate) random vectors instead ofdistribution functions (or copulas). The pair of random variables (U1, V1) ∼ C1 is said to bemore concordant than the pair of random variables (U2, V2) ∼ C2, whenever C1 C2. This isequivalent to the fact that, given any point (u, v) ∈ [0, 1]2, the probability that (U1, V1) lies in[0, u]× [0, v] or in [u, 1]× [v, 1] is always greater than, or equal to, the probability that (U2, V2) islocated in said rectangles. It is easy to verify that the binary relation is reflexive, antisymmetricand transitive over the set C2 of all bivariate copulas. In the following example, we will see thatthere exist copulas C1, C2 for which neither C1 C2 nor C1 C2 holds. Therefore impliesa partial order on C2. Similarly, by the analogous binary relation over Cd, a partial order isimplied.

Example 2.6.4. Convex combinations of copulas are copulas as well. Especially C1 := 13M + 2

3Wis a bivariate copula. Let C2 := Π, then

C1

(1

2,

1

2

)=

1

6<

1

4= C2

(1

2,

1

2

),

C1

(3

4,

3

4

)=

7

12>

9

16= C2

(3

4,

3

4

)holds and thus neither C1 C2 nor C1 C2 is true.

Now a measure of concordance, as Scarsini (1984) puts it, should establish a total order onC2 which is consistent with the partial order . This is done via a function κ, which maps theset C2 (or random vectors (X,Y ), respectively) to the interval [−1, 1] as in the following set ofaxioms due to Scarsini (1984).

Definition 2.6.5. Let (X,Y ) ∼ H with copula C and continous margins. Then κ is calledmeasure of concordance, whenever the following points hold.

25

Page 34: Exchangeability of Copulas

2. Copulas

1. κ(X, Y

)is defined for all random variables X, Y with continuous distribution functions.

2. κ(X,Y ) = κ(Y,X), i. e. κ is symmetric.

3. If C is the copula of(X, Y

)and C C, then κ

(X, Y

)≥ κ(X,Y ), i. e. κ is consistent with

the concordance (partial) ordering .

4. κ(X,Y ) ∈ [−1, 1].

5. If X and Y are independent, then κ(X,Y ) = 0.

6. κ(−X,Y ) = −κ(X,Y ).

7. If Hn is a sequence of continuous distributions, such that Hn(x, y) → H(x, y) for all(x, y) ∈ R2 and n→∞, and if (Xn, Yn) ∼ Hn, then limn→∞ κ(Xn, Yn) = κ(X,Y ), i. e. κ iscontinuous.

We frequently write κX,Y or κC instead of κ(X,Y ) (where C is the copula of (X,Y )). In the(somewhat rare) cases when there is no ambiguity, we might also just use κ.

From the following lemma, it will become clear that the value of κ(X,Y ) (for some randomvariables X, Y and a measure of concordance κ) does solely depend on the copula of the randomvector (X,Y ). And thus the notation κC instead of κ(X,Y ) is justified.

Lemma 2.6.6. Let C be a copula. Let H and H be distributions, such that C coincides with thecopula of each of the two. Then

κ(X,Y ) = κ(X, Y

)holds for (X,Y ) ∼ H,

(X, Y

)∼ H and every measure of concordance κ.

Proof. It was assumed that if C is the copula of(X, Y

)∼ H, then C ≡ C. Therefore C C and

C C both hold. By Axiom 3 of Definition 2.6.5, the measure of concordance κ must satisfyκ(X,Y ) ≤ κ

(X, Y

)as well as κ(X,Y ) ≥ κ

(X, Y

)and thus κ(X,Y ) = κ

(X, Y

).

In the next example—or rather counterexample—it will be demonstrated that Pearson’scorrelation coefficient is not a measure of concordance as in Definition 2.6.1, because this wouldcontradict Lemma 2.6.6.

Example 2.6.7. Let U ∼ U[0, 1]. From the proof of Lemma 2.2.5, we know that (U,U) ∼M andconsequently M is the copula of the vector (U,U). Let V := U2, then its distribution FV is givenby FV (v) =

√v for v ∈ [0, 1]. Thus its inverse F−V is given by F−V (v) = v2 for v ∈ [0, 1]. It is not

too hard to see that the distribution function H of the vector (U, V ) is

H(u, v) = minu,√v

for (u, v) ∈ [0, 1]2. By Sklar’s theorem, the copula C of H is equal to M , as

C(u, v) = H(F−U (u), F−V (v)

)= minu, v

holds for all (u, v) ∈ [0, 1]2. Obviously ρU,U = 1 and basic computation yields ρU,V =√

154 .

Therefore the value of Pearson’s correlation coefficient does not exclusively depend on the copulabut might change when the marginal distributions are altered in a non-linear way.

26

Page 35: Exchangeability of Copulas

2.6. Measures of Association

According to Kruskal (1958), the sample version of a correlation coefficient based on ranks,was first suggested by Spearman (1904) and therefore it is known as “Spearman’s rho”. We willuse a notation similar to Nelsen (2006), who attributes the population version of Spearman’s rho,which is given in the following definition, to Kruskal (1958).

Definition 2.6.8. Let H : R2 → [0, 1] be a distribution function with copula C. Furthermore,let (Xi, Yi) ∼ H be independent for i ∈ 1, 2, 3. Then

ρ(S)X,Y := 3

(P((X1 −X2)(Y1 − Y3) > 0

)− P

((X1 −X2)(Y1 − Y3) < 0

))is called Spearman’s rho. We will also use the notation ρ

(S)C .

The use of ρ(S)C , instead of ρ

(S)X,Y , whenever (X,Y ) has copula C, will be justified in Theo-

rem 2.6.11. But before, we will give the definition of “Kendall’s tau”, whose sample version was,according to Kruskal (1958), “independently proposed by several authors” including Fechner(1897) and Lipps (1905), but also Kendall (1938). Again, we will use notation similar to Nelsen(2006), who attributes the population version of Kendall’s tau, which is given in the followingdefinition, to Kruskal (1958).

Definition 2.6.9. Let H : R2 → [0, 1] be a distribution function with copula C. Furthermore,let (Xi, Yi) ∼ H be independent for i ∈ 1, 2. Then

τX,Y := P((X1 −X2)(Y1 − Y2) > 0

)− P

((X1 −X2)(Y1 − Y2) < 0

)is called Kendall’s tau. We will also use the notation τC .

In order to justify the use of τC , instead of τX,Y whenever (X,Y ) has copula C, we introducesome mapping Q in the following lemma (which is Theorem 5.1.1 and Corollary 5.1.2(2) of Nelsen(2006)). The subsequent Theorem 2.6.11 (which is a combination of Theorem 5.1.3, Theorem 5.1.6and of a statement on page 170 in the book of Nelsen (2006)) will make use of this mapping Q inorder to show that both, Spearman’s rho and Kendall’s tau, solely depend on the copula. At thesame time, a connection between Spearman’s rho and Kendall’s tau will become clear, namelythat they both may be expressed in terms of Q.

Lemma 2.6.10. Consider the mapping Q : C2 × C2 → [−1, 1] with

Q(C1, C2) := 4

∫[0,1]2

C2(u, v) dC1(u, v)− 1

for C1, C2 ∈ C2. Now, let C1, C2 ∈ C2. Furthermore, let (X1, Y1) and (X2, Y2) be independentrandom vectors with copula C1 and C2, respectively, such that Xi ∼ F and Yi ∼ G for i ∈ 1, 2and some continuous distribution functions F and G. Then

P((X1 −X2)(Y1 − Y2) > 0

)− P

((X1 −X2)(Y1 − Y2) < 0

)= Q(C1, C2)

holds. Additionally Q is monotone in the sense that, if Ci Ci for i ∈ 1, 2 and some copulasC1 and C2, then Q(C1, C2) ≥ Q(C1, C2).

Proof. As F and G are assumed to be continuous, P(X1 = X2) = 0 = P(Y1 = Y2) and thus

P((X1 −X2)(Y1 − Y2) > 0

)− P

((X1 −X2)(Y1 − Y2) < 0

)= 2P

((X1 −X2)(Y1 − Y2) > 0

)− 1

27

Page 36: Exchangeability of Copulas

2. Copulas

holds. Obviously, the last probability can be expressed in the form

P((X1 −X2)(Y1 − Y2) > 0

)= P(X2 < X1, Y2 < Y1) + P(X2 > X1, Y2 > Y1) (2.6.1)

as the two events on the right hand side are disjoint. Now

P(X2 < X1, Y2 < Y1) =

∫R2

P(X2 < x, Y2 < y) dC1

(F (x), G(y)

)=

∫R2

C2

(F (x), G(y)

)dC1

(F (x), G(y)

)=

∫[0,1]2

C2(u, v) dC1(u, v)

holds by Sklar’s theorem and with the substitution (u, v) =(F (x), G(y)

)in the last step. The

same substitution is used for the second term in (2.6.1) and

P(X2 > X1, Y2 > Y1) =

∫R2

P(X2 > x, Y2 > y) dC1

(F (x), G(y)

)=

∫[0,1]2P(F (X2) > u,G(Y2) > v

)dC1(u, v)

=

∫[0,1]2

1− P(F (X2) ≤ u

)− P

(G(Y2) ≤ v

)+ C2(u, v) dC1(u, v)

=

∫[0,1]2

C2(u, v) dC1(u, v)

holds, as F (X2) and G(Y2) are standard uniformly distributed by Theorem 2.3.3. Altogether,this yields

P((X1 −X2)(Y1 − Y2) > 0

)− P

((X1 −X2)(Y1 − Y2) < 0

)= Q(C1, C2). (2.6.2)

Observe that by the above equation, the arguments of Q are exchangeable in the sense thatQ(C1, C2) = Q(C2, C1). This is the case, as on the lefthand side of (2.6.2), the indices may beswapped without changing the outcome. Therefore, it suffices to show monotonicity of Q (withrespect to ) in one argument. Let C be a copula, such that C C2. Then monotonicity of theintegral and

Q(C1, C2) = 4

∫[0,1]2

C2(u, v) dC1(u, v)− 1 ≤ 4

∫[0,1]2

C(u, v) dC1(u, v)− 1 = Q(C1, C)

conclude the proof.

Theorem 2.6.11. Let X ∼ F , Y ∼ G, where F and G are continuous distribution functions andlet C be the copula of the random vector (X,Y ). Furthermore, consider Q as in Lemma 2.6.10.Then

ρ(S)X,Y = 3Q(C,Π) = 12

∫ 1

0

∫ 1

0

C(u, v) dudv − 3 = ρF (X),G(Y ),

τX,Y = Q(C,C) = 4

∫[0,1]2

C(u, v) dC(u, v)− 1 = 4E(C(F (X), G(Y )

))− 1

holds. Therefore both, ρ(S)X,Y = ρ

(S)C and τX,Y = τC solely depend on the copula and not on the

(marginal) distributions of X and Y .

28

Page 37: Exchangeability of Copulas

2.6. Measures of Association

Proof. For Spearman’s rho, note that Q(C1, C2) = Q(C2, C1) holds, as (2.6.2) remains the same,when the indices are exchanged. Additionally, in Definition 2.6.8, X2 and Y3 are independent

and by Corollary 2.3.6, Π is the copula of (X2, Y3) and thus ρ(S)X,Y = 3Q(C,Π) by Lemma 2.6.10.

For the equivalence of Spearman’s rho and Pearson’s correlation coefficient of F (X) and G(Y ),note that the vector

(F (X), G(Y )

)has distribution C. This yields

3Q(C,Π) = 12E(F (X)G(Y )

)− 3 =

E(F (X)G(Y )

)− 1

4112

= ρF (X),G(Y )

as F (X) and G(Y ) are standard uniformly distributed by Theorem 2.3.3 and thus expected valueand standard deviation are given by 1

2 and 112 , respectively. In case of Kendall’s tau, the claim

follows directly from Definition 2.6.9, Lemma 2.6.10 and the fact that(F (X), G(Y )

)∼ C.

Now we are ready to show that Spearman’s rho and Kendall’s tau are indeed measuresof concordance as in Definition 2.6.5 (see Theorem 5.1.9 of Nelsen (2006) or Theorem 4 andTheorem 5 of Scarsini (1984)). Together with Lemma 2.6.6, this is another way to see that bothSpearman’s rho and Kendall’s tau are invariant under different marginal distributions F and G,as long as the copula C stays the same.

Theorem 2.6.12. Spearman’s rho and Kendall’s tau are measures of concordance as in Defini-tion 2.6.5.

Proof. For both, Spearman’s rho and Kendall’s tau, Axioms 1, 2, 4 and 6 of Definition 2.6.5follow directly from their respective definitions.

Concerning Axiom 3, let C1 and C2 be copulas, such that C2 C1, then

ρ(S)C2

= 3Q(C2,Π) ≥ 3Q(C1,Π) = ρ(S)C1

,

τC2 = Q(C2, C2) ≥ Q(C1, C1) = τC1

holds by Lemma 2.6.10 and Theorem 2.6.11.Concerning Axiom 5, let X and Y be independent. Then by Corollary 2.3.6, the copula of

the random vector (X,Y ) is given by Π. Simple integration yields Q(Π,Π) = 0 and consequently

ρ(S)Π = 0 = τΠ by Theorem 2.6.11.

Concerning Axiom 7, let Hn be a sequence of continuous distribution functions, which pointwiseconverges to a continuous distribution H. Then the margins Fn and Gn of Hn converge pointwiseto the margins F and G of H, as well as the copula Cn of Hn converges pointwise to the copula Cof H. By Lemma 2.2.6, the sequence Cn and C are uniformly equicontinuous and as copulas theyare of course uniformly bounded by 1. By the well-known theorem of Arzela-Ascoli, there exists asubsequence that converges uniformly. If we assume that the (pointwise convergent) sequenceCn does not converge uniformly, then there exists a subsequence Cnk and an ε > 0, such thatfor all n0 ∈ N, there exists a k ∈ N with nk > n0, such that ‖Cnk − C‖∞ > ε. To be precise,there exists (uk, vk) ∈ [0, 1]2, such that

∣∣Cnk(uk, vk)−C(uk, vk)∣∣ ≥ ε for all k ∈ N. As (uk, vk) is

a sequence in the compact set [0, 1]2, there exists a convergent subsequence by the well knowntheorem of Bolzano-Weierstraß, i. e. (uki , vki)→ (u, v) for i→∞ and some (u, v) ∈ [0, 1]2. But,for i→∞, the first and third term on the righthand side of∣∣Cnki (uki , vki)− C(uki , vki)

∣∣ ≤ ∣∣Cnki (uki , vki)− Cnki (u, v)∣∣+∣∣Cnki (u, v)− C(u, v)

∣∣+∣∣C(u, v)− C(uki , vki)

∣∣29

Page 38: Exchangeability of Copulas

2. Copulas

tend to 0 by uniform equicontinuity of copulas (see Lemma 2.2.6) and the second term tends to0 by pointwise convergence of Cn. This results in a contradiction and therefore Cn convergesuniformly to C. Then, an application of Theorem 2.6.11 in∣∣∣ρ(S)

Cn− ρ(S)

C

∣∣∣ =∣∣3Q(Cn,Π)− 3Q(C,Π)

∣∣ ≤ 12

∫ 1

0

∫ 1

0

∣∣Cn(u, v)− C(u, v)∣∣ dudv ≤ 12‖Cn − C‖∞

and in

|τCn − τC | ≤∣∣Q(Cn, Cn)−Q(Cn, C)

∣∣+∣∣Q(C,Cn)−Q(C,C)

∣∣≤ 4

∫[0,1]2

∣∣Cn(u, v)− C(u, v)∣∣ dCn(u, v) + 4

∫[0,1]2

∣∣Cn(u, v)− C(u, v)∣∣dC(u, v)

≤ 8‖Cn − C‖∞completes the proof, as the arguments of Q may be exchanged (see the proof of Theorem 2.6.11).

For some of the copulas from Section 2.4, simpler expressions of Spearman’s rho or Kendall’stau are known. Several of these expressions are given in the following examples.

Example 2.6.13. Let X ∼ Ed(µ,Σ, φ), such that Xi is a continuous random variable for alli ∈ 1, . . . , d. By Theorem 2 of Lindskog et al. (2003),

τXi,Xj =2

πarcsin

(Σij√ΣiiΣjj

)holds for i, j ∈ 1, . . . , d, i 6= j. If X ∼ N(µ,Σ), then in Theorem 5.36 of McNeil et al. (2005) itis stated that Spearman’s rho is given by

ρ(S)Xi,Xj

=6

πarcsin

(Σij

2√

ΣiiΣjj

). (2.6.3)

However, McNeil et al. (2005) point to a counterexample by Hult and Lindskog (2002), which showsthat (2.6.3) does not hold for all elliptical distributions. To be precise, let (X1, X2) ∼ N(µ,Σ)

with Σ11 > 0, Σ22 > 0, Σ12 6= 0 and Xd= µ + RAUUk as in Proposition 2.4.3 (with R ∼ χ2

2),then W := AUUk is an elliptically distributed random vector, for which, according to Hult andLindskog (2002), (2.6.3) does not hold.

Example 2.6.14. If the generator ϕ of a bivariate Archimedean copula C is two times continuouslydifferentiable on (0, 1), then

τC = 4

∫ 1

0

ϕ(t)

ϕ′(t)dt+ 1 (2.6.4)

by Proposition 3.3 of Genest and MacKay (1986). For some of the parametric generators, this leadsto even simpler formulas, which are subsumed in Table 2.2. If Cθ is a (bivariate) Clayton-copulaas in Example 2.4.21, then (2.6.4) and basic integration yield τCθ = θ

θ+2 for θ ∈ [−1,∞) \ 0. If

Cθ is a (bivariate) Gumbel-copula as in Example 2.4.23, then τCθ = θ−1θ for θ ∈ [1,∞). If Cθ is a

Frank-copula as in Example 2.4.22, then Nelsen (1986) shows that

ρ(S)Cθ

= 1 +12

θ

(D2(θ)−D1(θ)

), (2.6.5)

τCθ = 1 +4

θ

(D1(θ)− 1

)(2.6.6)

30

Page 39: Exchangeability of Copulas

2.6. Measures of Association

holds for θ ∈ (−∞,∞) \ 0, where Dk, given by

Dk(θ) :=k

θk

∫ θ

0

tk

et−1dt

is the Debye-function of order k. Note that the terms in (2.6.5) and (2.6.6) differ from thecorresponding terms in Nelsen (1986) as a different parametrization of the Frank-copula is used.

Table 2.2: Kendall’s tau for some parametric Archimedean copulas

Family τCθ

Clayton θθ+2

Frank 1 + 4θ

(1θ

∫ θ0

tet−1 dt− 1

)Gumbel θ−1

θ

As all two-dimensional margins of a nested Archimedean copula are Archimedean copulasthemselves, the formulas of Example 2.6.14 may be applied. There exist multivariate extensionsof measures of concordance, see e. g. Joe (1990). The axioms of Scarsini (1984) are extended tothe case of dimension d > 2 by Dolati and Ubeda-Flores (2006), and independently by Taylor(2007) in slightly different ways.

Furthermore, there exist other measures of concordance, which will not be discussed here, butshould be nevertheless mentioned. As Nelsen (2006) states, Corrado Gini introduced the sampleversion of what is now known as “Gini’s gamma” around 1910. Blomqvist (1950) discusses whatis now known as “Blomqvist’s beta”, although he admits (see Section 3 of Blomqvist (1950)) thatits sample version was known before. Scarsini (1984) shows that the population version of Gini’sgamma, as well as the population version of Blomqvist’s beta, are measures of concordance in thesense of Definition 2.6.5.

2.6.3 Tail Dependence

Another way to see dependence between two random variables X ∼ F and Y ∼ G is by lookingat their tails. To be precise, one might consider the probability that Y exceeds the α-quantile ofG for some α ∈ [0, 1], given that X exceeds the α-quantile of F . The question is: does a limit forα→ 1 exist and if so, which is its value? This obviously concerns the upper tail. An analogousquestion may be asked, regarding the lower tail. According to Schmid and Schmidt (2007), (lower)tail dependence was first discussed by Sibuya (1960). The following formal definition of taildependence, as well as most of the remainder of this section, is an adaption of Section 5.4 ofNelsen (2006) and also of Section 1.7.4 of Hofert (2010).

Definition 2.6.15. Let X ∼ F and Y ∼ G, where F,G : R→ [0, 1] are continuous distributionfunctions. If the limits exist, then

λU (X,Y ) := limt→1

P(Y > G−(t)

∣∣X > F−(t))

is called upper tail dependence coefficient and

λL(X,Y ) := limt→0

P(Y ≤ G−(t)

∣∣X ≤ F−(t))

31

Page 40: Exchangeability of Copulas

2. Copulas

is called lower tail dependence coefficient (where, of course, t ∈ (0, 1) for both limits). IfλU (X,Y ) > 0 or λL(X,Y ) > 0, then X and Y are called upper or lower tail dependent. Otherwise,they are called tail independent. We will also write λL(C) and λU (C), if C is the copula of (X,Y ),or simply λU and λL, when there is no ambiguity.

Just as for the measures of concordance, it will become clear in the following theorem thatthis quantity is invariant under different choices of marginal distributions F and G, as it solelydepends on the copula C. To be precise, the tail dependence coefficients exclusively depend onthe diagonal section of C.

Theorem 2.6.16. Let X, Y , F , G, and C as in Definition 2.6.15, then

P(Y > G−(t)

∣∣X > F−(t))

= 2− 1− C(t, t)

1− t ,

P(Y ≤ G−(t)

∣∣X ≤ F−(t))

=C(t, t)

t

holds for all t ∈ (0, 1).

Proof. Let t ∈ (0, 1). As F and G are assumed to be continuous, we get

P(Y > G−(t)

∣∣X > F−(t))

=P(G(Y ) > t, F (X) > t

)P(F (X) > t

) =1− 2t+ P

(F (X) ≤ t, G(Y ) ≤ t

)1− t

= 2− 1− C(t, t)

1− t ,

P(Y ≤ G−(t)

∣∣X ≤ F−(t))

=P(G(Y ) ≤ t, F (X) ≤ t

)P(F (X) ≤ t

) =C(t, t)

t

by Lemma 2.3.2 and because(F (X), G(Y )

)∼ C by Sklar’s theorem.

It will become clear in the following example that if two random variables are independent,then they are tail independent, but the converse is not true.

Example 2.6.17. With Theorem 2.6.16, it is easy to see that

λU (M) = 1 = λL(M),

λU (Π) = 0 = λL(Π),

λU (W ) = 0 = λL(W )

holds. This means M shows perfect upper and lower tail dependence, while neither W nor Π istail dependent. Let U ∼ U[0, 1], then (U, 1− U) ∼W is not independent (see Corollary 2.3.6),but nevertheless tail independent.

Just like for the measures of concordance, for some families of distributions or copulas,there exist special formulas, which simplify the computation of the tail dependence coefficients.Although the general formulas of Theorem 2.6.16 seem to be fairly simple, for example in thecase of a Gaussian copula, computation is still far from obvious, as the copula is not given inclosed form. Schmidt (2002) showed that tail dependence of elliptical distributions (and thus thebivariate normal distribution, see Example 2.4.5) is related to regular variation, which is definedas follows.

32

Page 41: Exchangeability of Copulas

2.6. Measures of Association

Definition 2.6.18. A measurable function f : (0,∞)→ (0,∞) is called regularly varying (at ∞with index α ∈ R), if

limx→∞

f(xt)

f(x)= tα

holds for all t > 0. For a fixed α ∈ R, the set of all regularly varying functions with index α isdenoted by RVα. A function f ∈ RV0 is called slowly varying.

Proposition 2.6.19 (Schmidt (2002)). Let X ∼ E2(µ,Σ, φ) for some µ ∈ R2, Σ ∈ R2×2

positive definite, and φ : R→ R as in (2.4.1). Furthermore, let Xd= µ+ RAUUk and R ∼ FR

as in Proposition 2.4.3. If FR ∈ RV−α (as usual FR(x) := 1− FR(x)) and α > 0, then the taildependence coefficients of X = (X1, X2) are given by

λU (X1, X2) = λL(X1, X2) =

∫ h(ρ)

0uα√1−u2

du∫ 1

0uα√1−u2

du(2.6.7)

where ρ := Σ12√Σ11Σ22

and h(ρ) :=(

1+ρ2

) 12 .

For more necessary and sufficient conditions, such that elliptical distributions are tail dependent,especially if the vector X possesses a density, see Schmidt (2002). As Abdous et al. (2005) pointout, Equation (2.6.7) is misprinted in Schmidt (2002, Equation (5.2)), as well as in Frahm et al.(2003, Theorem 3.4), where

√u2 − 1 should be replaced by

√1− u2.

Example 2.6.20. Let X ∼ N(µ,Σ), then all bivariate margins are tail independent by Theorem 6.2of Schmidt (2002). If X ∼ tν(µ,Σ) is a bivariate random vector, then Frahm et al. (2003) statethat the tail dependence coefficients are given by

λU (X1, X2) = λL(X1, X2) = 2Ftν+1

(√(ν + 1)(1− ρ)

1 + ρ

),

where Ftν+1is the survival function of a univariate t-distribution with (ν + 1) degrees of freedom

and with ρ as in Proposition 2.6.19.

Of course, the tail dependence of Archimedean copulas depends on the generator ϕ. Thisleads naturally to the following corollary, which is a revised version of Theorem 2.6.16.

Corollary 2.6.21. Let C be a bivariate Archimedean copula with generator ϕ, such that thelimits in Definition 2.6.15 exist. If ϕ is strict, then the tail dependence coefficients are given by

λU (C) = 2− limx→0

1− ϕ−(2x)

1− ϕ−(x),

λL(C) = limx→∞

ϕ−(2x)

ϕ−(x).

If ϕ is non-strict, then λL(C) = 0 and λU (C) is given as in the strict case.

Proof. Writing C in terms of the generator ϕ in Theorem 2.6.16 yields

λU (C) = 2− limt→1

1− ϕ−(2ϕ(t)

)1− t = 2− lim

x→0

1− ϕ−(2x)

1− ϕ−(x)

33

Page 42: Exchangeability of Copulas

2. Copulas

where we use the substitution x = ϕ(t) for the second equation. As t→ 1, we may assume t > 0,and therefore ϕ− is the ordinary inverse of ϕ.

Regarding the lower tail dependence coefficient in the non-strict case, note that, for all

t ∈(0, ϕ−

(ϕ(0)2

)], we get ϕ−

(2ϕ(t)

)= 0 and thus C(t,t)

t = 0. If ϕ is strict, then the samesubstitution x = ϕ(t) as for the upper tail dependence coefficient yields the claim.

For most parametric Archimedean copulas, it is possible to derive simple connections betweenthe parameter and the tail dependence coefficients by applying Corollary 2.6.21. We will give thetail dependence coefficients of the parametric copulas from Section 2.4.2 in the following exampleas well as in Table 2.3. For the coefficients of several other parametric Archimedean families, seeExample 5.22 of Nelsen (2006).

Example 2.6.22. Let ϕθ be the generator of a Clayton-copula as in Example 2.4.21. Then anapplication of L’Hopital’s rule and Corollary 2.6.21 yield λU (Cθ) = 0 for all θ ∈ [−1,∞) \ 0.If θ < 0, then ϕθ is non-strict, which means λL(Cθ) = 0 by Corollary 2.6.21. For θ > 0 the

generator is strict and we get λL(Cθ) = 2−1θ .

If ϕθ is the generator of a bivariate Frank-copula as in Example 2.4.22, then it is strict for allθ ∈ (−∞,∞) \ 0. An application of L’Hopital’s rule and Corollary 2.6.21 yield λU (Cθ) = 0 andλL(Cθ) = 0 for all θ ∈ (−∞,∞) \ 0.

If ϕθ is the generator of a bivariate Gumbel-copula as in Example 2.4.23, then it is strict forall θ ∈ [1,∞). An application of L’Hopital’s rule and Corollary 2.6.21 yield λU (Cθ) = 2− 2

1θ and

λL(Cθ) = 0 for all θ ∈ [1,∞).

Remark 2.6.23. As Nelsen (2006) remarks, note that for some of the parametric families the taildependence coefficients of the limiting copulas are given by the limits of the tail dependencecoefficients, but for others they differ. For example, if Cθ is a Clayton-copula, then, on the onehand, Cθ → Π for θ → 0 and λU (Cθ) → 0 = λU (Π) as well as λL(Cθ) → 0 = λL(Π). On theother hand, Cθ →M for θ →∞ but λU (Cθ) = 0 for all θ ∈ [−1,∞), whereas λU (M) = 1.

Table 2.3: Tail dependence coefficients for some parametric Archimedean copulas

Family λL(Cθ) λU (Cθ)

Clayton, θ < 0 0 0

Clayton, θ > 0 2−1θ 0

Frank 0 0

Gumbel 0 2− 21θ

A rather straightforward multivariate extension of the upper tail dependence coefficient ofDefinition 2.6.15 is suggested in Definition 7.1 of Schmidt (2002). Some drawbacks of the bivariatetail dependence coefficients of Definition 2.6.15 are mentioned in Section 3.3 of Schmid andSchmidt (2007). Subsequently, Schmid and Schmidt (2007) introduce a multivariate concept oftail dependence, in order to improve the aforementioned drawbacks. However this measure ofmultivariate tail dependence may differ from the bivariate tail dependence coefficients, unlessλU , λL ∈ 0, 1. Besides Schmidt (2002) and Schmid and Schmidt (2007), four more referencesfor multivariate concepts of tail dependence are given in Joe et al. (2010).

34

Page 43: Exchangeability of Copulas

3 Empirical Processes

3.1 Basics

At first, we state some basic definitions. Most of them are given in any introductory lecture onmeasure and probability theory. But apart from precisely giving meaning to certain terms, atthe same time certain notations are introduced. For further information, see for example Bauer(1992) or Rao (2004). The following definitions, but not necessarily the notations, are based onthose in Bauer (1992).

Definition 3.1.1. Consider a set Ω and a set A := A : A ⊂ Ω of subsets of Ω, such that Ω ∈ A,Ac := Ω \A ∈ A for all A ∈ A and such that ∪n∈NAn ∈ A for An ∈ A (and for n ∈ N). This setA is called a σ-algebra (on Ω).

If(Ai)i∈I is a family of σ-algebras on the same set Ω, then, by verifying the three conditions

of Definition 3.1.1, it is easy to see that ∩i∈IAi is again a σ-algebra. Therefore the σ-algebra inthe following definition exists and is uniquely determined.

Definition 3.1.2. Let G := A : A ⊂ Ω be a set of subsets of Ω. Then a σ-algebra σ(G) iscalled the σ-algebra generated by G, if G ⊂ σ(G) and if additionally, G ⊂ A implies σ(G) ⊂ A, forall σ-algebras A.

Definition 3.1.3.

1. Let d ∈ N and Ω := Rd as well as

G :=

d×i=1

(xi, yi) : x,y ∈ Rd, xi ≤ yi

,

then Bd := σ(G) is called the Borel σ-algebra (on Rd). The elements B ∈ Bd are calledBorel sets.

2. If Ω is a general topological space with topology G, then σ(G), i. e. the σ-algebra generatedby all open sets in Ω, is (also) called Borel σ-algebra (on Ω).

According to Rao (2004), the aforementioned sets and σ-algebra are named after EmileBorel, who showed some important results concerning these structures in 1898. For example,in Proposition 5 of Rao (2004), it is shown that using all closed or half-open intervals inDefinition 3.1.3 results in the same σ-algebra Bd on Rd.

Definition 3.1.4. Let A be a σ-algebra on a set Ω and let An ∈ A for n ∈ N, such thatAi∩Aj = ∅ whenever i 6= j (i. e. pairwise disjoint). A mapping µ : A → [0,∞), such that µ(∅) = 0

35

Page 44: Exchangeability of Copulas

3. Empirical Processes

and

µ

( ∞⋃n=1

An

)=

∞∑n=1

µ(An)

is called a measure (on A). If µ(Ω) = 1, then µ is called a probability measure.

The following proposition may be found as Theorem 6.2 in Bauer (1992).

Proposition 3.1.5. There exists exactly one measure λd on Bd, such that

λd

(d×i=1

[xi, yi)

)=

d∏i=1

(yi − xi)

holds for all x,y ∈ Rd with xi ≤ yi for all i ∈ 1, . . . , d.

As Rao (2004) states, Henri Lebesgue, a student of Emile Borel, extended the theory ofmeasures, which was discussed by Camille Jordan in 1892, and published the results in his PhD-thesis, see Lebesgue (1902). Therefore, the previously mentioned unique measure is nowadayscommonly associated with his name.

Definition 3.1.6. The measure λd on Bd as in Proposition 3.1.5 is called Lebesgue-Borel-measure.If the dimension is obvious, we also write λ instead of λd.

The properties of the Lebesgue-measure may be found in most introductory text books onmeasure theory. Recall that all intervalls of the real line are Borel sets, and that λ

((a, b)

)= λ

([a, b]

)holds for all a, b ∈ R with a < b, as λ(A) = 0 for all countable sets A ⊂ R.

Definition 3.1.7. For each i ∈ 1, 2, let Ωi be a set with σ-algebra Ai. Then the pair (Ωi,Ai)is called measurable space. Each element A ∈ Ai is called measurable set. A mapping f : Ω1 → Ω2

is called measurable, or more precisely A1,A2-measurable, if f−1(A) ∈ A1 holds for all A ∈ A2.We will also write f−1(A2) ⊂ A1. Let additionally µ be a measure on Ω1. Then (Ω1,A1, µ) iscalled a measure space. If µ is a probability measure, then (Ω1,A1, µ) is called a probability space.

The following theorem will be of some use in Section 3.2 and as well as its proof is based onTheorem 7.5 in Bauer (1992).

Theorem 3.1.8. Let (Ω1,A1,P) be a probability space, let (Ω2,A2) be a measurable space andlet f : Ω1 → Ω2 be a measurable function. Then the mapping µ : A2 → [0, 1] with

µ(A) := P(f−1(A)

)for A ∈ A2 is a measure.

Proof. First of all µ(∅) = P(∅) = 0. Then, for n ∈ N, let An ∈ A2, such that Ai ∩ Aj = ∅whenever i 6= j. This means f−1(Ai) ∩ f−1(Aj) = ∅ whenever i 6= j, which yields

µ

( ∞⋃n=1

An

)= P

( ∞⋃n=1

f−1(An)

)=

∞∑n=1

W(f−1(An)

)=

∞∑n=1

µ(An)

because P is a measure.

36

Page 45: Exchangeability of Copulas

3.2. Convergence of Random Variables and Vectors

The measure µ in Theorem 3.1.8 is called the measure which is induced by f or just inducedmeasure.

Due to the following lemma, in the case of a σ-algebra σ(G) generated by some set G, thereexists a shortcut when examining a mapping for measurability. To be precise, it suffices toconsider the preimages of the elements of G.

Lemma 3.1.9. Let (Ω1,A1,P) be a probability space, let(Ω2, σ(G)

)be a measurable space where

G 6= ∅ is some set of subsets of Ω2 (see Definition 3.1.2). Let f : Ω1 → Ω2 be a mapping. Iff−1(A) ∈ A1 holds for all A ∈ G, then f is measurable.

Proof. For all A ∈ G, let f−1(A) ∈ A1. Thus, the set

A2 :=A ⊂ Ω2 | f−1(A) ∈ A1

is not empty. Obviously, f(ω) ∈ Ω2 holds for all ω ∈ Ω1 and thus f−1(Ω2) = Ω1 ∈ A1, whichimplies Ω2 ∈ A2. Let A ∈ A2. Then f−1(A) ∈ A1 holds by the definition of A2 and because A1

is a σ-algebra, we get Ω1 \ f−1(A) ∈ A1 and thus

f−1(Ω2 \A) =ω ∈ Ω1 | f(ω) 6∈ A

= Ω1 \ f−1(A) ∈ A1.

Let An ∈ A2 for n ∈ N, i. e. f−1(An) ∈ A1. Then

f−1

( ∞⋃n=1

An

)=ω ∈ Ω

∣∣ ∃n ∈ N such that f(ω) ∈ An

=

∞⋃n=1

f−1(An) ∈ A1

holds because A1 is a σ-algebra. Therefore A2 is also a σ-algebra and by Definition 3.1.2, G ⊂ A2

implies σ(G) ⊂ A2 which completes the proof.

3.2 Convergence of Random Variables and Vectors

In the preceding chapters, it was assumed that the reader is familiar with the notions of a “randomvariable” and a “random vector” and therefore no formal definition was given. Nonetheless, wewill now do so, in order to precisely separate random variables from the random functions andprocesses in the other sections of this chapter.

Definition 3.2.1. Let (Ω,A,P) be a probability space and consider Rd with the Borel σ-algebraBd. Then a measurable mapping X : Ω → R is called a random variable and a measurablemapping X : Ω→ Rd is called a (d-dimensional) random vector for d ≥ 2.

As usual (and already done in the previous chapters), the argument of a random variable willbe suppressed. This means instead of writing P

(ω ∈ Ω : X(ω) ≤ x

)for x ∈ R, we will in most

cases just write P(X ≤ x). In case of a random vector X and x ∈ Rd, the inequality X ≤ x ismeant to hold for each component.

The following corollary is a direct consequence of Theorem 3.1.8.

Corollary 3.2.2. Let X be a random variable. Then the mapping µX : B → [0,∞) with µX(B) :=P(X−1(B)

)is a measure on (R,B). In the same way, each random vector X induces a measure

µX on (Rd,Bd).

We already made extensive use of those measures in the previous chapter, as can be seen bythe following definition.

37

Page 46: Exchangeability of Copulas

3. Empirical Processes

Definition 3.2.3. Let d ∈ N and X be a d-dimensional random vector. Then the mappingH : Rd → [0, 1] with H(x) := P(X ≤ x) is called distribution function of X. For d = 1, i. e., arandom variable X, we usually write F instead of H.

For consequences and results on the properties of distribution functions, again, we refer tointroductory text books. Just note that (for d = 1) it is also possible to define distribution functionsby their properties (i. e. cadlag, non-decreasing and limx→∞ F (x) = 1, limx→−∞ F (x) = 0). Then,from each distribution function F , a measure µF may be constructed by putting µF (B) :=

∫B

dFfor all B ∈ B. Furthermore, there exists a probability space (Ω,A,P) and a random variable X,such that µF (B) = µX(B) holds for all B ∈ B (see e. g. Billingsley (1995, Theorem 14.1)). Tothis end, just consider Ω := R, A := B, P := µF and X ≡ id, i. e. X(ω) = ω for all ω ∈ Ω. Ananalogous result holds for dimension d > 1.

If a sequence of random vectors Xn is considered, naturally, the question of convergencearises. As it is not a sequence in Rd, but a sequence of functions, several notions of convergenceexist. In what follows, we will give those which are used within this work. They are based on thedefinitions by Bauer (2002). For additional concepts of convergence of random variables, see e. g.Gut (2013, Chapter 5).

Definition 3.2.4. Let d ∈ N and let X be a d-dimensional random vector, as well as Xn forn ∈ N. Let H be the distribution function of X and let Hn be the distribution function of Xn. If

P

(lim supn→∞

‖Xn −X‖ > ε

)= 0

holds for all ε > 0, then Xn is said to converge almost surely to X and we write Xna.s.−−→X. If

limn→∞

P(‖Xn −X‖ > ε

)= 0

holds for all ε > 0, then Xn is said to converge in probability to X and we write XnP−→X. If

limn→∞

Hn(x) = H(x) (3.2.1)

for all x ∈ Rd where H is continuous, then Xn is said to converge in distribution to X and we

write Xnd−→X.

The norm used in the above definition for almost sure convergence and convergence inprobability is not specified, as all norms on Rd are equivalent and therefore convergence accordingto one norm implies convergence according to any other norm.

A result which may be found in most introductory textbooks is that almost sure convergenceimplies convergence in probability which implies convergence in distribution and in general theinverse implications are not true. For details and more information, see e. g. Theorem 2.7 ofvan der Vaart (1998).

Note that convergence in distribution is often called “weak convergence.” When consideringthe measures induced by Hn and H in (3.2.1), this naming convention will become clearer in theso called “portmanteau theorem,” succeeding the following definition.

Definition 3.2.5.

1. Let Xn ∼ Hn for n ∈ N, as well as X ∼ H. If (3.2.1) holds for all x ∈ Rd where H is

continuous (i. e. Xnd−→X), then the sequence Hn is said to converge weakly to H and we

write Hn ⇒ H.

38

Page 47: Exchangeability of Copulas

3.2. Convergence of Random Variables and Vectors

2. For n ∈ N, let µn as well as µ be probability measures on the same measurable space(Ω,A). If µn(A)→ µ(A) holds for all sets A ∈ A, such that the boundary is a null-set, i. e.µ(∂A) = 0, then µn is said to converge weakly to µ and we write µn ⇒ µ.

That convergence in distribution of random vectors is equivalent to weak convergence of theinduced measures is a consequence of the following proposition, which is known as “portmanteautheorem,” see, e. g., Billingsley (1999, Theorem 2.1).

Proposition 3.2.6 (Portmanteau theorem). Let µn, µ be probability measures on somemeasurable space (Ω,A), then these five conditions are equivalent:

1. µn ⇒ µ.

2.∫

Ωf dµn →

∫Ωf dµ for all bounded, continuous functions f : Ω→ R.

3.∫

Ωf dµn →

∫Ωf dµ for all bounded, uniformly continuous functions f : Ω→ R.

4. lim supn∈N µn(A) ≤ µ(A) for all closed sets A.

5. lim infn∈N µn(A) ≥ µ(A) for all open sets A.

where n→∞.

A well-known occurrence of convergence of random variables is the law of large numbers. Thefollowing proposition is Theorem 22.1 of Billingsley (1995), where a proof may be found as well.

Proposition 3.2.7 (Law of large numbers). Let (Xn)n∈N be a sequence of independent andidentically distributed random variables with finite expectation E(X1) <∞. Then

Xn :=1

n

n∑i=1

Xia.s.−−→ E(X1)

holds for n→∞.

Using the law of large numbers, the following example is Example 14.4 of Billingsley (1995)and illustrates why it makes sense to request Hn(x)→ H(x) in (3.2.1) only for points x whereH is continuous.

Example 3.2.8. Let X1 be a random variable, such that P(X1 = −1) = P(X1 = 1) = 12 and

consider independent copies Xn for n ∈ N. Obviously E(X1) = 0 and by the law of large

numbers (Proposition 3.2.7), Xna.s.−−→ 0 for n→∞. Let X be a so called “degenerate” random

variable, i. e. it is almost surely constant, in this case P(X = 0) = 1. Let Fn be the distributionfunction of Xn and let F be the distribution function of X, i. e. F (x) = 1[0,∞)(x). Then, asalmost sure convergence implies convergence in distribution, we get Fn(x) → 1[0,∞)(x) for allx ∈ R \ 0. As a distribution function, the limiting function F must be right continuous, whichmeans that F (0) = 0 is the only choice. But F2n+1(0) = 1

2 for all n ∈ N, as X2n+1 = 0 is

impossible and P(X2n+1 > 0

)= P

(X2n+1 < 0

)holds by definition of X1, to be precise, because

of P(Xi = −1) = P(Xi = 1) for all i ∈ N. Therefore Fn(0)→ 1 = F (0) for n→∞ is impossible.Indeed, it can be shown that P

(X2n = 0

)→ 0 for and thus F2n(0) → 1

2 as well (by using

Stirling’s formula on(

2nn

)). This means that Fn converges pointwise to a function that is not

right continuous, or to put it differently that the function of pointwise limits is no distributionfunction.

39

Page 48: Exchangeability of Copulas

3. Empirical Processes

For independent and identically distributed random variables Xi, by the law of large numbers(Proposition 3.2.7), n−1

∑ni=1Xi almost surely converges to E(X1). If we replace n−1 by n−α for

some α > 1, then we get

n−αn∑i=1

Xi = n1−αXna.s.−−→ 0 · E(X1) = 0. (3.2.2)

A theorem, which is attributed to Marcinkiewicz and Zygmund (1937) by Gut (2013, Theorem 7.1),

states that for α ∈(

12 , 1), if E(X1) = 0 and E|X1|

1α <∞ holds, then (3.2.2) holds as well. The

following proposition, which is known as “central limit theorem” or “Lindeberg-Levy theorem,”shows that there is also convergence in the case α = 1

2 . However the limit is not a real numberlike in (3.2.2), but a random variable.

Proposition 3.2.9 (Central limit theorem). Let (Xn)n∈N be a sequence of independent andidentically distributed random variables with 0 < σ2 := var(X1) <∞. Then

√n(Xn − E(X1)

) d−→ X

where X ∼ N(0, σ2). If (X)n∈N is a sequence of independent and identically distributed, d-dimensional random vectors with covariance matrix Σ, then

1√n

n∑i=1

(Xn − E(X1)

) d−→X

where X ∼ Nd(0,Σ).

For a proof of the univariate case and similar results when relaxing the assumption of identicaldistributions or even of independence in Proposition 3.2.9, see, for example, Section 27 ofBillingsley (1995). The multivariate case is discussed in Example 2.18 of van der Vaart (1998).

One characteristic of continuous functions g : Rd1 → Rd2 is the possibility to move limits in-and outside the function, i. e. g

(limx→x0

x)

= limx→x0g(x). As stated in the following proposition

(which is Theorem 2.3 of van der Vaart (1998), where also a proof may be found), the same holdstrue for all three notions of convergence encountered in Definition 3.2.4. This result is known as“continuous mapping theorem” and will be useful in Chapter 5.

Proposition 3.2.10 (Continuous mapping theorem). Let d1, d2 ∈ N, let X be a d1-dimensional random vector and let g : Rd1 → Rd2 be continuous for all x ∈ A for some setA ⊂ Rd1 , such that P(X ∈ A) = 1. For a sequence of d1-dimensional random vectors (Xn)n∈N,it holds that

1. if Xna.s.−−→X, then g(Xn)

a.s.−−→ g(X),

2. if Xnp−→X, then g(Xn)

p−→ g(X),

3. if Xnd−→X, then g(Xn)

d−→ g(X),

for n→∞.

40

Page 49: Exchangeability of Copulas

3.3. Random Functions

3.3 Random Functions

In Definition 3.2.1, measurable mappings from some probability space to the real line or toRd were considered. Instead of random numbers or random vectors, we now consider randomfunctions. In order to do so, we first need some spaces of functions that some measurable mappingmay map to.

Definition 3.3.1. We denote the space of all continuous and real-valued functions on thehypercube [0, 1]d ⊂ Rd by C[0, 1]d, i. e.

C[0, 1]d :=f : [0, 1]d → R

∣∣ f is continuous

for d ∈ N. Similarly, the space of all real valued cadlag-functions on [0, 1]d is denoted by D[0, 1]d,i. e.

D[0, 1]d :=

f : [0, 1]d → R

∣∣∣ ∀x0 ∈ [0, 1]d limx→x0+

f(x) = f(x0) and limx→x0−

f(x) exists

,

where limx→x0+ f(x) is the limit from the upper right quadrant and limx→x0− f(x) is the limitfrom any given other quadrant. To be precise, in the case x→ x0+, only sequences of those xare considered, for which x ≥ x0 holds, i. e. xi ≥ x0,i for all i ∈ 1, . . . , d (as well as x→ x0).In the case x→ x0−, for all subsets I ⊂ 1, . . . , d, only sequences of those x are considered, forwhich xi < x0,i holds for all i ∈ I and xi ≥ x0,i holds for all i 6∈ I (as well as x→ x0). Finally,the space of all real valued bounded functions on [0, 1]d is denoted by l∞[0, 1]d, i e.

l∞[0, 1]d :=f : [0, 1]d → R

∣∣ ‖f‖∞ <∞

where, as usual, ‖ · ‖∞ denotes the supremum-norm.

Before giving a definition of random functions, we first discuss some properties of the spacesin Definition 3.3.1, which will be needed in the course of this section.

Definition 3.3.2. Let M be a topological space. If every open cover has a countable subcover,i. e., if for every index set I and open sets Oi ⊂ M , there exists a sequence (ik)k∈N ⊂ I, suchthat

⋃i∈I Oi =

⋃∞k=1Oikholds, then it is said to have the Lindelof property (or to be a Lindelof

space).

Definition 3.3.3. Let (M,d) be a metric space. If there exists a countable subset M ⊂ M ,which is dense in M , then M is called separable. The subset M is called dense in M , if for allx ∈M , there exists a sequence (xn)n∈N ⊂ M , such that xn → x holds for n→∞.

The following result from topology will be useful in the proof of Theorem 3.3.8.

Lemma 3.3.4. Let (M,d) be a metric space. Then M has the Lindelof property if and only if itis separable.

A proof of Lemma 3.3.4 is given in Appendix A.1.

Lemma 3.3.5. Let d ∈ N. The spaces of Definition 3.3.1 are subsets of l∞[0, 1]d, moreover

C[0, 1]d ⊂ D[0, 1]d ⊂ l∞[0, 1]d

holds and each inclusion is strict.

41

Page 50: Exchangeability of Copulas

3. Empirical Processes

A proof of Lemma 3.3.5 may be found in Appendix A.1.

Example 3.3.6. Consider the space C[0, 1] with the metric d : C[0, 1]×C[0, 1]→ [0,∞), such that

d(f, g) := ‖f − g‖∞. Let C[0, 1] ⊂ C[0, 1] with

C[0, 1] :=

f ∈ C[0, 1]

∣∣∣∣ f piecewise linear and f

(m

n

)∈ Q if m,n ∈ N, 0 ≤ m ≤ n

then C[0, 1] is countable because Q × Q is countable. Let f ∈ C[0, 1] and ε > 0 then there

exists f ∈ C[0, 1], such that supx∈[0,1]∩Q∣∣f(x)− f(x)

∣∣ < ε3 because Q is dense in R. Furthermore,

both f and f are continuous on the compact set [0, 1], and therefore they are both uniformlycontinuous. This means, there exists a δ > 0, such that |x− y| < δ implies

∣∣f(x)− f(y)∣∣ < ε

3 as

well as∣∣f(x)− f(y)

∣∣ < ε3 . Let x ∈ [0, 1] \Q. Then, with y ∈ (x, x+ δ) ∩ [0, 1] ∩Q,∣∣f(x)− f(x)

∣∣ ≤ ∣∣f(x)− f(y)∣∣+∣∣f(y)− f(y)

∣∣+∣∣f(y)− f(x)

∣∣ < ε

holds. This means∥∥f − f∥∥∞ < ε and C[0, 1] is a countable subset which is dense in C[0, 1], i. e.

C[0, 1] is separable.

An analogous consideration shows that C[0, 1]d is separable for d ≥ 2, but in order to avoidconfusion between the dimension d and the metric d(·, ·), we solely considered d = 1 in thepreceding example. In Example 3.3.12 it becomes clear that D[0, 1] is not separable. Then, ofcourse, l∞[0, 1] is not separable as well, because D[0, 1]d ⊂ l∞[0, 1]d holds for all d ∈ N.

Definition 3.3.7. Let d ∈ N and let S[0, 1]d :=f : [0, 1]d → R

be a set of real valued functions

on [0, 1]d and let(S[0, 1]d,S

)be a measurable space. Let additionally (Ω,A,P) be a probability

space. Then a measurable mapping X : Ω→ S[0, 1]d is called random function.

Another way to define random functions is to endow a random variable with a parametert, i. e. to consider mappings X : Ω× [0, 1]→ R. Then it becomes obvious (if not, see the proofof Theorem 3.3.8) that if X is a random function, then X( · , t), or

(X(·)

)(t) respectively, is a

random variable for any t. Of course, in Definition 3.3.7, one may also consider a set of real valuedfunctions which are defined on Rd or on N (i. e. random sequences) instead of S[0, 1]d, but weconfine ourselves to sets of functions which are needed in the course of Chapter 5. In this section,we will see that X(t) being a random variable for all t ∈ [0, 1] does not imply in general that Xis a random function. To be precise, we will see that for D[0, 1], the choice of σ-algebra S inDefinition 3.3.7 is essential. Measurability might already be an issue when considering somethingas simple as an empirical distribution function as in Definition 3.3.9. But first, we will show thatfor random functions on C[0, 1], X(t) being a random variable for all t ∈ [0, 1] implies X being arandom function. The following theorem, as well as its proof, are based on Section 8 of Billingsley(1968).

Theorem 3.3.8. Let (Ω,A, P ) be a probability space, consider(C[0, 1], σ(UC)

)where UC is the

set of all open sets in C[0, 1] equipped with the supremum-norm, i. e.

A ∈ UC ⇐⇒ ∀f ∈ A ∃ε > 0, such thatg ∈ C[0, 1]

∣∣ ‖f − g‖∞ < ε⊂ A (3.3.1)

and let X : Ω→ C[0, 1] be a mapping. Then X is measurable (i. e. a random function) if and onlyif X(t) is a random variable for all t ∈ [0, 1].

42

Page 51: Exchangeability of Copulas

3.3. Random Functions

Proof. For t ∈ [0, 1], the mappings πt : C[0, 1]→ R, such that πt(f) := f(t) are called projections.These projections are continuous, because let t ∈ [0, 1], f ∈ C[0, 1], ε > 0, δ := ε and g ∈ C[0, 1],such that ‖f − g‖∞ < δ, then∣∣πt(f)− πt(g)

∣∣ =∣∣f(t)− g(t)

∣∣ ≤ ‖f − g‖∞ < δ = ε

holds. As πt is continuous, open sets in O ⊂ R have open preimages π−1t (O) and thus π−1

t (O) ∈UC . It is easy to verify that π−1

t (O1), π−1t (O2) ∈ UC for two open sets O1, O2 ⊂ R implies

π−1t (R \O1) ∈ UC as well as π−1

t (O1 ∪O2) ∈ UC . This means that π−1t (B) ∈ σ(UC) holds for all

B ∈ B (where B is the Borel σ-algebra on [0, 1] as usual).Now let X : Ω → C[0, 1] be a random function and let t ∈ [0, 1]. Then X−1(U) ∈ A for all

U ∈ σ(UC), especially for U := π−1t (B), given a B ∈ B. This means that for B ∈ B,(

X(t))−1

(B) =ω ∈ Ω

∣∣ (X(ω))(t) ∈ B

=ω ∈ Ω

∣∣ πt(X(ω))∈ B

=ω ∈ Ω

∣∣ X(ω) ∈ π−1t (B)

= X−1

(π−1t (B)

)∈ A

holds and therefore X(t) is a random variable.For the inverse implication, let X(t) be a random variable for all t ∈ [0, 1]. This means πt(X)

is measurable, or to put it differently:(πt(X)

)−1(B) ∈ A holds for all B ∈ B and for all t ∈ [0, 1].

As usual, we denote the open ball with radius ε > 0 around f ∈ C[0, 1] by Bε(f). Here, thismeans

Bε(f) :=g ∈ C[0, 1] | ‖f − g‖∞ < ε

for f ∈ C[0, 1]. And of course, Bε(f) denotes the closure of said open ball. Let ε > 0 andf ∈ C[0, 1], then

Bε(f) =⋂q∈Q

0≤q≤1

g ∈ C[0, 1]

∣∣∣ ∣∣f(q)− g(q)∣∣ ≤ ε =

⋂q∈Q

0≤q≤1

π−1q

([f(q)− ε, f(q) + ε

])

holds by the continuity of f and as Q is dense in R. This yields

X−1(Bε(f)

)=ω ∈ Ω

∣∣ X(ω) ∈ Bε(f)

=⋂q∈Q

0≤q≤1

ω ∈ Ω

∣∣ πq(X(ω))∈[f(q)− ε, f(q) + ε

]

=⋂q∈Q

0≤q≤1

(πq(X)

)−1([f(q)− ε, f(q) + ε

])∈ A

because(πq(X)

)−1(B) ∈ A for all B ∈ B and as a σ-algebra, A is closed under countable

intersections. This means that X−1(B) ∈ A holds for all closed balls B ⊂ C[0, 1]. Let againf ∈ C[0, 1], then we have

X−1(Bε(f)

)=⋃n∈N

X−1(Bε− 1

n(f))∈ A

which means that X−1(B) ∈ A holds also for all open balls B ⊂ C[0, 1]. Let A ∈ UC , then Ais a union of open balls, because according to (3.3.1), for all f ∈ A , there exists εf > 0, suchthat Bεf (f) ⊂ A. By Example 3.3.6, C[0, 1] is separable which is equivalent to C[0, 1] having

43

Page 52: Exchangeability of Copulas

3. Empirical Processes

the Lindelof property by Lemma 3.3.4. Thus each open cover of A has a countable subcover andtherefore there exists a sequence (fn)n∈N ⊂ A, such that

A =⋃f∈A

Bεf (f) =

∞⋃n=1

Bεfn (fn)

holds. But then

X−1(A) =

ω ∈ Ω

∣∣∣∣ X(ω) ∈∞⋃n=1

Bεfn (fn)

=

∞⋃n=1

X−1(Bεfn (fn)

)∈ A

means that X−1(A) ∈ A holds for all A ∈ UC . Thus X is measurable (i. e. a random function) byLemma 3.1.9.

But as mentioned earlier, for mappings from some probability space (Ω,A,P) to D[0, 1] (andtherefore also to l∞[0, 1]), pointwise measurability is not sufficient for measurability. In order toshow this in Theorem 3.3.11, we first introduce empirical distribution functions.

Definition 3.3.9. Let (Xn)n∈N be a sequence of independent and identically distributed randomvariables with X1 ∼ F for some distribution function F : R→ [0, 1]. The mapping Fn : R→ [0, 1]with

Fn(x) :=1

n

n∑i=1

1(Xi ≤ x)

is called empirical distribution function (of the sample X1, . . . , Xn). We will write Hn for theempirical distribution function of a sample of random vectors X1, . . . ,Xn with distributionfunction H, where 1(Xi ≤ x) is replaced by 1(Xi ≤ x) and, as usual, inequalities of vectors aremeant to hold component-by-component as in (2.3.2).

It is easy to see that any empirical distribution function, given a sample of random vectors onthe unit hypercube (i. e. for a fixed ω ∈ Ω) is an element of D[0, 1]d.

Example 3.3.10. As usual we consider the random variables on some probability space (Ω,A,P)and B is the Borel σ-algebra on R. Given independent and identically distributed randomvariables U1, . . . , Un with U1 ∼ U[0, 1] and the corresponding empirical distribution functionFn as in Definition 3.3.9, it is easy to verify that Yi := 1(Ui ≤ x) is a random variable for alli ∈ 1, . . . , n and for all x ∈ [0, 1]. To this end, let B ∈ B. If neither 0 ∈ B nor 1 ∈ B, thenY −1i (B) = ∅ ∈ A by Definition 3.1.1. If 0 ∈ B and 1 6∈ B, then Y −1

i (B) = U−1i

((t,∞)

)∈ A

because Ui is a random variable. If 0 6∈ B and 1 ∈ B, then Y −1i (B) = U−1

i

((−∞, x]

)∈ A because

Ui is a random variable. And if both 0 ∈ B and 1 ∈ B, then Y −1i (B) = Ω ∈ A by Definition 3.1.1.

Therefore, Fn(x) is a random variable for any fixed x ∈ [0, 1]. As E(Fn(x)

)= F (x) = x and the

Yi are independent, Fn(x)a.s.−−→ F (x) by the law of large numbers (Proposition 3.2.7).

Due to the preceding example, it is tempting to consider the empirical distribution functionas a random function. A natural choice of measurable space would be

(D[0, 1], σ(UD)

), where

UD is the set of all open sets in D[0, 1] equipped with the supremum-norm, i. e.

A ∈ UD ⇐⇒ ∀f ∈ A ∃ε > 0, such thatg ∈ D[0, 1]

∣∣ ‖f − g‖∞ < ε⊂ A. (3.3.2)

This means that σ(UD) is the Borel σ-algebra on D[0, 1] with the topology of the supremum-norm.However, this results in Fn of Example 3.3.10 being not measurable (under the axiom of choice),as we will see in the following theorem.

44

Page 53: Exchangeability of Copulas

3.3. Random Functions

Theorem 3.3.11. Let (Ω,A,P) be a probability space and, with UD as in (3.3.2), consider themeasurable space

(D[0, 1], σ(UD)

). If the mapping Fn : Ω→ D[0, 1] is defined as in Example 3.3.10,

then Fn is not measurable.

The proof is based on Section 18 of Billingsley (1968), which is the first edition of Billingsley(1999).

Proof. Let U1, . . . , Un be independent and identically distributed random variables with U1 ∼U[0, 1] and Yi := 1(Ui ≤ x) for i ∈ 1, . . . , n as in Example 3.3.10. We will show that already forn = 1, Fn is not measurable. In fact, consider the mapping F1 : Ω→ D[0, 1] with F1(ω) : [0, 1]→ R,such that

(F1(ω)

)(x) := 1

(U1(ω) ≤ x

)for all x ∈ [0, 1].

Now, for θ ∈ [0, 1], consider the sets Aθ, which are given by

Aθ :=

f ∈ D[0, 1]

∣∣ ‖f − 1[θ,1]‖∞ <1

2

(3.3.3)

and 1M (x) := 1(x ∈ M) as usual. Furthermore, for an arbitrary subset J ⊂ [0, 1], the set AJis given by AJ :=

⋃θ∈J Aθ. Then AJ is open because Aθ is open for any θ ∈ [0, 1] and as an

open set, AJ ∈ UD which means AJ ∈ σ(UD). It holds that F−11 (AJ) = U−1

1 (J), or to put itdifferently:

ω ∈ Ω∣∣ F1(ω) ∈ AJ

=ω ∈ Ω

∣∣ U1(ω) ∈ J

. (3.3.4)

In order to verify (3.3.4), let ω ∈ F−11 (AJ), i. e. F1(ω) ∈ AJ . If U1(ω) 6∈ J , then∥∥F1(ω)− 1[θ,1]

∥∥∞ =

∥∥1[U1(ω),1] − 1[θ,1]

∥∥∞ = 1 >

1

2

holds for all θ ∈ J and therefore F1(ω) 6∈ Aθ for all θ ∈ J which would mean F1(ω) 6∈ AJ . ThusF1(ω) ∈ AJ implies U1(ω) ∈ J . For the inverse implication, let ω ∈ U−1

1 (J). Then U1(ω) ∈ Jmeans

F1(ω) = 1[U1(ω),1] ∈ AU1(ω) ⊂ AJ

and thus F1(ω) ∈ AJ .The above considerations yield that if F1 is measurable, then U−1

1 (J) ∈ A holds for all

J ⊂ [0, 1]. Moreover, if F1 is measurable, then there exists a measure µ on the power set of [0, 1]with µ

([a, b]) = b − a for all a, b ∈ [0, 1] with a ≤ b. But this leads to a contradiction, which

is derived by an example, based on the construction of a non-measurable set in the proof ofTheorem 19 by Rao (2004), where it is attributed to Vitali (1905).

As usual, let P([0, 1]

)be the set of all subsets of [0, 1]. Now, let F1 be a random mapping.

Then the mapping µ : P([0, 1]

)→ [0, 1] defined by

µ(J) := P(F−1

1 (AJ))

= P(U−1

1 (J))

= P(U1 ∈ J)

is a probability measure on (Ω,A) by Theorem 3.1.8. For the further construction, we needaddition modulo 1, which is denoted by +1, i. e.

x+1 y :=

x+ y if x+ y ∈ [0, 1),

x+ y − 1 else

45

Page 54: Exchangeability of Copulas

3. Empirical Processes

for x, y ∈ [0, 1]. For J ⊂ [0, 1] and x ∈ [0, 1],

J +1 x := y +1 x | y ∈ J

denotes the translation of all elements by x ∈ [0, 1] modulo 1. Because U1 follows a uniformdistribution, µ(J +1 x) = µ(J) holds for all J ⊂ [0, 1] and all x ∈ [0, 1]. It is easy to see that

x ∼1 y : ⇐⇒ x− y ∈ Q

is an equivalence relation for x, y ∈ [0, 1]. As an equivalence relation, ∼1 divides [0, 1] intodisjoint equivalence classes. By the axiom of choice, it is possible to select one element from eachequivalence class. Let J be the set of all those elements. Due to Q ∩ [0, 1] being countable, thereexists a sequence (qn)n∈N which contains all elements of Q ∩ [0, 1] exactly once. For i 6= j thismeans

(J +1 qi) ∩ (J +1 qj) = ∅

as well as⋃n∈N(J +1 qn) = [0, 1). But because of U1 ∼ U[0, 1] and µ being a measure, we get

1 = P(U1 ∈ [0, 1)

)= µ

([0, 1)

)= µ

( ∞⋃n=1

(J +1 qn)

)=

∞∑n=1

µ(J +1 qn) =

∞∑n=1

µ(J)

which shows that it is not possible to assign a value to µ(J) because neither µ(J) = 0 nor µ(J) > 0results in

∑∞n=1 µ(J) = 1. Therefore F1 is not measurable. Analogous considerations show that

Fn is not measurable for arbitrary n ∈ N.

Example 3.3.12. With the notation of the preceding proof, it may be seen in the following waythat D[0, 1] with the metric induced by ‖ · ‖∞ is not separable. For θ ∈ [0, 1], let Aθ ⊂ D[0, 1]be defined as in (3.3.3). Then Aθ is an open set for all θ ∈ [0, 1] and thus

⋃θ∈[0,1]Aθ is an open

cover. But ∥∥1[θ,1] − 1[ϕ,1]

∥∥ = 1 >1

2

for θ 6= ϕ implies 1[θ,1] 6∈ Aϕ for θ 6= ϕ. This means that the open cover⋃θ∈[0,1]Aθ contains

an uncountable number of elements that are included in exactly one set Aθ and therefore nocountable subcover can contain all of them. Thus D[0, 1] is no Lindelof space and by Lemma 3.3.4it is not separable.

Because of D[0, 1] ⊂ l∞[0, 1], the following corollary is a direct consequence of Theorem 3.3.11(just take f := Fn with Fn as in Theorem 3.3.11).

Corollary 3.3.13. Let (Ω,A,P) be a probability space and consider l∞[0, 1] with the Borelσ-algebra of the open sets according to the metric induced by the supremum norm ‖ · ‖∞. Thenthere exists a mapping f : Ω→ l∞[0, 1], which is not measurable.

According to van der Vaart and Wellner (1996), there exist two popular ways to resolve theissue of non-measurability in Theorem 3.3.11. The first one is to refrain from using the metricinduced by ‖ · ‖∞ in favor of the so called Skorokhod metric (or certain variations of it) whichwill be given in Definition 3.3.14. The second one is to relax the requirement of measurabilityand will be discussed in Section 3.4 and the remainder of this chapter.

As van der Vaart and Wellner (1996) state, Skorokhod (1956) endowed D[0, 1] with the metricwhich nowadays bears his name. We will give the definition as in Chapter 3 (Section 12) ofBillingsley (1999) and implicitly take for granted that it is indeed a metric.

46

Page 55: Exchangeability of Copulas

3.4. Weak Convergence

Definition 3.3.14. Let Λ ⊂ C[0, 1] the set of all bijective continuous functions λ : [0, 1]→ [0, 1].Then the metric dS : D[0, 1]×D[0, 1]→ [0,∞) with

dS(x, y) := infλ∈Λ

max‖λ− id‖∞, ‖x− y λ‖∞

is called Skorokhod metric.

If dU (x, y) := ‖x − y‖∞ denotes the uniform metric, then dS(x, y) ≤ dU (x, y) holds forall x, y ∈ D[0, 1] as obviously λ = id ∈ Λ. But when considering x := 1[θ,1] and y := 1[ϕ,1]

with θ, ϕ ∈ (0, 1) and θ < ϕ, then dU (x, y) = 1, whereas dS(x, y) ≤ ϕ − θ. For example, letλ : [0, 1]→ [0, 1] with

λ(t) :=

ϕθ t if t ∈ [0, θ),

ϕ+ 1−ϕ1−θ (t− θ) if t ∈ [θ, 1],

then it is easy to see that λ ∈ Λ (with Λ as in Definition 3.3.14) and x(t) = y(λ(t)

)holds for all

t ∈ [0, 1] as well as ‖λ− id‖∞ = ϕ− θ.Unfortunately, D[0, 1] with the Skorokhod metric dS is not complete, but Billingsley (1999)

shows that there is a metric dS , which induces an equivalent topology and for which D[0, 1] iscomplete.

3.4 Weak Convergence

The fact that, in general, already empirical distribution functions fail to be random functions wasdiscussed in Theorem 3.3.11 and the part of Section 3.3 following it. But still, at each fixed pointx, Fn(x) is a random variable (see Example 3.3.10). This suggests to relax the prerequisite ofmeasurability of random functions in Definition 3.3.7 as follows.

Definition 3.4.1. Let d ∈ N and let S[0, 1]d :=f : [0, 1]d → R

be a set of real valued functions

on [0, 1]d and let(S[0, 1]d,S

)be a measurable space. Let additionally (Ω,A,P) be a probability

space. Then a mapping X : Ω → S[0, 1]d is called random process, if(X(·)

)(t) : Ω → R is a

random variable for all t ∈ [0, 1]d. For any ω ∈ Ω, the realization X(ω) : [0, 1]d → R is called apath (of the random process).

This means that a random process X may be seen as a collection of random variablesXt : Ω → R, indexed by t ∈ [0, 1]d, such that Xt(ω) :=

(X(ω)

)(t). Furthermore (see the proof

of Theorem 3.3.8) each random function is a random process, but the converse is not true, asan empirical distribution function is in general no random function (see Theorem 3.3.11) buta random process. Just as with random functions, it is of course possible to define randomprocesses in a more general setting, but for the needs of the remaining chapters, it suffices toconsider random processes on [0, 1]d. In the literature, random processes are also called stochasticprocesses.

Still, the main interest in empirical distribution functions lies not in their properties as functionswith random values or as random processes, but in their asymptotic properties. Therefore, asvan der Vaart and Wellner (1996) suggest, one might consider some kind of convergence, whereonly the limit, but not necessarily the elements of the converging sequence are measurable. Tothis end, we first need to define outer expectation as in van der Vaart and Wellner (1996).

Definition 3.4.2. Let (Ω,A,P) be a probability space and let X : Ω→ R be a function. Then

E∗(X) := infE(Y )

∣∣ Y : Ω→ R is measurable, E(Y ) exists and Y ≥ X

47

Page 56: Exchangeability of Copulas

3. Empirical Processes

is called outer expectation of X. Just like in van der Vaart and Wellner (1996), “E(Y ) exists”means that either E

(maxY, 0

)or E

(max−Y, 0

)is finite and measurability of Y is meant to

hold for the Borel σ-algebra on R.

Just like most of the results and definitions of this section, also the following definition ofweak convergence of random processes is adapted to the needs of this work and may be found inmore generality in the book of van der Vaart and Wellner (1996).

Definition 3.4.3. Let M be some metric space (for example one of the spaces of Definition 3.3.1or Rd). Let Cb(M) be the set of all bounded, continuous and real valued functions on M, i. e.

Cb(M) :=f : M→ R

∣∣ ‖f‖∞ <∞ and f is continuous

.

Additionally, let (Ω,A,P) be a probability space and let Xn : Ω → M be a function for n ∈ N(which is not necessarily measurable). Furthermore, let M be the Borel σ-algebra on M and letX : Ω→M be measurable. If

E∗(f(Xn)

)→ E

(f(X)

)holds for all f ∈ Cb(M) (and for n→∞), then Xn is said to converge weakly to X and we writeXn X.

In the following example, it is shown that this definition of weak convergence is a generalizationof the weak convergence of probability measures in Definition 3.2.5. To be precise, if all elementsof a weak convergent sequence of functions are measurable, i. e. random functions, then theirinduced probability measures converge weakly to the induced measure of the limit.

Example 3.4.4. Let S[0, 1]d be one of the spaces of Definition 3.3.1 and let S be the respectiveBorel σ-algebra. Additionally, let the functions X : Ω → S[0, 1]d and Xn : Ω → S[0, 1]d bemeasurable for all n ∈ N and Xn X for n →∞. Let µn and µ be the probability measureswhich are induced by Xn and X on S[0, 1]d, i. e.,

µn(A) := P(X−1n (A)

)µ(A) := P

(X−1(A)

)for all A ∈ S. Because 1A

(Xn(ω)

)= 1X−1

n (A)(ω) holds for all ω ∈ Ω and for all A ∈ S, it is easyto see that ∫

S[0,1]df dµn =

∫Ω

f(Xn) dP = E(f(Xn)

)holds for all measurable functions f : S[0, 1]d → R, especially for f ∈ Cb(S) (with Cb(S) as inDefinition 3.4.3). Now let f ∈ Cb(S), then f(Xn) is measurable because Xn is measurable and fis continuous. This yields E∗

(f(Xn)

)= E

(f(Xn)

)and thus∫

S[0,1]df dµn = E∗

(f(Xn)

)→ E

(f(X)

)=

∫Ω

f(X) dP =

∫S[0,1]d

f dµ

holds, which means µn ⇒ µ by Definition 3.2.5 and Proposition 3.2.6.

Example 3.4.5. Similarly, if Xn,X : Ω → Rd are given as in Definition 3.4.3 and additionallyXn is measurable (i. e. a random vector) for all n ∈ N, then weak convergence coincides withconvergence in distribution as in Definition 3.2.4. Or to put it differently, if the elements of the

sequence (Xn)n∈N are all measurable, then Xn X implies Xnd−→X.

48

Page 57: Exchangeability of Copulas

3.4. Weak Convergence

In order to give a sufficient condition for weak convergence in Proposition 3.4.8, the notions of“inner probability” and “asymptotic tightness” as in the following definitions are needed. Theyare both versions of the respective definitions in the book of van der Vaart and Wellner (1996).

Definition 3.4.6. Let (Ω,A,P) be a probability space and let B ⊂ Ω. Then

P∗(B) := supP(A) | A ⊂ B and A ∈ A

is called inner probability of B.

Of course, “outer probability” as well as “inner expectation” may be defined similarly but arenot needed in this work.

Definition 3.4.7. Let d ∈ N and let S[0, 1]d :=f : [0, 1]d → R

be a set of real valued functions

on [0, 1]d and let(S[0, 1]d,S

)be a measurable space. Let additionally (Ω,A,P) be a probability

space and let Xn : Ω → S[0, 1]d be a function. The sequence (Xn)n∈N is called asymptoticallytight, if for all ε > 0, there exists a compact set K ⊂ S[0, 1]d, such that

lim infn∈N

P∗

(Xn ∈

⋃x∈K

Bδ(x)

)≥ 1− ε

holds for all δ > 0. If X1 is a random function, then it is called tight, if for all ε > 0, there existsa compact set K ⊂ S[0, 1]d, such that P(X1 ∈ K) ≥ 1− ε holds. If Xn is a random function forall n ∈ N, then the sequence is called uniformly tight, if for all ε > 0, there exists a compact setK ⊂ S[0, 1]d, such that P(Xn ∈ K) ≥ 1− ε holds for all n ∈ N.

Obviously any uniformly tight sequence is also asymptotically tight. Now we are set to give asufficient condition for weak convergence in the following proposition. It is part of Theorem 1.5.4in van der Vaart and Wellner (1996), where also a proof may be found.

Proposition 3.4.8. Let (Ω,A,P) be a probability space and consider l∞[0, 1]d with the Borelσ-algebra. Let furthermore Xn : Ω → l∞[0, 1]d be an arbitrary mapping for n ∈ N and letX : Ω→ l∞[0, 1]d be a random function. If the sequence (Xn)n∈N is asymptotically tight and(

Xn(t1), . . . , Xn(tk))> d−→

(X(t1), . . . , X(tk)

)>holds for n→∞ and for all (t1, . . . , tk)> ∈ [0, 1]k for all k ∈ N, then Xn X holds.

Note that the respective theorem in van der Vaart and Wellner (1996) is more general, in thatit suffices that the margins of the asymptotically tight sequence converge to the margins of arandom process X. Then there is a version of X that is measurable and maps to l∞[0, 1]d, suchthat Xn X holds.

As we are mainly interested in the weak convergence of empirical distribution functions, itwould suffice to consider D[0, 1]d instead of the larger space l∞[0, 1]d. However, Proposition 3.4.8yields a limiting process in l∞[0, 1]d. A result where the limit is a process with continuous pathsis given in Section 3.5 (see Corollary 3.5.9).

For most theorems on convergence in distribution or weak convergence of measures, there isan analogous version for the generalized weak convergence of Definition 3.4.3. But in view ofChapter 5 the following version of the continuous mapping theorem is the most important.

49

Page 58: Exchangeability of Copulas

3. Empirical Processes

Theorem 3.4.9 (Continuous mapping theorem). Let S1[0, 1]d, S2[0, 1]d be spaces of realvalued functions on [0, 1]d, let k ∈ N and let g1 : S1[0, 1]d → S2[0, 1]d as well as g2 : S1[0, 1]d →Rk be continuous mappings. Furthermore, let (Ω,A,P) be a probability space with functionsXn, X : Ω→ S1[0, 1]d. Then Xn X implies g1(Xn) g1(X) and g2(Xn) g2(X) for n→∞.

If g2(Xn) is a random vector for all n ∈ N, then g2(Xn)d−→ g2(X) holds as well.

Proof. Let Xn X and let f ∈ Cb(S2) as in Definition 3.4.3. Then f g1 is continuous as acomposition of continuous mappings and it is bounded because f is bounded. By definition,

E∗(f(Xn)

)→ E

(f(X)

)holds for all f ∈ Cb(S1) and therefore it holds especially for f g1 ∈ Cb(S1). This implies

E∗(f(g1(Xn)

))→ E

(f(g1(X)

)),

which is equivalent to g1(Xn) g1(X). Similarly, let f ∈ Cb(Rd), then f g2 ∈ Cb(S1) yieldsg2(Xn) g2(X). If g2(Xn) is measurable for all n ∈ N, then f

(g(Xn)

)is a random variable as

well and outer expectation becomes regular expectaion, i. e.

E∗(f(g1(Xn)

))= E

(f(g1(Xn)

))for all n ∈ N. The portmanteau theorem (see Proposition 3.2.6) thus yields g2(Xn)

d−→ g2(X).

3.5 Donsker’s Theorem

An empirical distribution function may be seen as a sum of mappings from some probabilityspace (Ω,A,P) to the space D[0, 1]d (and thus to l∞[0, 1]d). If we consider a sequence (Yn)n∈Nof independent and identically distributed random variables with distribution F and empiricaldistribution function Fn as in Definition 3.3.9 as well as the mappings Xn : Ω→ D[0, 1] given byXn := 1[Yn,∞), then we have Fn = Xn. As E

(Fn(x)

)= F (x) holds for every x ∈ R, the question

arises, whether there exists some analogy to the central limit theorem (see Proposition 3.2.9).To be precise, the question is, whether

√n(Fn − F

)converges in some sense. Although Fn may

not be measurable (see Theorem 3.3.11), we will see in this section that there is indeed weakconvergence to a continuous limit. But first, some definitions are needed.

Definition 3.5.1. Let d ∈ N and let (Xn)n∈N be a sequence of independent and identicallydistributed random vectors (or random variables if d = 1) with distribution function H : [0, 1]d →[0, 1] and empirical distribution function Hn (see Definition 3.3.9) on the probability space(Ω,A,P). Then the function

√n(Hn −H

): Ω→ D[0, 1]d

is called empirical process (of H).

Note that we are mainly concerned with distributions on the unit hypercube and thus theempirical process maps to D[0, 1]d. If H is not a distribution on the unit hypercube, thenthe empirical process maps to D

(Rd)⊂ l∞

(Rd), where l∞

(Rd)

is defined as the bounded

functions on Rd and D(Rd)

is defined as the functions on Rd which are continuous from theupper right quadrant and which possess limits from any (fixed) quadrant (see Definition 3.3.1).Note in addition that the empirical process is indeed a random process by considerations as inExample 3.3.10.

50

Page 59: Exchangeability of Copulas

3.5. Donsker’s Theorem

Definition 3.5.2. Let (Ω,A,P) be a probability space and let W : Ω → C[0, 1] be a randomprocess (and thus a random function by Theorem 3.3.8), such that

1. P(W(0) = 0

)= 1 (the process starts at 0),

2. if 0 ≤ t1 ≤ . . . ≤ tn ≤ 1 for some n ∈ N, then(W(tn)−W(tn−1), . . . ,W(t2)−W(t1)

)> ∼ N(0,Σ)

holds with Σij = (tn−i+1 − tn−i) · 1(i = j) for i, j ∈ 1, . . . , n − 1 (independent andnormally distributed increments),

then W is called Wiener process or Brownian motion. The random process B : Ω→ C[0, 1] withB(t) := W(t)−tW(1) for all t ∈ [0, 1] is called Brownian bridge. A random process G : Ω→ C[0, 1]d,with d ∈ N, such that

(G(t1), . . . ,G(tk)

)> follows a k-dimensional normal distribution for allk ∈ N and all ti ∈ [0, 1]d is called Gaussian process.

The Wiener process is named after Norbert Wiener who studied the mathematical implicationsof a phenomenon which was observed by Robert Brown: the seemingly random movement ofcertain particles under the microscope. Note that, in the preceding definition as well as in mostother cases, the argument ω of the processes is suppressed in order to improve readability. Forexample, W(t2)−W(t1) denotes the random variable(

W(·))(t2)−

(W(·)

)(t1) : Ω→ R.

The existence of a Wiener process is derived, for example, in Section 37 of Billingsley (1995,Chapter 7) or in Section 8 of Billingsley (1999). The notion of “Brownian bridge” is justifiedas not only P

(B(0) = 0

)= 1, but also P

(B(1) = 0

)= 1 holds for a process B as in the above

definition. Note that all three processes of Definition 3.5.2 have continuous paths as they map toC[0, 1] or C[0, 1]d respectively.

Corollary 3.5.3. Let W be a Wiener process and B be a Brownian bridge. Then the respectivecovariances are given by

cov(W(s),W(t)

)= mins, t

cov(B(s),B(t)

)= mins, t − st

for s, t ∈ [0, 1].

Proof. Both covariances are straightforward to derive by linearity of the covariance-operator,independent increments of a Wiener process and normal distribution of the margins. For example,for s ≤ t, we get

cov(W(s),W(t)

)= cov

(W(s),W(t)−W(s)

)+ cov

(W(s),W(s)

)= 0 + s

and similar considerations for s > t yield cov(W(s),W(t)

)= t.

For d = 1, the following proposition gives a version of the central limit theorem for theempirical process. It is based on Theorem 19.3 of van der Vaart (1998).

51

Page 60: Exchangeability of Copulas

3. Empirical Processes

Proposition 3.5.4 (Donsker (1952)). Let F : [0, 1]→ [0, 1] be a continuous distribution functionand let Fn be its empirical counterpart (for n ∈ N). The empirical process then converges weaklyto a Gaussian process GF , i. e.

√n(Fn − F

) GF

in D[0, 1], where GF (x) = B(F (x)

)and thus E

(GF (x)

)= 0 as well as

cov(GF (x),GF (y)

)= min

F (x), F (y)

− F (x)F (y)

holds for all x, y ∈ [0, 1].

According to Dudley (1999, Notes to Section 1.1), some issues concerning measurabilitywere found in the original proof by Donsker (1952), but the theorem which nowadays bearsDonsker’s name was finally proved four years later by Kolmogorov and Skorokhod, who resolvedmeasurability issues by developing a version of what is now known as the Skorokhod metric (seeDefinition 3.3.14).

Just like van der Vaart (1998) and van der Vaart and Wellner (1996, Section 2.1), the proof forProposition 3.5.4 as well as an extension to the case d > 1 follows from more general considerations.Until now, we mostly considered mappings from a probability space Ω to a space of real valuedfunctions on [0, 1]d like C[0, 1]d or l∞[0, 1]d (as for example the Gaussian and Wiener process inDefinition 3.5.2). This may be readily extended to l∞(S), where S = f : Rd → R is a set offunctions. More precisely, each element of l∞(S) maps the functions in S to R and is bounded,or to put it differently, g ∈ l∞(S) means that g : S → R is a function with supf∈S |g(f)| < ∞.This may be harder to imagine but nevertheless does not change the preceding results.

Definition 3.5.5. Let H : Rd → R be a distribution function and for n ∈ N let X,X1, . . . ,Xn

be independent random vectors on a probability space (Ω,A,P) which are identically distributedaccording to a d-dimensional distribution function H. Furthermore, let S be a set of measurablefunctions f : Rd → R, such that E

(f2(X)

)exists. Then the mapping Gn : Ω → l∞(S), where

Gn(ω) : S → R with

(Gn(ω)

)(f) :=

√n

(1

n

n∑i=1

f(Xi(ω)

)− E

(f(X)

))(3.5.1)

for all f ∈ S and ω ∈ Ω is called empirical process (of H) on S if(Gn(·)

)(f) is measurable for

each f ∈ S. If there exists a Gaussian process GH on S, such that E(GH(f)

)= 0 and

cov(GH(f),GH(g)

)= E

(f(X)g(X)

)− E

(f(X)

)E(g(X)

)(3.5.2)

holds for all f, g ∈ S as well as Gn GH then S is called H-Donsker or just Donsker.

With the preceding definition, the weak convergence of the empirical process to a Gaussianprocess may be derived by showing that the set

S :=1(−∞,x]

∣∣ x ∈ Rdis Donsker, where as usual (−∞,x] := (−∞, x1]× . . .× (−∞, xd] for x ∈ Rd. Because with thisspecial set S of indicator functions, the empirical process Gn of (3.5.1), with f = 1(−∞,x] ∈ Sfor some x ∈ Rd becomes

Gn(f) =√n

(1

n

n∑i=1

1(−∞,x](Xi)− E(1(−∞,x](X)

))=√n(Hn(x)−H(x)

)52

Page 61: Exchangeability of Copulas

3.5. Donsker’s Theorem

which is the classical empirical process.Finite sets S are Donsker by the central limit theorem. A criterion for an infinite set S to be

H-Donsker is given in Proposition 3.5.7. As van der Vaart (1998, page 270) puts it: “whethera class of functions is Donsker, depends on the size of the class.” In order to quantify this size,some definitions, which are based on the definitions by van der Vaart (1998) and by van der Vaartand Wellner (1996), are needed.

Definition 3.5.6. Let H be a continuous d-dimensional distribution function and let S be a setof measurable functions, such that E

(f2(X)

)exists for X ∼ H and for all f ∈ S.

1. The L2(H)-norm on S is denoted by

‖f‖H,2 :=

(∫Rd

∣∣f(x)∣∣2 dH(x)

) 12

=

√E(∣∣f(X)

∣∣2)for f ∈ S and X ∼ H.

2. If fl, fu : Rd → R are two functions (not necessarily in S, but with finite norms), then theset

[fl, fu] :=f ∈ S

∣∣ fl(x) ≤ f(x) ≤ fu(x) for all x ∈ Rd

is called a bracket.

3. Let ‖ · ‖ be a norm on the real valued functions on Rd and let ε > 0. If [fl, fu] is a bracket,such that ‖fl − fu‖ < ε holds, then it is called ε-bracket.

4. Let N[ ]

(ε, S, ‖ · ‖

)be the minimum number of ε-brackets (according to the norm ‖ · ‖),

which are needed to cover S (i. e. S is a subset of the union of those ε-brackets). ThenN[ ]

(ε, S, ‖ · ‖

)is called bracketing number.

The following proposition is Theorem 19.5 of van der Vaart (1998), where also a proof is given.Other criterions for a set S to be H-Donsker may be found in Section 2.5 of van der Vaart andWellner (1996).

Proposition 3.5.7. Let H be a continuous d-dimensional distribution function and let S be aset of measurable functions, such that ‖f‖H,2 exists for all f ∈ S. If∫ 1

0

√logN[ ]

(ε, S, ‖ · ‖H,2

)dε <∞

holds, then S is H-Donsker.

In the following theorem, which is based on Example 19.6 of van der Vaart (1998) andExample 2.5.4 of van der Vaart and Wellner (1996), it is shown that a certain class of indicatorfunctions is Donsker.

Theorem 3.5.8. Let H be a continuous d-dimensional distribution function and let S be the setof functions given by

S :=1(−∞,x]

∣∣ x ∈ Rd,

then S is H-Donsker.

53

Page 62: Exchangeability of Copulas

3. Empirical Processes

Proof. First of all, the indicator functions in S are Borel measurable by considerations analogousto the ones in Example 3.3.10. And

∥∥1(−∞,x]

∥∥H,2

=√H(x) exists for all x ∈ Rd and thus

‖f‖H,2 exists for all f ∈ S.For i ∈ 1, . . . , d, let Fi be the i-th one-dimensional marginal distribution of H. Let 1 ≥ ε > 0

and let xi,0 := −∞ for i ∈ 1, . . . , d as well as xi,j := F−i(jε2

2d ∧ 1)

for j ∈ 1, . . . , n where(2 ∨ 2d

ε2

)≤ n ≤ 3d

ε2 . Finally set xi,n+1 := ∞ for i ∈ 1, . . . , d. Then Fi(xi,j+1) − Fi(xi,j) < ε2

d

holds for all i and for all j ≤ n. Let j ∈ 0, 1, . . . , nd, then[fj , fj+1

]with

fj := 1(−∞,xj ],

fj+1 := 1(−∞,xj+1]

where xj+1 := (x1,j1+1, . . . , xd,jd+1)>, is an ε-bracket due to∥∥fj+1 − fj∥∥2

H,2= E

(∣∣fj+1(X)− fj(X)∣∣2) = P

(X ∈

(−∞,xj+1

]\(−∞,xj

])≤

d∑i=1

P(Xi ∈ (xi,j , xi,j+1]

)=

d∑i=1

(Fi(xi,j+1)− Fi(xi,j)

)< ε2.

Let f ∈ S, then there exists x ∈ Rd, such that f = 1(−∞,x]. Furthermore, there exists

j ∈ 0, 1, . . . , nd, such that xi,j < xi ≤ xi,j+1 holds for all i ∈ 1, . . . , d and thus

fj(y) ≤ f(y) ≤ fj+1(y)

holds for all y ∈ Rd. The ε-brackets[fj , fj+1

]with j ∈ 0, 1, . . . , nd therefore cover S. This

yields

1 ≤ N[ ]

(ε, S, ‖ · ‖H,2

)≤ nd ≤ (3d)d

ε2d

and thus logN[ ]

(ε, S, ‖ · ‖H,2

)≤ c1 + c2 log

(1ε

)holds for some constants c1, c2 ∈ (1,∞). But then√

c1 + c2 log(

)≤ c1 + c2 log

(1ε

)for ε ∈ (0, 1] and by

∫ 1

0

c1 + c2 log(1

ε

)dε = c1 + c2 <∞

we may apply Proposition 3.5.7, which completes the proof.

Corollary 3.5.9. Let H : [0, 1]d → [0, 1] be a continuous d-dimensional distribution and let Hn

be the empirical distribution function based on n independent random vectors with distribution H,then

√n(Hn −H

) GH

holds on D[0, 1]d where GH is a Gaussian process (and thus has continuous paths) with E(GH(x)

)=

0 and

cov(GH(x),GH(y)

)= H(x ∧ y)−H(x)H(y)

for x,y ∈ [0, 1]d.

54

Page 63: Exchangeability of Copulas

3.5. Donsker’s Theorem

As usual the minimum x ∧ y is meant to hold for all components, i. e. (x ∧ y)i := minxi, yifor i ∈ 1, . . . , d. Corollary 3.5.9 is a direct consequence of Theorem 3.5.8 by the considerationsbetween Definition 3.5.5 and Definition 3.5.6. The covariance structure is derived by using theindicator functions of Theorem 3.5.8 on (3.5.2).

The Gaussian process GH of Corollary 3.5.9 is called (tucked down) Brownian sheet. Thisnotion becomes especially clear for d = 2 and a distribution function H on the unit square. ThenP(GH(u) = 0

)= 1 holds for all u ∈ [0, 1]2 \ (0, 1)2.

55

Page 64: Exchangeability of Copulas
Page 65: Exchangeability of Copulas

4 Limits of Non-Exchangeability

Parts of this chapter, especially Section 4.3 and the following, are based on joined work withUlrich Stadtmuller and were published as Harder and Stadtmuller (2014).

4.1 Some Concepts of Multivariate Symmetry

For univariate distributions, most authors agree on one definition of symmetry (about somepoint a ∈ R), namely Definition 4.1.1. But when it comes to multivariate distributions andespecially copulas, different notions of symmetry exist. In this section, multivariate (whereappropriate) extensions of the four different concepts of bivariate symmetry in Nelsen (1993)are given. Serfling (2006) gives also four definitions of multivariate symmetry. The content ofone of the definitions of symmetries of these two authors coincides, but still they use differentdesignations. Exchangeability, which will be discussed in Section 4.2, is different from all of thesenotions of symmetry, nevertheless, exchangeable distributions are referred to as being symmetricby some authors. This demonstrates that in a multivariate setting, is it important to preciselydefine which notion of symmetry is employed.

4.1.1 Random Vectors

We commence this section with the definition of symmetry for univariate distribution functionsas in Nelsen (1993) or Nelsen (2006).

Definition 4.1.1. Let F be a continuous univariate distribution function, X ∼ F and a ∈ R. IfX − a and a−X are identically distributed, i. e.

P(X − a ≤ x) = P(a−X ≤ x)

holds for all x ∈ R, then X is called symmetric (about a).

Lemma 4.1.2. If F possesses a density f , then X ∼ F is symmetric about a if and only iff(a− x) = f(a+ x) holds for all x ∈ R \N , where N ⊂ R is a set with Lebesgue-measure 0.

Proof. Obviously X is symmetric about a if and only if F (a+ x) = F (a− x) holds for all x ∈ R.Then the claim follows, because

F (a+ x) =

∫ a+x

−∞f(s) ds =

∫ x

−∞f(a+ t) dt =

∫ x

−∞f(a− t) dt =

∫ ∞a−x

f(u) du = F (a− x)

holds for all x ∈ R, with the substitutions s = a+ t and u = a− t.

57

Page 66: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Now, we give definitions for a multivariate extension of the three bivariate concepts of symmetryof Nelsen (2006). These concepts are also discussed in Nelsen (1993), where an additional concept,called conditional symmetry is suggested. But we will omit conditional symmetry, as it has nostraightforward extension to arbitrary dimensions. In the first definition, we consider randomvectors, whose components exhibit univariate symmetry.

Definition 4.1.3. Let F1, . . . , Fd be continuous univariate distribution functions and Xi ∼ Fifor i ∈ 1, . . . , d. If there exists a ∈ Rd, such that Xi is symmetric about ai for all i ∈ 1, . . . , d(in the sense of Definition 4.1.1), then X is called marginally symmetric (about a).

Next, we replace both, the random variable and the real number in Definition 4.1.1 bycorresponding vectors.

Definition 4.1.4. Let F1, . . . , Fd be continuous univariate distribution functions and Xi ∼ Fifor i ∈ 1, . . . , d. If there exists a ∈ Rd, such that X − a and a−X are identically distributed,then X is called radially symmetric (about a).

Nelsen (1993) states that he prefers the term “radial symmetry” to others, as it emphasizesthat the points where the distribution function and the survival function of X are evaluated “lieon rays, emanating in opposite directions from a.” In the third notion of symmetry, we requestthat the distribution of the centered random vector remains unchanged, not only when all signsare reversed simultaneously, but also when an arbitrary subset of components is multiplied by −1.Along the way, with x y, a simplifying notation for component-by-component multiplication ofvectors x,y ∈ Rd, also known as “Hadamard-product,” is defined. Of course, it should not beconfused with the composition of mappings f, g : R→ R as in (f g)(x) = f

(g(x)

).

Definition 4.1.5. Let F1, . . . , Fd be continuous univariate distribution functions and Xi ∼ Fifor i ∈ 1, . . . , d. By x y, we denote the Hadamard-product of two vectors x,y ∈ Rd, i. e.(x y)i = xiyi for i ∈ 1, . . . , d. If there exists a ∈ Rd, such that the random vectors b (X −a)are identically distributed for all 2d choices of b ∈ −1, 1d, then X is called jointly symmetric(about a).

Before discussing the differences between these symmetries, we first explore some implicationsin the next two theorems. The following theorem is the multivariate version of Theorem 2.1 ofNelsen (1993), which is concerned with the bivariate case.

Theorem 4.1.6. Let X be a d-dimensional, continuous random vector and a ∈ Rd. If X isjointly symmetric about a, then X is also radially symmetric about a. If X is radially symmetricabout a, then it is also marginally symmetric about a.

Proof. Let X be jointly symmetric about a. Then by choosing b = 1 or b = −1, respectively,(where 1 := (1, . . . , 1) ∈ Rd as usual) in Definition 4.1.5, X is obviously radially symmetric abouta.

Now, let X be radially symmetric about a. If H1 is the distribution of X − a and H2 is thedistribution of a−X, then H1(x) = H2(x) for all x ∈ Rd and therefore the univariate marginsmust coincide as well.

Due to the following theorem in combination with Theorem 4.1.6, all three notions of multivari-ate symmetry coincide, if the components of the random vectorX in consideration are independent(but not necessarily identically distributed). It is a multivariate version of Theorem 2.4 of Nelsen(1993), where the bivariate case is discussed.

58

Page 67: Exchangeability of Copulas

4.1. Some Concepts of Multivariate Symmetry

Theorem 4.1.7. Let X be a d-dimensional, continuous random vector with copula Π and leta ∈ Rd. If X is marginally symmetric about a, then X is jointly symmetric about a.

Proof. Let H be the joint distribution of X with univariate margins F1, . . . , Fd and b ∈ −1, 1d.As X is assumed to be marginally symmetric about a, we get

P(bi(Xi − ai) ≤ xi

)= P(Xi − ai ≤ xi)

for all xi ∈ R and i ∈ 1, . . . , d. Then, due to independence of the components of X,

P(X − a ≤ x) =

d∏i=1

P(Xi − ai ≤ xi) =

d∏i=1

P(bi(Xi − ai) ≤ xi

)= P

(b (X − a) ≤ x

)holds for any x ∈ Rd. Therefore, the vectors X − a and b (X − a) are identically distributed.Now let b1, b2 ∈ −1, 1d, then

b1 (X − a)d= X − a d

= b2 (X − a)

concludes the proof.

Just like Nelsen (1993), we will give some examples, in order to demonstrate that the abovenotions of symmetry may indeed be different. At the same time, the following examples serve ascounterexamples to further conjectures about implications of symmetries.

Example 4.1.8. If h1, as in Figure 4.1, is the density of a random vector X1 on the unit square,then it is easy to verify that X1 is jointly symmetric about a :=

(12 ,

12

)>. Due to Theorem 4.1.6,X1 is radially and marginally symmetric about a as well. For other examples of densities ofjointly symmetric random vectors, see Figures 1.1 and 1.4 of Nelsen (1993).

Example 4.1.9. If h2, as in Figure 4.1, is the density of a random vector X2 on the unit square,then it is easy to verify that X2 is marginally symmetric about a :=

(12 ,

12

)>, as the components ofX2 are standard uniformly distributed. Nevertheless, X2 is neither jointly nor radially symmetric.For example,

P

(X2,1 −

1

2≤ −1

4, X2,2 −

1

2≤ −1

4

)= 0 6= 1

8= P

(1

2−X2,1 ≤ −

1

4,

1

2−X2,2 ≤ −

1

4

)holds and therefore X2 − a and a−X2 are not identically distributed. Thus, X2 is not radiallysymmetric and consequently it is not jointly symmetric by Theorem 4.1.6. For other examples ofdensities of marginally symmetric random vectors, which are neither jointly nor radially symmetric,see Figures 1.3, 1.6, 1.9 and 1.12 of Nelsen (1993).

Example 4.1.10. If h3, as in Figure 4.1, is the density of a random vector X3 on the unit square,then it is easy to verify that X3 is radially symmetric about a :=

(12 ,

12

)>. Due to Theorem 4.1.6,X3 is marginally symmetric about a as well. Nevertheless, X3 is not jointly symmetric. Forexample,

P

(X3,1 −

1

2≤ −1

4, X3,2 −

1

2≤ −1

4

)=

3

166= 0 = P

(1

2−X3,1 ≤ −

1

4, X3,2 −

1

2≤ −1

4

)holds and therefore X3 − a and b (X3 − a) are not identically distributed for b = (−1, 1)>. Forother examples of densities of radially (and thus marginally) symmetric random vectors, whichare not jointly symmetric, see Figures 1.2, 1.5, 1.8 and 1.11 of Nelsen (1993).

59

Page 68: Exchangeability of Copulas

4. Limits of Non-Exchangeability

1

1

h3

1

1

h4

1

1

h1

1

1

h2

Figure 4.1: Densities hi of random vectors Xi, which exhibit different symmetries (see exam-ples 4.1.8 through 4.1.11), where hi(x, y) = 2 if and only if (x, y)> lies within the blue area,hatched from top left to bottom right, i. e. within and hi(x, y) = 4 if and only if (x, y)> lieswithin the red area, hatched from top right to bottom left, i. e. within . X1 is jointly (andthus radially and marginally) symmetric. X2 is marginally but not radially (and thus not jointly)symmetric. X3 is radially (and thus marginally), but not jointly symmetric. X4 is not marginally(and thus neither radially nor jointly) symmetric.

60

Page 69: Exchangeability of Copulas

4.1. Some Concepts of Multivariate Symmetry

Example 4.1.11. If h4, as in Figure 4.1, is the density of a random vector X4 on the unit square,then X4 is not marginally symmetric, which can be seen as follows. Obviously, X4,1 (i. e. the firstcomponent of X4) possesses the density f1(x) = 2x on the unit interval. If X4 was marginallysymmetric about some a ∈ R2, then f1(a1 + x) = f1(a1 − x) would hold for almost all x ∈ [0, 1]by Lemma 4.1.2. But for x0 6= 0, there exists no a1 ∈ R, such that 2(a1 + x0) = 2(a1 − x0) holds.Therefore X4 is not marginally symmetric and due to Theorem 4.1.6 it is neither radially norjointly symmetric. For other examples of densities of random vectors which are not marginally(and thus neither radially nor jointly) symmetric, see Figures 1.7, 1.10, 1.13 and 1.14 of Nelsen(1993).

Of course, all of the above examples (4.1.8 through 4.1.11) may be extended to dimensions d > 2,by considering random vectors whose first two components are given by Xi (for i ∈ 1, . . . , 4 as inthe according example) and whose remaining components follow a standard uniform distributionand are independent (of Xi as well as of each other).

4.1.2 Copulas

Clearly, marginal symmetry is a property of the marginal distributions of a random vector, hencethe name. In the following example, we will see that a random vector whose distribution is acopula is always marginally symmetric. Other connections between symmetry and the copulawill be given in the subsequent theorem.

Example 4.1.12. If U ∼ C for some d-dimensional copula C, then all components Ui are standarduniformly distributed by definition. As the density of the standard uniform distribution isf(x) = 1

(x ∈ [0, 1]

), it is easy to see that f

(12 − x

)= f

(12 + x

)holds for all x ∈ R. This means

that U is marginally symmetric about(

12 , . . . ,

12

)∈ Rd, by Lemma 4.1.2.

In the next theorem, which is a multivariate version of Theorem 3.2 of Nelsen (1993), we willsee that, given univariate symmetric margins, radial symmetry is equivalent to a certain propertyof the copula.

Theorem 4.1.13. Let X be a d-dimensional, continuous random vector with copula C, which ismarginally symmetric about some vector a ∈ Rd. Then X is radially symmetric about a, if andonly if C coincides with its survival copula, i. e. C(u) = C(u) holds for all u ∈ [0, 1]d.

Proof. Let H be the distribution function of X with margins F1, . . . , Fd. By F (x) we denote thevector

(F1(x1), . . . , Fd(xd)

)and analogously, F (x) denotes the vector of the marginal survival

functions evaluated at x. It is a direct consequence of Definition 4.1.4 that X being radiallysymmetric is equal to H(a+ x) = H(a− x) for all x ∈ Rd. Therefore, let x ∈ Rd, then

H(a+ x) = H(a− x) ⇐⇒ C(F (a+ x)

)= C

(F (a− x)

)(4.1.1)

⇐⇒ C(F (a+ x)

)= C

(F (a+ x)

)(4.1.2)

holds, where the first equivalence is due to Sklar’s theorem and Corollary 2.3.5 and the secondequivalence is due to the assumed marginal symmetry of X and thus Fi(ai + xi) = Fi(ai − xi)for all i ∈ 1, . . . , d.

Now let X be radially symmetric about a. Furthermore, let u ∈ [0, 1]d and xi := F−i (ui)− aifor all i ∈ 1, . . . , d. Then F (a+ x) = u by Lemma 2.3.2 and thus C(u) = C(u) by (4.1.1) and(4.1.2).

For the converse implication, we may assume that C(u) = C(u) holds for all u ∈ [0, 1]d. Letx ∈ Rd and u := F (a+ x). Then, with (4.1.1) and (4.1.2), we get H(a+ x) = H(a− x) andthus X is radially symmetric.

61

Page 70: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Next, we establish a property of the copula, under which the random vector is jointly symmetric,as long as it is marginally symmetric. Other than in Theorem 4.1.13, where only one conditionneeds to be verified, in the case of joint symmetry, the amount of work increases exponentiallywith the dimension d, as 2d − 1 conditions have to be checked. Therefore, the application ofthe theorem might be inconvenient in most situations. But nevertheless, it clarifies that jointsymmetry is a property of the copula (given symmetric margins), as well as radial symmetry.

Theorem 4.1.14. Let X be a d-dimensional, continuous random vector with copula C, which ismarginally symmetric about some vector a ∈ Rd. Additionally, for u ∈ [0, 1]d and b ∈ −1, 1d,consider the hyperrectangle V (u, b) := ×dj=1

[vj,0(uj , bj), vj,1(uj , bj)

], given by

vj,0(uj , bj) := (1− uj)1− bj

2,

vj,1(uj , bj) := 1 + (uj − 1)1 + bj

2

for j ∈ 1, . . . , d. Then X is jointly symmetric about a, if and only if C(u) = C(V (u, b)

)for all u ∈ [0, 1]d and for all b ∈ −1, 1d, where C

(V (u, b)

)is the C-volume of V (u, b) as in

Definition 2.1.1.

Proof. Let b ∈ −1, 1d. With the index set I := i | bi = 1 and the marginal distributionsF1, . . . , Fd of X, we get

P(b (X − a) ≤ x

)= P(Xi ≤ ai + xi for i ∈ I, Xi > ai − xi for i 6∈ I)

= P(Fi(Xi) ≤ Fi(ai + xi) for i ∈ I, Fi(Xi) ≥ Fi(ai − xi) for i 6∈ I

)= C

(d×i=1

[wi,0(xi, bi), wi,1(xi, bi)

]),

for x ∈ Rd (for the second equation, see Theorem 2.3.4, i. e. Sklar’s theorem, or the proof thereof),where the intervals

[wi,0(xi, bi), wi,1(xi, bi)

]are given by

[wi,0(xi, bi), wi,1(xi, bi)

]=

[0, Fi(ai + xi)

]if i ∈ I,[

Fi(ai − xi), 1]

otherwise.

Note that Fi(ai − xi) = Fi(ai + xi) holds for all i ∈ 1, . . . , d and all xi ∈ R, as it was assumedthat X is marginally symmetric about a. This yields

P(b (X − a) ≤ x

)= C

(V(F (a+ x), b

))(4.1.3)

with F as in the proof of Theorem 4.1.13.

Now, let X be jointly symmetric about a. Futhermore, let u ∈ [0, 1]d, b ∈ −1, 1d andconsider xi := F−i (ui)− ai for all i ∈ 1, . . . , d. Then F (a+ x) = u by Lemma 2.3.2 and thus

C(u) = C(F (a+ x)

)= P(X − a ≤ x) = P

(b (X − a) ≤ x

)= C

(V(F (a+ x), b

))= C

(V (u, b)

)holds by (4.1.3) and Sklar’s theorem.

62

Page 71: Exchangeability of Copulas

4.1. Some Concepts of Multivariate Symmetry

For the converse implication, we may assume that C(u) = C(V (u, b)

)holds for all u ∈ [0, 1]d

and for all b ∈ −1, 1d. Let x ∈ Rd, b ∈ −1, 1d and u := F (a+ x). Then

P(b (X − a) ≤ x

)= C

(V(F (a+ x), b

))= C

(V (u, b)

)= C(u) = C

(F (a+ x)

)= P(X − a ≤ x)

holds by (4.1.3) and Sklar’s theorem. Therefore the random vectors X − a and b (X − a) areidentically distributed for all b ∈ −1, 1d, which means that X is jointly symmetric about a.

The following definition is a consequence of Theorem 4.1.13 and Theorem 4.1.14, where it isshown that radial as well as joint symmetry is a property of the copula.

Definition 4.1.15. Let C be a d-dimensional copula. If C ≡ C, then C is called radiallysymmetric, because in that case, every marginally symmetric vector X with copula C is radiallysymmetric by Theorem 4.1.13.

If C(u) = C(V (u, b)

)holds for all u ∈ [0, 1]d and for all b ∈ −1, 1d with V (u, b) as

in Theorem 4.1.14, then C is called jointly symmetric, because in that case, every marginallysymmetric vector X with copula C is jointly symmetric by Theorem 4.1.14.

It was demonstrated in Example 4.1.12 that every d-dimensional copula is marginally symmetricabout

(12 , . . . ,

12

)∈ Rd. As the various symmetries of copulas correspond to symmetries of random

vectors, the following corollary is an immediate consequence of Theorem 4.1.6. In the subsequentexample, copulas which exhibit various symmetries will be introduced. At the same time, theyserve as counterexamples to further conjectures about implications of symmetries of copulas.

Corollary 4.1.16. Let C be a d-dimensional copula and a :=(

12 , . . . ,

12

)∈ Rd. If C is jointly

symmetric about a, then C is radially symmetric about a.

Example 4.1.17. For i ∈ 1, 2, 3, let Ci : [0, 1]2 → [0, 1] be a distribution function generated bythe density hi of Figure 4.1. It is easy to verify that every Ci is a distribution function on theunit square with uniform margins. By Corollary 2.1.7, each Ci is a copula. Therefore Ci exhibitsthe same symmetry (or absence thereof) as the vector Xi of the respective Example 4.1.8, 4.1.9or 4.1.10. This means C1 is jointly (and thus radially) symmetric, C2 is not radially (and thusnot jointly) symmetric and C3 is radially but not jointly symmetric.

Of course, each copula Ci of the above example may be generalized to dimension d > 2 byconsidering Ci : [0, 1]d → [0, 1] with

Ci(u) := Ci(u1, u2)

d∏i=3

ui

for u ∈ [0, 1]d. Next, we examine the classes of copulas from Section 2.4 with regard to radialsymmetry as well as joint symmetry.

Theorem 4.1.18. Let C be a d-dimensional, elliptical copula. Then C is radially symmetric.

Proof. Let X be an elliptically distributed random vector with copula C, i. e. X ∼ Ed(µ,Σ, φ)for some vector µ, matrix Σ and mapping φ as in Definition 2.4.1. By Theorem 4.1.13, C isradially symmetric, whenever X is. Therefore, it suffices to show that X is radially symmetric

63

Page 72: Exchangeability of Copulas

4. Limits of Non-Exchangeability

about µ or, to put it differently that X − µ and µ−X are identically distributed. To this end,let t ∈ Rd, then

E(

eit>(X−µ)

)= φ

(t>Σt

)= φ

((−t)>Σ(−t)

)= E

(ei(−t)

>(X−µ))

= E(

eit>(µ−X)

)holds and uniqueness of the characteristic function yields X − µ d

= µ−X.

Not all elliptically distributed random vectors are jointly symmetric. Consider for example thebivariate normal distribution with standard normal margins and correlation ρ = 1

2 , which is anelliptical distribution by Example 2.4.5. Then the joint density h does not fulfill h(x, y) = h(x,−y),for example in the point x = 1 = y. But if h is the density of a jointly symmetric, bivariaterandom vector, then

h(x, y) = h(x,−y) = h(−x, y) = h(−x,−y)

holds for all (x, y)> ∈ R2 (see Definition 1.3 of Nelsen (1993)).

Theorem 4.1.19. Let C be a d-dimensional, elliptical copula of a random vector X ∼ Ed(µ,Σ, φ).If Σ = Id, then C is jointly symmetric.

Proof. Similarly to the proof of Theorem 4.1.18, it suffices to show that X is jointly symmetric byTheorem 4.1.14. Due to Definition 4.1.5, this is the case if b (X −µ) and X −µ are identicallydistributed for all b ∈ −1, 1d. To this end, let t ∈ Rd and b ∈ −1, 1d, then

E(

eit>(X−µ)

)= φ

(t>t)

= φ((b t)>(b t)

)= E

(ei(bt)

>(X−µ))

= E(

eit>(b(X−µ))

)holds and uniqueness of the characteristic function yields X − µ d

= b (X − µ).

Note that unlike in the case of the multivariate normal distribution, Σ = Id does not implyindependence in general. For example the density of a multivariate t-distribution with Σ = Id isnot the product of the marginal densities, see, for example, Section 3.3.6 of Fang et al. (1990).

Due to the last two theorems, all elliptical copulas are radially symmetric and some (but notall) are jointly symmetric. We will see that the situation is different for Archimedean copulas.

Proposition 4.1.20 (Frank (1979)). Let C be a bivariate Archimedean copula. Then C isradially symmetric if and only if there exists θ ∈ (−∞,∞) \ 0, such that C ≡ Cθ, where Cθ isa Frank-copula as in Example 2.4.22.

The next theorem shows that this result cannot be extended to arbitrary dimensions. At first,it may come as a surprise that for d = 3, a Frank-copula is not radially symmetric, whereas all ofits bivariate marginal copulas, generated by the very same generator, are radially symmetric. Thenagain, there are bivariate copulas (for example Clayton-copulas, by Proposition 4.1.20) which arenot radially symmetric, but both univariate margins are symmetric (see Example 4.1.12).

Theorem 4.1.21. Let C be a d-dimensional Archimedean copula. If d > 2, then C is not radiallysymmetric.

Proof. Let d > 2 and C be generated by the generator ϕ. If ϕ is not the generator of the Frank-family of copulas (see Example 2.4.22), then especially the bivariate margins of C are Archimedeancopulas generated by ϕ, and thus they are not Frank-copulas. By Proposition 4.1.20, they arenot radially symmetric and hence C is not radially symmetric. Therefore it suffices to consider a

64

Page 73: Exchangeability of Copulas

4.1. Some Concepts of Multivariate Symmetry

copula C ≡ Cθ for some d-dimensional Frank-copula Cθ. Furthermore, it suffices to examine thecase d = 3, because if Cθ is not radially symmetric for d = 3, then all Frank-copulas with largerdimensions have three-dimensional margins which are not radially symmetric and thus are notradially symmetric themselves. Now assume that Cθ is a radially symmetric, three-dimensionalFrank-copula. Then by Theorem 4.1.13, Cθ(u) = Cθ(u) holds for all u ∈ [0, 1]3 and especially foru :=

(12 ,

12 ,

12

). Theorem 2.1.9 and u = 1− u yield

Cθ(u) = Cθ(u) = Cθ(u) ⇐⇒ 3Cθ

(1

2,

1

2

)− 2Cθ

(1

2,

1

2,

1

2

)− 1

2= 0

where we denote the two-dimensional margins by Cθ as well, because as Archimedean copulas,they are all generated by the same generator ϕθ and thus coincide. Some elementary computations,which are given in Section A.2 of the appendix, result in∣∣∣∣3Cθ(1

2,

1

2

)− 2Cθ

(1

2,

1

2,

1

2

)− 1

2

∣∣∣∣ > 0 (4.1.4)

for θ ∈ [− ln 2,∞) \ 0. Additionally (4.1.4) tends to 0 for θ → 0. We would expect this, aslimθ→0 Cθ(u) = Π(u) and the independence copula Π is radially symmetric. But for all θ ∈ Θ,where Θ = [− ln 2,∞) \ 0 is the parameter space of the three-dimensional Frank-family ofcopulas (see Table 2.1), we get Cθ(u) 6= Cθ(u) and thus the assumption is wrong.

For joint symmetry in Archimedean copulas, there is only a negative result. This might notcome as a big surprise, because radial symmetry is necessary for joint symmetry by Theorem 4.1.6(and Definition 4.1.15) and due to Proposition 4.1.20 and Theorem 4.1.21, only some bivariateArchimedean copulas (namely the bivariate Frank-copulas) are radially symmetric.

Theorem 4.1.22. Let C be a d-dimensional Archimedean copula. Then C is not jointly sym-metric.

Proof. If C is jointly symmetric, then its bivariate margins must be jointly symmetric as well.Theorem 4.1.6 implies that these bivariate margins are radially symmetric and therefore, be-cause they are Archimedean copulas, they belong to the Frank-family of copulas, according toProposition 4.1.20. Thus, it suffices to consider a bivariate Frank-copula Cθ as in Example 2.4.22and to show that it is not jointly symmetric. By Theorem 4.1.14, a bivariate copula C is jointlysymmetric, if and only if

C(u, v) = u− C(u, 1− v) = v − C(1− u, v) = u+ v − 1 + C(1− u, 1− v)

holds for all (u, v)> ∈ [0, 1]2. For u = v = 12 , the first equation is equivalent to C

(12 ,

12

)= 1

4 , but

(1

2,

1

2

)6= Π

(1

2,

1

2

)=

1

4

for all θ ∈ R \ 0. Hence, Cθ is not jointly symmetric.

Because the bivariate margins of nested Archimedean copulas are always Archimedean them-selves, the following corollary is a direct consequence of the non-existence of jointly symmetric(non-nested) Archimedean copulas.

Corollary 4.1.23. Let C be a d-dimensional nested Archimedean copula. Then C is not jointlysymmetric.

65

Page 74: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Similarly, if a nested Archimedean copula C is radially symmetric, then its bivariate marginsmust be radially symmetric as well. By Proposition 4.1.20, all bivariate margins must beFrank-copulas and therefore C is generated exclusively by generators of the Frank-family. If theparameters of all generators coincide, then C is a non-nested Frank-copula with dimension d > 2,which is not radially symmetric by Theorem 4.1.21. As the mapping θ 7→ Cθ(u) is continuous,there exist nested Frank-copulas Cθ, such that Cθ(u) 6= Cθ(u), at least for u =

(12 ,

12 ,

12 , 1, . . . , 1

).

4.2 Exchangeability and its Antonym

In Section 4.1, three concepts of symmetries were introduced and examples showed that they areindeed different. More precisely, there exist random vectors or copulas which exhibit symmetryof one kind, but not of the other, unless, of course, one kind of symmetry is implied by the otheras in the case of radial and joint symmetry, see for example Theorem 4.1.6. In this section wewill consider a fourth concept of symmetry, namely exchangeability. It will be demonstrated thatexchangeability is different from the aforementioned concepts of symmetry, in that it neitherimplies nor is implied by these. Note that, just as before, d ∈ N \ 1 denotes the dimension.

Definition 4.2.1. A random vectorX := (X1, . . . , Xd)> is called exchangeable, if its law coincides

with the law of the random vector Xπ := (Xπ(1), . . . , Xπ(d))>, where π ∈ Sd is a permutation of

1, . . . , d.

Exchangeability may also be defined as a property of multivariate mappings as in the followingdefinition.

Definition 4.2.2. A mapping F : Rd 7→ R is called exchangeable, if

F (x1, . . . , xd) = F (xπ(1), . . . , xπ(d))

holds for all (x1, . . . , xd)> ∈ Rd and all permutations π ∈ Sd.

At first, it may seem unusual to use the same expression for two different things. But theyare indeed closely related, at least as far as multivariate distribution functions are concerned,which becomes clear in the following theorem.

Theorem 4.2.3. Let H be a continuous, d-dimensional distribution function with copula Cand let X ∼ H. Then X is exchangeable if and only if H is exchangeable. Furthermore, H isexchangeable if and only if all k-dimensional margins of H coincide for each k ∈ 1, . . . , d− 1and C is exchangeable.

Proof. Let X ∼ H. For π ∈ Sd let Hπ be the distribution function of Xπ. By Definition 4.2.1,X is exchangeable if and only if H ≡ Hπ for all π ∈ Sd. The first claim follows, as

Hπ(x) = P(Xπ(1) ≤ x1, . . . , Xπ(d) ≤ xd

)= P

(X1 ≤ xπ−1(1), . . . , Xd ≤ xπ−1(d)

)= H

(xπ−1

)holds for all x ∈ Rd and all π ∈ Sd.

For the second claim, suppose H is exchangeable. Let k ∈ 1, . . . , d− 1 and let

J := j1, . . . , jk ⊂ 1, . . . , d

be an index-set of size k, as well as I := 1, . . . , k. Then there exists π ∈ Sd, such that π(i) = jifor all i ∈ I, for example by mapping i > k to the (i− k)-smallest index that is not contained in

66

Page 75: Exchangeability of Copulas

4.2. Exchangeability and its Antonym

J . The k-dimensional margins HI and HJ then coincide, because

HI(x1, . . . , xk) = limxi→∞i 6∈I

H(x1, . . . , xd) = limxi→∞i 6∈I

H(xπ−1(1), . . . , xπ−1(d)

)= limyj→∞j 6∈J

(H(y1, . . . , yd)

∣∣∣(yj1 ,...,yjk )=(x1,...,xk)

)= HJ (x1, . . . , xk)

holds for all x ∈ Rk as H is exchangeable. For the exchangeability of C, let u ∈ [0, 1]d. As H isassumed to be continuous, there exists x ∈ Rd, such that (u1, . . . , ud)

> =(F1(x1), . . . , Fd(xd)

)>.Thus

C(u) = C(F1(x1), . . . , Fd(xd)

)= H(x) = H(xπ) = C

(Fπ(1)(xπ(1)), . . . , Fπ(d)(xπ(d))

)= C(uπ)

holds for all π ∈ Sd by Sklar’s theorem.Conversely, if C is exchangeable and all margins of H coincide, then particularly the one-

dimensional margins F1, . . . , Fd are identical. Applying Sklar’s theorem then yields

H(x) = C(F1(x1), . . . , F1(xd)

)= C

(F1

(xπ(1)

), . . . , F1

(xπ(d)

))= H(xπ)

for all x ∈ Rd and all π ∈ Sd.

From the proof of Theorem 4.2.3 it becomes obvious that the coincidence of all one-dimensional,marginal distributions suffices for exchangeability, at least as long as the copula is exchangeable.

Corollary 4.2.4. Let H be a continuous, d-dimensional distribution function with copula C.Then H is exchangeable if and only if all one-dimensional, marginal distributions of H coincideand C is exchangeable.

So, apart from coinciding marginal distributions, exchangeability is a property of the copula(see Theorem 4.2.3). Therefore, in most cases, we will address the exchangeability—or rather thelack of this property—of copulas. But, of course, all results may easily be transferred to generaldistribution functions by applying Sklar’s theorem (Theorem 2.3.4).

Existence of exchangeable and non-exchangeable copulas is demonstrated in the followingexamples. Along the way, it becomes clear that exchangeability may be thought of as a weakerform of independence.

Example 4.2.5. If the components of a random vector X are independent and identically dis-tributed, then it is exchangeable, as the independence-copula Π (see Example 2.1.6) is obviouslyexchangeable. However independence and exchangeability are not equivalent, as there existcopulas that are exchangeable but are different from Π. For example, the upper Frechet-Hoeffding-bounds M is exchangeable for all dimensions d and Π(u) 6= M(u) for all u ∈ (0, 1)d. Therefore,independence is sufficient but not necessary for exchangeability.

Example 4.2.6. Of course, not all copulas are exchangeable. For example, consider the density h2

from Figure 4.1 and let c : [0, 1]d → [0,∞) with

c(u) := h2(u1, u2)

d∏i=3

ui (4.2.1)

for u ∈ [0, 1]d be the density of copula C (which is a valid copula by Corollary 2.1.7, as it isobviously a distribution function on the unit hypercube with uniform one-dimensional margins).

67

Page 76: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Then C is not exchangeable, as for example

C(1

2,

1

4, 1, . . . , 1

)=

1

86= 0 = C

(1

4,

1

2, 1, . . . , 1

)holds. Similarly, it is easy to verify that the other densities of Figure 4.1 yield non-exchangeabledistributions as well. To be precise, let Hi be the distribution with density hi for i ∈ 1, 3, 4,then it holds that H1

(14 ,

18

)= 1

32 but H1

(18 ,

14

)= 1

16 , H3

(14 ,

18

)= 1

16 but H3

(18 ,

14

)= 1

8 and

H4

(u, u2

)= 3

4u2 but H4

(u2 , u)

= 14u

2 for u ∈ (0, 1). Note that H1 and H3 are copulas. Examplesfor dimension d > 2 are easily created by adding independent components, just like in (4.2.1).

Next, we examine the classes of copulas from Section 2.4 with regard to exchangeability.

Lemma 4.2.7. Let X ∼ E2(µ,Σ, φ) be a two-dimensional, elliptically distributed random vectorwhich is exchangeable. Then µ1 = µ2 and Σ11 = Σ22 holds.

Proof. If Σ = 0, then µ1 = µ2 as all one-dimensional margins coincide by Corollary 4.2.4 andthe second claim is obviously true. Therefore, in the remaining proof, we may assume Σ 6= 0.By Proposition 2.4.3, there exist a random variable R, a matrix A, and a random vector U

(independent of R and such that U>U = 1), such that Xd= µ+ RAU . If X is exchangeable,

then (X1, X2)>d= (X2, X1)> holds in particular. Therefore(

µ1

µ2

)+R

(a11 a12

a21 a22

)U

d=

(X1

X2

)d=

(X2

X1

)d=

(µ2

µ1

)+R

(a21U1 + a22U2

a11U1 + a12U2

)d= µτ +RAτUτ

where τ is the transposition (12) and Aτ denotes the matrix(a22 a21a12 a11

), i. e. each aij is replaced by

aτ(i)τ(j). As Uτ is independent of R and uniformly distributed on the unit circle, an applicationof Proposition 2.4.4 yields µ = µτ as well as the existence of a c > 0, such that

Σ = AA> = cAτA>τ = cΣτ

holds. This implies Σ11 = cΣ22 and Σ22 = cΣ11. Thus, we get either c = 1 or Σ11 = 0 = Σ22.

The previous lemma, which states only a necessary condition for the two-dimensional case, isextended to arbitrary dimensions (d ≥ 2) in the following theorem. At the same time, it is shownthat the necessary condition on the parameters µ and Σ of Lemma 4.2.7 is also sufficient.

Theorem 4.2.8. Let X ∼ Ed(µ,Σ, φ) be a d-dimensional, elliptically distributed random vector.Then X is exchangeable if and only if µ1 = . . . = µd and Σ11 = . . . = Σdd as well as Σij = Σkl

for all i 6= j and k 6= l.

Proof. First, let µ1 = . . . = µd and Σ11 = . . . = Σdd as well as Σij = Σkl for all i 6= j and k 6= l.Additionally, let π ∈ Sd. As in the proof of Lemma 4.2.7, the components of the matrix Σπ aregiven by

(Σπ

)ij

:= Σπ(i)π(j) for i, j ∈ 1, . . . , d. Let t ∈ Rd, then by reordering the sums, it is

easy to see that t>π−1Σtπ−1 = t>Σπt holds. Thus, as X ∼ Ed(µ,Σ, φ),

E(

exp(it>(Xπ − µ)

))= E

(exp(it>(Xπ − µπ)

))= E

(exp(it>π−1(X − µ)

))= φ

(t>π−1Σtπ−1

)= φ

(t>Σπt

)= E

(exp(it>(X − µ)

))and uniqueness of the characteristic function yield X

d= Xπ, i. e. exchangeability of X.

68

Page 77: Exchangeability of Copulas

4.2. Exchangeability and its Antonym

Next, let X be exchangeable. By Theorem 4.2.3, all two-dimensional margins coincide. Asthey are all exchangeable as well, Lemma 4.2.7 may be applied and results in µ1 = . . . = µd andΣ11 = . . . = Σdd. For 1 ≤ i < j ≤ d, consider the matrix

Σ(ij) :=

(Σ11 ΣijΣij Σ11

)of the vector (Xi, Xj)

> ∼ Ed((µ, µ)>,Σ(ij), φij

). Note that Σ(ij) is symmetric, as Σ is assumed

to be symmetric (see Definition 2.4.1). If Σ11 = 0, then, as Σ(ij) has to be positive semidefiniteby Definition 2.4.1, we get Σij = 0. This implies Σ = 0 and Σij = Σkl for all i 6= j and k 6= lobviously holds. Therefore, we now assume Σ11 6= 0. Let 1 ≤ i < j ≤ d and 1 ≤ k < l ≤ d. Asstated before, (Xi, Xj)

> and (Xk, Xl)> are identically distributed by Theorem 4.2.3. Due to

Proposition 2.4.4, there exists c > 0, such that Σ(ij) = cΣ(kl). Because of Σii = Σjj 6= 0, we getc = 1 and thus Σij = Σkl, which completes the proof.

As a consequence of Theorem 4.2.8 and Corollary 4.2.4, all bivariate elliptical copulas areexchangeable, but there is neither a positive nor a negative result for exchangeability in dimensiond > 2 for elliptical copulas. Even if all its components are identically distributed (i. e. the one-dimensional margins coincide), a random vectorX ∼ Ed(µ,Σ, φ) does not have to be exchangeablefor d > 2 (but of course, it may be). This will be demonstrated in Example 4.2.10.

Corollary 4.2.9. Let C be a d-dimensional, elliptical copula as in Definition 2.4.7. If d = 2,then C is exchangeable. If d > 2, then exchangeability of C depends on the matrix Σ of theunderlying elliptical distribution Ed(µ,Σ, φ) via Theorem 4.2.8.

Example 4.2.10. Let X ∼ N(0, I3) be a three-dimensional random vector with independentcomponents which follow a standard normal distribution. By Corollary 2.3.6, the copula of X isgiven by Π, which is obviously exchangeable. As all components of X are identically distributed,X is exchangeable by Corollary 4.2.4.

Let Y ∼ N(0,Σ), where

Σ :=

1 12 0

12 1 00 0 1

,

then the two-dimensional marginal distributions, to be precise those of (X1, X2)> and (X2, X3)>,do not coincide, as (

X1

X2

)∼ N

(0,

(1 1

212 1

)),

(X2

X3

)∼ N(0, I2)

and therefore, X is not exchangeable by Theorem 4.2.3. However, the one-dimensional marginsare all identical and thus, by Corollary 4.2.4, the copula of Y is an elliptical copula which is notexchangeable.

It is easy to see that Archimedean copulas are exchangeable. One key feature of nestedArchimedean copulas is non-exchangeability. For the sake of completeness, we will give bothresults in the following theorem.

Theorem 4.2.11. Let C be a d-dimensional nested Archimedean copula. Then C is exchangeableif and only if there exists a non-nested Archimedean copula C, such that C(u) = C(u) holds forall u ∈ [0, 1]d.

69

Page 78: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Proof. Let C be a d-dimensional Archimedean copula with generator ϕ. Obviously

C(u) = ϕ−( d∑i=1

ϕ(ui)

)= ϕ−

( d∑i=1

ϕ(uπ(i)

))= C(uπ)

holds for all u ∈ [0, 1]d and for all π ∈ Sd. Therefore C is exchangeable and thus C is exchangeablewhenever C ≡ C.

Now, let C be exchangeable and denote the outermost generator by ϕ1. Note that C may berewritten in a form where each sum of generators consists of exactly two summands. For example,let ϕj be an Archimedean generator, then

ϕ−j

(ϕj(u) + ϕj(v) + ϕj(w)

)= ϕ−j

(ϕj(u) + ϕj

(ϕ−j(ϕj(v) + ϕj(w)

)))holds for all u, v, w ∈ [0, 1]. As C is assumed to be exchangeable, all two-dimensional marginscoincide by Theorem 4.2.3. Therefore, the terms ϕ−j

(ϕj(v) + ϕj(w)

)may all be replaced by

ϕ−1(ϕ1(v) + ϕ1(w)

)and C collapses to a d-dimensional non-nested copula C with generator

ϕ1.

In Example 4.2.6 it is demonstrated that there exist non-exchangeable distributions whichare jointly (and thus radially and marginally) symmetric, or marginally but not radially (andthus not jointly) symmetric, or radially (and thus marginally) but not jointly symmetric, or notmarginally (and thus neither radially nor jointly) symmetric. This means that each possiblecombination of the concepts of symmetry of Section 4.1 (including the absence of all three typesof symmetry) does not imply exchangeability. In the following examples, it becomes clear thatthis is also true the other way, i. e. exchangeability does not imply any possible combination (orthe absence) of the concepts of symmetry of Section 4.1.

Example 4.2.12. The distribution H1 in Example 4.2.6 is a jointly (and by Theorem 4.1.6 radiallyand marginally) symmetric but not exchangeable copula. An obvious example for a jointlysymmetric copula which is exchangeable is the independence copula Π. The elliptical distribtuionEd(0, Id, φ) is a non-trivial example for a jointly symmetric and exchangeable distribution byTheorem 4.1.19 and Theorem 4.2.8. Note that the copula of Ed(0, Id, φ) is not necessarily theindependence copula, e. g. if Ed(0, Id, φ) is a t-distribution.

Example 4.2.13. The copula C in Example 4.2.6 is a non-exchangeable copula which is not radially(and by Theorem 4.1.6 not jointly) symmetric. But as a copula, C is marginally symmetric (seeExample 4.1.12). Let Cθ be a d-dimensional Clayton-copula (see Example 2.4.21). Then Cθ isalso marginally symmetric and by Proposition 4.1.20 in combination with Theorem 4.1.21, it isnot radially (and thus not jointly) symmetric. Nevertheless, as an Archimedean copula, Cθ isexchangeable by Theorem 4.2.11.

Example 4.2.14. Let h5 be a d-dimensional density with uniform mass on the two hypercubes[0, 1

2

]dand

[12 , 1]d

and let H5 be the distribution function induced by h5, i. e.

h5(u) := 2d−1

(1(u ∈

[0, 1

2

]d)+ 1

(u ∈

[12 , 1]d))

H5(u) = 2d−1

(d∏i=1

minui,

12

+

d∏i=1

maxui − 1

2 , 0)

for u ∈ [0, 1]d. Then H5 is a copula as it is a distribution function on the unit hypercube withuniform margins. As a copula, it is marginally symmetric about a = 1

2 (1, . . . , 1)>. Furthermore,

70

Page 79: Exchangeability of Copulas

4.2. Exchangeability and its Antonym

it is easy to see that H5 is exchangeable. As H5 is a copula, radial symmetry is equivalent toH5 ≡ H5 by Theorem 4.1.13. If U ∼ H5, then 1−U ∼ H5 by Definition 2.1.8. Basic calculationsyield

H5(u) = P(1− u ≤ U) =

∫ 1

1−u1

· · ·∫ 1

1−udh5(t) dtd · · · dt1

= 2d−1

(d∏i=1

max

12 − (1− ui), 0

+

d∏i=1

(1−max

1− ui, 1

2

))= H5(u)

for u ∈ [0, 1]d and thus radial symmetry of H5. But H5 is not jointly symmetric, as

P(U − a ≤ 0) =1

26= 0 = P

(b (U − a) ≤ 0

)holds for U ∼ H5 and b = (−1, 1, . . . , 1)> (see Definition 4.1.5).

Example 4.2.15. Let h6 be a d-dimensional density with uniform mass on the d-polytope Pdbetween the origin and the positive unit vectors, i. e.

Pd :=

x ∈ [0, 1]d :

d∑i=1

ui ≤ 1

h6(x) := v−1

d 1(x ∈ Pd)for x ∈ Rd, where vd =

∫Rd1(x ∈ Pd) dx is the volume of Pd. Furthermore, let H6 be the

distribution function induced by h6. It is easy to see that h6 is exchangeable and thus H6 isexchangeable (but H6 is no copula, as its one-dimensional margins are not uniform). In orderto show that H6 is not marginally symmetric, it suffices to consider the marginal distribution

H(1,2)6 of the first two components, or its density h

(1,2)6 . Note that h

(1,2)6 vanishes outside P2 but

it is not necessarily constant on P2 (e. g. for d = 3). The marginal density h(1)6 (x) of the first

component is positive for all x ∈ (0, 1) and vanishes for all x ∈ R \ [0, 1]. Therefore, the only

chance for h(1)6 to be symmetric about some a ∈ R is a = 1

2 . If (X1, X2)> ∼ H(1,2)6 and H

(1)6 is

the distribution induced by h(1)6 , then

H(1)6

(a+ 1

4

)= 1− P

((X1, X2)> ∈

[34 , 1]× [0, 1]

)= 1− P

((X1, X2)> ∈ [0, 1]×

[34 , 1])

(4.2.2)

H(1)

6

(a− 1

4

)= 1− P

((X1, X2)> ∈

[0, 1

4

]× [0, 1]

)= 1− P

((X1, X2)> ∈

[0, 1

4

]×[0, 3

4

])− P

((X1, X2)> ∈

[0, 1

4

]×[

34 , 1])

= 1− P(

(X1, X2)> ∈[0, 1

4

]×[0, 3

4

])︸ ︷︷ ︸

>0

− P(

(X1, X2)> ∈ [0, 1]×[

34 , 1]) (4.2.3)

hold for a = 12 , because (X1, X2)>

d= (X2, X1)> holds as H6 was assumed to be exchangeable

and as

P(

(X1, X2)> ∈[0, 1

4

]×[

34 , 1])

= P(

(X1, X2)> ∈ [0, 1]×[

34 , 1])

is implied by[

14 , 1]×[

34 , 1]⊂ [0, 1]2 \ P2 and h

(1,2)6 (x) vanishing for x 6∈ P2y. Obviously, (4.2.2)

and (4.2.3) yield H(1)6

(a+ 1

4

)6= H

(1)

6

(a− 1

4

)and thus H

(1)

6 is not symmetric (see Definition 4.1.1)which means H6 is not marginally symmetric. Due to Theorem 4.1.6, H6 is neither radially norjointly symmetric as well.

71

Page 80: Exchangeability of Copulas

4. Limits of Non-Exchangeability

4.3 Limits of Non-Exchangeability

Being interested in statistical tests to decide whether a given sample comes from an exchangeablecopula, it is important to know how big the difference of a copula from itself with permutedcomponents can be. For exchangeable copulas this difference is obviously zero. A first result ford = 2 is due to Klement and Mesiar (2006) and was discovered independently by Nelsen (2007).

Proposition 4.3.1 (Klement and Mesiar (2006); Nelsen (2007)). Let d = 2 and let C : [0, 1]d →[0, 1] be a copula. Then ∣∣C(u)− C(uπ)

∣∣ ≤ 1

3(4.3.1)

holds for all u ∈ [0, 1]d and all π ∈ Sd.

Of course, for π ≡ id (where id denotes the identity-mapping, i. e., id(x) = x) this is obvious,so for d = 2 there is only one interesting permutation, namely π ≡ τ(1, 2), i. e., the transpositionof u1 and u2. The bound in (4.3.1) is the best possible, as Nelsen (2007) demonstrates by showingthat

C(u1, u2) := min

u1, u2,

(u1 −

1

3

)+

+

(u2 −

2

3

)+

(4.3.2)

is a copula and for u :=(

13 ,

23

)>the bound in (4.3.1) is attained. As usual we denote by

f+ := maxf, 0 .By defining C(u1, u2) := C(u2, u1) for any (u1, u2)> ∈ [0, 1]2, we obviously get another copula

C. Therefore, (4.3.1) could be rewritten as

maxu∈[0,1]2

∣∣C(u)− C(u)∣∣ ≤ 1

3

i. e. the maximal absolute difference between two copulas. However, as a consequence of thefollowing theorem, the difference between two arbitrary two-dimensional copulas Ca and Cb inthe same point u ∈ [0, 1]2 is at most 1

2 .

Theorem 4.3.2. Let Ca, Cb : [0, 1]d → [0, 1] be d-dimensional copulas. Then

∣∣Ca(u)− Cb(u)∣∣ ≤ d− 1

d

holds for all u ∈ [0, 1]d and the bound is best possible.

Proof. Let u ∈ [0, 1]d. An application of Theorem 2.2.4 yields∣∣Ca(u)− Cb(u)∣∣ ≤M(u)−W (u),

where M and W denote the Frechet-Hoeffding-bounds, as usual (see Definition 2.2.3). BecauseM and W are exchangeable, we may assume M(u) = u1. Due to

M(u)−W (u) =

u1 if

∑di=1 ui ≤ d− 1,∑d

i=2(1− ui) if∑di=1 ui > d− 1,

M−W is non-decreasing in every component as long as∑di=1 ui ≤ d−1 holds, and non-increasing

in every component as long as∑di=1 ui > d − 1 holds. As M −W is continuous, M −W is

72

Page 81: Exchangeability of Copulas

4.3. Limits of Non-Exchangeability

maximized by a u ∈ [0, 1]d with∑di=1 ui = d − 1. In this case, W (u) vanishes and M(u) gets

maximal on the diagonal, i. e. for u(m) := d−1d (1, . . . , 1)> which means

M(u)−W (u) ≤M(d− 1

d, . . . ,

d− 1

d

)−W

(d− 1

d, . . . ,

d− 1

d

)=d− 1

d.

The bound is best possible, as we may choose Ca := M by Lemma 2.2.5 and Cb := CW , whereCW is a copula which coincides with W in the point u(m) (as W is a copula only for d = 2 byLemma 2.2.5). For the construction of such a copula CW , see e. g. the proof of Theorem 1.2.4 ofHofert (2010). For an exact form of such a copula CW with given diagonal section, see Jaworski(2009).

4.3.1 Main Result

Next comes the main theorem of this section, generalizing the inequality (4.3.1) to an arbitrarydimension d. Given a vector u ∈ [0, 1]d, we write uπ := (uπ(1), . . . , uπ(d))

> for the vector whosecomponents are permutated according to some permutation π ∈ Sd, just like in Definition 4.2.1.

Theorem 4.3.3. Let C be a d-copula. Then

maxu∈[0,1]d

∣∣C(u)− C(uπ)∣∣ ≤ d− 1

d+ 1(4.3.3)

holds true for any permutation π ∈ Sd. The bound is best possible, i. e. for each dimension d ≥ 2there exists a d-dimensional copula C, a permutation π ∈ Sd and a vector u∗ ∈ [0, 1]d, such that

∣∣C(u∗)− C(u∗π)∣∣ =

d− 1

d+ 1

holds.

In view of Theorem 4.3.3, it becomes clear that the limit for non-exchangeability is strictlysmaller than the difference between two arbitrary copulas in the same point, which is discussedin Theorem 4.3.2.

Under the condition that u∗i ≤ u∗i+1 for all i ∈ 1, . . . , d− 1, i. e. under the condition that thecomponents of u∗ are in ascending order, we get uniqueness of u∗ depending on the dimension dbeing even or odd. For d = 2n+ 2, n ∈ N there are infinitely many choices for such a u∗—yetwithin some lower dimensional manifold. In any case, for a fixed u∗ and d > 2, there is alwaysmore than one choice for u∗π. This will be discussed in more detail in Section 4.4.

Based on Theorem 4.3.3, it seems reasonable to define the mapping µd : Cd → [0, 1] with

µd(C) :=d+ 1

d− 1maxπ∈Sd

maxu∈[0,1]d

∣∣C(u)− C(uπ)∣∣

as a measure of non-exchangeability for d-dimensional copulas C ∈ Cd. By Theorem 4.3.3, µd(C)takes values in [0, 1]. Obviously, µd(C) = 0 if and only if C is exchangeable and µd(C) = 1 ifand only if C attains the bound of Theorem 4.3.3, i. e. C is maximal non-exchangeable. Notehowever that the definition of measures of non-exchangeability by Durante et al. (2010) is just forbivariate copulas and therefore not applicable in this case.

73

Page 82: Exchangeability of Copulas

4. Limits of Non-Exchangeability

4.3.2 Proof of the Main Result

Before proving Theorem 4.3.3 we state some auxiliary results needed in the proof. By τij wedenote the transposition of i and j, i. e. the permutation interchanging components i and j andleaving the others unchanged.

Lemma 4.3.4. Let u ∈ [0, 1]d, let i, j ∈ 1, . . . , d, then∣∣C(u)− C(uτij

)∣∣ ≤ |ui − uj | (4.3.4)

holds for any d-copula C.

The original proof of Lemma 4.3.4 may be found in Section A.2 of the appendix. The followingsimpler proof was suggested by an anonymous referee during the submission of Harder andStadtmuller (2014).

Proof. Let C be a d-copula, u ∈ [0, 1]d and i, j ∈ 1, . . . , d. Now define v by

vk := maxuk, uτij(k)

,

for k ∈ 1, . . . , d, which implies vk = uk for k 6∈ i, j, as well as u ≤ v and uτij ≤ v. Thus weget

maxC(u), C

(uτij

)≤ C(v) (4.3.5)

due to the monotonicity of C. As a copula, C is Lipschitz-continuous (see Lemma 2.2.6), whichyields ∣∣C(v)− C(u)

∣∣ ≤ d∑k=1

|vk − uk| = |vi − ui|+ |vj − uj | (4.3.6)

where the last equation is due to the choice of v. As vi = vj = maxui, uj either |vi − ui| or|vj − uj | vanishes. Together with (4.3.5) we conclude

C(u) ∈[C(v)− |ui − uj |, C(v)

].

By replacing u in (4.3.6) by uτij , it is easy to see that C(uτij ) is located within the same interval,which completes the proof.

In the next lemma, we will show that the inequality of Theorem 4.3.3 holds. For the proofwe need the following example of the generation of a permutation π ∈ Sd by at most (d − 1)transpositions.

Example 4.3.5. Let u ∈ [0, 1]d and π ∈ Sd. Let τd be the transposition which exchanges d andπ−1(d). Thus, τd puts ud in the right place, in the sense that if π(i) = d, then ud is the i-thcomponent of both uπ and uτd . Now let τd−1 be the transposition which puts ud−1 in uτdin the right place. If (d − 1) is a fixed point of τd (i.e. τd(d − 1) = d − 1), then τd−1 is thetransposition which exchanges (d − 1) and π−1(d − 1) (note that π−1(d − 1) 6= π−1(d), so udremains untouched). Otherwise, τd(d) = d − 1 and then τd−1(d − 1) = d − 1 and, even moreimportant τd−1(d) = π−1(d− 1). Now, we have ud and ud−1 in the right places, i. e. on the samepositions in uπ and uτd−1τd . Like this, we can go on, until τ2 finally puts u2 into its place. Thereis no need to worry about u1, because whenever u2, . . . , ud are all on their places, then u1 has tobe taken care of as well.

74

Page 83: Exchangeability of Copulas

4.3. Limits of Non-Exchangeability

Let us consider a concrete example, namely π : (1, 2, 3, 4) 7→ (3, 2, 4, 1). Now, one way togenerate π is by π = τ2 τ3 τ4, where the transpositions τj are characterized by

τ4 = (34),

τ3 = (14),

τ2 = id.

In this case, as τ2 = id, even two transpositions suffice to generate π = (134).

Next, we proof the part of Theorem 4.3.3 which is concerned with d−1d+1 being an upper bound,

as stated in the following lemma.

Lemma 4.3.6. Let u ∈ [0, 1]d, let π ∈ Sd, then∣∣C(u)− C(uπ)∣∣ ≤ d− 1

d+ 1(4.3.7)

holds for any d-dimensional copula C.

Proof. Let C be a d-dimensional copula. Without loss of generality, we may assume u1 ≤ . . . ≤ ud,otherwise replace C in the proof by C with C(v) := C

(vσ−1

)for all v ∈ [0, 1]d. Here σ ∈ Sd is

the permutation which orders the components of u by size, i. e. uσ =(u(1), . . . , u(d)

)>.

If there exists at least one i ∈ 1, . . . , d with ui <d−1d+1 the claim follows immediately by

∣∣C(u)− C(uπ)∣∣ ≤M(u)−W (uπ) ≤M(u) ≤ ui <

d− 1

d+ 1.

Hence we may now assume that d−1d+1 ≤ u1. In the following, we write ui := ui − d−1

d+1 , so we have

0 ≤ ui ≤ 2d+1 . The permutation π is generated by at most (d− 1) transpositions as described in

Example 4.3.5, therefore, we are able to write π = τ2 . . . τd−1 τd. Next we use the triangularinequality to derive∣∣C(u)− C(uπ)

∣∣ ≤ ∣∣C(u)− C(uτd)∣∣+∣∣C(uτd)− C(uτd−1τd)

∣∣+ . . .+∣∣C(uτ3...τd)− C(uπ)

∣∣≤

d∑i=2

(ui − u1) ≤d∑i=2

ui (4.3.8)

where the second inequality follows from Lemma 4.3.4.At the same time, we have

∣∣C(u)− C(uπ)∣∣ ≤M(u)−W (u) ≤ u1 −

(d∑i=1

ui − (d− 1)

)= 2

d− 1

d+ 1−

d∑i=2

ui (4.3.9)

with the Frechet-Hoeffding-bounds M and W of Theorem 2.2.4. Therefore, we may conclude that

∣∣C(u)− C(uπ)∣∣ ≤ min

d∑i=2

ui, 2d− 1

d+ 1−

d∑i=2

ui

≤ d− 1

d+ 1

which completes the proof.

In the proof of Lemma 4.3.6 we need u1 ≤ . . . ≤ ud only for notational convenience. Therefore,it is straightforward to derive the following corollary:

75

Page 84: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Corollary 4.3.7. With the prerequisites of Lemma 4.3.6 and u(1) := minu1, . . . , ud,

|C(u)− C(uπ)| ≤ min

u1, . . . , ud,

d∑i=1

(ui − u(1)), (d− 1) + u(1) −d∑i=1

ui

holds for any d-copula C.

By now, we established the upper inequality in Theorem 4.3.3. In order to prove that thisbound is best-possible, we have to find a proper d-copula, for which the bound in (4.3.3) isattained in some point u ∈ [0, 1]d and for some permutation π ∈ Sd. To this end let u∗ ∈ [0, 1]d,such that

u∗j :=

d−1d+1 for 1 ≤ j ≤ d+1

2dd+1 for j = d

2 + 1 and d even

1 otherwise

(4.3.10)

for j ∈ 1, . . . , d. In the following we consider the mapping C∗ : [0, 1]d → R with

C∗(u) :=

d−1∑j=0

d−1∧k=0

(u((j+k) mod d)+1 −

d∑i=1,i6∈I(j,k)

(1− u∗i ))+

(4.3.11)

where the set I(j, k) ⊂ N is given by

I(j, k) :=(

(j + l) mod d)

+ 1 : l = 0, 1, . . . , k

and∧di=1 ai := mina1, . . . , ad.

Example 4.3.8. In the case d = 2 and with u∗ =(

13 ,

23

)>, this mapping C∗ is given by

C∗(u1, u2) = min

(u1 −

1

3

)+

, u2

+ min

u1,

(u2 −

2

3

)+

(4.3.12)

for all (u1, u2)> ∈ [0, 1]2. In the case d = 3 and with u∗ =(

12 ,

12 , 1)>, we get

C∗(u1, u2, u3) = min

(u1 −

1

2

)+

, u2, u3

+ min

u1,

(u2 −

1

2

)+

,

(u3 −

1

2

)+

for (u1, u2, u3)> ∈ [0, 1]3. In the case d = 4 and with u∗ =(

35 ,

35 ,

45 , 1)>, we get

C∗(u) = min

(u1 −

3

5

)+

,

(u2 −

1

5

)+

, u3, u4

+ min

u1,

(u2 −

3

5

)+

,

(u3 −

2

5

)+

,

(u4 −

2

5

)+

+ min

(u1 −

2

5

)+

, u2,

(u3 −

4

5

)+

,

(u4 −

4

5

)+ (4.3.13)

for u ∈ [0, 1]4.

76

Page 85: Exchangeability of Copulas

4.3. Limits of Non-Exchangeability

A small calculation shows that in the case d = 2 the mapping C∗ as in (4.3.12) satisfies

C∗(u1, u2) = min

u1, u2,

(u1 −

1

3

)+

+

(u2 −

2

3

)+

for all (u1, u2)> ∈ [0, 1]2. This means that C∗ coincides with the copula in (4.3.2), which is givenby Nelsen (2007) in order to show that the bound of non-exchangeability is best possible for d = 2(see Proposition 4.3.1). Therefore C∗ is a maximal non-exchangeable copula for d = 2. For d > 2,the following lemma shows that the bound of Theorem 4.3.3 is attained. Then, in the (remaining)proof of Theorem 4.3.3, which is stated right after the proof of Lemma 4.3.9, it becomes clearthat the mapping C∗ is indeed a proper copula for any d ≥ 2.

Lemma 4.3.9. Let C∗ be the mapping defined in (4.3.11) and u∗ ∈ [0, 1]d as in (4.3.10). Letπ ∈ Sd the order reversing permutation, i.e. π(k) := d−k+1, then C∗(u∗) = 0 and C∗(u∗π) = d−1

d+1 .

Proof. First and foremost, note that

d∑i=1

u∗i = d− 1

and therefore

d∑i=1

(1− u∗i ) = 1

holds by the choice of u∗.Now for the first claim: let j ∈ 0, . . . , d− 1 and k := 0. Thus I(j, k) = I(j, 0) = j + 1 and

because ofd∑i=1

i 6∈I(j,0)

(1− u∗i ) =

d∑i=1

(1− u∗i )− (1− u∗j+1) = u∗j+1

we get (u∗((j+k) mod d)+1 −

d∑i=1

i 6∈I(j,k)

(1− u∗i ))+

=(u∗j+1 − u∗j+1

)+= 0

whenever k = 0. As this holds true for each j we have C∗(u∗) = 0.

In order to prove the second claim, note that C∗(u∗π) =∑d−1j=0

∧d−1k=0

(mj,k

)+, with

mj,k := u∗([(d−1)−(j+k)] mod d)+1 −d∑i=1

i 6∈I(j,k)

(1− u∗i )

because

d−((

(j + k) mod d)

+ 1)

+ 1 =((

(d− 1)− (j + k))

mod d)

+ 1

holds for all j, k ∈ 0, . . . , d− 1. Now let j ∈ 0, . . . , d− 1 and 0 ≤ k ≤ d− 2. We want to showthat mj,k is nondecreasing in k, i.e. mj,k ≤ mj,k+1. This is the case if and only if

αj,k := u∗([(d−1)−(j+k)] mod d)+1 − u∗([(d−1)−(j+k+1)] mod d)+1

≤ 1− u∗((j+k+1) mod d)+1 =: βj,k (4.3.14)

77

Page 86: Exchangeability of Copulas

4. Limits of Non-Exchangeability

holds. The left hand side of (4.3.14) is the difference between consecutive components of u∗, soαj,k = 0 for most choices of k. The cases where αj,k 6= 0 depend on d being odd or even. If d iseven, then αj,k 6= 0 if and only if:

1.((

(d− 1)− (j + k))

mod d)

+ 1 = 1. Then αj,k = u∗1 − u∗d < 0 ≤ βj,k.

2.((

(d− 1)− (j + k))

mod d)

+ 1 = d2 + 2. In this case

((j + k + 1) mod d

)+ 1 = d−

(d

2+ 1

)+ 1 =

d

2

and therefore

αj,k = u∗d2 +2− u∗d

2 +1=

1

d+ 1<

2

d+ 1= 1− u∗d

2= βj,k .

holds.

3.((

(d− 1)− (j + k))

mod d)

+ 1 = d2 + 1. In this case

((j + k + 1) mod d

)+ 1 = d− d

2+ 1 =

d

2+ 1

and therefore

αj,k = u∗d2 +1− u∗d

2=

1

d+ 1= 1− u∗d

2 +1= βj,k

holds.

If d is odd, then αj,k 6= 0 if and only if:

1. see 1. where d is even.

2.((

(d− 1)− (j + k))

mod d)

+ 1 = d+12 + 1. In this case

((j + k + 1) mod d

)+ 1 = d− d+ 1

2+ 1 =

d+ 1

2

and therefore

αj,k = u∗d+32

− u∗d+12

=2

d+ 1= 1− u∗d+1

2

= βj,k

holds.

So we have αj,k ≤ βj,k and thus mj,k ≤ mj,k+1 for all choices of j and k. This means theminimum in (4.3.11) is always achieved for k = 0, which gives us

C∗(u∗) =

d−1∑j=0

(mj,0)+

=

d−1∑j=0

(u∗d−j − u∗j+1

)+=d− 1

d+ 1

as for j > d2 the term

(u∗d−j − u∗j+1

)+vanishes by the construction of u∗.

Now we are finally set to prove Theorem 4.3.3.

78

Page 87: Exchangeability of Copulas

4.3. Limits of Non-Exchangeability

Proof of Theorem 4.3.3. Let π ∈ Sd and C be a d-copula. Then by Lemma 4.3.6 we get the upperbound of (4.3.3). In Lemma 4.3.9 it is shown that there exists a point u∗ ∈ [0, 1]d, a permutationπ ∈ Sd and a mapping C∗ : [0, 1]d → R such that∣∣C∗(u∗)− C∗(u∗π)

∣∣ =d− 1

d+ 1.

So, all we need to do in order to prove Theorem 4.3.3 is to show that C∗ is indeed a validcopula. This is the case, as it can be constructed by the shuffle-of-min method. In two dimensionsMikusinski et al. (1992) show that by slicing the unit square vertically (including the mass ofthe upper Frechet-Hoeffding-bound on the main diagonal) and rearranging it, i. e. shuffling thestrips, the resulting mass distribution will yield a proper copula. Mikusinski and Taylor (2010,Section 6) state that this also works for d > 2 by rearranging [0, 1]d, with all the mass equallyspread on the main diagonal, i. e. on the set

u ∈ [0, 1]d |u1 = . . . = ud

.

To this end, the hypercube [0, 1]d is separated along hyperplanes of the form uk = λk. Theseparate parts are then rearranged. The resulting shuffle of the original mass distributioncorresponds to a proper copula. C∗ can be obtained this way, by using hyperplanes withλk :=

∑ki=1(1− u∗i ). Durante and Fernandez-Sanchez (2010) generalize this concept by applying

it to arbitrary copulas. By Remark 2.1. therein, and following their notation, we get a copula Cindicated by

⟨(J k)dk=1, (Ci)

di=1

⟩where Ci(u) := Md(u) for i ∈ 1, . . . , d, and J k = (Jkj )dj=1

with

Jkj :=

[1−

k∑i=j

(1− u∗i ), 1−k∑

i=j+1

(1− u∗i )]

if j < k,

[u∗k, 1

]if j = k,[

j−1∑i=k+1

(1− u∗i ),j∑

i=k+1

(1− u∗i )]

if j > k,

(4.3.15)

for k = 1, . . . , d. In Proposition 2.2 of Durante and Fernandez-Sanchez (2010) an explicitexpression of C is given, namely

C(u) =

d∑j=1

λ(J1j )Md

((u1 − a1

j

)+λ(J1

j ), . . . ,

(ud − adj

)+λ(J1

j )

)(4.3.16)

where akj is the left limit of the interval Jkj . Showing that C(u) = C∗(u) is just notationallydemanding. The sums in (4.3.11) and in (4.3.15) look similar, but in (4.3.11) we circumventthe distinction of cases by using modular arithmetic. Note that in (4.3.16), we write

(ui − aij

)+

instead of ui − aij in Proposition 2.2. of Durante and Fernandez-Sanchez (2010). But from their

proof it is clear that a summand is 0 whenever ui < aij for at least one i ∈ 1, . . . , d.

Now, as the proof of Theorem 4.3.3 is complete, it should be noted that the mapping C∗

in (4.3.11) and in Example 4.3.8 is a maximal non-exchangeable copula. The intervals Jkj as in(4.3.15), corresponding to the copula C∗ in dimension d = 4 as in Example 4.3.8, are given inthe following example. Other examples of maximal non-exchangeable copulas will be given inSection 4.4.

79

Page 88: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Example 4.3.10. For dimension d = 4, the vector u∗ as in (4.3.10) is given by u∗ =(

35 ,

35 ,

45 , 1)>.

The intervals Jkj of (4.3.15) for this case are given in Table 4.1. Note that the intervals Jk4 may beomitted as their length is zero. The graphs in Figure 4.2 show the two-dimensional projections ofthe four-dimensional hypercubes J1

j × . . .×J4j with mass

∣∣J1j

∣∣ uniformly spread on their diagonals,as Ci ≡Md for all i ∈ 1, . . . , d is used in the construction of C∗. By using other copulas for Ci,the mass

∣∣J1j

∣∣ is still confined to the blue square labeled j. The hatched areas correspond to the

hypercube between 0 and u∗ and the hypercube between 0 and u∗π =(1, 4

5 ,35 ,

35

)> respectively.In order for C∗(u∗) = 0 to hold, all the blue squares must lie outside the red area for at least oneprojection. And C∗(u∗π) = 3

5 is due to the location of blue squares (at least in parts) with mass35 inside the green area in all projections. An explicit form of the copula C∗ is given in (4.3.13).

Table 4.1: Intervals Jkj as in (4.3.15) for d = 4.

j k = 1 k = 2 k = 3 k = 4

1[

35 , 1] [

15 ,

35

] [0, 2

5

] [0, 2

5

]2

[0, 2

5

] [35 , 1] [

25 ,

45

] [25 ,

45

]3

[25 ,

35

] [0, 1

5

] [45 , 1] [

45 , 1]

4[

25 ,

25

] [15 ,

15

][0, 0] [1, 1]

4.4 Additional Results

We now present some results concerning uniqueness and other implications which follow frommaximal non-exchangeability or modifications of some proofs in the preceding section.

4.4.1 Some Aspects of Uniqueness

As mentioned in the beginning of Section 4.3, if we assume u∗1 ≤ u∗2, Nelsen (2007) shows that ford = 2 there is exactly one u∗ (namely u∗ =

(13 ,

23

)>) for which the maximum in (4.3.3) is attained.

Assumed u∗1 ≤ u∗2 ≤ u∗3 the same holds true for d = 3 and u∗ =(

12 ,

12 , 1)>. If we exclusively

consider C∗ as in (4.3.11) (see also Example 4.3.8), which is shown to be a copula in the proof ofTheorem 4.3.3 above, there are still two possible choices for u∗π, or rather for the permutationπ: both u∗π = (u∗3, u

∗1, u∗2)> and u∗π = (u∗3, u

∗2, u∗1)> yield the upper bound of Theorem 4.3.3. For

d = 4 we lose uniqueness of u∗, even if we assume u∗i ≤ u∗i+1 for all i ∈ 1, 2, 3. The existence ofa copula which is maximal non-exchangeable in u∗ = (0.6, 0.6, 0.9, 0.9)> will be derived in theproof of Theorem 4.4.3.

In summary, for d > 2, the point u∗, where equality in (4.3.3) holds, is unique if and only if dis odd (assumed u∗i ≤ u∗j for i ≤ j). If d = 2n+ 2 (n ∈ N), then there exists a

(d2 −1

)-dimensional

manifold with a subset M ⊂ [0, 1]d, such that for all u∗ ∈ M, there exist a copula C and apermutation π ∈ Sd with

∣∣C(u∗)−C(u∗π)∣∣ = d−1

d+1 . This is shown in Theorem 4.4.3. For the proofwe are first going to improve the bound in (4.3.8) which was derived in the proof of Lemma 4.3.6.

Lemma 4.4.1. Let d ≥ 2 and u ∈ [0, 1]d with ui ≤ uj for i ≤ j. Then

∣∣C(u)− C(uπ)∣∣ ≤ d∑

i=d d2 e+1

(ui − u1) (4.4.1)

80

Page 89: Exchangeability of Copulas

4.4. Additional Results

1

2

3

1 x1

1

x4

0

1

2

3

1 x1

1

x3

0

1

2

3

1 x1

1

x2

0

1

2

3

1 x3

1

x4

0

1

2

3

1 x2

1

x4

0

1

2

3

1 x2

1

x3

0

Figure 4.2: Two-dimensional projections of the four-dimensional hypercubes J1j × . . .× Jdj with

mass∣∣J1j

∣∣ uniformly spread on their diagonals. The intervals Jkj depend on u∗ =(

35 ,

35 ,

45 , 1)> and

are given in (4.3.15) and Table 4.1. The squares Jk1j ×Jk2j in the projection onto the xk1 , xk2 -plane

are colored blue and labelled j, i. e. j . The red area, hatched from top left to bottom right, i. e.

is the projection of the hypercube [0,u∗], whereas the green area, hatched from top right tobottom left, i. e. is the projection of the hypercube [0,u∗π], where u∗π =

(1, 4

5 ,35 ,

35

)>.

81

Page 90: Exchangeability of Copulas

4. Limits of Non-Exchangeability

holds for any copula C and any permutation π ∈ Sd, where dxe denotes the smallest integer z ∈ Z,such that z ≥ x, as usual.

Before the proof of Lemma 4.4.1 for an arbitrary π ∈ Sd, we will give the proof for a specialcase in the following example.

Example 4.4.2. Let d ≥ 3, u ∈ [0, 1]d as in Lemma 4.4.1 and π ∈ Sd, such that π(i) 6= i forexactly three indices i ∈ 1, . . . , d. This means, there are exactly three components ui1 , ui2 , ui3in u, which are permuted in uπ. Without loss of generality, we may assume i1 < i2 < i3, i. e.ui1 ≤ ui2 ≤ ui3 because the components of u being ordered is a prerequisite of Lemma 4.4.1. Asπ cannot be a transposition (otherwise, there is one k with π(ik) = ik), either π is a left-shift ofthe three components, i. e.

π = πl := (i1i2i3)

or π is a right-shift of the three components, i. e.

π = πr := (i1i3i2)

as there are no other derangements in S3. Now consider the transpositions τ1 := (i1i2) andτ2 := (i2i3). Then πl and πr are generated by those two transpositions in the following way:

πl = τ1 τ2,πr = τ2 τ1.

Thus, we have ∣∣C(u)− C(uπl)∣∣ ≤ ∣∣C(u)− C(uτ2)

∣∣+∣∣C(uτ2)− C(uπl)

∣∣∣∣C(u)− C(uπr )∣∣ ≤ ∣∣C(u)− C(uτ1)

∣∣+∣∣C(uτ1)− C(uπl)

∣∣and applying Lemma 4.3.4 yields∣∣C(u)− C(uπl)

∣∣ ≤ |ui3 − ui2 |+ |ui2 − ui1 | = |ui3 − ui1 | ≤ |ud − u1|∣∣C(u)− C(uπr )∣∣ ≤ |ui2 − ui1 |+ |ui3 − ui2 | = |ui3 − ui1 | ≤ |ud − u1|.

Note that the last equation holds, as u1 ≤ ui1 ≤ ui2 ≤ ui3 ≤ ud by the prerequisites ofLemma 4.4.1. Now, in this special case, (4.4.1) follows immediately, as either π = πl or π = πr.

For more information on generating permutations by transpositions, see e. g. Dummit andFoote (2009), Coxeter and Moser (1980) or Kerber (1971) and the references therein. We willmake use of Example 4.4.2 in the following proof of Lemma 4.4.1.

Proof of Lemma 4.4.1. Let d ≥ 2, u ∈ [0, 1]d with ui ≤ uj for i ≤ j and π ∈ Sd. We will needp ∈ N, defined by

p :=∣∣∣i ∈ N ∣∣ 1 ≤ i ≤ d ∧ π(i) 6= i

∣∣∣,so p is the number of elements of 1, . . . , d which are no fixed points of π. Note that for p = 0there is nothing to show and p = 1 is impossible. Therefore, we may assume p ≥ 2 and have pindices 1 ≤ i1 < . . . < ip ≤ d with π(ik) 6= ik for k ∈ 1, . . . , p. We will prove Lemma 4.4.1 byestablishing the similar claim

|C(u)− C(uπ)| ≤p∑

k=d p2 e+1

(uik − ui1). (4.4.2)

82

Page 91: Exchangeability of Copulas

4.4. Additional Results

Then (4.4.1) follows immediately, as the components of u are ordered and thus

p∑k=d p2 e+1

(uik − ui1) ≤d∑

i=d d2 e+1

(ui − u1)

holds true for all p and the corresponding index sets.The proof of (4.4.2) will be an induction on p. For p = 2 equation (4.4.2) holds true due

to Lemma 4.3.4. The case p = 3 is discussed in Example 4.4.2. Now assume (4.4.2) holds forp− 1 (with 3 < p ≤ d). As ip is no fixed point of π, there exist x, y ∈ 1, . . . , p− 1, such thatix = π−1(ip) and iy = π(ip). We distinguish the cases ix 6= iy and ix = iy.

Case 1: If ix 6= iy, then consider the shift σ := (ixipiy). With the permutation π1 := σ−1 π weget

σ π1 = σ σ−1 π = π

as a decomposition of π. Because of σ(ix) = π(ix) and σ(ip) = π(ip), both ix and ip are fixedpoints of π1. This means that the permutation π1 exhibits at most p − 2 non-fixed pointscontained in the set

j1, . . . , jp−2 = i1, . . . , ip−1 \ ix

where jk = ik if k < x and jk = ik+1 otherwise. If π(iy) = ix, then π1 has just p− 3 non-fixedpoints, but the proof remains unchanged. As jk ≤ ik+1 for all k ∈ 1, . . . , p− 2 as well asj1 ≥ i1, we get

∣∣C(u)− C(uπ1)∣∣ ≤ p−2∑

k=d p−22 e+1

(ujk − uj1) ≤p−1∑

k=d p2 e+1

(uik − ui1)

by the induction hypothesis and because the components of u are assumed to be in ascendingorder. Furthermore, as σ is a shift of three elements,∣∣C(uπ1)− C(uσπ1)

∣∣ ≤ uπ1(ip) −minuπ1(ix), uπ1(iy)

≤ uip − ui1

holds because of Example 4.4.2 and ip being a fixed point of π1 (see above). Altogether, thisyields ∣∣C(u)− C(uπ)

∣∣ ≤ ∣∣C(u)− C(uπ1)∣∣+∣∣C(uπ1

)− C(uσπ1)∣∣

≤p−1∑

k=d p2 e+1

(uik − ui1) + (uip − ui1) =

p∑i=d p2 e+1

(uik − ui1)

which completes the proof in this case.

Case 2: If ix = iy, then all arguments of Case 1 remain valid, if the shift σ in Case 1 is replacedby the transposition τ = (ixip).

Therefore (4.4.2) holds for all p ∈ 2, . . . , d.

Now we are able to prove that for d > 2, the point u∗, where maximal non-exchangeability ispossible, is unique if and only if the dimension is odd.

83

Page 92: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Theorem 4.4.3. Let d > 2 and let

M :=

u ∈ [0, 1]d

∣∣∣u1 ≤ . . . ≤ ud ∧ ∃π ∈ Sd ∃C ∈ Cd :∣∣C(u)− C(uπ)

∣∣ =d− 1

d+ 1

denote the subset of the unit hypercube where maximal non-exchangeability is attained. Then|M| = 1 if and only if d = 2n + 1 (for some n ∈ N). If d = 2n, then M is a subset of an(n− 1)-dimensional manifold.

As usual Cd denotes the set of all d-dimensional copulas. In the case d = 2n, M being asubset of an (n− 1)-dimensional manifold, means that there are (n− 1) degrees of freedom whenlooking for points where maximal non-exchangeability is attained.

Proof. Let u ∈M and just like in the proof of Lemma 4.3.6 consider

ui := ui −d− 1

d+ 1∈[0,

2

d+ 1

], (4.4.3)

where the left bound of ui follows from

d− 1

d+ 1=∣∣C(u)− C(uπ)

∣∣ ≤M(u) ≤ ui

for any i = 1, . . . , d and the right bound of ui is a direct consequence of ui ∈ [0, 1]. From (4.3.9)we find that any u ∈M satisfies

2d− 1

d+ 1−

d∑i=2

ui ≥d− 1

d+ 1,

i. e. it holds that

d− 1

d+ 1≥

d∑i=2

ui ≥d∑

i=d d2 e+1

ui.

This, together with the inequality

d∑i=d d2 e+1

ui ≥d∑

i=d d2 e+1

(ui − u1) =

d∑i=d d2 e+1

(ui − u1) ≥ d− 1

d+ 1

from Lemma 4.4.1 yieldsd∑

i=d d2 e+1

ui =d− 1

d+ 1(4.4.4)

for every u ∈M. Let d = 2n+ 1. The only way for (4.4.4) to be true is

ui =

0 if i ≤

⌈d2

⌉,

2d+1 if i ≥

⌈d2

⌉+ 1,

because, as mentioned before, 0 ≤ ui ≤ 2d+1 holds for all i ∈ 1, . . . , d. Therefore |M| = 1 if the

dimension d is odd.

84

Page 93: Exchangeability of Copulas

4.4. Additional Results

Now, let d = 2n and for j ∈ 1, . . . , n let δj ∈[0, 1

d+1

], such that δ1 ≤ . . . ≤ δn as well as

n∑j=1

δj =n− 1

d+ 1

holds. Furthermore consider u ∈ [0, 1]d where

ui :=

d−1d+1 if i ≤ n,

dd+1 + δi−n if i > n

and let M be the set of all such u. For each u ∈ M there exists a permutation π and a copula C,such that

∣∣C(u)− C(uπ)∣∣ = d−1

d+1 holds. We will give such a permutation π and construct such acopula C by the shuffle-of-min method, presented by Mikusinski et al. (1992) and Durante andFernandez-Sanchez (2010), in Appendix A.2. Therefore, we have M ⊆M.

It remains to show that M ⊆ M holds as well. To this end, let u ∈ M. If we assumethat un+1 < d

d+1 holds, or to put it differently, if we assume that un+1 < 1d+1 holds, then

equation (4.4.4) implies that there exists some un+j >2d+1 contradicting (4.4.3) in the beginning

of the proof. Hence, we may write

un+j =d

d+ 1+ δj

for j ∈ 1, . . . , n with

0 ≤ δ1 ≤ · · · ≤ δn ≤1

d+ 1

as the components of all u ∈M are in ascending order. Consequently, we have

un+j =1

d+ 1+ δj

for all j ∈ 1, . . . , n and equation (4.4.4) implies

n∑j=1

δj =n− 1

d+ 1

which means that u ∈ M and thus M⊆ M.

In what follows, we give an example of a copula as it is constructed in the proof of Theorem 4.4.3by the shuffle-of-min method as in (A.4). This copula is maximal non-exchangeable in dimensiond = 4 and different from the four dimensional, maximal non-exchangeable copula in Example 4.3.8.

Example 4.4.4. Let d = 4. As the dimension is even, the point u∗ =(

35 ,

35 ,

45 , 1)> in Example 4.3.8

is not the only one in which maximal non-exchangeability is attained (see Theorem 4.4.3). Letδ1 ∈

[0, 1

10

]and u =

(35 ,

35 ,

45 + δ1, 1 − δ1

)>. Note that 1 − δ1 = 45 + δ2 if δ1 + δ2 = 1

5 as in

85

Page 94: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Appendix A.2. The copula in (A.4) is given by

C(u) = min

u1,

(u2 −

3

5

)+, u3, u4,

1

5+ δ1

+ min

(u1 −

1

5− δ1

)+,

(u2 −

4

5− δ1

)+,

(u3 −

3

5− δ1

)+,

(u4 −

3

5− δ1

)+,

1

5− δ1

+ min

(u1 −

2

5

)+, u2,

(u3 −

4

5− δ1

)+,

(u4 −

4

5

)+,

1

5− δ1

+ min

(u1 −

3

5+ δ1

)+,

(u2 −

1

5+ δ1

)+,

(u3 −

4

5

)+, (u4 − 1 + δ1)

+, δ1

+ min

(u1 −

3

5

)+,

(u2 −

1

5

)+,

(u3 −

1

5− δ1

)+,

(u4 −

1

5− δ1

)+,

2

5

.

(4.4.5)

Therefore, we have ∣∣∣∣∣C(u)− C(

1− δ1,4

5+ δ1,

3

5,

3

5

)∣∣∣∣∣ =3

5

and C is maximal non-exchangeable in u. For δ1 6= 0, this copula C is different from the copulaC∗ in Example 4.3.8, as

C

(3

5,

3

5,

4

5+ δ1, 1− δ1

)= 0 6= δ1 = C∗

(3

5,

3

5,

4

5+ δ1, 1− δ1

)holds. The intervals Jki of the construction in Appendix A.2 are given in Table 4.2. Because ofthe condition δ1 + δ2 = 1

5 , we use only δ1 for better readability. The blue areas in Figure 4.3depict the two-dimensional projections of the four-dimensional hypercubes J1

j × . . .× J4j with

mass∣∣J1j

∣∣ uniformly spread on their diagonals. This means that each of the hypercubes containsthe upper Frechet-Hoeffding-bound M , scaled to fit the respective hypercube. Instead of M , othercopulas might be used (see Durante and Fernandez-Sanchez (2010)). For example, spreading themass uniformly within the hypercubes results in scaled versions of the independence copula Π.The hatched areas correspond to the hypercube between 0 and u and the hypercube between 0and u∗π =

(1− δ1, 4

5 + δ1,35 ,

35

)> respectively. In order for C(u) = 0 to hold, all the blue squaresmust lie outside the red area for at least one projection. And C(uπ) = 3

5 is due to the location ofblue squares (at least in parts) with mass 3

5 inside the green area in all projections. An explicitform of the copula C is given in (4.4.5). Comparison of Figure 4.2 and Figure 4.3 clarifies thateven for δ1 = 0 the respective copulas do not coincide.

4.4.2 Marginal Distributions

In the following theorem, it becomes clear that Theorem 4.3.3 is not just a statement aboutexchangeability, but also has consequences for the possible choices of lower dimensional marginsof a copula. For example, for d > 3, there exists no copula of which two (d − 1)-dimensionalmargins Ca and Cb coincide on the point d−2

d−1 (1, . . . , 1)> with the Frechet-Hoeffding-bounds ofTheorem 2.2.4.

86

Page 95: Exchangeability of Copulas

4.4. Additional Results

1

2

34

5

1

2

34

5

1

2

34

5

1

2

34

5

1

2

34

5

1

2

34

5

1 x1

1

x4

0

1 x1

1

x3

0

1 x1

1

x2

0

1 x3

1

x4

0

1 x2

1

x4

0

1 x2

1

x3

0

Figure 4.3: Two-dimensional projections of the four-dimensional hypercubes J1i × . . .× J4

i withmass

∣∣J1i

∣∣ uniformly spread on their diagonals as in the proof of Theorem 4.4.3 (see Appendix A.2).

The intervals Jki depend on δ1 ∈[0, 1

10

]as well as on u∗ =

(35 ,

35 ,

45 + δ1, 1− δ1

)> and are given

in Appendix A.2 and Table 4.2. The squares Jk1i × Jk2i in the projection onto the xk1 , xk2 -plane

are colored blue and labelled i, i. e. i . The red area, hatched from top left to bottom right, i. e.

is the projection of the hypercube [0,u∗], whereas the green area, hatched from top right tobottom left, i. e. is the projection of the hypercube [0,u∗π], where u∗π =

(1− δ1, 4

5 + δ1,35 ,

35

)>.Here, δ1 = 3

40 was used.

87

Page 96: Exchangeability of Copulas

4. Limits of Non-Exchangeability

Table 4.2: Intervals Jki of the construction in Appendix A.2 for d = 4.

i k = 1 k = 2 k = 3 k = 4

1[0, 1

5 + δ1] [

35 ,

45 + δ1

] [0, 1

5 + δ1] [

0, 15 + δ1

]2

[15 + δ1,

25

] [45 + δ1, 1

] [35 + δ1,

45

] [35 + δ1,

45

]3

[25 ,

35 − δ1

] [0, 1

5 − δ1] [

45 + δ1, 1

] [45 , 1− δ1

]4

[35 − δ1, 3

5

] [15 − δ1, 1

5

] [45 ,

45 + δ1

][1− δ1, 1]

5[

35 , 1] [

15 ,

35

] [15 + δ1,

35 + δ1

] [15 + δ1,

35 + δ1

]

Theorem 4.4.5. Let d > 3, let C be a d-dimensional copula and let k ∈ N, such that k < d−12 .

Furthermore, let C(d−k),a and C(d−k),b be two (d− k)-dimensional margins of C. Then

∣∣C(d−k),a(u)− C(d−k),b(u)∣∣ ≤ d− 1

d+ 1<d− k − 1

d− k = Md−k(u∗)−Wd−k(u∗)

holds for all u ∈ [0, 1]d−k and u∗ := d−k−1d−k (1, . . . , 1)> ∈ [0, 1]d−k

By Md−k we denote the upper (d− k)-dimensional Frechet-Hoeffding-bound, and by Wd−k a(d− k)-dimensional copula which coincides with the lower (d− k)-dimensional Frechet-Hoeffding-bound in u∗. For details on the existence and construction of such a copula Wd−k, see, e. g.,the proof of Theorem 1.2.4 of Hofert (2010). For an exact form of such a copula Wd−k withgiven diagonal section, see Jaworski (2009). Note that Theorem 4.4.5 is still correct for d = 3,but gives no information. To be precise, for d = 3, there exists a copula with M and W astwo-dimensional margins. For example consider U ∼ U[0, 1] and the copula C of the randomvector (1 − U,U, U)> ∼ C. Then C(u, v, 1) = W (u, v) and C(1, u, v) = M(u, v) holds for all(u, v)> ∈ [0, 1]2.

Proof. Let d > 3 and let C be a d-dimensional copula. As C(d−k),a and C(d−k),b are margins of

C, for a fixed u ∈ [0, 1]d−k, there exist ua ∈ [0, 1]d and ub ∈ [0, 1]d with exactly k componentsequal to 1, such that

C(d−k),a(u) = C(ua),

C(d−k),b(u) = C(ub)

holds. These two d-dimensional vectors ua and ub might not be unique, but it is possible tochoose them in such a way that they are the same, up to the order of their components (justexpand u by inserting 1 into the appropriate places). Therefore, there exists a permutationπ ∈ Sd, such that ub = (ua)π and thus∣∣C(d−k),a(u)− C(d−k),b(u)

∣∣ =∣∣∣C(ua)− C

((ua)π

)∣∣∣ ≤ d− 1

d+ 1.

A basic computation yields

d− 1

d+ 1<d− k − 1

d− k ⇐⇒ k <d− 1

2

which completes the proof.

88

Page 97: Exchangeability of Copulas

4.4. Additional Results

4.4.3 Non-Maximal Non-Exchangeability

Besides the maximal non-exchangeable copulas which are mentioned in this chapter, there existalso examples for copulas which are non-exchangeable, but for which the limit of Theorem 4.3.3is not attained. Some of those non-exchangeable but not maximal non-exchangeable copulaswere already introduced in Example 4.2.6 or Theorem 4.2.8. Theorem 4.2.11 shows that nestedArchimedean copulas are not exchangeable in general. And as the generators of the familiesdiscussed in Example 2.4.21, Example 2.4.22 and Example 2.4.23 are continuous in their respectiveparameter θ, a nested Archimedean copula with generators that belong to the same family becomesexchangeable, whenever the difference between the parameters goes to zero. Another way toconstruct non-exchangeable copulas from a set of exchangeable copulas is given in the followingproposition. According to Genest et al. (1998), it was introduced by Khoudraji (1995) in thebivariate case. An extension to the multivariate case is discussed by Liebscher (2008) (see alsoLiebscher (2011)). The following proposition is Theorem 2.1 of Liebscher (2008).

Proposition 4.4.6. Let k ∈ N. For j ∈ 1, . . . , k, let Cj ∈ Cd be a d-dimensional copula andfor each j as well as for each i ∈ 1, . . . , d, let gji : [0, 1]→ [0, 1] be a function, which is eitherstrictly increasing or gji ≡ 1. If

k∏j=1

gji(v) = v as well as limv→0

gij(v) = gji(0)

holds for all v ∈ [0, 1] and for all i ∈ 1, . . . , d, j ∈ 1, . . . , k, then the function C : [0, 1]d → [0, 1]with

C(u) :=

k∏j=1

Cj(gj1(u1), . . . , gjd(ud)

)is a copula.

In Liebscher (2008), the functions gji(v) := vθji for θji ∈ [0, 1], such that∑kj=1 θji = 1 are

given as an example for functions, which fulfill the prerequisites of the above proposition. Ofcourse the extent of non-exchangeability, i. e. the value of

maxπ∈Sd

maxu∈[0,1]d

∣∣C(u)− C(uπ)∣∣,

for the copulas in the preceding proposition, depends on the concrete choice of C and g.

89

Page 98: Exchangeability of Copulas
Page 99: Exchangeability of Copulas

5 Tests for Non-Exchangeability

Before using an Archimedean or any other exchangeable copula in order to make inferences from agiven sample, a test for exchangeability of the underlying copula of the data should be performed.In this chapter we present some test statistics for testing exchangeability of copulas in arbitrarydimension and their asymptotical behavior as well as some small simulation study in order tocompare their performance on finite samples. Essential parts of this chapter are submitted forpublication as Harder and Stadtmuller (2015) and a revision is currently under review.

5.1 Test Statistics

Before introducing some test statistics, we first clarify some notations in the following definition.

Definition 5.1.1. Let n ∈ N and let X1, . . . ,Xn be a sample of a d-variate distribution H withunivariate continuous marginal distributions F1, . . . , Fd, i. e.

Xi := (Xi1, . . . , Xid)> ∼ H,

Xij ∼ Fj ,

for i ∈ 1, . . . , n, j ∈ 1, . . . , d. For x ∈ R and j ∈ 1, . . . , d, the function Fjn : R→ [0, 1] with

Fjn(x) :=1

n

n∑i=1

1(Xij ≤ x)

denotes the empirical distribution function of the margin Fj (see Definition 3.3.9). Furthermore,

the random process Cn on D[0, 1]d with

Cn(u) :=1

n

n∑i=1

d∏j=1

1(Fjn(Xij) ≤ uj

)(5.1.1)

for u ∈ [0, 1]d is called empirical copula. The random vectors U1, . . . , Un with

Ui := (Ui1, . . . , Uid)>

Uij := Fjn(Xij)(5.1.2)

for i ∈ 1, . . . , n and j ∈ 1, . . . , d are referred to as rank transformed sample, because nFjn(Xij)gives the rank of Xij within the j-th elements of X1, . . . ,Xn.

91

Page 100: Exchangeability of Copulas

5. Tests for Non-Exchangeability

Just like Genest et al. (2012) we will refer to Cn as empirical copula, being aware of itsnon-uniform margins thus not being a valid copula. Although the notation suggests so, thefunction Cn in (5.1.1) must not be mistaken for the empirical distribution function Cn of thecopula C. It is easy to see that the empirical distribution function of the copula C, belonging toH via Sklar’s theorem, and Cn in (5.1.1) do not coincide in general. One reason is that Fjn(Xij)only takes the discrete values 1

n , . . . ,nn , whereas Fj(Xij) follows a standard uniform distribution.

That the empirical copula is indeed a random process (i. e.(Cn(·)

)(u) is a random variable for

all u ∈ [0, 1]d), is straightforward as the Xi are random vectors and thus Fjn(Xij) are randomvariables for all i and j (see also Example 3.3.10).

Note that some authors use generalized inverses F−jn and Xij ≤ F−jn(uj) instead of Fjn(Xij) ≤uj in Definition 5.1.1, but as the resulting version of Cn differs at most by 2

n from the one inDefinition 5.1.1, this is negligible, particularly when looking at the asymptotics. In Ruschendorf(1976) some regularity conditions are given, under which (5.1.1) is a consistent estimator forC(u). In Proposition 3.1 of Segers (2012) it is stated that if all first order partial derivatives of

C exist and are continuous on the interior of the unit hypercube, then Cn, suitably normalized,converges weakly to a Gaussian process in l∞[0, 1]d.

We want to test the hypothesis

H0 : ∀u ∈ [0, 1]d, ∀π ∈ Sd C(u) = C(uπ)

versus

H1 : ∃u ∈ [0, 1]d, ∃π ∈ Sd C(u) 6= C(uπ)

where the vector uπ results from permuting the components of u ∈ [0, 1]d according to somepermutation π ∈ Sd, i. e. uπ :=

(uπ(1), . . . , uπ(d)

)> as in Section 4.2.It is well known that the set of permutations of d elements Sd has a magnitude of d!.

Therefore, the effort of checking C(u) = C(uπ) for each permutation π ∈ Sd is growing fast withthe dimension. Because of Theorem 5.1.3, for d > 2 it suffices to test for exchangeability underpermutations in a set G ⊂ Sd, which generates Sd, as in the following definition.

Definition 5.1.2. Let d ∈ N and let G ⊂ Sd. If, for all π ∈ Sd, there exist π1, . . . , πk ∈ G forsome k ∈ N, such that

π = π1 . . . πkholds, then G is said to be a generating set of Sd.

An obvious example for such a generating set is G0 := Sd. Other examples include the set ofall transpositions (1i), i. e.

G1 :=

(1i) ∈ Sd : 2 ≤ i ≤ d

with d− 1 elements and the set

G2 :=

(12), (12 . . . d)

with only two elements (for any d > 2). For these and other generating sets, see for exampleCoxeter and Moser (1980, Chapter 6.2), Kerber (1971) or Conrad (2013). We will concentrateonto the two generating sets G1 and particularly G2, the latter consisting of τ := τ12 := (12)(transposition of the first two elements) and σ := (12 . . . d). The cycle σ will also be referred to asleftshift, because for any vector u = (u1, . . . , ud)

>, shifting all elements one position to the leftand moving the first element to the last position yields the vector uσ.

92

Page 101: Exchangeability of Copulas

5.1. Test Statistics

Theorem 5.1.3. Let C : [0, 1]d → [0, 1] be a copula, and let G ⊂ Sd be a set of permutationsgenerating Sd. Then C is exchangeable if and only if C(u) = C(uπ) holds for all π ∈ G and forall u ∈ [0, 1]d

Proof. First, let C be exchangeable and let u ∈ [0, 1]d. Then obviously C(u) = C(uπ) holds forall π ∈ G by Definition 4.2.2.

Now, for all u ∈ [0, 1]d and for all π ∈ G, let C(u) = C(uπ). Let π0 ∈ Sd. As G is assumed togenerate Sd, by Definition 5.1.2, there exist k ∈ N and (not necessarily unique) π1, . . . , πk ∈ G,such that

π0 ≡ πk . . . π1

holds and thus we have, for all u ∈ [0, 1]d,

C(u) = C(uπ1) = C(uπ2π1

) = . . . = C(uπk−1...π1) = C(uπ0

),

which completes the proof.

In other words, C is exchangeable if and only if C is invariant under certain permutationsπ ∈ G of its argument u. If C is not exchangeable, then

∣∣C(u)− C(uπ)∣∣ is positive for at least

one permutation π ∈ G and some u ∈ [0, 1]d.As we do not observe C, we use the empirical copula instead and, given a set G ⊂ Sd which

generates Sd, as well as a weight function w : [0, 1]d × G → [0,∞), such that w(·, π) is continuousfor any π ∈ G, we propose

Rn :=∑π∈G

∫[0,1]d

(Cn(u)− Cn(uπ)

)2

w(u, π) du (5.1.3)

Sn :=∑π∈G

∫[0,1]d

(Cn(u)− Cn(uπ)

)2

w(u, π) dCn(u) (5.1.4)

Tn :=∑π∈G

supu∈[0,1]d

∣∣∣Cn(u)− Cn(uπ)∣∣∣w(u, π)

(5.1.5)

as a generalization of the test statistics by Genest et al. (2012) for detecting non-exchangeabilityin the bivariate case. In the proofs of Section 5.2, it becomes clear that the condition of continuityof w( · , π) may be relaxed, depending on the test statistic. Apart from w ≡ 1, another continuousweight function is proposed in Section 5.4. In most of the following sections, it is assumedthat each test statistic is a random variable, i. e. that the mappings Rn, Sn, Tn : Ω → [0,∞)are Borel-measurable, where (Ω,A,P) is a probability space. This holds for continuous weightfuncitons by the following theorem. If a weight function is used, such that some of the teststatistics are not measurable, then they might still converge weakly (as in Definition 3.4.3) to thelimits discussed in Section 5.2.

Theorem 5.1.4. If w( · , π) is continuous for all u ∈ [0, 1]d and for all π ∈ Sd, then Rn, Sn andTn are random variables.

Proof. As Cn is the empirical distribution function of the rank-transformed data, there are onlydiscontinuities if ui = k

n holds for some k ∈ 1, . . . , n and some i ∈ 1, . . . , d. Therefore itsuffices to consider the supremum over all points in [0, 1]d∩Q. And as the supremum of countablymany random variables is measurable, so is Tn.

93

Page 102: Exchangeability of Copulas

5. Tests for Non-Exchangeability

The test statistic Sn is measurable, because due to

Sn =∑π∈G

n∑i=1

(Cn(Ui)− Cn

((Ui)π

))2

w(Ui, π

)it is given by a finite sum of random variables.

In order to deduce measurability of Rn, consider the mapping f : [0, 1]d×n → [0,∞) with

f(A) :=∑π∈G

∫[0,1]d

(1

n

n∑i=1

1(ai ≤ u)− 1

n

n∑i=1

1(ai ≤ uπ)

)2

w(u, π) du

where ai denotes the i-th column of A ∈ [0, 1]d×n. If U is a matrix with columns U1, . . . , Un,

then obviously Rn = f(U) holds. It suffices to show that f is continuous for each matrix

in

1n , . . . ,

nn

d×nbecause Ui : Ω → [0, 1]d is measurable and Ui ∈

1n , . . . ,

nn

dholds for all

i ∈ 1, . . . , n. Continuity of f is demonstrated in Appendix A.3.

Of course even for exchangeable copulas we obtain∣∣Cn(u)− Cn(uπ)

∣∣ 6= 0 for most u ∈ [0, 1]d

and π ∈ Sd. So in general, the statistics will not vanish when C is exchangeable but they shouldget larger the larger

maxπ∈Sd

supu∈[0,1]d

∣∣C(u)− C(uπ)∣∣

gets, i. e. the more non-exchangeable C is. Note that using Tn with

Tn := supu∈[0,1]d

∑π∈G

∣∣∣Cn(u)− Cn(uπ)∣∣∣w(u, π)

is possible, but not superior to Tn as discussed in Remark 5.2.4.

5.2 Asymptotics

In the paper by Deheuvels (1981) (or Kiefer (1961, Theorem 2)) a proof of the following propositionmay be found. It shows that under general assumptions (i. e. continuous margins F1, . . . , Fd) the

empirical copula Cn is a consistent estimator for C.

Proposition 5.2.1. Let H be a d-dimensional distribution function with continuous marginsF1, . . . , Fd. Then

maxu∈[0,1]d

∣∣Cn(u)− C(u)∣∣ = O

(√log log n

n

)holds almost surely as n→∞.

Therefore, a multivariate (but somewhat weaker) version of the well-known Glivenko-Cantelli

theorem holds and Cn(u)a.s.−−→ C(u) uniformly for u ∈ [0, 1]d.

As mentioned in the preceding section, it will be assumed throughout this chapter that thetest statistics given in (5.1.3), (5.1.4), and (5.1.5) are random variables. If a weight function isused, such that some of the test statistics are not measurable, then they might still convergeweakly (as in Definition 3.4.3) to the limits given in the following Theorem.

94

Page 103: Exchangeability of Copulas

5.2. Asymptotics

Theorem 5.2.2. Let G ⊂ Sd and let w(·, π) be continuous for every π ∈ G. If H has continuousmargins F1, . . . , Fd, then

Rna.s.−−→ R :=

∑π∈G

∫[0,1]d

(C(u)− C(uπ)

)2w(u, π) du (5.2.1)

Sna.s.−−→ S :=

∑π∈G

∫[0,1]d

(C(u)− C(uπ)

)2w(u, π) dC(u) (5.2.2)

Tna.s.−−→ T :=

∑π∈G

supu∈[0,1]d

∣∣C(u)− C(uπ)∣∣w(u, π)

(5.2.3)

Tna.s.−−→ T := sup

u∈[0,1]d

∑π∈G

∣∣C(u)− C(uπ)∣∣w(u, π)

(5.2.4)

holds for n→∞.

Proof. The limit relation (5.2.1) follows from an application of the well-known dominated conver-gence theorem, because ∣∣∣(Cn(u)− Cn(uπ)

)2w(u, π)

∣∣∣ ≤ w(u, π)

and w(·, π) was assumed to be integrable (as it is continuous on a compact set and thus bounded).

Moreover(Cn(u)− Cn(uπ)

)2converges almost surely to

(C(u)− C(uπ)

)2for each u ∈ [0, 1]d

due to∣∣∣(Cn(u)− Cn(uπ))2 − (C(u)− C(uπ)

)2∣∣∣ =

=∣∣∣Cn(u)− Cn(uπ) + C(u)− C(uπ)︸ ︷︷ ︸

∈[−2,2]

∣∣∣∣∣∣Cn(u)− C(u) + C(uπ)− Cn(uπ)∣∣∣

≤ 4 supu∈[0,1]d

∣∣Cn(u)− C(u)∣∣ a.s.−−→ 0.

(5.2.5)

Of course, (5.2.1) may be proved with (5.2.5) and without implying the dominated convergencetheorem. But still integrability of w( · , π) for each π ∈ G is needed.

For (5.2.3), consider

supu∈[0,1]d

∣∣∣Cn(u)− Cn(uπ)∣∣∣w(u, π)

− supu∈[0,1]d

∣∣C(u)− C(uπ)∣∣w(u, π)

≤ supu∈[0,1]d

∣∣∣Cn(u)− C(u)∣∣∣w(u, π)

+ supu∈[0,1]d

∣∣∣Cn(uπ)− C(uπ)∣∣∣w(u, π)

as well as

supu∈[0,1]d

∣∣C(u)− C(uπ)∣∣w(u, π)

− supu∈[0,1]d

∣∣∣Cn(u)− Cn(uπ)∣∣∣w(u, π)

≤ supu∈[0,1]d

∣∣∣Cn(u)− C(u)∣∣∣w(u, π)

+ supu∈[0,1]d

∣∣∣Cn(uπ)− C(uπ)∣∣∣w(u, π)

95

Page 104: Exchangeability of Copulas

5. Tests for Non-Exchangeability

and thus

|Tn − T | ≤ 2(d!) supu∈[0,1]d

∣∣∣Cn(u)− C(u)∣∣∣∥∥w( · , π)

∥∥∞

a.s.−−→ 0

holds, where boundedness of w( · , π) for each π ∈ G is needed. Obviously, (5.2.4) follows fromanalogous considerations.

The following proof of (5.2.2) is adapted from the proof of SnP−→ S in the bivariate case,

given by Genest et al. (2012). In order to prove Sna.s.−−→ S in (5.2.2), it suffices to show that

Sn :=

∫[0,1]d

(Cn(u)− Cn(uπ)

)2w(u, π) dCn(u)

a.s.−−→ S :=

∫[0,1]d

(C(u)− C(uπ)

)2w(u, π) dC(u)

holds for any π ∈ Sd. Similar to Genest et al. (2012), we look at the sequences γn and ξn, as in

∣∣Sn − S∣∣ ≤ ∣∣∣∣ ∫[0,1]d

((Cn(u)− Cn(uπ)

)2 − (C(u)− C(uπ))2)w(u, π) dCn(u)︸ ︷︷ ︸

=:γn

∣∣∣∣+

∣∣∣∣ ∫[0,1]d

(C(u)− C(uπ)

)2w(u, π) dCn(u)− S︸ ︷︷ ︸

=:ξn

∣∣∣∣ = |γn|+ |ξn|.

Because Cn(u) ∈ [0, 1] and C(u) ∈ [0, 1], obviously∣∣∣Cn(u)− Cn(uπ)∣∣∣ ≤ 1 and

∣∣C(u)− C(uπ)∣∣ ≤ 1

holds. The second bound could be improved to∣∣C(u) − C(uπ)

∣∣ ≤ d−1d+1 (lowest bound possi-

ble, see Theorem 4.3.3), but we omit that for the sake of simplicity. With c > 0, such thatsupu∈[0,1]d

∣∣w(u, π)∣∣ ≤ c and∣∣∣(Cn(u)− Cn(uπ)

)2 − (C(u)− C(uπ))2∣∣∣ ≤ 4 sup

u∈[0,1]d

∣∣Cn(u)− C(u)∣∣

as in (5.2.5), we obtain

|γn| ≤ 4c supu∈[0,1]d

∣∣Cn(u)− C(u)∣∣ a.s.−→ 0

for n→∞ (see Deheuvels (1981)).

Next we show |ξn| a.s.−→ 0 in a similar way as in the proof of Proposition A.1 in Genest

et al. (1995). By Definition 5.1.1, Cn : Ω → D[0, 1]d is a random process on some probabilityspace (Ω,A,P). As usual, we suppress the argument ω in most instances. But now, let ω ∈ Ω,

such that(Cn(ω)

)(·) converges to C. Such an ω exists, due to Cn

a.s.−−→ C. Then the mapping

Cn(ω) : [0, 1]d → [0, 1] is a (non-continuous but nonetheless) valid distribution function. As theempiciral copula converges with respect to the supremum, we get pointwise convergence as well,especially in all points u ∈ [0, 1]d where C is continuous (which are all points in [0, 1]d). A

sequence of random vectors Un ∼ Cn(ω) therefore converges in distribution to a vector U ∼ C,

i. e. Und−→ U for n→∞. Now let ψ : [0, 1]d → R be a function with

ψ(u) :=(C(u)− C(uπ)

)2w(u, π)

96

Page 105: Exchangeability of Copulas

5.2. Asymptotics

for u ∈ [0, 1]d. Then ψ is continuous on [0, 1]d as C is continuous by definition and with thecontinuous mapping theorem (see Proposition 3.2.10) follows

Vn := ψ(Un) d−→ ψ(U) =: V

for n→∞. For ε > 0 we obtain

E(|Vn|1+ε

)=

∫[0,1]d

∣∣ψ(u)∣∣1+ε

dCn(u) ≤(

maxu∈[0,1]d

w(u, π))1+ε

for every n ∈ N and thus lim supn→∞E(|Vn|1+ε

)<∞. Then E

(V qn

)→ E

(V q), i. e.∫

[0,1]d

(ψ(u)

)qdCn(u)→

∫[0,1]d

(ψ(u)

)qdC(u)

holds for all q < 1 + ε, in particular for q = 1. For details on this, see for example Section 1 ofChapter VIII in Feller (1971) or Theorem 1.11.3 (and Example 1.11.4 respectively) in van derVaart and Wellner (1996). Now we have E

(Vn)→ E

(V)

for almost every (fixed) ω ∈ Ω and

n→∞, which is |ξn| a.s.−−→ 0. Altogether this yields

|Sn − S| ≤ |γn|+ |ξn| a.s.−→ 0

which completes the proof.

Remark 5.2.3. The assumption on the weight w can be relaxed considerably, e.g., to w ∈ L1[0, 1]d

(i. e. integrability) for (5.2.1) and boundedness in the other cases. Still, these assumptions are notnecessary, as for example

w(u, π) :=

∣∣C(u)− C(uπ)∣∣−1

if∣∣C(u)− C(uπ)

∣∣ > 0

0 otherwise

for u ∈ [0, 1]d and for π ∈ G is neither bounded nor continuous but still yields convergence inTheorem 5.2.2. Of course, integrability of this weight function depends on the copula C.

Remark 5.2.4. Obviously Tn is superior to Tn. If C is exchangeable, then T = 0 as well as T = 0.But if C is non-exchangeable, then T ≤ T as the supremum does not have to be attained in thesame point for all permutations π ∈ G. Similarly, Tn tends to be smaller than Tn. This is adisadvantage in detecting non-exchangeability. Under the null hypothesis, Tn should be moreconservative than Tn. But in most cases of the simulation study in Section 5.4, Tn was rather tooliberal. Therefore, we will not consider Tn any further.

Among others, Ganssler and Stute (1987), and Tsukahara (2005) (see also Tsukahara (2011))showed that if all partial derivatives of C are continuous on [0, 1]d, then

√n(Cn − C

) GC (5.2.6)

for n→∞ on l∞[0, 1]d where GC is a Gaussian process which depends on the copula C (see alsoFermanian et al. (2004)). The process GC can be decomposed as

GC(u) = BC(u)−d∑k=1

BC(1, . . . , 1, uk, 1, . . . , 1)∂

∂ukC(u) (5.2.7)

97

Page 106: Exchangeability of Copulas

5. Tests for Non-Exchangeability

where BC is a tucked down, d-dimensional, Brownian sheet, i. e. a centered Gaussian process with

cov(BC(u),BC(v)

)= C(u ∧ v)− C(u)C(v) (5.2.8)

as covariance function for u,v ∈ [0, 1]d. As usual, u ∧ v denotes the vector whose elements aregiven by the minimum of the respective elements of the vectors u and v. The partial derivativesin (5.2.7) originate from the use of empirical marginal distributions in Cn. If one had access tothe real margins F1, . . . , Fd (or to a sample from the copula), with

Cn(u) :=1

n

n∑i=1

n∏j=1

1(Fj(Xij) ≤ uj

)we obtain

√n(Cn−C

) BC on l∞[0, 1]d with the Brownian sheet BC as above. The continuous

mapping theorem (see Theorem 3.4.9) yields

√n(Cn − C + Cπ − Cπn

) GC −GπC

on l∞[0, 1]d for any π ∈ Sd. Here, the permutation π in the exponent denotes the function withpermuted argument, for example Cπ(u) := C(uπ) for u ∈ [0, 1]d. Note that this convergence doesnot depend on C being exchangeable or not, yet if C is exchangeable (i. e. under H0), we get

√n(Cn − Cπn

) GC −GπC

on l∞[0, 1]d for all π ∈ Sd, whereas for non-exchangeable C, there exists at least one u ∈ [0, 1]d

and one π ∈ Sd, such that√n(Cn(u)− Cn(uπ)

)does not converge.

Theorem 5.2.5. Let G ⊂ Sd and let w(·, π) be continuous for all π ∈ G. If all partial derivativesof C are continuous, then under H0

nRnd−→∑π∈G

∫[0,1]d

(GC(u)−GC(uπ)

)2w(u, π) du =: RC (5.2.9)

nSnd−→∑π∈G

∫[0,1]d

(GC(u)−GC(uπ)

)2w(u, π) dC(u) =: SC (5.2.10)

√nTn

d−→∑π∈G

supu∈[0,1]d

∣∣GC(u)−GC(uπ)∣∣w(u, π)

=: TC (5.2.11)

holds for n→∞.

In order to prove the preceding theorem, some definitions as well as some additional theory isrequired. The proof may be found on page 99. First, we need another space of functions on theunit hypercube, namely functions of bounded variation.

Definition 5.2.6. For a function f ∈ D[0, 1]d the Vitali variation is given by

V V (f) := sup∑i

∣∣f(Vi)∣∣where the Vi are non-overlapping hypercubes, whose union is the unit hypercube. The supremumis taken over all such partitions (into non-overlapping hypercubes Vi) of [0, 1]d and f

(Vi)

is the

98

Page 107: Exchangeability of Copulas

5.2. Asymptotics

f -volume of Vi as in Definition 2.1.1. We call the Hardy variation of f bounded by c > 0, ifV V (f) ≤ α and the Vitali variation of f with respect to any d < d variables is bounded by c aswell. Then, for c > 0,

BVc[0, 1]d :=f ∈ D[0, 1]d

∣∣ Hardy variation of f bounded by c

is referred to as the space of functions with bounded (Hardy) variation.

There exist several different notions of total variation for multivariate functions, for detailssee for example the paper of Clarkson and Adams (1933). Note that in the literature Hardyvariation is sometimes also attributed to Krause.

Next, the definition of Hadamard-differentiability as in Section 3.9 of van der Vaart andWellner (1996) is given. It is needed for the functional delta method of Proposition 5.2.8.

Definition 5.2.7. Let S1 and S2 be normed spaces and let S ⊂ S1. A function φ : S → S2 iscalled Hadamard-differentiable at θ ∈ S if there exists a continuous linear map φ′θ : S1 → S2, suchthat for n→∞

φ(θ + tnhn)− φ(θ)

tn→ φ′θ(h)

holds for all sequences (tn)n∈N and (hn)n∈N for which θ + tnhn ∈ S holds for all n ∈ N andtn → 0 as well as hn → h whenever n→∞. If there exists S0 ⊂ S1, such that h ∈ S0 holds forall h in the preceding definition, then φ is called Hadamard-differentiable tangentially to S0 (andφ′ may only be defined on S0).

The statement of the following proposition is known as functional delta method. It is a versionof Theorem 3.9.4 of van der Vaart and Wellner (1996), where it is stated in more generality andalso a proof may be found.

Proposition 5.2.8. Let S1 and S2 be metric spaces and let S ⊂ S1, S0 ⊂ S1 as well as θ ∈ S.Let φ : S → S2 be Hadamard-differentiable at θ tangentially to S0. Let (Ω,A,P) be a probabilityspace and let X : Ω→ S be tight. If Xn : Ω→ S is a sequence of functions, such that there existsa sequence (rn)n∈N ⊂ R with rn →∞, for which

rn(Xn − θ) X

holds on S1, then

rn(φ(Xn)− φ(θ)

) φ′θ(X)

holds as well on S2 (for n→∞).

With the functional delta method at hand, we are set to prove Theorem 5.2.5.

Proof of Theorem 5.2.5. For the proof of (5.2.9) and (5.2.11), we use the continuous mappingtheorem. To this end, consider the mapping f1 : D[0, 1]d → R with

f1(g) := supu∈[0,1]d

g(u)w(u, π

)=∥∥g · w( · , π)

∥∥∞

for g ∈ D[0, 1]d, given some π ∈ Sd. Then f1 is continuous by the inverse triangular inequality,i. e. ∣∣f1(g1)− f1(g2)

∣∣ =∣∣∣∥∥g1 · w( · , π)

∥∥∞ −

∥∥g2 · w( · , π)∥∥∞

∣∣∣ ≤ ‖g1 − g2‖∞‖w( · , π)‖∞

99

Page 108: Exchangeability of Copulas

5. Tests for Non-Exchangeability

holds for all g1, g2 ∈ D[0, 1]d. Thus f1 is continuous and the continuous mapping theorem (seeTheorem 3.4.9) yields

f1

(√n(Cn − Cπn

)) f1

(GC −GπC

)on l∞[0, 1]d which implies (5.2.11) as Tn was assumed to be a random variable.

Now consider the function f2 : D[0, 1]d → R with

f2(g) :=

∫[0,1]d

g(u)w(u, π) du

for g ∈ D[0, 1]d, given some π ∈ Sd. Then f2 is continuous, because

∣∣f2(g1)− f2(g2)∣∣ ≤ ∫

[0,1]d

∣∣g1(u)− g2(u)∣∣w(u, π) du ≤ ‖g1 − g2‖∞

∫[0,1]d

w(u, π) du

holds for all g1, g2 ∈ D[0, 1]d. The last integral is finite because the weight function w isintegrable by the prerequisites. Thus f2 is continuous and the continuous mapping theorem (seeTheorem 3.4.9) yields

f2

(n(Cn − Cπn

)2) f2

((GC −GπC

)2)on l∞[0, 1]d which implies (5.2.9) as Rn was assumed to be a random variable.

The proof of (5.2.10) is a little more involved, because not only the integrand but also themeasure converges. It is adapted from the proof of the bivariate case, given by Genest et al.(2012). We will show (5.2.10) for G = π, given one permutation π ∈ Sd. The general case thenfollows immediately.

By (5.2.6) and the continuous mapping theorem, under H0, we find(n(Cn − Cπn )2

√n(Cn − C)

)

((GC −GπC)2

GC

)(5.2.12)

in the space l∞[0, 1]d × l∞[0, 1]d (with the supremum-norm, see Fermanian et al. (2004)) forn→∞ and thus

√n

((AnCn

)−(AC

))

((GC −GπC)2

GC

)

with A ≡ 0 and An :=√n(Cn−Cπn )2 on l∞[0, 1]d×l∞[0, 1]d. Now with the mapping φ : l∞[0, 1]d×

BV1[0, 1]d → R, given by

φ(α, β) :=

∫(0,1]d

α(u)w(u, π) dβ(u),

for (α, β) ∈ l∞[0, 1]d × BV1[0, 1]d, we can write

n

∫[0,1]d

(Cn(u)− Cn(uπ)

)2

w(u, π) dCn(u) =√n(φ(An, Cn

)− φ(A,C)

).

100

Page 109: Exchangeability of Copulas

5.3. Test Procedures

Lemma 5.2.9, succeeding this proof, states that φ is Hadamard-differentiable, tangentially toC[0, 1]d ×D[0, 1]d at each (A,B) ∈ l∞[0, 1]d × BV1[0, 1]d, such that

∫|dA| <∞. Applying the

functional delta-method (see Proposition 5.2.8) yields

√n(φ(An, Cn

)− φ(A,C)

) φ′(A,C)

((GC −GπC)2,GC

)=

∫(0,1]d

A(u)w(u, π) dGC(u)︸ ︷︷ ︸=0

+

∫(0,1]d

(GC(u)−GC(uπ)

)2w(u, π) dC

on l∞[0, 1]d which completes the proof.

The incorporation of the weight function in the preceding proof is treated through thecontinuous mapping theorem (and the functional delta method, respectively), it could also beincorporated into the weak convergence of the processes. See, for example, the paper of Berghauset al. (2015) for a special weight function.

Lemma 5.2.9. Let w(·, π) be continuous for each π ∈ Sd. For all c > 0 and all π ∈ Sd, themapping φ : l∞[0, 1]d × BVc[0, 1]d → R with

φ(α, β) :=

∫(0,1]d

α(u)w(u, π) dβ(u)

for (α, β) ∈ l∞[0, 1]d × BVc[0, 1]d is Hadamard-differentiable tangentially to C[0, 1]d ×D[0, 1]d ateach (A,B) ∈ l∞[0, 1]d × BVc[0, 1]d, such that

∫|dA| <∞. The derivative is given by

φ′(A,B)(α, β) =

∫(0,1]d

A(u)w(u, π) dβ(u) +

∫(0,1]d

α(u)w(u, π) dB(u).

Note that∫A(u)w(u, π) dβ(u) exists, even if β is not of bounded variation. See Theorem 8.8

in Chapter III in the book of Hildebrandt (1963) for computing this integral using integration byparts in the two-dimensional case or Lemma 3.3 of Liflyand et al. (2011) for the multidimensionalcase or alternatively see Fermanian (1998). A proof of Lemma 5.2.9 may be found in Appendix A.3.It is based on the proof of Lemma 4.2.2 of Carabarin Aguirre (2008) which is similiar to the proofof Lemma 3.9.17 of van der Vaart and Wellner (1996).

5.3 Test Procedures

As mentioned before, each of the test statistics proposed in Section 5.1 will in general be positive,even if the sample’s underlying copula is exchangeable (i. e. H0 holds). But, as seen in Section5.2, under H0 the test statistics tend to 0 (almost surely) with increasing sample size, wherein contrast under H1 they tend to infinity. As the exact distribution of each test statistic isunkown and the asymptotic distribution depends on the (unknown) copula, we cannot computep-values directly. Instead, (bootstrap-)versions of the test statistic are computed, which should be(asymptotically) distributed according to the asymptotic distribution of the test statistic underH0.

In Example 3 of Romano (1989) a randomization test is suggested. Each random vector in theoriginal sample is permuted according to some (random) permutation. The distribution of therandom vectors in this new sample is exchangeable. Furthermore their distribution is identical tothe distribution of the original vectors if and only if H0 holds. Like this, several new samples arecreated and each time the test statistic is computed. From these results, a p-value is computed.

101

Page 110: Exchangeability of Copulas

5. Tests for Non-Exchangeability

Suppose all partial derivatives of C exist and are continuous. Remillard and Scaillet (2009)propose a multiplier approach to approximate the process BC , based on a multiplier central limittheorem by van der Vaart and Wellner (1996). Then an approximation of a realization of GCcan be computed as in (5.2.7) by using estimates for the partial derivatives of C. Bucher andDette (2010) call this approach pdm-bootstrap and additionally propose a direct multiplier (dm)bootstrap. In the dm-bootstrap, the use of estimates for the partial derivatives is avoided bydirectly approximating GC . First, a simulation study by Bucher and Dette (2010) showed that thepdm-bootstrap performed best in most cases. Second, in our implementation, the pdm-bootstrapalso took substantially less time to compute than the dm-bootstrap or any resampling strategieslike the randomization proposed by Romano (1989). That is why we use the pdm-bootstrap ofRemillard and Scaillet (2009) in the simulation study of Section 5.4. Throughout this and thefollowing sections, notation similar to Bucher and Dette (2010) and Genest et al. (2012) is used.And now for a brief outline of how this pdm-bootstrap works.

Let ξ1, . . . , ξn be i. i. d. random variables which are independent of the sample with E(ξ1) = 0,var(ξ1) = 1 and

‖ξ1‖2,1 :=

∫ ∞0

√P(|ξ1| > x

)dx <∞.

Acoording to van der Vaart and Wellner (1996, Section 2.9, page 177) it should be noted that “inspite of the notation, this is not a norm (but there exists a norm that is equivalent to ‖ · ‖2,1)”.With ξn := 1

n

∑ni=1 ξi and

C∗n(u) :=1

n

n∑i=1

ξi1(Ui ≤ u

)where, as usual, 1(x ≤ y) = 1 if and only if xj ≤ yj holds for all j ∈ 1, . . . , d, Remillard andScaillet (2009) state that if all partial derivatives of C are continuous on [0, 1]d, then(√

n(Cn − C

),√n(C∗n − ξnCn

))> (BC ,B

′C

)>(5.3.1)

for n → ∞ in D[0, 1]d ×D[0, 1]d. Note that in Cn the true margins are used for transforming

the sample (i. e. Fj , such that Cn is the empirical distribution function of copula C) but in Cnthe empirical margins are employed (i. e. Fjn as in (5.1.1)). The process B′C is an independentcopy of the Brownian sheet BC whose covariance structure is given in (5.2.8). Under the sameassumptions, with

B∗n(u) :=√n(C∗n(u)− ξnCn(u)

),

G∗n(u) := B∗n(u)−d∑l=1

B∗n(1, . . . , 1, ul, 1, . . . , 1)Cl,n(u)

Remillard and Scaillet (2009) have shown that(√n(Cn − C

), G∗n

)> (GC ,G

′C

)>(5.3.2)

on l∞[0, 1]d × l∞[0, 1]d in their Theorem 2.1, if Cl,n is a consistent estimator for the l-th partialderivative of C. The process G′C is an independent copy of GC . In Section 5.4 we use

Cl,n(u) :=Cn(u+ hnel)− Cn(u− hnel)

2hn

102

Page 111: Exchangeability of Copulas

5.3. Test Procedures

with hn := 1√n

and el the l-th column of the d× d identity matrix as in Remillard and Scaillet

(2009) or Bucher and Dette (2010) and for which the former (see Proposition A.2) showed uniformconsistency under the prerequisite of continuity of ∇C. Genest et al. (2012) state conditions (seeProposition 5) under which other choices of hn still yield (5.3.2). Proposition 3.2 of Segers (2012)shows (5.3.2) under weaker assumptions on C and its partial derivatives.

By (5.3.1), the continuous mapping theorem and with

D∗n,π(u) :=1√n

n∑i=1

(ξi − ξn

) (1(Ui ≤ u

)− 1

(Ui ≤ uπ

))=√n(C∗n(u)− ξnCn(u)− C∗n(uπ) + ξnCn(uπ)

)we obtain D∗n,π BC −BπC on l∞[0, 1]d for any permutation π ∈ Sd. As usual BπC is given by

BπC(u) := BC(uπ) for all u ∈ [0, 1]d. Furthermore, with

Dn,π(u) := D∗n,π(u)−d∑l=1

D∗n,π(1, . . . , 1, ul, 1, . . . , 1)Cl,n(u),

the continuous mapping theorem and (5.3.2), we get(Dn,π,

√n(Cn − C

))> (GC −GπC ,G′C

)>(5.3.3)

on l∞[0, 1]d × l∞[0, 1]d. Again G′C is an independent copy of GC . The following propositionclarifies, how to derive bootstrap-versions of the test statistics from (5.3.2).

Proposition 5.3.1. Let G ⊂ Sd and let w(·, π) be continuous for all π ∈ G. If all partialderivatives of C are continuous, and

Rn :=1

n

∑π∈G

∫[0,1]d

(Dn,π(u)

)2

w(u, π) du

Sn :=1

n

∑π∈G

∫[0,1]d

(Dn,π(u)

)2

w(u, π) dCn(u)

Tn :=1√n

∑π∈G

supu∈[0,1]d

∣∣∣Dn,π(u)∣∣∣w(u, π)

then (

nRn, nRn

)> d−→(RC ,R′C

)>(nSn, nSn

)> d−→(SC ,S′C

)>(√

nTn,√nTn

)> d−→(TC ,T′C

)>holds for n → ∞ under both hypotheses. Each limiting vector has independent and identicallydistributed elements and Rn, Sn, and Tn as well as RC , SC , and TC are given in Theorem 5.2.5.

Therefore, we use Rn, Sn, and Tn as bootstrap-versions of Rn, Sn, and Tn respectively. Theconvergence of nRn and

√nTn follows directly from (5.3.3) and the continuous mapping theorem.

103

Page 112: Exchangeability of Copulas

5. Tests for Non-Exchangeability

For the convergence of nSn, an argument analogous to the proof of Theorem 5.2.5 can be used,where (5.2.12) is replaced by an adequate version of (5.3.3).

Let M ∈ N. If, for each h = 1, . . . ,M , such a realization of ξ(h)1 , . . . , ξ

(h)n (independent of the

sample and each other) and thus M versions of R(h)n , S

(h)n or T

(h)n as in Proposition 5.3.1 are

created, then an approximate p-value can be computed by

1

M

M∑h=1

1(R(h)n > Rn

),

1

M

M∑h=1

1(S(h)n > Sn

)or

1

M

M∑h=1

1(T (h)n > Tn

)(5.3.4)

respectively.In practice and just like in Genest et al. (2012), it is possible to speed up the computation

of the bootstrap-versions R(h)n , S

(h)n or T

(h)n in the following way. Given Ui, computed from a

sample Xi and given a realization of ξ(h)i (i = 1, . . . , n), we write

Dn,π(u) =1√n

Ξ(h)n ·Qn,π(u)

with the vectors

Ξ(h)n :=

((ξ

(h)1 − ξ(h)

n

). . .

(h)n − ξ(h)

n

))∈ R1×n, (5.3.5)

Qn,π(u) := Pn,π(u)−n∑l=1

Pn,π(1, . . . , 1, ul, 1, . . . , 1)Cl,n(u) ∈ Rn×1 (5.3.6)

and Pn,π(u) ∈ Rn×1, such that the i-th component is given by

(Pn,π(u)

)i

:=

d∏k=1

1(Uik ≤ uk

)−

d∏k=1

1(Uik ≤ uπ(k)

)(5.3.7)

for i ∈ 1, . . . , n. Therefore, we find the representations

R(h)n =

1

n2

∑π∈G

∫[0,1]d

(Ξ(h)n Qn,π(u)

)2

w(u, π) du

≈ 1

n2Nd

∑π∈G

N∑i1,...,id=1

(Ξ(h)n Qn,π

(i1N , . . . ,

idN

))2

w(i1N , . . . ,

idN , π

) (5.3.8)

S(h)n =

1

n

∑π∈G

∫[0,1]d

(Ξ(h)n Qn,π(u)

)2

w(u, π) dCn(u)

=1

n3

∑π∈G

n∑i=1

(Ξ(h)n Qn,π

(Ui))2

w(Ui, π

) (5.3.9)

T (h)n =

1

n

∑π∈G

supu∈[0,1]d

∣∣Ξ(h)n Qn,π(u)

∣∣w(u, π)

≈ 1

n

∑π∈G

maxi∈1,...,Nd

∣∣∣Ξ(h)n Qn,π

(i1N , . . . ,

idN

)∣∣∣w( i1N , . . . , idN , π) (5.3.10)

104

Page 113: Exchangeability of Copulas

5.3. Test Procedures

for large N ∈ N and G ⊂ Sd. Note that the evaluations of Qn,π have to be computed just

once and then, with each realization of Ξ(h)n a new bootstrap-version of the test statistic can be

computed. In the following small simulation study, it will prove to be an advantage in terms of

computation time that for S(h)n only n evaluations of each Qn,π have to be computed as opposed

to Nd in the other cases.To conclude this section, we sum things up in the following algorithms, depending on the

choice of test statistic. For either algorithm, the vectors x1, . . . ,xn ∈ Rd which are realizationsof the random vectors X1, . . . ,Xd ∼ H with copula C are used as input. In a first step, theyare transformed to realizations u1, . . . , un ∈ [0, 1]d via a rank transformation with the empiricaldistribution functions of the margins as in (5.1.2). Each test is conducted at a level α ∈ (0, 1).Usually α = 0.05 or α = 0.1 are used. An appropriate weight function w must be chosen as wellas the number M ∈ N of bootstraps and for Rn and Tn a number N ∈ N for the approximation ofthe integral or the supremum, respectively. Of course, M and N should be as large as computationtime allows.

Algorithm 5.1 Testing for non-exchangeability with Rn at level α

Compute Rn as in (5.1.3).for i1, . . . , id ∈ 1, . . . , N, π ∈ G do

Compute Qn,π(i1N , . . . ,

idN

)as in (5.3.6).

Compute w(i1N , . . . ,

idN , π

)end forfor h ∈ 1, . . . ,M do

Generate a realization ξ(h)1 , . . . , ξ

(h)n of a random variable ζ with E(ζ) = 0, var(ζ) = 1 and

‖ζ‖2,1 <∞, for example ζ ∼ Exp(1).

Compute Ξ(h)n as in (5.3.5).

Compute an approximation of R(h)n as in (5.3.8).

end forCompute p := 1

M

∑Mh=1 1

(R

(h)n > Rn

).

if p ≥ α thenDo not reject H0.

elseReject H0.

end if

105

Page 114: Exchangeability of Copulas

5. Tests for Non-Exchangeability

Algorithm 5.2 Testing for non-exchangeability with Sn at level α

Compute Sn as in (5.1.4).for i ∈ 1, . . . , N, π ∈ G do

Compute Qn,π(Ui)

as in (5.3.6).

Compute w(Ui, π

)end forfor h ∈ 1, . . . ,M do

Generate a realization ξ(h)1 , . . . , ξ

(h)n of a random variable ζ with E(ζ) = 0, var(ζ) = 1 and

‖ζ‖2,1 <∞, for example ζ ∼ Exp(1).

Compute Ξ(h)n as in (5.3.5).

Compute S(h)n as in (5.3.9).

end forCompute p := 1

M

∑Mh=1 1

(S

(h)n > Sn

).

if p ≥ α thenDo not reject H0.

elseReject H0.

end if

Algorithm 5.3 Testing for non-exchangeability with Tn at level α

Compute Tn as in (5.1.5).for i1, . . . , id ∈ 1, . . . , N, π ∈ G do

Compute Qn,π(i1N , . . . ,

idN

)as in (5.3.6).

Compute w(i1N , . . . ,

idN , π

)end forfor h ∈ 1, . . . ,M do

Generate a realization ξ(h)1 , . . . , ξ

(h)n of a random variable ζ with E(ζ) = 0, var(ζ) = 1 and

‖ζ‖2,1 <∞, for example ζ ∼ Exp(1).

Compute Ξ(h)n as in (5.3.5).

Compute an approximation of T(h)n as in (5.3.10).

end forCompute p := 1

M

∑Mh=1 1

(T

(h)n > Tn

).

if p ≥ α thenDo not reject H0.

elseReject H0.

end if

106

Page 115: Exchangeability of Copulas

5.4. Simulation Study and Data Application

5.4 Simulation Study and Data Application

The proposed test was implemented using R (see R Core Team (2014)) and simulations wereconducted on Ulm University’s CUSS Compute-Server. The implementation was also used withdata from a study by the United States Departement of Agriculture. As mentioned before, weconsidered the sets

G1 :=

(1i) ∈ Sd | 2 ≤ i ≤ d

,

G2 :=

(12), (12 . . . d) (5.4.1)

of permutations generating Sd. As a weight function, we used different exponents of the mappingwm : [0, 1]d × Sd →

[0, d−1

d+1

]with

wm(u, π) := min

Md(u), ω(u, π), d− 1 +Md(u)−

d∑i=1

ui

(5.4.2)

where

ω(u, π) :=

|ui − uj | if π = (ij),∑di=d d2 e+1

(u(i) −Md(u)

)else

and u(i) denotes the i-th smallest value in u, i. e. u(1) = Md(u) (with Md as in Definition 2.2.3).

Note that wm is a best-possible upper bound for∣∣C(u)−C(uπ)

∣∣ in the case of any transpositionπ = (ij) and any permutation π that yields maximal non-exchangeability. For all transpositionsand the order-reversing permutation π, this is a consequence of Lemma 4.3.4, Corollary 4.3.7 andLemma 4.4.1. The mapping wm is also a best upper bound for

∣∣C(u)−C(uσ)∣∣ (with the leftshift

σ = (12 . . . d) we used in G2) as there exists a copula C and u∗ ∈ (0, 1]d, such that∣∣C(u∗)− C(u∗σ)∣∣ =

d− 1

d+ 1= w(u∗, σ) = wm(u∗, σ)

holds. One such copula C and vector u∗ may be found by permuting the arguments of C∗ asin (4.3.11), as well as the elements of u∗ as in (4.3.10), appropriately. Furthermore, wm( · , π) iscontinuous for all π ∈ Sd and bounded by d−1

d+1 .The simulation study consists of three stages. In the first stage (see Subsection 5.4.1), the

performance of Rn, Sn and Tn is compared, without weights (i. e. w ≡ 1) and using the smallestgenerating set G2. In the second stage (see Subsection 5.4.2), the performance of Sn using thegenerating sets G1 and G2 is analyzed. Again, no weights are used. In the third stage (seeSubsection 5.4.3), the implications of the use of a weight function in Sn, which is different fromw ≡ 1, are considered, while the generating set is fixed (i. e., G = G2).

After the third stage of the simulation study, the test statistic Sn is used with a datasetthat was not simulated but is the result of a study on nutrients, whose details are given inSubsection 5.4.4. Four combinations of weights and generating sets are used.

A brief discussion of the results of the simulation study and of the application to the nutrientdataset follows in Subsection 5.4.5.

For every case in every stage of the simulation study, a sample of n = 1 000 random vectors ofa given copula was created and an empirical p-value, based on M = 1 000 bootstrap-variables wascomputed. Bootstrap-versions of the test statistics were always based on exponentially distributed

ξ(h)i ∼ Exp(1). Note that Z

(h)i := ξ

(h)i − 1 fulfills

E(Z

(h)i

)= 0, Var

(Z

(h)i

)= 1,

∥∥∥Z(h)i

∥∥∥2,1<∞,

107

Page 116: Exchangeability of Copulas

5. Tests for Non-Exchangeability

Table 5.1: Fraction of rejections out of 1 000 samples, each based on n = 1 000 realizations andM = 1 000 bootstrap-variables, dimension d = 3, no weights, generating set G = G2, approximation

for R(h)n and T

(h)n with N = 20.

α = 0.05 α = 0.1

Copula Rn Sn Tn Rn Sn Tn

Π 0.034 0.044 0.031 0.058 0.083 0.075Clayton 0.028 0.037 0.037 0.051 0.082 0.085Frank 0.014 0.030 0.034 0.046 0.075 0.063

as in Section 5.3 but both, ξ(h)i and Z

(h)i yield the same Ξ

(h)n in (5.3.5). Other distributions

(normal or uniform) of ξ(h)i did not affect the outcome. Therefore we used the Exp(1)-distribution

for all simulations. This was repeated 1 000 times, which means that in each case 1 000 empiricalp-values were computed, based on independent samples.

5.4.1 First Stage

For the first stage, random vectors of dimension d = 3 were created with independent andidentically distributed components following a uniform distribution on [0, 1], i. e. a sample fromthe independence copula Π. The test statistics Rn, Sn and Tn were computed with w ≡ 1 and

G = G2. For R(h)n and T

(h)n an approximation with N = 20 as described in the end of Section 5.3

was used. A graphical comparison of the ordered p-values is given in Figure 5.1a.The same was carried out with realizations of a three-dimensional Clayton copula (see

Example 2.4.21) and a three-dimensional Frank copula (see Example 2.4.21). For each family, theparameter was chosen corresponding to Kendall’s tau of 1

3 (see Definition 2.6.9 and Table 2.2).The fraction of rejections according to the significance levels α = 0.05 and α = 0.1 is given inTable 5.1 for each of the three distributions.

In order to compare the test statistics in a non-exchangeable setting, the above was repeatedwith random vectors which are distributed according to a maximal non-exchangeable copula asin (4.3.11). We used a non-singular version where mass 1

2 is uniformly distributed within each ofthe cubes

[0, 1

2

]×[

12 , 1]×[

12 , 1]

and[

12 , 1]×[0, 1

2

]×[0, 1

2

], i. e.

C(u) = 4(u1 ∧ 1

2

)(u2 − 1

2

)+(u3 − 1

2

)++ 4(u1 − 1

2

)+(u2 ∧ 1

2

)(u3 ∧ 1

2

)for u ∈ [0, 1]d. This copula is maximal non-exchangeable in the sense that

maxπ∈S3

supu∈[0,1]3

∣∣C(u)− C(uπ)∣∣ =

1

2

holds. For each test statistic, all of the 1 000 p-values, which were computed, were 0.In order to assess the power of the test statistics, samples from non-exchangeable random

vectors of dimension d = 3 were generated. They were distributed according to nested (orhierarchical) Clayton copulas Cfav

θ0,θ1(favorable case, see below) and Cadv

θ0,θ1(adverse case, see

below) with

Cfavθ0,θ1(u) = Cθ0

(u1, Cθ1(u2, u3)

), Cadv

θ0,θ1(u) = Cθ0(Cθ1(u1, u2), u3

)(5.4.3)

for different values of (θ0, θ1)> and Cθ0 , Cθ1 being Clayton copulas (see Example 2.4.21). Notethat Cfav

θ0,θ1and Cadv

θ0,θ1are valid copulas, if θ0 ≤ θ1 (see e. g. Joe (1997)). For θ0 6= θ1 the copulas

108

Page 117: Exchangeability of Copulas

5.4. Simulation Study and Data Application

0 200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

ranks

computedp-values

RnSnTn

(a) w ≡ 1

0 200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

ranks

computedp-values

δ = − 12

δ = 0δ = 2

(b) Sn, w = wδm

Figure 5.1: Ordered p-values for 1 000 simulations of data from the independence copula,n = M = 1 000, N = 20, d = 3, G = G2, using different test statistics (with fixed weightfunction) in (a) and different weights (with fixed test statistic Sn) in (b). The weight functionwm is given in (5.4.2).

0.00 0.05 0.10 0.15

0.0

0.2

0.4

0.6

0.8

1.0

τ1 − τ0

empirical

pow

er

Cfav

Cadv

α = 0.05α = 0.1

(a) G = G2

0.00 0.05 0.10 0.15

0.0

0.2

0.4

0.6

0.8

1.0

τ1 − τ0

empirical

pow

er

Cfav

Cadv

α = 0.05α = 0.1

(b) G = G1

Figure 5.2: Empirical power of Sn based on 1 000 samples using nested Clayton copulasCfavθ0,θ1

and Cadvθ0,θ1

as in (5.4.5), n = M = 1 000, d = 4, w ≡ 1.

109

Page 118: Exchangeability of Copulas

5. Tests for Non-Exchangeability

Table 5.2: Empirical power estimated from 1 000 samples, each based on n = 1 000 randomvectors, M = 1 000 bootstrap-variables, dimension d = 3, generating set G2, no weights (w ≡ 1),according to α = 0.05, favorable and adverse case as in (5.4.3).

favorable adverse

Copula k Rn Sn Tn Rn Sn Tn

Clayton 0 0.021 0.044 0.026 0.023 0.033 0.0261 0.146 0.182 0.156 0.063 0.112 0.0712 0.643 0.667 0.567 0.353 0.416 0.2483 0.973 0.981 0.952 0.850 0.865 0.5984 1.000 1.000 1.000 0.991 0.999 0.9055 1.000 1.000 1.000 1.000 1.000 0.997

Frank 0 0.019 0.032 0.025 0.015 0.030 0.0201 0.121 0.149 0.153 0.057 0.083 0.0692 0.563 0.597 0.621 0.259 0.311 0.2533 0.913 0.932 0.960 0.731 0.781 0.6624 0.997 1.000 1.000 0.965 0.973 0.9445 1.000 1.000 1.000 0.999 1.000 0.993

Cfavθ0,θ1

and Cadvθ0,θ1

are not exchangeable and thus R, S and T in Lemma 5.2.2 are positive. Butthe test statistics tend to be larger, when the underlying copula is non-exchangeable under allpermutations in the generating set G, like Cfav

θ0,θ1(u) and unlike Cadv

θ0,θ1(u), as far as G1 and G2 are

concerned. Parameters (θ0, θ1) were chosen corresponding to Kendall’s tau of

τ0 =5

12− k

60, τ1 =

5

12+

k

60(5.4.4)

for k ∈ 0, . . . , 5, which yields θ0 = θ1 = 107 for k = 0 and (θ0, θ1)> = (1, 2)> for k = 5. All

realizations of Ui ∼ Cadvθ0,θ1

in the simulation study were generated by shifting the components

of the realizations of the vectors Ui ∼ Cfavθ0,θ1

which were used in the simulation of the favorable

case, as (U1, U2, U3)> ∼ Cfavθ0,θ1

implies (U2, U3, U1)> ∼ Cadvθ0,θ1

. This was repeated with Cθ0 , Cθ1from the Frank family. Empirical power for all three test statistics and both copulas (α = 0.05)is given in Table 5.2.

5.4.2 Second Stage

As mentioned before, no weights are used in the second stage of the simulation study and theperformance of the test statistic Sn was compared under the generating sets G1 and G2. At first,empirical p-values were computed, based on samples from exchangeable distributions, i. e. theindependence copula Π, a Clayton copula and a Frank copula, each for d = 3 and d = 4. Theparameter for the Archimedean copulas was chosen corresponding to Kendall’s tau of 1

3 . Theratio of rejections according to the significance levels α = 0.05 and α = 0.1 for all cases are givenin Table 5.3.

In order to assess the power of Sn with different generating sets (but with fixed weightsw ≡ 1), samples from non-exchangeable random vectors were generated. The random vectorswere distributed according to nested (or hierarchical) Archimedean copulas Cfav

θ0,θ1(favorable case,

as in the first stage) and Cadvθ0,θ1

(adverse case, as in the first stage). For dimension d = 3, thenesting structure as in the first stage, i. .e. as in (5.4.3) was used. For dimension d = 4, the

110

Page 119: Exchangeability of Copulas

5.4. Simulation Study and Data Application

Table 5.3: Fraction of rejections out of 1 000 samples, each based on n = 1 000 realizations andM = 1 000 bootstrap-variables, test statistic Sn, dimension d, generating sets G1 = (12), . . . , (1d)or G2 = (12), (1 . . . d), weight function w with wm as in (5.4.2).

α = 0.05 α = 0.1

w ≡ 1 w ≡ 1 w = w2m w = w

− 12

m w ≡ 1 w ≡ 1 w = w2m w = w

− 12

m

Cop. d G = G1 G = G2 G = G2 G = G2 G = G1 G = G2 G = G2 G = G2

Π 3 0.046 0.044 0.052 0.043 0.091 0.083 0.097 0.075Cl. 0.041 0.037 0.037 0.038 0.079 0.082 0.085 0.073Fr. 0.036 0.030 0.036 0.028 0.079 0.075 0.089 0.063Π 4 0.045 0.043 0.038 0.039 0.098 0.082 0.094 0.072Cl. 0.032 0.040 0.041 0.032 0.081 0.070 0.079 0.066Fr. 0.036 0.036 0.037 0.028 0.081 0.061 0.077 0.059

nesting structure

Cfavθ0,θ1(u) = Cθ0

(u1, Cθ1(u2, u3, u4)

), Cadv

θ0,θ1(u) = Cθ0(Cθ1(u1, u2, u3), u4

)(5.4.5)

was used. Note that, in each dimension and for each of the generating sets G1 and G2, the copulaCfavθ0,θ1

is non-exchangeable under all permutations π ∈ G, whereas Cadvθ0,θ1

(u) is non-exchangeable

for exactly one permutation π ∈ G. Each was realized for different values of (θ0, θ1)>, withKendall’s tau as in (5.4.4), and Cθ0 , Cθ1 from the Clayton- as well as the Frank-family of copulas.The fraction of rejections at the level α = 0.05 in each case are given in Table 5.4. Empirical-power-plots for the favorable and the adverse case and for both G = G1 as well as G = G2, usingthe samples from the four-dimensional, nested Clayton-copula may be found in Figure 5.2.

5.4.3 Third Stage

For the third stage, the same realizations as in the second stage were used. But this time, thegenerating set was fixed to be G2 and either w ≡ 1 or w = w2

m as in (5.4.2) was used. The ideawas to put more weight on areas where potentially large non-exchangeability might occur and toput less weight to areas near the diagonal and the margins, where less non-exchangeability ispossible for C but where (relatively) large non-exchangeability might occur due to using Cn based

on rank-transformed data Ui. We conjecture that a result similar to Berghaus et al. (2015) holds,i. e. that w = wδm yields asymptotic convergence as well, for δ ∈ (−1, 0). Therefore everything

in the third stage was repeated for w = w− 1

2m , as can be seen in Table 5.4. For δ = −2, d = 3,

n = M = 1 000 and a sample from the independence copula, no reasonable results were produced,i. e. out of 1 000 empirical p-values, not one was smaller than 0.1. Ordered p-values computed fromsamples of the independence copula with Sn and using w = wδm for δ ∈

− 1

2 , 0, 2

are depictedin Figure 5.1b. The ratio of rejections according to the significance levels α = 0.05 and α = 0.1for all exchangeable cases are given in Table 5.3 and for all non-exchangeable cases in Table 5.4.

5.4.4 Nutrient Data

Johanna Neslehova kindly provided us with the dataset which was used by McNeil and Neslehova(2010) to demonstrate the advantages of a non-exchangeable extension to the class of Archimedeancopulas they introduced, so-called Liouville copulas. Furthermore, the same dataset was used

111

Page 120: Exchangeability of Copulas

5. Tests for Non-Exchangeability

Table 5.4: Empirical power estimated from 1 000 samples, each based on n = 1 000 random vectors,M = 1 000 bootstrap-variables, test statistic Sn, dimension d, generating sets G1 = (12), . . . , (1d)or G2 = (12), (1 . . . d), weight function w with wm as in (5.4.2), according to α = 0.05, favorableand adverse case as in (5.4.3) and (5.4.5).

favorable adverse

w ≡ 1 w ≡ 1 w = w2m w = w

− 12

m w ≡ 1 w ≡ 1 w = w2m w = w

− 12

m

C d k G = G1 G = G2 G = G2 G = G2 G = G1 G = G2 G = G2 G = G2

Cl. 3 0 0.041 0.044 0.046 0.042 0.038 0.033 0.043 0.0301 0.202 0.182 0.192 0.167 0.120 0.112 0.119 0.0902 0.736 0.667 0.668 0.651 0.473 0.416 0.454 0.3773 0.990 0.981 0.974 0.975 0.887 0.865 0.885 0.8414 1.000 1.000 1.000 1.000 0.996 0.999 0.998 0.9975 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

Fr. 0 0.032 0.032 0.040 0.028 0.037 0.030 0.031 0.0281 0.170 0.149 0.197 0.127 0.097 0.083 0.118 0.0702 0.662 0.597 0.674 0.519 0.385 0.311 0.464 0.2483 0.960 0.932 0.952 0.910 0.816 0.781 0.881 0.6954 1.000 1.000 1.000 0.998 0.984 0.973 0.992 0.9615 1.000 1.000 1.000 1.000 0.999 1.000 1.000 0.999

Cl. 4 0 0.034 0.032 0.044 0.028 0.032 0.035 0.043 0.0341 0.333 0.254 0.247 0.243 0.126 0.125 0.159 0.1062 0.914 0.845 0.788 0.852 0.544 0.611 0.620 0.5723 1.000 0.998 0.994 0.998 0.943 0.970 0.965 0.9684 1.000 1.000 1.000 1.000 0.998 0.999 0.999 0.9995 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

Fr. 0 0.026 0.030 0.044 0.028 0.034 0.035 0.043 0.0341 0.253 0.188 0.247 0.243 0.098 0.092 0.159 0.1062 0.845 0.720 0.788 0.852 0.415 0.454 0.620 0.5723 0.994 0.986 0.994 0.998 0.862 0.910 0.965 0.9684 1.000 1.000 1.000 1.000 0.996 0.998 0.999 0.9995 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

by Genest et al. (2012) in order to demonstrate the existence of non-exchangeability in the realworld by applying the test they introduced to the bivariate margins of said dataset. We refer tothis dataset as “nutrient data” because, according to Genest et al. (2012), it contains the dailyintake of five nutrients (calcium, iron, protein, vitamin A, and vitamin C) by n = 737 womenwhose age ranged between 25 and 50 years. The values in the nutrient dataset were originallycollected by the United States Departement of Agriculture in 1985. The dataset is assumed to bea sample of size n = 737 from an unknown five-dimensional distribution.

A test based on Sn was conducted on all d-dimensional margins of the nutrient data ford ∈ 2, . . . , 5. Each of the two generating sets G1 and G2 of (5.4.1) was used together with theweight function w2

m as in (5.4.2) and additionally without any weights, i. e., w ≡ 1. Just as in thesimulation study, the empirical p-values were computed based on M = 1 000 bootstrap-variables

using exponentially distributed ξ(h)i ∼ Exp(1). The empirical p-values that were computed for

the bivariate margins are given in Table 5.5. Of course, there is only one generating set inthe two-dimensional case. Plots of the rank-transformed bivariate margins of the dataset for

112

Page 121: Exchangeability of Copulas

5.4. Simulation Study and Data Application

Table 5.5: Empirical p-values for bivariate margins of the nutrient dataset, based on M = 1 000bootstrap-variables, test statistic Sn, weight function w with wm as in (5.4.2) and generating setG = (12).

nutrients w ≡ 1 w = w2m

calcium iron 0.003 0.002protein 0 0

vitamin A 0 0vitamin C 0.190 0.093

iron protein 0.411 0.510vitamin A 0.002 0.004vitamin C 0.004 0.011

protein vitamin A 0.003 0.004vitamin C 0.135 0.145

vitamin A vitamin C 0.609 0.547

Table 5.6: Empirical p-values for three-dimensional margins of the nutrient dataset, based onM = 1 000 bootstrap-variables, test statistic Sn, weight function w with wm as in (5.4.2) andgenerating sets G1 = (12), (13) or G2 = (12), (123).

w ≡ 1 w = w2m

nutrients G = G1 G = G2 G = G1 G = G2

calcium iron protein 0 0 0 0vitamin A 0 0 0 0vitamin C 0.002 0.003 0.009 0.010

protein vitamin A 0 0 0.001 0.001vitamin C 0.009 0.007 0.016 0.004

vitamin A vitamin C 0 0 0 0iron protein vitamin A 0 0 0 0

vitamin C 0 0 0 0vitamin A vitamin C 0.001 0 0 0

protein vitamin A vitamin C 0.001 0.001 0.001 0.001

which exchangeability was rejected (p < 0.05) are given in Figure 5.3. Plots of the remainingrank-transformed bivariate margins of the dataset are given in Figure 5.4. The p-values for thethree-dimensional margins may be found in Table 5.6 and those for the four-dimensional marginsin Table 5.7. The p-values that were computed with all of the five nutrients (i. e., the completedataset) were all 0, no matter which generating set or weight function was used. This means thatevery single bootstrap-variable was smaller than the test statistic. The p-values for the nutrientdata based on M = 10 000 bootstrap-variables are given in Table 5.8.

5.4.5 Discussion

We formulate our conclusions of the simulation study with respect to four aspects:Test statistics: In a first stage, comparing Rn, Sn and Tn, in most cases Sn performed best

under both hypotheses. However, there are some cases (i. e. the independence copula and Frankcopula in the favorable setting), where Tn was slightly better. Computation time for a test

113

Page 122: Exchangeability of Copulas

5. Tests for Non-Exchangeability

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

calcium

iron

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

calcium

protein

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

calcium

vitam

inA

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

iron

vitam

inA

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

iron

vitam

inC

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

protein

vitam

inA

Figure 5.3: Plots of the rank-transformed bivariate margins of the nutrient dataset for whichexchangeability was rejected (p < 0.05).

114

Page 123: Exchangeability of Copulas

5.4. Simulation Study and Data Application

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

calcium

vitam

inC

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ironprotein

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

protein

vitam

inC

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

vitamin A

vitam

inC

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

vitamin A

vitam

inC

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

vitamin A

vitam

inC

Figure 5.4: Plots of the rank-transformed bivariate margins of the nutrient dataset for whichexchangeability was not rejected (p ≥ 0.05).

Table 5.7: Empirical p-values for four-dimensional margins of the nutrient dataset, based onM = 1 000 bootstrap-variables, test statistic Sn, weight function w with wm as in (5.4.2) andgenerating sets G1 = (12), (13), (14) or G2 = (12), (1234).

w ≡ 1 w = w2m

nutrients G = G1 G = G2 G = G1 G = G2

calcium iron protein vitamin A 0 0 0 0vitamin C 0 0 0 0

vitamin A 0 0 0 0protein 0.002 0 0.003 0

iron 0 0 0 0

115

Page 124: Exchangeability of Copulas

5. Tests for Non-Exchangeability

Table 5.8: Empirical p-values for the nutrient dataset, based on M = 10 000 bootstrap-variables, test statistic Sn, weight function w with wm as in (5.4.2) and generating setsG1 = (12), (13), (14), (15) or G2 = (12), (12345).

w ≡ 1 w = w2m

nutrients G = G1 G = G2 G = G1 G = G2

calcium, iron, protein, vitamin A, vitamin C 0.0005 0 0.0013 0.0001

decision based on Rn and Tn was more than 4 times larger for sample size n = 1 000 (and morethan 17 times larger for n = 3 000), than computation time for a test decision based on Sn. Hence,we did not consider Rn or Tn for larger values of d or in other stages of the simulation study.

Generating permutations : Both generating sets yielded similar results for exchangeable copulas.In the case of a sample from a non-exchangeable copula and dimension d = 3, the use of thegenerating set G1 resulted in slightly larger empirical power in almost all cases. Why this happenedis unclear, as for both generating sets, the copulas are exchangeable under no permutations (inthe favorable case) or under exactly one permutation (in the adverse case) in the generatingset. For dimension d = 4, G1 performed better in the favorable case. This was expected, as theunderlying copula is non-exchangeable under all permutations in G1 and G2, but G1 consists ofone permutation more than G2. However, in the adverse case, G2 performed better, maybe due to∣∣C(u)− C(uπ)

∣∣ taking potentially larger values for π = σ than for π = (1i), as the upper boundwm is smaller in the latter case. Another downside of G1 was that computation time was about1.5 times longer than computation time for a test decision with G2. This was expected, as themost time-consuming step was the evalutation of Qn,π for each permutation of which there arethree (i. e. d− 1) in G1 but only two (for any d ≥ 2) in G2.

Weights: In the third stage, w = w2m as defined in (5.4.2) was slightly superior to w ≡ 1 and

w = w− 1

2m for most samples from exchangeable as well as from non-exchangeable copulas.

Dimensionality : The larger the dimension d will be, the more important efficient proceduresbecome. This is why we focus on the smallest generator G2 and on the test statistic Sn. The powerof the test shows a slight increase from d = 3 to d = 4 in the simulation study of Section 5.4, butin general one should expect a slight decrease for larger d, since there is a larger number of subsetsof variables, each of which might be exchangeable, i. e. there are more potential cases, where theunderlying (non-exchangeable) distribution is exchangeable under some of the permutations inthe generating set G, just like in the adverse case of Section 5.4.

Therefore, we propose to use Sn with weight function w = w2m and the generating set G2,

unless computation time is not an issue, in which case one should use G as large as possible.

The application of Sn to various margins of the nutrient data confirmed the findings of Genestet al. (2012) that there are six out of ten two-dimensional margins for which the null-hypothesis ofexchangeability is rejected with p < 0.05. All results coincide, as far as the rejections are concerned,no matter which combination of generating set and weight function is used. The exchangeability ofall margins in dimension d ∈ 3, 4, 5 was rejected with p < 0.05 by all tests that were conducted.This was expected as there is a pair of nutrients in any combination of three or more for whichexchangeability was rejected. For example, the three-dimensional margin consisting of the valuesfor iron, protein, and vitamin C cannot be exchangeable, if the two-dimensional margin consistingof the values for iron and vitamin C is non-exchangeable. Nevertheless, by testing the wholedataset at once instead of performing tests on all two-dimensional margins separately, there is noneed to worry about the effects of multiple testing. The lack of exchangeability in the datasetmakes it obvious that no Archimedean copula is suited for modeling the dependence structure,

116

Page 125: Exchangeability of Copulas

5.4. Simulation Study and Data Application

because all Archimedean copulas are exchangeable. Neither should a nested Archimedean copulabe used as all bivariate margins of a nested Archimedean copula are Archimedean and thusexchangeable, but, as mentioned before, not all pairs of nutrients seem to be exchangeable. Justas expected from the results of the simulation study, computation time for the generating set G1

was about twice as long as for G2 because the former contains twice as much permutations. Whenincreasing the number of bootstrap-variables by a factor of ten for computation of the values inTable 5.8, computation time only increased by a factor of about 1.5.

117

Page 126: Exchangeability of Copulas
Page 127: Exchangeability of Copulas

Appendix

A Some Proofs

A.1 Some Proofs from Chapter 3

Proof of Lemma 3.3.4

Recall that (M,d) is a metric space. Note that, apart from most of the rest of this work, here ddoes not denote the dimension, but a metric. This should not cause too much confusion, as itis used with arguments, i. e. d(x, y) in almost all occurences. We need to show that M has theLindelof property if and only if it is separable.

Proof. Let M have the Lindelof property, i. e. each open cover has a finite subcover. As usual,

Bδ(x) :=y ∈M | d(x, y) < δ

denotes the open ball with radius σ > 0 around some x ∈M . Obviously,

⋃x∈M B 1

n(x) is an open

cover of M for every n ∈ N. As M is a Lindelof space, given n ∈ N and said open cover, there

exists a countable subcover. This means, for every n ∈ N, there exists a sequence(q

(n)k

)k∈N⊂M ,

such that ⋃x∈M

B 1n

(x) =

∞⋃k=1

B 1n

(q

(n)k

)holds. Let M :=

⋃∞n=1

⋃nk=1 q

(n)k , then M is countable. It remains to show that M is dense

in M . To be precise, we need to show that it is possible, for each x ∈ M , to find a sequence(xn)n∈N ⊂ M , such that xn → x for n→∞. Therefore, let x ∈M and n ∈ N. It suffices to showthat there exists an xn ∈ M with d(x, xn) < 1

n . But since x ∈M , we get

x ∈⋃y∈M

B 1n

(y) =⇒ x ∈∞⋃k=1

B 1n

(q

(n)k

)

and thus there exists at least one k ∈ N, such that x ∈ B 1n

(q

(n)k

)which means

d(x, q

(n)k

)<

1

n

and therefore M is dense in M , i. e. M is separable.Now let M be separable, and let M be a countable subset which is dense in M and whose

existence is assured by Definition 3.3.3. Let I be some index set and let Oi ⊂ M be open for

119

Page 128: Exchangeability of Copulas

Appendix

i ∈ I, i. e.⋃i∈I Oi is some open covering. If Oi = ∅ holds for all i ∈ I, then

⋃i∈I Oi = Oi0 for any

i0 ∈ I and we have a finite (and thus countable) subcover. Otherwise, M ∩(⋃

i∈I Oi)6= ∅ and for

every x ∈ ⋃i∈I Oi, there exists i ∈ I and δ > 0, such that Bδ(x) ⊂ Oi. Because of the separability

of M , there exists a sequence (xn)n∈N ⊂ M , such that xn → x for n→∞ and thus there existsn0 ∈ N, such that d(xn, x) < δ for n ≥ n0 which is equivalent to xn ∈ Bδ(x) ⊂ ⋃i∈I Oi for all

n ≥ n0. To cut a long story short, M ∩(⋃

i∈I Oi)6= ∅ is countable, which means there exists a

sequence (qk)k∈N, such that

(qk)k∈N = M ∩(⋃i∈I

Oi

)

holds (if M ∩(⋃

i∈I Oi)

is finite, we may just repeat some qk). For each k ∈ N, there existsm(k) ∈ N, with

m(k) := minn ∈ N | ∃i ∈ I, such that B 1

n(qk) ⊂ Oi

(A.1)

because the non-existence of such an m(k) would imply qk 6∈ Oi for all i ∈ I. By the axiom ofchoice, for every k ∈ N, it is possible to select one (of the possibly infinitely many) i ∈ I suchthat B 1

m(k)(qk) ⊂ Oi holds and put Ok := Oi. It remains to show that

⋃∞k=1 Ok is a subcover, i. e.

that

∞⋃k=1

Ok =⋃i∈I

Oi

holds. To this end, let x ∈ ⋃i∈I Oi. Therefore, there exists some i ∈ I, such that x ∈ Oi. As

Oi is open, there exists an n ∈ N, such that B 1n

(x) ⊂ Oi. And because M is dense in M , there

exists a subsequence(qkj)j∈N, such that qkj → x for j →∞. This means, there exists a j0 ∈ N,

such that d(qkj , x

)< 1

4n for all j ≥ j0. By the triangular inequality, we get

B 12n

(qkj)⊂ B 1

n(x) ⊂ Oi

and (A.1) implies m(kj)≤ 2n. But this means

B 12n

(qkj)⊂ B 1

m(kj)(qkj ) ⊂ Okj

which, together with x ∈ B 12n

(qkj), implies x ∈ ⋃∞k=1 Ok. Thus

⋃∞k=1 Ok ⊃

⋃i∈I Oi and, as the

other inclusion is trivial, we have a countable subcover and M has the Lindelof property.

Proof of Lemma 3.3.5

Remember that we need to show that

C[0, 1]d ⊂ D[0, 1]d ⊂ l∞[0, 1]d

holds and the inclusion is strict.

Proof. First, let f ∈ C[0, 1]d. Because f is continuous, limx→x0 f(x) = f(x0) holds for allx0 ∈ [0, 1]d and all sequences xn → x0 in [0, 1]d, especially for xn → x0+ and xn → x0− andthus f ∈ D[0, 1]d.

120

Page 129: Exchangeability of Copulas

A. Some Proofs

Now, let f ∈ D[0, 1]d and assume f 6∈ l∞[0, 1]d. This means that f is unbounded, i. e. foreach n ∈ N, there exists xn ∈ [0, 1]d, such that

∣∣f(xn)∣∣ > n holds. As the sequence (xn)n∈N is

bounded (by 1), there exists a convergent subsequence(xnk

)k∈N by the well-known theorem of

Bolzano and Weierstraß. This means, there exists x0 ∈ [0, 1]d, such that xnk → x0 holds fork → ∞. There must be at least one quadrant around x0, which contains infinitely many xk.To be precise, out of the 2d index sets I ⊂ 1, . . . , d, there must be (at least) one, such that,for infinitely many k ∈ N, xnk,i < x0,i holds for all i ∈ I and xnk,i ≥ x0,i holds for all i 6∈ I.If we denote this subsubsequence by xnkj , we get xnkj → x0, but f

(xnkj

)→∞ and therefore

limx→x0− f(x) does not exist which contradicts f ∈ D[0, 1]d. Thus the assumption is wrong andf ∈ l∞[0, 1]d holds.

It is easy to see that f : [0, 1]d → R with f(x) = 1(x = 0

)is bounded, i. e. f ∈ l∞[0, 1]d, but

f 6∈ D[0, 1]d because limx→0+ f(x) 6= 1. For example consider the sequence xn :=(

1n , 0, . . . , 0

)>for n ∈ N, then xn → 0+, but f(xn)→ 0 6= f(0).

If we consider a :=(

12 , . . . ,

12

)> and the mapping f : [0, 1]d → R with f(x) := 1(a ≤ x

), then

obviously f ∈ D[0, 1]d, but f 6∈ C[0, 1]d. Thus both inclusions in Lemma 3.3.5 are strict.

A.2 Some Proofs from Chapter 4

Technical part of the proof of Theorem 4.1.21

Remember that Cθ is a two- or three-dimensional Frank-Copula as in Example 2.4.22. It shouldbe demonstrated that ∣∣∣∣3Cθ(1

2,

1

2

)− 2Cθ

(1

2,

1

2,

1

2

)− 1

2

∣∣∣∣ > 0 (A.2)

holds for θ ∈ [− ln 2,∞) \ 0.Proof. Let θ ∈ [− ln 2,∞) \ 0. First note that

1 +

(e−

θ2 −1

)2e−θ −1

= 1 +e−

θ2 −1

e−θ2 +1

=2 e−

θ2

e−θ2 +1

,

1 +

(e−

θ2 −1

)3(e−θ −1

)2 = 1 +e−

θ2 −1(

e−θ2 +1

)2 =e−

θ2

(e−

θ2 +3

)(e−

θ2 +1

)2holds and thus

3Cθ

(1

2,

1

2

)− 2Cθ

(1

2,

1

2,

1

2

)− 1

2= −3

θln

(1 +

(e−

θ2 −1

)2e−θ −1

)+

2

θln

(1 +

(e−

θ2 −1

)3(e−θ −1

)2)− 1

2

= −3

θln

(2 e−

θ2

e−θ2 +1

)+

2

θln

(e−

θ2

(e−

θ2 +3

)(e−

θ2 +1

)2)− 1

2

=1

θln

((e−

θ2 +3

)28(e−

θ2 +1

)).

The mapping f : (0,∞)→ R with f(x) := (x+3)2

8(x+1) attains its minimum in x0 = 1 and as f(x) = 1

if and only if x = 1, we get f(x) > 1 for x ∈ (0,∞) \ 1. Therefore∣∣∣∣3Cθ(1

2,

1

2

)− 2Cθ

(1

2,

1

2,

1

2

)− 1

2

∣∣∣∣ =1

|θ| ln(f(e−

θ2

))> 0

holds for θ 6= 0.

121

Page 130: Exchangeability of Copulas

Appendix

Original proof of Lemma 4.3.4

It should be demonstrated that ∣∣C(u)− C(uτij

)∣∣ ≤ |ui − uj |holds for any d-copula C and for all u ∈ [0, 1]d and i, j ∈ 1, . . . , d.

Proof. Let C be a d-copula. Without loss of generality, let i = 1, j = 2 and ui ≤ uj (otherwise,the proof stays the same, up to notational changes). In order to avoid confusion, we will writeνC(A) := C(A) for the copula-volume of a hyperrectangle A ⊂ [0, 1]d as in Definition 2.1.1. Onemay think of νC as the induced measure of C, i. e. if U ∼ C, then νC(A) = P(U ∈ A). Inthis way, the copula-volume is extended to arbitrary subsets A ⊂ [0, 1]d. As C is a copula it isd-increasing and thus νC(A) ≥ 0 for all A ⊂ [0, 1]d. We will consider the specific hyperrectangles

A1 := [u1, u2]× [u1, 1]d×j=3

[0, uj ],

Ai := [u1, u2]× [0, 1]i−1×j=3

[0, 1]× [ui, 1]d×

j=i+1

[0, uj ]

for i ∈ 3, . . . , d. Because of C(u) = 0 whenever ui = 0 for at least one i ∈ 1, . . . , d (i. e.groundedness of C), we get

νC(A1) = C(u2, 1, u3, . . . , ud)− C(u1, 1, u3, . . . , ud)

− C(u2, u1, u3, . . . , ud) + C(u1, u1, u3, . . . , ud),

νC(Ai) = C(u2, 1, . . . , 1, ui+1, . . . , ud)− C(u1, 1, . . . , 1, ui+1, . . . , ud)

− C(u2, 1, . . . , 1, ui, . . . , ud) + C(u1, 1, . . . , 1, ui, . . . , ud)

for i ∈ 3, . . . , d. When we sum all these terms, most of them cancel each other out, and onepart of the claim can be shown verifying

0 ≤ νC(A1) +

d∑i=3

νC(Ai) = u2 − u1 + C(u1, u1, u3, . . . , ud)− C(u2, u1, u3, . . . , ud)

≤ u2 − u1 + C(u1, u2, u3, . . . , ud)− C(u2, u1, u3, . . . , ud) .

(A.3)

Note that the last inequality follows from the fact that copulas are non-decreasing in eachcomponent. Thus we have

C(uτ1,2

)− C(u) ≤ u2 − u1.

Replacing A1 and Ai in (A.3) by

A1 := [u1, 1]× [u1, u2]d×j=3

[0, uj ],

Ai := [0, 1]× [u1, u2]i−1×j=3

[0, 1]× [ui, 1]d×

j=i+1

[0, uj ]

for i ∈ 3, . . . , d yields

C(u)− C(uτ1,2

)≤ u2 − u1

and thus completes the proof.

122

Page 131: Exchangeability of Copulas

A. Some Proofs

Construction of a copula for the proof of Theorem 4.4.3

Let d = 2n and u ∈ M as described in the proof of Theorem 4.4.3, i. e. for j ∈ 1, . . . , n letδj ∈

[0, 1

d+1

], such that δ1 ≤ . . . ≤ δn as well as

n∑j=1

δj =n− 1

d+ 1

holds and define u ∈ [0, 1]d by

ui :=

d−1d+1 if i ≤ n,

dd+1 + δi−n if i > n.

Let π ∈ Sd be the order reversing permutation, i. e. π(j) = d− j + 1 for all j ∈ 1, . . . , d. Wewill construct a shuffle-of-min copula C, such that

∣∣C(u)− C(uπ)∣∣ = d−1

d+1 holds.According to Remark 2.1. in Durante and Fernandez-Sanchez (2010), all that is needed for the

construction of such a copula is a so called shuffling structure of d-dimensional orthotopes anda system of copulas (Ci). We use Ci ≡Md for all i for simplicity, but other choices, especiallynon-singular copulas, are possible. Now for the orthotopes J1

i × . . .×Jdi (with i ∈ 1, . . . , 3n−1):In the following, we will give Jki for all cases of i ∈ 1, . . . , 3n− 1 and k ∈ 1, . . . , d.Case 1: i ∈ 1, . . . , n− 1.

Case 1.1: k ∈ 1, . . . , n− i ∪ n+ 1, . . . , 2n.

Jki :=

[i−1∑j=1

(1

d+ 1+ δj

),

i∑j=1

(1

d+ 1+ δj

)]

Case 1.2: k = n− i+ 1.

Jki :=

[d− 1

d+ 1,

d

d+ 1+ δi

]Case 1.3: i ≥ 2 and k ∈ n− i+ 2, . . . , n.

Jki :=

i−1∑j=1

j 6=n+1−k

(1

d+ 1+ δj

),

i∑j=1

j 6=n+1−k

(1

d+ 1+ δj

)Case 2: i ∈ n, . . . , 2n− 2.

Case 2.1: k = 1.

Jki :=

d− 2

d+ 1− δn +

i−n∑j=1

(1

d+ 1− δj

),d− 2

d+ 1− δn +

i−n+1∑j=1

(1

d+ 1− δj

)Case 2.2: n ≥ 3 and k ∈ 2, . . . , 2n− i− 1.

Jki :=

d− 3

d+ 1− δn − δn+1−k +

i−n∑j=1

(1

d+ 1− δj

),d− 3

d+ 1− δn − δn+1−k +

i−n+1∑j=1

(1

d+ 1− δj

)123

Page 132: Exchangeability of Copulas

Appendix

Case 2.3: k = 2n− i.

Jki :=

[d

d+ 1+ δn+1−k, 1

]Case 2.4: i ≥ n+ 1 and k ∈ 2n− i+ 1, . . . , n.

Jki :=

d− 4

d+ 1− δn +

i−n∑j=1

(1

d+ 1− δj

),d− 4

d+ 1− δn +

i−n+1∑j=1

(1

d+ 1− δj

)Case 2.5: k ∈ n+ 1, . . . , 2n.

Jki :=

d

d+ 1− δn +

i−n∑j=1

(1

d+ 1− δj

),

d

d+ 1− δn +

i−n+1∑j=1

(1

d+ 1− δj

)Case 3: i ∈ 2n− 1, . . . , 3n− 2.

Case 3.1: k = 1.

Jki :=

d− 2

d+ 1+

i−2n+1∑j=1

(1

d+ 1− δj

),d− 2

d+ 1+

i−2n+2∑j=1

(1

d+ 1− δj

)Case 3.2: k ∈ 2, . . . , n.

Jki :=

d− 4

d+ 1+

i−2n+1∑j=1

(1

d+ 1− δj

),d− 4

d+ 1+

i−2n+2∑j=1

(1

d+ 1− δj

)Case 3.3: k ∈ n+ 1, . . . , 2n \ i− n+ 2.

Jki :=

d

d+ 1+

i−2n+1∑j=1

j 6=k−n

(1

d+ 1− δj

),

d

d+ 1+

i−2n+2∑j=1

j 6=k−n

(1

d+ 1− δj

)Case 3.4: k = i− n+ 2.

Jki :=

[d

d+ 1+ δk−n, 1

]Case 4: i = 3n− 1.

Case 4.1: k = 1.

Jki :=

[d− 1

d+ 1, 1

]Case 4.2: k ∈ 2, . . . , n.

Jki :=

[d− 3

d+ 1,d− 1

d+ 1

]

124

Page 133: Exchangeability of Copulas

A. Some Proofs

Case 4.3: k ∈ n+ 1, . . . , 2n.

Jki :=

[d− 2

d+ 1− δn,

d

d+ 1− δn

]By Definition 2.1. of Durante and Fernandez-Sanchez (2010), the intervals Jki must fulfill fourconditions, in order to get a proper copula:

1. First, i must run in a finite or countable index set. Here, this is obviously the case, as1 ≤ i ≤ 3n− 1.

2. Second, for every k ∈ 1, . . . , d and i1 6= i2 the intervals Jki1 and Jki2 must have at mostone endpoint in common. This condition is tedious to verify, but nonetheless fulfilled.

3. Third, the orthotopes must be d-hypercubes, i. e.∣∣Jk1i ∣∣ =

∣∣Jk2i ∣∣ for every i and every pairk1, k2. This is the case, as

∣∣Jki ∣∣ =

1d+1 + δi for i ∈ 1, . . . , n− 1,

1d+1 − δi−n+1 for i ∈ n, . . . , 2n− 2,

1d+1 − δi−2n+2 for i ∈ 2n− 1, . . . , 3n− 2,

2d+1 for i = 3n− 1

holds for every k ∈ 1, . . . , d.4. Last, for every k ∈ 1, . . . , d, the length

∣∣Jki ∣∣ of the intervals Jki must sum up to 1. This isthe case, as

3n−1∑i=1

∣∣Jki ∣∣ =

n−1∑i=1

(1

d+ 1+ δi

)+

2n−2∑i=n

(1

d+ 1− δi−n+1

)

+

3n−2∑i=2n−1

(1

d+ 1− δi−2n+2

)+

2

d+ 1= 1

for every k.

Analogous to (4.3.16) we get an explicit expression of C, namely

C(u) =

3n−1∑i=1

min((u1 − a1

i

)+, . . . ,

(ud − adi

)+,∣∣J1i

∣∣) (A.4)

where aki is the left limit of the interval Jki . The distribution of mass within the d-hypercubesis arbitrary, as long as there is exactly the mass

∣∣J1i

∣∣ in the hypercube J1i × . . . × Jdi . In our

example, all the mass is on the diagonal. For a non-singular copula, one could spread the massevenly within the hypercubes, for example replace Md in (4.3.16) by the Independence Copula πd.

A.3 Some Proofs from Chapter 5

Technical part of the proof of Theorem 5.1.4

Remember that we need to show that the function f : [0, 1]d×n → [0,∞) with

f(A) :=∑π∈G

∫[0,1]d

(1

n

n∑i=1

1(ai ≤ u)− 1

n

n∑i=1

1(ai ≤ uπ)

)2

w(u, π) du,

125

Page 134: Exchangeability of Copulas

Appendix

where ai denotes the i-th column of A ∈ [0, 1]d×n, is continuous for each matrix in

1n , . . . ,

nn

d×n.

Proof. Let A ∈

1n , . . . ,

nn

d×nand let 0 < δ < 1

3n . Additionally, let B ∈ [0, 1]d×n, such that

‖A−B‖∞ := maxi,j∈1,...,n

|aij − bij | < δ

holds. In order to improve readability, gA(u) := 1n

∑ni=1 1(ai ≤ u) will be used in the remainder

of this proof (and gB(u) is defined analoguously). Then, as w( · , π) is continuous on the compactset [0, 1]d, it is bounded and thus

∣∣f(A)− f(B)∣∣ ≤∑

π∈G

∥∥w( · , π)∥∥∞

∫[0,1]d

∣∣∣(gA(u)− gA(uπ))2 − (gB(u)− gB(uπ)

)2∣∣∣du (A.5)

holds as well as∣∣∣(gA(u)− gA(uπ))2 − (gB(u)− gB(uπ)

)2∣∣∣ =

=∣∣∣(gA(u)− gA(uπ)

)+(gB(u)− gB(uπ)

)︸ ︷︷ ︸∈[−2,2]

∣∣∣∣∣∣(gA(u)− gA(uπ))−(gB(u)− gB(uπ)

)∣∣∣≤ 2∣∣∣(gA(u)− gB(u)

)+(gB(uπ)− gA(uπ)

)∣∣∣≤ 2∣∣gA(u)− gB(u)

∣∣+ 2∣∣gB(uπ)− gA(uπ)

∣∣holds for any π ∈ G. Therefore, the integrand in (A.5) vanishes outside the grid

G :=

x ∈ [0, 1]d

∣∣∣∣ ∃i ∈ 1, . . . , d,∃k ∈ 1, . . . , n, such that

∣∣∣∣xi − k

n

∣∣∣∣ < δ

and the integrand is bounded by 4. This yields∣∣f(A)− f(B)

∣∣ ≤∑π∈G

∥∥w( · , π)∥∥∞4λ(G) ≤ d! ·

∥∥w( · , π)∥∥∞ · 4 · (2nδ)

d → 0

for δ → 0 (where λ(G) denotes the Lebesgue-measure of the grid G). Thus B → A impliesf(B)→ f(A) which completes the proof.

Proof of Lemma 5.2.9

It should be demonstrated that for all c > 0 and all π ∈ Sd, the mapping φ : l∞[0, 1]d×BVc[0, 1]d →R with

φ(α, β) :=

∫(0,1]d

α(u)w(u, π) dβ(u)

for (α, β) ∈ l∞[0, 1]d × BVc[0, 1]d is Hadamard-differentiable tangentially to C[0, 1]d ×D[0, 1]d ateach (A,B) ∈ l∞[0, 1]d × BVc[0, 1]d, such that

∫|dA| <∞. The derivative is given by

φ′(A,B)(α, β) =

∫(0,1]d

A(u)w(u, π) dβ(u) +

∫(0,1]d

α(u)w(u, π) dB(u).

126

Page 135: Exchangeability of Copulas

A. Some Proofs

Proof. This proof is based on the proof of Lemma 4.2.2 in Carabarin Aguirre (2008) whichis similiar to the proof of Lemma 3.9.17 in van der Vaart and Wellner (1996). In order toimprove readability, we suppress the argument u in all functions on the unit hypercube as wellas π in the case of the weight function w. Now let c > 0 and (A,B) ∈ l∞[0, 1]d × BVc[0, 1]d,such that

∫|dA| < ∞. Additionally, let (α, β) ∈ C[0, 1]d × D[0, 1]d and for t ∈ (0,∞) let

(αt, βt) ∈ l∞[0, 1]d ×D[0, 1]d, such that

αt → α,

βt → β

holds for t→ 0 (with respect to the supremum-norm) and additionally

At := A+ tαt ∈ l∞[0, 1]d,

Bt := B + tβt ∈ BVc[0, 1]d

holds. Then

φ(At, Bt)− φ(A,B)

t− φ′(A,B)(α, β) =

=1

t

(∫At w dBt −

∫Aw dB

)−∫Aw dβ −

∫αw dB

=1

t

(∫Aw d(Bt −B︸ ︷︷ ︸

=tβt

) + t

∫αt w dBt

)−∫Aw dβ −

∫αw dB

=

∫Aw dβt +

∫αt w dBt −

∫Aw dβ −

∫αw dB

=

∫Aw d(βt − β) +

∫αt w dBt −

∫αw dB

where the first term vanishes for t→ 0, which can be verified by using integration by parts, as Ais of bounded variation and ‖βt − β‖∞ → 0. For the remaining two terms, we write∣∣∣∣∫ αt w dBt −

∫αw dB

∣∣∣∣ =

∣∣∣∣∫ αt w dBt −∫αw dBt +

∫αw dBt −

∫αw dB

∣∣∣∣≤∣∣∣∣∫ (αt − α)w dBt

∣∣∣∣+

∣∣∣∣∫ αw d(Bt −B)

∣∣∣∣≤ c‖w‖∞‖αt − α‖∞ +

∣∣∣∣∫ αw d(Bt −B)

∣∣∣∣ .In order to show that the last term is smaller than any ε > 0 if t is small enough, we introducesome notation and α, a discretization of α. For a given N ∈ N, let α : [0, 1]d → R, with

α(u) := (α · w)(i1N , . . . ,

idN

)⇐⇒ u ∈

(i1−1N , i1N

]× . . .×

(id−1N , idN

]=: Vi1,...,id

for i1, . . . , id ∈ 1, . . . , N and

α(i1N , . . . ,

idN

):= (α · w)

(i1N , . . . ,

idN

)whenever ij = 0 for at least one j ∈ 1, . . . , d. As (α · w) ∈ C[0, 1]d, there exists N ∈ N, suchthat 2c‖(α · w)− α‖∞ < ε

2 . By B(V i1,...,id

)we denote the volume of the closure of Vi1,...,id with

127

Page 136: Exchangeability of Copulas

Appendix

respect to B (and analogous for (Bt−B)(V i1,...,id

)), i. e. the B-volume of Vi as in Definition 2.1.1.

With this fixed N , we obtain∣∣∣∣∫ αw d(Bt −B)

∣∣∣∣ ≤ ∣∣∣∣∫ (α · w − α) d(Bt −B)

∣∣∣∣+

∣∣∣∣∫ α w d(Bt −B)

∣∣∣∣≤ 2c‖(α · w)− α‖∞ +

N∑i1,...,id=1

∣∣(α · w)(i1N , . . . ,

idN

)∣∣ ∣∣∣(Bt −B)(V i1,...,id

)∣∣∣+

∑i1,...,id∈0,...,N∃j∈1,...,d,ij=0

∣∣(α · w)(i1N , . . . ,

idN

)∣∣ ∣∣(Bt −B)(i1N , . . . ,

idN

)∣∣≤ ε

2+ ‖α · w‖∞‖Bt −B‖∞

(2dNd +

((N + 1)d −Nd

))< ε

for t small enough, as ‖Bt −B‖∞ → 0 for t→ 0 .

128

Page 137: Exchangeability of Copulas

Zusammenfassung

In der vorliegenden Arbeit befassen wir uns mit der Frage, ob die Verteilung, die einer gegebenenStichprobe zugrunde liegt, austauschbar ist. Austauschbarkeit bezeichnet dabei eine Eigenschafteiner Funktion in mehreren Variablen. Ist eine solche Funktion austauschbar, bedeutet das, dass derFunktionswert nicht von der Reihenfolge der Variablen abhangt. Genauer ist eine d-dimensionaleVerteilungsfunktion H austauschbar, falls

H(x1, . . . , xd) = H(xπ(1), . . . , xπ(d))

fur alle Permutationen π ∈ Sd sowie fur alle Vektoren (x1, . . . , xd)> ∈ Rd gilt. Wie unschwer zu

erkennen ist, kann eine Verteilungsfunktion nur dann austauschbar sein, wenn alle ihre Randvertei-lungen ubereinstimmen, insbesondere die eindimensionalen. Die Aquivalenz aller eindimensionalenRandverteilungen ist somit notwendig fur die Austauschbarkeit einer Verteilungsfunktion, jedochist sie nicht hinreichend. Unter der Voraussetzung von identischen eindimensionalen Randern isteine Verteilungsfunktion genau dann austauschbar, wenn ihre Copula austauschbar ist.

Copulas sind eine spezielle Klasse von Funktionen, die auf dem Einheitsquadrat (im Fall vonzwei Variablen), auf dem Einheitswurfel (im Fall von drei Variablen) oder auf dem Einheitshy-perwurfel (im Fall von mehr als drei Variablen) definiert sind. Bei Copulas handelt es sich geradeum die Verteilungsfunktionen von Zufallsvektoren mit standardgleichverteilten Komponenten.Die eindimensionalen Randverteilungen eines solchen Zufallsvektors stimmen also alle mit derGleichverteilung auf dem Intervall [0, 1] uberein. Die Bedeutung dieser Funktionen wird durcheinen Satz von Sklar (1959) deutlich. Er besagt, dass fur jede d-dimensionale VerteilungsfunktionH mit eindimensionalen Randverteilungen F1, . . . , Fd eine Copula C existiert, so dass

H(x1, . . . , xd) = C(F1(x1), . . . , Fd(xd)

)fur alle Vektoren (x1, . . . , xd)

> ∈ Rd gilt. Die Copula verknupft also die eindimensionalenRandverteilungen zu einer gemeinsamen Verteilungsfunktion.

Liegt eine Stichprobe einer unbekannten multivariaten Zufallsverteilung vor, sind oftmals dieEigenschaften dieser Verteilung von Interesse. Etwa kann man sich fragen, ob die Komponentender Zufallsvektoren, deren Realisierungen man betrachtet, unabhangig sind. Sind sie es nicht,ergibt sich sofort die Frage nach der Abhangigkeitsstruktur, beziehungsweise nach der Starke derAbhangigkeit der Variablen untereinander. Ob und in welcher Weise die Variablen voneinanderabhangen, liegt alleine an der Copula. Dies folgt sofort aus dem oben erwahnten Satz von Sklar.Ganz ahnlich verhalt es sich mit verschiedenen Formen der Symmetrie, wie etwa der Austausch-barkeit (abgesehen von der oben erwahnten Aquivalenz der eindimensionalen Randverteilungen).Sind Zufallsvariablen unabhangig und identisch verteilt, dann ist ihre gemeinsame Verteilung(und damit auch die zugehorige Copula) austauschbar. Da die Umkehrung im Allgemeinen nichtgilt, handelt es sich bei der Austauschbarkeit also um eine schwachere Form der Unabhangigkeit.Um die Abhangigkeitsstruktur einer Stichprobe nachzubilden, kann man etwa ein parametrisches

129

Page 138: Exchangeability of Copulas

Zusammenfassung

Modell, genauer eine parametrische Familie von Copulas, wie beispielsweise eine Clayton-Copulaoder andere Familien von Archimedischen Copulas, verwenden. Eine Copula C wird Archimedischgenannt, falls eine sogenannte Generatorfunktion ϕ (mit bestimmten Eigenschaften) existiert, sodass

C(u1, . . . , ud) = ϕ−(ϕ(u1) + . . .+ ϕ(ud)

)fur alle Vektoren (u1, . . . , ud)

> ∈ [0, 1]d gilt. Viele dieser Generatorfunktionen sind Inverse vonLaplace-transformierten Verteilungsfunktionen. Die Beliebtheit von Archimedischen Copulas,besonders in Bereichen, in denen hochdimensionale Verteilungsfunktionen auftreten (wie etwaim Versicherungs- oder Finanzsektor), liegt moglicherweise daran, dass mit ihrer Hilfe diver-se Abhangigkeitsstrukturen durch eine einfache Funktion (teilweise mit nur einem Parameter)modelliert werden konnen. Aufgrund ihrer oben beschriebenen Konstruktion durch eine Generator-funktion sind alle Archimedischen Copulas austauschbar. Insbesondere bei der Verwendung einerArchimedischen Copula wird also implizit vorausgesetzt, dass die unbekannte Copula, die einergegebenen Stichprobe zugrunde liegt, austauschbar ist. Es empfiehlt sich daher, vor dem Einsatzvon austauschbaren Copulas einen statistischen Test, wie den in dieser Arbeit vorgestellten,durchzufuhren.

Da die zugrundeliegende Verteilung sowie die dazugehorige Copula einer Stichprobe im All-gemeinen unbekannt sind, konnen diese nicht direkt auf Austauschbarkeit uberpruft werden.Stattdessen werden die empirische Verteilung und die empirische Copula betrachtet, welche ingewisser Weise Naherungen an die theoretische Verteilung beziehungsweise Copula darstellen.Sind diese Naherungen nun nicht austauschbar, ist zunachst unklar ob es sich um einen Fehlerbei der Naherung der theoretischen Großen handelt, oder ob die Ursache dafur in der Nichtaus-tauschbarkeit der unbekannten Copula liegt. Eine Gewichtsfunktion kann Abweichungen von derAustauschbarkeit in Bereichen, in denen sie theoretisch kaum vorkommen kann, abschwachen.Bereiche, in denen großere Abweichungen moglich sind, konnen mit einem großeren Gewicht ver-sehen werden. Dazu ist es allerdings notwendig die maximal moglichen Abweichungen zu kennen.Diese maximale Nichtaustauschbarkeit wurde im zweidimensionalen Fall unabhangig voneinandervon Klement und Mesiar (2006) sowie Nelsen (2007) bestimmt. In dieser Arbeit wird eine kleinsteobere Schranke fur beliebige Dimension hergeleitet. Daraus ergibt sich dann, dass (unter gewis-sen Einschrankungen) die Eindeutigkeit eines Punktes, in dem maximale Nichtaustauschbarkeitvorliegen kann, davon abhangt, ob die Dimension gerade oder ungerade ist.

Die Arbeit ist folgendermaßen aufgebaut: Kapitel 1 enthalt eine kurze Einleitung.Zu Beginn von Kapitel 2 werden grundlegende Begriffe der Theorie von Copulas definiert sowie

der oben erwahnte Satz von Sklar vorgestellt. Anschließend werden einige wichtige Klassen vonCopulas wie etwa elliptische Copulas (zu denen auch die Copula der multivariaten Normalverteilunggehort) oder Archimedische Copulas betrachtet. Im Hinblick auf die spatere Simulationsstudiewerden einige Verfahren zur numerischen Simulation von Realisierungen von Zufallsvektoren,die gemaß einer Copula verteilt sind, genannt. Kapitel 2 schließt mit der Betrachtung einigerAbhangigkeitsmaße.

In Kapitel 3 werden die Grundlagen fur die asymptotischen Uberlegungen spaterer Kapitelgelegt, indem die entsprechenden Ergebnisse der Literatur zusammengefasst werden, insbesonderedie der Bucher von Billingsley (1999) sowie von van der Vaart und Wellner (1996). Nach derErlauterung verschiedener Arten von Konvergenz von Zufallsvariablen und -vektoren wird dieubliche Definition der oben erwahnten empirischen Verteilungsfunktion wiedergegeben. Obwohljeder Funktionswert der empirischen Verteilungsfunktion eine Zufallsvariable ist, wird verdeutlicht,dass Probleme auftreten konnen, wenn man die empirische Verteilungsfunktion als zufalligeAbbildung in einem Raum von Funktionen ansieht. Auf diese Probleme wurden von Skorokhodschon um 1950 hingewiesen. Um trotzdem noch Konvergenzaussagen treffen zu konnen, wird die

130

Page 139: Exchangeability of Copulas

Zusammenfassung

sogenannte”schwache Konvergenz“ verwendet. Abschließend wird dann die Definition eines Gauß-

Prozesses sowie ein bekanntes Resultat dargelegt, das besagt, dass, analog zu einem Theorem vonDonsker, eine entsprechend skalierte Version der empirischen Verteilungsfunktion – der sogenannte

”empirische Prozess“ – schwach gegen einen solchen Gauß-Prozess konvergiert.

In Kapitel 4 werden die Grenzen der Nichtaustauschbarkeit aufgezeigt. Da in der Literaturanstelle von

”Austauschbarkeit“ teilweise nur von

”Symmetrie“ die Rede ist, werden zunachst

verschiedene Konzepte der Symmetrie von Zufallsvektoren sowie von multivariaten Funktionenvorgestellt. Nach der Definition von Austauschbarkeit und von Nichtaustauschbarkeit wird gezeigt,dass jede der zuvor genannten Arten von Symmetrie weder hinreichend noch notwendig furAustauschbarkeit ist. Schließlich wird eine bestmogliche Schranke der Nichtaustauschbarkeit inbeliebiger Dimension (großer gleich 2) hergeleitet, die, wie oben erwahnt, zuvor lediglich imzweidimensionalen Fall bekannt war. Genauer wird gezeigt, dass fur jede d-dimensionale CopulaC, fur jede Permutation π und fur jeden Vektor (u1, . . . , ud)

> ∈ [0, 1]d bereits

∣∣C(u1, . . . , ud)− C(uπ(1), . . . , uπ(d))∣∣ ≤ d− 1

d+ 1

gilt und dass diese Schranke nicht verbessert werden kann. Die Verscharfung einer Abschatzungim Beweis fuhrt allerdings zu einer Aussage uber die Menge der Punkte, in denen die Schrankeerreicht wird. Ob diese Menge endlich ist oder uberabzahlbar viele Punkte enthalt, hangt davonab, ob die Dimension gerade oder ungerade ist. Außerdem wird eine Konsequenz des Resultatsuber die maximale Nichtaustauschbarkeit fur die mogliche Wahl der Randverteilungen einerCopula aufgezeigt. Das Kapitel schließt mit der Vorstellung eines Resultats von Liebscher (2008)uber die Erzeugung einer nichtaustauschbaren Copula aus einer austauschbaren.

Am Anfang von Kapitel 5 steht die Einfuhrung verschiedener Teststatistiken um die Hypothesezu uberprufen, dass einer gegebenen Stichprobe eine austauschbare Copula zugrunde liegt.Bei den Teststatistiken handelt es sich um Verallgemeinerungen der von Genest et al. (2012)vorgeschlagenen Teststatistiken fur den zweidimensionalen Fall. Allerdings steigt die Anzahl derPermutationen, unter denen die Copula austauschbar oder nichtaustauschbar sein konnte, mitwachsender Dimension exponentiell an. Dennoch wird gezeigt, dass es genugt, eine Teilmenge allerPermutationen zu betrachten. Genauer ist fur jede Dimension bereits eine solche Teilmenge mit nurzwei Elementen ausreichend. Zusatzlich werden die Teststatistiken, wie oben beschrieben, um eineGewichtsfunktion erganzt. Eine konkrete, stetige Gewichtsfunktion wird vorgeschlagen, allerdingswird das asymptotische Verhalten der Teststatistiken fur eine allgemeine Gewichtsfunktionuntersucht, unter der Voraussetzung, dass diese stetig ist. Im Verlauf der Beweise wird daraufhingewiesen, dass (und in welcher Form) die Voraussetzung der Stetigkeit teilweise abgeschwachtwerden kann. Da die (asymptotische) Verteilung der Teststatistiken von der unbekannten Copulaabhangt, wird zur Berechnung von empirischen p-Werten und damit zur Entscheidung uberdie Ablehnung der Nullhypothese ein sogenanntes

”Bootstrap-Verfahren“ angewandt. Dabei

werden, basierend auf der vorliegenden Stichprobe, Realisierungen von Zufallsvariablen erzeugt,die der Grenzverteilung unter der Nullhypothese in gewissem Sinne nahekommen. Durch denVergleich dieser Realisierungen mit der Teststatistik wird entschieden, ob die Hypothese, dassdie beobachtete Stichprobe auf einer austauschbaren Copula basiert, abgelehnt werden sollteoder nicht. Schließlich werden in einer Simulationsstudie die verschiedenen Teststatistiken, einigeGewichtsfunktionen und einige Teilmengen von Permutationen verglichen. Zu diesem Zweckwerden Realisierungen von verschiedenen Zufallsvektoren erzeugt, von deren Copula bekannt ist,ob sie austauschbar ist oder nicht. Kapitel 5 endet mit einer kurzen Diskussion der Resultate derSimulationsstudie.

Einige Beweise oder deren technische Details wurden in den Appendix ausgelagert.

131

Page 140: Exchangeability of Copulas
Page 141: Exchangeability of Copulas

Bibliography

Abdous, B., Genest, C., and Remillard, B. (2005). Dependence properties of meta-ellipticaldistributions. In Duchesne, P. and Remillard, B., editors, Statistical modeling and analysis forcomplex data problems, pages 1–15. Springer, New York.

Alsina, C., Frank, M. J., and Schweizer, B. (2006). Associative Functions: Triangular Norms andCopulas. World Scientific Publishing Co. Pte. Ltd., Hackensack.

Alsina, C., Nelsen, R. B., and Schweizer, B. (1993). On the characterization of a class of binaryoperations on distribution functions. Statistics & Probability Letters, 17:85–89.

Bauer, H. (1992). Maß- und Integrationstheorie. de Gruyter Lehrbuch. Walter de Gruyter & Co.,Berlin, second edition.

Bauer, H. (2002). Wahrscheinlichkeitstheorie. de Gruyter Lehrbuch. Walter de Gruyter & Co.,Berlin, fifth edition.

Beliakov, G., De Baets, B., De Meyer, H., Nelsen, R. B., and Ubeda-Flores, M. (2014). Best-possible bounds on the set of copulas with given degree of non-exchangeability. Journal ofMathematical Analysis and Applications, 417(1):451–468.

Berghaus, B., Bucher, A., and Volgushev, S. (2015). Weak convergence of the empirical copulaprocess with respect to weighted metrics. to appear in Bernoulli.

Bernstein, S. (1929). Sur les fonctions absolument monotones. Acta Mathematica, 52(1):1–66.

Billingsley, P. (1968). Convergence of probability measures. John Wiley & Sons, Inc., NewYork-London-Sydney.

Billingsley, P. (1995). Probability and measure. Wiley Series in Probability and MathematicalStatistics. John Wiley & Sons, Inc., New York, third edition.

Billingsley, P. (1999). Convergence of probability measures. Wiley Series in Probability andStatistics: Probability and Statistics. John Wiley & Sons, Inc., New York, second edition.

Blomqvist, N. (1950). On a measure of dependence between two random variables. Annals ofMathematical Statistics, 21(4):593–600.

Box, G. E. P. and Muller, M. E. (1958). A note on the generation of random normal deviates.The Annals of Mathematical Statistics, 29(2):610–611.

Bucher, A. and Dette, H. (2010). A note on bootstrap approximations for the empirical copulaprocess. Statistics & Probability Letters, 80(23–24):1925–1932.

133

Page 142: Exchangeability of Copulas

Bibliography

Cambanis, S., Huang, S., and Simons, G. (1981). On the theory of elliptically contoureddistributions. Journal of Multivariate Analysis, 11:368–385.

Carabarin Aguirre, A. (2008). Set-indexed survival analysis with generalized censoring. PhDthesis, University of Ottawa.

Clarkson, J. A. and Adams, C. R. (1933). On definitions of bounded variation for functions oftwo variables. Transactions of the American Mathematical Society, 35(4):824–854.

Clayton, D. G. (1978). A model for association in bivariate life tables and its application inepidemiological studies of familial tendency in chronic disease incidence. Biometrika, 65(1):141–151.

Conrad, K. (2013). Generating sets. Expository, unpublished paper on the author’s personalhomepage.

Coxeter, H. S. M. and Moser, W. O. J. (1980). Generators and relations for discrete groups,volume 14 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer-Verlag BerlinHeidelberg GmbH, fourth edition.

Cuculescu, I. and Theodorescu, R. (2001). Copulas: diagonals, tracks. Revue roumaine demathematiques pures et appliquees, 46(6):731–742.

David, F. N. (1955). Studies in the history of probability and statistics. I. Dicing and gaming (anote on the history of probability). Biometrika, 42:1–15.

Deheuvels, P. (1981). An asymptotic decomposition for multivariate distribution-free tests ofindependence. Journal of Multivariate Analysis, 11:102–113.

Demarta, S. and McNeil, A. J. (2005). The t copula and related copulas. International StatisticalReview, 73(1):111–129.

Devroye, L. (1986). Nonuniform random variate generation. Springer-Verlag, New York.

Dolati, A. and Ubeda-Flores, M. (2006). On measures of multivariate concordance. Journal ofProbability and Statistical Science, 4(2):147–163.

Donsker, M. D. (1952). Justification and extension of Doob’s heuristic approach to the Kolmogorov-Smirnov theorems. Annals of Mathematical Statistics, 23:277–281.

Dudley, R. M. (1999). Uniform central limit theorems, volume 63 of Cambridge Studies inAdvanced Mathematics. Cambridge University Press, Cambridge.

Dummit, D. S. and Foote, R. M. (2009). Abstract algebra. Wiley, third edition.

Durante, F. and Fernandez-Sanchez, J. (2010). Multivariate shuffles and approximation of copulas.Statistics & Probability Letters, 80(23-24):1827–1834.

Durante, F., Klement, E. P., Sempi, C., and Ubeda-Flores, M. (2010). Measures of non-exchangeability for bivariate random vectors. Statistical Papers, 51(3):687–699.

Embrechts, P. (2009). Copulas: A personal view. Journal of Risk and Insurance, 76(3):639–650.

Embrechts, P. and Hofert, M. (2013). A note on generalized inverses. Mathematical Methods ofOperations Research, 77(3):423–432.

134

Page 143: Exchangeability of Copulas

Bibliography

Fang, K. T., Kotz, S., and Ng, K. W. (1990). Symmetric multivariate and related distributions,volume 36 of Monographs on Statistics and Applied Probability. Chapman and Hall, Ltd.,London.

Fechner, G. T. (1897). Kollektivmasslehre: im Auftrage der Koniglich sachsischen Gesellschaftder Wissenschaften herausgegeben von Gottl. Friedr. Lipps. Wilhelm Engelmann, Leipzig.

Feller, W. (1971). An Introduction to Probability Theory and Its Applications, volume 2. Wiley,New York, second edition.

Fermanian, J.-D. (1998). Contributions a l’Analyse Nonparametrique des Fonctions de Hasardsur Donnees Multivariees et Censurees. PhD thesis, Universite Paris 6.

Fermanian, J.-D., Radulovic, D., and Wegkamp, M. (2004). Weak convergence of empirical copulaprocesses. Bernoulli, 10(5):847–860.

Frahm, G., Junker, M., and Szimayer, A. (2003). Elliptical copulas: applicability and limitations.Statistics & Probability Letters, 63(3):275–286.

Frank, M. J. (1979). On the simultaneous associativity of F (x, y) and x+y−F (x, y). aequationesmathematicae, 19(1):194–226.

Frechet, M. R. (1951). Sur les tableaux de correlation dont les marges sont donnees. Annales del’Universite de Lyon. Section A (3), 14:53–77.

Ganssler, P. and Stute, W. (1987). Seminar on Empirical Processes, volume 9 of DMV Seminar.Birkhauser, Basel.

Genest, C. and Favre, A.-C. (2007). Everything you always wanted to know about copula modelingbut were afraid to ask. Journal of Hydrologic Engineering, 12(4):347–368.

Genest, C., Ghoudi, K., and Rivest, L.-P. (1995). A semiparametric estimation procedure ofdependence parameters in multivariate families of distributions. Biometrika, 82(3):543–552.

Genest, C., Ghoudi, K., and Rivest, L.-P. (1998). “Understanding relationships using copulas,”by Edward Frees and Emiliano Valdez, January 1998. North American Actuarial Journal,2(3):143–149.

Genest, C. and MacKay, R. J. (1986). Copules archimediennes et familles de lois bidimensionnellesdont les marges sont donnees. The Canadian Journal of Statistics / La Revue Canadienne deStatistique, 14(2):145–159.

Genest, C., Neslehova, J., and Quessy, J.-F. (2012). Tests of symmetry for bivariate copulas.Annals of the Institute of Statistical Mathematics, 64(4):811–834.

Genest, C., Quesada Molina, J. J., Rodrıguez Lallena, J. A., and Sempi, C. (1999). A characteri-zation of quasi-copulas. Journal of Multivariate Analysis, 69(2):193–205.

Golub, G. H. and Van Loan, C. F. (2013). Matrix computations. Johns Hopkins Studies in theMathematical Sciences. Johns Hopkins University Press, Baltimore, fourth edition.

Gumbel, E. J. (1960). Distributions des valeurs extremes en plusieurs dimensions. Publicationsde l’Institut de Statistique de l’Universite de Paris, 9:171–173.

Gut, A. (2013). Probability: A Graduate Course. Springer Texts in Statistics. Springer, NewYork, second edition.

135

Page 144: Exchangeability of Copulas

Bibliography

Harder, M. and Stadtmuller, U. (2014). Maximal non-exchangeability in dimension d. Journal ofMultivariate Analysis, 124:31–41.

Harder, M. and Stadtmuller, U. (2015). Testing exchangeability of copulas in arbitrary dimension.Under revision.

Hering, C. (2011). Estimation Techniques and Goodness-of-fit Tests for Certain Copula Classesin Large Dimensions. PhD thesis, Universitat Ulm.

Hildebrandt, T. H. (1963). Introduction to the Theory of Integration. Academic Press, New York.

Hoeffding, W. (1940). Masstabinvariante Korrelationstheorie. Schriften des MathematischenInstituts und des Instituts fur Angewandte Mathematik der Universitat Berlin, 5(Heft 3):179–233.

Hoeffding, W. (1941). Masstabinvariante Korrelationsmasse. Archiv fur mathematischeWirtschafts- und Sozialforschung, 7:49–70.

Hofert, M. (2008). Sampling Archimedean copulas. Computational Statistics & Data Analysis,52(12):5163–5174.

Hofert, M. (2010). Sampling nested Archimedean copulas with applications to CDO pricing. PhDthesis, Universitat Ulm.

Hofert, M. (2011). Efficiently sampling nested Archimedean copulas. Computational Statistics &Data Analysis, 55(1):57–70.

Hofert, M., Kojadinovic, I., Machler, M., and Yan, J. (2015). copula: Multivariate Dependencewith Copulas. R package version 0.999-13.

Hofert, M. and Machler, M. (2011). Nested archimedean copulas meet R: The nacopula package.Journal of Statistical Software, 39(9):1–20.

Hofert, M. and Pham, D. (2013). Densities of nested Archimedean copulas. Journal of MultivariateAnalysis, 118:37–52.

Hofert, M. and Scherer, M. (2011). CDO pricing with nested Archimedean copulas. QuantitativeFinance, 11(5):775–787.

Hult, H. and Lindskog, F. (2002). Multivariate extremes, aggregation and dependence in ellipticaldistributions. Advances in Applied Probability, 34(3):587–608.

Jaworski, P. (2009). On copulas and their diagonals. Information Sciences, 179(17):2863–2871.

Joe, H. (1990). Multivariate concordance. Journal of Multivariate Analysis, 35(1):12–30.

Joe, H. (1997). Multivariate Models and Dependence Concepts, volume 73 of Monographs onStatistics and Applied Probability. Chapman & Hall, London.

Joe, H. (2015). Dependence modeling with copulas, volume 134 of Monographs on Statistics andApplied Probability. CRC Press, Boca Raton.

Joe, H., Li, H., and Nikoloulopoulos, A. K. (2010). Tail dependence functions and vine copulas.Journal of Multivariate Analysis, 101(1):252–270.

Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30(1/2):81–93.

136

Page 145: Exchangeability of Copulas

Bibliography

Kerber, A. (1971). Representations of permutation groups. I, volume 240 of Lecture Notes inMathematics. Springer-Verlag, Berlin Heidelberg New York.

Khoudraji, A. (1995). Contributions a l’etude des copules et a la modelisation de valeurs extremesbivariees. PhD thesis, Universite Laval, Quebec.

Kiefer, J. C. (1961). On large deviations of the empiric D. F. of vector chance variables and a lawof the iterated logarithm. Pacific Journal of Mathematics, 11:649–660.

Kimberling, C. H. (1974). A probabilistic interpretation of complete monotonicity. aequationesmathematicae, 10(2–3):152–164.

Kingman, J. F. C. (1978). Uses of exchangeability. The Annals of Probability, 6(2):183–197.

Klement, E. P. and Mesiar, R. (2006). How non-symmetric can a copula be? CommentationesMathematicae Universitatis Carolinae, 47(1):199–206.

Knuth, D. E. (1998). The art of computer programming. Vol. 2: Seminumerical algorithms.Addison-Wesley, Reading, third edition.

Kojadinovic, I. and Yan, J. (2010). Modeling multivariate distributions with continuous marginsusing the copula R package. Journal of Statistical Software, 34(9):1–20.

Kruskal, W. H. (1958). Ordinal measures of association. Journal of the American StatisticalAssociation, 53:814–861.

Lebesgue, H. (1902). Integrale, longueur, aire. Annali di Matematica Pura ed Applicata (1898-1922), 7(1):231–359.

Liebscher, E. (2008). Construction of asymmetric multivariate copulas. Journal of MultivariateAnalysis, 99(10):2234–2250.

Liebscher, E. (2011). Erratum to “Construction of asymmetric multivariate copulas”. Journal ofMultivariate Analysis, 102(4):869–870.

Liflyand, E., Stadtmuller, U., and Trigub, R. (2011). An interplay of multidimensional variationsin fourier analysis. Journal of Fourier Analysis and Applications, 17(2):226–239.

Lindskog, F., McNeil, A., and Schmock, U. (2003). Kendall’s tau for elliptical distributions. InBol, G., Nakhaeizadeh, G., Rachev, S. T., Ridder, T., and Vollmer, K.-H., editors, CreditRisk: Measurement, Evaluation and Management, Contributions to Economics, pages 149–156.Physica-Verlag, Heidelberg.

Lipps, G. F. (1905). Die Bestimmung der Abhangigkeit zwischen den Merkmalen eines Gegen-standes. Berichte der Koniglich Sachsischen Gesellschaft der Wissenschaften, Mathematisch-Physische Klasse, 57:1–32.

Malov, S. V. (2001). On finite-dimensional Archimedean copulas. In Balakrishnan, N., Ibragimov,I., and Nevzorov, V., editors, Asymptotic Methods in Probability and Statistics with Applications,Statistics for Industry and Technology, pages 19–35. Birkhauser, Boston.

Marcinkiewicz, J. and Zygmund, A. (1937). Sur les fonctions independantes. FundamentaMathematicae, 29:60–90.

Marshall, A. W. and Olkin, I. (1988). Families of multivariate distributions. Journal of theAmerican Statistical Association, 83(403):834–841.

137

Page 146: Exchangeability of Copulas

Bibliography

Massonnet, G., Janssen, P., and Duchateau, L. (2009). Modelling udder infection data usingcopula models for quadruples. Journal of Statistical Planning and Inference, 139(11):3865–3877.

Matsumoto, M. and Nishimura, T. (1998). Mersenne twister: A 623-dimensionally equidistributeduniform pseudo-random number generator. ACM Transactions on Modeling and ComputerSimulation, 8(1):3–30.

McNeil, A. J. (2008). Sampling nested Archimedean copulas. Journal of Statistical Computationand Simulation, 78(6):567–581.

McNeil, A. J., Frey, R., and Embrechts, P. (2005). Quantitative Risk Management: Concepts,Techniques, Tools. Princeton Series in Finance. Princeton University Press, Princeton.

McNeil, A. J. and Neslehova, J. (2009). Multivariate Archimedean copulas, d-monotone functionsand l1-norm symmetric distributions. The Annals of Statistics, 37(5B):3059–3097.

McNeil, A. J. and Neslehova, J. (2010). From Archimedean to Liouville copulas. Journal ofMultivariate Analysis, 101(8):1772–1790.

Mikusinski, P., Sherwood, H., and Taylor, M. D. (1992). Shuffles of min. Stochastica, 13(1):61–74.

Mikusinski, P. and Taylor, M. D. (2010). Some approximations of n-copulas. Metrika, 72(3):385–414.

Mroz, M. (2012). Time-Varying Copula Models for Financial Time Series. PhD thesis, UniversitatUlm.

Nelsen, R. B. (1986). Properties of a one-parameter family of bivariate distributions with specifiedmarginals. Communications in Statistics. A. Theory and Methods, 15(11):3277–3285.

Nelsen, R. B. (1993). Some concepts of bivariate symmetry. Journal of Nonparametric Statistics,3(1):95–101.

Nelsen, R. B. (2006). An Introduction to Copulas. Springer Series in Statistics. Springer, NewYork, second edition.

Nelsen, R. B. (2007). Extremes of nonexchangeability. Statistical Papers, 48:329–336.

Pearson, K. (1895). Note on regression and inheritance in the case of two parents. Proceedings ofthe Royal Society of London, 58:240–242.

Pearson, K. (1900). X. on the criterion that a given system of deviations from the probable inthe case of a correlated system of variables is such that it can be reasonably supposed to havearisen from random sampling. Philosophical Magazine Series 5, 50(302):157–175.

R Core Team (2014). R: A Language and Environment for Statistical Computing. R Foundationfor Statistical Computing, Vienna, Austria.

RAND Corporation (1955). A million random digits with 100,000 normal deviates. The FreePress, Glencoe.

Rao, M. M. (2004). Measure theory and integration, volume 265 of Monographs and Textbooks inPure and Applied Mathematics. Marcel Dekker, Inc., New York, second edition.

Remillard, B. and Scaillet, O. (2009). Testing for equality between two copulas. Journal ofMultivariate Analysis, 100(3):377–386.

138

Page 147: Exchangeability of Copulas

Bibliography

Ressel, P. (2011). A revision of Kimberling’s results—With an application to max-infinitedivisibility of some Archimedean copulas. Statistics & Probability Letters, 81(2):207–211.

Rodgers, J. L. and Nicewander, W. A. (1988). Thirteen ways to look at the correlation coefficient.The American Statistician, 42(1):59–66.

Romano, J. P. (1989). Bootstrap and randomization tests of some nonparametric hypotheses.The Annals of Statistics, 17(1):141–159.

Rosenblatt, M. (1952). Remarks on a multivariate transformation. The Annals of MathematicalStatistics, 23(3):470–472.

Ruschendorf, L. (1976). Asymptotic distributions of multivariate rank order statistics. TheAnnals of Statistics, 4(5):912–923.

Scarsini, M. (1984). On measures of concordance. Stochastica, 8(3):201–218.

Schmid, F. and Schmidt, R. (2007). Multivariate conditional versions of Spearman’s rho andrelated measures of tail dependence. Journal of Multivariate Analysis, 98(6):1123–1140.

Schmidt, R. (2002). Tail dependence for elliptically contoured distributions. MathematicalMethods of Operations Research, 55(2):301–327.

Schmitz, V. (2003). Copulas and Stochastic Processes. PhD thesis, Rheinisch-WestfalischeTechnische Hochschule Aachen.

Schoenberg, I. J. (1938). Metric spaces and completely monotone functions. Annals of Mathematics.Second Series, 39(4):811–841.

Schweizer, B. (1991). Thirty years of copulas. In Dall’Aglio, G., Kotz, S., and Salinetti, G.,editors, Advances in Probability Distributions with Given Marginals, volume 67 of Mathematicsand Its Applications, pages 13–50. Springer Netherlands.

Schweizer, B. and Sklar, A. (1983). Probabilistic Metric Spaces. North-Holland, New York.

Segers, J. (2012). Asymptotics of empirical copula processes under non-restrictive smoothnessassumptions. Bernoulli, 18(3):764–782.

Serfling, R. J. (2006). Multivariate symmetry and asymmetry. In Kotz, S., Read, C. B.,Balakrishnan, N., and Vidakovic, B., editors, Encyclopedia of Statistical Sciences, volume 8:Mizutani Distribution to Nyquist Frequency, pages 5338–5345. John Wiley & Sons, New York,second edition.

Sibuya, M. (1960). Bivariate extreme statistics. I. Annals of the Institute of Statistical Mathematics,11(195–210).

Sklar, A. (1959). Fonctions de repartition a n dimensions et leurs marges. Publications deL’Institut de Statistique de L’Universite de Paris, 8:229–231.

Sklar, A. (1996). Random variables, distribution functions, and copulas—a personal look backwardand forward. In Distributions with fixed marginals and related topics (Seattle, WA, 1993),volume 28 of IMS Lecture Notes Monograph Series, pages 1–14. Institute of MathematicalStatistics, Hayward.

Skorokhod, A. V. (1956). Limit theorems for stochastic processes. Theory of Probability & ItsApplications, 1(3):261–290.

139

Page 148: Exchangeability of Copulas

Bibliography

Spearman, C. E. (1904). The proof and measurement of association between two things. TheAmerican Journal of Psychology, 15(1):72–101.

Taylor, M. D. (2007). Multivariate measures of concordance. Annals of the Institute of StatisticalMathematics, 59(4):789–806.

Tsukahara, H. (2005). Semiparametric estimation in copula models. The Canadian Journal ofStatistics / La Revue Canadienne de Statistique, 33(3):357–375.

Tsukahara, H. (2011). Erratum to “Semiparametric estimation in copula models”. The CanadianJournal of Statistics / La Revue Canadienne de Statistique, 39(4):734–735.

van der Vaart, A. W. (1998). Asymptotic statistics, volume 3 of Cambridge Series in Statisticaland Probabilistic Mathematics. Cambridge University Press, Cambridge.

van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes.Springer Series in Statistics. Springer, New York.

Vitali, G. (1905). Sul problema della misura dei gruppi di punti di una retta. Tipografia Gamberinie Parmeggiani, Bologna.

Walker, H. M. (1929). Studies in the history of Statistical Method: With special reference tocertain educational problems. The Williams & Wilkins Company, Baltimore.

Yan, J. (2007). Enjoy the joy of copulas: With a package copula. Journal of Statistical Software,21(4):1–21.

140

Page 149: Exchangeability of Copulas

Nomenclature

(Ω,A,P) Probability space, consisting of a set Ω, a σ-algebra A and a probability measure P

[0, 1]d unit hypercube i. e.×dj=1[0, 1]

µ unless stated otherwise, vector of expected values

F (x) vector of univariate functions Fi, evaluated at xi, i. e.(F1(x1), . . . , Fd(xd)

)x y Hadamard-product of two vectors x,y ∈ Rd, i. e. (x y)i = xiyi for i ∈ 1, . . . , d,

not to be confused with composition of mappings as in (f g)(x) = f(g(x)

)xπ vector x ∈ Rd whose components are permuted according to some permutation

π ∈ Sd, i. e. xπ = (xπ(1), . . . , xπ(d))>

Xna.s.−−→X almost sure convergence of random vectors, see Definition 3.2.4

Xnd−→X convergence in distribution, see Definition 3.2.4

XnP−→X convergence in probability, see Definition 3.2.4

BVc[0, 1]d functions on [0, 1]d with Hardy variation bounded by α as in Definition 5.2.6

C survival copula of C as in Definition 2.1.8

χ2ν chi-square distribution with ν degrees of freedom

E(X) expected value of a random variable X, i. e.∫RxdF (x) if X ∼ F .

E∗(X) outer expectation of X as in Definition 3.4.2

Fn(x) empirical distribution function as in Definition 3.3.9, also Hn, not to be confused

with the empirical copula Cn

Fjn(x) empirical distribution function of (a one-dimensional margin) Fj as in Defini-tion 5.1.1

1(y ≤ 1) indicator function, i. e. 1(y ≤ 1) = 1 if y ≤ 1 and 1(y ≤ 1) = 0 otherwise, anequivalent notation is 1(−∞,1](y).

κ(X,Y ) measure of concordance of the random variables X and Y as in Definition 2.6.5,also κX,Y and κC .

λd Lebesgue-Borel-measure

141

Page 150: Exchangeability of Copulas

Nomenclature

λL(X,Y ) lower tail dependence coefficient as in Definition 2.6.15, also λL(C)

λU (X,Y ) upper tail dependence coefficient as in Definition 2.6.15, also λU (C)

dxe ceiling function, i. e. smallest integer z ∈ Z, such that z ≥ x

‖f‖∞ supremum-norm of f : D → R, i. e. supx∈D|f(x)| and D ⊂ Rd is the domain of f

‖f‖H,2 L2(H)-norm of some function f : Rd → R as in Definition 3.5.6

Bd d-dimensional Borel σ-algebra

Sd symmetrical group of d elements, i. e. set of all permutations of 1, . . . , d (allbijective mappings π : 1, . . . , d → 1, . . . , d)

Ud d-dimensional unit hypersphere, i. e.x ∈ Rd |x>x = 1

.

cov(X,Y ) covariance of random variable X and Y , i. e. cov(X,Y ) = E(XY )− E(X)E(Y )

Exp(λ) exponential distribution with rate λ, i. e. with mean 1λ

Ed(µ,Σ, φ) d-dimensional elliptical distribution with parameters (µ,Σ, φ)

id identity-mapping, i. e., id(x) = x

N(µ,Σ) (multivariate) normal distribution with expectation µ and covariance matrix Σ

ran(F ) range of F , i. e. the set y ∈ R | ∃x ∈ R : F (x) = y

U[a, b] uniform distribution on the interval [a, b]

var(X) variance of random variable X, i. e. var(X) = E(X2)−(E(X)

)2Cd the set of all d-dimensional copulas

P([0, 1]

)power set, i. e. the set of all subsets of [0, 1]

µn ⇒ µ weak convergence of measures as in Definition 3.2.5 and Proposition 3.2.6

νC(A) copula-volume of a hyperrectangle A ⊂ [0, 1]d, induced measure of C for arbitraryA ⊂ [0, 1]d, i. e. if U ∼ C, then νC(A) = P(U ∈ A)

R closure of R, i. e. R ∪ −∞,∞

F survival function of distribution function F , i. e. F (x) := P(X > x), where X ∼ F

Xn mean of X1, . . . , Xn, i. e. Xn := 1n

∑ni=1Xi.

Φ cumulative distribution function of the standard normal distribution, i. e. of N(0, 1)

φ characteristic function, generating function for elliptical distributions as in Defini-tion 2.4.1

Π independence copula

concordance ordering, C1 C2 means C1 is less concordant than C2

P∗(B) inner probability of B as in Definition 3.4.6.

142

Page 151: Exchangeability of Copulas

Nomenclature

RVα set of functions, which are regularly varying with index α as in Definition 2.6.18

ρ(S)X,Y Spearman’s rho as in Definition 2.6.8, also ρ

(S)C

ρX,Y correlation coefficient of X and Y as in Definition 2.6.1

Σ positive semi-definite matrix, in most cases a covariance or correlation matrix

σ(G) smallest σ-algebra generated by the set G

σX standard deviation of random variable X, i. e. σ2X = E

(X2)−(E(X)

)2 concordance ordering, C1 C2 means C1 is more concordant than C2

τX,Y Kendall’s tau as in Definition 2.6.9, also τC

ϕ (Archimedean) generator, parametric form: ϕθ

ϕ− inverse generator as in (2.4.3)

Cn(u) empirical copula as in (5.1.1), not to be confused with the empirical distributionfunction Cn of a copula C.

Bε(x) open ball with radius ε > 0 around x

C[0, 1]d space of all continuous and real-valued functions on the hypercube [0, 1]d ⊂ Rd asin Definition 3.3.1

CL smallest Archimedean copula (pointwise)

Ci|1,...,i−1 conditional margin as in Lemma 2.5.2

d dimension of a copula or the number of elements in a vector, d ≥ 2 unless stateddifferently

D[0, 1]d space of all real valued cadlag-functions on [0, 1]d as in Definition 3.3.1

Dk Debye-function of order k as in Example 2.6.14

F− generalized inverse of F

f (k)(x) k-th derivative of the function f , evaluated at x

Hn ⇒ H limn→∞Hn(x) = H(x) for all x where H(x) is continuous, see Definition 3.2.5.

J the interior of the set J , i. e. all points x ∈ J , for which an open set O ⊂ J exists,such that x ∈ O

l∞[0, 1]d space of all real valued bounded functions on [0, 1]d as in Definition 3.3.1

Md upper d-dimensional Frechet-Hoeffding-bound, also M

N[ ]

(ε, S, ‖ · ‖

)bracketing number as in Definition 3.5.6

tν(µ,Σ) (multivariate) t-distribution with ν degrees of freedom, location vector µ and scalematrix Σ

V V (f) Vitali variation of f as in Definition 5.2.6

Wd lower d-dimensional Frechet-Hoeffding-bound, also W

Xn X weak convergence of Xn to X as in Definition 3.4.3

143

Page 152: Exchangeability of Copulas
Page 153: Exchangeability of Copulas

Index

almost sure convergence, 38Archimedean

copula, 16generator, 16

asymptotic tightness, 49

Bernstein’s theorem, 18Blomqvist’s beta, 31Borel

set, 35σ-algebra, 35

bounded variation, 99bracket, 53bracketing number, 53Brownian

bridge, 51motion, 51sheet, 55

C-volume, see H-volumecadlag-functions, 41central limit theorem, 40Clayton-copula, 19completely monotone, 17concordance, 25continuous mapping theorem, 40, 50convergence

almost sure, 38in distribution, 38in probability, 38

copula, 6tν-, 15Archimedean, 16Clayton-, 19elliptical, 15empirical, 91Frank-, 19Gaussian, 15

Gumbel-, 20hierarchical Archimedean, 20independence, 6jointly symmetric, 63maximal non-exchangeable, 76, 85nested Archimedean, 20quasi-, 8radially symmetric, 63sub-, 7survival, 7

correlation coefficient, 24covariance, 24

d-increasing, 6-monotone, 18

Debye-function, 31degenerate random variable, 39dense, 41distribution function, 38Donsker’s theorem, 52

ellipticalcopula, 15distribution, 14

empiricalcopula, 91distribution function, 44process, 50

ε-bracket, 53exchangeable

copula, see exchangeable mappingmapping, 66random vector, 66

Frechet-Hoeffding-bounds, 8Frank-copula, 19functional delta method, 99

145

Page 154: Exchangeability of Copulas

Index

Gaussiancopula, 15process, 51

generalized inverse, 11generating set, 92generator, 16Gini’s gamma, 31grounded mapping, 6Gumbel-copula, 20

H-volume, 5Hadamard

differentiable (tangentially to a set), 99product, 58

Hardy variation, 99hierarchical Archimedean copula, 20

independence copula, 6indicator function, 17inner probability, 49

joint symmetry, 58jointly symmetric copula, 63

Kendall’s tau, 27

law of large numbers, 39Lebesgue-Borel-measure, 36Lindelof property, 41Lipschitz-condition, 9

marginal symmetry, 58Marshall-Olkin-algorithm, 24maximal non-exchangeable copula, 76, 85measurable

mapping, 36set, 36space, 36

measure, 36induced, 37space, 36

measure of concordance, 26Blomqvist’s beta, 31Gini’s gamma, 31Kendall’s tau, 27Spearman’s rho, 27

measure of non-exchangeability, 73monotone

d-, 18completely, 17

nested Archimedean copula, 20non-degenerate random vector, 15non-measurable set, 45normal distribution, 15

outer expectation, 48

path (of a random process), 47Pearson’s correlation coefficient, 24portmanteau theorem, 39probability

measure, 36space, 36

processGaussian, 51random, 47Wiener, 51

quasi-copula, 8

radial symmetry, 58radially symmetric copula, 63random

function, 42process, 47variable, 37vector, 37

rank transformed sample, 91regular variation, 33Rosenblatt-transform, 21

separable space, 41σ-algebra, 35Sklar’s theorem, 12Skorokhod metric, 47slowly varying, 33Spearman’s rho, 27sub-copula, 7survival copula, 7symmetry, 57

joint, 58, 63marginal, 58radial, 58, 63univariate, 57

tν-copula, 15-distribution, 15

tail dependencelower, 32

146

Page 155: Exchangeability of Copulas

Index

upper, 31theorem

Bernstein, 18continuous mapping, 40, 50Donsker, 52Lindeberg-Levy, 40portmanteau, 39Rosenblatt, 21Sklar, 12

tightness, 49

uniformmargins, 6tightness, 49

variationHardy, 99Vitali, 99

Vitali variation, 99

weak convergence, 48of distribution functions, 39of probability measures, 39of random processes, 48

Wiener process, 51

147

Page 156: Exchangeability of Copulas
Page 157: Exchangeability of Copulas

Der Lebenslauf ist aus Grunden des Datenschutzes in dieser Version der Arbeit nicht enthalten.

Page 158: Exchangeability of Copulas

Parts of this dissertation have already been published in the following journal articles:

Harder, M. and Stadtmuller, U. (2014). Maximal non-exchangeability in dimension d. Journal ofMultivariate Analysis, 124:31–41.

Harder, M. and Stadtmuller, U. (2015). Testing exchangeability of copulas in arbitrary dimension.Under revision.