Mathematical Modelling and Simulation

UNIVERSITAT LINZJOHANNES KEPLER

JKU

Technisch-Naturwissenschaftliche

Fakultat

Mathematical Modelling and Simulationof Ion Channels

DISSERTATION

zur Erlangung des akademischen Grades

Doktorin

im Doktoratsstudium der

Technischen Wissenschaften

Eingereicht von:

Dipl.-Math.techn. Kattrin Arning

Angefertigt am:

Radon Institute for Computational and Applied Mathematics

Beurteilung:

Univ. Prof. Dipl. Ing. Dr. Martin Burger (Betreuung)Univ. Prof. Dipl. Ing. Dr. Christoph Romanin

Linz, Oktober 2009

Die Mathematik ist das Alphabet,mit dem Gott die Welt geschrieben hat.

Galileo Galilei

Abstract

This work is concerned with the mathematical modelling and simulation of ion channels.Ion channels are of major interest and form an area of intensive research in the fields ofbiophysics and medicine, since they control many vital physiological functions. As certainaspects with respect to ion channel structure and function are hard or impossible to addressin experimental investigations, mathematical models form a useful completion and alterna-tive to these studies.

This thesis will mainly deal with two aspects of the channel behaviour, namely ion conduc-tion and gating.

The first part is dedicated to the description of ion transport across single open channels, fo-cusing on a macroscopic model composed of a system of coupled nonlinear partial differentialequations (PDEs), known as the Poisson-Nernst-Planck (PNP) system in biological context.A one-dimensional approximation of this PDE system is derived by introducing additionalpotentials to account for the channel protein geometry, and furthermore a computationallyefficient way to include size-exclusion effects, which become important in narrow geometries,is developed.

Since for most ion channels the structure of the selectivity filter (the region where the speci-ficity of the ion channel is determined) cannot be resolved in experiments yet, the PNPmodel is subsequently used to address questions from the field of inverse problems. It isinvestigated if electrophysiological measurements like current-voltage curves can be used tocharacterise the underlying channel structure. A special focus is put on the employment ofsurrogate models in the identification procedure.

The second part of the thesis deals with the opening and closing of ion channels, a processknown as gating. Different modelling approaches that can be used to simulate the behaviourof voltage-gated ion channels are presented, and a model of Fokker-Planck type is derived todescribe the gating currents and open probabilities on a macroscopic, i.e. whole cell, level.This model is then used to analyse certain characteristic features arising in gating currentdata, like the existence of a rising phase under certain conditions.

As above in the case of ion conduction, the derived gating model is subsequently employedto address inverse problems from the field of parameter identification. Macroscopic currentdata are used to investigate what can be inferred about the underlying physical system.

iii

iv

Zusammenfassung

Die vorliegende Arbeit beschaftigt sich mit der mathematischen Modellierung und Simula-tion von Ionenkanalen. Ionenkanale sind von großer Bedeutung im Bereich der Biophysikund Medizin und Gegenstand intensiver Forschung, da sie viele lebenswichtige Funktionensteuern und kontrollieren. Da gewisse Aspekte bezuglich ihrer Proteinstruktur und Funkti-onsweise mit den heutigen experimentellen Methoden noch gar nicht oder nur sehr schweruntersucht werden konnen, stellen mathematische Modelle eine hilfreiche Erganzung undAlternative dar.

Die Arbeit konzentriert sich vornehmlich auf zwei Aspekte der Kanalfunktion, zum einen derIonentransport durch einzelne Kanalproteine und zum anderen das Offnen und Schließen derKanale, ein Vorgang, der auch als Gating bezeichnet wird.

Der erste Teil beschaftigt sich mit der Modellierung des Ionentransports durch einzelne geoff-nete Kanale. Hauptaugenmerk liegt dabei auf einem makroskopischen Modell bestehendaus gekoppelten nichtlinearen partiellen Differentialgleichungen (PDEs), das im biologischenZusammenhang auch als Poisson-Nernst-Planck (PNP) -Modell bekannt ist. Beruhend aufzusatzlichen Potentialen zur Berucksichtigung der Proteingeometrie wird eine eindimensio-nale Approximation des PNP-Systems hergeleitet, und ein effizienter Weg zur Berechnungzusatzlicher lokaler Interaktionen zwischen den einzelnen Ionen wird entwickelt. Diese Inter-aktionen gewinnen besonders in der engen Geometrie des Kanalfilters an Bedeutung.

Da fur die meisten Ionenkanale bisher keine kristallisierte Struktur ihrer Filterregion vor-liegt, wird das PNP-Modell im Folgenden benutzt, um Fragen aus dem Bereich der InversenProbleme zu adressieren. Es wird untersucht, ob, basierend auf elektrophysiologischen Datenwie Strom-Spannungs-Kurven, etwas uber die zu Grunde liegende Struktur des Filters aus-gesagt werden kann. Ein besonderer Schwerpunkt ist dabei die Untersuchung von Surrogat-Modellen.

Der zweite Teil der Arbeit befasst sich mit dem Gating von Kanalen. Unterschiedliche Mo-dellierungsansatze zur Simulation spannungsregulierter Kanale werden vorgestellt, und einModell basierend auf einer Fokker-Planck-Gleichung wird entwickelt. Anhand dieses Modellswerden dann im Folgenden gewisse charakteristische Eigenschaften von Gating-Stromen ana-lysiert.

Wie schon im ersten Teil beim Ionentransport wird auch das Gating-Modell im Weiterenverwendet, um Fragestellungen aus dem Gebiet der Parameteridentifizierung zu adressieren.Basierend auf makroskopischen Daten wird untersucht, welche Eigenschaften des zu Grundeliegenden physikalischen Systems identifiziert werden konnen.

v

vi

Acknowledgments

First of all I would like to thank Prof. Martin Burger for his supervision and the time heinvested into my work. Special thanks also go to Prof. Christoph Romanin for being theco-referee of my thesis and to Prof. Heinz Engl for introducing me to the PhD program“Molecular Bioanalytics” and for the great scientific environment he provides at RICAM.

Furthermore I would like to thank Prof. Peter Pohl and the other organizers and mem-bers of the PhD program for creating this great research opportunity in Linz. In particular Iwould like to thank Angela Vlad for her friendship and encouragements throughout my work.

I also owe a great deal to Prof. Bob Eisenberg who introduced me to the field of biophysicsand to ion channels in particular. I would like to thank him for all the helpful and livelydiscussions throughout the last years.

I am very grateful to my colleagues at RICAM for the friendly environment, especially tomy former office mate Marie-Therese Wolfram and to Clemens Zarzer for all the helpfuldiscussions about work and beyond. In addition I appreciated very much the help of RainerSchindl from the biophysics department who was patiently willing to answer all my questionsconcerning ion channels and membranes.

Special thanks go to my husband Markus and my family and friends for all their encourage-ments and their steady support concerning work and life.

This research was supported as part of the PhD program “Molecular Bioanalytics: Frommolecular recognition to membrane transport” by the Austrian Science Found FWF throughthe project grant DK W1201-N13 and the Austrian Academy of Sciences through RICAM.

vii

viii

Contents

1 Introduction 1

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 On this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Modelling ion conduction 5

2.1 The Poisson-Nernst-Planck model . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Size-exclusion effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.2 Derivation of the 1D-Poisson-Nernst-Planck model . . . . . . . . . . . 17

2.1.3 Analysis of the 1D-Poisson-Nernst-Planck model . . . . . . . . . . . . 23

2.2 Other modelling approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3 Evaluation of the different models . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Inverse problems with PNP 33

3.1 Inverse problems - basic setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.1.1 Parameter identification . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.2 Design problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2 The full forward model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Surrogate models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3.1 Surrogate model with fixed DFT . . . . . . . . . . . . . . . . . . . . . 41

3.3.2 Linear surrogate model . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.4 Comparison of the models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4 Gating of Ion Channels 53

4.1 Different models of channel gating . . . . . . . . . . . . . . . . . . . . . . . . 55

4.1.1 Discrete state Markov models . . . . . . . . . . . . . . . . . . . . . . . 55

4.1.2 Fokker-Planck type models . . . . . . . . . . . . . . . . . . . . . . . . 63

4.1.3 Statistics from single channel recordings . . . . . . . . . . . . . . . . . 67

4.1.4 Comparison of the different models . . . . . . . . . . . . . . . . . . . . 70

4.2 Analysis of the gating current . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.2.1 The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.2.2 The one-dimensional model . . . . . . . . . . . . . . . . . . . . . . . . 86

5 The Cole-Moore effect 93

5.1 Cole-Moore and the different gating models . . . . . . . . . . . . . . . . . . . 93

5.2 A mathematical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

ix

x CONTENTS

6 Inverse problems related to gating 99

6.1 Inverse problems - basic setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 996.2 A one-dimensional model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6.2.1 Identification of potential . . . . . . . . . . . . . . . . . . . . . . . . . 1036.2.2 Combined potential and diffusion coefficient . . . . . . . . . . . . . . . 105

7 Concluding Remarks 109

A Adjoint systems 111

A.1 Adjoint system for full PNP model . . . . . . . . . . . . . . . . . . . . . . . . 111A.2 Adjoint system for linear surrogate model . . . . . . . . . . . . . . . . . . . . 113A.3 Adjoint system for the gating model . . . . . . . . . . . . . . . . . . . . . . . 114

Chapter 1

Introduction

1.1 Background

“Ion channels are proteins with a hole down their middle” - this basic message alreadyconveys the most important fact about ion channels: They form the passageway for ionsacross otherwise impermeable cell membranes. Being present in every living cell of everyliving organism, ion channels control many vital functions like e.g. signal transduction,muscle contraction and regulation of blood pressure. Of course ion channels are more thanjust a hole. They regulate the flow of ions into and out of the cell, with different channeltypes being permeable to different ion species only.Already in the 19th century researchers like Sidney Ringer ([92], [93]) and Walther Nernst([80], [81]) began to investigate and understand the importance of ions for the basic physiolog-ical functions. Since then tremendous insights have been gained and experimental techniquesas well as simulation tools for the investigation of ion transport across membranes have beendeveloped.

Traditionally two different classes of transport mechanisms can be distinguished: the trans-porters (or carriers) and the channels (or pores) ([47]). The first one is associated withactive transport across the cell membrane, a process requiring energy. A prominent exam-ple for a transporter is the Na+-K+pump, moving Na+ions out of the cell and K+ions intothe cell, thereby establishing a concentration gradient between the inside of the cell andits surroundings. As both ion species are transported against their concentration gradientthis process uses ATP, i.e. energy. On the other hand ion channels are related to passivetransport across the membrane. They were shown to exhibit the following three fundamentalproperties ([70]):

1) Ion channels conduct ions rapidly.

2) Many ion channels are highly selective for certain ion species.

3) Their function is regulated by gating, i.e. turning the ion conduction on and off dueto specific environmental stimuli.

In the 1950’s the modern history of ion channels began, when Hodgkin and Huxley performedtheir famous experiments on the squid giant axon ([49], [50], [51], [52]). In the sequel electro-physiological methods were used to demonstrate that ions like Na+and K+indeed cross the

1

2 CHAPTER 1. INTRODUCTION

membrane via unique protein pores ([7], [46]). These studies also led to the discrimination ofthe basic parts of an ion channel: The selectivity filter in the pore determines which ions canpass through the channel, e.g. the KcsA potassium channel mainly conducts K+ions. Thegate is responsible for opening and closing the channel in response to some environmentalstimulus, e.g. a change in membrane voltage, the binding of a specific ligand or some me-chanical distortion of the membrane. The sensing device in turn is responsible for detectingthese changes in the environment, conveying this information in some way to the gate. Aprominent example is the voltage sensor in voltage-gated ion channels that reacts to changesin the membrane voltage.

The invention of the patch clamp technique by Erwin Neher and Bert Sakmann in the 1970’s([76]) made it possible to perform measurements on single channels, and new experimentalapproaches and data became available. Developments in the fields of protein expression,mutational techniques and crystallography allowed to get more and more insights into thestructure of ion channels, culminating in the first high-resolution structure of an ion channel,namely the potassium channel KcsA, in 2001 ([25], [121]). For this contribution RoderickMacKinnon was awarded the Nobel prize in Chemistry in 2003, together with Peter Agre“for the discovery of water channels”.

Besides these seminal experimental developments and findings, the field of theoretical inves-tigations and mathematical modelling developed in parallel. At the end of the 19th centurythe basic equations of electro-diffusion were formulated by Nernst and Planck in the ioniccontext ([79], [88]). Following their experiments on the squid giant axon, Hodgkin andHuxley also proposed a mathematical model describing the potassium and sodium currentsacross the membrane. (Together with J. C. Eccles, Hodgkin and Huxley were granted theNobel prize in Physiology and Medicine in 1963 “for their discoveries concerning the ionicmechanisms involved in excitation and inhibition in the peripheral and central portions ofthe nerve cell membrane”.) With the increasing use of computers after the Second WorldWar, numerical simulations and computer modelling began to emerge as an alternative toexperimental studies. First basic Molecular Dynamics simulations were already carried outin the late 1950’s ([4], [5]).

As was the case with the experimental development, also the theoretical modelling of iontransport and channel gating underwent great advancements over the past decades, leadingto more and more realistic characterizations. Since nowadays there is a variety of differentmodelling approaches, ranging from detailed atomistic descriptions to macroscopic contin-uum models and kinetic models for channel gating, there is also a controversy about whichmodels might be adequate and help in the further understanding of the underlying physicalprocesses.

In this thesis we will mainly focus on the investigation of the electro-diffusion model (alsoreferred to as Poisson-Nernst-Planck model in the biological context) for the description ofion transport. We are aware that this model has its limitations (as well as the other mod-elling approaches), but since we want to turn our attention to questions from the area ofinverse problems it is the best suitable for our needs. Apart from the ion conduction weare also going to derive models of channel gating and again investigate inverse problems likeparameter identification on the basis of electrophysiological data.

1.2. ON THIS WORK 3

This work is a further step to investigate how mathematical techniques can be used in orderto get additional insights into ion channel structure and function and hence contribute to abetter knowledge and understanding of their behaviour.

1.2 On this work

After giving the above general introduction into the historical and biophysical backgroundof this thesis the remainder of this work is organized as follows:

In Chapter 2 we introduce the Poisson-Nernst-Planck (PNP) system as a model to describeion transport through single open pores. After stating the general three-dimensional equa-tions a computationally efficient way to introduce size-exclusion effects into the model isdemonstrated, and alternative ways to derive a one-dimensional approximation of the setupare shown. Finally other modelling approaches for ion conduction are presented and theirindividual advantages and disadvantages are discussed.

In Chapter 3 the PNP model is used to address parameter identification problems. Besidesthe full model also two different surrogate models are used to carry out the identificationtask in an iterated algorithm. Their performance is investiagted and compared to the resultsfrom the full model.

In Chapter 4 we turn from the ion conduction towards the gating behaviour of voltage-gated ion channels. Different mathematical models are presented and a continuum model ofFokker-Planck type is derived. With the help of this model we analyse certain characteristicfeatures appearing in macroscopic (e.g. whole cell) gating currents.

In Chapter 5 a specific time delay in the macroscopic ionic current under certain experi-mental conditions, known as the Cole-Moore effect, is considered. We investigate which ofthe models introduced in the previous chapter are able to generate such a time delay andpropose a simple two-state Markov model with a newly introduced time-dependence in therate constants that is in principle capable of generating the desired delay.

In Chapter 6 we again turn our attention to inverse problems, this time using the Fokker-Planck type gating model derived in Chapter 4 to perform parameter identifications withrespect to the energy landscape of the voltage sensor.

In Chapter 7 we finally end by pointing out some interesting open questions and propectsfor future research.

4 CHAPTER 1. INTRODUCTION

Chapter 2

Modelling of ion conduction on

single-channel basis

As ion channels are of major importance for the function of any living organism, large efforthas been taken to understand and manipulate their behaviour. Besides experimental in-vestigations using electrophysiological methods that gained attention in the 1950’s with thefamous studies on squid giant axons performed by Kenneth S. Cole and John W. Moore ([22],[23]) and by Hodgkin, Huxley and Katz ([49], [50], [53]), and other experimental techniques,also theoretical approaches became a field of intensive research ([11], [52], [100]). The basicidea behind those theoretical investigations was (and still is) to find the main components,i.e. the driving forces and physical laws, that explain the behaviour of ion channels. Severaldifferent approaches can be taken to address this question, ranging from very detailed modelson the atomic level to continuum models describing the average behaviour of the system.

In this work we want to focus on a macroscopic model based on the Poisson-Nernst-Planckequations. After introducing the general equations in a three-dimensional setup, we willconcentrate on the development of a one-dimensional approximation, since in ion channelgeometry one direction (the axial direction) usually has a much larger extension as comparedto the remaining two (cross-sectional) directions.Different microscopic modelling approaches such as molecular dynamics will be briefly dis-cussed in the sequel, and we are going to wrap up this chapter with a discussion of theadvantages and disadvantages of the various model types.

2.1 The Poisson-Nernst-Planck model

The macroscopic model based on the Poisson-Nernst-Planck (PNP) equations is used todescribe the transport of charged ions through a single open channel pore. The underlyingassumptions are that the ion movement through the pore is mainly driven by diffusion andelectrostatic interactions among the moving ions, protein charges, and an externally appliedelectric field. PNP is a mean field approach, i.e. time averages of the electrostatic field arecomputed rather than the fluctuations on atomic time scales. Omitting molecular detail,the model describes the average charge densities of the ions instead of the movements ofindividual ions. (The latter one is done in the molecular dynamics simulations, see Section2.2.)

5

6 CHAPTER 2. MODELLING ION CONDUCTION

The PNP equations have been used since decades as a standard model to describe electro-diffusion in various systems. Apart from their application to ion channel transport they werealready used with great success in the simulation of semiconductor devices ([116], [105] andreferences therein). Semiconductor systems are not that different from ion channels in theirbasic transport characteristics and hence it has been a natural extension to apply the PNPequations in both the technical and the biological context.

The PNP system in three dimensions consists of a Poisson equation for the electrostaticpotential V and a set of continuity equations for the ion densities ρk:

−div(ǫ∇V ) = e0

m∑

k=1

zk ρk (2.1a)

∂ρk

∂t= −div(Jk) ∀ k = 1, ..., m (2.1b)

Jk = − 1

kBTDkρk∇µk ∀ k = 1, ..., m (2.1c)

µk = kBT ln(ρk/ρscale) + zke0V + µexk + µ0

k ∀ k = 1, ..., m. (2.1d)

Here ǫ denotes the local dielectric coefficient, kB is the Boltzmann constant, T is the absolutetemperature and e0 is the unit charge. The index k refers to the k-th ion species and mgives the total number of different ionic species present in the system. In the context of ionchannels we are going to distinguish between two basic types of ion species: the free ionscomprised of the ions in solution (such as K+, Na+or Cl−), and the confined species madeup of the charged groups that belong to the channel protein. In the following the m-thspecies will always refer to the confined species used to model the charged protein residueslining the inner channel wall. The valence of each ion species is denoted by zk and thespecies-dependent diffusion coefficient by Dk. Generally Dk = Dk(x) is a space-dependentfunction. The electrochemical potential µk can be decomposed into three parts: the idealpart constituted by diffusion and mean field electrostatic interactions (the first two termson the right-hand side), local short-range electrochemical interactions µex

k and additionalexternally applied potentials µ0

k. A derivation of the PNP system in the ion channel contextcan e.g. be found in [104]. A detailed introduction into electro-diffusion systems is also givenby Rubinstein in his book Electro-Diffusion of Ions ([100]).The main focus is put on modelling the ion transport through the filter region of the channel,assumed to be the rate limiting step in the transport process across the whole channel.In the most simplified setup this filter region is modelled as a cylinder attached to two conicalatria that open into the baths on either side of the channel (see Figure 2.1). Modifications ofthis simplified geometry can of course be made, e.g. by introducing a larger cavity betweenfilter region and bath on one side in accordance with the structure of most ion channels.Let Ω ⊂ R

3 denote the system domain, i.e. the channel plus adjacent baths on either side. Inthe baths we model the applied electrostatic potential U and the fixed bath concentrationsηk of the free ions on the Dirichlet boundary by conditions of the form

V = U on ΓD (2.2a)

ρk = ηk on ΓD, k = 1, ..., m − 1 (2.2b)

∂µm

∂n= 0 on ΓD, (2.2c)

2.1. THE POISSON-NERNST-PLANCK MODEL 7

where ∂·∂n denotes the normal derivative and ΓD ⊂ ∂Ω. Note that the confined species (index

m) is not present in the bath and hence no Dirichlet boundary values are prescribed for thisspecies. On the insulated part ΓN = ∂Ω\ΓD of the boundary we use homogeneous Neumannboundary conditions , i.e.

∂V

∂n= 0 on ΓN (2.3a)

∂µk

∂n= 0 on ΓN , k = 1, ..., m. (2.3b)

The fixed bath concentrations ηk of the free ion species are constrained by demanding chargeneutrality, i.e.

m−1∑

k=1

zkηk = 0.

The above system (2.1) together with (2.2) and (2.3) has the same structure as the drift-diffusion equations (another name for the PNP equations frequently used in physics) usedto model semiconductor devices ([71], [105]), but in the ion channel context more speciesare included into the system (as compared to only electrons and holes in semiconductorapplications).

PPPPPPPPP

PPPPPPPPP

ΓD ΓD

ΓN

ΓN

channelbath bath

Figure 2.1: Simplified sketch of the two-dimensional system.

For ion channel modelling the above system is supplemented with additional constraintsfor the confined species. The confined ions reflect the charged protein residues lining theselectivity filter of the channel. It is modelled in equilibrium, ∇µm = 0, and the number ofprotein charges is fixed by ∫

Ωfilter

ρm dx = Nm. (2.4)

Here Nm denotes the total number of confined particles and Ωfilter describes the region ofconfinement (in our case this will usually be the selectivity filter of the channel protein).Finally the measured current I flowing through the channel from one bath to the other isgiven by

I = e0

m−1∑

i=1

∫

Γ0

zkJk · dn,

where Γ0 ⊂ ΓD constitutes one part of the Dirichlet boundary. Note that only the free ionscontribute to the measured currents, since the confined species is not present in the baths.


The classical PNP approach ignores the short-range interaction terms µexk and additional

external potentials µ0k (compare (2.1d)). However, especially the first one will become im-

portant when considering the restricted geometry of the selectivity filter. As space is limitedinside the filter region, crowding effects gain importance for the proper determination of theparticle densities. Ways to include them are by means of density functional theory or a localdensity approximation and will be discussed in the next section.

Chemical energy contributions like dehydration energies for the individual ions are not in-cluded in the classical PNP equations, which is one of the critical points about the PNPmodel for ion transport across membrane channels. The external potentials µ0

k could beused to include such additional effects into the model, provided a reasonable assumption forassigning the corresponding energy barriers is available.

Before going into more detail concerning the size-exclusion effects inside the selectivity filter,we briefly want to mention the scaling of the equations used for computational purposes andasymptotics, which renders the system dimensionless.

Scaling of the equations

Appropriate scaling and non-dimensionalization of mathematical models is on the one handimportant from a numerical point of view when the system shall be solved on a computer. Onthe other hand the non-dimensionalization and proper scaling might help in the structuralanalysis of the system. Thus, due to the scaling, small parameters might appear in front ofcertain terms. This allows to consider limiting cases (such as setting the parameter equal tozero) in analytical considerations and by this means gain some insight into the qualitativebehaviour of the system. Such analytical methods are summarized under the name pertur-bation theory in the mathematical context. An introduction into this field can be found e.g.in [55].

For our PNP system the main equations are given by

−div(ǫ∇V ) = e0

m∑

k=1

zk ρk

and∂ρk

∂t= div

(Dk (∇ρk + zk

e0

kBTρk∇V +

1

kBTρk∇µex

k +1

kBTρk∇µ0

k)).

The dielectric coefficient ǫ(x) = ǫr(x)ǫ0 can be decomposed into the relative permittivity ǫr

and the permittivity of free space ǫ0. Let L, V and ρ denote some typical length, voltageand density in the system, respectively. With the scaled quantities

xs = x/L, Vs = V/V , ρk,s = ρk/ρ, Dk,s = Dk/D, µ0,exk,s = µ0,ex

k /(kBT )

we obtain

−λ2 div(ǫr∇Vs) =m∑

k=1

zk ρk,s (2.5)


and∂ρk,s

∂ts= div

(Dk,s (∇ρk,s + zk c ρk,s∇Vs + ρk,s∇µex

k,s + ρk,s∇µ0k,s)

), (2.6)

with the constants given by

λ2 =ǫ0V

e0L2ρ, c =

e0V

kBT

and the scaled time ts = tD/L2 (we are going to omit the subscript s and the subscriptr at ǫ again in the following). This kind of scaling is also commonly used in semiconduc-tor applications ([71]). For typical ion channel quantities (L = 10nm, V = 100mV andρk = 100mM) the scaling parameter λ2 ≈ 10−3 and can be considered as a small parameter.This makes it possible for example to employ methods from perturbation theory to analysethe mathematical model, as was done e.g. in [37], [110] and [111].

2.1.1 Size-exclusion effects

As was already mentioned in the last section, when modelling the transport through ionchannels we have to keep in mind the restricted space available inside the selectivity filter,which is small also compared to the ion diameters. Classical Poisson-Nernst-Planck theoryis based on the computation of particle densities and does not deal with the individual ionas a physical particle. Imagine the ions as little spheres that move around the baths andeventually pass through the ion channel. As the selectivity filter is a really narrow region,already a few particles are enough to crowd it (water is also included in the model as an ownuncharged species) and hence alter the behaviour of other particles successively trying toenter the channel. Put another way, two particles cannot occupy the same space at the sametime. In really narrow channels (like K+channels) this might even lead to single-filing, whereparticles cannot overtake each other while crossing the channel ([112]). However, this effectis not considered when only dealing with densities. The macroscopic modelling approachis based on the independent movement of the ions. Partial remedy is found by introducingshort-range interaction terms into the model, denoted by µex

k in the general PNP system(2.1).These terms account for hard-sphere and short-range electrostatic interactions among theparticles. Single-filing cannot be modelled by them in their current form, since for this sakeinteractions like correlated movement would also have to be included. Thus we should keepin mind that the PNP approach is best used for relatively large channels like the L-typeCa2+channel or the ryanodine receptor ([38], [42]), that have a wider diameter of the selec-tivity filter as e.g. compared to K+channels.

In the following we are going to present two different approaches to compute the localshort-range interaction terms. The first one is based on density functional theory (DFT) offluids, in the second one we will make use of a local density approximation resulting in acomputationally more efficient way to determine the hard-sphere interactions. The numericalsolution of the resulting equations will be more difficult though.The short-range interaction terms can be split up into two different contributions: a localcorrection to electrostatic interactions (the long-range electrostatic interaction is taken intoaccount via the mean electrostatic field, i.e. the Poisson equation (2.1a)) and volume ex-clusion effects (termed hard-sphere interactions). We are going to focus on the latter one


in the following. Since the next two subsections will be quite technical, we try to give abrief summary of the general ideas beforehand: The hard-sphere interactions are based onthe real physical extensions of the ions as individual particles. As PNP only deals withdensities, these effects have to be artificially included via excess potentials. The general ideais that the distribution of one ion species is influenced by the presence of all other species,competing for space. To account for these effects, generalized densities are introduced thatarise as convolutions of the densities and weight functions which account for the fundamentalgeometrical properties of the ions. The resulting convolution integrals are numerically quitetime consuming to handle. In order to reduce computational effort, in the second part weuse a local expansion of the regular ion densities in order to approximate the convolutionintegrals by simple expressions reflecting the ion dimensions.

Density Functional Theory

Density functional theory (DFT) was first introduced by Walter Kohn in collaboration withPierre Hohenberg ([54]) and Lu J. Sham ([60]) in the 1960’s. About the same time Mermin([73]) also contributed to the emerging theory of density functionals. It was a computation-ally feasible approach to deal with many-electron systems based on their density distributionrather than on the many-electron wavefunction ([59]). The main aim of density functionaltheory is to construct a functional for the excess free energy of a system as a functional of thedensity distribution ([96]). One approach for classical systems such as hard-sphere mixturesis based on the fundamental measure theory (FMT) introduced by Yaakov Rosenfeld andhis co-workers ([95], [94]). A modified version of this was used by Gillespie et al. ([40], [41])to include size-exclusion effects and local electrostatic interactions into the PNP equations.For a good and detailed introduction into DFT and FMT we refer the reader to [96].The basic idea is that the grand potential (as a functional of the ion densities) can beseparated into an ideal (id), hard-sphere (HS) and local electrostatic (ES) component,

F(ρk) = Fid(ρk) + FHS(ρk) + FES(ρk), (2.7)

where the ideal part is given by

Fid(ρk) = kBT∑

k

∫ρk(x)

[ln(ρk(x)/ρscale)−1

]dx+

∑

k

∫ρk(x)

[zke0V (x)+µ0

k(x)−µk(x)]dx.

The last integral on the right-hand side accounts for the influence of all external potentials(electrostatic (V ) and others (µ0

k)).The excess chemical potentials µex

k that are used to include the short-range particle inter-action terms into the PNP system (see (2.1d)) are then derived as variations of the grandpotential components with respect to the densities,

µHSk =

δFHS(ρi)δρk

and

µESk =

δFES(ρi)δρk

,

and the excess chemical potentials given by

µexk = µHS

k + µESk .


For a more detailed description and statement of the equations we refer to [40], [82] and [95].The above approach leads to a numerically intensive procedure that even in the one-dimen-sional setup requires a fundamental part of the computation time when solving the PNPsystem. Especially the ES part of the excess chemical potential needs large computationaleffort. As outlined in [119], the inclusion of those terms into two- or three-dimensionalsimulations is an even more challenging problem.

In order to compute the HS part of the excess chemical potential, the variations of FHS(ρi)with respect to the densities have to be determined. The HS part of the grand potential canbe expressed via the excess free energy density ΦHS as ([40])

FHS(ρi) = kBT

∫ΦHS(nα(x′)) dx′,

where the nα can be interpreted as weighted or generalized densities. They are given by

nα(x) =m∑

k=1

∫ρk(x

′)ω(α)k (x′ − x) dx′ (2.8)

for α = 0, 1, 2, 3, V 1, V 2 with

ω(2)k (r) = δ(|r| − Rk) ω

(0)k (r) = 1

4πR2k

ω(2)k (r)

ω(3)k (r) = Θ(|r| − Rk) ω

(1)k (r) = 1

4πRkω

(2)k (r)

ω(V 2)k (r) = r

|r|δ(|r| − Rk) ω(V 1)k (r) = 1

4πRkω

(V 2)k (r).

Here δ denotes the Dirac delta function and Θ is a unit step function with Θ(x) = 1 forx ≤ 0 and Θ(x) = 0 for x > 0. The excess free energy density ΦHS can be determined fromthe weighted densities nα:

ΦHS(nα) = −n0 ln(1 − n3) +n1n2 − nV 1 · nV 2

1 − n3+

n32

24π(1 − n3)2(1 − nV 2 · nV 2

n22

)3. (2.9)

Note that there are also different versions to put up the excess free energy density ([95],[97]). To compute ΦHS and in the following its derivatives with respect to the weighteddensities in order to arrive at the excess chemical potentials µHS

k , the integrals in (2.8)have to be evaluated. This can be done numerically using standard integration methodsor analytically by assuming one of the integrand functions (ρk(x

′)) to be piecewise linear(the other one is a polynomial in the integration variable) ([40]). In both cases integrationhas to be carried out with care, since the integration domains are from x − Rk to x + Rk

(in the one-dimensional setting), where Rk are the radii of the different ion species, andthe end points of the integration domain usually do not correspond to grid points from thediscretization.To avoid the direct computation of the integrals in (2.8), in the next section we proposeanother way to determine the weighted densities nα based on a local density approximationof the densities ρk.


Local Density Approximation

The local density approximation (LDA) ([26]) used in order to determine the weighted den-sities nα makes use of a local expansion of the densities ρk around a given point x and theassumption that terms that are not of leading order can be neglected.We start by making the following basic assumptions:

• ρk ∈ C2, i.e. the densities are smooth;

• the ionic radii Rk are small (Rk << 1);

• ρkR3k = O(1), i.e. we have crowding of the ions in the filter;

• ∇ρkρk

= O(1) ; |∇ρk|R3k = O(1);

• D2ρkρk

= O(1) ; |D2ρk|R3k = O(1)

(here D2ρk denotes the second derivative, or Hessian, of ρk).

The last two assumptions reflect the fact that we need to assume larger scale variations inthe densities in order for the macroscopic model approach to make sense.For ρk we use the local expansion

ρk(x′) = ρk(x) + ∇ρk(ξk(x, x′)) · (x′ − x),

which follows from the mean value theorem, or the second order expansion

ρk(x′) = ρk(x) + ∇ρk(x) · (x′ − x) +

1

2(x′ − x)T (D2ρk)(ξk(x, x′))(x′ − x),

respectively.Let BR(x) denote the ball of radius R around x and ∂BR(x) its surface. Then we cancompute the weighted densities as defined in (2.8):

n3(x) =∑

k

∫ρk(x

′)Θ(|x′ − x| − Rk) dx′

=∑

k

∫

BRk(x)

ρk(x′) dx′

=∑

k

∫

BRk(x)

[ρk(x) + ∇ρk(ξk(x, x′)) · (x′ − x)] dx′

=4π

3

∑

k

ρk(x)R3k +

∑

k

∫

BRk(x)

∇ρk(ξk(x, x′)) · (x′ − x) dx′.

The last integral is of order O(|∇ρk|R4k) = O(Rk) and can be neglected relative to the first

part which is of order O(1) according to our assumptions. Hence we get

n3(x) ≈ 4π

3

∑

k

ρk(x)R3k. (2.10)


For n2 we have

n2(x) =∑

k

∫ρk(x

′)δ(|x′ − x| − Rk) dx′

=∑

k

∫

∂BRk(x)

ρk(x′) dσ

=∑

k

∫

∂BRk(x)

[ρk(x) + ∇ρk(x) · (x′ − x) +1

2(x′ − x)T (D2ρk)(ξ)(x

′ − x)] dσ

= 4π∑

k

ρk(x)R2

k +

∫

∂BRk(x)

∇ρk(x) · (x′ − x) dσ +1

2

∫

∂BRk(x)

(x′ − x)T (D2ρk)(ξ)(x′ − x) dσ

.

Since∫∂BRk

(x)(x′ − x) dσ = 0, the second term on the right-hand side vanishes. The last

term is again of order O((D2ρk)R4k) = O(Rk) and will be neglected, as the first term is of

order O(R−1k ). Hence we remain with

n2(x) ≈ 4π∑

k

ρk(x)R2k. (2.11)

For n1 we get

n1(x) =∑

k

1

4πRk

∫ρk(x

′)δ(|x′ − x| − Rk) dx′

=∑

k

1

4πRk

∫

∂BRk(x)

ρk(x′) dσ

≈∑

k

ρk(x)Rk +

1

8πRk

∫

∂BRk(x)

(x′ − x)T (D2ρk)(x)(x′ − x) dσ

=∑

k

ρk(x)Rk +

R3k

8π

∫

∂B1(0)

ηT (D2ρk)(x)η dσ.

Note that we changed the integration domain to the unit sphere in the last step.For n0 it follows analogously

n0(x) ≈∑

k

ρk(x) +

R2k

8π

∫

∂B1(0)

ηT (D2ρk)(x)η dσ.

Straightforward computations show that

∫

∂B1(0)

ηT η dσ =4π

3Id,


with Id denoting the identity. With this we get that

∫

∂B1(0)

ηT (D2ρk)(x)η dσ =4π

3trace((D2ρk)(x))

=4π

3∆ρk(x)

and hence

n1(x) ≈∑

k

ρk(x)Rk +1

6

∑

k

R3k∆ρk(x) (2.12)

and

n0(x) ≈∑

k

ρk(x) +1

6

∑

k

R2k∆ρk(x). (2.13)

The two remaining vector quantities nV 1 and nV 2 can be computed as follows:

nV 2(x) =∑

k

∫ρk(x

′)x′ − x

|x′ − x| δ(|x′ − x| − Rk) dx′

=∑

k

∫

∂BRk(x)

ρk(x′)

x′ − x

Rkdσ

≈∑

k

1

Rk

∫

∂BRk(x)

ρk(x)(x′ − x) dσ +

∫

∂BRk(x)

(∇ρk(x) · (x′ − x))(x′ − x) dσ

+1

2

∫

∂BRk(x)

[(x′ − x)T (D2ρk)(x)(x′ − x)

](x′ − x) dσ

,

where the first integral on the right-hand side vanishes again due to the fact that∫∂BRk

(x)(x′ − x) dσ = 0. The second integral can be computed to be

1

Rk

∫

∂BRk(x)

(∇ρk(x) · (x′ − x))(x′ − x) dσ =4π

3∇ρk(x)R3

k

and the last term also evaluates to zero. Hence we get

nV 2(x) ≈ 4π

3

∑

k

∇ρk(x)R3k. (2.14)

Analogously nV 1 can be computed to be

nV 1(x) ≈ 1

3

∑

k

∇ρk(x)R2k. (2.15)

When inserting these approximated weighted densities into the excess free energy density(2.9), we realize that n1n2 is of order O(R−3

k ) and nV 1 · nV 2 is of order O(R−1k ). This

suggests to ignore the term nV 1 · nV 2 relative to n1n2. A similar conclusion can be drawn


when considering the term nV 2·nV 2

n22

. It is of order O(R2k) and therefore negligible compared

to one. This leaves us with the expression

ΦLDAHS (nα) = −n0 ln(1 − n3) +

n1n2

1 − n3+

n32

24π(1 − n3)2(2.16)

for the excess free energy density in the local density approximation.

Now there are different possibilities in order to proceed to finally compute µHSk . One possi-

bility is to continue as in [40], just with the newly derived excess free energy density ΦLDAHS .

This would amount to evaluating the following integral,

µexk = kBT

∑

α

∫∂ΦLDA

HS

∂nα(x′)ω

(α)k (x − x′) dx′,

again making use of the extension of the ω(α)k . Another possibility is to determine directly

µHSk =

δFHS(ρi)δρk

=δΦLDA

HS (ρi)δρk

,

which can be computed to be

µHSk = − ln(1 − 4π

3

∑

i

ρiR3i ) + 4π

Rk (∑

i ρiR2i ) + R2

k (∑

i ρiRi) + 13 R3

k (∑

i ρi)

1 − 4π3

∑i ρiR3

i

+16π2

3

R3k (

∑i ρiRi)(

∑i ρiR

2i ) + 3

2 R2k (

∑i ρiR

2i )

2

(1 − 4π3

∑i ρiR3

i )2

+64π3

9

R3k (

∑i ρiR

2i )

3

(1 − 4π3

∑i ρiR3

i )3, (2.17)

when only the leading order term of each weighted density nα is considered. The advantageof this approach (in the following termed LDA) is that no integrals need to be evaluated andhence would save a considerable amount of time in a computational process.The numerical results in the one-dimensional setup are comparable to the DFT approach.The densities for Ca2+, Na+and Cl−shown in Figure 2.2 were used to compute the HSinteraction terms on the one hand with the DFT approach and on the other hand with theLDA approach.The resulting excess chemical potentials are shown in Figure 2.3(a) (exemplarily for Ca2+).The blue curve is computed with the DFT approach, the red curve results from the local den-sity approximation. The yellow rectangle indicates the filter region. We see that both resultsare comparable. In the baths away from the filter region, where we have uniform bulk con-centrations, both methods yield the same value. Only inside the narrow filter region we getslight deviations between the two results. The pointwise relative error (µHS

LDA−µHSDFT ) / µHS

DFT

between the LDA and DFT approach is around 2−3% at most (see Figure 2.3(b) for Ca2+).The relative error in L2-norm is less than one percent. The DFT approach results in morebroadened potentials as compared to the LDA, which in turn gives the steeper potentials.


−0.2 0 0.20

2

4

6

8

10

12

x

Ca2+

−0.2 0 0.20

0.5

1

1.5

x

Na+

−0.2 0 0.20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

x

Cl−

Figure 2.2: Densities for Ca2+, Na+and Cl−used to compare DFT and LDA. The yellowrectangle indicates the filter region.

−0.2 −0.1 0 0.1 0.2 0.34.4

4.6

4.8

5

5.2

5.4

5.6

5.8

x

µ Ca

HS [1

/(k B

T)]

(a) µHSCa for DFT and LDA

−0.2 −0.1 0 0.1 0.2 0.3−0.02

−0.015

−0.01

−0.005

0

0.005

0.01

0.015

0.02

0.025

x

(µLD

AH

S−µ

DF

TH

S)/

µ DF

TH

S

(b) Relative error LDA andDFT

Figure 2.3: Comparison of DFT and LDA, yellow rectangle indicates the filter region; (a) HScomponent of excess chemical potential for Ca2+for the densities shown in Figure 2.2; blue:DFT; red: LDA; (b) relative error (µHS

LDA−µHSDFT ) / µHS

DFT between LDA and DFT approach.

While the LDA version leads to a drastical reduction in computation time (from approxi-mately 1.8 s for one DFT step to far less than 0.001 s for LDA (with MATLAB R© 6.5 (TheMathWorks, Inc.) on a standard (Windows XP) laptop with a 1.73 GHz Pentium M pro-cessor and 512MB RAM), it is numerically highly unstable. Special methods would haveto be employed in order to prevent a destabilization and a blow-up of the excess chemicalpotentials µex

k in successive iterations. One possibility to stabilize the computations is theuse of a two-step recursion of the form

yk+1 = (1 − ω) yk + ω yk+1LDA

with the weight ω ∈ [0, 1] chosen small enough. Here yk stands for the µHS computed inthe last iteration step and yk+1

LDA is the exact new value resulting from (2.17). But insteadof taking this exact value for the further computations a weighted average of the new valueand the old value from the previous iteration is chosen.

Other algorithms were not tested during this thesis and remain a field of further investiga-tions. When using the LDA in the PNP system, the resulting equations belong to the class ofnonlinear diffusion equations. Numerical treatment of this type of equations is investigatedin [119].


2.1.2 Derivation of the 1D-Poisson-Nernst-Planck model

The full PNP system is a three-dimensional model. Nevertheless, the channel geometryrenders the ion transport an almost one-dimensional process inside the filter, since the cross-section of the filter is generally much smaller than its axial extension. Hence it seemsreasonable to try to approximate the three-dimensional model by a one-dimensional one.Also from a numerical point of view a one-dimensional model is faster and easier to handlethan the full three-dimensional version. Of course we should keep in mind that more detailedstructure can be put into the three-dimensional model, like spatial distributions of proteincharges. But the one-dimensional approximations used so far showed reasonable agreementswith experimental data ([38], [42]), and hence they seem to be a viable alternative. When itcomes to more detailed structures and channels where the structural information is available,the use of the three-dimensional model should be considered.

The simplest way of reducing the three-dimensional model to one dimension would be justto take the one-dimensional versions of the Poisson and Nernst-Planck equations, which aregiven by (all derivatives with respect to y and z being zero)

− d

dx(ǫ

dV

dx) = e0

m∑

k=1

zk ρk (2.18a)

∂ρk

∂t=

d

dx

(Dk(

dρk

dx+

zke0

kBTρk

dV

dx+

1

kBTρk

dµexk

dx+

1

kBTρk

dµ0k

dx))

∀ k = 1, ..., m − 1.

(2.18b)

The problem with those equations is that they also treat the bath regions as behaving likea one-dimensional system. In fact the baths at either side of the channel indeed have areal three-dimensional extension. The question is how to incorporate this transition from anessentially one-dimensional filter to a three-dimensional bath into an over-all one-dimensionalmodel. One idea is to begin with the full three-dimensional model and to reduce it to onedimension by including some kind of area function or shape function into the reduced model,that accounts for variations in the cross-sectional area along the axial direction of the wholesystem. There are different ways to deduce such a reduced one-dimensional model. Oneversion proposed by Nonner, Eisenberg and their co-workers ([40], [84]) is based on averagingthe three-dimensional model over equipotential and equiconcentration surfaces, respectively,i.e. surfaces where the electrostatic potential and the ionic concentrations are constant. Adetailed description of this method is given in [37]. The resulting one-dimensional version ofthe PNP system is

− 1

A

d

dx(ǫA

dV

dx) = e0

m∑

k=1

zk ρk (2.19a)

∂ρk

∂t=

1

A

d

dx

(DkA(

dρk

dx+

zke0

kBTρk

dV

dx+

1

kBTρk

dµexk

dx+

1

kBTρk

dµ0k

dx))

∀ k = 1, ..., m − 1.

(2.19b)

Here A = A(x) denotes the area function that describes the surface area of the equipotentialsurfaces. Note that the diffusion coefficient in the equation above has also transformed intoan effective one-dimensional diffusion coefficient. For the simple approximation where the


filter is modelled as a cylinder with two conical atria opening into the bath on either side (seeFigure 2.1), the area function can be taken as the cross-section inside the filter region andas the surface of spherical shells connecting perpendicular to the boundaries of the systemin the remaining part ([40]).

The second method to derive a one-dimensional approximation is based on the assumptionthat the channel can be described as a rotational symmetric domain

Ωǫ = (x, y, z) : 0 < x < 1, y2 + z2 < g2(x, ǫ),

where g is a smooth function describing the shape of the filter boundary and the parameterǫ (ǫ << 1) gives the maximal channel radius. The axial direction x has been scaled tothe interval [0, 1] for convenience. The one-dimensional PNP system is then derived as thelimiting system for ǫ → 0 ([67], [68]). Following this approach the 1D-limiting system canbe determined to be

− 1

g20

d

dx(ǫg2

0

dV

dx) = e0

m∑

k=1

zk ρk (2.20a)

∂ρk

∂t=

1

g20

d

dx

(Dkg

20(

dρk

dx+

zke0

kBTρk

dV

dx+

1

kBTρk

dµexk

dx+

1

kBTρk

dµ0k

dx))

∀ k = 1, ..., m − 1,

(2.20b)

with g0(x) = ∂g∂ǫ (x, 0).

Identifying A(x) with g20, we see that both approaches lead to the same one-dimensional

PNP system.

In the next section we want to propose a different approach for taking the geometricalconstraints inside the filter into account. It will produce an additional potential into theNernst-Planck equations, closely related to the entropic barriers appearing in the Fick-Jacobsapproximation ([56]) and to the 1D system derived in [61]. The Fick-Jacobs approximationwas first derived by Jacobs in [56] and deals with the diffusion of a Brownian particle in anarrow symmetric tube with varying cross-section A(x). The Fick-Jacobs equation approx-imates the three-dimensional diffusion process as a one-dimensional process in longitudinal(i.e. along the channel axis, x) direction. The basic assumption underlying this approxi-mation is that the equilibration in transversal direction (i.e. y, z) is much faster than inthe longitudinal direction. The original equation has been set up for purely diffusive motionwithout any external driving forces and a constant diffusion coefficient D0. It is given by

∂C

∂t= D0

∂

∂x

[∂C

∂x− 1

A(x)

dA(x)

dxC

], (2.21)

where C = C(x, t) =∫A(x) c(x, y, z, t) dy dz denotes the one-dimensional concentration and

A(x) gives the cross-section of the channel at x. The last term, 1A(x)

dA(x)dx , can be seen as

an entropic barrier constituted by the varying geometry of the channel. Later this equationwas rederived by Zwanzig [122] in a more general way. He showed that the accuracy of theFick-Jacobs approximation is restricted to channels where the cross-section varies smoothly,


|R′(x)| < 1, with R(x) denoting the channel radius, because otherwise the assumption oflocal equilibrium in the transverse direction breaks down. Furthermore, by introducing aspatially dependent effective diffusion coefficient D(x) dependent on the radial change R′(x),Zwanzig could extend the range of validity of the Fick-Jacobs approximation. Reguera andRubı ([90]) proposed the following effective diffusion coefficient,

D(x) =D0

(1 + R′(x)2)α,

with α = 1/3 in the two-dimensional case and α = 1/2 in three-dimensional case. In [13]and [91], additionally the effect of a constant external force acting in x-direction along thechannel axis was investigated. Burada et al. ([13]) found out that for large driving forcesalong the channel axis the Fick-Jacobs approximation loses validity, since the assumption ofequilibration along the transverse direction fails as the particles tend to crowd along the axisof the channel. The influence of the entropic barriers generated by the channel geometryloses importance as the force in x-direction gets larger.

After this short introduction into the Fick-Jacobs approximation, we now turn to the deriva-tion of a one-dimensional approximation of the PNP equations that leads to similar resultsin the Nernst-Planck equations.

Geometrical constraints via potentials

To derive a one-dimensional approximation of the PNP equations, we start with the classicalPNP system, neglecting the short-range interaction terms for simplicity. For the sake ofclearness we are going to start with a two-dimensional system and derive a one-dimensionalapproximation, but the same analysis can be carried out also for the three-dimensional case.

The general idea of the following derivation is to include the geometry of the membrane-channel-baths system via an additional potential acting on the ion densities. This approachis motivated by the fact that using the channel geometry is actually only an idealization.Ions crossing the filter will not be constrained because of a real wall blocking their way, butenergetical aspects will hinder them to approach the protein too closely, thereby confiningtheir movement essentially to the axial direction. Hence the use of potentials instead of thegeometry can even be considered as the more realistic approach.

As the available space for the ions inside the filter is small in the y-direction, we get terms ofdifferent orders of magnitude for the two different directions x and y after a rescaling of thesystem. This separation of scales is then used to derive the one-dimensional approximation.

Let the two-dimensional system be given on a rectangular domain containing membrane,channel and baths (see Figure 2.4).

Let Ωx = [Lx−, Lx+] denote the range of x and Ωy = [Ly−, Ly+] the range of y. The wholedomain is then given by Ω = Ωx ×Ωy. The channel itself shall be located between y = 0 andy = δ, i.e. δ denotes the maximal channel diameter. There is no need to assume the channelto be cylindrical (of constant cross-section), rather, the cross-section can vary. Furthermorewe assume δ to be small (δ << 1) compared to the channel extension in x-direction. Theboundaries of the system domain now do not correspond to the channel walls anymore as


HH

HH

LLLL

LLLL

LLLL

LLLL

LLLLL

LL

Lx− Lx+Ly−

Ly+

bath

membrane

?

Figure 2.4: Simplified sketch of the two-dimensional system.

before, instead the domain includes part of the membrane. To get the geometrical constraintsof the channel shape into the model, we introduce a confining potential Υ into the Nernst-Planck equations (comparable to µ0

k in (2.1)). This confining potential can, for example, betaken as zero in the “free space” inside the channel and the baths and taking a large valuein the membrane regions where there is a geometrical hindrance. The two-dimensional PNPequations then read as

∂tρi = ∂x

[Di

(∂xρi+zi

e0

kBTρi∂xV +

1

kBTρi∂xΥ

)]+∂y

[Di

(∂yρi+zi

e0

kBTρi∂yV +

1

kBTρi∂yΥ

)]

for the densities and

−λ2∂x

[ǫ ∂xV

]+ ∂y

[ǫ ∂yV

]=

m∑

i=1

ziρi

for the Poisson equation. The constant λ contains physical constants and scaling parameters(see section on scaling of the PNP system) and ǫ denotes the space-dependent dielectriccoefficient. For clear arrangement we use the abbreviation ∂z = ∂

∂z for z = x, y, t. Forsimplicity we assume the diffusion coefficients to be solely dependent on x, Di = Di(x). Theboundary conditions can be put into the form

V = U on ∂Ωx × Ωy

ρk = ηk on ∂Ωx × Ωy

∂V

∂n= 0 on Ωx × ∂Ωy

Ji · n = 0 on Ωx × ∂Ωy,

with the flux Ji given by Ji = −Di

(∇ρi + zi

e0kBT ρi∇V + 1

kBT ρi∇Υ)

and n denoting theoutward normal vector.

The geometrical restrictions are apparently only important inside the channel, which has atransversal extension of the order of δ. Hence we introduce the rescaled coordinate y = y/δand derivatives with respect to the second coordinate transform into ∂y = δ−1∂y. Further-more we introduce the y-dependent quantities

u(x, y, t) = ρ(x, δy, t) V (x, y) = V (x, δy) Υ(x, y) = Υ(x, δy) ǫ(x, y) = ǫ(x, δy)


(we omit the subscript i for the individual ion types for easiness of reading and writing).With these new variables the Poisson equation becomes

−λ2∂x

[ǫ ∂xV

]+ δ−2 ∂y

[ǫ ∂yV

]=

m∑

i=1

ziui. (2.22)

Since δ << 1, the leading order in this expression is given by δ−2. We expand V and u inpowers of δ2,

V (x, y) = V0(x, y) + δ2V1(x, y) + ...

andu(x, y, t) = u0(x, y, t) + δ2u1(x, y, t) + ... .

With these expansions we get from (2.22)

• leading order: δ−2

⇒ ∂y

[ǫ ∂yV0

]= 0

⇒ ǫ ∂yV0 = K(x), where K(x) denotes a (possibly) x-dependent constant with re-spect to y;

⇒ from the homogeneous Neumann boundary condition for V (and hence also forV0) on the y-boundary we get that ∂yV0 = 0 for y ∈ ∂Ωy and all x ∈ Ωx;

⇒ under the assumption that ǫ 6= 0 for all x, y, it follows that also ∂yV0 = 0 for ally ∈ Ωy and all x ∈ Ωx;

⇒ V0 = V0(x) solely depends on x and is independent of the y-coordinate.

• next order: δ0

⇒ −λ2∂x

(ǫ ∂xV0

)+ ∂y

(ǫ ∂yV1

)=

∑mi=1 ziu0,i

⇒ −λ2∫

∂x

(ǫ ∂xV0

)dy − λ2

∫∂y

(ǫ ∂yV1

)dy =

∑mi=1 zi

∫u0,i dy,

where the second term on the left-hand side vanishes due to the Neumann bound-ary conditions imposed on V ;

⇒ −λ2∂x

(ǫ ∂xV0

)=

∑mi=1 zi

∫u0,i dy,

since V0 is independent of y and where we have defined the effective dielectriccoefficient ǫ(x) =

∫ǫ dy.

Next we turn our attention to the continuity equation, which in the scaled variables andusing the standard abbreviation β = 1

kBT is given by

∂tu = ∂x

(D (∂xu+z e0 β u ∂xV +β u ∂xΥ)

)+δ−2 ∂y

(D (∂yu+z e0 β u ∂yV +β u ∂yΥ)

). (2.23)

Again we use the above expansion for u and V and get from (2.23):


• leading order: δ−2

⇒ ∂y

(D (∂yu0 + z e0 β u0 ∂yV0 + β u0 ∂yΥ)

)= 0

⇒ D (∂yu0 + z e0 β u0 ∂yV0 + β u0 ∂yΥ) = K(x), where K(x) denotes a (possibly)x-dependent constant with respect to y;

⇒ from the no-flux condition on the y-boundary we get that

D (∂yu0 + z e0 β u0 ∂yV0 + β u0 ∂yΥ) = 0

for all x ∈ Ωx and all y ∈ Ωy;

⇒ u0(x, y, t) = v(x, t) e−β Υ(x,y),where we have used the fact shown above that V0 is independent of y and theassumption that D(x) 6= 0 for all x; v is a solely x- and t-dependent prefactor.

• next order: δ0

⇒ ∂tu0 = ∂x

(∂xu0 +z e0 β u0 ∂xV0 +β u0 ∂xΥ

)+∂y

(∂yu1 +z e0 β u0 ∂yV1 +β u1 ∂yΥ

),

where we have used the fact that ∂yV0 = 0;

⇒ e−β Υ ∂tv = ∂x

(e−β Υ∂xv+z e0 β e−β Υv ∂xV0

)+∂y

(∂yu1+z e0 β u0 ∂yV1+β u1 ∂yΥ

),

where we have inserted the above expression for u0 = ve−β Υ;

⇒ under integration with respect to y the last term on the right-hand side vanishesagain due to the no-flux boundary conditions and we remain with

a(x) ∂tv = ∂x

(a(x) ∂xv + z e0 β a(x) v ∂xV0

),

where a(x) =∫

e−β Υ dy.

Now we have the one-dimensional quantities

ρ(x, t) =∫

u0(x, y) dy = a(x)v(x, t),

V (x) = V0(x),

Υ(x) = −kBT ln(a(x)) = −kBT ln(∫

e−β Υ dy),

ǫ(x) =∫

ǫ dy,

and finally end up with the following one-dimensional approximation of the PNP system.The Poisson equation is given by

−λ2∂x

(ǫ ∂xV

)=

m∑

i=1

ziρi (2.24)


and the continuity equations are given by

∂tρi = ∂x

(D (∂xρi + zi

e0

kBTρi ∂xV +

1

kBTρi ∂xΥ)

). (2.25)

The last term in the continuity equation corresponds to the entropic part in the Fick-Jacobsapproximation (2.21) (see also [91]), as it can be recast into

1

kBTρi ∂xΥ =

1

kBTρi ∂x

(− kBT ln(a(x))

)

= − 1

a(x)

da(x)

dxρi.

2.1.3 Analysis of the 1D-Poisson-Nernst-Planck model

When dealing with partial differential equation systems, natural questions to ask from themathematical point of view are concerned with the existence and uniqueness of a solution tothe system. (From the physical point of view a solution to the underlying physical systemnaturally exists, since the experiment “has an outcome”.) In order to be able to makestatements about this, the functions (e.g. the ion densities and the electrostatic potential)and parameters (e.g. the diffusion coefficients and dielectric coefficient as well as the otherpotentials) are usually taken to be in certain (rather abstract) mathematical spaces. Thesespaces generally impose special properties on the functions and parameters, like boundednessand smoothness, that are necessary to prove the existence of a solution. Over the pastdecades, analysis of classical PNP systems has been a major subject of research, especiallyin the context of semiconductor devices (e.g. [72], [71], [105]). Results cover the stationarysteady state, i.e. ∂ρ

∂t = 0, as well as the transient system ([33]). In this section we will recallthe major results concerning existence and uniqueness of solutions to the PNP equationswithout giving the proves. For more details we refer to the literature.Under certain conditions on the domain Ω, the boundary data and the potentials µex

k , µ0k,

one can formulate the following result:

Theorem 2.1. The stationary PNP system has a solution (V, ρ1, ..., ρm) ∈ H1(Ω)M+1 ∩L∞(Ω)M+1.

Here the function spaces are defined as

L2(Ω) := u |∫

Ωu2 dx < ∞,

H1(Ω) := u ∈ L2(Ω) | ∇ui ∈ L2(Ω)and

L∞(Ω) := u | ess supx∈Ω

|u(x)| < ∞.

A list of the necessary assumptions can be found in [119]. For the transient system alsouniqueness of a solution can be shown ([119]).

Note that standard results for classical drift-diffusion systems ([72], [71], [105]) usually donot include the excess free energy terms µex

k that are included into our PNP system (2.1).In [15] the existence of a locally unique solution to the system (2.1), (2.2), (2.3), and (2.4)including the excess terms is shown for small bath concentrations:


Theorem 2.2. Let ||ηk||H1/2(ΓD) and ||ηk||L∞(ΓD) be sufficiently small. Then, for each

U ∈ H1/2(ΓD) there exists a locally unique solution (V, ρ1, ..., ρm) ∈ H1(Ω)M+1 ∩L∞(Ω)M+1

of the above system.

The space H1/2(Γ), Γ being of dimension d − 1, can be defined as

H1/2(Γ) := C∞(Γ)||·||

H1/2

with C∞ denoting the space of infinitely often continuously differentiable functions and thenorm induced by the inner product

(u, v)H1/2(Γ) :=

∫

Γ

∫

Γ

(u(x) − u(y))(v(x) − v(y))

|x − y|d dsxdsy +

∫

Γuv dsx.

Apart from existence and uniqueness issues, also the qualitative behaviour of solutions hasbeen studied for the classical PNP system ([86]). A prominent approach in this respect isthe use of singular perturbation techniques ([111]). Scaling of the PNP equations leads toa (generally) small parameter λ appearing in the Poisson equation in front of the Laplaceoperator (see (2.5)). The basic idea of singular perturbation is to consider two solutionson different time scales (fast time scale for small parameter and slow time scale for theparameter tending to zero) and match them afterwards ([37], [110] and [111]).

2.2 Other modelling approaches

As was already mentioned in the beginning of this chapter, PNP is by far not the onlyapproach to model the conduction of ions through membrane channels. In this section wewant to introduce other well-known methods that are considered in the ion channel context.Maybe the most prominent approach are Molecular Dynamics simulations, based on theatomic details of a system. Also Monte Carlo methods are frequently used for computationalsimulations. A nice introduction into these two methods can e.g. be found in [65].

Molecular Dynamics

Molecular Dynamics (MD) is a method that is based on the atomistic details of a system. Ascompared to PNP, where densities and mean fields are used, MD really takes each individualparticle in the system into account. It can thus be seen as the most detailed, but also as themost complex method for simulation.

The basic idea of MD is that every particle in the system behaves according to Newton’slaws of motion. It is thus a deterministic approach, since for a given initial configuration theevolution of the system is determined by Newton’s law. The aim is to determine the timeevolution of the system, i.e. the positions and velocities of the particles as they change withtime. Hence an MD simulation consists of computing the trajectories of every particle in thesystem, which is done by integrating Newton’s second law, given by

d2x

dt2=

F

m.

2.2. OTHER MODELLING APPROACHES 25

Here x is the coordinate of the particle, m its mass and F is the force acting on the particle.One major issue in MD is to appropriately design this force field F . Ideally, all interactionsbetween all the particles in the system should be included, leading to a strongly coupledsystem of equations, since the movement of one particle will influence all the other particlesand vice versa.

First MD simulations based on simple models were already carried out as early as 1957 byAlder and Wainwright ([4]). Since then, far more accurate and complex fields have beendeveloped. In the commonly used programs for MD simulations some standard force fieldsfor different situations are usually implemented and the user can choose which one to take.Prominent force fields are e.g. AMBER, CHARMM and GROMOS/GROMACS, that differin certain parameter settings (see also http://ambermd.org, http://www.charmm.org andhttp://www.gromacs.org).

When integrating the equations of motion to compute the trajectories, special care has to betaken with respect to the time steps used. If too small time steps are used the phase space(i.e. the space of all possible states of the system, in this case positions and momenta of theparticles) is covered too slowly. Too large timesteps lead to instabilities, since two particlesmight have already crashed into each other during the long time step which otherwise wouldhave collided smoothly. See Figure 2.5 for an illustration. There is no fixed rule for the

eeeeeuuuuu

e

u

@@R@@AA

e e e eu u u u e e

u u

Figure 2.5: Effects of different time steps in MD simulations (figure reproduced from [65]);left: time step too small; middle: time step too large; right: adequate time step

choice of the time steps. In his book Molecular Modelling ([65]) Leach suggests time stepsranging from 10−14s for atom systems and translational motion to 5 × 10−16s for flexiblemolecules with flexible bonds, when translation, rotation, torsion and vibration are present.From these numbers we already see that it is difficult to really simulate the ion transportthrough membrane channels using MD. To generate a reliable flux, time spans of micro- tomilliseconds have to be simulated, while the range covered by MD simulations is currentlyon the time scale of nanoseconds. Nevertheless, with certain simplifications like potential ofmean force approximations of the solvent, MD might be used to get an idea of the detailedpicture e.g. from the protein residues lining the filter wall. However, setting up the forcefield required for the simulation can pose several problems, especially in the case of syntheticnanopores, since the conventional force fields for simulating biomolecular systems do notinclude synthetic materials ([2]).

Models where the explicit solvent is replaced by some approximate model, introducing ran-dom forces into the system, are also known as Stochastic or Brownian Dynamics. ThereNewton’s law of motion is replaced by the Langevin equation that includes friction due to


the solvent and some random fluctuations.

Another stochastic approach to determine the possible configurations of a system are theso-called Monte Carlo methods, that we are going to introduce in the next paragraph.

Monte Carlo methods

Monte Carlo (MC) methods allow the simulation of more complex systems including lots ofatoms in reasonable computation time. While the MD simulations described above are adeterministic approach, i.e. the evolution of the system is known for a fixed initial configu-ration (and hence several computations starting with the same initial configuration shouldyield the same output), MC methods are based on stochastics. Good introductions are e.g.given in [6] and [32].

Imagine a system consisting of N different particles whose coordinates are denoted by rN .The probability density of finding the system in configuration rN is given by

exp(−β U(rN ))∫exp(−β U(xN )) dxN

,

where the integral is over the whole configuration space. U in the Boltzmann factorexp(−β U(rN )) denotes the energy of the system and β = 1/(kB T ).The main purpose of MC methods is to determine equilibrium properties of the system, e.g.the equilibrium distribution of ions inside the channel. In order to do this, random changesto the system configuration are introduced in such a way that the probability of visting aparticular point rN of the configuration space (also termed phase space) is proportional tothe Boltzmann factor exp(−β U(rN )) ([32]). This procedure is also known as importancesampling. In contrast, in random sampling the configuration points are distributed ran-domly over the whole phase space. Thus most of the computational effort is spent on pointswhere the Boltzmann factor is negligible, which is not the region of interest. The importancesampling on the other hand focuses on regions of the phase space that have an importantcontribution to the ensemble average ([6]). One way to achieve such a behaviour was intro-duced by Metropolis et al. in [74] and the resulting method is hence called the MetropolisMC method. It is nowadays widely-used in MC simulations. The method can be summarizedby the following steps (see also [32]):

1) Configuration rN of the system given, compute the energy U(rN );

2) select a particle at random and give it a random displacement r′ = r + ∆;

3) compute the resulting energy U(r′N );

4) accept the move from rN to r′N with probability

pacc = min 1, exp(−β [U(r′N ) − U(rN )]) .

The last step implies that a move is always accepted if it does not lead to an increase inthe energy (i.e. U(r′N ) ≤ U(rN )), since then pacc = 1. For the case that the energy isincreased by the move, i.e. U(r′N ) > U(rN ), the usual procedure to determine if the step

2.2. OTHER MODELLING APPROACHES 27

is accepted goes as follows: A random number from a uniform distribution in the interval[0, 1] is generated. If this random number is less than pacc the move is accepted, otherwisethe move is rejected and the old configuration rN is kept for the next step.The advantage of this MC method is that it allows the system to get out of local energyminima again. Also ”unphysical” moves can be performed in MC simulations that wouldnot be possible in MD simulations, thereby speeding up the sampling of the phase space.In order to get an efficient algorithm the displacement ∆ has to be chosen appropriately.If it is chosen too large, the moves are rejected too often and little movement through thephase space is achieved. On the other hand, if the displacement is chosen too small there isa high acceptance rate for the moves but the phase space is explored slowly anyway.

Barrier models

The next modelling approach we are going to introduce might be summarized under thename barrier models or - a bit more colloquial - as hopping models. The underlying idea ofthese particle-based models is that the channel can be divided into a series of distinct bindingsites, where each site can only be occupied by one particle (see Figure 2.6). The individual

? ? ?

6 6 6

-

-

ki,i+1

ki+1,i

kN,out

kout,N

kout,1

k1,out

Figure 2.6: Sketch of channel system for barrier model.

ions can then hop along the channel from one binding site to the next. For a more detailedintroduction we refer the reader to [47]. Equations either describing the probabilities offinding a particle at a certain binding site ([119]) or describing the probability of finding thechannel in a certain occupation state ([20], [114]), respectively, can be set up which dependon the hopping rates of the ions. Detailed derivation and stating of the equations can alsobe found in [18] and [19]. One standard approach for defining these hopping rates is to useArrhenius type rate constants,

k = k0e− ∆U

kBT ,

where ∆U denotes the energy barrier a particle has to cross to jump to the next bindingsite. This ansatz for the hopping rates actually also explains the name barrier models for thisapproach. The prefactor k0 has been a subject of many discussions over the past decades.The “traditional” prefactor used in barrier models, kBT/hP , where hP denotes Planck’sconstant, is not species-dependent and has been considered inadequate in [17], since in asituation with no energy barriers present, all ion species would have the same transitionrates. Instead, Chen and his co-workers ([17]) derive an alternative prefactor depending onthe diffusion coefficient and the potential landscape seen by the ions. In the case of highbarriers it reads as

2D

d2√

π

√| ∆U

kBT|,


D being the diffusion coefficient and d the width of the energy barrier. A more generalversion of the transition rates (without the assumption of high barriers) can be given as

k =D

d2

exp(Vappl/kBT )1d

∫ d0 exp(∆U/kBT ) dx

,

Vappl denoting the applied external potential.

Others

Besides PNP there are also other continuum approaches to model ion transport throughchannels, e.g. Poisson-Boltzmann (PB) models. In PB theory, the Poisson equation forthe electrostatic potential V is supplemented with the assumption that (at equilibrium) thedistribution of mobile ions can be approximated by the Botzmann factor,

ρi(x) = zie0ηie−zi

e0V (x)kBT ,

with ηi denoting bulk concentration of ion species i. This results in the PB equation

−λ2 ∇ · (ǫ∇V ) =∑

i

zie0ηie−zi

e0V (x)kBT + ρfix,

where ρfix stands for some fixed charge distribution in the system. For small potentials Vit is common to linearize this equation and one then speaks of linearized Poisson-Boltzmanntheory. A comparison of PB with Brownian Dynamics simulations in [75] showed that thePB approximation is inappropriate in the narrow channel geometry.

2.3 Evaluation of the different models

After introducing several different modelling approaches to encounter the subject of iontransport across membrane channels, in this section we want to discuss the advantages anddisadvantages of the different models (see also [83], [99] for an overview). At first glance theMolecular Dynamics approach might seem to be the most attractive one, since all atomisticdetail can in principle be put into this model. However, as was already mentioned in thelast section, in MD simulations all the particles in the system, including solvent moleculeslike water, have to be modelled explicitly. This leads to a huge amount of particles forwhich Newton’s equations of motion have to be integrated. As this integration can only beperformed with a small time step, the time range covered by MD simulations is currentlytoo short to really simulate the transport of several ions across a channel. Furthermore, asthe simulation time increases, numerical errors from the integration process might start toaccumulate and falsify the final result ([65]). Another problem with MD is the question of howto correctly simulate different bath concentrations of ions. To simulate physically relevantconcentrations of e.g. 100nM intracellular Ca2+, vast amounts of water particles wouldhave to be simulated. Besides, in current applications using standard software only periodicboundary conditions can be employed, which are inadequate for the general ion channel setup.The standard force fields used in MD simulations, like CHARMM and AMBER, are mainlybased on Lennard-Jones interactions and Coulombic electrostatic interactions with fixed

2.3. EVALUATION OF THE DIFFERENT MODELS 29

atomic partial charges. No polarization effects are included that might play an importantrole in ion channel selectivity on the microscopic level. Therefore, since a few years especiallythe group around Benoit Roux started to develop force fields that incorporate polarizationeffects ([45], [63], [64]). Generally, the setup of an appropriate force field is one of the majordifficulties in MD simulations ([2]).

Due to all these shortcomings of MD with respect to ion channel transport, it is probablynot the best model to choose in this respect. Nevertheless, because of the great microscopicdetail that can be incorporated, MD is a valuable approach to study parts of the process andof the filter structure, especially for channels where the structure is known to a large extent(e.g. the KcsA potassium channel) and hence the detailed information needed to set up themodel is available.

The Monte Carlo approach performs better with respect to ion channel transport. It can beregarded as a coarse-grained particle approach, that includes less detail than MD models,but nevertheless retains the particle nature of the ions of interest. In this approach solvent,protein and membrane are usually treated as continuum dielectrics and only the mobile ionsare treated as real particles. This drastically reduces the computational effort and allowslonger simulation times. But since Monte Carlo methods are a stochastic approach, theoutcome in every simulation will be different and a sufficiently large number of trajectorieshas to be computed in order to get a reliable average. For applications where a model outputhas to be computed over and over again, such as in the area of inverse problems that we aregoing to address in the next chapter, MC simulations are inefficient. For such a situationin fact macroscopic continuum models, like PNP, are the best choice, since with efficientalgorithms they can be solved fast and only one solution has to be determined, i.e. not anensemble of possible outcomes has to be determined first in order to arrive at the average.But although continuum models are the most attractive approach from a computationalpoint of view, we also have to consider their physical validity and restrictions. As theyrepresent the most coarse-grained version of the above models they incorporate the leastatomistic detail. As an example, classical PNP is merely based on long-range electrostaticinteractions and uses a mean field approach. It means that not the actual “true” trajectoriesof ions are computed, as would be the case in an MD simulation, but instead an averagedistribution is determined. Due to the lack of atomistic details in the continuum models,e.g. direct ion-protein interactions cannot be investigated in detail. Instead, the continuummodels are based on the assumption that the major effects in ion transport can be producedby considering the right basic interactions, without incorporating the structural details.As a consequence, the models can help to understand the basic functions and illustratefundamental principles of ion permeation ([99]), but they will not give detailed informationabout exact positions and charges of e.g. filter residues or about the molecular basis of certaindiffusion coefficients. But their great advantage lies in their computational feasibility.

Another critical point when dealing with continuum models is their validity in the reallynarrow filter geometry. Does it make sense to talk of densities when two ions are basicallyenough to crowd the filter? And can one simply ignore the particle nature of ions in such aconfined geometry? It is a fact that certain effects only appear due to the physical presence ofthe ions, like crowding effects and single-filing. Hence macroscopic models can be improvedby trying to include such effects, at least to some extent, into the model. This has for examplebeen done by including volume-exclusion effects and short-range electrostatic interactions


into the PNP equations by means of excess chemical potentials (see sections above). Forother situations one has to keep in mind that macroscopic continuum models might just beinadequate, due to the lack of structural detail, for example when it comes to single-filing.Additional terms would have to be included and the models would have to be refined inorder to match such situations.

Nevertheless, PNP has been shown to reproduce experimental data in several channels fairlywell, as for example in the porin OmpF ([115]) and in the Ryanodine receptor RyR ([38],[42]).

Probably the most disputed and discussed model class over the decades are the barrier models(see e.g. [83] and references therein). The critical point is the assignment of transition rates.They should in some way depend on the energy landscape in the system. What is frequentlydone in the classical barrier models is that a fixed energy landscape U is taken and thetransition rates are assumed to be of Arrhenius type (see above). The problem with thisapproach is that the energy landscape seen by the ions is not fixed, but it will depend on thepositions of other ions in the system (ion-ion interaction). Hence the barrier seen by an iontrying to enter the channel will be different when the second channel site is occupied by an ionwith likewise charge as compared to a solely water-filled channel. Thus the energy landscapeshould depend on the occupation state of the channel and hence should also change over timeas the state of the channel changes. In other words we could say that the energy landscapeneeds to be computed self-consistently and should not be taken as a fixed input parameterof the model. The following terms could for example be included in the computation of thefield acting on an individual ion at a specific site:

• Electrostatic forces due to an external applied voltage,

• electrostatic interaction between the ion and the (fixed) protein charges,

• electrostatic interaction between the ion and other mobile ions in the system,

• volume exclusion effects between particles,

• “hydration”: an ion in the filter is an energetically more favourable state if the proteincharges tend to “solvate” it, mimicking the water hydration shell.

These computations would have to be carried out for all possible occupancies of the channel.

Another critical point in the barrier models is the choice of an appropriate prefactor, thatwas already mentioned above. The prefactor might be taken as a fitting parameter in orderto adjust the model output to data, but then the physical meaning of this parameter getsunclear. The traditional prefactor kBT/hP can be regarded as inappropriate ([48]).Furthermore, for the hopping models one has to decide under which circumstances an ionmight jump to the next binding site. Can one ion move only if the neighbouring site is notoccupied by another ion or can an ion push the neigbouring particle away from the site? Thisquestion relates to the process of concerted movement inside ion channels, since in realitythe channel will never be empty, but solvent molecules like water will fill the gaps betweencharged particles.

Nevertheless, barrier models might be useful in determining whether a particular mechanismcan account for a particular phenomenon, e.g. a single-site channel compared to a multi-ion

2.3. EVALUATION OF THE DIFFERENT MODELS 31

channel ([10]).

When deciding on the type of model to address ion transport, one also has to define what isthe intention of the modelling and which prerequisites are given. If we are looking for somedetailed structural information (e.g. the exact position of a charged side chain) a macro-scopic model might not yield the desired information. On the other hand, if little is knownabout the structure of the channel under consideration, a detailed atomistic approach doesnot make sense, since the required information to set up the model is not available.

From the above considerations we see that each modelling approach has its benefits andshortcomings. As a conclusion we could say that actually a combination of the above modelswould be best to account for ion channel transport. A coupling of a detailed, particle-basedapproach to describe the situation inside the constrained region of the channel filter, wherea continuum model has its limitations, to a coarse-grained continuum model to describethe situation in the baths. This would allow to carry out detailed simulations inside thefilter and at the same time account for sufficiently large baths and a good way to deal withdifferent ion concentrations in the baths, without drastically increasing the computation time(as would be the case with an all-particle-based model). To couple a discrete model to acontinuum model also poses interesting mathematical questions, for example how to definethe boundary conditions at the crossover and how to transfer macroscopic quantities intoparameters suitable for atomistic description and vice versa. This will be a field of intensiveresearch for the future.For now, we are going to turn our attention again to the PNP approach in the following. Aswe will address questions dealing with the inverse identification of parameters in the nextchapter, a continuum approach is the best currently available, due to the relatively shortcomputation times.


Chapter 3

Inverse problems with

Poisson-Nernst-Planck

In this chapter we are going to use the one-dimensional Poisson-Nernst-Planck model wediscussed in the previous chapter to study questions from the field of inverse problems. Wewill start by giving a general introduction into the topic of inverse problems, pointing outthe two major classes of them, namely parameter identification and design problems. In thesubsequent sections we are first going to investigate parameter identification with the fullforward model, but the main focus will be on the use of surrogate models for the solution ofthe inverse problem. This approach is motivated by the relatively high numerical complexitywhen solving the full PNP system including the short-range interaction terms. Faster andeasier-to-solve surrogate models can improve the performance in this respect.

3.1 Inverse problems - basic setup

The mathematical field of inverse problems gained tremendous importance over the pastthree decades. The needs of various applications in other scientific disciplines and industryled to a fast development of the mathematical tools related to inverse problems. Goodintroductions into the topic can be found e.g. in [27], [44], [69] and many others. Thenotion of inverse problems naturally implies that there is also something like a direct probleminvolved. These two problems are closely related and while the direct problem in physicalapplications usually describes the future development of a system from a known presentstate (including all parameters and state variables), inverse problems are “concerned withdetermining causes for a desired or an observed effect” ([27]). To find “causes for an observedeffect” is frequently referred to as identification or reconstruction, while determining “causesfor a desired effect” can be summarized under the name design or control problems. In orderto be able to address these kind of questions, one first needs a forward operator describingthe direct problem. This is usually created in a modelling process based on physical lawsand assumptions and can consist e.g. in solving a system of partial differential equations.Let F denote such a forward operator. Then we can state the

• direct problem: Input x given, compute output y via

y = F (x),

33

34 CHAPTER 3. INVERSE PROBLEMS WITH PNP

• inverse problem: Output y given, determine input x such that

F (x) = y.

In other words we can say that in the inverse problem we are looking for an inversion of theoperator F .

Inverse problems are closely related to the notion of ill-posed problems. A problem is calledill-posed in the sense of Hadamard, if one of the following conditions is not fulfilled:

• For all admissible data a solution exists.

• The solution is unique.

• The solution depends continuously on the data.

Especially violation of the last point leads to serious problems when trying to recover param-eters from given data. Very small variations in the data might lead to huge variations in therecovered parameter and thus the identification is unstable. As in real-life applications dataare usually obtained from measurements, e.g. current flow measurements, they are generallynoisy and can vary due to measurement errors. Hence one needs to take special care inorder to obtain meaningful results in an identification process despite the instability inher-ent in the problem. The use of regularization methods is one way to deal with such situations.

As noisy data yδ will generally not be in the range of the forward operator F , one usuallytries to approximate it by trying to minimize ||F (x) − yδ||2 + α R(x), where the first termrepresents the data fit and the second one is a regularization term to stabilize the problem.The regularization parameter α thereby is used to balance between stability and quality ofthe data fit.

After giving this general introduction we would now like to come back to the special case ofion channels and possible inverse problems that might arise in this context. Mathematicalmodels can be extremely helpful when the parameters of interest are not directly accesible inexperiments. For example in many channel types it has not yet been possible to resolve thestructure of the selectivity filter, which in turn is a key aspect when trying to understand thefunctioning of ion conduction. Apart from the basic understanding of the processes it canalso be of great importance, e.g. in medical applications, to learn how to alter the behaviourof an ion channel or how to design an artificial ion-conducting pore that has certain attributes(e.g. with respect to selectivity). We are going to focus on inverse problems concerned withthe structure of the selectivity filter (like length, radius and charge distribution) of singleion channels, but also other questions are possible. For example, in [31] and [30] Frenchand his co-workers investigate the spatial distribution of ion channels based on temporalmeasurements of transmembrane current.

3.1.1 Parameter identification

In order to get deeper insights into the channel behaviour and conductance properties ofsingle ion channels, the structure and composition of the selectivity filter are of great im-portance. In our one-dimensional PNP model (2.19), introduced in detail in Chapter 2, the

3.1. INVERSE PROBLEMS - BASIC SETUP 35

geometrical aspects of the channel structure (radius, length, shape of the selectivity filter) en-ter the forward model via the area function A, while the protein charges lining the selectivityfilter are modelled as an own species restricted to the filter region. The structural informa-tion concerning these protein charges is parametrized by Nm, the number of fixed charges(compare (2.4)), and some confining potential µc. This confining potential is an example foran externally applied potential µ0

k that is included in the equations as a possibility to accountfor additional factors (see Equation (2.1d)), i.e. µ0

m = µc. The information conveyed in theconfining potential is related to the distribution of the protein charges. Although in ourPNP model these charges are moving according to the electrostatic field and the interactionswith the other species in the system, with the confining potential it is possible to restricttheir range of movement to certain areas in the filter. Instead of assuming a completely rigidchannel structure, this procedure allows a certain flexibility for the charged residues, whichis also most likely the case in real ion channels. Hence the confining potential encodes partof the channel structure that is related to the distribution of the charged filter residues.

An alternative approach would be to consider the protein charges as fixed, thus eliminatingthe equation for the confined species from the PNP system. The density of the confinedspecies would then merely appear as a parameter on the right-hand side of the Poisson equa-tion (2.19a). It could then directly be sought as an output in a parameter identificationprocess. In the former approach the confining potential µc is the parameter of interest.

The data available for the reconstruction process is composed of current-voltage curves forvarious bath concentrations of the free ion species. In fact, quite a large number of data setscan be acquired this way by varying the bath concentrations of the free ions. Identificationproblems have also been studied in the semiconductor context, but there on the contrary itis not possible to alter the electron and hole concentrations on the boundary, but only theapplied voltage can be varied. As larger data sets lead to better reconstruction results ingeneral, in ion channels the assembly of sufficiently many data is not an issue, but can beaccomplished once the experimental setup (e.g. a patch clamp stand) has been established.

The recorded data can then be used for the identification of the above mentioned systemparameters, i.e. the channel structure and channel geometry.

Let q denote the parameter to be identified (e.g. q = µc or q = channel radius) and Iδ

the (noisy) data measurable in experiments (i.e. current-voltage curves for different bathconcentrations). The forward operator F for our PNP model then consists of solving theone-dimensional steady-state version of the PNP system (2.1)-(2.3) for a fixed parametervalue q and afterwards mapping the outcome to the current flow via

I = e0

m−1∑

i=1

ziJi.

Since I = F (q) depends implicitly on the sought parameter, the resulting optimizationproblem can be written as

Q(q) → minq

with

Q(q) =1

2||F (q) − Iδ||2.


To stabilize and/or to incorporate some a priori knowledge about the parameter q, a penaltyterm can be added and the final optimization problem reads as

Jα(q) := Q(q) + αR(q) → minq

.

The regularization parameter α is used to balance between data fit (first term) and stability(second term). The penalty functional can e.g. be taken as R(q) = ||q − q∗||2, where q∗

denotes some ideal solution or incorporates some additional knowledge about the parameter.

Apart from the parameter identification problem, it might also be of importance to alterthe channel behaviour in a certain desired way. This brings us to the second class of inverseproblems, the design issue.

3.1.2 Design problems

While in parameter identification we try to recover the underlying causes that produce a givendata set, in the design problem we would like to generate system parameters that lead to adesired output. Hence in the design problem no measured data are involved as was the case inthe parameter identification problem. There the noisy data could lead to instabilities in thereconstruction process and this factor can be omitted in the design problem. Nevertheless,instabilities of a different kind might arise, e.g. robustness with respect to applied voltagesmight not be given. Slight variations in the applied voltage might yield a completely differentparameter design, which is not meaningful from a practicle point of view.

In ion channels one desirable goal could e.g. be to increase the selectivity for one ion speciesover others. In order to formulate the design problem in this case in a mathematical way, wefirst need to define a selectivity measure Sk that allows us to quantify the selectivity withrespect to species k. Examples for such selectivity measures are e.g. the conductance orthe flux of species k. For more details see [39]. If S1 and S2 are the selectivity measuresof species 1 and 2, respectively, and the aim is to increase the selectivity for species 1 overspecies 2, the design problem can also be formulated as a minimization problem:

Q(q) → minq

with

Q(q) = −S1(q)

S2(q)

or

Q(q) = −S1(q) + S2(q).

As above, the design parameter enters the minimization functional via the PNP model, whoseoutput is needed to compute the selectivity measures. And also as in the case of parameteridentification, an additional penalty term can be added to prevent a blow-up of the designparameter and to favour a reasonable design (e.g. one that is easy to realize in practice). In[14] and [15] such design problems have been addressed, e.g. for the confining potential µc.The benefit of an additional regularization term to generate a reasonable confining potentialhas been demonstrated there.

3.2. THE FULL FORWARD MODEL 37

In the following we are going to focus on parameter identification problems. After consideringthe full forward model for this issue, the main emphasis will be put on the development ofso-called surrogate models to perform the identification task.

3.2 Parameter identification with the full forward model

As introduced before, mathematical models can be a tool to learn something about theunderlying physical system and to reconstruct quantities that cannot be measured directlyin an experiment. The general procedure for parameter identification has been outlinedabove. In this section we want to present some results for the reconstruction of the confiningpotential µc (or more precisely exp(−µc)), using the full PNP system with hard-sphereparticle interactions included. The data we are going to consider consist of current-voltagecurves for different bath concentrations of the free ions. The optimization functional we wantto minimize with respect to the parameter q = exp(−µc) reads

Q(q) =1

2||F (q) − Iδ||2,

where Iδ denotes the (possibly noisy) data and F (q) is our forward operator. It is given by

F : q 7→ (V, ρi, µexi ) 7→ I,

i.e. first mapping the parameter q to the solution of the PNP PDE system and subsequentlycomputing the current I from this solution. The current is given by

I = e0

m−1∑

i=1

ziJi.

Recall that the m-th species refers to the protein charges inside the filter and thus doesnot contribute to the measured current. From the steady-state continuity equations forthe densities ρi, i = 1, ..., m − 1, together with the boundary conditions ρi(−L) = η−i andρi(L) = η+

i (here −L and L denote the left and right boundary of our one-dimensionalsystem, respectively) the fluxes Ji can be computed to be

Ji =η−i exp(zi c V (−L) + µex

i (−L)) − η+i exp(zi c V (L) + µex

i (L))∫ L−L

1Di A exp(zi c V + µex

i ) dx, k = 1, ..., m − 1. (3.1)

The fluxes do not depend explicitly on the optimization paramter q but on the solutions Vand µex

i of the PNP system, which in turn depend on q.

In order to minimize Q with respect to q we employ an iterative gradient method of the form

qn+1 = qn − γnQ′(qn)

(γn denoting the step size) to compute the next iterate, starting with some initial guess q0.Since we are trying to identify a whole function and not just a single scalar, finite differ-ence methods to compute the derivative Q′ are very inefficient. Instead we use the adjointapproach for its determination (see [36] for an introduction into this topic). This results in


an adjoint PDE system to be solved and the gradient can be computed by solving a single(adjoint) system. For the derivation of the resulting equations we refer to the appendix.

The data for the numerical tests are generated by solving the forward model with the exactparameter value. To mimic more realistic experimental setups artificial noise of differentlevels can be added afterwards.

For the results shown in Figures 3.1 and 3.2 we used twelve different current-voltage curves,each consisting of three different applied voltages. Around 4% noise was added to the databefore performing the reconstruction in the case of the more complex confining potential inFigure 3.2.

−0.1 −0.05 0 0.05 0.10

0.2

0.4

0.6

0.8

1

x

exp

(−µc )

(a) Parameter reconstruction

2 4 6 8 10 12 14 160

0.5

1

1.5

2

2.5

3

3.5

4

no. of iterations

resi

dual

(b) Data residual

Figure 3.1: Reconstruction results for identification of confining potential µc with the fullmodel including hard-sphere interaction terms (no noise added to data). (a) Reconstructionresults as exp(−µc); blue: exact value; red: reconstruction; green: initial guess; (b) dataresidual during iteration.

The first example gives the reconstruction result for a rather simple confining potential thatrestricts the charged residues to the central filter region. It has been recovered really good ascan be seen in Figure 3.1(a). The blue curve gives the exact value, red is the reconstruction.The green line at one indicates the initial guess with which the iteration was started. Thisinitial guess assumes no restrictions on the movement of the charged residues inside the filter(i.e. µc ≡ 0). The right part of Figure 3.1 demonstrates the decrease in the data residual Q.

In the second example (Figure 3.2) a more complex confining potential has been used togenerate the data and 4% noise was added to the data before performing the reconstruction.

The blue curve in Figure 3.2(a) gives the true parameter value used for the data generation,the red curve is the reconstruction at the end of the iteration process. The green line givesthe initial guess with which the iteration was started. Again the initial guess assumes norestrictions for the protein charges inside the filter, i.e. theoretically they could be anywhereinside the filter (µc ≡ 0). Also for this more complex confining potential a reasonable re-construction could be achieved under the presence of data noise. The iteration was stoppedwhen the residual approached the noise level.

Parameter identification problems with respect to the total charge Nm and the charge dis-tribution using the full model have also been addressed in [15].

3.3. SURROGATE MODELS 39

−0.1 −0.05 0 0.05 0.10

0.2

0.4

0.6

0.8

1

x

exp

(−µc )


2 4 6 8 10 12 14 16 18 20 221

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

3

no. of iterations

resi

dual

(b) Data residual

Figure 3.2: Reconstruction results for identification of confining potential µc with the fullmodel including hard-sphere interaction terms (4% noise added to data). (a) Reconstructionresults as exp(−µc); blue: exact value; red: reconstruction; green: initial guess; (b) dataresidual during iteration.

Due to the fact that the forward operator has to be solved repeatedly during the iterativereconstruction procedure, the identification process is quite time-consuming when consider-ing the full PNP system with self-consistently computed short-range particle interactions. Inthe next section we are thus going to investigate if, instead of only the full model, some kindof surrogate models can be used in the identification process to speed up the computations.

3.3 Parameter identification with surrogate models

The general motivation for using surrogate or replacement models usually lies in the com-putational complexity of the full model. In the PNP case the full forward model consistsof solving the coupled nonlinear PNP system including the short-range particle interactions.This is computationally quite expensive and in a parameter identification process the for-ward model has to be solved repeatedly. Furthermore we should keep in mind that in everyiteration step the coupled PDE system has to be solved for every combination of bath ionconcentrations and applied voltages. Since larger data sets generally yield better reconstruc-tion results, the idea is to simplify the computations for the single data record rather thanto decrease the amount of data.

The surrogate models still depend in some way on the original full model (otherwise wewould not speak of a surrogate model but we would just use a different model describing theprocess). Let F denote the original model, S a surrogate model and q the parameter to beidentified. Furthermore let p = p(q) denote the output of the original model that is relevantfor the surrogate model. Iδ refers to the measured data. With this notation we propose thefollowing iterated algorithm that combines the full forward model with the simpler surrogatemodel (Figure 3.3): We start with an initial guess qold = q0 for the parameter of interestand use this together with the experimental boundary conditions BC (like ion bath concen-trations and applied voltages) as an input for the original model F . Solving this full modelonce for the fixed parameter value qold generates the quantities p that are needed as an inputfor the surrogate model S. As we are going to see later on, in the PNP case these quantitiescan e.g. be the ion densities and electrostatic potential at a specific applied voltage or theshort-range particle interaction terms.


qold

BC- full model F

?'&

$%

p(qold),

I(qold)

?

surrogate model S

?

||F (qold) − Iδ|| ≤ ǫ ?yesSTOP

no 6

optimize with

surrogate model S

6

qnew

qold = qnew

Figure 3.3: The iterated algorithm for parameter identification with a surrogate model.

If the data are not described sufficiently well already, the update procedure for the parameterq is then carried out only with the surrogate model S (depending on the fixed quantitiesp) in what we call the inner iteration. This avoids complex computations with the orig-inal F . (As the surrogate model is only an approximation to the full model, there is noneed to perform a full minimization in the inner iteration. It should be stopped when thevariation of the optimization parameter q from the old value qold is larger than some thresh-old, since then the reference quantities p should be updated.) The resulting new parametervalue qnew is then fed into the original model to update the quantities p in what we call theouter iteration. These nested iterations are repeated until some stopping criterion is reached.

Next we want to discuss which prerequisites a surrogate model should fulfil in order to be areasonable alternative in the iteration procedure. The following points should be taken intoconsideration:

1) Since the drawback of the original model is its computational complexity, the surrogatemodel should reduce the computational effort.

2) The surrogate model has to represent the measurements, i.e. it must have the measur-able quantities as output.

3) Since it is used to identify a system parameter of the original system, the surrogatemodel must in some way depend on the parameters of interest.

According to these points we are going to investigate two different surrogate models. Thefirst one stems from computational considerations. As the short-range particle interactionstake up the major part of the computation time, it suggests itself to take a PNP system


with fixed short-range interaction terms µexk as a computationally less expensive surrogate

model. Points 2) and 3) from above are naturally given with such a model. In the followingwe are going to call this approach the noDFT model, to acknowledge the fact that the short-range interaction terms are not computed self-consistently in this model (as is the case inthe original PNP model), but merely enter as some fixed parameters.The second surrogate model under investigation comes more from physical considerations.When studying the current-voltage curves of several ion channels and for various bath con-centrations it turns out that many channels have a rather linear current-voltage characteristic(either over the full or at least over part of the relevant voltage range). Figure 3.4 shows anexample for such a current-voltage curve measured from the cation channel TRPA1.

Figure 3.4: Current-voltage curve for TRPA1.

Thus another idea is to approximate the measured outcome (the current-voltage curve) bya linear model of the form

I = g · (U − U0) + I0.

Here g is generally known as the conductance of the single channel, U0 is some referencepotential, e.g. the reversal potential, and I0 is the current flowing at the reference voltage.This linear model will be calibrated using the original PNP model, yielding an expression forthe conductance g dependent on the system parameters and thereby establishing the relationbetween the surrogate model and the parameter to be identified (compare point 3) above).

Let us start by considering the noDFT model. All minimizations with respect to the systemparameter q in the inner iterations have been carried out using the iterative gradient method,

qn+1 = qn − γnQ′(qn),

γn denoting the stepsize and the derivative is computed via the adjoint approach. Foran exemplary derivation and statement of the resulting equations in the case of the linearsurrogate model we refer to the appendix.

3.3.1 Surrogate model with fixed DFT

In this subsection we are going to investigate the performance of the noDFT surrogatemodel with respect to the identification of the confining potential µc. The noDFT modelconsists of the PNP system equations including the short-range particle interactions µex

k


in the continuity equations as fixed input parameters. On the contrary to this surrogateapproach, in the original PNP model these short-range interaction terms have to be computedself-consistently in every step when solving the PDE system. Hence using the notation fromFigure 3.3, we have

F ≃ PNP with DFT p ≃ µexk , k = 1, ..., m

S ≃ PNP noDFT q ≃ µc.

From the equilibrium assumption ∇µm = 0 for the confined species (here µm denotes theelectrochemical potential of the confined species) it follows in the one-dimensional scaledversion that

ρm = K e−zmcV −µexm −µc

, (3.2)

where the constant K can be determined with the help of the normalization condition (2.4).Hence we see that the term actually acting on the confined species is e−µc

, which could betaken as the optimization parameter instead of µc. Since the confining potential has somelarge value where the charged residues are not supposed to be and is basically zero wherethe residues are allowed to move around freely, the expression q = e−µc

will be between 0and one, q ∈ (0, 1]. Thus in an iterative procedure we can project the parameter q onto thisinterval (or rather onto [ǫ, 1] with ǫ small) to stay within the admissible set.

The forward operator F and the surrogate model S can be decomposed into two parts,the first one mapping the parameter q to the solution of the PDE system and the secondone mapping the solution of the PDE system to the current output, denoted by I in thefollowing. Let w := (V, ρk) denote the solution of the full PNP system (including self-consistent computation of short-range interaction terms) and M := (µex

k ) the short-rangeinteraction terms. Both quantities depend on the parameter q, i.e. w = w(q) and M = M(q).

Then we can write the forward operator F as

qA7→ w

B7→ I.

Let u := (V, ρk) denote the solution of the noDFT model with fixed short-range interactionterms, i.e. u = u(q, M). The surrogate model S can then be split up into

qAM7−→ u

BM7−→ I.

In this case the operators mapping the parameter to the solution of the PDE system (AM )and mapping the PDE solution to the current (BM ) depend on the fixed quantities M . Thefunctional to be minimized with respect to q in the inner iteration is given by

QM (q) =1

2||BM (AM (q)) − Iδ||2,

and the iteration process can be put as

• outer iteration (index i):

qi →

wi = w(qi)M i = M(qi),


• inner iteration (index n):

qi0 = qi, M i fixed

qin+1 = qi

n − γn∇qQM i(qin),

with a subsequent projection of qin+1 onto the admissible set and qi+1 = qi

n+1 at theend of the inner iteration.

Figure 3.5 shows the reconstruction result for a rather simple confining potential. Twelvedifferent bath concentrations and three different voltages for each bath combination havebeen used for the reconstructions, resulting in 36 data points in total (no noise added todata, short-range particle interactions restricted to hard-sphere part). The exact confiningpotential (given in blue in Figure 3.5(a)) is recovered very accurately (red curve), startingfrom an initial guess (green curve) that assumes no restrictions on the residue movementinside the filter. Figure 3.5(b) demonstrates the decrease of the optimization functional.

−0.1 −0.05 0 0.05 0.10

0.2

0.4

0.6

0.8

1

x

exp

(−µc )


1 2 3 4 5 60

0.5

1

1.5

2

2.5

3

3.5

4

no. of outer iterations

resi

dual

(b) Data residual

Figure 3.5: Reconstruction results for identification of confining potential µc with noDFTsurrogate model (no noise added to data). (a) Reconstruction results as exp(−µc); blue:exact value; red: reconstruction; green: initial guess; (b) residual in outer iteration.

We performed the same computations for a more complex confining potential (see Figure3.6(a), blue curve), this time adding around 4% noise to the data and again starting withthe same initial guess as before. Also in this case the reconstruction (red curve) gives aqualitatively right result, leading to a considerable decrease in the data error (Figure 3.6(b)).

The surrogate model with the fixed short-range particle interactions performs quite goodin the numerical examples considered here, but nevertheless one inner iteration step stillinvolves the solution of a nonlinear coupled PDE system. In the next section we are goingto explore a linear surrogate model that has a larger potential for time savings as comparedto the noDFT model.

3.3.2 Linear surrogate model

In this section we are going to investigate a linear surrogate model that comes from exper-imental considerations. In the first part we will derive the model in detail and afterwardsinvestigate its performance in the parameter identification.


−0.1 −0.05 0 0.05 0.10

0.2

0.4

0.6

0.8

1

x

exp

(−µc )


2 4 6 8 10 12 14 16 18 201

1.5

2

2.5

3

3.5

4


resi

dual

(b) Data residual

Figure 3.6: Reconstruction results for identification of confining potential µc with noDFTsurrogate model (4% noise added to data). (a) Reconstruction results as exp(−µc); blue:exact value; red: reconstruction; green: initial guess; (b) residual in outer iteration.

As in many ion channels the current-voltage curve is almost linear (at least in parts of thevoltage range), we use the following approach:

I = g · (U − U0) + I0 (3.3)

with U0 denoting some reference voltage and I0 the corresponding current. The conduc-tance g determines the slope of the curve. In order to establish the relationship betweenthe underlying system parameters like channel geometry and charge distribution and thissurrogate model, we use the PNP model to calibrate the linear one. The current in theoriginal one-dimensional PNP model depending on the applied membrane voltage U is givenby

I(U) = e0

m−1∑

k=1

zk Jk(U), (3.4)

with e0 denoting the unit charge, zk the valence of species k and Jk its flux. Remember thatthe m-th index refers to the protein charges that do not contribute to the measured ioniccurrent. In order to get a linear relation between current I and applied membrane voltage Uof the form (3.3) we perform a linearization of (3.4) around the reference potential U0. Thisresults in

I(U) = e0

m−1∑

k=1

zk Jk(U)

≈ e0

m−1∑

k=1

(zk Jk(U0)

)+ e0

m−1∑

k=1

(zk

dJk

dU(U0)

)(U − U0),

which is of the form (3.3) for setting

g = e0

m−1∑

k=1

(zk

dJk

dU(U0)

)

and

I0 = e0

m−1∑

k=1

(zk Jk(U0)

).


For ease of reading and writing in the sequel we define the following quantities to expressthe derivatives with respect to voltage:

v(x) =dV (x)

dU|U0 , rk(x) =

dρk(x)

dU|U0 , jk =

dJk

dU|U0 .

Then the conductance can be written as

g = e0

m−1∑

k=1

zkjk.

To determine the quantities jk, the full PNP system is also linearized around the referencevoltage U0. Recall that the full steady-state PNP system is given by

−λ2 1

A

d

dx(ǫA

dV

dx) =

m∑

k=1

zk ρk,

Jk = −DkA(dρk

dx+ zk c ρk

dV

dx+ ρk

dµexk

dx

)∀ k = 1, ..., m − 1,

dJk

dx= 0,

d

dx

(ln(ρm) + zm c V + µex

m + µc)

= 0,

∫

filterA ρm dx = Nm,

(omitting additional external potentials µ0k for the free ion species). Using the expansions

V (x, U) = V (x, U0) + v(x)(U − U0) + O((U − U0)2)

andρk(x, U) = ρk(x, U0) + rk(x)(U − U0) + O((U − U0)

2)

and neglecting terms of order O((U − U0)2) leads to the following linear PDE system to be

solved for v and rk:

−λ2 1

A

d

dx(ǫA

dv

dx) =

m∑

k=1

zk rk (3.6a)

jk = −DkA(drk

dx+ zk c rk

dV0

dx+ zk c ρk,0

dv

dx+ rk

dµexk,0

dx+ ρk,0 (

m∑

i=1

∂µexk

∂ρi|U0 ri)

)

(3.6b)

∀ k = 1, ..., m − 1

djk

dx= 0 (3.6c)

d

dx

( rm

ρm,0+ zm c v +

m∑

i=1

∂µexm

∂ρi|U0 ri

)= 0 (3.6d)


∫

filterA rm dx = 0. (3.6e)

Here V0, ρk,0 and µexk,0 denote the electrostatic potential, densities and short-range inter-

action terms at the reference voltage U0, i.e. V0(x) = V (x, U0), ρk,0(x) = ρk(x, U0) andµex

k,0(x) = µexk (x, U0). For simplicity we are going to ignore the last term in the linearized

flux equation (3.6b) and in (3.6d), i.e. the voltage dependence of the short-range interactionterms.In order to introduce the optimization parameter q = µc into the linearized system we use(3.2) to express ρm,0 = K e−zmc V0−µex

m,0−µc

in the above system.Let −L and L denote the left and right system boundary, respectively. Since the boundaryvalues for the original PNP system are given by

ηk = ρk(±L, U) ≈ ρk(±L, U0) + rk(±L)(U − U0) = ηk + rk(±L)(U − U0)

and

U = V (L, U) ≈ V (L, U0) + v(L)(U − U0) = U0 + v(L)(U − U0)

(for membrane voltage applied at right side), it follows that the boundary values of thelinearized system have to fulfil

rk(−L) = 0, rk(L) = 0, k = 1, ..., m − 1 (3.7a)

v(−L) = 0, v(L) = 1 or v(−L) = 1, v(L) = 0 (3.7b)

for membrane voltage applied at the right or left side, respectively.

From (3.6b) together with the boundary conditions (3.7a) the quantity jk can be computedto be

jk = −zk c

∫ L−L exp(zk c V0 + µex

k,0) ρk,0dvdx dx

∫ L−L

1DkA exp(zk c V0 + µex

k,0) dx,

and the surrogate model mapping the parameter q to the conductance g can be expressed as

S(q) = −e0

m−1∑

k=1

z2k c

∫ L−L exp(zk c V0 + µex

k,0) ρk,0dvdx(q) dx

∫ L−L

1DkA exp(zk c V0 + µex

k,0) dx. (3.8)

Note that in this expression the reference components V0, ρk,0 and µexk,0 are fixed quantities

coming from the original PNP system and hence are not changed during the optimizationwith the surrogate model.

As before we can split up the operators and write the forward operator F as

qA7→ w

B7→ g,

with w := (V, ρk) denoting the solution of the full PNP system and g the conductancecorresponding to the currents.Let u := (v, rk) denote the solution of the linearized PNP system (3.6). In this case thesolution u depends not only on the parameter q and the short-range interaction terms M


but also on the fixed reference quantities V0 and ρk,0, k = 1, ..., m− 1, i.e. u = u(q, w0, M0).Hence we can express the surrogate model S as

qAw0,M07−→ u

Bw0,M07−→ g,

with the operators depending on the reference solution of the full PNP system.Let gδ denote the measured conductance, then the optimization functional in the inneriteration can be written as

Qw0,M0(q) =1

2||Bw0,M0(Aw0,M0(q)) − gδ||2.

The over-all iteration is given by

• outer iteration (index i):

qi →

wi0 = w0(q

i)M i

0 = M0(qi),

• inner iteration (index n):

qi0 = qi, wi

0, M i0 fixed

qin+1 = qi

n − γn∇qQwi0,M i

0(qi

n),

with a subsequent projection of qin+1 onto the admissible set and qi+1 = qi

n+1 at theend of the inner iteration.

After deriving the linear surrogate model calibrated with the original PNP system, in thefollowing part we are going to investigate its performance with respect to identification ofthe confining potential µc.We will explore two different cases: In a first approximation we are going to ignore the short-range interaction terms and take the PNP system without µex

k as the full forward model F .In the second case we are going to include the hard-sphere part of the short-range interactionterms into F .

Reconstruction results for the first case and the two confining potentials from the last sectionare shown in Figures 3.7 and 3.8 (without data noise).The reconstruction result is very good in this case where the full model does not include theshort-range interaction terms. Nevertheless we have to mention that in some other test casesthe iteration got stuck and no decrease in the optimization functional Q could be achieved,presumably due to the accumulation of numerical errors (since in the inner iteration no de-crease could be achieved).

A difficulty arising from the linear approach becomes apparent when taking a closer look atthe model:

I = g · (U − U0) + I0

with the conductance given by

g = e0

m−1∑

k=1

(zk jk

)


−0.1 −0.05 0 0.05 0.10

0.2

0.4

0.6

0.8

1

x

exp

(−µc )


2 4 6 8 10 12 140

50

100

150

200

250

300

350

400

450


resi

dual

(b) Data residual

Figure 3.7: Reconstruction results for identification of confining potential µc with linearsurrogate model (no noise added to data). (a) Reconstruction results as exp(−µc); blue:exact value; red: reconstruction; green: initial guess; (b) residual in outer iteration.

−0.1 −0.05 0 0.05 0.10

0.2

0.4

0.6

0.8

1

x

exp

(−µc )


5 10 15 20 25 300

100

200

300

400

500

600

no. outer iteration

resi

dual

(b) Data residual

Figure 3.8: Reconstruction results for identification of confining potential µc with linearsurrogate model (no noise added to data). (a) Reconstruction results as exp(−µc); blue:exact value; red: reconstruction; green: initial guess; (b) residual in outer iteration.

and

I0 = e0

m−1∑

k=1

(zk Jk(U0)

).

The jk needed for the computation of the conductance are determined by solving the linearPDE system (3.6) and thereby depend on the optimization parameter q = µc. The referencecurrent I0 on the other hand enters the surrogate model as a fixed quantity coming fromthe full forward model. It is therefore not changed during the inner iteration and only theconductance will be optimized. It could thus happen that the resulting new parameter valueqi+1 improves the data fit with respect to the slope, but the resulting reference current I0

might even be further away from the true value, i.e. data curve and computed model outputlie further apart than before the iteration step. In order to overcome this problem eithera way to include the direct parameter dependence of I0 into the surrogate model has tobe sought, or an initial guess and data sets where the corresponding reference currents arealready close together should be used, hoping to avoid the above problem.

Next we turn our attention to the second case, where the full forward model F includes the

3.4. COMPARISON OF THE MODELS 49

hard-sphere short-range interactions. For this case we could not perform the optimizationprocedure on whole data sets, but some data had to be excluded in order to get a decreasingresidual during the inner iteration. This might be due to some numerical problems thatcould not yet be resolved. An example for a reconstruction based on only parts of the dataset are shown in Figure 3.9. There the iteration got stuck after three outer steps since nofurther decrease in the outer residual could be achieved.

−0.1 −0.05 0 0.05 0.10

0.2

0.4

0.6

0.8

1

x

exp

(−µc )

Figure 3.9: Reconstruction results for identification of confining potential µc with linearsurrogate model (no noise added to data). Reconstruction results as exp(−µc); blue: exactvalue; red: reconstruction; green: initial guess.

Another difficulty with the linear surrogate model could be that the term actually acting onthe confined species is the combination of µc and the short-range interaction µex

m , see (3.2),

ρm = K e−µexm −µc

e−zmcV .

In the beginning of the iteration process the optimization of the confining potential maythus try to compensate for the wrong µex

m . The µexm in the true solution is in fact large

where the confined species will be located. This is the region where µc is supposed to besmall. But since in the beginning a wrong µex

m (and additionally wrong reference densitiesand electrostatic potential) is used in the linear model, the µc might become large in theseregions, trying to mimic the influence of the true µex

m . Hence the iteration might also getstuck due to this effect.

3.4 Comparison of full model and surrogate models

In this section we are going to compare the different surrogate models from above and theirperformance relative to the full forward model. The main motivation behind using surrogatemodels for the identification task was to reduce the computational effort. The full PNPsystem including the self-consistent computations of the short-range particle interactions isby far the most time-consuming model due to the self-consistency iteration in every step. ThenoDFT model omitts this part, resulting in the solution of a system of nonlinear coupledPDEs. Due to the nonlinearity the solution of this system has to be computed with aniterative procedure. For the linear surrogate model finally, only the linear PDE system (3.6)has to be solved, which can be performed in one single step making an iterative procedureneedless. Hence the linear model is by far the fastest from the computational point of view.In Figure 3.10(a) a comparison between the identification of the channel radius using onlythe full PNP model with hard-sphere interactions included (blue) and using the iterated


algorithm with the linear surrogate model (red) is shown. We see that both approachesrecover the exact value of the radius (given by the dotted line) sufficiently good, but theiterated algorithm performs a lot faster than the full model approach (see Figure 3.10(b)).

1 2 3 4 5 6 7 80.08

0.085

0.09

0.095

0.1

0.105

0.11

Iteration number

rad

ius

[nm

]


1 2 3 4 5 6 70

1000

2000

3000

4000

5000

6000

7000

8000

9000

Iteration number

com

puta

tion

time

(b) Computation time

Figure 3.10: Comparison of full PNP model and iterated algorithm using the linear surrogatemodel for reconstruction of the channel radius. (a) Parameter reconstruction; blue: recon-struction with only full model (HS short-range interactions included); red: reconstructionwith linear surrogate model in iterated algorithm; dotted line gives exact parameter value;(b) computation time for full model reconstruction (blue) and iterated algorithm (red).

Let us compare the noDFT and the linear surrogate model next. When taking the samenumber of data sets for reconstructions with the two surrogate models it becomes apparentthat in the noDFT case the amount of data really used for the identification process is largeras compared to the linear surrogate model. One current-voltage curve from our above exam-ples consisted of three different applied voltages, hence resulting in three different data pointsthat can be used in the reconstruction process with the noDFT model. But the linear modelhas the conductance as an output instead of the currents, and one current-voltage curve onlycorresponds to one conductance. E. g. for computations performed on twelve different datasets (i.e. twelve different bath concentrations) and three different applied voltages for eachset, the reconstruction with the linear model is based on 12 data points (the conductances ofthe 12 data sets), while the reconstruction with the noDFT model relies on 36 data points(three currents for every set). As a consequence, more measurements should be performedfor the linear surrogate model in order to arrive at the same amount of different data pointsfor the reconstruction.

If we compare the two surrogate models, we see that the linear model is closer related tothe previous outer iteration than the noDFT model. As we have introduced above, the twosurrogate models can be written as

SnoDFT (q) = BM (AM (q))

for the noDFT model and

Slin(q) = Bw0,M0(Aw0,M0(q))

for the linear model. Recall that the short-range particle interactions M and the solution wof the full PNP system depend on the last iterate from the outer iteration. While SnoDFT

depends only via M on the previous outer iterate, the linear model Slin in addition also

3.4. COMPARISON OF THE MODELS 51

depends on w and is hence coupled more strongly to the previous outer iteration step. Es-pecially in the beginning of the over-all iteration process it could hence be reasonable toperform only few inner iterations, as the reference components are presumably still quite faroff from their actual true value.

As a conclusion from the performed numerical testings we can say that the linear surrogatemodel has the highest potential for time savings, but its robustness in the reconstructiontask is questionable when short-range interaction terms shall be included in the full forwardmodel. The noDFT model performs much better under these circumstances. Comparing thereconstruction results from the first part, where only the full model has been used for the re-constructions, and the results from the noDFT surrogate model we see that both approacheslead to qualitatively right results. The faster performance of the noDFT model makes it aviable alternative to algorithms using only the full forward model.

After performing the numerical analysis of the surrogate approach above, we want to end thischapter by posing some open questions related to the mathematical analysis of this approachin general. The most apparent question concerns the convergence of the iterated surrogatemodel approach. Is it possible to find conditions for the involved operators such that theiteration can be shown to converge to a solution? In [103] Scherzer investigated convergencecriteria for iterative methods based on Landweber iteration. It would be interesting to see ifsimilar assumptions on the full operator F and the surrogate model S could be used to getresults concerning the convergence of the iterated approach.

The inner iterations performed with the surrogate model could also be considered as usingapproximate adjoints. To illustrate this let us consider the following setup: let u denote thestate variables (e.g. the ion densities and electrostatic potential in the channel system) andq the parameter to be optimized. By B we denote the operator mapping the state u to thedata y, i.e. B(u) = y and we demand that u fulfills some side constraints e(u, q) = 0 (e.g. uis the solution of a PDE system). The minimization task

1

2||B(u) − y||2 → min

q

under the constraints e(u, q) = 0 leads to the following adjoint system (α denoting theLagrange parameter):

B∗(B(u) − y) +∂e

∂u

∗

α = 0

∂e

∂p

∗

α = 0

e(u, q) = 0.

If we now assume that the surrogate model changes the side constraints in some way (e.g.from self-consistently computed short-range interaction terms in the PNP system to fixedones), i.e. e(u, q) = 0, an interesting question is under which conditions on the surrogatemodel the resulting adjoint is still a good approximation to the original one. Assume forexample that the side constraints can be posed in the form e(u, q) = Au + Du − Cq and


e(u, q) = Au + Duk − Cq (uk denoting the solution from the last iterate). Then we get

B∗(B(u) − y) + A∗ α + D∗ α = 0

−C∗ α = 0

Au + Du − Cq = 0

for the original system and

B∗(B(u) − y) + A∗ α = 0

−C∗ α = 0

Au + Duk − Cq = 0

for the surrogate system.

Chapter 4

Gating of Ion Channels

After discussing the important property of ion conduction through the open pore, we willnow focus on a second fundamental property of ion channels: the gating behaviour. The termgating refers to the opening and closing of the channels, rendering the channel conductiveor non-conductive, respectively. Since ion channels control, to a large extent, the contentsand behaviour of a cell, they must be able to react to different conditions in their surround-ings. A prominent example is the signal transduction along a nerve fibre. When the nervemembrane is depolarized to a certain extent, formerly closed Na+channels open, allowing theinflux of Na+into the cell, thereby increasing the depolarization. In turn voltage-dependentK+channels open, bringing (together with the inactivation of the Na+channels) the mem-brane back to its resting potential. This generation of an action potential is just one examplewhere the proper functioning of the gating mechanism is crucial for the correct behaviour ofthe cell.

In general, the gating of ion channels can be governed by different mechanisms. One canroughly distinguish three different types of channels (compare Figure 4.1):

• voltage-gated ion channels,

• ligand-gated ion channels,

• mechanically gated ion channels.

Also a combination of the above mechanisms is possible.

In the voltage-gated type, the channel protein comprises a region generally referred to asthe voltage sensor. With this sensor the channel is able to detect and react to changes inthe membrane potential. Voltage gating will be the mechanism of interest in this thesis andhence will be discussed in more detail later on. Generally, a change in membrane voltageis detected at the voltage sensor of the protein, inducing a conformational change. This inturn leads to the opening of the pore, rendering the channel conductive. For ligand-gatedion channels, a ligand, e.g. a second messenger, has to bind to (or possibly unbind from) aspecific site at the channel protein. This will induce a conformational change in the protein,resulting in the opening or closing of the channel. Two prominent classes of ligand-gatedion channels are the ryanodine receptors (RyR) and the inositol 1,4,5-triphosphate receptors(IP3R). Both of them are Ca2+channels located in intracellular membranes. They mediatethe release of Ca2+from intracellular stores ([9]).

53

54 CHAPTER 4. GATING OF ION CHANNELS

Figure 4.1: Types of ion channel gating mechanisms (figure taken from [3]).

The third type of channels gates in response to a mechanical stimulus. This can be defor-mations or strain in the membrane close to the channel protein, e.g. due to a stretching ofthe membrane. There exist different families of these mechanosensitive channels, for moreinformation see e.g. [87] and references therein.

In the following we are going to focus on the class of voltage-gated ion channels only.

Although gating is an important part of the ion channel function, its process is still notunderstood in detail. As already mentioned above, the fact that a voltage-gated ion channelreacts to changes in membrane potential necessarily implies that the channel needs to havesome kind of “device” to sense those variations. Since a change in membrane potentialactually leads to a change in the electric field across the membrane, right from the beginningof investigations on channel gating it seemed reasonable to assume that the role of the“voltage sensing device” was taken by some charges present in the channel protein. Firststudies on the gating behaviour of voltage-sensitive channels began in the early fifties (e.g.[28], [52]) and since then a large amount of studies have confirmed the existence of a voltage-sensing domain in the channel protein. This domain is frequently termed the voltage sensorof the ion channel.

Voltage-gated ion channels are usually comprised of four homologous domains (in the case ofNa+and Ca2+channels) or a tetramer of four identical subunits (in the case of K+channels)that together form the central pore for the conduction of ions ([16]). The individual subunitsare made up of six transmembrane domains, labelled S1 to S6, where the segments S5 andS6 together with a re-entrant P-loop form the ion pore, and the remaining segments, S1-S4,make up the voltage-sensing domain. As several experimental studies have shown, the S4segment throughout all voltage-gated channels shows a highly conserved sequence of posi-tively charged amino acids that make this segment predestined to be the major player in thevoltage-sensing machinery. Mutational experiments and electrophysiological measurementshave confirmed the hypothesis that the charges located on the S4 segment contribute a majorpart of the measured gating charge when the membrane voltage is changed ([1]).

The movement of the voltage sensor due to a change in the membrane potential inducessome other conformational change in the channel protein which finally leads to the openingof the pore. This coupling between the motion of the voltage sensor and the actual openingof the pore is still not quite understood yet and continues to be a field of intensive research.

Over the past decades, besides experimental studies, theoretical models of channel gating

4.1. DIFFERENT MODELS OF CHANNEL GATING 55

have been employed in order to get a better understanding of the underlying processes.Maybe one of the earliest models taking gating behaviour into account is the famous Hodgkin-Huxley model (see e.g. [35], [52]). Named after its developers, Alan Lloyd Hodgkin andAndrew Fielding Huxley, the model describes the total current flowing through the membraneof a squid giant axon as the sum of potassium current IK , sodium current INa and a leakagecurrent IL. The individual currents are given by

IK = gK n4 (V − EK)

INa = gNa m3 h (V − ENa)

IL = gL(V − EL)

where gk denotes the maximal conductance with respect to k, V is the applied electrostaticpotential and Ek gives the reversal potential for k. The newly introduced gating variablesn, m and h account for the dynamic behaviour of the channels. They can be interpretedas fictitious gating particles, n4 signifying the assumption that four “activation particles”have to be in a permissive state for the potassium channel to conduct ions (e.g. thinking ofthe four subunits of the channel protein). For the more complex behaviour of the sodiumcurrent Hodgkin and Huxley introduced an “activation particle” m as well as an “inactivationparticle” h. The functions n(t), m(t) and h(t) are then found as solutions of first-orderordinary differential equations (see e.g. [58]).

Since then many more models have been developed to describe the gating behaviour ofvoltage-gated ion channels. The next section of this chapter will give an introduction intothe different modelling approaches and how the different model types can be related.

4.1 Different models of channel gating

4.1.1 Discrete state Markov models

The most prominent class of models in the context of ion channel gating are the so-calleddiscrete state Markov models (DSMMs). The underlying assumption for these models is thatthe channel system can reside in several well-defined states. The discrete states correspondto local energy minima of the system. In order to traverse to a different state a certainenergy barrier has to be overcome. These passages among different states are characterizedby transition rates. Hence in general DSMMs are basically composed of two elements: thenumber of discrete states and the corresponding transition rates among them. The simplestDSMM would be a system with only two states and the forward and backward transitionrates between them. A sketch of this system can be seen in Figure 4.2.The general notation is that the rate kij refers to the transition from state i into state j.Generally DSMMs can have any number of states and arbitrary connections among thosestates. Beginning from a basic sequential setup, where there is only one path to get from thefirst state to the last state, there can be circular or even more complex arrangements, wherethere are several possibilities for the system to get from one state to another. Circular modelsbecome extremely interesting when not only channel activation is studied, but when de- andinactivation is also taken into account. Since we are mainly concerned with the activationprocess after a voltage step in this thesis, we stick to the sequential Markov models in thefollowing.


-

k12

k21

1 2

open closed

Figure 4.2: Discrete state Markov model with two states.

A crucial point in using DSMMs is always the proper definition of the transition rates. Acommon approach taken in this respect is based on Eyring rate theory or Kramers reactionrate theory. A detailed derivation of the latter can be found in [48].Based on these theories the transition rates are generally stated in the form

kij = k0ij exp(−∆Eij

kBT), (4.1)

where kB and T denote Boltzmann constant and absolute temperature, respectively, and∆Eij stands for the height of the energy barrier that needs to be overcome when traversingfrom state i to state j. The prefactor k0

ij also has to be chosen appropriately.When applying the above ansatz in the context of the DSMMs for voltage-induced channelgating, it is generally assumed that the energy barrier ∆Eij can be decomposed into twoparts. One is the protein-intrinsic potential landscape G that does not depend on the appliedmembrane voltage, and the second part is contributed by a linear additive term that givesthe influence of the membrane voltage on the overall energy landscape. Hence the rates inthe context of channel gating are often expressed as

kij = k0ij exp(−∆Gij

kBT± zije0V

kBT)

= k0ij exp(± zije0V

kBT) (4.2)

(see e.g. [12], [117]).In these expressions zij denotes the amount of charge moving in the corresponding transitiontimes the fraction of the field it traverses. V is the electrostatic potential across the membraneand e0 has the usual meaning of unit charge. The sign in front of the second term in theexponent is determined by the impact the membrane voltage has on the transition rate.So-called forward rates, meaning transitions towards an open state of the channel, have apositive sign in the exponent, reflecting the fact that for large negative voltages V the ratetowards the open state will be small, but will increase for positive potentials V . For thebackward rates, i.e. transitions away from the open state, it is the converse. A negative signleads to larger backward transition rates when a negative potential V is applied and viceversa for positive potentials V .This choice of signs is only appropriate for channels that tend to open upon depolarization(which is in fact the case for most of the voltage-gated channels). However, there are excep-tions where the channel opens upon hyperpolarization, like several pacemaker channels and


the potassium channel Methanococcus jannaschii ([106]). In this case the signs have to beadapted appropriately.

Once the number of states and the structure of the Markov model under consideration isfixed, a system of ordinary differential equations (ODEs) can be set up that describes thetime development of the state probabilities. Let N be the number of states in the Markovmodel, si(t) denote the probability that the system is in state i at time t and kij be thetransition rates as introduced before. In the general setup the transition rates can alsodepend on time t, since they are dependent on the applied voltage V (see (4.2)), which canchange over time.

The probability, that the channel is in state i at time t + dt can then be computed as

si(t + dt) = si(t) +N∑

j=1j 6=i

[sj(t)kji(t)dt

]−

N∑

j=1j 6=i

[si(t)kij(t)dt

],

where the second term on the right-hand side describes all the transitions into state i andthe last term accounts for the transitions out of state i. Transition rates among states thatare not connected are set to zero.

Subtracting si(t) on both sides, dividing by dt and taking the limit dt → 0 we arrive at theODE

dsi

dt(t) =

N∑

j=1j 6=i

[sj(t)kji(t)

]−

N∑

j=1j 6=i

[si(t)kij(t)

](4.3)

for the probability that the system is in state i at time t. The above equation holds for alli = 1, ..., N , and thus we end up with a system of N coupled ODEs. In compact notationthis can be written as

S(t) = A(t)S(t). (4.4)

Here the dot denotes the derivative with respect to time and S(t) = [s1(t)...sN (t)]T is thevector of all state probabilities. A(t) = [aij(t)]i,j=1,...,N is the system matrix whose entriesare composed of the transition rates via

aij(t) = kji(t),

aii(t) = −N∑

j=1j 6=i

kij(t).

The solution of this system of ODEs is formally given by

S(t) = exp(

∫ t

t0

A(τ) dτ)S0,

where S0 = S(t0) represents the probability distribution at initial time t0. For time inde-pendent rates the ODEs simplify to a system with constant coefficient matrix A,

S(t) = AS(t),


and the corresponding solutionS(t) = eA·(t−t0)S0.

A detailed description of how to deal with the above systems can be found e.g. in [24] and[102]. In order to learn something about the behaviour of the solution of the ODE system,we first consider the eigenvalues of the system matrix A in the following lemma:

Lemma 4.1. All eigenvalues of A have non-positive real parts.

Proof. A is a singular matrix, since its rows sum to zero, i.e. one eigenvalue is given byλ1 = 0. For the other eigenvalues we use the notion of Gerschgorin circles, which are definedby

Cj = C(ajj ,

N∑

i=1i6=j

|aij |), j = 1, ..., N,

where C(x, r) denotes the closed circle with radius r around x. According to Gerschgorin’s

theorem, the spectrum of A, and hence its eigenvalues, are a subset ofN⋃

j=1Cj . Due to the

definition of our diagonal elements ajj it thus follows that all eigenvalues lie within the lefthalf of the complex plane and thus have non-positive real parts.

With this lemma it is now easy to show the following behaviour of the ODE solution.

Proposition 4.2. Let A have pairwise different eigenvalues λi, i = 1, ..., N . The solutionof the ODE converges towards the equilibrium solution as a sum of N − 1 exponentials.

Proof. The matrix A is diagonalizable and can be expressed as

A = V ΛV −1

where Λ = diag(λ1, ..., λN ) and V = [v1 v2 ... vN ] is composed of eigenvectors. Let vj denotethe jth row of V −1. The solution of the ODE can be rewritten as

S(t) = V eΛ(t−t0)V −1S0

=[ N∑

i=1

(vi · vi)eλi(t−t0)

]S0 (4.5)

(note that vi · vi is a matrix).Let λ1 = 0, then v1 can be taken as the equilibrium solution S(∞), since AS(∞) = 0, andv1 = [1 1 ... 1]. It follows that

(vi · vi)S0 = S(∞),

where we have used the fact thatN∑

j=1Sj(t0) = 1. Inserting this expression into (4.5) gives

S(t) = S(∞) +[ N∑

i=2

(vi · vi)eλi(t−t0)

]S0,

where the sum of the N − 1 exponentials goes to zero as t → ∞, since Re(λi) < 0 for1 = 2, ..., N (since we have assumed pairwise different eigenvalues and from Re(λ) = 0 itautomatically follows that λ = 0, compare proof of Lemma 4.1).


The measured quantities coming from experiments are generally given to be the macroscopicionic currents and the gating currents. The macroscopic ionic current Iion is defined by

Iion(t) = Mc · γs(V (t)) · (V (t) − Veq) · Popen(t), (4.6)

under the assumption that there are only identical channels in the membrane patch underconsideration with the single-channel conductance γs. Mc denotes the number of channelsand the term (V (t)−Veq) gives the driving force, V (t) being the applied membrane potentialand Veq the equilibrium potential. The open probability Popen(t) can be determined fromthe solution of the ODE system (4.4) by summing up all the state probabilities si(t) thatcorrespond to a conducting state of the channel:

Popen(t) =∑

i∈Λ

si(t) (4.7)

with Λ denoting the index set of conducting states. In the simplest case where there is onlyone conducting state, e.g. state 1, the open probability is Popen(t) = s1(t).The macroscopic gating current Igate can be defined as (see e.g. [117])

Igate(t) = ∑

(i,j)∈FT

[zijkij(t) si(t)

]−

∑

(i,j)∈BT

[zijkij(t) si(t)

] · e0 · Mc (4.8)

Here FT refers to the set of all forward transitions, i.e. those transitions contributing apositive part to the gating current, and BT to the set of all backward transitions, i.e. thosetransitions that contribute a negative part to the gating current. Here e0 again is the unitcharge and Mc the number of channels.For the simpler case of a sequential model (i.e. kij = 0 for |i − j| > 1) with N states thegating current then is given by

Igate(t) = N∑

i=1

[zi(i+1)ki(i+1)(t) si(t) − z(i+1)ik(i+1)i(t) si+1(t)

] · e0 · Mc

when we consider the open states to be associated with higher numbers than the closedstates.When using Markov models to describe the gating behaviour of ion channels, usually thetransition rates cannot be measured directly in the experiments. Hence they have to bedetermined by fitting the model output to the measured data. The data stemming fromexperiments is to a large amount made up of macroscopic ionic currents and gating currents,sometimes supplemented by single-channel currents. The transition rates then represent thefree parameters of the system. It turns out that for certain properties of the measured dataquite a large number of states would have to be included into the model in order to geta satisfactory result ([117]). (As an example up to 24 states would have to be used for asimplified sequential model that includes inactivation, see [113].) One such specific dataproperty for example would be the Cole-Moore shift, a time delay in the development of themacroscopic ionic current when starting from more hyperpolarized potentials. This effectwas first recognized by Kenneth S. Cole and John W. Moore in the 1950’s when they didtheir famous experiments on the squid giant axon ([23]). A more detailed description of theCole-Moore effect will be given in a later chapter dealing with the analysis of the proposedmodels.


The introduction of more and more states naturally increases the number of transition ratesneeded in order to describe the system. Keeping in mind the rate theory approach (4.2),each transition rate is determined by two parameters, the prefactor and the effective valencezij . The number of free parameters in the system consequently increases dramatically whenintroducing several new states. With more free parameters to tune the system, it becomeseasier to fit a large range of data satisfactorily. The drawback, however, is that in a largeparameter space there might exist several local or global minimizers and thus different com-binations might lead to the desired output. Unique identification of physically meaningfulparameters becomes more unlikely when increasing the number of states. In order to getusable results out of a fitting procedure, additional constraints have to be imposed on therates like assuming equalty of several transitions. Another crucial point to be kept in mindis the expression of the gating current. The rate theory approach is only valid if large energybarriers are separating the individual states. In this context “large” refers to barriers ofseveral kBT . Ways of expressing transition rates between states that are separated only bya modest energy barrier is not included in this rate theory approach.

From Markov models to Fokker-Planck

Keeping in mind the necessity to include a sufficiently large number of states into a Markovmodel description of channel gating in order to obtain a certain characteristic behaviour,an attractive approach is to step from a discrete many-state Markov model to a continuousdescription. In this section we will start out with a sequential model as can be found e.g.in [12], and derive a continuum description of this one-dimensional gating process, wherethe kinetics are described by diffusion coefficients and some potential landscape, instead ofrate constants. The relation between Markov models and continuous models has also beenpresented by Sigg and his coworkers ([109]), where they start with the general Smoluchowskiequation and use it to derive rate constants for a DSMM under the assumption of sufficientlyhigh barriers. We would like to go the other direction, starting from a discrete Markovdescription, and show the natural emergence of a continuous characterization.

For the sake of simplicity we deal just with the case of sequential Markov models. Thetreatment could be expanded to different types of Markov models as well, e.g. a modelrepresenting the parallel action of four different voltage sensors (see e.g. [108]). The outcomewould then not be a one-dimensional Fokker-Planck model (as in the case of sequentialDSMMs), but a higher-dimensional version.

The initial model consists of a sequence of N + 1 discrete states as is sketched in Figure 4.3.We assume the state denoted by 0 to refer to the open state. To each transition a forward(α) and a backward (β) rate constant is assigned, as well as an effective amount of charge (z)that is associated with this transition (and contributes to the gating charge of the system; infact this will correspond to the charge moved in reality times the fraction of the electric field,since one elementary charge travelling a certain distance in the electric field will produce thesame current in an external circuit as two elementary charges travelling half the distance).

Using the notation introduced in Figure 4.3 and denoting by P (n, t) the probability thatthe channel is in state n at time t, a system of ordinary differential equations for the stateprobabilities can be derived in the usual manner. This leads to

dP

dt(n, t) = α(n +

1

2)P (n + 1, t)+ β(n− 1

2)P (n− 1, t)−

[α(n− 1

2) + β(n +

1

2)]P (n, t), (4.9)


r r r

- - - - -

α 12

β 12

α1+ 12

β1+ 12

α2+ 12

β2+ 12

αN− 32

βN− 32

αN− 12

βN− 12

z 12

z1+ 12

z2+ 12

zN− 32

zN− 12

0 1 2 N-1 N

Figure 4.3: Sequential DSM model with N + 1 states.

where the rates at the boundaries (for n = 0, n = N) are taken as zero. So far this isequivalent to the general Markov models that we have introduced in the last section. Wejust changed the notation a little bit. The basic idea now is to introduce more and moresubstates in between the already fixed Markov states, forcing N to become large (in thelimiting case N → ∞). One has to note that by introducing the substates into the systemthe assumption that the different states are separated by large energy barriers eventuallybreaks down. The states no longer necessarily correspond to clearly defined energy wells,but can also refer to energy states in between. Thus the Eyring rate theory approach, whichrelies on the presence of sufficiently large barriers, is no longer applicable when introducinga large number of substates. On the contrary, while the transition rates in the original high-barrier Markov case could greatly vary from one state to the next, we now assume that dueto the introduction of enough substates, the transition coefficients tend to vary smoothlyamong neighbouring states.Next we introduce a change in variables and set xi = i

N for i = 0, ..., N , h := 1N and

furthermore define the probability density function

p(x, t) = N · P (Nx, t).

In a similar fashion we define α(x) = α(Nx) and β(x) = β(Nx). For the sake of simplicitywe are going to omit the bar over α and β in the following. The use of the variable x scalesour system under consideration to the interval Ω := [0, 1] and inserting additional substatesinto the model can visually be interpreted as considering a denser and denser packing of theinterval [0, 1].Using the new quantities equation (4.9) reads as

∂p

∂t(xi, t) = α(xi+

h

2)p(xi+h, t)+β(xi−

h

2)p(xi−h, t)−

[α(xi−

h

2)+β(xi+

h

2)]p(xi, t). (4.10)

Taylor expansion up to second order around xi yields

∂p

∂t(xi, t) = α(xi +

h

2)p(xi + h, t) + β(xi −

h

2)p(xi − h, t) −

[α(xi −

h

2) + β(xi +

h

2)]p(xi, t)

= α(xi)p(xi, t) + hα(xi)∂p

∂x(xi, t) +

h

2

∂α

∂x(xi)p(xi, t)

+h2

2α(xi)

∂2p

∂x2(xi, t) +

h2

2

∂α

∂x(xi)

∂p

∂x(xi, t) +

h2

8

∂2α

∂x2(xi)p(xi, t)

+β(xi)p(xi, t) − hβ(xi)∂p

∂x(xi, t) −

h

2

∂β

∂x(xi)p(xi, t)


+h2

2β(xi)

∂2p

∂x2(xi, t) +

h2

2

∂β

∂x(xi)

∂p

∂x(xi, t) +

h2

8

∂2β

∂x2(xi)p(xi, t)

−α(xi)p(xi, t) +h

2

∂α

∂x(xi)p(xi, t) −

h2

8

∂2α

∂x2(xi)p(xi, t)

−β(xi)p(xi, t) −h

2

∂β

∂x(xi)p(xi, t) −

h2

8

∂2β

∂x2(xi)p(xi, t) + O(h3)

= h∂

∂x

[(α(xi) − β(xi))p(xi, t)

]+

h2

2

∂

∂x

[(α(xi) + β(xi))

∂p

∂x(xi, t)

]+ O(h3)

Defining D(x) = h2

2 [(α(x) + β(x)] and ϑ(x) = h[α(x) − β(x)] yields a Fokker-Planck likeequation for the probability density distribution p for N sufficiently large (neglecting termsof order O(h3)):

∂p

∂t=

∂

∂x[ϑp + D

∂p

∂x]. (4.11)

Together with the no-flux boundary conditions and initial condition

ϑp + D∂p

∂x= 0 for x = 0, x = 1, ∀t (4.12)

p(x, 0) = p0(x), x ∈ Ω (4.13)

equation (4.11) represents the continuous model for the probability density distribution.To simplify matters we have assumed the rates in the above treatment to be time-independent.But the whole procedure outlined above can also be carried out with time-dependent tran-sition rates α = α(x, t) and β = β(x, t), leading to the same continuous model stated in(4.11), now just with time-dependent coefficients ϑ and D. The voltage-dependence of theabove model is now hidden in the two coefficient functions ϑ and D. It might be reasonableto assume mainly ϑ to be voltage-dependent and D to be voltage-independent, since D asdefined above is composed of the sum of forward and backward transition rates, in which fora change in membrane voltage one of them decreases while the other increases.The open probability of the channel corresponds to

Popen(t) =

∫

Ip(x, t) dx, (4.14)

where I ⊂ Ω denotes the interval that is associated with the open states. More generally,the open probability can also be defined as

Popen(t) =

∫

Ωω(x)p(x, t) dx, (4.15)

where the function ω : Ω → [0, 1] is a weighting factor giving the probability that the channelis open when it is in a certain state. For example ω(x) = 0 when the channel is definitelynon-conducting and ω(x) = 1 when the channel is for sure open when being in state x. Ifthere is a certain chance that the channel might open when being in state x, ω(x) can beassigned a value between 0 and 1.As we have seen in the preceding derivation, the continuous model arises naturally from thediscrete Markov approach, when introducing a large number of substates into the system.


Note that no explicit assumptions on the Markovian transition rates had to be made in orderto derive the continuous limit. Hence the treatment is also applicable in cases where no largeenergy barriers are involved and the Eyring rate theory approach would break down.

In the following paragraph we are going to consider a more general version of the Fokker-Planck type model, introducing the corresponding expression of the gating current for thecontinuous model.

4.1.2 Fokker-Planck type models

In the preceding paragraph we have used the Markov model approach for channel gatingto derive a continuum description of the process and to show the connection between thetwo model classes. In this section we are going to formulate a more physical approach toderive a Fokker-Planck model of channel gating. This approach is based on the Langevinequations of motion. The underlying assumption is that all relevant particles in the systembehave according to the Langevin equation of motion, a stochastic differential equation thatdescribes the Brownian motion of a particle in some potential. The Langevin equation in itsmost general form is given by

x = H(x, t) + W(t).

The right-hand side describes the force exerted on the particle under consideration. The firstpart H(x, t) gives the deterministic contribution and W(t) gives the stochastic contributionwhich is independent of the state x. For a technical introduction into stochastic processes seee.g. [57]. In our case the deterministic part H(x, t) will be made up of the interaction termsamong the particles forming our system (protein charges and free ions) like size exclusionforces, and electrostatic forces acting on the particles due to the other charges in the systemand external forces like an applied electrostatic potential. Let us assume in the following thatour system is composed of M relevant particles that we want to include into the modellingprocess. Each particle has a charge zje0, a position vector xj(t) = (xj,1(t), xj,2(t), xj,3(t))

T

in the general three-dimensional setup and a velocity vj(t) = (vj,1(t), vj,2(t), vj,3(t))T at time

t. Let x(t) = (xj(t))Tj=1,...,M denote the position vector of all M particles (x ∈ R

3M ). Thenthe deterministic force Hj(x, t) acting on the jth particle can be written as

Hj(x, t) = −ηj(xj)(∇xjµj(x) + zje0∇xjVj(xj , t)

). (4.16)

ηj is the mobility of particle j and can vary with the position of the particle. The interactionpotential µj(x) depends on the position of all other particles in the system and can forexample be taken as pairwise interaction potentials:

µj(x) =

M∑

i=1i6=j

µij(xi, xj).

The electrostatic potential Vj is determined by the Poisson equation

−∇ · (ǫ∇Vj) =M∑

i=1i6=j

zie0δxi , (4.17)


where ǫ is the dielectric coefficient and δx denotes the Dirac delta function centered at x. Theboundary conditions to (4.17) are given by homogeneous Neumann conditions on insulatedparts of the boundary and by the Dirichlet boundary conditions

V = U on Γ1 (4.18)

V = 0 on Γ2 (4.19)

on the parts where the electrostatic potential is maintained. This means that at Γ1 ⊂ ∂Ωthe potential U is applied, and the part Γ2 ⊂ ∂Ω is grounded to zero. Here Ω ⊂ R

3 denotesthe three-dimensional system domain.Because of the linearity of the Poisson equation we can split the electrostatic potential V (y)(y ∈ R

3) generated by all charges in the system into the two contributions

V (y) = u(y) +M∑

i=1

zie0G(y, xi),

where u satisfies the Laplace problem

−∇ · (ǫ∇u) = 0

with the Dirichlet boundary conditions

u = U on Γ1

andu = 0 on Γ2,

and G denotes the Green’s function for the Laplace problem, i.e. G satisfies

−∇x · (ǫ∇xG(x, y)) = δy

together with homogeneous boundary conditions.The stochastic contribution W(t) is modelled as a Brownian motion.Hence the Langevin equation of motion for the jth particle reads as

dxj = −ηj(xj)[∇xjµj(x) + zje0∇u(xj) + zje0

M∑

i=1i6=j

zie0∇xjG(xj , xi)]dt + σjWj(t). (4.20)

The corresponding probability density function p(x, t) of the M particles is then given bythe Fokker-Planck equation

∂p

∂t(x, t) =

M∑

j=1

∇xj ·ηj(xj)

[∇xjµj(x) + zje0∇u(xj)

+ zje0

M∑

i=1i6=j

zie0∇xjG(xj , xi)]p(x, t) + Dj∇xjp(x, t)

. (4.21)


A detailed derivation of multivariate Langevin and Fokker-Planck equations can be found,amongst others, in [43]. Note that p maps from R

3M ×R+ (3M space dimensions plus time)

to R+. If we now define the probability density flux

Jj = −[ηj(xj)

(∇xjµj(x) + zje0∇u(xj) + zje0

M∑

i=1i6=j

zie0∇xjG(xj , xi))p(x, t) + Dj∇xjp(x, t)

]

(4.22)equation (4.21) can be written in the standard continuity form

∂p

∂t(x, t) = −

M∑

j=1

∇xj · Jj . (4.23)

The above partial differential equation for the probability density function is supplementedwith no-flux boundary conditions,

Jj · n = 0 on ∂Ω, j = 1, ..., M, (4.24)

and initial condition

p(x, 0) = p0(x), x ∈ ΩM . (4.25)

In order to be able to compare the model with experimental output, the open probabilityPopen(t) is defined as before to be

Popen(t) =

∫

ΩM

ω(x)p(x, t) dx

(see equation (4.15)).

What we are missing so far is an expression for determining the gating current from theFokker-Planck model. Since gating currents are readily measurable in many experiments,every model concerned with gating should in some way be able to include this additionalinformation. The next paragraph of this section will hence be dedicated to the derivation ofan expression linking the probability density function p to the gating current.

Gating current from Fokker-Planck models

The term gating current refers to the movement of charged particles within the channelprotein when the applied electrostatic potential is changed. In the experimental measure-ments of gating current using the voltage-clamp setup, the total current flowing between twoelectrodes is recorded, while one electrode is held at a fixed potential U and the other isgrounded to zero. With appropriate techniques the current due to ion conduction throughthe pore can be eliminated (e.g. using pore blocking), but the current detected at the elec-trodes is still composed of two different contributions ([85]): the free bath ions entering orleaving the electrodes and the displacement current due to moving charges between the twoelectrodes that do not reach the electrodes (e.g. charged residues moving within the chan-nel protein would change the electric field and consequently induce a charge movement inthe electrodes). The total current flowing through the electrodes is given by the so-calledRamo-Shockley theorem ([89], [107]). This theorem states that the current measured in the


external circuit of the electrodes due to a moving charge in between is given by the productof the charge ze0, its velocity v and the electric field generated from a unit potential Φ:

I = − 1

1Voltze0∇Φ · v.

The unit potential Φ in this equation is determined by applying 1 Volt across the electrodeswith all charges removed from the system. Just the dielectric geometry ([85]) of the systemis kept. As a consequence, the potential Φ does not depend on the positions or velocities ofthe charges moving in the system nor on the actual experimentally applied potential, andcan be kept as a fixed quantity as long as the dielectric geometry is not changed. On theother hand, the velocity v of the particle does depend on the experimental voltage and theother charges that are in fact present in the system.According to the superposititon principle the total current is then given as the sum over allmoving charges in the system:

IRS(t) = − 1

1Volt

M∑

j=1

zje0∇Φ(xj) · vj(t). (4.26)

As we have already done before, we only consider a large number of identical channels to belocated in the membrane patch of interest. Then, due to the law of large numbers, the totalcurrent measured at the electrodes can well be described by

Ig(t) ≈ Mc · E[IRS ],

the number of channels Mc times the expectation value E[IRS ] of the Ramo-Shockley theoremfor a single-channel system. This expression for the gating current is only an approximation,since on the one hand the limiting case of infinitely many channels in the membrane patchwill never occur and on the other hand interactions among the individual channels are nottaken into account. Depending on the location and density of the channels in the membranepatch, the channels would interact via the electric field and hence influence each other. Alsothe externally applied electric field will be slightly inhomogeneous for the individual channelproteins, depending on the relative location of ion channel and electrode.Nevertheless, assuming these interactions and inhomogeneities in the electric field to benegligible, for a sufficiently large number of channels Ig(t) = Mc · E[IRS ] is an adequateapproximation of the gating current.The expected value E[IRS ] is determined via

E[IRS ] =

∫

Ω

∫

Ω. . .

∫

Ωp IRS dx1dx2 . . . dxM

=M∑

j=1

∫

Ω

∫

Ω. . .

∫

Ω− 1

1Voltzje0∇Φ(xj) · vj p dx1dx2 . . . dxM

=M∑

j=1

∫

Ω

∫

Ω. . .

∫

Ω− 1

1Voltzje0∇Φ(xj) · Jj dx1dx2 . . . dxM

and hence the gating current can be computed as

Ig(t) = Mc ·M∑

j=1

∫

Ω

∫

Ω. . .

∫

Ω− 1

1Voltzje0∇Φ(xj) · Jj dx1dx2 . . . dxM , (4.27)


where the probability flux Jj is given by equation (4.22). Note that Mc stands for the numberof channels in the membrane patch under consideration, while M denotes the number ofparticles contributing to a single-channel system. Equations (4.27) and (4.15) now establishthe relation between the measurable quantities and the Fokker-Planck model (4.22)-(4.25)for the probability density function.

Before using the above model for some analytical investigations with respect to certainfeatures in data, we would like to mention yet another modelling approach in the context ofchannel gating, based on single-channel statistics.

4.1.3 Statistics from single channel recordings

The approach presented in this section is quite different from the other already mentionedmodel classes (discrete state Markov models and Fokker-Planck models). While the previousmodels were basically starting from a more or less physical description of the gating process(transitions among several states corresponding to different conformations of the channelin the DSMM case, or equations of motion for charged parts of the channel protein in theFokker-Planck case), the next model will be based on statistical single-channel behaviour.The starting point for this model is a sufficiently large number of single-channel records. Aswe will see in the following, the major quantities in the single-channel based model are theprobability of an opening event occurring at time t and a probability related to the openduration of the single channel.

The general idea is that the macroscopic current Imac(t) is composed of overlaid single-channel currents Isc(t). Let Mc denote the number of channels contributing to the macro-scopic current, Imax

sc the maximal single-channel current (flowing when the channel is open)given by

Imaxsc = γs(V )(V − Veq),

where γs denotes the maximal single channel conductance and Veq the equilibrium potential.Since the macroscopic current Imac(t) is generated by overlying single-channel currents wehave

Imac(t) =

Mc∑

i=1

Iisc(t)

= (V (t) − Veq)γs(V )

Mc∑

i=1

si(t).

Here si(t) is a unit step function, being 1 if the channel is open, 0 if closed, assuming that allchannels contributing to the current are of the same type and thus have the same conduc-tance. Figure 4.4 shows an example of a single channel time course and the correspondingunit step function.

As in [77], we can express one open period as the difference of two stepfunctions oji(t) andcji(t), where oji(t) corresponds to the to the jth opening in the ith single channel and cji(t)to the jth closing in the ith single channel (see Figure 4.5).

Denoting by Ki the number of current pulses in the ith channel, the macroscopic current


0

γs single channel current

0

1unit step function

Figure 4.4: Single channel current and corresponding unit step function.

jth current pulse in channel i

oji(t)

cji(t)

Figure 4.5: Single current pulse and corresponding step functions.

can thus be expressed as

Imac(t) = (V (t) − Veq)γs(V )

Mc∑

i=1

Ki∑

j=1

(oji(t) − cji(t))

= (V (t) − Veq)γs(V )T∑

j=1

(oj(t) − cj(t)).

After renumbering the opening and closing events, T is the total number of opening eventsin all channels. Defining O(t) := 1

Mc

∑Tj=1 oj(t), the average of opening events, and

C(t) := 1Mc

∑Tj=1 cj(t), the average of closing events, the macroscopic current can finally be

written asImac(t) = (V (t) − Veq) γs(V )Mc (O(t) − C(t)). (4.28)

We assume that all channels start in the closed state, which is a sensible assumption, keepingin mind that we want to investigate macroscopic currents developing after a voltage step (i.e.before the voltage is stepped from some holding potential to a certain test pulse, no currentis recorded and thus all channels have to be non-conducting).O(t) is the proportion of transitions from the closed to the open state before time t and C(t)is the proportion of transitions from the open to the closed state before time t. We get ([77])

O(t) =

∫ t

0H(s) ds

where H(t)dt defines the probability that an opening event occurs between time t and t+dt.In order to have a closing event occur before time t, an opening must have occurred at some


time τ < t and the duration of the opening should be d < t− τ . We assume that the Markovproperty holds, i.e. the probability does not depend on the history of the channel but onlyon the last time step. This implies that the probability for an opening event to occur doesnot depend on the duration the channel stayed closed or how often it already opened, butonly if it is closed or open at that instant. We can write

P (opening occurs at τ for a duration d < t − τ)

= P (opening occurs at τ) · P (opening duration d < t − τ |channel opened at time τ)

and we define Q(s, t) as the probability that the channel stays open for a period less thans given that it opened at time t. This conditional probability is necessary to account fordifferent distributions of open durations, especially considering the time after a voltage step.Compared to [77] we keep this more general version of Q(s, t) and abandon the assumptionthat the open durations have to be independent of the channel opening time.Integration over all possible ranges for τ then yields

C(t) =

∫ t

0H(s)Q(t − s, s) ds.

Thus the macroscopic current can be computed as

Imac(t) = (V (t) − Veq) γs(V )Mc

∫ t

0H(s)(1 − Q(t − s, s)) ds. (4.29)

Equivalently the macroscopic open probability Popen(t) can be considered as before, whichis now given by

Popen(t) =

∫ t

0H(s)(1 − Q(t − s, s)) ds. (4.30)

As Nekouzadeh and Rudy have shown in [77] and [78], the two statistical properties H, thenumber of openings per unit time in a single-channel record, and Q, the density function ofopen duration, uniquely determine the shape of the macroscopic ionic current, whereas othercommonly used statistical properties like the probability density function (pdf) of closedduration, the pdf of latency to first opening or the distribution of the number of openingsper record are not sufficient to uniquely determine the macroscopic current. This means thattwo different sets of single-channel sweeps having the same statistical properties mentionedabove can give rise to different kinetical behaviour in the macroscopic ionic current. On theother hand, single-channel sweeps showing the same statistical properties H and Q will giverise to the same macroscopic behaviour.One of the drawbacks of this model is, however, that it is not straight-forward to supple-ment it with an expression for the gating current. As the statistical properties come fromsingle-channel measurements, they mainly incorporate information with respect to transi-tions between open and closed states of the channel. But the major part of gating currentis related with pre-open states, referring to conformational changes among closed stateswell before the channel opens. The only information about this period is contained in thesingle-channel data sets in the time to first opening after a voltage step has occurred.The advantage of employing such a statistical approach is the fact that single-channel record-ings can be used instead of macroscopic whole-cell recordings. Hence the number of channelsand the single-channel conductance can be explicitly read from measurements. Furthermore,


as Aldrich and Yellen point out in [102], Chapter 13, single-channel recordings have someadditional advantages over macroscopic measurements, such as avoiding imperfect currentseparation. This means that in macroscopic measurements it is hardly possible to measureonly the current flowing through one specific channel type, eliminating all other possiblecurrent contributions. In single-channel measurements one specific channel can be addresseddirectly.

4.1.4 Comparison of the different models

In this section we are going to compare the three different model types introduced above,namely the Markov approach, the Fokker-Planck approach and the single-channel approach.Each model has its own parameters and assumptions and we are going to see in which waythe different parameters of the individual models can be related. As we have already seenabove, the Markov approach and the Fokker-Planck type model are somewhat closer relatedthan the last approach based on single-channel statistics. By increasing the number of statesin the Markov model we have discovered that a Fokker-Planck type equation naturally arisesas the limiting case for the number of states tending to infinity. Table 4.1 gives an overviewabout the different parameters involved in the individual model types. In order to be ableto compute the macroscopic ionic and gating currents also the single-channel conductanceγs and Mc, the number of channels in the membrane patch, need to be supplied.

Model Parameters

Markov transition rates kij

effective charge valence zij

Fokker-Planck energy landscape µmobilities ηj

diffusion coefficients Dj

charges zj

open probability weighting function ω (optional)

single-channel probability of opening events H(t)probability of opening duration Q(s, t)

Table 4.1: Summary of the main parameters appearing in theindividual model types for the macroscopic ionic and gatingcurrent.

Remember that the single-channel based model is not supplied with an expression for thegating current. The optional weighting function ω introduced in the expression for the macro-scopic ionic current based on the Fokker-Planck model (see (4.15)) could also be includedinto Markov type models.


The macroscopic open probability

Let us start by comparing the expressions for the macroscopic open probability Popen, whichin turn gives rise to the macroscopic ionic current. The single-channel conductance γs and anestimate for the number of channels Mc in the membrane patch of interest are required forall modelling approaches to finally compute the ionic current from the open probability. Asalready mentioned before, the Markov and Fokker-Planck models are quite closely related.If we compare the expression for the macroscopic open probability we have

Popen(t) =

∫

ΩM

ω(x) p(x, t) dx for the Fokker-Planck model,

with ω denoting the restriction to the subsets that are associated with conducting states,and

Popen(t) =∑

i∈Λ

si(t) for the Markov model,

with Λ denoting the index set of conducting states. We see that the latter one is in fact justa discrete version of the first description if we take the weighting function ω to be either0 (for non-conducting states) or 1 (for conducting states). This insight is not surprising,keeping in mind the close relation of the two model types.

It should, however, be noted that the Fokker-Planck model, as derived in the next to lastsection, incorporates a physical geometry in the domain Ω (recall that Ω ⊂ R

3 is the realphysical domain in which the different charges can move), while the Markov states are anabstraction of functional states rather than a description of the real physical arrangementof the system. Hence it might be justified to say that the Fokker-Planck approach baresthe closest resemblance of the underlying physics, while the Markov models (and also thesingle-channel approach) work more on an abstract basis concerned with the functionallydistinguishable states of the system.

The transition rates kij in the Markov model form the counterpart to the energy landscapeµ, the diffusion coefficients Dj and the mobilities ηj in the Fokker-Planck model. In case thatEinstein’s relation can be applied, the mobilities ηj can be expressed by means of the diffusioncoefficients via D = kB T η. As we have done when performing the Taylor expansion of themultiple-state Markov model, an energy landscape µ and diffusion coefficients and mobilitiescould be defined with the help of the Markov transition rates.

The other way round is the more prominent approach, starting from a given potential land-scape and deriving transition rates for a Markov model. This direction might also be seen asthe more reasonable approach, since it starts from the physical basis (the energy landscape)and goes to the abstract level, while the other one starts from the abstract level and tries torecover the physical basis for it. Assuming an energy landscape with adequately high barri-ers (> 5kBT , see [108]) separating local energy minima, transition rates for a correspondingMarkov model can be derived using the so-called mean first passage time. This quantitydescribes how long on average it takes the system to pass from one energy minimum to thenext. Imagine a particle moving in a double-well potential as can be seen in Figure 4.6.

If at time t = 0 the particle sits in i, the mean first passage time Tij will give the averagetime it takes the particle to cross the barrier to j. It can be computed from the shape of thepotential landscape (see [34] for details). The associated Markov transition rate can thenbe approximated as the reciprocal of the mean first passage time, kij = 1/Tij , if the energy


i

j

Figure 4.6: Double-well potential.

barriers are large enough ([48], [109], [108]).

Next we turn our attention to the model based on single-channel statistics. The parametersin this model are the two time-dependent functions H(t) and Q(s, t). Recall that H(t)dtdenotes the probability that an opening event occurs in the time interval [t, t + dt], andQ(s, t) is the probability that the channel stays open for less than s time units given that itopened at t. The macroscopic open probability is then computed via (compare (4.30))

Popen(t) =

∫ t

0H(s)(1 − Q(t − s, s)) ds.

Generally this model is derived for measurements performed on the single-channel level, andthe two functions H and Q can be determined from those single-channel data as described in[77]. The general idea in doing so is to count the number of events occuring over all single-channel recordings to get an estimate for the average probability in different time intervals.But relating this statistical approach to the other models, it is also possible to derive thetwo quantities H(t) and Q(s, t) from an underlying Markov model. First we are going todeduce an expression for H(t) based on the transition rates of a Markov model, as outlinedin [77]. Then we are also going to show how the quantity Q(s, t) can be expressed by meansof transition rates.Let us assume that the underlying Markov model consists of N distinct states, where forsimplicity we assume that the first m states correspond to the conducting states of thechannel and the states m + 1 to N refer to the non-conducting states. The probability ofbeing in state i at time t shall be denoted by si(t). For a large ensemble this is equivalent tothe fractional number of channels in the state i at time t. The transition rate between statei and state j is again expressed in its general form as kij(t). In order for an opening to occurbetween time t and t + dt the channel should be in one of the closed states and transitionto one of the open states during dt. To arrive at the probability that the channel traversesfrom any closed state to any open state during dt, all possible transition paths have to besummed up:

H(t) =

N∑

j=m+1

m∑

i=1

sj(t)kji(t). (4.31)

From the section on Markov models we know that S(t) = [s1(t)...sN (t)]T is the solution ofthe system of ordinary differential equations, dS

dt = A(t)S(t), where the system matrix A is


composed of the transition rates via aij(t) = kji(t) and aii(t) = −∑Nj=1j 6=i

kij(t). Labelling

with the subscript o all quantities related to open states and with c all quantities relatedto closed states, we can decompose S(t) into S(t) = [So(t) ; Sc(t)] and the system matrix Ainto

A(t) =

[Aoo(t) Aco(t)Aoc(t) Acc(t)

].

Here the double indices oo and cc refer to transitions only among the open or closed states,respectively, while co refers to transitions from closed to open and oc to transitions fromopen states to closed states. With this notation (4.31) can be written in the compact form

H(t) = 〈Aco(t)Sc(t),1〉,

with 1 = [1...1]T being a vector of length m (number of open states) with all entries equal toone, and 〈·, ·〉 the standard scalar product in R

m. The above expression shows that H(t) iscomposed of the sum over all transitions that transfer the channel from a closed to an openstate, as could be expected, since H(t)dt is defined as the probability of an opening eventoccurring during dt.

To derive an expression for the probability of opening duration, Q(τ, t), we need to considerthe following setup: If the channel is in an open state at time t, what is the probabilitythat it will reach a closed state during the timespan τ? (In order to avoid confusion in thefollowing, we have switched to the timespan τ instead of s, which will be used to denote stateprobabilities coming from the Markov model.) To find the distribution of the lifetime at timet of the open state, we consider a new absorbing process, i.e. transitions from shut statesare impossible (all transitions from closed states are set to zero). Since we are interested inthe distribution of lifetime at time t, consider the new process starting in the open state attime t = 0. Denoting by a tilde all quantities related to the new process we can define theconditional probabilities ([24])

pij(τ) = P (channel is in state j at time τ | state i at time 0)

and the corresponding matrix P (τ) = [pij(τ)]i,j=1,...,N . This matrix can be decomposed inan analogous way as we have done before with the matrix A,

P (τ) =

[Poo(τ) Poc(τ)

Pco(τ) Pcc(τ)

], (4.32)

with the subscripts oo referring to the conditional probabilities starting and ending in anopen state, oc to the conditional probabilities starting in an open and ending up in a closedstate, and co and cc beginning in a closed state and ending up in an open or closed state,respectively.

The transition rates for the new process are the transition rates of the original processstarting from time t onwards:

kij(s) = kij(t + s).

For time-independent transition rates both are just equal.

Now if we fix a time t the probability that the lifetime of the open state is less than τ , Q(τ, t),is the probability that the channel is in any shut state at time t + τ given that it is in the


open state at time t. Reopenings from closed states are impossible due to the fact that weare considering an absorbing process. Put in other words,

Q(τ, t) = P (any shut state at time t + τ | open state at time t)

= 1 − P (open state at time t + τ | open state at time t)

= 1 −m∑

j=1

m∑

i=1

pij(τ)si(0).

Here, si(0), i = 1, ..., m, denotes the probability that the channel begins in the open state ifor our new absorbing process. For this process we assume that the channel at time t = 0is open, but we do not know in which open state the channel resides. Hence we definesi(0) with the help of the state probabilities si(t) coming from the original Markov process(dS

dt = A(t)S(t)), keeping in mind that the time t in the original process corresponds to thestarting point of the absorbing process. We get

si(0) = si(t)/fopen(t),

stating that the initial distribution of the absorbing process corresponds to the open state dis-tribution of the original process scaled by the total fraction of open channelsfopen(t) =

∑mi=1 si(t) = 〈So(t),1〉. This scaling ensures that the initial distribution of the

absorbing process sums up to unity,∑m

i=1 si(0) = 1, which has to be the case since theabsorbing process starts in open state. Here, m again denotes the number of open states.Using the decomposition of the state probability vector S(t) = [So(t) ; Sc(t)], that we alreadyused in the derivation of H(t), we can write

Q(τ, t) = 1 − 〈Poo(τ)T So(0),1〉

= 1 − 〈Poo(τ)T So(t),1〉〈So(t),1〉

,

with 1 = [1...1]T ∈ Rm as before. Poo(τ)T can be determined as the solution of the ordinary

differential equation (ODE) system

dP Too

dτ= AooP

Too

with Aoo(τ) = Aoo(t + τ) and Poo(0)T = Id, where Id denotes the identity matrix on Rm×m.

This ODE system follows from the fact that we are considering an absorbing process wheretransitions from a closed to an open state are impossible: For the full system we could setup the evolution equations for each conditional probability pij ,

dpij

dτ= −

N∑

l=1l 6=j

kjl pij +N∑

l=1l 6=j

klj pil,

which in compact notation transform into

dP T

dτ= AP T .


Using the decompositions (4.32) and A =

[Aoo Aco

Aoc Acc

]together with the fact that

Aco = 0 ∈ Rm×(N−m) (transitions from shut states to open states are impossible), finally

gives us the above ODE system for P Too, with the formal solution

Poo(τ)T = exp(

∫ τ

0Aoo(ξ) dξ).

From the above considerations we see that the parameters of the single-channel based model,H(t) and Q(s, t), can indeed be related to the parameters showing up in the Markov model,although we get quite a complicated relation and not a simple one-to-one correspondence forthe general case.To get a better understanding of what the expressions for H and Q actually mean, weconsider as an example the case of a sequential Markov model with only one open state(m = 1) and time-independent transition rates. The expressions then simplify to Aoo = −k12,Aco = [k21 0 ... 0]T , So(t) = s1(t) and fopen(t) = s1(t). This yields the following expressionsfor H and Q:

H(t) = k21 · s2(t) Q(τ, t) = 1 − e−k12τ

and we get for the macroscopic open probability

Popen(t) =

∫ t

0H(τ)(1 − Q(t − τ, τ)) dτ

=

∫ t

0k21 s2(t) e−k12(t−τ) dτ

= s1(t)

for s1(0) = 0 (recall that we have used this assumption in the derivation of the single-channel model). For the special case of k12 = 0, i.e. channels cannot close once they areopen, Q vanishes and the macroscopic open probability is just dependent on the openingrate, Popen(t) =

∫ t0 H(τ) dτ .

What can also be seen from the relations for H and Q is the fact that only the open anddirectly neighbouring closed states explicitly enter the expressions for H and Q. Anythingthat happens among the closed states “far away” from the open states only influences thequantities H and Q indirectly by changing the behaviour of the probability distribution si(t).

The following numerical example nicely demonstrates the above derived relation for H and Qbased on Markov transition rates. For this we have used the 8-state sequential Markov modelintroduced by Bezanilla, Perozo and Stefani in [12]. The model contains one open state anda sequence of 7 closed states with the transition rates given by αi = a0i exp(zixie0V/(kBT ))and βi = a0i exp(−zi(1 − xi)e0V/(kBT )). The parameters are given in Figure 4.7.

To illustrate the above relations for H and Q we first generate a set of single-channel record-ings using this Markov model. Then on the one hand we deduce the statistical quantities Hand Q from this single-channel data as described in [77] and on the other hand compare it tothe analytical expressions based on the transition rates that we have derived above. In orderto mimic single-channel experiments, we simulated a set of 1000 single-channel timecourses,


C0α0

β0

C1α1

β1

C11α1

β1

C12α1

β1

C2α2

β2

C3α3

β3

C4α4

β4

O

α0i β0i

i zi xi [ms−1] [ms−1]

0 1.81 0.24 1.55 0.031 1.24 0.8 4.88 0.452 0.89 0.75 0.9 0.83 3.5 0.81 50.0 0.0184 0.35 0.3 10 7

Figure 4.7: Eight-state sequential Markov model and parameters for the transition ratesgiven in [12].

of which the first five can be seen in Figure 4.8. The voltage step from −80 mV to 0 mVoccurs at time t = 0.

0 1 2 3 4 5 6 7t [ms]

Figure 4.8: Set of single-channel time-courses.

0 1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t [ms]

Pop

en

from single−channel dataanalytical

Figure 4.9: Comparison of the macro-scopic open probability Popen(t) derivedfrom single-channel data (blue) and asdetermined from the Markov model,Popen(t) = s1(t) (red).

The macroscopic open probability Popen stemming from the Markov model is shown in Figure4.9 in red, while the open probability arising as the result of the summed-up single-channeltimecourses is drawn in blue. We see that for 1000 single-channel sets both nicely agree witheach other.

Figures 4.10 and 4.11 show in blue the properties deduced from the single-channel datadirectly and in red color the analytical expressions H(t) = k21 ·s2(t) and Q(τ, t) = 1−e−k12τ .We see a very good correspondence between the two results, confirming our theoreticalinvestigations.


0 1 2 3 4 5 6 70

0.5

1

1.5

2

2.5

3

3.5

4

4.5

t [ms]

H

from single−channel dataanalytic

Figure 4.10: Comparison of H derivedfrom single-channel data (blue) and ana-lytical expression H(t) = k21 · s2(t) (red).

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

τ [ms]

Q

from single−channel dataanalytic

Figure 4.11: Comparison of Q(τ, t) de-rived from single-channel data (blue) andanalytical expression 1 − e−k12τ (red).

Gating current

Next we turn our attention to the gating current and compare the expressions for a sequentialMarkov model with a one-dimensional single-particle version of the Fokker-Planck model.This one-dimensional single-particle Fokker-Planck version is derived in detail in the nextsection and hence we just state it here and refer to the next section for explanations. Thegating currents are given by

IMMg (t) = Mc · e0 ·

N∑

i=1

[zi(i+1)ki(i+1)(t) si(t) − z(i+1)ik(i+1)i(t) si+1(t)

]

in the case of the Markov model (N being the number of states and zij the effective valencetransported during the transition from i to j, respectively) and

IFPg (t) = Mc · e0 ·

∫ L

0− 1

1Voltz

dΦ

dxJ dx

for the Fokker-Planck model. Here, z denotes the valence of the moving particle, Φ is theunit potential across the system and J denotes the probability density flux. The integrationdomain [0, L] gives the region accessible to the moving particle.

Comparing the two expressions, we can relate the discrete version connected to the Markovmodel to the continuous Fokker-Planck expression. If we assume that zi(i+1)=z(i+1)i=: zi+ 1

2

and introduce the effective valence z := − 11Volt z dΦ

dx , we get

N∑

i=1

zi+1/2

[ki(i+1)(t) si(t) − k(i+1)i(t) si+1(t)

]=

∫ L

0zJ dx,

where the latter term on the left-hand side corresponds to the probability density flux J onthe right-hand side. To better understand this we take a look at the graphical interpretationshown in Figure 4.12.


r r r r r r-J

i i+1

?

ki(i+1)

6

k(i+1)i

Figure 4.12: Visualisation of probability density fluxes.

Imagine the domain [0, L] to be discretized into little compartments, each in which theprobability density p is constant. The probability density flux J then describes the netflow across the compartment boundaries from one cell to the next. If we now identify onecell with the Markov state i and the neighbouring one with the state i + 1, the expressionki(i+1)(t) si(t) − k(i+1)i(t) si+1(t) in fact also gives the net probability flow from one cell tothe next.

Actually what we have done is nothing more than to use a discretization of the continuousFokker-Planck model to come to a discrete description, which is another way to show therelation between Markov schemes and Fokker-Planck type models.

4.2 Analysis of the gating current

In this section we are going to take a closer look at the properties of the gating currentrelated to the Fokker-Planck model described in the previous section. We are going to startwith an analysis of the full 3D model in the first part and in the second part investigate asimplified one-dimensional model.

4.2.1 The general case

Let us state again the system of equations that models the probability density function p(x, t)for the distribution of M different particles in a three-dimensional domain Ω and the relationto macroscopic open probability and gating current (see equations (4.22)-(4.25),(4.15) and(4.27)):

Jj = −[ηj(xj)

(∇xjµj(x) + zje0∇u(xj) + zje0

M∑

i=1i6=j

zie0∇xjG(xj , xi))p(x, t) + Dj∇xjp(x, t)

]

(4.33)

∂p

∂t(x, t) = −

M∑

j=1

∇xj · Jj (4.34)

Jj · n = 0 on ∂Ω, j = 1, ..., M (4.35)

p(x, 0) = p0(x), x ∈ ΩM (4.36)

Popen(t) =

∫

ΩM

ω(x)p(x, t) dx (4.37)

4.2. ANALYSIS OF THE GATING CURRENT 79

Ig(t) = Mc ·M∑

j=1

∫

Ω

∫

Ω. . .

∫

Ω− 1

1Voltzje0∇Φ(xj) · Jj dx1dx2 . . . dxM . (4.38)

Again Mc denotes the number of channels and M the number of particles contributing toone single-channel subsystem.Throughout the analysis we are going to consider the following setup: The membrane patchunder consideration is held at some applied potential U0 for a sufficiently long time, such thatthe system reaches its equilibrium state. Then at time t = 0 the applied potential is changedfrom U0 to U1. In the general case, when U0 corresponds to the resting potential or some otherlarge hyperpolarized voltage, all channels in the membrane are supposed to be closed. Thechange in applied voltage to U1 usually corresponds to a depolarization. If this depolarizationis sufficiently large, the channels will eventually start to open and a macroscopic ionic currentis going to develop. (For simplicity we assume only channels opening upon depolarization,i.e. closed at large negative voltages and open at positive voltages.) If the depolarization isnot large enough to actually open the channel, nevertheless one expects a gating current todevelop when there is a change in membrane potential towards less negative potentials.We begin by considering the equilibrium state that the system will eventually move intowhen a constant voltage is applied for a sufficiently long time. The general equilibrium statefor an applied potential U is characterized by

Jj = 0, j = 1, ..., M, (4.39)

i.e. the probability density flux for each particle vanishes throughout the whole domain Ω.This leads to the following equation, with peq denoting the equilibrium solution:

ηj(xj)[∇xjµj(x) + zje0∇u(xj) + zje0

M∑

i=1i6=j

zie0∇xjG(xj , xi)]peq + Dj∇xjpeq = 0 (4.40)

Keeping in mind that the applied electrostatic potential is changed from U0 to U1 at timet = 0, we can formulate the following lemma:

Lemma 4.3. If at time t = 0 the applied voltage is changed from U0 to U1, the probabilitydensity flux right after the voltage step satisfies

Jj(0+) = −ηjzje0U1 − U0

1Volt∇Φ(xj) peq,0. (4.41)

Proof. Let peq,0 denote the equilibrium solution to the applied potential U0. Right after thevoltage step it holds for the probability density flux

Jj(0+) = −[ηj(xj)

(∇xjµj(x) + zje0∇u1(xj) + zje0

M∑

i=1i6=j

zie0∇xjG(xj , xi))peq,0 + Dj∇xjpeq,0

]

= −[ηj(xj)zje0∇(u1(xj) − u0(xj)) peq,0

],

making use of (4.40). Due to the linearity of the Poisson equation it generally holds thatfor homogeneous right-hand side the two solutions u1 and u2 corresponding to the boundary


conditions U1 and U2 are related via u1(x)U2 = u2(x)U1. Hence we can express the potentialsu0 and u1 in the above equation by means of the unit potential Φ:

u0(xj) =U0

1VoltΦ(xj) u1(xj) =

U1

1VoltΦ(xj).

This yields

Jj(0+) = −ηj(xj)zje0U1 − U0

1Volt∇Φ(xj) peq,0,

the statement of the lemma.

Making use of the above result, we next turn to the fact that whenever the applied membranevoltage is changed, there occurs an immediate jump in the gating current. As can be seenin Figure 4.13, no matter which prepulse is applied, right after the depolarization the jumpin the gating current occurs.

Figure 4.13: (a) Gating currents from asquid giant axon for prepulses to −50, −70and −110 mV followed by a depolarizationto 0 mV; (b) shifted gating currents (figuretaken from [113]).

This is different to the behaviour seen in the macroscopic ionic current, where a time delayarises when starting with more negative electrostatic potentials (see Figure 5.1).

Figure 4.14: (a) Macroscopic sodium currents froma squid giant axon for prepulses to −50, −70 and−140 mV followed by a depolarization to 0 mV;(b) shifted ionic currents (figure taken from [113]).

This time delay in the developing macroscopic ionic current is also referred to as the Cole-Moore shift, an effect first investigated in the 1950’s ([23]). In their experiments on the squidgiant axon membrane, Cole and Moore realized that when applying different hyperpolarizingprepulses before depolarizing the membrane, the ionic current will show the same kineticdevelopment but with a certain time delay depending on the amplitude and duration of thehyperpolarizing prepulse. As can be seen in Figure 4.13, in the case of the gating currents


the kinetics are also somewhat shifted, but the onset of the gating current stays the same,beginning right after the voltage step. This effect can be explained using our above modelof the gating current, and we can formulate the following theorem:

Theorem 4.4. Let at time t = 0 the applied potential be changed from U0 to U1 and peq,0

denote the equilibrium solution to U0. Then the amplitude of the jump in the gating current(Ig(0+)) after the voltage step is given by

Ig(0+) = Mc ·U1 − U0

1Volt2

M∑

j=1

(zje0)2

∫

Ω

∫

Ω. . .

∫

Ωηj |∇Φ(xj)|2 peq,0 dx1dx2 . . . dxM . (4.42)

Proof. According to our definition (4.38) of the gating current, right after the voltage stepwe have

Ig(0+) = Mc ·M∑

j=1

∫

Ω

∫

Ω. . .

∫

Ω− 1

1Voltzje0∇Φ(xj) · Jj(0+) dx1dx2 . . . dxM

= Mc ·M∑

j=1

∫

Ω

∫

Ω. . .

∫

Ω(zje0)

2ηjU1 − U0

1Volt2|∇Φ(xj)|2 peq,0 dx1dx2 . . . dxM

= Mc ·U1 − U0

1Volt2

M∑

j=1

(zje0)2

∫

Ω

∫

Ω. . .

∫

Ωηj |∇Φ(xj)|2 peq,0 dx1dx2 . . . dxM

where the second equality follows from Lemma 4.3.

The statement of Theorem 4.4 can be further simplified if we assume that the unit potential toleading order varies linearly in the region of interest, which implies that ∇Φ is approximatelyconstant. This is a reasonable assumption if we consider the fact that most of the electrostaticpotential falls off across the membrane and the major contribution to the gating currentcomes from mobile charges moving within the membrane protein. Assigning a uniformdielectric constant to the protein region leads to a linearly varying electrostatic potential Φin this area. Furthermore assuming approximately constant mobilities ηj , the jump in thegating current to leading order is given by

Ig(0+) ≈ Mc ·U1 − U0

1Volt2|∇Φ|2

M∑

j=1

(zje0)2ηj

∫

Ω

∫

Ω. . .

∫

Ωpeq,0 dx1dx2 . . . dxM

︸︷︷︸=1

. (4.43)

Equation (4.43) now tells us that as soon as there are mobile charges present in the system(ηj > 0), there will be a current whenever the applied electrostatic potential is changed! Thusfrom our theoretical investigations we cannot expect a time delay in the onset of the gatingcurrent, which is confirmed by experimental results (compare Figure 4.13). Furthermorefrom equation (4.43) we see that the amplitude of the initial jump in the gating current isproportional to the absolute change in the applied voltage. The larger the difference betweenU0 and U1, the bigger the initial jump becomes. Interestingly, the precise location of theindividual charges seems not to matter for the existence and amplitude of the initial jump,as long as the mobilities can be approximated as constant.


Also the self-consistent interaction forces among the mobile charges are not of leading orderimportance for the initial jump, which implies that the very first instance is dominated bythe change in the external boundary conditions, and only afterwards the interaction forcesgain influence. This appears reasonable when we associate the initial jump with the capaci-tance current that occurs due to a change in the applied membrane voltage. It is related to acharging of the membrane and does not reflect a movement of the gating charges inside thechannel protein. In the very beginning the capacitance current overlies gating current due toprotein charge movement, resulting in the initial spike seen in measurements. But since thischarging of the membrane occurs on a much faster time scale than the gating movements,the capacitance current decays very fast and only the current due to movement of gatingcharges (i.e. the informative part of the measurements) remains. Hence it makes sense toconsider the gating current measurements only from some small time τ > 0 onwards, whenthe influence of the capacitance current has ceased and mostly the information related to thegating process is contained in the data. For experimental investigations there are also certainprocedures to correct for the capacitance current, e.g. so-called P/4 methods ([101]) thatassume a linear behaviour of the capacitance, while the gating charge movement is assumedto be nonlinear in nature.

Apart from the immediate jump in the current when the applied membrane potential ischanged, another experimentally well-known circumstance is the fact that the decaying phaseof the gating currents can usually be fitted by the sum of several exponentials. As we haveseen in Section 4.1.1 at the beginning of this chapter, using the discrete state Markov modelapproach one can prove that the system converges towards its equilibrium state as a sumof exponentials. The equilibrium distribution corresponds to zero gating current, since nochanges take place any more and thus no charges are moving. In the following paragraphwe will show that also the Fokker-Planck model explains the exponential decay of the gatingcurrent for times t large enough.The decay kinetics of the gating current are governed by the time-dependent developmentof the probability density p(x, t). In order to be able to analyse the long-term behaviour ofp, we need to take a closer look at the Fokker-Planck operator defining the time evolution ofp. For simplicity of writing we introduce the potential Wj(x), an abbreviated notation forthe external and interaction potentials in the Fokker-Planck equation:

Definition 4.5. The over-all drift potential Wj(x) for particle j is defined via

Wj(x) = µj(x) + zje0u(xj) + zje0

M∑

i=1i6=j

zie0G(xj , xi). (4.44)

With the above definition we can now write the Fokker-Planck operator in the followingform:

Definition 4.6. Let L denote the Fokker-Planck operator defined as

Lp = −M∑

j=1

∇xj ·[ηj(xj)∇xjWj(x)p(x, t) + Dj∇xjp(x, t)

](4.45)

such that∂p

∂t= −Lp.


Furthermore we introduce a weighted scalar product:

Definition 4.7. Let 〈·, ·〉w denote the scalar product given by

〈p, q〉w =

∫

ΩM

p q w dxM , (4.46)

withw = exp(−Υ),

where Υ is defined via the relation

∇xjΥ = − ηj

Dj∇xjWj .

With the above definitions it follows that

∇xjw = wηj

Dj∇xjWj , (4.47)

a fact that we are now going to use to show that L is a symmetric positive-semidefiniteoperator in an appropriate scalar product. In the one-dimensional case it turns out that wexactly corresponds to the inverse of the equilibrium solution of the system.

Proposition 4.8. The operator L defined by (4.45) is symmetric and positive-semidefinitewith respect to the scalar product introduced in (4.46) on the set of functions fulfilling theno-flux boundary conditions

Jj · n = 0 on ∂Ω, j = 1, ..., M.

Proof. For the symmetry of the operator L we need to show that 〈q, Lp〉w = 〈Lq, p〉w:

〈q, Lp〉w =

∫

ΩM

q (Lp)w dxM

=

∫

ΩM

q(−

M∑

j=1

∇xj · [ηj∇xjWjp + Dj∇xjp])w dxM

=

M∑

j=1

∫

ΩM

∇xj (qw) · (ηj∇xjWjp + Dj∇xjp) dxM

=M∑

j=1

∫

ΩM

(∇xjq w + q∇xjw) · (ηj∇xjWjp + Dj∇xjp) dxM

=M∑

j=1

∫

ΩM

(∇xjq w + qwηj

Dj∇xjWj) · (ηj∇xjWjp + Dj∇xjp) dxM

=M∑

j=1

∫

ΩM

(Dj∇xjq + q ηj∇xjWj)w · ( ηj

Dj∇xjWj p + ∇xjp) dxM

=M∑

j=1

∫

ΩM

(Dj∇xjq + q ηj∇xjWj) · (∇xj (wp)) dxM


=M∑

j=1

∫

ΩM

−∇xj · (ηj∇xjWj q + Dj∇xjq) p w dxM

=

∫

ΩM

(Lq) p w dxM

= 〈Lq, p〉w

where we have used Gauß’ theorem twice and the fact of vanishing boundary terms due tothe no-flux conditions imposed on ∂Ω. The positive-semidefiniteness follows along the samelines,

〈p, Lp〉w =M∑

j=1

∫

ΩM

(Dj∇xjp + p ηj∇xjWj)w

Dj· (ηj∇xjWj p + Dj∇xjp) dxM

=M∑

j=1

∫

ΩM

|Dj∇xjp + p ηj∇xjWj |2w

DjdxM

≥ 0

since Dj > 0 and w > 0 by definition. Note that for p = peq, the equilibrium solution, itholds that Dj∇xjp + p ηj∇xjWj = 0 for all j and hence we only get semidefiniteness insteadof definiteness.

Having the above properties of the Fokker-Planck operator we can now consider the time-dependent development of the probability density p after a voltage-step.

Theorem 4.9. The probability density p(x, t) governed by the Fokker-Planck equation

∂p

∂t= −Lp,

with boundary conditions given by Jj · n = 0 on ∂Ω and initial condition p(x, 0) = p0(x),decays exponentially to its equilibrium solution as t → ∞.

Proof. Let L2w denote the L2 space with the weighted scalar product introduced in (4.46)

and functions fulfilling the no-flux boundary conditions (compare Proposition 4.8), and letD⊥ ⊂ L2

w denote the orthogonal complement to the nullspace D of L. Every solution p ofthe Fokker-Planck equation can then be expressed in the form

p(x, t) = peq(x, t) + p(x, t)

with peq ∈ D equilibrium solution and p ∈ D⊥. We consider the non-equilibrium part p.L : D⊥ → L2

w is injective and surjective and hence its inverse L−1 exists. L−1 is compactand symmetric (compare Proposition 4.8) and with the spectral theorem it follows that pcan be expressed in the form

p(x, t) =∞∑

k=1

Tk(t)Yk(x)


with Yk forming an orthonormal system of eigenvectors of L−1, the eigenvalues µk of L−1

being all positive. Inserting this expression for p into the Fokker-Planck equation leads tothe following ODE for each eigenvalue µk,

T (t) = − 1

µkT (t), (4.48)

since ∂p∂t (x, t) =

∞∑k=1

Tk(t)Yk(x) and

−Lp(x, t) = −∞∑

k=1

Tk(t)LYk(x) = −∞∑

k=1

Tk(t)1

µkYk(x).

Denoting by λk = 1µk

the nonzero eigenvalues of L, the general solution of (4.48) is given by

Tk(t) = ake−λkt.

The solution p can hence be expressed as a Fourier series of the form

p(x, t) =∞∑

k=1

Tk(t)Yk(x) =∞∑

k=1

ak Yk(x) e−λkt,

where the coefficients ak are determined from the initial condition, which can also be ex-panded into a Fourier series with respect to Yk,

p0(x) =∞∑

k=1

p0k Yk(x) with p0k :=

∫

ΩM

p0(x)Yk(x)dx,

and thus we get ak = p0k.The final solution to our original Fokker-Planck equation is then given by

p(x, t) = peq(x) +∞∑

k=1

p0k Yk(x) e−λkt

and consequentially for t → ∞ we have an exponential decay towards the equilibrium solu-tion.

From the above proof we see that the kinetic behaviour of the probability density p for largetimes is dominated by the smallest non-zero eigenvalues of the Fokker-Planck operator L.

So far we have considered the time immediately after a voltage step (the jump in the gatingcurrent) and the long-term behaviour of the gating current. But what we have left out sofar is the intermediate time span. What happens after the immediate effect of the voltagechange has ceased but before the long-term behaviour, namely the exponential decay, setsin? Several experimental studies concerned with ion channel gating and conduction (e.g.[8]) have shown that under certain conditions the gating current not necessarily needs tobe monotonically decreasing. It can happen that the gating current exhibits a rising phasebefore the multi-exponential decay becomes the dominating kinetics. So we would also like


to investigate under what prerequisites and assumptions our Fokker-Planck model is able togenerate a rising phase in the macroscopic gating current. A rising phase in this contextrefers to an explicit increase in the gating current visible after the capacitance current hasceased, i.e. after some small time τ > 0.

In order to investigate this question, we turn to a simplified one-dimensional model in thenext section.

4.2.2 The one-dimensional model

In this section we are going to consider a simplified one-dimensional model in order to getsome qualitative insights into the system behaviour. We start by making some rigorousassumptions on the single-channel system underlying the Fokker-Planck equation. Assumefor simplicity that there is basically only one particle moving within our single-channelsubsystem (and hence producing the gating current). This means we are ignoring the factthat actually inside the voltage sensor we have an interplay between several charged flexibleresidues. Furthermore we assume that our single “gating particle” is only moving in thedirection perpendicular to the membrane, thereby reducing our model from three to onedimension. For the moment we also ignore the presence of baths (and bath ions) besidesthe membrane and consider the electrodes right attached to the protein surface on bothsides. The region between the electrodes is assigned just a single dielectric coefficient ǫ, nodielectric boundaries are present. This simple geometry causes the electrostatic potential tovary linearly between the two electrodes. For convenience we consider the whole system tolive on the interval [0, L], the voltage is applied at the left side at x = 0, the right-hand sidex = L is grounded to zero. The variable L gives the system dimension (e.g. 10 A) for therange of movement of the gating charge. Furthermore we consider the right-hand side of thesystem to represent the open state of the channel, meaning that when the channel opens theprobability density will shift towards x = L. Figure 4.15 shows a sketch of the simplifiedone-dimensional model.

HHHHHH0 L

applied voltage U

inside outside

Figure 4.15: Sketch of the simplified 1D setup.

The model equations related to this setup read as

J = −[η(dµ

dxp + ze0

dV

dxp)

+ D∂p

∂x

](4.49)

∂p

∂t(x, t) = −∂J

∂x(4.50)

J = 0 for x = 0, x = L (4.51)

p(x, 0) = p0(x), x ∈ [0, L] (4.52)


Popen(t) =

∫ L

0ω(x)p(x, t) dx (4.53)

Ig(t) = Mc

∫ L

0− 1

1Voltze0

dΦ

dxJ dx. (4.54)

Note that in this case x ∈ R is a one-dimensional variable. The potential µ appearing inequation (4.49) represents the time-invariant protein structure, assumed to be independentof the applied voltage U . It can be seen as the protein-intrinsic restrictions on the movementof the gating particle, generated e.g. by charged residues of the voltage sensor and nearbychannel parts that are not explicitly considered as particles in this simplified model. Theelectrostatic potential V is just a linear function in our specified geometry:

V (x) = U(1 − x),

where U is the voltage applied at x = 0. The same applies for the unit potential Φ, which isthe electrostatic potential under application of 1 Volt:

Φ(x) = 1Volt (1 − x),

Furthermore we assume that Einstein’s relation holds, expressing the mobility η via thediffusion coefficient: η = D/(kBT ). For convenience we rescale our above model equationsand introduce the following non-dimensionalized quantities. Let L denote the system lengthand τ some typical timescale of the system (e.g. miliseconds). Then we can define thedimensionless variables

x = x/L t = t/τ

U = e0UkBT µ(x) = µ(Lx)

kBT

D = τL2 D p(x, t) = p(Lx, τ t) · L

Φ = Φ1Volt ω(x) = ω(Lx).

The rescaled system now lives on the unit interval [0, 1] and the corresponding set of equationsreads as

J = −[D

(dµ

dx+ z

dV

dx

)p + D

∂p

∂x

](4.55)

∂p

∂t(x, t) = −∂J

∂x(4.56)

J = 0 for x = 0, x = 1 (4.57)

p(x, 0) = p0(x), x ∈ [0, 1] (4.58)

Popen(t) =

∫ 1

0ω(x)p(x, t) dx (4.59)

Ig(t) = Mc

∫ 1

0−ze0

1

τ

dΦ

dxJ dx. (4.60)

For simplicity of reading and writing we are going to omit the hat over the scaled quantitiesagain in the sequel. Note that potentials can now be considered as given in units of kBTand the applied electrostatic potential U is scaled by the thermal voltage kBT/e0.


Taking into account the linear nature of the electrostatic potentials V and Φ, their spatialderivatives are simply given by dV

dx = −U and dΦdx = −1. Insertion of these values into (4.55)

and (4.60) leads to the following expressions for the probability density flux and the gatingcurrent,

J = −[D

(dµ

dx− zU

)p + D

∂p

∂x

](4.61)

and

Ig(t) = c

∫ 1

0zJ dx, (4.62)

where we have set the constant c = Mc e0/τ .For this one-dimensional model we can compute the equilibrium distribution peq(x), whenwe assume that the electrostatic potential U has been applied for a sufficiently long time.From the equilibrium condition J = 0 it follows that

peq(x) = c0 exp(−

∫ x

0(dµ

dx− zU) ds

)

and hencepeq(x) = c0e

−(µ(x)−zUx). (4.63)

The constant c0 is determined via the normalization condition∫ 10 p dx = 1, signifying that

the gating particle has to be somewhere between 0 and 1. With this condition it follows

c0 =( ∫ 1

0e−(µ(x)−zUx) dx

)−1.

Next we are going to use the above stated model to investigate the existence of those cer-tain characteristic features that we already mentioned in the context of the general multi-dimensional model. Recall that those features with respect to the macroscopic gating currentwere

• an immediate jump in the gating current when the membrane potential is changed;

• the existence of a rising phase in the gating current (under appropriate conditions).

Again we assume that the membrane voltage is stepped from U0 to U1 at time t = 0 andthat the first voltage U0 has been applied long enough for the system to reach its equilibriumstate peq,U0 . The gating current right after the voltage step is then given by

Ig(0+) = c

∫ 1

0−z

[D

(dµ

dx− zU1

)peq,U0 + D

∂peq,U0

∂x

]dx

= c

∫ 1

0−z2D (U0 − U1) peq,U0 dx

= c (U1 − U0)

∫ 1

0z2D peq,U0 dx

= c (U1 − U0) z2D,

where the last equality only holds under the assumption of a constant diffusion coefficient Dand constant charge of the gating particle throughout the whole domain. As expected from


the analysis of the general case, in the simplified one-dimensional setup the height of thejump in the gating current is in fact mainly determined by the absolute change in membranevoltage. It does not depend on the detailed distribution of peq,U0 before the voltage step andis not influenced by the shape of the intrinsic potential µ. This can also be nicely seen inthe numerical examples in Figure 4.16.

−0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

5

10

15

t [ms]

I ga

te [τ/

(Mc e

0)]

Ig(0+) = 13.82

Figure 4.16: Gating currents for different ap-plied electrostatic potentials and two differentprotein-intrinsic potentials µ. The difference inapplied voltage is constant, ∆U=U1−U0=80 mV.The line of the arrow indicates the theoreticalresult Ig(0+) = c (U1 − U0) z2D.blue: quadratic µ, U0 = −80 mV, U1 = 0 mV;red: quadratic µ, U0 = −50 mV, U1 = 30 mV;cyan: µ = 0, U0 = −80 mV, U1 = 0 mV;green: µ = 0, U0 = −50 mV, U1 = 30 mV.

In these examples we consider two different protein-intrinsic potentials µ, and for each po-tential µ we apply two different initial voltages U0. In each case the voltage is then increasedby an increment of 80 mV and we see that the initial jump in the gating current is the samein all tests, although the further development of the gating current differs depending on theapplied voltage and protein-intrinsic potential µ.

In order to get some insight what happens in the time course of development, we considerthe time derivative of the gating current:

dIg

dt= c

∫ 1

0z∂J

∂tdx

= −c

∫ 1

0z[D

(dµ

dx− zU1

)∂p

∂t+ D

∂

∂x

∂p

∂t

]dx (4.64)

for t ≥ τ (remember that we defined a rising phase as an increase in the gating current aftersome small time τ > 0). We are interested in the question under what circumstances wefind a rising phase in the gating current and which circumstances make it impossible. Westart by considering the most simple case of a constant or linear protein-intrinsic potentialµ. Furthermore assuming constant diffusion coefficient D as well as charge valence z, theabove equation (4.64) can be rewritten as

dIg

dt= −c zD

(dµ

dx− zU1

) ∫ 1

0

∂p

∂tdx +

∫ 1

0

∂

∂x

∂p

∂tdx

.

The term∫ 10

∂p∂t dx vanishes for p fulfilling the no-flux boundary conditions and we remain

withdIg

dt= −c zD (

∂p

∂t(1) − ∂p

∂t(0)).

For U1 > U0 the probability density will shift towards x = 1 (under the presumption thatafter the voltage step the open state is favoured) and hence we have ∂p

∂t (1, t) ≥ 0 and


∂p∂t (0, t) ≤ 0, which overall gives

dIg

dt≤ 0.

This means that an at most linear potential landscape µ will not be able to generate a risingphase in the gating current. Instead we will not see a change in the gating current as longas ∂p

∂t (0, t) = 0 and ∂p∂t (1, t) = 0. Put in descriptive terms, when all channels have left x = 0

but have not reached x = 1 yet, the gating current will remain constant.

Doing the above analysis for a nonlinear potential is a bit more involved. We start againwith the time derivative (4.64) of the gating current and consider a quadratic potentialµ(x) = c1(x − c2)

2 + c3, with c1, c2 and c3 some constants.

dIg

dt= c

∫ 1

0z∂J

∂tdx

= −c

∫ 1

0z[D

(dµ

dx− zU1

)∂p

∂t+ D

∂

∂x

∂p

∂t

]dx

= −c

∫ 1

0z[D

(2c1(x − c2) − zU1

)∂p

∂t+ D

∂

∂x

∂p

∂t

]dx

= −c zD2c1

∫ 1

0x

∂p

∂tdx −

(2c1c2 + zU1

) ∫ 1

0

∂p

∂tdx +

∫ 1

0

∂

∂x

∂p

∂tdx

(4.65)

For the last equations we have used, as before, the assumption that D and z are constant. Forp fulfilling the no-flux boundary condition the second term on the right-hand side vanishesand we remain with

dIg

dt= −c zD

2c1

∫ 1

0x

∂p

∂tdx + (

∂p

∂t(1, t) − ∂p

∂t(0, t))

.

The difference of the boundary values of the time-derivative for the probability densitydistribution can again be considered as positive (see above), ∂p

∂t (1, t) − ∂p∂t (0, t) ≥ 0, and

hence the existence of a rising phase strongly depends on the sign and amplitude of theother term on the right-hand side,

2 c1

∫ 1

0x

∂p

∂tdx.

The integral can be considered as the change of the expectation value for x with time:

∫ 1

0x

∂p

∂tdx =

∂

∂t

∫ 1

0x p dx

=∂

∂tE[x].

In order to get a rising phase, the weighted rate of change of the expectation value has tobe smaller than the difference of the boundary terms,

2 c1∂

∂tE[x] <

∂p

∂t(0, t) − ∂p

∂t(1, t),


which says that for c1 > 0 the change in the expectation value for x has to be negative (underthe presumption that after the voltage step the system moves from the closest state (x = 0)towards the open state (x = 1)). This means that the expectation value for x has to shift tothe left, which is in contradiction to the assumption that the system tends to move towardsx = 1 after the voltage step (which would correspond to a shift of the expectation value inthe direction of x = 1). Hence also a convex quadratic potential (open to the top) will neverproduce a rising phase. Figures 4.17 and 4.18 illsutrate the behaviour of the expectationvalue.

0

0.05

0.10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

1

2

3

4

5

6

t [ms]

x

p(x,

t)

00.02

0.040.06

0.080.1

0

0.2

0.4

0.6

0.8

10

1

2

3

4

5

6

t [ms]x

p(x,

t)

Figure 4.17: Development of the probability density function p(x, t) for a convex quadraticpotential µ and a voltage step from U0 = −80 mV to U1 = 30 mV (seen from two differentrotational angles).

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

t [ms]

E[x

]

U1 = 0 mV

U1 = 10 mV

U1 = 20 mV

U1 = 30 mV

Figure 4.18: Shift of expectation value E[x] with time (for a convex quadratic potential µ).Voltages are stepped from U0 =−80 mV to U1 =0 mV (blue), U1 =10 mV (red), U1 =20 mV(green) and U1 = 30 mV (cyan).

The first one shows the development of the probability density function p(x, t) for a convexquadratic protein-intrinsic potential µ, when the voltage is stepped from U0 = −80 mV toU1 = 30 mV. It can be seen that the maximum of p(·, t) shifts towards larger values of xfor increasing time, starting with a peak at x = 0 for t = 0. The corresponding shift in theexpectation value for x, E[x], is shown in Figure 4.18 in light-blue colour. The other curvesin this figure demonstrate the shift of the expectation value for different applied voltages


U1. They all show qualitatively the same behaviour, constantly increasing from an initialexpectation value close to zero to the final value depending on the applied potential U1.As we have seen above, a convex quadratic potential will not generate a rising phase in thegating current. But how about a concave quadratic potential, i.e. c1 < 0? In this case it hasto hold

2 |c1|∂

∂tE[x] >

∂p

∂t(1, t) − ∂p

∂t(0, t)

in order for a rising phase to exist. Note that for our standard assumption that the systemmoves away from x = 0 towards x = 1, the right-hand side of the above equation is greateror equal to zero. Hence the change in the expectation value definitely also has to be greaterthan zero, which complies with a system shift from x = 0 towards x = 1. For |c1| largeenough the left side of the above inequality is indeed larger than the right side and thegating current exhibits a rising phase, as can also be seen in numerical examples (see Figure4.19). Note that the display starts at time t = 0.05 ms.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2

4

6

8

10

12

t [ms]

I ga

te [

τ/(M

c e0)]

Figure 4.19: Gating current for a concave quadratic potential µ showing a rising phase(τ = 0.05 ms).

We should mention that we solely performed qualitative investigations with our model anddid not attempt to fit real experimental data. Hence by adjusting the model parameters(e.g. the diffusion coefficient) different time scales for the kinetics could also be achieved.

Due to the complexity of the system, the analytical investigations performed above are onlyfeasible for relatively simple potential landscapes µ. More realistic assumptions concerningthe shape of the energy landscape governing the behaviour of the gating charges will definitelyinvolve more complex potentials, having several local minima and maxima. They can beinvestigated using numerical analysis.

Chapter 5

The Cole-Moore effect

5.1 Cole-Moore and the different gating models

After analysing the gating current in great detail in the last chapter, we would now like toconsider the phenomenon of the time-shift in the macroscopic ionic current a bit more de-tailed. In the 1950’s Kenneth S. Cole and John W. Moore performed a series of experimentson the squid giant axon [23] using a voltage clamp technique. The experimental details aredescribed in [22]. Among other things, they investigated the development of macroscopicionic currents after stepping the membrane voltage from different hyperpolarizing prepulsesto some activation voltages. They found out that when more negative prepulses are applied,the ionic current will show a very similar kinetic development but with a certain time de-lay. The size of this delay depends on the duration and amplitude of the hyperpolarizingprepulses. An example for these time shifts can be seen in Figure ??, reproduced by Taylorand Bezanilla in [113].

Figure 5.1: (a) Macroscopic sodium currents froma squid giant axon for prepulses to −50, −70 and−140 mV followed by a depolarization to 0 mV;(b) shifted ionic currents (figure taken from [113]).

This characteristic behaviour is nowadays often referred to as the Cole-Moore (CM) effectafter its two discoverers.

The famous Hodgkin & Huxley (HH) model, which has been deveolped in the early 1950’s([52]) has been shown to be inadequate for describing the potassium current IK in squidgiant axons [21]. It assumes that IK is proportional to the fourth power of some time- andvoltage-dependent rate parameter n, IK ∼ n4(V, t). This dependence can be interpreted as arepresentation of the four subunits of the K+channel, which each have to switch conformation

93

94 CHAPTER 5. THE COLE-MOORE EFFECT

for the channel to become conductive. The time-dependence of n is given by

dn

dt= α(V )(1 − n) − β(V )n.

Increasing the power of n significantly leads to better results in describing IK , but the phys-ical interpretation with respect to the four subunits gets lost.

We would like to use the different gating models introduced in the first part of Chapter 4 toinvestigate which models are in fact able to produce such a time delay in the macroscopicgating current.

The models we are going to consider are (compare Chapter 4):

1) the Fokker-Planck type model, where the macroscopic open probability is given by

Popen(t) =

∫

ΩM

ω(x)p(x, t) dx

with p fulfilling the Fokker-Planck equation

∂p

∂t= −∂J

∂x

with

J = 0 for x = 0, x = L

and

p(x, 0) = p0(x), x ∈ [0, L];

2) a sequential Markov model with five closed and one open state with the parameters asgiven in [113]. The open probability is given by

Popen(t) = s1(t)

where the probability S(t) = (s1(t), ..., s6(t)) of being in the open or closed statesfulfills the ODE system

S(t) = AS(t)

with A incorporating the transition rates;

3) the statistics from single channel measurements with the open probability given by

Popen(t) =

∫ t

0H(s)(1 − Q(t − s, s)) ds

and the functions H and Q determined from the above Markov model;

4) a two-state Markov model with only one open and one closed state.

5.2. A MATHEMATICAL MODEL 95

−0.2 −0.1 0 0.1 0.2 0.3

−100

−80

−60

−40

−20

0

20

40

60

t [ms]

U [m

V]

Figure 5.2: Voltage protocol for investigation of Cole-Moore shift. Voltage was stepped from−100, −80 and −60 mV to 50 mV.

The voltage protocol used for all computations is illustrated in Figure 5.2. Holding potentialswere taken as −100, −80 and −60 mV, at time t = 0 the voltage was stepped to 50 mV.In Figure 5.3 the resulting open probabilities for the above four models are shown.We see that all models apart from the two-state Markov model (right lower corner) are ableto produce a Cole-Moore shift. Note that size of the shift and shape of the curves can bealtered by changing the model parameters, i.e. the transition rates in the Markov model anddiffusion coefficient and energy landscape in the Fokker-Planck model. Just two states in aMarkov model with conventional transition rates of the form

kij = k0ij exp(± ze0

kBTV ),

i.e. depending only on the actual applied voltage, are not enough to produce a Cole-Mooreshift. Several closed states have to be included in a series to provide a basis for the timedelay. Depending on the choice of parameters, up to 14 closed states ([120]) might have tobe included in order to get the desired time shifts.In the subsequent part we want to introduce a relatively simple mathematical model thatreduces the number of states to be considered and that is in principle capable of reproducingthe Cole-Moore shift.

5.2 A mathematical model

The model we want to propose in this section is similar to the idea presented in [21]. Instead ofconsidering Markov models with lots of closed states and solely voltage-dependent transitionrates kij(V ), we limit the number of states (e.g. just one open and one closed state) andinvestigate voltage- and time-dependent rates. Usually, transition rates between states i andj are posed in the form

kij = k0ij exp(± ze0

kBTV ),

with V denoting the applied electrostatic potential, ze0 the charged moved during the tran-sition and k0

ij some constant prefactor. This formulation indicates that the transition ratesimmediately change their value when the membrane potential is stepped to another voltage.We introduce a continuous change from the old transition rate to its new value. For sim-plicity we use the notation that at time t = 0 the applied membrane voltage is stepped fromV0 to V1. In order to define the time-dependent transition rates that gradually change from


0 0.05 0.1 0.15 0.2 0.25 0.30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t [ms]

Pop

en

(a) Fokker-Planck model

0 0.05 0.1 0.15 0.2 0.25 0.30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t [ms]

Po

pe

n

(b) 6-state Markov model

0 0.05 0.1 0.15 0.2 0.25 0.30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t [ms]

Pop

en

(c) Statistics model

0 0.05 0.1 0.15 0.2 0.25 0.30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t [ms]

Pop

en

(d) 2-state Markov model

Figure 5.3: Open probabilities (normalized to one); in all figures blue corresponds to holdingpotential of −100 mV, red to −80 mV and green to −60 mV. Potential is stepped to 50 mVin all cases. (a) Fokker-Planck model; (b) 6-state Markov model; (c) statistics model; (d)2-state Markov model.

k(V0) to k(V1), we introduce an effective electrostatic potential W (t), that changes from V0

to V1. Its time-dependence is given by

dW

dt=

V1 − W

τ

and W (0) = V0. The time constant τ describes how fast the effective potential approachesits final value. The model transition rates are then defined in analogy to the above as

kij(t) = k0ij exp(± ze0

kBTW (t)).

With such a relatively simple approach it is indeed possible to reproduce a time delay inthe corresponding current. Figure 5.4 shows a model output corresponding to the voltageprotocol in Figure 5.2.We see that when introducing continuous time-dependent transition rates into a Markovmodel, two states are in fact enough in principle to produce a time delay for more hyper-polarizing prepulses. If such a model is also capable of reproducing the appropriate kineticsremains to be investigated.

Although the above approach might produce the right effects, it still needs to be suppliedwith a physical basis. One possibility might be that the voltage sensor does not immediately“see” the actual applied voltage, but that it is somehow shielded by other charges (e.g.

5.2. A MATHEMATICAL MODEL 97

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t [ms]

Pop

enFigure 5.4: Open probabilities (normalized to one) for the ”delayed voltage” model; bluecorresponds to holding potential of −100 mV, red to −80 mV and green to −60 mV. Potentialis stepped to 50 mV in all cases.

polarization charges) in the protein that need some time to adjust to the new voltage. Itmight thus be that at the location of the voltage sensor a continuous change in the localelectric field arises, making the above time-continuous transition rates reasonable.


Chapter 6

Inverse problems related to gating

In this chapter we are going to address some inverse questions with respect to the gatingmechanism. The general idea is the same as already discussed in Chapter 3. Since most ofthe quantities determining the gating behaviour of voltage-gated ion channels are not directlyaccessible and measureable in experiments, an attractive approach is to use those quantitiesthat are readily measurable to learn something about the underlying structures. In the case ofion channel gating the data we are going to consider are comprised of the electrophysiologicalmeasurements, i.e. the macroscopic ionic current and the macroscopic gating current. Theaim is to learn about the underlying features such as the energy landscape of the channelprotein.

6.1 Inverse problems - basic setup

In order to address an inverse problem we first need to define what the direct or forwardproblem is. If we are interested in some quantity q of our system under consideration, weneed to have a model relating the quantity q to the output y that would also be measurablein the experimental setup. Let F denote such a model, mapping between the spaces X andY ,

F : X −→ Y

q 7→ y.

The parameter space X is made up of all input possibilities and space Y contains all possibleoutputs. To give a concrete example, in the case of channel gating the quantity q could bethe energy landscape µ (or rather its derivative dµ

dx ) and the output y would be made up ofthe pair y = (ionic current , gating current).Hence the forward problem in this case would be to determine the macroscopic gating andionic currents for a given potential landscape µ. The corresponding inverse problem can beformulated as: “Given pairs of macroscopic gating and ionic currents, how does the energylandscape µ generating these currents look like?”In other words, we are looking for the underlying cause of the currents, a problem that canbe referred to as parameter identification in mathematical terminology. An illustration ofthe setup can be seen in Figure 6.1. In physical applications the process of identifying acertain parameter that gives rise to some measured behaviour is also frequently denoted as“data fitting”.

99

100 CHAPTER 6. INVERSE PROBLEMS RELATED TO GATING

Forward problem:

channel properties

experimental boundaryconditions

-

model F

currents Igate, Iion

Inverse problem:

currents Igate, Iion

experimental boundaryconditions

?

channel properties

Figure 6.1: Forward and inverse problem in channel gating.

It should be kept in mind that when dealing with real experimental data the measuredquantities are not the ideal solution y that is computed with the forward model. Instead,due to measurement errors and noise only the quantity yδ with

||y − yδ|| < δ

will be measured and often special care has to be taken when using such noisy data forthe identification of underlying system properties and parameters. Slight variations in themeasured data can give rise to huge variations in the identified parameters, a fact referredto as ill-posedness in the mathematical field of inverse problems. Regularization techniqueshave to be employed in this case in order to get a sensible result.

Which system parameters might be identified depends on the model used for the forwardproblem. In the case of a discrete state Markov model, the tunable parameters are the tran-sition rates and the number of states. For the Fokker-Planck model the diffusion coefficients,mobilities and energy landscape are the major quantities determining the system behaviour.

A large number of adjustable parameters increases the flexibility of the model, making itmore likely to fit a large class of data. However, with an increasing number of free parameters,their identifiability and uniqueness becomes questionable. Several combinations of differentparameters might generate the same output and additional information would be needed todecide on one specific parameter set. In [29] the question of identifiability and uniqueness inthe case of Markov models is addressed.Usually the identification becomes better the larger the variety of data that are used in theidentification process. In the case of channel gating the amount of data can be increased e.g.by using different voltages (holding potentials, test potentials) and different pulse durations.Ideally one would hope to find those parameters that predict the right currents for all pos-sible boundary conditions.

6.2. A ONE-DIMENSIONAL MODEL 101

In the next section we are going to use the Fokker-Planck type model derived in the last chap-ter to investigate inverse problems with respect to the energy landscape µ that determinesthe gating behaviour.

6.2 Parameter identification with a one-dimensional model

We are going to consider the one-dimensional version of the Fokker-Planck type model tostudy identifiability of model parameters such as the energy landscape µ and the diffusioncoefficient D, based on macroscopic gating currents and macroscopic open probability. Inother words, we want to find those model parameters that describe both the gating currentsand the macroscopic open probability at the same time.

In the following we denote the measured open probability over time by yδ(t) and the measuredgating current by uδ(t). The final time until which the data are considered is denoted byT . The optimization functional, i.e. the functional we want to minimize with respect to themodel parameter of interest, then reads as

Q = α1

2

∫ T

0|Popen(t) − yδ(t)|2 dt + β

1

2

∫ T

0|Ig(t) − uδ(t)|2 dt. (6.1)

The two constants α and β are weighting factors, allowing to put different emphases on thetwo data sets. This might be useful since the measured time course of the channel openprobability, yδ, will contain more information related with the “close to open” states of thechannel, and might not contribute enough information in order to identify the parametersfar away from the open state. On the other hand the gating current uδ conveys informationabout the whole system, since especially for large negative initial potentials, the gatingcurrent reflects what happens among the closed states far from open. The quantities Popen(t)and Ig(t) are the model output, depending on the model parameters to be optimized. In ourone-dimensional model they are given by (see (4.59), (4.62))

Popen(t) =

∫ 1

0ω(x)p(x, t) dx,

with ω(x) being the weighting function for the pore to be conducting when the gating particleis in a certain state, and

Ig(t) = −c

∫ 1

0z[D

(dµ

dx− zU

)p + D

∂p

∂x

]dx,

where c = Mc e0/τ is related to the total number of channels Mc and some characteristictime τ of the system (compare the section on scaling of the equations in the last chapter).

Let q denote the parameter (or set of parameters) to be identified in the inverse problem.For example q can be chosen as the derivative of the energy landscape, dµ

dx , or the diffusion

coefficient D or as the pair of them, q = (dµdx , D). Note that we will not be able to identify the

absolute energy landscape µ, since the model only involves its spatial derivative dµdx . Hence

energy landscapes equal up to an additive constant will always yield the same model output.


In order to determine a q describing the available data set, the optimization functional (6.1)has to be minimized with respect to the parameter q, i.e. we want to solve

Q(q) → minq∈X

.

A prominent way to perform this minimization procedure is the use of gradient methods.Beginning with an initial guess q0 for the parameter in question, an update is determined bycomputing the gradient ∇qQ of Q with respect to q. This results in the iterative scheme

qk+1 = qk − γk∇qQ(qk), (6.2)

which is carried out until some stopping criterion is reached. The parameter γk gives thestepsize of the iteration process. Since the gradient of Q needs to be computed in everyiteration step, an efficient way of doing so is required. Especially for the case when thesought parameter is a function and not just a scalar, the use of finite differences to computethe gradient becomes quite time consuming and more convenient methods are needed. Inour one-dimensional model case we employ adjoint techniques for the computation of thegradient. An introduction into the adjoint approach can be found e.g. in [36].In fact, we would like to minimize Q with respect to q under the constraint that the proba-bility density distribution p fulfills the Fokker-Planck equation

∂p

∂t(x, t) =

∂

∂x

[D

(dµ

dx− zU

)p + D

∂p

∂x

].

The associated Lagrange functional is then given by

L(p, q, λ) = Q(p, q) +

∫ 1

0

∫ T

0λ[∂p

∂t− ∂

∂x

(D(

dµ

dx− zU)p + D

∂p

∂x

)]dt dx, (6.3)

where λ = λ(x, t) denotes the Lagrange parameter. The adjoint approach now consists ofsubsequently solving the equations ∂L

∂λ = 0, ∂L∂p = 0 and evaluating ∂L

∂q to determine thederivative of the optimization functional.From ∂L

∂p = 0 we get the following adjoint system to be solved backward in time for λ:

−∂λ

∂t− ∂

∂x(D

∂λ

∂x) + D(

dµ

dx− zU)

∂λ

∂x= −R (6.4)

with the right-hand side R given by a weighted sum of the residuals,

R(x, t) = α ω(x)(Popen(t) − yδ(t)

)+ β

[c

∂

∂x(zD) − czD(

dµ

dx− zU)

] (Ig(t) − uδ(t)

).

Initial and boundary conditions are given by

λ(x, T ) = 0 (6.5)

and

D∂λ

∂x(0, t) = β c z D (Ig(t) − uδ(t)) (6.6)

as well as

D∂λ

∂x(1, t) = β c z D (Ig(t) − uδ(t)). (6.7)


For a detailed computation of the partial derivatives of L and the adjoint system we refer tothe Appendix.

The gradient of the minimization functional with respect to the parameter q can then beevaluated to be

∇qQ(q) =∂L∂q

,

according to the adjoint method. In the case of q = dµdx , this becomes (see Appendix)

∇µ′Q(µ′) =

∫ T

0

[∂λ

∂xD p − c β z D p (Ig(t) − uδ(t))

]dt.

6.2.1 Identification of potential

In this section we are going to investigate the performance of our approach for identifyingthe shape of the energy landscape, i.e. dµ

dx , under the assumption that we have a fixedconstant diffusion coefficient D. The next section will deal with the case that both the energylandscape µ and the diffusion coefficient are unknown. The optimiziation is carried out asdescribed above, using the adjoint approach to determine the gradient of the optimizationfunctional (6.1).

The data used in the following tests are generated by solving the forward model with theexact parameter. Then artificial noise at different noise levels is added to these data in orderto mimic a more realistic experimental situation. For the tests we consider four differentholding potentials and three test pulses for each holding potential, resulting in 24 timecourse measurements in total (12 ionic currents and 12 gating current measurements). Theapplied membrane potentials and parameter settings are given in Table 6.1.

U0 [mV ] U1 [mV ] D = 0.27

-100 0 z = 4

-80 20 ω(x ≥ 0.8) = 1

-60 40 α = 10

-40 β = 1

Table 6.1: Applied membrane potentials andscaled model parameters. U0 denotes the hold-ing potential, U1 denotes the test pulse poten-tial. The weighting function ω for the proba-bility that the channel is open when the gatingparticle is in a certain state, with ω(x ≥ 0.8) = 1and ω(x < 0.8) = 0 has been smoothed for com-putational purposes.

The first example deals with an energy landscape µ that has a single maximum. The recon-struction of the open probability and the gating currents are shown exemplarily in Figure6.2 for the first holding potential. 7% artificial noise has been added to the exact data beforestarting the reconstruction.

As we can see from Figure 6.2, the data sets (given in blue) are recovered very well, andthe same holds for the other holding potentials (not shown). The red curves correspondto the data at the end of the reconstruction and cyan gives the data corresponding to theinitial guess. But we also want to use this example to illustrate the instability inherent inthe problem. Although the data error (i.e. Q form (6.1)) constantly decreases, the errorin the reconstructed parameter dµ

dx starts to increase after a few iterations (see Figure 6.3).Hence, without any precautions the parameter reconstruction gets worse if we iterate too


0 0.5 1 1.5 2 2.5 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t [ms]

Pop

en

(a) Open probabilities

0 0.5 1 1.5 2 2.5 30

5

10

15

20

25

30

t [ms]

I gate

[τ/(

Mc e

0)]

(b) Gating currents

Figure 6.2: (a) Time courses of open probabilities for a holding potential of −100 mV and testpulses to 0, 20 and 40 mV; blue: data used for reconstruction, red: data after reconstruction,cyan: data corresponding to initial guess; (b) time courses of gating currents; colour codingas in (a).

long. Performing the iteration with a smaller stepsize γk (see (6.2)) leads to a slower increasein the parameter error.

0 100 200 300 400 5000.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

no. of iterations

Q

(a) Residual Q

0 100 200 300 400 5001.4

1.6

1.8

2

2.2

2.4

2.6

2.8

3

no. of iterations

|| µ’

− µ

’ exac

t ||

(b) Error in parameter

Figure 6.3: (a) Data residual Q; (b) error in parameter dµdx .

A regularization technique like the Landweber method with an appropriate stopping criterionlike Morozov’s dicrepancy principle can be a remedy for this problem. Morozov’s discrepancyprinciple states that the iteration is stopped as soon as the data error is smaller than amultiple of the data noise,

Q < κδ,

with e.g. κ = 2 and δ denoting the noise level. With such a stopping criterion the op-timization procedure in the above example would have already stopped after the first fewiterations. The reconstructed parameter would be much closer to the exact value without los-ing much accuracy with respect to the data fit (compare Figure 6.3). The energy landscapeµ corresponding to the parameter reconstruction is shown in Figure 6.4.Here blue denotes the exact potential that has been used to generate the data, red is thereconstruction after 500 iterations and green is the reconstruction if the above stopping cri-terion would have been applied. We see that the green curve qualitatively gives the rightresult, reflecting the existence of one single maximum. The “unregularized” red solution isnot too far off from the exact result reconstructing the main energy maximum, but it gives


0 0.2 0.4 0.6 0.8 1−1

−0.5

0

0.5

1

1.5

x

µ [1

/kBT

]Figure 6.4: Reconstruction of energy landscape; blue: exact value; red: reconstruction after500 iterations; green: reconstruction after 4 iterations; cyan: initial guess.

wrong additional energy minima. The cyan curve gives the initial guess for the reconstruc-tion.

In the second example we want to show the results for a more complex energy landscapehaving two maxima. In this case only 2% noise has been added to the exact data, the othersettings are as given in Table 6.1. Also for this more complex case the reconstruction workswell, the results are shown in Figure 6.5. In particular the energy minima and maxima arewell reconstructed.

As the above reconstructions have been carried out with the exact value for the diffusioncoefficient D, we also want to investigate how robust the reconstruction is with respect tovariations in D. For this issue we fix a deviating diffusion coefficient and try to reconstructthe energy landscape under these terms. It turns out that the reconstructions still givequalitatively right results as long as the deviation of D from its true value is not too large.Figure 6.6 shows an example for reconstructions with a diffusion coefficient 11% larger (D1)and 11% smaller (D2) than the exact value (no data noise).

For larger deviations in D, e.g. 25%, no reliable reconstruction of the energy landscape couldbe achieved (see Figure 6.7).

Hence, in the next section we address the question what happens if the diffusion coefficientD is not taken as a fixed quantity, but is also included as an optimization parameter.

6.2.2 Combined potential and diffusion coefficient

As we would like to reconstruct the energy landscape µ (more precisely its derivative dµdx ) and

the diffusion coefficient D simultaneously, our optimization variable is given by q = (dµdx , D).

As before we want to solve

Q(q) → minq∈X

with Q given by (see (6.1))

Q = α1

2

∫ T


1

2

∫ T

0|Ig(t) − uδ(t)|2 dt.


0 0.5 1 1.5 2 2.5 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

t [ms]

Pop

en

(a) Open probabilities

0 0.5 1 1.5 2 2.5 30

5

10

15

20

25

t [ms]

I gate

[τ/(

Mc e

0)]

(b) Gating currents

0 0.2 0.4 0.6 0.8 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

x

µ [1

/kBT

]

(c) Parameter reconstruction

0 50 100 150 200 250 300 350 4000

0.5

1

1.5

no. of iterations

Q

(d) Residual Q

Figure 6.5: (a) Time courses of open probabilities for a holding potential of −100 mV and testpulses to 0, 20 and 40 mV; blue: data used for reconstruction, red: data after reconstruction,cyan: data corresponding to initial guess; (b) time courses of gating currents (colour codingas in (a)); (c) reconstruction of energy landscape; blue: exact value; red: reconstruction;cyan: initial guess; (d) data residual Q.

Both quantities dµdx and D are updated simultaneously. Again we use the adjoint approach

to compute the update direction ∇qQ for the iterative scheme

qk+1 = qk − γk∇qQ(qk).

For a statement of the involved equations we refer to the Appendix. The adjoint PDE systemto be solved for the Langrange parameter λ remains the same as in the previous identificationproblem, see (6.4)-(6.7), just the computation of the update has to be modified.Assuming a scalar diffusion coefficient, i.e. D independent of x, the simultaneous identifica-tion of energy landscape shape dµ

dx and diffusion coefficient D yields reasonable results. Thefinal reconstruction result for the hat-shaped energy profile is shown in Figure 6.8 togetherwith the development of the diffusion coefficient D during the iteration. The correspondingopen probabilities and gating currents for the first holding potential can be seen in Figure6.9.The right part of Figure 6.8 shows the evolution of the diffusion coefficient during the iterationfor two different starting values. The exact value of the diffusion coefficient has been scaledto one for illustrative purposes. We see that in both cases, for an initial guess that is too largeand one that is too small, the exact value is approximated rather good. The correspondingenergy landscape is also nicely recovered. If we compare these results with the ones in Figure6.7, we see that for a combined identification process we get definitively better results than


0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

µ [1

/kBT

]

(a) Parameter reconstruction D1

0 0.5 1 1.5 2 2.5 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

t [ms]

Pop

en

(b) Open probabilities D1

0 0.2 0.4 0.6 0.8 1−1

−0.5

0

0.5

1

1.5

x

µ [1

/kBT

]

(c) Parameter reconstruction D2

0 0.5 1 1.5 2 2.5 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

t [ms]

Pop

en

(d) Open probabilities D2

Figure 6.6: (a) Reconstruction of energy landscape for (D1 − Dexact)/Dexact = 0.11; blue:exact value; red: reconstruction; cyan: initial guess; (b) corresponding time courses of openprobabilities for a holding potential of −100 mV and test pulses to 0, 20 and 40 mV; blue:data used for reconstruction, red: data after reconstruction, cyan: data corresponding toinitial guess; (c) reconstruction of energy landscape for (D2 − Dexact)/Dexact = −0.11; (d)corresponding time courses of open probabilities.

for the energy optimization with a fixed wrong diffusion coefficient. This holds true also forlarge deviations of the diffusion coefficient, where we could not recover the right potentialshape at all in the last section.


0 0.2 0.4 0.6 0.8 1−1.5

−1

−0.5

0

0.5

1

1.5

x

µ [1

/kBT

]


0 0.5 1 1.5 2 2.5 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

t [ms]

Pop

en

(b) Open probabilities

Figure 6.7: (a) Reconstruction of energy landscape for 25% deviation in D; blue: exact value;red: reconstruction; cyan: initial guess; (b) time courses of open probabilities for a holdingpotential of −100 mV and test pulses to 0, 20 and 40 mV; blue: data used for reconstruction,red: data after reconstruction, cyan: data corresponding to initial guess.

0 0.2 0.4 0.6 0.8 1−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

x

µ [1

/kBT

]


0 10 20 30 40 500.2

0.4

0.6

0.8

1

1.2

1.4

1.6

no. of iterations

D

(b) Diffusion coefficient

Figure 6.8: Reconstruction results for simultaneous identification of dµdx and D. (a) Recon-

struction of energy landscape; blue: exact value; red: reconstruction; cyan: initial guess;(b) development of the diffusion coefficient with exact value normalized to one; blue: exactvalue, red and green: two different initial guesses for D.

0 0.5 1 1.5 2 2.5 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

t [ms]

Pop

en


0 0.5 1 1.5 2 2.5 30

5

10

15

20

25

t [ms]

I gate

[τ/(

Mc e

0)]

(b) Diffusion coefficient

Figure 6.9: Data sets for simultaneous identification of dµdx and D for a holding potential of

−100 mV and test pulses to 0, 20 and 40 mV; blue: data used for reconstruction, red: dataafter reconstruction, cyan: data corresponding to initial guess. (a) Time courses of openprobabilities; (b) time courses of gating currents.

Chapter 7

Concluding Remarks

This thesis was dedicated to theoretical investigations regarding transport through ion chan-nels and their gating behaviour. Mathematical models are a helpful tool to get more insightsinto biophysical systems and can form a beneficial completion to experimental investiga-tions. We have demonstrated that the presented models can in principle be used to addressquestions of inverse parameter identification both in the case of ion conduction as well as re-garding voltage gating. The results presented in this thesis are based on artificially generateddata to investigate the possibilities and the performance of the algorithms. The next stepis naturally the application of the developed tools to real experimental data. To be able todo this the models, especially the gating model, need to be validated, i.e. one needs to showthat the models indeed capture the main aspects of the channel behaviour. (The PNP modelhas already been successfully applied e.g. to the ryanodine receptor ([42]).) Validation canbe accomplished by checking if the model can fit a sufficiently large class of experimentaldata by adjusting the current model parameters. Once a parameter set has been established,a further step of validation is to use this parameter set to predict data that have not beeninvolved in the fitting procedure.

We started some investigations with respect to the cation channel TRPA1, a subfamily ofthe transient receptor potential channels. A survey about these channels can be found e.g.in [62]. Detailed structural and geometrical information is not known in the case of TRPA1and the introduced mathematical models could help to get more insights e.g. with respectto channel radius and the essential charged residues that are responsible for the selectivityand conductance properties of the channel. Several mutational studies performed on thepore region of TRPA1 ([118]) and other TRP channels ([66]) showed that the conductanceproperties can be substantially altered by changing some key residues. The executed muta-tions either changed the charge or the size of the residues, resulting in shifts of the reversalpotential in whole cell recordings. First tests with the PNP model could qualitatively re-produce these shifts, but calibration of the model and further refinements still need to bedone. If this is accomplished and the experimental results can be reproduced satisfactorily,one can try to address the inverse problem to get some information about the underlyingsystem properties or to make predictions that could suggest other interesting mutations tobe performed in the experiments.

Apart from the combination of experimental techniques and mathematical models the com-

109

110 CHAPTER 7. CONCLUDING REMARKS

bination of existing model types to generate more accurate and efficient schemes for thedescription of ion transport is a field of major research. As we pointed out in the thesis,every modelling approach has its advantages and drawbacks. Hence an ideal answer wouldbe a model that combines all the advantages from the different model classes, providing adetailed description of the channel structure and at the same time being computationally soefficient that e.g. inverse problems could be addressed with it. As the experimental tech-niques and prospects become better, allowing a more and more detailed resolution of channelstructure and composition, also efficient models that can incorporate detailed structures areneeded. The general idea would be to combine a detailed atomistic model describing thefilter region with a macroscopic description of the attached baths and a reasonable modelfor the channel protein and the surrounding membrane. Apart from physical considerationssuch as which regions need to be modelled in detail and which interactions have to be re-garded, the coupling of discrete microscopic and macroscopic continuum models also posessome interesting mathematical questions. Suitable transition conditions have to be definedat the interface of the two descriptions and macroscopic quantities need to be transformedinto microscopic ones and vice versa.

Theoretical investigations of ion channel gating up to now have been restricted either to ab-stract models like discrete state Markov models or to models solely focussing on the voltagesensor. The usual assumption is that if the voltage sensor is in a certain state/position thepore opens and the channel becomes conductive. But to our knowledge there is no modelthat really describes the opening process coupled to the movement of the voltage sensor. Oneapproach could be to start from the multi-particle Fokker-Planck model that we derived inChapter 4, including not only the voltage sensor but also the pore region into the simulationdomain. It would be interesting to see if a sufficiently simple mechanism could be includedthat automatically leads to an opening of the pore once the voltage sensor has performeda certain movement. In order to do this some hypothesis about the voltage sensor-porecoupling needs to be available (e.g. a mechanical coupling or coupling via electrostatic in-teractions).

Similar questions could be posed regarding the pore blocking mechanism that disrupts theion conduction. Several possible ways have been proposed for this and it would be interestingto see if conduction models like PNP could be supplemented with a mechanism generatingsuch a pore block under certain conditions. One idea put forward was e.g. the formation ofa gas bubble inside the filter that blocks the ion pathway ([98]).

Until experimental procedures are available to prove or disprove such hypotheses, mathe-matical models can also be a helpful tool to investigate the plausibility and feasibility ofdifferent theories.

Appendix A

Adjoint systems

A.1 Adjoint system for full PNP model

In this section we will derive the adjoint system for optimizations with the full one-dimensionalPNP model.

The current output of the forward operator F is given by

I = e0

m−1∑

i=1

ziJi

with

Ji =η−i exp(zi c V (−L) + µex

i (−L)) − η+i exp(zi c V (L) + µex

i (L))∫ L−L


i ) dx, i = 1, ..., m − 1

(see (3.1)).

The minimization functional reads as

Q(q) =1

2||I − Iδ||2

and we want to solve

Q(q) → minq

under the constraint that V , ρi, i = 1, ..., m, solve the steady-state PNP system. We takeq = exp(−µc). The associated Lagrange functional L = L(V, ρi, q, αv, αi) is given by

L = Q(q) +

∫αv [−λ2 d

dx(ǫA

dV

dx) −

m∑

i=1

ziAρi] dx

+m−1∑

i=1

∫αi

d

dx[DiA(

dρi

dx+ zicρi

dV

dx+ ρi

dµexi

dx)] dx

+

∫

filter

αm [ρm − Nm q exp(−zm c V − µexm )∫

filter

q A exp(−zm c V − µexm ) ds

] dx.

111

112 APPENDIX A. ADJOINT SYSTEMS

Here αV = αV (x) and αi = αi(x) denote the Lagrange parameters and the integrals are overthe system domain (i.e. from −L to L) unless stated otherwise.Computing the partial derivatives of L and setting them equal to zero,

∂L∂V

V = 0 for all V

∂L∂ρi

ρi = 0 for all ρi, i = 1, ..., m

∂L∂αi

αi = 0 for all αi, i = 1, ..., m, v

leads to the following PDE system to be solved for the Lagrange parameters (for fixed V ,ρi, µex

i and given q):

−λ2 d

dx(ǫ A

dαv

dx) +

m−1∑

i=1

d

dx(DiA zi c ρi

dαi

dx) + FT = RHS

−zi A αv +d

dx(Di A

dαi

dx) − Di A (zi c

dV

dx+

dµexi

dx)dαi

dx= 0, i = 1, ..., m − 1

−zm A αv + αm = 0 in filter region,

with the filter term FT (only added in filter region) given by

FT = Nm zm c[ αm q exp(−zm c V − µex

m )∫

filter

A q exp(−zm c V − µexm ) ds

−

∫

filter

αm q exp(−zm c V − µexm ) ds

(∫

filter

A q exp(−zm c V − µexm ) ds)2

A q exp(−zm c V − µexm )

]

and the right-hand side

RHS = (I − Iδ) e0

m−1∑

i=1

[z2i c

η−i exp(zi c V (−L) + µexi (−L)) − η+

i exp(zi c V (L) + µexi (L))

(∫


i ) dx)2

· 1

Di Aexp(zi c V + µex

i )].

The boundary conditions to the above system are given by

αi(−L) = αi(L) = 0, i = 1, ..., m − 1, v.

The gradient ∇qQ is then determined by evaluating ∂L∂q , which can be computed to be

∂L∂q

= Nm

[∫

αm q exp(−zm c V − µexm ) dx

(∫

A q exp(−zm c V − µexm ) dx)2

A exp(−zm c V − µexm )

− αm exp(−zm c V − µexm )∫

A q exp(−zm c V − µexm ) dx

],

where in this case all integrals are over the filter region.

A.2. ADJOINT SYSTEM FOR LINEAR SURROGATE MODEL 113

A.2 Adjoint system for linear surrogate model

As we used surrogate models for the identification process in Chapter 3, we will derive theadjoint system for the linear surrogate functional in the following, which has been used inthe inner iteration in the iterated algorithm.

The aim is to minimize the following objective functional

Q(q) =1

2||S(q) − gδ||2 (A.1)

with respect to the parameter q to be identified taken as q = exp(−µc). Here gδ stands forthe measured conductance.

The operator S describes the parameter-to-output map for the conductance g. It can bewritten as

S : q 7→ g(q)

with

S(q) = km−1∑

i=1

ziji

= −km−1∑

i=1

z2i c

∫ L−L exp(zic V + µex

i )ρidvdxdx

∫ L−L

1DiA

exp(zic V + µexi )dx

where k denotes a scaling constant and V , ρi, µi and v denote the solution of the full andlinearized PDE system, depending on the parameter q.

The associated Lagrange functional L = L(v, ri, q, αi) then reads as

L = Q(q) +

∫αv [−λ2 d

dx(ǫA

dv

dx) −

m∑

i=1

ziAri] dx

+m−1∑

i=1

∫αi

d

dx[Di A (

dri

dx+ zi c ri

dV

dx+ zi c ρi

dv

dx+ ri

dµexi

dx)] dx

+

∫αm [rm + zm c N q e−zm c V −µex

m (v∫

A q e−zmc V −µexm ds

−∫

A q e−zmc V −µexm v ds

(∫

A q e−zm c V −µexm ds)2

)] dx

with αi = αi(x) denoting the Lagrange parameters.

Computing the partial derivatives of L and setting them equal to zero,

∂L∂v

v = 0 for all v

∂L∂ri

ri = 0 for all ri, i = 0, ..., m

∂L∂αi

αi = 0 for all αi, i = 0, ..., m, v

leads to the following PDE system to be solved for the Lagrange parameters (for fixed V ,ρi, µex

i and given q):


−λ2 d

dx(ǫ A

dαv

dx) +

m−1∑

i=1

d

dx(DiA zi c ρi

dαi

dx) + FT = RHS

− zi A αv +d

dx(DiA

dαi

dx) − DiA (zi c

dV

dx+

dµexi

dx)dαi

dx= 0, i = 1, ..., m − 1

− zm A αv + αm = 0,

with the filter term FT given by

FT = Nm

[αm zm c q exp(−zm c V − µexm )∫

A q exp(−zm c V − µexm ) dx

−∫

αm zm c q exp(−zm c V − µexm ) dx

(∫

A q exp(−zm c V − µexm ) dx)2

A q exp(−zm c V − µexm )

]

and the right-hand side

RHS = − (S(q) − gδ) km−1∑

i=1

[ z2i c∫

1DiA

ezi c V +µexi ds

d

dx(ezi c V +µex

i ρi)]

The boundary conditions are given by

αi(−L) = αi(L) = 0, i = 1, ..., m − 1, v.

The update is then computed by evaluating ∂L∂q :

∂L∂q

= zm c N[αme−zmc V −µex

m (v∫

A q e−zmc V −µexm ds

−∫


(∫

A q e−zmc V −µexm ds)2

)

−∫

αm q e−zmc V −µexm v ds

(∫


Ae−zmc V −µexm

−∫

αm q e−zmc V −µexm ds

(∫


Ae−zmc V −µexm v

+ 2

∫αm q e−zmc V −µex

m ds∫


(∫


A e−zmc V −µexm

].

A.3 Adjoint system for the gating model

Here we present a detailed derivation of the adjoint system for the one-dimensional Fokker-Planck like gating model.

The Lagrange functional is given by

L(p, q, λ) = Q(p, q) +

∫ 1

0

∫ T

0λ

[∂p

∂t− ∂

∂x

(D (

dµ

dx− z U) p + D

∂p

∂x

)]dt dx,

A.3. ADJOINT SYSTEM FOR THE GATING MODEL 115

and the corresponding partial derivatives can be computed as in the following. We start bydetermining the partial derivatives of the optimization functional

Q = α1

2

∫ T


1

2

∫ T

0|Ig(t) − uδ(t)|2 dt (A.2)

with

Popen(t) =

∫ 1

0ω(x) p(x, t) dx,

and

Ig(t) = −c

∫ 1

0z[D

(dµ

dx− zU

)p + D

∂p

∂x

]dx.

It is

∂Q

∂pp = α

∫ T

0

∫ 1

0(Popen(t) − yδ(t))ω(x) p(x, t) dx dt

−c β

∫ T

0(Ig(t) − uδ(t)) ·

∫ 1

0[z D (

dµ

dx− zU) − ∂

∂x(z D)] p dx

+z [D(1) p(1, t) − D(0) p(0, t)]

dt

= α

∫ T

0

∫ 1

0(Popen(t) − yδ(t))ω(x) p(x, t) dx dt

−c β

∫ T

0

∫ 1

0(Ig(t) − uδ(t)) · [z D (

dµ

dx− zU) − ∂

∂x(z D)] p dx dt

−cβ

∫ T

0(Ig(t) − uδ(t)) z

[D(1) p(1, t) − D(0) p(0, t)

]dt.

Next we turn our attention to ∂L∂p :

∂L∂p

p =∂Q

∂pp +

∫ 1

0

∫ T

0

[− ∂λ

∂t+

∂λ

∂xD(

dµ

dx− zU) − ∂

∂x(D

∂λ

∂x)]

p dt dx

+

∫ 1

0

[λ(x, T ) p(x, T ) − λ(x, 0)p(x, 0)

]dx (A.3)

+

∫ T

0

[D(1)

∂λ

∂x(1, t) p(1, t) − D(0)

∂λ

∂x(0, t) p(0, t)

]dt.

Since ∂L∂p p = 0 has to hold for all p the resulting adjoint partial differential equation for the

Lagrange parameter λ is given by

−∂λ

∂t+

∂λ

∂xD (

dµ

dx− zU) − ∂

∂x(D

∂λ

∂x) = −R,

with R given by

R(x, t) = α ω(x) (Popen(t) − yδ(t)) + β · [c ∂

∂x(z D) − c z D (

dµ

dx− zU)] (Ig(t) − uδ(t)).


The adjoint system will be solved backward in time and the appropriate initial and boundaryconditions for λ are determined by demanding that the boundary terms in (A.3) vanish. Sincep(x, 0) = 0 as p(x, 0) = p0(x) is fixed, this yields

λ(x, T ) = 0

and∂λ

∂x(1, t) = c z β (Ig(t) − uδ(t))

as well as∂λ

∂x(0, t) = c z β (Ig(t) − uδ(t))

(assuming D(1) 6= 0, D(0) 6= 0).

As the Fokker-Planck model actually depends solely on the derivative µ′ = dµdx of the energy

landscape and not on its absolute value µ, it makes more sense to consider q = µ′ instead ofq = µ. For the optimization functional we get

∂Q

∂µ′µ′ = −c β

∫ 1

0

∫ T

0z D p (Ig(t) − uδ(t)) dt µ′ dx

and the partial derivative of the Lagrange functional ∂L∂µ′ is then given by

∂L∂µ′

µ′ =∂Q

∂µ′µ′ +

∫ 1

0

∫ T

0

∂λ

∂xD p µ′ dt dx

=

∫ 1

0

∫ T

0

[∂λ

∂xD p − c β z D p (Ig(t) − uδ(t))

]µ′ dt dx,

making use of the no-flux boundary conditions on p.It might also be necessary to consider the diffusion coefficient D as the optimization param-eter, q = D. The derivatives in this case are given by

∂Q

∂DD = −c β

∫ 1

0

∫ T

0(Ig(t) − uδ(t)) z

[(dµ

dx− zU)p +

∂p

∂x

]dt D dx

and

∂L∂D

D =∂Q

∂DD +

∫ 1

0

∫ T

0

∂λ

∂x

[(dµ

dx− zU)p +

∂p

∂x

]D dt dx

=

∫ 1

0

∫ T

0

∂λ

∂x

[(dµ

dx− zU)p +

∂p

∂x

]− c β (Ig(t) − uδ(t)) z

[(dµ

dx− zU)p +

∂p

∂x

]dt D dx.

Note that the boundary terms arising from the partial integration vanish since the fluxJ = −D

[(dµ

dx − zU)p + ∂p∂x

]vanishes at the boundaries.

In the case that both parameters µ′ and D should be optimized at the same time, the partialderivative ∂L

∂q for q = (µ′, D) is given by ∂L∂q = (∂L

∂µ , ∂L∂D ).

Bibliography

[1] S. K. Aggarwal and R. MacKinnon. Contribution of the S4 segment to gating chargein the Shaker K+ channel. Neuron, 16(6):1169–1177, 1996.

[2] A. Aksimentiev, R. Brunner, E. Cruz-Chu, J. Comer, and K. Schulten. Modelingtransport through synthetic nanopores. IEEE Nanotechnology Magazine, 3(1):20 – 28,2009.

[3] B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, and P. Walter. MolecularBiology of the Cell. Garland Sciences, New York, 4th edition, 2002.

[4] B. J. Alder and T. E. Wainwright. Phase transition for a hard sphere system. J. Chem.Phys., 27:1208–1209, 1957.

[5] B. J. Alder and T. E. Wainwright. Studies in Molecular Dynamics. I. General method.The Journal of Chemical Physics, 31(2):459–466, 1959.

[6] M. P. Allen and D. J. Tildesley. Computer Simulation of Liquids. Oxford UniversityPress, New York, 1996.

[7] C. M. Armstrong, F. Bezanilla, and E. Rojas. Destruction of sodium conductanceinactivation in squid axons perfused with pronase. J. Gen. Physiol., 62(4):375–391,1973.

[8] C. M. Armstrong and W. F. Gilly. Fast and slow steps in the activation of sodiumchannels. J. Gen. Physiol., 74:691 – 711, 1979.

[9] F. M. Ashcroft. Ion Channels and Disease. Academic Press, San Diego, California,2000.

[10] R. H. Ashley. Ion Channels - a practical approach. Oxford University Press, New York,1995.

[11] O. M. Becker, A. D. MacKerell Jr., B. Roux, and M. Watanabe. ComputationalBiochemistry and Biophysics. Marcel Dekker, Inc., New York, 2001.

[12] F. Bezanilla, E. Perozo, and E. Stefani. Gating of Shaker K+ channels: II. Thecomponents of gating currents and a model of channel activation. Biophysical Journal,66(4):1011 – 1021, 1994.

[13] P. S. Burada, G. Schmid, and P. Hanggi. Entropic transport - a test bed for theFick-Jacobs approximation. Phil. Trans. R. Soc. A, 367:3157–3171, 2009.

117

118 BIBLIOGRAPHY

[14] M. Burger, R. S. Eisenberg, and H. W. Engl. Mathematical design of ion channel selec-tivity via inverse problems technology. US patent application, submitted 12/04/2006,2006.

[15] M. Burger, R. S. Eisenberg, and H. W. Engl. Inverse problems related to ion channelselectivity. SIAM Journal on Applied Mathematics, 67(4):960–989, 2007.

[16] B. Chanda and F. Bezanilla. A common pathway for charge transport through voltage-sensing domains. Neuron, 57(3):345–351, 2008.

[17] D. Chen, L. Xu, A. Tripathy, G. Meissner, and B. Eisenberg. Permeation through thecalcium release channel of cardiac muscle. Biophys. J., 73(3):1337–1354, 1997.

[18] T. Chou. How fast do fluids squeeze through microscopic single-file pores? Phys. Rev.Lett., 80(1):85–88, Jan 1998.

[19] T. Chou. Kinetics and thermodynamics across single-file pores: Solute permeabilityand rectified osmosis. The Journal of Chemical Physics, 110(1):606–615, 1999.

[20] T. Chou and D. Lohse. Entropy-driven pumping in zeolites and biological channels.Phys. Rev. Lett., 82(17):3552–3555, Apr 1999.

[21] J. R. Clay. A simple model of K+ channel activation in nerve membrane. J. Theor.Biol., 175.

[22] K. S. Cole and J. W. Moore. Ionic current measurements in the squid giant axonmembrane. J. Gen. Physiol., 44:123 – 167, 1960.

[23] K. S. Cole and J. W. Moore. Potassium ion current in the squid giant axon: dynamiccharacteristic. Biophysical Journal, 1(1):1 – 14, 1960.

[24] D. Colquhoun and A. G. Hawkes. Relaxation and fluctuations of membrane currentsthat flow through drug-operated channels. Proc. of the Royal Society Of London. SeriesB, Biological Sciences, 199:231 – 262, 1977.

[25] D. A. Doyle, J. M. Cabral, R. A. Pfuetzner, A. Kuo, J. M. Gulbis, S. L. Cohen, B. T.Chait, and R. MacKinnon. The structure of the potassium channel: Molecular basisof K+ conduction and selectivity. Science, 280(5360):69–77, 1998.

[26] R. M. Dreizler and E. K. U. Gross. Density Functional Theory - An Approach to theQuantum Many-Body Problem. Springer-Verlag, Berlin Heidelberg, 1990.

[27] H. W. Engl, M. Hanke, and A. Neubauer. Regularization of Inverse Problems. KluwerAcademic Publishers, Dordrecht, The Netherlands, 1996.

[28] P. Fatt and B. Katz. The electrical properties of crustacean muscle fibres. J. Physiol.,120:171 – 204, 1953.

[29] M. Fink and D. Noble. Markov models for ion channels: versatility versus identifiabilityand speed. Phil. Trans. R. Soc. A, 367:2161–2179, 2009.

BIBLIOGRAPHY 119

[30] D. A. French, R. J. Flannery, C. W. Groetsch, W. B. Krantz, and S. J. Kleene. Nu-merical approximation of solutions of a nonlinear inverse problem arising in olfactionexperimentation. Mathematical and Computer Modelling, 43(7-8):945 – 956, 2006.

[31] D. A. French and C. W. Groetsch. Integral equation models for the inverse problem ofbiological ion channel distributions. Journal of Physics: Conference Series, 73:012006,2007.

[32] D. Frenkel and B. Smit. Understanding Molecular Simulation - From Algorithms toApplications. Academic Press, San Diego, 1996.

[33] H. Gajewski. On existence, uniqueness and asymptotic behavior of solutions of thebasic equations for carrier transport in semiconductors. Z. Angew. Math. Mech.,65(2):101–108, 1985.

[34] C. W. Gardiner. Handbook of Stochastic Methods for Physics, Chemistry and theNatural Sciences. Springer-Verlag, Berlin, 2nd edition, 1997.

[35] W. Gerstner and W. Kistler. Spiking Neuron Models: Single Neurons, Populations,Plasticity. Cambridge University Press, Cambridge, 2002.

[36] M. B. Giles and N. A. Pierce. An introduction to the adjoint approach to design. Flow,Turbulence and Combustion, 65:393–415, 2000.

[37] D. Gillespie. A Singular Perturbation Analysis of the Poisson-Nernst-Planck System,Applications to Ionic Channels. PhD thesis, Rush University, Chicago, 1999.

[38] D. Gillespie. Energetics of divalent selectivity in a calcium channel: The ryanodinereceptor case study. Biophysical J., 94:1169–1184, 2008.

[39] D. Gillespie and R. S. Eisenberg. Physical descriptions of experimental selectivitymeasurements in ion channels. European Biophysics Journal, 31:454 – 466, 2002.

[40] D. Gillespie, W. Nonner, and R. S. Eisenberg. Coupling Poisson-Nernst-Planck anddensity functional theory to calculate ion flux. J. Phys.: Condens. Matter, 14:12129–12145, 2002.

[41] D. Gillespie, W. Nonner, and R. S. Eisenberg. Density functional theory of charged,hard-sphere fluids. Phys. Rev. E, 68(3):031503, Sep 2003.

[42] D. Gillespie, L. Xu, Y. Wang, and G. Meissner. (De)constructing the ryanodine recep-tor: Modeling ion permeation and selectivity of the calcium release channel. J. Phys.Chem. B, 109:15598–15610, 2005.

[43] D. T. Gillespie. The multivariate Langevin and Fokker-Planck equations. Am. J.Phys., 64:1246 – 1257, 1996.

[44] C. W. Groetsch. Inverse Problems in the Mathematical Sciences. Vieweg, Braun-schweig, 1993.

120 BIBLIOGRAPHY

[45] E. Harder, A. D. MacKerell Jr., and B. Roux. Many-body polarization effects and themembrane dipole potential. Journal of the American Chemical Society, 131(8):2760–2761, 2009.

[46] B. Hille. Ionic channels in nerve membranes. Progress in Biophysics and MolecularBiology, 21:1 – 32, 1970.

[47] B. Hille. Ion Channels of Excitable Membranes. Sinauer Associates, Inc., Sunderland,Massachusetts, USA, 3rd edition, 2001.

[48] P. Hanggi, P. Talkner, and M. Borkovec. Reaction-rate theory: fifty years afterKramers. Rev. Mod. Phys., 62(2):251–341, Apr 1990.

[49] A. H. Hodgkin and A. F. Huxley. The components of membrane conductance in thegiant axon of Loligo. J. Physiol., 116(4):473 – 496, 1952.

[50] A. H. Hodgkin and A. F. Huxley. Currents carried by sodium and potassium ionsthrough the membrane of the giant axon of Loligo. J. Physiol., 116(4):449 – 472, 1952.

[51] A. H. Hodgkin and A. F. Huxley. The dual effect of membrane potential on sodiumconductance in the giant axon of Loligo. J. Physiol., 116(4):497 – 506, 1952.

[52] A. H. Hodgkin and A. F. Huxley. A quantitative description of membrane current andits application to conduction and excitation in nerve. J. Physiol., 117:500 – 544, 1952.

[53] A. H. Hodgkin, A. F. Huxley, and B. Katz. Measurement of current-voltage relationsin the membrane of the giant axon of Loligo. J. Physiol., 116(4):424 – 448, 1952.

[54] P. Hohenberg and W. Kohn. Inhomogeneous electron gas. Phys. Rev., 136(3B):B864–B871, 1964.

[55] M. H. Holmes. Introduction to Perturbation Methods. Springer Verlag, New York,1995.

[56] M. H. Jacobs. Diffusion Processes. Springer, New York, 1967.

[57] N. G. van Kampen. Stochastic processes in physics and chemistry. North-HollandPhysics Publishing, Amsterdam, The Netherlands, 1981.

[58] C. Koch. Biophysics of Computation: Information Processing in Single Neurons. Ox-ford University Press, New York, 1998.

[59] W. Kohn. Nobel lecture: Electronic structure of matter - wave functions and densityfunctionals. Rev. Mod. Phys., 71(5):1253–1266, 1999.

[60] W. Kohn and L. J. Sham. Inhomogeneous electron gas. Phys. Rev., 140(4A):A1133–A1138, 1965.

[61] I. D. Kosinska, I. Goychuk, M. Kostur, G. Schmid, and P. Hanggi. Rectification in syn-thetic conical nanopores: A one-dimensional Poisson-Nernst-Planck model. PhysicalReview E (Statistical, Nonlinear, and Soft Matter Physics), 77(3):031131, 2008.

BIBLIOGRAPHY 121

[62] B. Lackner. Activation and modulation of Transient Potential Ankyrin 1 (TRPA1)channels by allyl isothiocyanate and Ca2+. Diploma thesis, Johannes Kepler UniversityLinz, Austria, 2008.

[63] G. Lamoureux, E. Harder, I. V. Vorobyov, B. Roux, and A. D. MacKerell Jr. A polar-izable model of water for molecular dynamics simulations of biomolecules. ChemicalPhysics Letters, 418(1-3):245 – 249, 2006.

[64] G. Lamoureux and B. Roux. Modeling induced polarization with classical drude oscil-lators: Theory and molecular dynamics simulation algorithm. The Journal of ChemicalPhysics, 119(6):3025–3039, 2003.

[65] A. R. Leach. Molecular Modelling - Principles and Applications. Pearson EducationLimited, England, second edition, 2001.

[66] C. H. Liu, T. Wang, M. Postma, A. G. Obukhov, C. Montell, and R. C. Hardie. Invivo identification and manipulation of the Ca2+ selectivity filter in the DrosophilaTransient Receptor Potential channel.

[67] W. Liu. One-dimensional steady-state Poisson-Nernst-Planck systems for ion channelswith multiple ion species. Journal of Differential Equations, 246(1):428 – 451, 2009.

[68] W. Liu and B. Wang. Poisson-Nernst-Planck systems for narrow tubular-like membranechannels. submitted for publication.

[69] A. K. Louis. Inverse und schlecht gestellte Probleme. Teubner, Stuttgart, 1989.

[70] R. MacKinnon. Potassium channels and the atomic basis of selective ion conduction.Nobel lecture, 2003.

[71] P. A. Markovich. The Stationary Semiconductor Device Equations. Springer-VerlagWien - New York, 1986.

[72] P. A. Markovich, C. A. Ringhofer, and C. Schmeiser. Semiconductor Equations.Springer-Verlag Wien, 1990.

[73] N. D. Mermin. Thermal properties of the inhomogeneous electron gas. Phys. Rev.,137(5A):A1441–A1443, 1965.

[74] N. Metropolis, A. W. Rosenbluth, N. M. Rosenbluth, A. N. Teller, and E. Teller.Equation of state calculations by fast computing machines. J. Chem. Phys., 21:1087 –1092, 1953.

[75] G. Moy, B. Corry, S. Kuyucak, and S.-H. Chung. Tests of continuum theories as modelsof ion channels. I. Poisson-Boltzmann theory versus Brownian Dynamics. Biophys. J.,78(5):2349–2363, 2000.

[76] E. Neher and B. Sakmann. Single-channel currents recorded from membrane of den-ervated frog muscle fibres. Nature, 260:799–802, 1976.

122 BIBLIOGRAPHY

[77] A. Nekouzadeh and Y. Rudy. Statistical properties of ion channel records. Part I:Relationship to the macroscopic current. Mathematical Biosciences, 210:291 – 314,2007.

[78] A. Nekouzadeh and Y. Rudy. Statistical properties of ion channel records. Part II:Estimation from the macroscopic current. Mathematical Biosciences, 210:315 – 334,2007.

[79] W. Nernst. Zur Kinetik der in Losung befindlichen Korper. Z. Physik. Chem., 2:613,1888.

[80] W. Nernst. Die elektromotorische Wirksamkeit der Ionen. Z. Physik. Chem., 4:129 –181, 1889.

[81] W. Nernst. Zur Theorie der elektrischen Reizung. Nachrichten von der Gesellschaftder Wissenschaften zu Gottingen, Mathematisch-Physikalische Klasse, pages 104 – 108,1899.

[82] W. Nonner, L. Catacuzzeno, and R. S. Eisenberg. Binding and selectivity in L-typecalcium channels: A mean spherical approximation. Biophysical J., 79:1976–1992,2000.

[83] W. Nonner, D. P. Chen, and R. S. Eisenberg. Progress and prospects in permeation.J. Gen. Physiol., 113:773–782, 1999.

[84] W. Nonner and R. S. Eisenberg. Ion permeation and glutamate residues linked byPoisson-Nernst-Planck theory in L-type calcium channels. Biophysical J., 75:1287–1305, 1998.

[85] W. Nonner, A. Peyser, D. Gillespie, and R. S. Eisenberg. Relating microscopic chargemovement to macroscopic currents: The Ramo-Shockley theorem applied to ion chan-nels. Biophysical Journal, 87:3716 – 3722, 2004.

[86] J.-H. Park and J. W. Jerome. Qualitative properties of steady-state Poisson-Nernst-Planck systems: mathematical study. SIAM J. Appl. Math., 57(3):609–630, 1997.

[87] C. D. Pivetti, M. R. Yen, S. Miller, W. Busch, Y. H. Tseng, I. R. Booth, and M. H.Saier Jr. Two families of mechanosensitive channel proteins. Microbiol. and MolecularBiol. Rev., 67(1):66–85, 2003.

[88] M. Planck. Uber die Erregung von Elektrizitat und Warme in Elektrolyten. Ann.Phys. und Chem., 39:161, 1890.

[89] S. Ramo. Currents induced by electron motion. Proc. IRE, 27:584 – 585, 1939.

[90] D. Reguera and J. M. Rubı. Kinetic equations for diffusion in the presence of entropicbarriers. Phys. Rev. E, 64(6):061106, Nov 2001.

[91] D. Reguera, G. Schmid, P. S. Burada, J. M. Rubı, P. Reimann, and P. Hanggi. En-tropic transport: Kinetics, scaling, and control mechanisms. Physical Review Letters,96(13):130603, 2006.

BIBLIOGRAPHY 123

[92] S. Ringer. Concerning the influence exerted by each of the constituents of the bloodon the contraction of the ventricle. J. Physiol., 3(5-6):380 – 393, 1882.

[93] S. Ringer. A further contribution regarding the influence of the different constituentsof the blood on the contraction of the heart. J. Physiol., 4(1):29 – 42.3, 1883.

[94] Y. Rosenfeld. Free-energy model for the inhomogeneous hard-sphere fluid mixture anddensity-functional theory of freezing. Phys. Rev. Lett., 63(9):980–983, 1989.

[95] Y. Rosenfeld, M. Schmidt, H. Lowen, and P. Tarazona. Fundamental-measure free-energy density functional theory for hard spheres: Dimensional crossover and freezing.Phys. Rev. E, 55(4):4245–4263, 1997.

[96] R. Roth. Introduction to density functional theory of classical systems: Theory andapplications. Lecturenotes, 2006.

[97] R. Roth, R. Evans, A. Lang, and G. Kahl. Fundamental measure theory for hard-spheremixtures revisited: the white bear version. J. Phys.: Condens. Matter, 14:12063–12078,2002.

[98] R. Roth, D. Gillespie, W. Nonner, and R. S. Eisenberg. Bubbles, gating, and anes-thetics in ion channels. Biophys. J., 94(11):4282–4298, 2008.

[99] B. Roux. Theoretical and computational studies of ion channels. Curr. Op. Struc.Biol., 12:182–189, 2002.

[100] I. Rubinstein. Electro-Diffusion of Ions. Society for Industrial and Applied MAthe-matics, Philadelphia, 1990.

[101] B. Rudy and L. E. Iverson, editors. Ion Channels - Methods in Enzymology. AcademicPress, New York, 1992.

[102] B. Sakman and E. Neher, editors. Single-Channel Recording. Plenum Press, New York,1983.

[103] O. Scherzer. Convergence criteria of iterative methods based on Landweber iterationfor solving nonlinear problems. Journal of Mathematical Analysis and Applications,194(3):911 – 933, 1995.

[104] Z. Schuss, B. Nadler, and R. S. Eisenberg. Derivation of Poisson and Nernst-Planckequations in a bath and channel from a molecular model. Phys. Rev. E, 64(3):036116,Aug 2001.

[105] S. Selberherr. Analysis and Simulation of Semiconductor Devices. Springer-VerlagWien - New York, 1984.

[106] F. Sesti, S. Rajan, R. Gonzalez-Colaso, N. Nikolaeva, and S. A. Goldstein. Hyper-polarization moves S4 sensors inward to open MVP, a methanococcal voltage-gatedpotassium channel. Nature Neuroscience, 6(4):353 – 361, 2003.

[107] W. Shockley. Currents to conductors induced by a moving point charge. J. Appl.Phys., 9:635 – 636, 1938.

124 BIBLIOGRAPHY

[108] D. Sigg and F. Bezanilla. A physical model of potassium channel activation: Fromenergy landscape to gating kinetics. Biophysical Journal, 84:3703 – 3716, 2003.

[109] D. Sigg, H. Qian, and F. Bezanilla. Kramers’ diffusion theory applied to gating kineticsof voltage-dependent ion channels. Biophysical Journal, 76:782 – 803, 1999.

[110] A. Singer, D. Gillespie, J. Norbury, and R. S. Eisenberg. Singular perturbation analysisof the steady-state Poisson-Nernst-Planck system: Applications to ion channels. Euro.J. of Appl. Math., 19:541–560, 2009.

[111] A. Singer and J. Norbury. A Poisson–Nernst–Planck model for biological ion channels—an asymptotic analysis in a three-dimensional narrow funnel. SIAM Journal on AppliedMathematics, 70(3):949–968, 2009.

[112] W. D. Stein, editor. Current Topics in Membranes and Transport, volume 21. AcademicPress, Orlando, Florida, 1984.

[113] R. E. Taylor and F. Bezanilla. Sodium and gating current time shifts resulting fromchanges in initial conditions. J Gen Physiol., 81(6):773 – 784, 1983.

[114] I. S. Tolokh, S. Goldman, and C. G. Gray. Unified modeling of conductance kineticsfor low- and high-conductance potassium channels. Phys. Rev. E, 74:011902, 2006.

[115] T. A. van der Straaten, J. Tang, and R. S. Eisenberg. Three-dimensional continuumsimulations of ion transport through biological ion channels: Effect of charge distribu-tion in the constriction region of porin. J. Comp. Electronics, 1(3):335–340, 2002.

[116] W. V. van Roosbroeck. Theory of flow of electrons and holes in germanium and othersemiconductors. Bell Systems Tech. J., 29:560 – 607, 1950.

[117] C. A. Vandenberg and F. Bezanilla. A sodium channel gating model based on singlechannel, macroscopic ionic, and gating currents in the squid giant axon. BiophysicalJournal, 60(6):1511 – 1533, 1991.

[118] Y. Y. Wang, R. B. Chang, H. N. Waters, D. D. McKemy, and E. R. Liman. Thenociceptor ion channel TRPA1 is potentiated and inactivated by permeating calciumions. Journal of Biological Chemistry, 283(47):32691–32703, 2008.

[119] M.-T. Wolfram. Forward and Inverse Solvers for Electro-Diffusion Systems. PhDthesis, Johannes Kepler University Linz, Austria, 2008.

[120] W. N. Zagotta, H. Toshinori, and R. W. Aldrich. Shaker potassium channel gating III:Evaluation of kinetic models for activation. J. Gen. Physiol., 103:321 – 362, 1994.

[121] Y. Zhou, J. H. Morais-Cabral, A. Kaufman, and R. MacKinnon. Chemistry of ioncoordination and hydration revealed by a K+ channel-Fab complex at 2.0 A resolution.Nature, 414(6859):43, 2001.

[122] R. Zwanzig. Diffusion past an entropy barrier. J. Phys. Chem., 96(10):3926–3930,1992.

Eidesstattliche Erklarung

Ich, Kattrin Arning, erklare an Eides statt, dass ich die vorliegende Dissertation selbstandigund ohne fremde Hilfe verfasst, andere als die angegebenen Quellen und Hilfsmittel nichtbenutzt bzw. die wortlich oder sinngemaß entnommenen Stellen als solche kenntlich gemachthabe.

Linz, Oktober 2009

————————————————Kattrin Arning

Curriculum Vitae

Personal Data

Name: Kattrin ArningDate of birth: 08.04.1982Place of birth: Bremen, GermanyNationality: GermanFamily status: married

Education

1988-1998: Grundschule Arsten (elementary school)Bilinguales Gymnasium Habenhausen (grammar school), Bremen

1998-2001: bilinguale Oberstufe (highschool), Bremen2001-2006: Technomathematik with minor subject physics

at the University of Bremen, Germanysince 2006: PhD studies in the DK “Molecular Bioanalytics”

at the University of Linz, Austria

Career

since 2006: research assistant at the Radon Institutefor Computational and Applied Mathematics (RICAM),Austrian Academy of Sciences

Documents

Mathematical Modelling and Simulation