Pr azisionstrackingstudien bei Belle IIthesis/data/iekp-ka2012-18.pdf · sie kann in einigen Bereichen der Medizintechnik und der Radiologie gar als Industriestandard angesehen werden

IEKP-KA/2012-18

Prazisionstrackingstudienbei Belle II

Philipp Oehler

Diplomarbeit

an der Fakultat fur Physikdes Karlsruher Instituts fur Technologie (KIT)

Referent: Prof. Dr. M. FeindtInstitut fur Experimentelle Kernphysik

Korreferent: Prof. Dr. Th. MullerInstitut fur Experimentelle Kernphysik

31. Oktober 2012

ii

iii

Wenn wir in das Universum hinausblicken und erkennen, wieviele Zufalle in Physik und Astronomie zu unserem Wohle

zusammengearbeitet haben, dann scheint es fast so, als habe dasUniversum gewusst, dass wir kommen.

Freeman Dyson

iv

Deutsche Zusammenfassung

Trotz bahnbrechender neuer Entdeckungen in der Teilchenphysik in letz-ter Zeit gibt es weiterhin viele offene Fragen. So ist auch das Verstandnisder CP-Verletzung als eines der Sacharov-Kriterien notwendig fur ein tiefe-res Verstandnis des Universums. Die Entdeckung der CP-Verletzung beiB-Mesonen beim Belle-Experiment war hierzu ein wichtiger Schritt. Derjetzige Umbau zu Belle II bietet, mit einer um den Faktor vierzig hoherenLuminositat als beim Vorganger, großen Raum fur neue Entdeckungen, stelltaber auch zugleich neue hohe Anforderungen an die experimentellen Me-thoden. So wird bei Belle II unter anderem ein um den Faktor zwanzighoherer Untergrund erwartet. Dies erfordert hocheffiziente Methoden derDatenverarbeitung, insbesondere in der Spurfindung (Track Finding) und derSpurrekonstruktion (Track Fitting). Im Rahmen dieser Diplomarbeit wurdedie Software zur Spurrekonstruktion optimiert und weiterentwickelt.

Hierzu wurden zwei Ziele in Angriff genommen. Zum einen wurde dieFehlerpropagationssoftware Geant4e, selbst Teil der renommierten Detektor-simulation Geant4, in die in Belle II genutzte SpurrekonstruktionssoftwareGenFit integriert, sowohl um zukunftig ein (wie damals vermutet) praziseresWerkzeug zur Extrapolation von Spuren zu haben, als auch um die Kom-patibilitat mit dem vorhandenen Datenformat fur Detektorgeometrien zugewahrleisten.

Das zweite Ziel war die Geschwindigkeitsoptimierung von GenFit selbst.Hier wurden in Zusammenarbeit mit den GenFit-Entwicklern Ansatzpunkteausfindig gemacht, mit denen die Spurrekonstruktion beschleunigt werdenkann. Der im Rahmen dieser Arbeit gewahlte Ansatz war der Austausch derAlgebra-Klassen im Quellcode des Kalmanfilters.

Bei einer geschatzten Anzahl von zu behandelnden Spuren in der Ordnung1013 wahrend der gesamten Laufzeit von Belle II wird die Wichtigkeit eineroptimalen und schnellen Verarbeitung dieser deutlich.

Zunachst wurde die Geant4e-Integration in Angriff genommen. Hierzuwurde GenFit um ein neues Exemplar einer sogenannten Track Representa-

v

vi DEUTSCHE ZUSAMMENFASSUNG

tion-Klasse erganzt. Die Track Representations in GenFit dienen dazu, dieParameter einer Spur zu speichern, und diese auf beliebige Detektorebenenoder Raumpunkte zu extrapolieren. Diese Extrapolationsmethoden werden vorallem im Kalmanfitter benotigt. In den bisher existierenden Implementationenwurden sowohl eigens implementierte Methoden, wie die auf Runge-Kutta-Verfahren basierende Extrapolationsmethode der RKTrackRep, als auch dieMethoden anderer Frameworks, wie Geane in der Klasse GeaneTrackRep2,verwendet. Die große strukturelle Ahnlichkeit der GeaneTrackRep2 wurdeals Richtlinie fur die Implementierung der Extrapolationsmethode der neuenKlasse G4eTrackRep verwendet, da das dort verwendete Geane als Vorgangerdes zu verwendenden Geant4e eine sehr ahnliche Bedienung aufweist. Fur dieubrigen Klassenmethoden wurde die Struktur der RKTrackRep als Richtliniegenommen, da diese bereits bei Belle II eingesetzt wird und gut an dessenBedurfnisse angepasst ist.

Fur die Entwicklung dieser neuen G4eTrackRep wurde eigens eine Simu-lationsumgebung mit Geant4 entworfen, welche an die erforderlichen Testsder neuen Klasse gut angepasst war. Diese simulierte einen einfachen Detek-toraufbau, dessen Signale mit GenFit verarbeitet werden konnten. Wahrendder Implementierung der neuen Klasse kam es zu mehreren Schwierigkeiten.Zum einen erschwerte die hochkomplexe Klassenstruktur Geant4 s die paralle-le Ausfuhrung zusammen mit Geant4e im selben Programm. Zum anderentraten große Fehler bei der Propagation der Kovarianzmatrix der Spur auf,wodurch der Fit immer fehlschlug. Sehr erschwerend kam hier hinzu, dassGeant4e nur außerst sparlich dokumentiert ist, wodurch es nahezu unmoglichwar, Informationen uber die richtige Implementierung zu erlangen. Da dieseProbleme zunachst nicht gelost werden konnten, wurde das Projekt vorersteingestellt.

Im Fruhsommer 2012 jedoch veroffentlichte ein Mitglied der Belle II-Kollaboration ein weiteres Softwaremodul, das auf Geant4e beruht. Diehierin enthaltenen Details waren sehr wichtig fur die korrekte Ausfuhrungund wurden in die G4eTrackRep ubernommen. Nach anfanglichen Erfolgenkonnte letztendlich leider dennoch keine funktionierende Version der neuenKlasse fertiggestellt werden. Die Grunde hierfur lagen zum einen in dersparlichen Dokumentation, die bis zuletzt essentielle Fragen uber Details derImplementierung offen ließ. Zum anderen scheint es in Geant4e schwerwiegendeinterne Fehler zu geben, die eine funktionierende Implementierung unter dengegebenen Umstanden unmoglich machten. Wie sich wahrend der Entwicklungim personlichen Kontakt mit dem Autor von Geant4e herausstellte, wurdeGeant4e bisher kaum genutzt. Auch nach intensiver Recherche ließen sichkaum Benutzer dieser Software, und keine Experimente, die diese verwendethatten, ausfindig machen. Die wenigen gefundenen Benutzer beklagten sich

vii

durchweg uber fehlerhaftes Verhalten.Diese Unzuverlassigkeit Geant4es ist sehr verwunderlich, da das Geant4-

Paket eine auch außerhalb der Teilchenphysik weit verbreitete Software ist;sie kann in einigen Bereichen der Medizintechnik und der Radiologie gar alsIndustriestandard angesehen werden.

Neben der Implementierung der G4eTrackRep wurde dann die Geschwin-digkeitsoptimierung von GenFit angegangen. Hierzu wurde zunachst derZeitverbrauch beim Verarbeiten einer Spur mittels sogenannter Profiling-Programme detailliert untersucht. Im Speziellen wurde die Methode GFKalman::ProcessTrack() analysiert, welche fur die Verarbeitung der Spur und derBestimmung derer Parameter zustandig ist. In der Analyse stellte sich heraus,dass ein Großteil der Prozessorzeit zum einen fur die Berechnung von Matri-xoperationen benotigt wird, und zum anderen fur die Navigation durch dieDetektorgeometrie. Da die Konzepte zur gesteigerten Effizienz der Navigationim verbleibenden Zeitrahmen nicht zufriedenstellend hatten gelost werdenkonnen, wurde zunachst die Optimierung der Matrixoperationen bearbeitet.

Diese Aufgabe bestand aus zwei separaten Problemen. Zum einen musstendie zahllosen Matrizenrechnungen der Extrapolationsmethode der RKTrackRepoptimiert werden, da diese Methode intensiv von der ProcessTrack-Methodegenutzt wird. Zum anderen musste naturlich die ProcessTrack-Methodeselbst optimiert werden. Dies gelang in Zusammenarbeit mit den GenFit-Entwicklern; diese bearbeiteten die RKTrackRep, womit in dieser Diplomarbeitdie ProcessTrack-Methode untersucht wurde.

Als geeigneter Ansatz stellte sich der Austausch der fur diese Berechnungenzustandige Matrixklasse heraus. Hierzu wurde zunachst eine Testumgebungfur Matrixklassen implementiert. Diese auf sogenannten Template-Klassenbasierende Software untersucht und vergleicht beliebige Matrixklassen ausunterschiedlichen Algebrapaketen. Als klarer Favorit gingen die statischenMatrizen der Eigen Library aus diesem Testlauf hervor, welche gegenuberden in GenFit verwendeten Matrizen einen Geschwindigkeitsvorteil bei Multi-plikationen von bis zu einem Faktor vierzig bringen.

Die letztendliche Implementierung stellte sich als nicht trivial heraus.Zum einen erwies sich die Matrixklasse als inkompatibel zu dem ROOT C-Interpreter, der in GenFit verwendet wird. Zum anderen musste das Interfaceder bisherigen ROOT -Matrizen an vielen Stellen erhalten bleiben. Ein Ausweghieraus war die gemeinsame Nutzung desselben Daten-Arrays, wodurch sowohldie außerst schnellen Operationen der Eigen Library als auch das ROOT -Interface zeitgleich genutzt werden konnten.

Nach weiteren Feinheiten ergab sich Geschwindigkeitszuwachs von ca. 12%gegenuber der nicht optimierten Version. Dies liegt im Bereich des Erwarteten.

viii DEUTSCHE ZUSAMMENFASSUNG

Unter Hinzunahme der optimierten Extrapolation der Track Representationergibt sich sogar ein Geschwindigkeitszuwachs von knapp 60% gegenuber dernicht optimierten Version. Dies stellt einen bedeutenden Fortschritt in derGeschwindigkeitsoptimierung des Track Fitting dar.

Neben der Optimierung der Matrizenoperationen bot sich Optimierungder Geometrienavigation an. Hier wurde das SiliMap-Konzept des CDF-Experiments auf eine Adaption zu Belle II hin untersucht. Im Rahmen dieserArbeit wurde dies allerdings nicht umgesetzt; eine Untersuchung des Konzeptsist dennoch im letzten Kapitel dieser Arbeit zu finden.

Precision Tracking Studiesat Belle II

Philipp Oehler

Diplomarbeit

an der Fakultat fur Physikdes Karlsruher Instituts fur Technologie (KIT)

Referent: Prof. Dr. M. FeindtInstitut fur Experimentelle Kernphysik

Korreferent: Prof. Dr. Th. MullerInstitut fur Experimentelle Kernphysik

October 31st, 2012

ii DEUTSCHE ZUSAMMENFASSUNG

Contents

Deutsche Zusammenfassung v

1 Introduction 3

2 Theoretical Overview 52.1 The Standard Model of Particle Physics . . . . . . . . . . . . 52.2 Interaction of Particles with Matter . . . . . . . . . . . . . . . 102.3 Particle Colliders . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 The Belle II Experiment 153.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Tracking Detectors . . . . . . . . . . . . . . . . . . . . . . . . 163.3 Particle Identification . . . . . . . . . . . . . . . . . . . . . . . 203.4 Calorimeters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4 Computing at Belle II 234.1 The basf2 Framework . . . . . . . . . . . . . . . . . . . . . . . 244.2 Simulation with Geant4 . . . . . . . . . . . . . . . . . . . . . 244.3 Geant4e Error Propagation . . . . . . . . . . . . . . . . . . . . 284.4 GenFit - a Generic Tracking Framework . . . . . . . . . . . . 30

5 The Geant4e Track Representation 355.1 General Approach and Testing Environments . . . . . . . . . . 36

5.1.1 RKTrackRep and GeaneTrackRep2 . . . . . . . . . . . . 375.1.2 Structure of G4eTrackRep::Extrap() . . . . . . . . . 415.1.3 States and Matrix Conversions . . . . . . . . . . . . . . 455.1.4 Run Management . . . . . . . . . . . . . . . . . . . . . 475.1.5 Malfunctions in Matrix Propagation and Track Fitting 51

6 Performance Optimisations on GenFit 556.1 Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.1.1 Profiling using inbuilt basf2 module . . . . . . . . . . 56

1

2 CONTENTS

6.1.2 Profiling using Valgrind . . . . . . . . . . . . . . . . . 576.2 Alternatives to the ROOT matrices . . . . . . . . . . . . . . . 59

6.2.1 The Eigen Matrices . . . . . . . . . . . . . . . . . . . . 616.3 Performance Optimisations at the GFKalman Class . . . . . . . 62

6.3.1 Team Play with ROOT Matrices . . . . . . . . . . . . 636.3.2 Adjustments . . . . . . . . . . . . . . . . . . . . . . . . 646.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

7 Ideas for Further Performance Optimisations 697.1 Simplified Geometry Models . . . . . . . . . . . . . . . . . . . 697.2 SiliMap at CDF . . . . . . . . . . . . . . . . . . . . . . . . . 707.3 Similar Approaches at Belle II . . . . . . . . . . . . . . . . . . 72

8 Conclusion 75

A Appendix 77A.1 Performance of GFFastKalman . . . . . . . . . . . . . . . . . . 77

Danksagung 85

Chapter 1

Introduction

The contemporary world view of physics yields a broad understanding of theworld surrounding us, on macroscopic as well as on microscopic scale. Yet,physics itself has still left open several urgent issues.

One of these issues is the better understanding of CP violation, which isessential for a deeper knowledge about the creation of the universe, as it isone of the Sakharov criterions [1]. After the Belle experiment has providedground-breaking knowledge about the CP violation in B meson systems [2],its current update towards Belle II will deliver new possibilities of exploringphysics beyond the standard model. The massive amount of raw data gainedfrom this experiment underlies a complex processing, whose precision andefficiency is essential for a correct interpretation of the reactions taking placeat the experiment. This makes great demands on the computing.

Especially at Belle II, where the luminosity exceeds the one of its forerunnerby a factor of forty, a precise detection and processing is indispensable. Withthis high luminosity – and thus reaction rate – the background radiation willalso be about twenty times higher than it was at Belle. This complicates theprocess of putting detector hits back together to a track (Track Finding) andreconstructing the particle’s properties out of it (Track Fitting). The upgradeof track fitting and its adjustments to the new challenges is crucial for aworking experiment. As one part of this, this diploma thesis is concernedwith improving the software needed for track fitting at Belle II.

To get an idea of how important an effective track fitting is, one shallkeep in mind that in the years of running Belle II the number of performedtrack fits is in the order of approx. 1013. The number of track extrapolationsis even higher, as for one fit approx. 30 extrapolations take place. The mainattention in this thesis has been spent on both the fitting process and thetrack extrapolation.

For this, several issues have been dealt with. At first, it was tried to

3

4 CHAPTER 1. INTRODUCTION

integrate the error propagation framework of the simulation toolkit Geant4 1

to the present tracking framework GenFit2, in order to provide a moreprecise and better adapted possibility for track extrapolation. Secondly, asperformance plays a major role in Belle II’s computing, several approacheson optimising speed performance on GenFit were examined and worked out.

At first, a quick overview about contemporary particle physics and itsexperimental methods is given in Chapter 2. After this, the most importantfeatures of the Belle-II detector and the KEKB accelerator can be seen inChapter 3. As this thesis focuses on computing issues, Belle II’s computinginfrastructure and software framework is presented in Chapter 4.

The work regarding the integration of Geant 4’s error propagation intoGenFit is depicted in Chapter 5, while the optimisation issues handled inChapter 6. The last Chapter 7 shall point out some further possibilities ofperformance optimisation.

1Geant4 is a framework to simulate the passage of particles through matter. It is widelyused in particle physics. A detailed description can be found in 4.2.

2GenFit is a highly object-oriented and very flexible framework for track fitting. See4.4 for details.

Chapter 2

Theoretical Overview

By convention sweet is sweet,bitter is bitter, hot is hot,

cold is cold, color is color;but in truth there are only atoms and the void.

Democritus

Centuries ago, when the classical greek philosophers began to think aboutthe deeper being of the world, one basic idea came up, first mentioned byLeucippus: all matter consists of tiny, indivisible particles; and every thingone tries to break up, no matter how often, at some point they would endup at these indivisible particles, called atoms. Actually, although centurieslater, this theory turned out to be true, even if the particles consideredto be indivisible - until nowadays they’re called ’atoms’ - prove to have asubstructure, too.

The nature of the ’atoms’ we know today, the basic particles, are describedby the Standard Model of Particle Physics. This first chapter shall give abrief overview about achievements of physics, and its contemporary view ofthe subatomic world.

Yet, we still don’t know whether these elementary particles are reallyimpartible ’atoms’. It’s possible that there is a substructure, which can berevealed at higher energies. But until now, the Standard Model of ParticlePhysics gives a good description of elementary particle physics.

2.1 The Standard Model of Particle Physics

The Standard Model of Particle Physics is the physical theory describing theinteraction between the known elementary particles [3]. It takes into accountthree ways of interaction (forces):

5

6 CHAPTER 2. THEORETICAL OVERVIEW

• electromagnetic interaction

• strong interaction

• weak interaction

The fourth known interaction, gravity, is disregarded by the Standard Model,because up to now there’s no quantum-mechanical theory of gravity proven tobe valid. Nevertheless, this doesn’t affect the Standard Model - the relevantthree forces are by many orders of magnitude stronger than gravity is, whichmakes gravity irrelevant, at least at the energies considered here.

These forces are transmitted by particles themselves, since mass andenergy are equivalent. So, not only matter itself, but also the radiationbetween it can be described by particles.

These particles are divided into two groups, Bosons and Fermions.Fermions have a half-integer spin value, bosons have an integer spin1. Theelementary bosons are the particles responsible for interaction between matterin first order, and fermions are what matter is made of.

An overview about all particles of the standard model and their propertiescan be seen in figure 2.1.

Bosons

As mentioned above, in first order bosons are the carriers of force. Inthe standard model, these elementary bosons are called gauge bosons, fortheoretical reasons. They can be assigned to one of the interactions introducedabove.

• photons are the carrier of the electromagnetic force.

• eight gluons create the strong force. Like quarks, they carry colorcharge. Due to that, there’s a strong force between the gluons themselves,which causes the strong force to interact with its own radiation.

• two W±- and one Z-boson carry the weak force.

Beside these elementary gauge bosons, there are other bosons, which aremade of interacting fermions (quarks). They can also carry force. One exampleare the pions, which contribute to the nuclear force in the correspondingYukawa theory.

Another important elementary boson is the recently discovered higgs-boson, which affects most of the matter with mass.

1The spin is a quantum mechanical property, which can be interpreted as an intrinsicangular momentum, although the particle is punctual. It has no analogon in our macroscopicworld.

2.1. THE STANDARD MODEL OF PARTICLE PHYSICS 7

uup

2.4 MeV/c

2/3

1/2γ

photon

0

0

1ttop

171.2 GeV/c

2/3

1/2ccharm

1.27 GeV/c

2/3

1/2

νelectronneutrino

< 2.2 eV/c

0

1/2 ZZ boson

91.2 GeV/c

0

1νmuon

neutrino

< 0.17 MeV/c

0

1/2 νtau

neutrino

< 15.5 MeV/c

0

1/2

ddown

4.8 MeV/c

-1/3

1/2ggluon

0

0

1

bottom

4.2 GeV/c

-1/3

1/2sstrange

104 MeV/c

-1/3

1/2

eelectron

0.511 MeV/c

-1

1/2 WW boson

80.4 GeV/c

±1

1

muon

105.7 MeV/c

-1

1/2 τtau

1.777 GeV/c

-1

1/2

b

µ

e τµ

mass

charge

spin

name

±

0

Figure 2.1: Particles of the stand-ard model. To each charged particlebelongs an anti-particle with re-versed quantum numbers. Thequarks are the particles on the green,leptons on blue, and bosons on yel-low tiles (the higgs bosons is leftout). From left to right, the firstthree columns reflect the generations(values from [4]).

Fermions

Fermions can be arranged into three ’generations’, regarding to their massrange. Most of the matter surrounding us every day is made of first generationfermions. The elementary fermions can also be arranged by the forces they areaffected with. The fermions submitted to the strong force are called quarks,the other ones are called leptons.

Leptons

Leptons are fermions which are not affected by the strong force. There is theelectron, the muon and the tau, with ascending generation order. Eachof these is associated with a neutrino, almost massless particles withoutelectric charge.

Leptons carry a lepton flavour, which is always conserved during interac-tions2.

Quarks

Besides gluons, quarks are the only particles which can interact via the strongforce, so they carry color charge. There are two quarks per generation, eachwith its own conserved quark flavour. In the first and lightest generation,there is the up- and the down-quark, in the second one there is the charm-

2But not necessarily during propagation of neutrinos, as seen in the neutrino oscillations.


and the strange-, and in the third and heaviest one, the bottom- and thetop-quark [5] can be found.

Due to the complex behaviour of the strong force – gluons themselves are’colored’ and thus they do interact with themselves – quarks do only occurin colorless bound states. Until now, only bound states out of two or threequarks have been found3. Bound states out of two quarks are called mesons,the ones with three quarks are called baryons.

All mesons are unstable. They are bosons, because the quark’s half-integer spins do always add up to an integer one. Thus they can serve ascarriers of interactions in first order. A meson can be excited to higherenergy levels by gaining higher levels of orbital momentum or by higher spincombinations.

In the Belle-experiments, the spectroscopy of B-Mesons is one of thefocuses of research. They consist each of an anti-b-quark in combination witha u-quark (B+), a d-quark (B0), a s-quark (B0

s ) and a c-quark (B+c ). The

neutral ones are can spontaneously transform into their anti-particles andback. This behaviour is closely connected to CP violation described in section2.1.

Baryons are particles consisting of three quarks, with arbitrary combina-tions. Their spins are half-integer. As both the proton and the neutron arebaryons (uud and udd), most of the matter in the world surrounding us everyday is baryonic. The baryons can be arranged in schemes, depending on theirspin - the spin-1/2 baryons are arranged in the baryon octet, the spin-3/2ones in the baryon decuplet.

CKM-Mechanism

As the quark flavour is conserved by strong interactions, a quark cannotdecay strongly into another quark. But as flavour is only conserved in stronginteractions, quarks can decay into each other weakly. However, the weakeigen states of the quarks are not identical to their mass eigen states. Thiscauses differences in the decay rates of how likely a quark decays into one witha different flavour. This was discovered first in the s-quark meson systems4.These weak eigen states are shifted by a unitary operator, which can bewritten as a 3×3 matrix, the CKM matrix. In the case of a weak decay,this must be taken into account, and has a big influence on the branchingratios.

The CKM matrix can be parametrized as a rotation matrix around threeEuler angles, with one complex phase parameter in addition. This additional

3There are several hints for ’tetraquarks’.4Which was found of to be ’strange’, so this quark was called ’strange’.

2.1. THE STANDARD MODEL OF PARTICLE PHYSICS 9

complex phase is responsible for the CP violation, which will be describedin this section.

Conservation Laws

It is widely known that conservation laws play a big role in all branches ofphysics, so they do in particle physics. The universal conserved quantities likeenergy/mass, momentum and angular momentum are conserved in particlephysics as well, of course, and there are several others among them. Only theones which are important for the Belle experiments shall be mentioned herein detail.

It is important to mention, that the conservation these quantities inquantum mechanics is only granted when integrated over a sufficient time.Due to the time-energy uncertainty, particles can exceed their energy for shorttimes, and so turn into other higher-energetic particles for this time.

Parity and Parity Violation

The conservation of parity means, that a physical process is not affectedby the chirality of its participants. That means, that a parity conservingreaction takes place in the same way, no matter if space is mirrored. It waslong thought that parity is a universal law, but it was found to be violated byweak interactions5. For example, the direction of the electron emitted in thedecay of a muon is influenced by the spin of the muon. After the discoveryof parity violations in weak interactions, it was still thought that CP, thecombined usage of the parity and the charge operator, is conserved in everyinteraction.

CP Violation

A few years later, this had to be abandoned, too, after the discovery of CP

violation in neutral kaon systems. Due to weak mixing, K0 and K0

are not thephysical kaon states, but its mixes K0

1 and K02 are. These are CP eigenstates,

which should only decay conserving CP, with a short-living K01 = K0

S anda long-living K0

2 = K0L. However, it was discovered by Cronin and Fitch

[6] that these K0S and K0

L are still convertible into each other, and are thusmixed, which violates CP. Years later, CP violation was also discovered inB0 systems by the BaBar and the Belle experiment [2].

5Only left-handed neutrinos and right-handed anti-neutrinos do exist, or more precisely,are affected by the weak force.


The CP-violation contributes, as one of the Sakharov criterions [1], tothe asymmetry between matter and antimatter in the universe. Yet, thesources of CP-violation discovered up to now are too small to fully explainthis asymmetry. Analysing this phenomenon within B-meson systems is oneof the most important tasks for Belle II.

2.2 Interaction of Particles with Matter

As in particle physics basically all given tools to examine particles are made ofordinary materials, it is important to take a look on how elementary particlesare interacting with matter. The most important effects will be illustratedhere.

Effects of Uncharged Particles

The focus of this section will lie on photons and their effects within matter.There are three main effects:

• The Photo Effect describes the absorption of low-energy photons bya valence electron, which is pulled out of the atom.

• The Compton Effect is the the scattering of a mid-energy photonwith an electron of the atomic hull. The scattered electron is released.

• Pair Production denotes the creation of an electron-positron pair outof high-energy photons. Due to the conservation of momentum, pairproduction cannot take place in the vacuum.

These effects produce secondary electrons, positrons and photons, whichare seeds for further interactions with the material. This causes, if the energyof the initial photon was high enough, a cascade of particles. As the strengthof this particle shower is nearly linear to the energy of the initial photon, thisoffers a suitable method to measure a photons energy.

Effects of Charged Particles

Charged particles propagation through matter can either excite atoms, byhanding over some of their energy to the atoms electrons, or they can ionizethem by removing the atoms electrons. Both, of course, causes an energy loss

2.2. INTERACTION OF PARTICLES WITH MATTER 11

Muon momentum

1

10

100

Stoppingpower[MeVcm2 /g]

Lindhard-

Scharff

Bethe Radiative

Radiativeeffectsreach 1%

µ+ on Cu

Without δ

Radiativelosses

βγ0.001 0.01 0.1 1 10 100 1000 104 105 106

[MeV/c] [GeV/c]

1001010.1 100101 100101

[TeV/c]

Anderson-Ziegler

Nuclearlosses

Minimumionization

Eµc

µ−

Figure 2.2: Stopping power/energy loss for µ+ in copper [7].

of the charged particle. This energy loss, or stopping power, can be describedin dependence of the relativistic β-factor by the Bethe-Bloch formula

−〈dEdx〉 = Kz2

Z

A · β2·[

1

2ln

(2mec

2β2γ2Tmax

I2

)− β2 − δ(βγ)

2

]where Z is the atomic number of the material, z the particles charge, and Ithe excitation energy [7]. A is the atomic mass, Tmax the maximum possibleenergy transfer and K a constant factor. For example, this energy loss withinmatter can be used to identify particles. A plot of the stopping power overthe momentum for µ+ in copper can be seen in fig. 2.2, along with othereffects.

It is important to mention the minimum at ca. 4 βγ. Particles atthis energy are called minimum ionizing particles. After the following flatdistribution, for higher energetic particles the relativistic rise means anotherincrease of stopping power. The δ factor in the Bethe-Bloch formula, describesthe so-called density effect, eliminating the relativistic rise.

For electrons, the Bethe-Bloch formula is not valid, due their high effectsconcerning bremsstrahlung and special interactions with the identical hullelectrons.

Next to ionization and excitation, charged particles are affected by bremsstrahlung.Bremsstrahlung occurs when charged particles are accelerated, which happens


in matter due to interaction with the coulomb field of the nuclei. The energyloss through bremsstrahlung is proportional to the amount of energy

−dEdx

=E

X0

and anti-proportional to the radiation length. Integrating this formula reveals

E = E0 exp(−x/X0)

which is also a definition of the radiation length – it is the distance where theparticles energy has dropped to 1/e of the former value [8].

Multiple Scattering

As mentioned before discussing bremsstrahlung, particles propagating throughmatter interact with the Coulomb field of the nuclei and are thus scattered mul-tiple times6. This makes them not only lose energy through bremsstrahlung,also their direction of propagation is influenced. For small angles, this is welldescribed by a Gaussian distribution [9].

Let θ0 = θrmsplane be the root mean square of the scattering angle in the

scatter plane, then the width of the Gaussian is given by the Highland formula

θ0 =13.6 MeV

βcpz√x/X0 [1 + 0.038 ln(x/X0)]

with momentum p, velocity βc, particle charge z and the thickness of the tra-versed medium in radiation length x/X0. With this, the angular distributionsare given by

1

2πθ20exp

[−θ2space2θ20

]dΩ

for space angles and

1√2πθ0

exp

[−θ2plane2θ20

]dθplane

for the plane angles, where the space angle is θ2space ≈ θ2plane,x + θ2plane,y withindependent but identically distributed θ2plane,x and θ2plane,y.

Other important quantities, as shown in fig. 2.3, are given by

ψrmsplane =

1√3θ0

6Actually, for hadronic particles, strong interactions also contribute to multiple scatter-ing.

2.3. PARTICLE COLLIDERS 13

x

splaneyplane

Ψplane

θplane

x/2

Figure 2.3: Quantities of the Highland formulas of multiple scattering [10]

yrmsplane =

1√3xθ0

srmsplane =

1

4√

3xθ0

However, it must be kept in mind, that these formulas only apply for smallangle scatterer [7].

2.3 Particle Colliders

The high acceleration of particles comes with several effects. Beyond itsdesign, a particle collider is characterized by specific parameters. One of themis the center-of-mass energy, which is relevant on head-on colliders like BelleII:

s = (p1 + p2)2 = (E1 + E2)2 − (~p1 + ~p2)

2

= E21 + E2

1 + 2E1E2 − |~p1|2 − |~p2|2 − 2|~p1||~p2| cos θ1,2

= m21 +m2

2 + 2E1E2 − 2 cos θ1,2

√E2

1 −m21

√E2

2 −m22

with the 4-vectors p1/2. With the conditions at the interaction point, mi Ei

and cos θ1,2 ≈ 1, it is √s =

√4 · E1E2

in the center-of-mass frame within the detector.


Important to the particle reactions is the cross section σ, which canbe seen as the area around a particle which has to be crossed by anotherparticle to cause an interaction; thus it can also be interpreted as a probabilitydistribution for interaction. Other important quantities are the luminosity Land the rate dN/dt, which are related in the following way:

dN

dt= L · σ

The luminosity is the number of particles, which pass in the beam per areaper time. In a collider with two colliding bunches, it can be expressed with

L = fn1n2

4πσxσy

where f is the frequency of the bunches, ni the number of their particles andσx/y the vertical and horizontal cross sections, where it is assumed that thebeam profile of both bunches is the same. The total number of events can begained by integrating the rate:

N = σ ·∫Ldt

One possibility to gain a higher luminosity is to focus the beam at theinteraction point, to get the same number of particles on a smaller area. AtBelle II, the highly focused nano beam scheme is used, where the beam issqueezed in vertical direction, leading to the previously mentioned factor 40in luminosity in contrast to Belle.

Chapter 3

The Belle II Experiment

Keine Experimente!Konrad Adenauer

This chapter will give an overview of the Belle II experiment, especiallyof the components of its detector. Belle II is an upgrade of the first Belleexperiment, which was shut down in 2010. At this point, Belle II is still underconstruction, with a supposed beginning of operation in 2016. It might bethat the final experimental setup differs from what is specified here, howeverthese changes will for sure be small. Most of the information contained inthis chapter was taken from the Belle II Technical Design Report [11].

3.1 Overview

Belle II is situated at the (Super-) KEKB collider in Tsukuba, Japan. TheKEKB is an asymmetric e+e− collider with a luminosity of 8 · 1035cm−2s−1.This significant rise compared to KEK, which is nearly the fortyfold luminosityof its forerunner, will be accomplished by a smaller beam size and a doublingof the beam current. This will put greater strain on both the hardware andthe data processing of the detector, due to the much higher rate of events, andthe significantly raised background. The collision energy lies at the Y(4S),so most of the particles produced are the Y(4S). These decay into B mesonpairs at a branching ratio of 96%, which are the matter of interest at Belle II.

The innermost detector component, enclosing the interaction point (IP), isthe Pixel Detector (PXD), which is just about the size of a soft drink can(see 3.1 for a crossection). The next outer components are the Silicon VertexDetector (SVD) and the Central Drift Chamber (CDC). These threeinnermost components serve as tracking devices, measuring curvature (and

15

16 CHAPTER 3. THE BELLE II EXPERIMENT

thus momentum and charge), and the vertex of particles flying through. Theylie within an almost homogeneous magnetic field of 1.5 T along the z axis1.

Behind the tracking detectors, the components for particle identification(PID) are situated. They consist of the Time-of-Propagation Counter(TOP), which is seated around the barrel, covering the whole φ-angle, andof the Aerogel Ring-Imaging Cherenkov Detector (ARICH), locatedat the end caps.

Between the TOP and the solenoid sits the Electromagnetic Calori-meter (ECL), detecting photons and electrons; at the outermost, the barreland the endcap K-Long and Muon Detectors (KLM) detect these twolonger living particles.

3.2 Tracking Detectors

Pixel Detector (PXD)

The PXD consists of two thin layers of pixel sensors, arranged barrel-shapedaround the interaction area. Its high resolution provides detailed informationrequired for vertex reconstruction. This is necessary to determine the propaga-tion length of the Bs. In order to reduce multiple scattering, these layers aredesigned as only 75 µm thick fields of DEPFETs (DEpleted P-channel FieldEffect Transistor) with the readout electronics on their sides, outside of therelevant area. Due to the very low power consumption of the DEPFETs, onlythe exterior readout system requires cooling. Another big advantage of theseDEPFET sensors is that they can be constructed very radiation-hard.

The inner layer has a radius of 14 mm, consisting of eight ladders of sensors,fitting exactly on the beam pipe. The outer layer is designed similarly, at aradius of 22 mm using twelve ladders. Each ladder is 15 mm wide and has alength of 90 mm (123 mm) at the inner (outer) layer. This length makes itpossible for the PXD to be sensitive for a wide angle, which is 17 in forwardand 150 in backward direction2.

The DEPFET sensor cell consists of field effect transistors, whose back-sides are completely covered by a p+ doped contact, which depletes the siliconplanes (for its structure, see fig. 3.2). The field effect transistors serve as

1Due to the cylindrical composition of the detector, a cylindrical coordinate system isused. Its z axis lies parallel to the beam pipe, the angular position around the barrel isdescribed by the φ-angle. Different systems might be used, but they all have the z axis incommon. In cartesian systems, the y axis points upwards, and the x axis sideways.

2This asymmetry is necessary due to the forward boost of the center-of-mass frame.The electrons and positrons do not have the same impulse, which makes it possible todetermine the propagation length of the B-mesons.

3.2. TRACKING DETECTORS 17B

arr

el K

LMEnd

cap

KLM

EC

L

End

cap

KLM

CD

C

Endcap PID

PX

D

SV

D

TO

P

Sole

noid

12

34

(m

)0 Fig

ure

3.1:

Cro

ss-s

ecti

onof

the

Bel

leII

det

ecto

r[1

2].


deep n-doping'internal gate'

deep p-well

p+ back contact

depletedn-Si bulk

p+ source

FET gate

n+ clear

cleargate

p+ drain

amplifier

Figure 3.2: Internal architecture of a DEPFET cell [11]

a first amplification of the signal. When radiation passes through the cell,electron-hole pairs are produced, and the field causes the holes to drift tothe back contact which absorbs them. The electrons are shifted towards then-doped internal gate. This switches the FET, and a signal can be measured.

Via the external gate, the DEPFET can be switched off and on. Inthe switched-on state, the amount of charge in the internal gate affects theconductivity of the FET. By measuring this conductivity, information aboutthe energy of the particle can be gained.

After a particle has passed through, the DEPFET has to be cleared byapplying a high voltage to the clear gate. This exhausts the signal electronsfrom the internal gate. After this clearance, the DEPFET is ready fordetection again.

The readout of the matrix-arranged sensors is applied row wise. For that,a gate voltage is applied to a row, and the drain currents can be measuredfor each pixel. This readout procedure takes 20 µs, during this period thesensor has a dead time of only 100 ns, which is 0.5%.

3.2. TRACKING DETECTORS 19

The chipset is designed to bear up to 100 kGy of radiation withoutsignificant losses in quality, which should last for five years of operation atBelle II.

Silicon Vertex Detector (SVD)

As its name implies, the SVD’s purpose along with the PXD is to determinethe vertex of particles flying through. It basically consists of four layers ofstripes of silicon sensors. Most of these layers are arranged barrels aroundthe beam pipe. The additional ones in the forward region are slanted towardsthe beam pipe. This wedge-like shape offers the fully required angular rangefrom 17 to 150. This concept was also chosen because of its lower cost andits better pz resolution. The SVD ranges from a radius of 38 mm to 140 mm.

In contrast to the PXD, the design of these layers is not pixel based: Thelayers have silicon stripes on both sides, one side p-doped, the other onen-doped. These layers are twisted by 90. Whenever a particle propagatesthrough a layer, it hits out electron-hole pairs, which are accelerated by the p-n-junction and will give a temporary change of voltage, that can be measured.The effect of the magnetic field of 1.5 T must be taken into account, which isdeflecting the electron-hole pairs due to its Lorentz force. This stretches thepath the charge has to travel. To minimize this effect and the flight length,the layers are tilted along the z axis.

Depending on which stripes are hit, the position of the hit can be de-termined uniquely. If multiple particles hit the layer (which will be the mostlikely case), several numbers of ’ghost hits’ are detected, because it can’t bedetermined which vertical stripe hit belongs to which horizontal stripe hit.So each combination will be detected as a hit. Nevertheless, these ghost hitscan be ruled out by the information provided by the other layers. As long asghost hits don’t line up in tracks, they can be abandoned in the track finding.

The SVD’s location farther away from the IP and its faster readout makesit less prone for background than the PXD.

Central Drift Chamber (CDC)

The CDC is build around the SVD. It basically consists of a chamber filledwith gas (50% Helium and 50% Ethan) with more than 55,000 wires spannedthrough. Its main purpose is to determine the momentum of particles precisely.Also information about the particle properties can be obtained through theenergy loss within the gas.

The chamber ranges from a radius of 160 mm to a radius of 1130 mm. Thewires are 30 µm thick in diameter. There are earthed ’sense wires’ (anodes)


and ’field wires’ with voltage applied; they are arranged in a way that eachsense wire is surrounded by a bunch of anode wires. Whenever a particle fliesthrough the chamber, gas molecules along the track become ionized. Theirelectrons are accelerated towards the sense wire, ionizing more and more gasmolecules and producing more electrons. This is called ’avalanche effect’.When they arrive the sense wire, a current proportional to the energy loss ofthe particle within the cell can be measured.

This avalanche effect only takes place close to the sense wires. Fartheraway, the energy gained by the field gets lost in the collisions with gas, causinga nearly constant drift speed. The farther away the particle flew, the longer ittakes for the electrons to reach the sense wire. This time is called drift timeand gives a good value to the minimal distance a particle had towards thewire.

A feature of the CDC is a modification which allows not only to determinethe motion of the particle within the φ-plane, but also to gather trackinformation in z direction. For that, every second super-layer of wires - thewires are combined to barrel-shaped layers - is not parallel to the x axis,but is slightly tilted. These tilted super-layers are called stereo layers (U fornegative, V for positive angle tilt), the non-tilted ones are called axial wires(A). Beginning with the innermost one, every second super-layer is axial,with altering U and V stereo layers in between. The innermost super-layercontains a more dense packing of wires to archive a higher sensibility againstbackground, allowing a higher resolution.

All in all, the CDC is an effective and reasonably precise 3 tracking device,at relatively low costs. Its easy construction allows a fast readout and offersa low dead time. Also, it covers a wide radial distance, which is essential fora precise determination of low-curved tracks.

3.3 Particle Identification

Time-of-Propagation Counter (TOP)

Right outside the tracking detectors, the detectors responsible for particleidentification are located. The one for the barrel side is the TOP counter.Basically, the TOP measures the velocity of a particle flying through bymeasuring its Cherenkov angle. Combined with the information about itsmomentum, determined by the tracking detectors, the mass and hence theidentity can be determined. The TOP consists of quartz planes arranged next

3About ten times less precise than the silicon detectors.

3.4. CALORIMETERS 21

to each other, with an array of photomultipliers at one side and with a mirroron the other.

The TOP uses the Cherenkov radiation of the particles in the quartz. Ifthe speed of the particle flying through is higher than the speed of light inthis medium, photons are emitted cone-shaped. The angle of this cone isrelated to the speed of the particle:

cos(θ) =c′

v⇒ v =

c′

cos(θ)

with c′ = cn

being the speed of light in the medium.The Cherenkov photons are then reflected several times on the inner sides

of the quartz until they reach the PMT at the end. The time they travel isinfluenced by the Cherenkov cone angle; the smaller it is, the more often ithas been reflected within the quartz. This time is measured by the PMTs,both directly by the photons flying towards the PMTs, and indirectly by thephotons which were reflected back from the focusing mirror.

Aerogel Ring Imaging Cherenkov Detector (ARICH)

The second PID device, covering the endcap of the barrel into the forwarddirection, is the ARICH. It consists of two layers of aerogel, in which theCherenkov photons are created, and one layer of PMTs to detect these photons.Its principle is similar to the TOP; in order to determine the velocity, thecone angle of the Cherenkov light is measured. But due to its peripheralposition, no quartz crystals are needed to lead the photons to the PMTs.Therefore the PMTs capture the full ring image of the Cherenkov cone.

3.4 Calorimeters

Electromagnetic Calorimeter (ECL)

All the inner parts of the Belle II detector have left out photon detection sofar. Except for electron/positron pairs due to γ conversion, none of the innerdetectors is sensitive for these uncharged particles, whose energy and positionare relevant to be known. Furthermore, this information is also important forelectrons to be determined precisely.

Photons interact in several ways with the detector material. The higherenergetic ones produce secondary electrons by pair production, lower energeticones hit out material electrons due to the Compton and the photoelectriceffect. Both secondary and primary electrons release other electrons of the


detector material, produce other photons, and so on. All these effects add upto an electromagnetic cascade.

The ECL both surrounds the TOP and the ARICH. The part surroundingthe barrel consists of 6624 CsI scintillation crystals, all shaped as truncatedpyramids; the endcaps are made of 2112 crystals, 8736 in sum. Behindeach crystal, photodiodes are attached for the detection of the photons. Asthe intensity of a cascade is proportional to the energy of the particle, thisvalue can be obtained by measuring the light intensity measured within thescintillation crystal.

K-Long and Muon Detector (KLM)

The KLM is the detector part responsible for the detection of K0L and muons.

It lies behind the solenoid and has a part covering the barrel area, and anothercovering the endcaps. It’s made of a sandwich structure with altering 4.7 cmthick iron plates and resistive plate chambers.

The resistive plate chambers consist of two glass electrodes on both sides,with a thin gas layer in between. When a high voltage is applied to the glasselectrodes, high energy particles flying through the gas layer leave a band ofionized gas behind. Because this band is conductive, it causes a local drop ofvoltage, which cannot be evened out immediately, due to the high resistanceof the glass. This drop of the electric field can then be detected by pickupstrips arranged orthogonally behind each glass plate. Due to its orthogonalstructure, position information in φ and z direction for the barrel detectorand in φ and θ direction for the endcaps can be obtained.

To distinguish muons from other particles, its high ability of penetrationis used. For hadrons, the way from the ECL through the solenoid and theKLM is several interaction lengths long, which prevents them on reaching theKLM.

The K0L do, of course, also reach the KLM, as (mostly) the only hadrons.

Thus, it creates clusters in both the KLM and the ECL. Every such eventwill be taken as a K0

L candidate.

Chapter 4

Computing at Belle II

If you don’t know anything about computers,just remember that they are machines that

do exactly what you tell them but often surprise you in the result.Richard Dawkins

As mentioned before, Belle II will reach a luminosity which exceeds theformer Belle luminosity by a factor of 40 to 50. This implies that the dataproduced by the detector will increase by about the same factor. For example,in a high rate scenario, we expect to obtain 1,800 MB of data per second,at a rate of 6,000 Hz at 300 kB per event. This puts new high demands oncomputing at Belle II, compared to Belle.

The new approach at Belle II is to change from Belle’s KEK-centralizedcomputing to a distributed form of computing. Although this approach isboth technically and organizationally much more complex, it provides severaladvantages. First, the Belle II members, which are located all over the world,can use their domestic facilities for their contributions, and thus save datatransfer volume. Another advantage is the redundancy. A local blackoutwould not paralyse the whole computing system, as it happened to the KEK’scomputing facility during the 2011 Japan earthquake disasters. And at last,there is broad experience with computing centers used for other experimentsand facilities using similar approaches.

The tasks of computing at Belle II can be divided into three levels. Atfirst, the raw data produced by the detector needs to be stored and processed.Due to the high data production of several peta bytes per year, this will bedone directly at KEK, where the data is initially stored directly on tape. Toachieve redundancy, the raw data will also be copied directly to the PNNL1

via the SINET4 and the PacificWave backbones. After the raw processing, it

1Pacific Northwest National Laboratory (PNNL), located in Richland near Seattle, USA

23

24 CHAPTER 4. COMPUTING AT BELLE II

is necessary to convert the obtained data into a format which is compatibleto the analysis framework. Therefore, n-tupels of physically relevant data arecreated. This takes place at the second level of the distributed Grid Sites.Another very important task done there is the production of Monte Carlo2

data, process which in fact requires most of the computing resources at BelleII. About six times as much MC data as experimental data will be generated.The third level consists of the local computing resources where the physicsanalyses will be done. These use the n-tupel from the Grid Sites. As for theseanalyses, random access to the data is required, the data will be stored ondisks rather than on tape; only the raw data is stored on tape.

4.1 The basf2 Framework

The Belle Analysis Software Framework 2, or short basf2, is the heartof the software at Belle II. It is based on a collection of various softwaresfor particle physics, which provide simulation, reconstruction, analysis, anddata acquisition. Its composition was inspired by the frameworks of otherexperiments, mainly Belle (basf).

basf2 consists of several modules. Each module is assigned to one specialtask, for example to run the simulation with specific parameters, or to loadthe geometry model. These modules are controlled by a Python steeringscript, which calls each module consecutively. As a data interface for thereading from and writing to the modules, the ROOT API is used [13]. ROOTis a data analysis framework developed at CERN, and as a free software, ithas become one of the standard applications in particle physics, and is widelyused.

As the focus of this thesis lies on tracking, two important parts of basf2responsible for simulation and tracking will take the center stage here: thesimulation framework Geant4 and its error propagation tool Geant4e onthe one hand, and the tracking framework GenFit on the other hand.

4.2 Simulation with Geant4

For the simulation of the particle reactions within the detector, basf2 uses thesoftware framework Geant4. Geant4 is the fourth part of the Geant series,originally deveolped by the CERN for simulation of particle reactions within

2Monte Carlo data, or short MC data, is detector data which is not obtained by thereal detector, but by the simulation of the experiment. This plays an important role inlater analyses.

4.2. SIMULATION WITH GEANT4 25

matter[14][15]. It is the successor of Geant3, and, in contrast to its forerunnerswhich were written in FORTRAN, Geant4 was completely redesigned in anhighly object-oriented way, using C++. Besides its use in many particle physicexperiments, it has found wide applications in medical research. Geant4 cannowadays be seen as an industry standard in its field.

In a program using Geant4, at first a geometry of the detector or a specificcomponent is set up3. This is done via a nested class structure of differentobjects, representing different shapes of a specific material. Then, primaryparticles are created, using Monte Carlo methods. After the primaries areshot, Geant4 initializes a stepper, propagating every particle step-by-stepthrough the material. In every step, particle reactions (like decays etc.) andmaterial effects are applied stochastically for every particle. This gives arealistic simulation of the events. Geant4 also offers several tools to visualizethese events, and provides interfaces to extract the properties of an event.

Software Design

As the focus of this diploma thesis lies on developing and improving softwarefor Belle II, a quick overview over the software structure of Geant4 is given. Atop level category diagram of this can be seen in figure 4.1. These categoriesare not closed objects within Geant4, but a loose arrangement of classesbelonging to the same task area.

On the bottom of this diagram, the most fundamental components can befound, with the next higher ones relying on them. At the foot, the globals

can be found. This category contains basic features of Geant4, like the systemof units and several constants, as well as numerics and random numberhandling. Other basic categories are materials, particles, and graphical

representations. The geometry category provides different volumes, ofwhich the detector geometry consists of, and the routines to navigate throughthem. intercoms contains the communication interface between the classes.It’s also directly accessed by the user interfaces.

Above these basic categories lie the ones responsible for physical processesand particle tracking. The track category is responsible for the handlingof tracks and steps; they are invoked by the classes of physical processesand hits. The tracking category makes use of all these classes to updatethe track classes. Above, the events category manages these tracks and run

manages bunches of events.

The visualisation, persistency, readout and (user) interfaces providepossibilities to access the runs.

3Or, for medical issues, the ’detectors’ are body parts.


Geant4

Readout

Run

Event

Tracking

Digits+Hits

Material

Particle

Processes

Geometry

Track

PersistencyVisualization

Graphic_Reps

Interfaces

Intercoms

Global

Figure 4.1: A Top Level Category diagram of the Geant4 structure. The linesshow the relations between the categories. The category at the circle endmakes use of the connected one [14].

4.2. SIMULATION WITH GEANT4 27

Implementation

This section focuses on how Geant4 can be used in practise. This is shownwith the help of a basic implementation within a very basic C++ program.

At first, an instance of G4RunManager is allocated.

G4RunManager* runManager = new G4RunManager;

This is the basic class managing the simulation. To attach the simulationwith individual properties, like an own detector geometry, a specified physicslist and a primary generator, individual classes, defining these properties, canbe attached to the instance of the G4RunManager by calling

runManager ->SetUserInitialization(new

myDetectorConstruction);

runManager ->SetUserInitialization(new myPhysicsList);

runManager ->SetUserAction(new

myPrimaryGeneratorAction);

Classes like myDetectorConstruction have to be defined previously bythe user, by inheriting from a Geant4 user class such as G4VUserDetector-Construction. These user classes are designed as virtual classes. For ex-ample, the individual composition of the detector is done by its virtualconstructor, defining a tree of volume classes like G4LogicalVolume andG4VPhysicalVolume4, containing instances of G4Material.

After attaching the user classes to the G4RunManager, it needs to beinitialized, before a run can be started by the BeamOn(int) command:

runManager ->Initialize ();

int numberOfEvent = 3;

runManager ->BeamOn(numberOfEvent);

By extending user classes, that are designed as virtual classes, the developercan deploy arbitrary code here. The most important user classes are

• G4VUserDetectorConstruction, defining the detector geometry.

• G4VUserPhysicsList, defining the physics.

• G4VUserPrimaryAction, defining how primary particles shall be cre-ated.

4The detailed difference between physical and logical volumes can be seen in the Geant4user documentation. As one might have noticed, G4VPhysicalVolume is a virtual class,too.


Along with these three, which are mandatory to define, there are severalothers. Only the G4VSensitiveDetector class will be mentioned here, as itis important for the simulation done in the context of this thesis. Inheritedimplementations of this virtual class can be attached to the parts of thedetector which are marked as sensitive5. By implementing the virtual methodvoid EndOfEvent(), the user can access the produced hits and process themindividually, by implementing GenFit code, for example.

4.3 Geant4e Error Propagation

One part which comes with Geant4 is its error propagation framework Geant4e[16]. It uses the software structure of Geant4’s tracking to track averagetrajectories and their covariance matrices through detector material.

Tracking and Error Propagation

As pointed out in section 4.4 about the tracking framework GenFit, thealgorithm we use for fitting measured hits back together to a track - in ourcase, the Kalman Filter - needs a precise prediction of the trajectory of aparticle and its covariance matrix. The tracking interface of Geant4 does, inprincipal, provide this trajectory prediction. But in this case, not a singleparticle is needed, which is affected by random fluctuations, but the averagetrajectory of all Monte Carlo particles.

Using Geant4 tracking, this can be done by switching off all ’individual’effects inside the G4PhysicsList, just leaving ionisation. This will providethe ’average’ trajectory. After this, the covariance matrix of the trajectory ispropagated. This is done at first by a simple transformation of the matrixfrom the old position towards the new one. After this, the physical effectsare applied.

Structure of Geant4e

As a part of Geant4, Geant4e is designed highly object-orientated. LikeGeant4, it contains a basic manager class, G4ErrorPropagationManager,which steers the run and manages the track parameters and their cov-ariances. Before initializing an instance of G4ErrorPropagationManager,instances of a target class (G4ErrorTarget) and a trajectory state class(G4ErrorTrajState) have to be attached.

5Sensitivity means, that Geant4 will collect hits for these parts. For non-sensitive parts,no hits will collected.

4.3. GEANT4E ERROR PROPAGATION 29

Figure 4.2: Step-wise propagation of a particle (black); ionisation process/en-ergy loss (red); secondary process (green) (only appied in monte carlo tracking);error (blue) (only calculated in error propagation)

The target classes define a geometrical condition at which the extrapolationshould stop. There are different types of target classes, like G4ErrorGeomVolume-Target, stopping extrapolation when a volume is reached, or the mostly usedG4ErrorPlaneSurfaceTarget, defining a plane as stopping point. Beside theseveral types of target classes, which inherited from G4ErrorTarget, thereare two important children of G4ErrorTrajState, storing the trajectory inspecific parametrizations.

• The G4ErrorFreeTrajState handles the trajectory in five parametersin spherical coordinates.

• The G4ErrorSurfaceTrajState manages the trajectory by five para-meters, set in plane coordinates with a previously determined plane.

These states are described in detail in section 5.1.3. Like in Geant4, thepropagation is then done stepwise, using the stepping methods of the Geant4core, and the detector geometry previously defined at the G4RunManager

instance.During the work with Geant4e in line with this thesis, it has turned out

that there are several severe bugs within Geant4e. These bugs caused massivemaloperation by the propagation of specific G4ErrorTrajStates and theircovariance matrices. More details on this, and its complications towards thisthesis, are shown in section 5.1.5.


4.4 GenFit - a Generic Tracking Framework

GenFit is a flexible framework for track fitting. It was developed at theTechnische Universitat Munchen as fitting tool for the PANDA experiment[17][18]. It is designed highly object-oriented, with its core tasks splitted intoseveral exchangeable classes.

The three core classes in GenFit are the track representations, inherit-ing from GFAbsTrackRep, the reconstruction hits, derived from GFAbsRecoHits,and the fitting algorithms, like the Kalman filter implementation GFKalman.The basic class of the track object is GFTrack, which stores (lists of) imple-mentations of these classes.

FittingAlgorithm

silicon pixelreco hit

TPCreco hit

silicon stripreco hit ...

interface class

inherits from inherits from inherits from

trackmodel1 for proton

trackmodel2 for pion

trackmodel1 for pion ...

interface class

inherits from inherits from inherits from

contains contains

acts on

all hits are usedto fiteach track representation

Figure 4.3: Class structure of GenFit[17].

Reconstruction Hits

A special feature of GenFit is its versatility. This involves, that hits fromdifferent detector components can be combined and fitted within the sametrack object. As the raw data from the different components differ a lot - forexample, a PXD hit is a coordinate in a plane, while a CDC hit consists of awire number and a drift time - the hits from each component need specifiedtreatments. As a virtual base class for the hits, GFAbsRecoHits is used. Thehits do not directly inherit from this class, but from special class templates

4.4. GENFIT - A GENERIC TRACKING FRAMEWORK 31

called GFRecoHitIfc<HitPolicy >, directly derived from the RecoHits. Forevery kind of hit, a HitPolicy is defined, providing methods to derive theinformation needed.

The RecoHits break down the hit coordinates onto detector planes, andprovide them in a planar coordinate system, defined by the unit vectorsspanning the plane. If a set of hits does not lie in a physical detector plane,like e.g. CDC hits, virtual detector planes are introduced. This connection to(virtual) detector planes is GenFit’s basic treatment of hit coordinates.

To convert the raw hits into the plane coordinates, the RecoHits providea projection matrix H, and several methods to get the plane coordinates.

Track Representations

As the RecoHits are used to store and convert the hits, a class containingthe parameters of the fitted track is needed. Another necessary feature,needed for the fitting algorithms, is the ability to extrapolate the trackparameters and their covariances towards detector planes or near drift wires.These functionalities are provided by the track representations, which areimplemented by the virtual class GFAbsTrackRep.

Like the GFAbsRecoHits class for the RecoHits, the GFAbsTrackRep classgives a basic structure for the design of track representations. All specificimplementations inherit from this class. The different implementations differin the way how they extrapolate the track parameters.

The track parameters are stored in five parameters. As the tracks inGenFit are always evaluated on detector planes, these parameters are storedin a plane coordinate system. This coordinate system is spanned by thetwo unit vectors of these planes (~u, ~v), whereas the plane is defined by

support vector ~O and normal vector ~n. The five parameters consist ofthe two coordinates defining the point where the track crossed the plane(u,v), the two projections6 of the momentum direction at this point onto theplane(u′,v′), and of the charge divided by the magnitude of the momentum(q/p). The errors on this parameters are stored in a 5× 5 covariance matrix.These quantities can be converted into common phase space quantities bythe following transformations. These transformations given here show theconversions into position and momentum:

6Important: These projections are often referred to as ’direction cosines’ in the GenFitpapers, but in fact they are ’direction tangents’, as the normal vector ~n is always addedonce. If they were cosines, the transformations could only give back direction vectors witha cone angle of at least 45


• position:~x = ~O + u · ~u+ v · ~v

• momentum:

~p = |p| · spu · (~n+ u′ · ~u+ v′ · ~v)

|~n+ u′ · ~u+ v′ · ~v|

These five parameters (almost) fully represent a track state, but from con-structing momentum with these, one does not know whether the momentumvector is pointing towards or away from the plane. To avoid this ambiguity,the spu parameter is stored individually within the track representation todetermine this direction. More detailed informations about the state vectorsof track representations can be found in section 5.1.3.

Beside the storage of the track parameters and their covariances, themain task of the track representations is to extrapolate them towards otherdetector planes. Every different implementation uses different ways to do thisextrapolation. The different kinds of track representations will be introducedand described later in the G4eTrackRep chapter, where also a more technicallydetailed description is given.

Fitting Algorithms

As its name implies, GenFit provides classes to fit a track. Unlike the othertwo important parts, the fitting algorithms do not inherit from a virtualbase class, nor are they a part of a GFTrack object. They are instantiatedoutside of the track objects, and work on them.

Since now, there are two algorithms; the most widely used Kalman Fitter,implemented in GFKalman, and the Deterministic Annealing Filter, shortDAF, implemented in GFDaf, which basically is a weighted Kalman fitter.

Kalman Fitter

The GFKalman class is an implementation of the widely used Kalman filterfitting algorithm, which is a common method for fitting tracks, not only inhigh energy physics.

It can be proven that the principle of a Kalman filter is the best one onthe following conditions:

• the model used for propagation is linear

• the errors are Gaussian

• noise is white and uncorrelated

4.4. GENFIT - A GENERIC TRACKING FRAMEWORK 33

These conditions may seem strict to most of the physical systems. Butmostly both the propagation model and the error model can be approximatedtowards these needs. For example, the deviation of multiple scatteringcan be modelled Gaussian for small angles, as seen in section 2.2. Thebasic principle to calculate the first state is to take a weighted mean of allavailable measurements, assuming that the state is distributed Gaussian aswell, with the weights chosen after the error. For further steps including anew measurements, the weighted mean of this new measurement as well asof the extrapolation of the former state is taken. Of course, not only thestates, but also the errors are calculated after this principle. A big advantageof this is that the fit can at any time complemented by new measurements,without repeating the whole fit [19]. The previous explanation was a very basicone, of course its implementation in the 5-dimensional and the varying trackextrapolation comes with several adjustments. But basically the GFKalman

follows this principle.


Chapter 5

The Geant4e TrackRepresentation

This chapter will introduce one of the main tasks of this diploma thesis. Itcovers the integration of Geant4’s error propagation into the GenFit fittingframework as a new track representation. The approach to this task, also incomparison to the two existing and mainly used track representations, theRKTrackRep and the GeaneTrackRep2, will be shown here.

There were two major motivations for creating this track representationbased on Geant4e. First, the RKTrackRep, which is currently used withinbasf2, is not compatible with the Geant4 detector geometry model of the BelleII detector, because it uses the ROOT class TGeo. This requires storing thegeometry twice, which leads to additional memory consumption. Moreover,there are several known bugs within TGeo, e.g. bugs leading to endless loopsduring navigation. Secondly, at the beginning of this work in November 2011,there were serious doubts on the accuracy of the RKTrackRep. This couldbe seen for example in several biased pull distributions. In the meantime,these issues have been solved, and the RKTrackRep runs reliably and givesreasonable results.

One of the great challenges of this was to get Geant4e to run properly,despite the great lack of documentation about it. Despite the exchange ofexperiences with other users, the work on this did unfortunately not turn outto be successful. The reasons for this will be analysed in section 5.1.5 on page51 of this chapter.

35

36 CHAPTER 5. THE GEANT4E TRACK REPRESENTATION

5.1 General Approach and Testing Environ-

ments

As the new track representation needs to have almost the same structure asthe two existing ones, the first ansatz to get a functioning version was totake the code of the latest version of the RKTrackRep and replace the relevantExtrap() method by an implementation of Geant4e. The RKTrackRep waschosen because it is the more updated one and was fitted well to the needs ofthe tracking software within basf2.

Figure 5.1: Sideview on a track shot throughthe argon tube in the test environment. Thered trace is the Geant4 truth trajectory,while the black dots are extrapolated and re-fitted points generated with the RKTrackRep.

To be able to test thenew track representation, anew testing environment wasset up using a Geant4 simula-tion. This program simulatesa particle gun, shooting arbit-rary particles (mostly electron-s/positrons and muons) at spe-cific energies trough a ’detector’setup. This ’detector’ only con-sisted of a 40 cm long tube filledwith argon gas. This detectorvolume was made sensitive, i.e.,a G4VSensitiveVolume classwas attached to it, which madeGeant4 to collect hits withinthis volume. These hits weretransformed into a new Gen-Fit RecoHit class and attachedto a GFTrack object. Thetrack could then be extrapol-ated and fitted using GenFit

and its track representations, including the G4eTrackRep (see fig. 5.1 and5.5).

Later in the ongoing development another testing facility was used. Forthis, one of the GenFit examples, which was used to generate pull distributionsof a fit, was adapted to the needs of the G4eTrackRep. This test programcan be found in $GENFIT/test/PullsTest. It was chosen because of its bignumber of executed extrapolate() methods and GenFits recently addedevent display.

To get a better understanding of the implementation of the G4eTrackRep,

5.1. GENERAL APPROACH AND TESTING ENVIRONMENTS 37

the next section will at first take a closer look at the two existing ones andtheir base class.

5.1.1 RKTrackRep and GeaneTrackRep2

The GFAbsTrackRep Base Class

The GFAbsTrackRep is the virtual base class of all track representations. Itprovides their basic structure by inheritance. It defines the basic memberobjects and I/O-methods. The specific methods, like extrapolation, areindicated as virtual methods. The following list will give a short overview ofthe most important objects and methods.Objects:

• TMatrixD fState is the 5× 1 matrix where the 5 track parameters arestored, and TMatrixD fCov stores their 5× 5 covariance.

• fRefPlane contains the parameters of the current reference plane, inwhich the coordinates of the state vector are given.

Methods:

• virtual double extrapolate() is the virtual method to extrapolatethe parameters to a different detector plane. This method is not givenby this base class and must be implemented in every track repres-entation. In addition, virtual double extrapolateToPoint() andvirtual double extrapolateToLine() are defined here, too.

• void setPosMomCov(), void setData() and other setters, along withgetters like void getPosMomCov() provide an I/O-interface.

The two implementations add several other methods and objects, anddiffer slightly in details.

The RKTrackRep

The RKTrackRep uses a Runge-Kutta method to solve the equations ofmotion to propagate the track through matter. The class was developed by J.Rauch and Ch. Hoppner at the Technische Universitat Munchen. It is basedon code from Geant3, which was ported to the C version by I. Gavrilenko.

Its extrapolate() method is splitted into several others. At first, thecall of extrapolate itself formats the input data to the correct format


double RKTrackRep :: extrapolate(const GFDetPlane& pl ,

TMatrixD& statePred ,

TMatrixD& covPred)

M1x7 state7;

getState7(state7);

// getting the current 7-dim state

M7x7 cov7x7;

transformPM7(fCov , cov7x7 , fRefPlane , fState , fSpu);

// transformation of cov. to 7-dim space

double coveredDistance = Extrap(pl , state7 , &cov7x7);

// extrapolation; returns distance

statePred.ResizeTo (5,1);

statePred = getState5(state7 , pl, fCacheSpu);

//back transformation of state

fCachePlane = pl;

covPred.ResizeTo (5,5);

transformM7P(cov7x7 , covPred , pl , state7);

//back transformation of covariance

return coveredDistance;

where statePred and covPred are the (ROOT) matrices to dump the extra-polation results in (all GenFit code can be found in [20]). As initial values,the state and covariance stored in the member variables are taken, not theones handed over.

Before the extrapolation itself is taking place in the Extrap() method,a conversion from the given 5 dimensional track model of GenFit into a 7dimensional track model must be taken, as this is sufficient for the code usedwithin Extrap(). This 7 dimensional representation consists of q/p, the phasespace position coordinates and direction of the track (i.e., the momentumwith magnitude 1) in phase space coordinates. The basic functioning won’tbe shown here in detail. A structural overview can be seen in figure 5.2.

Extrap() calls the RKutta() method, which calculates the steps of theextrapolation, using a Runge-Kutta method. After the stepping, the materialeffects are invoked, to calculate the effects of energy loss and scattering onthe track and the covariance matrix.


Figure 5.2: Basic functioning of the RKTrackRep::Extrap() method [21].

The GeaneTrackRep2

This track representation is older than the RKTrackRep and was not developedfurther, thus it contains designs which are outdated and solved better inthe RKTrackRep, which better fits the needs of basf2. However, as itsextrapolation method is quite similar to the one used in the G4eTrackRep,it is still interesting to take a look on it. The GeaneTrackRep2 uses Geaneas error propagation framework, which is the error propagation frameworkof Geant4’s forerunner Geant3. As Geant4e is a C++ re-implementationof Geane, its internal processing methods are close to the one of Geane,what makes it equal in its usage and equal regarding the data input format.The most significant difference between them is that Geane was written inFORTRAN, while Geant4e uses C++.

A closer look at the GeaneTrackRep2::extrapolate() reveals the correctusage of Geane and thus Geant4e. The most important steps within thismethod will be presented here:

When the extrapolate() method is called, a target plane and twomatrices to dump the new state and covariance are handed over.

GeaneTrackRep2 :: extrapolate(const GFDetPlane& pl ,

TMatrixT <double >& statePred ,

TMatrixT <double >& covPred)

For the extrapolation itself, Geane is used, or more precisely, a C++


wrapper for the FORTRAN-coded Geane. Before the extrapolation can becalled, the extrapolation parameters must be handed over to this wrapper:

gMC3 ->Eufilp(1, ein , pli , plf);

where 1 is a flag for the input format, ein is the input covariance matrix, andpli and plf are the reference and target planes.

The extrapolation itself is then performed by the Ertrak() method,

gMC3 ->Ertrak(x1 ,p1 ,x2 ,p2 ,fG3ParticleID ,

geaneOption.Data());

where x1 and p1 are the initial position and momentum, x2 and p2 the finalones, and geaneOptions is a string containing either ’PE’ for forward or ’BPE’for backward propagation.

The distinction between forward or backward propagation has previouslybeen done by calculating the distance between target plane and currentposition; if the scalar product of this distance and the current momentum(i.e. direction) is negative, a backpropagation must be taken:

TString geaneOption("PE");

TVector3 dir=pl.dist(x1vect); // direction from pos

to plane;

if((dir*p1vect) <0)

geaneOption="BPE";

After the extrapolation, the final data can be read out using the fErtrioobject:

Ertrio_t *ertrio = gMC3 ->fErtrio;

Double_t cov15 [15];

for(Int_t i=0;i<15;i++) cov15[i]=ertrio ->errout[i];

The flag used at the beginning within the Eufilp command determinesthe kind of parametrization of the track. The here chosen ’1’ flag sets thesurface trajectory state, which is very similar to the parametrization usedin GenFit1. Another parametrisation is the free trajectory state, in whichthe coordinates are given in a sphere-like parametrization. As Geant4e is are-implementation of Geane, it provides both representations, too.

Unfortunately, the very practical surface trajectory state cannot be usedwithin the G4eTrackRep, for reasons demonstrated in further sections. Never-

1The only difference is that the ’charge / momentum’ parameter changed to ’1 /momentum’. This simplifies the transformation between those representations enormously,especially the complex transformation of the covariance matrix can easily be done by justmultiplying the first row and the first coloumn by the charge.


theless, the basic sequence of usage of Geane within this track representationwas used as an archetype for the G4eTrackRep.

5.1.2 Structure of G4eTrackRep::Extrap()

The core of the new track representation is its Extrap() method, which isalso the basic difference towards other track representations. As mentionedbefore, most of the other code was taken from RKTrackRep, as both mustfulfill the same basic tasks and thus need the same structure. In this section,the functioning of extrapolation is explained in detail. For further changesand differences towards the other track representations, especially for thecomplex run management of Geant4e, which are left out here, the reader shallrefer to the next section.

Extrap() can’t be called directly. It is invoked by the extrapolate()

method, which prepares state and covariance.

double G4eTrackRep :: extrapolate(const GFDetPlane& pl ,

TMatrixD& statePred ,

TMatrixD& covPred)

TMatrixD state7(getState7 ());

TMatrixD cov5x5 (5,5);

double coveredDistance = Extrap(pl , &state7 , &cov5x5);

statePred.ResizeTo (5,1);

statePred = getState5(state7 , pl, fCacheSpu);

fCachePlane = pl;

covPred.ResizeTo (5,5);

covPred = cov5x5;

return coveredDistance;

As one can see, no transformation of the covariance matrix is taking place,only the 5-dimensional state vector is transformed to the phase-space-likeparametrization also used within the RKTrackRep. This is because Geant4eis initialized with phase space coordinates.

At first, the initial values for Geant4e are created out of the state vectorwhich was handed over. Also the target plane is set.

double G4eTrackRep :: Extrap( const GFDetPlane& plane ,

TMatrixD* state , TMatrixD* cov)


...

// determine current position and momentum

TVector3

postemp ((* state)[0][0] ,(* state)[1][0] ,(* state)[2][0]);

G4ThreeVector g4pos(GFtoG4Vec(postemp));

TVector3

momtemp ((* state)[3][0] ,(* state)[4][0] ,(* state)[5][0]);

G4ThreeVector g4mom(GFtoG4Vec(momtemp));

momtemp.SetMag(fabs (1/(* state)[6][0]));

g4mom.setMag(fabs (1/(* state)[6][0]) *1000.);

// make target plane

G4Point3D pln(plane.getNormal ().X(),

plane.getNormal ().Y(), plane.getNormal ().Z());

G4Point3D pl0 = GFtoG4Vec(plane.getO());

G4ErrorTarget *target = new

G4ErrorPlaneSurfaceTarget(pln , pl0);

It’s important to notice that the values for the coordinates can’t just becopied from GenFit to Geant4e, because different systems of units are used.GenFit uses cm and GeV, while Geant uses mm and MeV for the coordinates,and GeV and cm for the covariance. This necessitates a conversion, whichis here done by the GFtoG4e and G4etoGF functions. The coordinate valuesmust be divided/multiplied by a factor of 10. The values for momentum mustbe divided/multiplied by a factor of 1000.

As seen in the GeaneTrackRep2, before the extrapolation takes place, thedirection of propagation must be determined. This is done in the same wayas in the GeaneTrackRep:

// direction from pos to plane;

TVector3 in_out=plane.dist(postemp);

// determine whether fore -/ backpropagation

G4ErrorMode currentG4ErrorMode;

if(( in_out*momtemp) <0) currentG4ErrorMode =

G4ErrorMode_PropBackwards;

else currentG4ErrorMode = G4ErrorMode_PropForwards;

Now the basic initial objects for Geant4e are set up, except for thecovariance matrix. As mentioned before, there are two different trajectorystates given in Geant4e. The very similar surface trajectory state lends itselfto be used here within GenFit, as its 5-dimensional track parametrisation isalmost the same (see section 5.1.1 about GeaneTrackRep2). Unfortunately,


there occurred serious doubts on the proper functioning of this trajectorystate class2. This led the author to the use of the other free trajectory stateclass, which is supposed to work properly and is used by other users, but usesa different parametrization. More on this and other malfunctions of Geant4ecan be read in the following section.

The change of the track parametrization makes it necessary to convertboth the state parameters and the covariance matrix. Although the con-structor of the free trajectory state accepts phase space coordinates of positionand momentum for initialization, doing the transformation internally, thecovariance matrix must be handed over in an already transformed version.This transformation will be left out here, and will be described in the nextsection, as it is a part of a patch which is exhibited there.

After these preparations, Geant4e is ready to perform the extrapolation.Geant4e offers two ways to extrapolate; on the one hand the extrapolation canbe done at once, or, on the other hand, it can be done stepwise. Because everytrack representation has to return the covered distance after the extrapolation,this number must be calculated somehow. Unfortunately, there is no way to getthis information out of Geant4e after an at-once-extrapolation. Nevertheless,this can be gained indirectly by performing the stepwise extrapolation andretrieving the the steplength after every step; these numbers can then beadded up. The stepper is located in an infinite loop, which is broken up assoon as the Geant4e manager class sets the flag for the last step.

// determine particle and state

G4String particlename =

G4ParticleTable :: GetParticleTable ()

->FindParticle(fPdg)->GetParticleName ();

g4state = new G4ErrorFreeTrajState(particlename , g4pos ,

g4mom , error );

// init extrapolation

double coveredDistance (0.);

g4eData ->SetTarget(target);

g4eMgr ->InitTrackPropagation ();

// extrapolate stepwise

bool moreEvt = true;

int count = 0;

while( moreEvt )

2The free trajectory state is only a different interface, whereat the extrapolation is doneby the free state after conversion. This conversion revealed different results after forth-and backtransformation, which is not reasonable.


int ierr = g4eMgr ->PropagateOneStep( g4state ,

currentG4ErrorMode );

// calculate covered distance

if(g4state ->GetG4Track () != NULL)

coveredDistance +=

g4state ->GetG4Track ()->GetStepLength () /10.;

else G4cerr <<"NO G4Track !!" << G4endl;

// Check if target is reached

if(g4eMgr ->GetPropagator ()

->CheckIfLastStep(g4state ->GetG4Track ()))

g4eMgr ->GetPropagator ()

->InvokePostUserTrackingAction(g4state ->GetG4Track ());

moreEvt = false;

G4cout << "STEP_BY_STEP propagation: Last Step " <<

G4endl;

count ++;

After this, the extrapolation has finished and has left over the final results.These are stored within the previous free trajectory state object and canbe retrieved out of it in useful phase space coordinates. Here, again, theconversion factors due to the different system of units must be kept in mind.

// set up final results

TVector3 finPos = G4toGFVec(g4state ->GetPosition ());

TVector3 fin_direction =

G4toGFVec(g4state ->GetMomentum ());

fin_direction.SetMag (1.);

double GFmag = g4state ->GetMomentum ().mag() /1000.;

(* state)[0][0] = finPos.x();

(* state)[1][0] = finPos.y();

(* state)[2][0] = finPos.z();

(* state)[3][0] = fin_direction.x();

(* state)[4][0] = fin_direction.y();

(* state)[5][0] = fin_direction.z();

(* state)[6][0] = fCharge/GFmag;

The state vector object, which was filled here, is one of the returningreference values. As seen before, it will be returned to the extrapolate()


method, which does the conversion back to the 5-dimensional representation.

5.1.3 States and Matrix Conversions

The basic functioning of the extrapolation of the G4eTrackRep was shown inthe last section, except the conversion of the covariance matrix was left outintentionally to be shown here. As mentioned before, the free trajectory statewas chosen as a representation within Geant4e. Before the reasons for this areexposed, let’s first discuss the different representations. Let ~x = (x, y, z) bethe position vector, ~p = (px, py, pz) be the momentum vector in the cartesiandetector coordinate system, and q the charge of the particle; in plane-basedrepresentations, let ~O be a support vector of the plane, ~u and ~v shall be theunit vectors spanning the 2-dimensional plane coordinate system, and ~n theunit vector perpendicular to the plane:

GenFit 5-dim.:

• q/p = q|p|

• u’ = ~p·~u~p·~n

• v’ = ~p·~v~p·~n

• u = (~x− ~O) · ~u

• v = (~x− ~O) · ~v

G4e surface state:

• 1/p = 1|p|

• u’ = ~p·~u~p·~n

• v’ = ~p·~v~p·~n

• u = (~x− ~O) · ~u

• v = (~x− ~O) · ~v

In the spherical free trajectory state, the λ and the φ are the dip and theazimuthal angle of the momentum in spherical coordinates:

GenFit 7-dim.:

• x = x

• y = y

• z = z

• ax= px/|~p|

• ay= py/|~p|

• az= pz/|~p|

• q/p = q|p|

G4e free state [22] [23]:

• 1/p = 1|p|

• λ= π/2− θ

• φ= atan2(px, py)

• y⊥= − sin(φ) · x+ cos(φ) · y

• z⊥= − sin(λ) cos(φ) · x− sin(λ) sin(φ) · y + cos(λ) · z


with polar angle θ = arccos pz√p2x+p2y+p2z

.

The x⊥/y⊥ coordinate system of the free trajectory state is an orthonormalcartesian system, where x⊥ points to the direction of the track, and the y⊥lies parallel to the xy-plane. The ~a of the 7-dim. state denotes the direction.

As one can easily notice, the free trajectory state differs most from allother representations, while the surface trajectory state is almost equal tothe standard 5-dimensional GenFit representation. Due to that, at thebeginning of the development, this was the first choice to use. Unfortunately,several serious problems with the extrapolation of the G4eTrackRep occurred,which were at this point inexplicable and unsolvable and led to an interimabandonment of this project. More on these malfunctions can be found inthe dedicated section 5.1.5.

In the early summer of 2012, basf2 developer L. Piilonen committed a newmodule used for track extrapolation from the CDC to the outer detectors. Forthis extrapolation, the G4ext module is also using Geant4e. In this module,the extrapolation finally worked properly, after having faced similar problemsas on the G4eTrackRep. One fundamental difference in the design of G4extwas the usage of the free trajectory state instead of the surface trajectorystate. This lead the author to implement this in the G4eTrackRep, includingthe more complex conversion of the covariance matrix.

Coordinate transformation on covariance matrices are done by using thethe Jacobian matrices:

C ′ = J · C · JT

with the Jacobians

J =

∂f1∂x1

∂f1∂x2

. . . ∂f1∂xn

......

. . ....

∂fm∂x1

∂fm∂x2

. . . ∂fm∂xn

.

For the transformation between 5-dimensional GenFit and the surface tra-jectory state, this matrix looks quite sparse:3

JGF5→G4eS =

1q

0 0 0 0

0 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 1

The transformation from GenFit to free trajectory state is more complex.

Their Jacobians can’t be shown in matrix format here, as it would not fit

3The Jacobian matrix for the back transformation JG4eS→GF5 is left out here, as it isthe same, but with flipped nominator and denominator in all charge containing entries.


on a page; they are rather displayed element-wise. The transformation fromGenFit’s 5-dimensional representation to the free trajectory state uses:

JGF5→G4eF [1][1] = q

JGF5→G4eF [2][2] = spu · −px · pz · ux − py · pz · uy + p2⊥ · uz|p| · p⊥

JGF5→G4eF [2][3] = spu · −px · pz · vx − py · pz · vy + p2⊥ · vz|p| · p⊥

JGF5→G4eF [3][2] = spu · px · uy − py · uxp2⊥

JGF5→G4eF [3][3] = spu · px · vy − py · vxp2⊥

JGF5→G4eF [4][4] = − sinφ · ux + cosφ · uyJGF5→G4eF [4][5] = − sinφ · vx + cosφ · vyJGF5→G4eF [5][4] = − sinλ · cosφ · ux − sinλ · sinφ · uy + cosλ · uzJGF5→G4eF [5][5] = − sinλ · cosφ · vx − sinλ · sinφ · vy + cosλ · vz,

with ~p = spu · (~n + u′ · ~u + v′ · ~v) in the plane coordinate system andp⊥ denominating its transverse component, i.e. ρ in cylindrical coordinates.Non-mentioned entries are zero. The Jacobian was taken from [24] and cross-checked by calculation and unit testing. For the backtransformation, thesame Jacobian was calculated and inverted. As every entry is a derivation ofa non-constant state vector, the entries of this matrix are non-constant aswell, and they must be calculated anew every time the transformation takesplace.

5.1.4 Run Management

Another issue with Geant4e was its run management. Although it soundseasy to just set up the needed classes, several problems occurred during theinitialization. These problems were caused by running of Geant4 and Geant4ein parallel. As both use the same program kernel, e.g. for stepping, materialeffects etc., they can’t just be initialized in parallel as they would have tobe run in an exclusive environment. Like other peculiarities of Geant4e, thisproblem is neither mentioned anywhere, nor is there any documented solutionfor it.

In earlier versions of the G4eTrackRep, this initialization was done just inthe same way as in the Geant4e example program (which can be found in$G4INSTALL/examples/extended/errorpropagation/errprop.cc). This wayis a quite simple one:


void Initialize ()

// Initialize Stepping

G4VSteppingVerbose :: SetInstance(new

G4SteppingVerbose);

// Initialize the GEANT4e manager classes

G4ErrorPropagatorManager* g4emgr =

G4ErrorPropagatorManager :: GetErrorPropagatorManager ();

G4ErrorPropagatorData* g4edata =

G4ErrorPropagatorData :: GetErrorPropagatorData ();

// Detector class attached

g4emgr ->SetUserInitialization(new

ExErrorDetectorConstruction);

...

// Final initialization

g4emgr ->InitGeant4e ();

...

This approach was working properly for one instance of Geant4e with noGeant4 running in parallel. However, in both the testing environments and inthe basf2 framework, it is necessary for the track representations to run inparallel with Geant4, and of course it is necessary for most GenFit applicationsto be able to run multiple instances of track representations in parallel. Ontesting the track representation in the testing environment, its initializationfailed due to a blocking of the Geant4 kernel state, and the program quitthrowing an exception4. Also, in the pure GenFit environment in the secondtesting environment (GenFits modified PullsTest program), where multipleinstances of the selected track representation are initialized, the programcrashed after initializing the second track representation (only one workedwell, of course).

As this behaviour of the software is of course unacceptable, it is necessary tofind another way of run management, which allows running the G4eTrackRep

in parallel to Geant4 and with other instances of Geant4e. At first, no solutionwas found for this, which was the second reason for the previously mentionedtemporary abandonment of this project.

4To be able to continue with further development of the G4eTrackRep, this was circum-vented by splitting the one testing program into two, one for hit creation using Geant4,and the other for fitting the hits with the G4eTrackRep


*** G4Exception : RunInitializationAtIncorrectstate

issued by :

G4RunManagerKernel :: RunInitialization

Geant4 kernel not in Idle state : method ignored.

*** This is just a warning message.

Figure 5.3: First warning after initialization...

ERROR - G4ErrorPropagator :: PropagateOneStep ()

Called before initialization is done for this

track.

Please call

G4ErrorPropagatorManager :: InitGeant4e ().

*** G4Exception : Invalidcall

issued by : G4ErrorPropagator :: PropagateOneStep ()

Called before initialization is done for this track!

*** Fatal Exception *** core dump ***

Figure 5.4: ...before it crashes.

Again, with the release of G4ext, which is compatible with other instancesof Geant4, it was possible to see a working implementation of Geant4e withoutthese problems. And thus it was possible to adapt its run managementinto the G4eTrackRep. This way of run management turned out to beway more complex, which can be seen in the following cited code of theG4eTrackRep::Initialize() method5.

void testTrackRep :: Initialize ()

// Initialize the GEANT4e manager classes

g4eMgr = G4eTrackRepManager :: GetManager ();

g4eData =

G4ErrorPropagatorData :: GetErrorPropagatorData ();

// Check if Geant4/Geant4e is already running

if (G4ParticleTable :: GetParticleTable ()->entries () ==

NULL)

// G4eTrackRep will run without Geant4

5Some minor commands unrelated to the run management are left out.


// Run Manager is not used

g4RunMgr = NULL;

g4trk = NULL;

g4stp = NULL;

// Detector class attached

DetectorConstruction* detector = new

DetectorConstruction ();

g4eMgr ->SetUserInitialization(detector);

// Region is set

G4Region* region =

(*( G4RegionStore :: GetInstance ()))[0];

region ->SetProductionCuts(G4ProductionCutsTable ::

GetProductionCutsTable ()->GetDefaultProductionCuts ());

// New G4VPhysicsList is invoked

g4eMgr ->SetUserInitialization(new PhysList ());

// Mag. field is created

fieldMgr =

G4TransportationManager :: GetTransportationManager ()

->GetFieldManager ();

G4UniformMagField* magField = new

MagField(G4ThreeVector (0. ,0. ,.0015* tesla));

fieldMgr ->SetDetectorField(magField);

fieldMgr ->CreateChordFinder(magField);

// Final initialization

g4eMgr ->InitGeant4e ();

else

// ... runs simultaneously!

// existing RunManager is used;

// if non -existent , initialization is done either

g4RunMgr = G4RunManager :: GetRunManager ();

if(g4RunMgr != NULL)

g4trk = const_cast <G4UserTrackingAction*>

(g4RunMgr ->GetUserTrackingAction ());

g4stp = const_cast <G4UserSteppingAction*>

(g4RunMgr ->GetUserSteppingAction ());


g4eMgr ->InitGeant4e ();

G4StateManager :: GetStateManager ()

->SetNewState(G4State_Idle);

G4cout << "Initialisation done.\n";

As it can be seen, the initialization is spilt into two parts; one for aninitialization without other instances of Geant4/Geant4e, and the other fora simultaneous run. The decision is made by a check whether the particletable exists yet or not. This distinction is one of the important parts of thenew run management, but the least complicated one. Another importantpart is the introduction of a new adjusted G4VPhysicsList class. In thisclass, the particles and physical effects are defined. If no class is attached tothe RunManager, Geant4/Geant4e will take the standard implementationof it. This class differs from the normal physics list in the fact that allparticles used for the propagation are redefined under new names and PIDs.These new particles are exact copies of the ’old’ ones, but with their nameschanged to ’g4e muon’ instead of ’muon’, for example, and with PID numbersshifted by an offset of 1000000 (respectively -1000000 for negative numbers).Geant4e then only uses these dedicated particles for tracking, which avoidsthat it intersects with the tracking of Geant4. Beside these two modifications,several others were taken. The whole correct run management demandsother helper classes, which were implemented for the G4eTrackRep in thesame way as for G4ext. Also, not the original Geant4e classes were used,but new classes wrapping them, adding some new minor features and checks.Basically, their usage has remained the same. The description of these classeswould go beyond the scope of this paper, and is not necessary for the basicunderstanding of the run management of the G4eTrackRep.

With these many modifications, no further problems on getting theG4eTrackRep running alongside Geant4/Geant4e instances occurred.

5.1.5 Malfunctions in Matrix Propagation and TrackFitting

As previously mentioned several times, during the development of this trackrepresentation several serious problems occurred, making it impossible tocomplete a properly running version, which led to the final abandonment ofthis project.


Malfunctions before G4ext adaptations

Figure 5.5: Same view in the testing environ-ment as in fig. 5.1, but using the G4eTrackRep

with free trajectory state for the re-fit. Theextrapolation fails after a short distance.

One major problem was foundin the run management. As thefocus of this section shall lieon the malfunctions of the ex-trapolation, this problem andits solution are described in theprevious section.

The second major prob-lem was the behaviour of thepropagation of the covariancematrix, which is one of the cent-ral tasks of the G4eTrackRep.It was experienced that dur-ing the extrapolation the co-variance matrix, figurativelyspoken, ’exploded’. Mean-ing, that after a few steps,its entries gained values upto O(10100), which caused thetrack fit to crash, and (mostly)

left the covariance matrix with ’not a number’ (’NaN’) entries. Such a failedfit can be seen in fig. 5.5.

However, the extrapolation of the state vector itself mostly worked, so theerror seemed to focus on the matrix propagation. An interesting fact was thatthis error did not (or less severely) occur when the covariance matrix waszero. L. Piilonen stated in [25] that he was experiencing the same problems.

This problem was tried to be solved by adapting the extrapolation structureof G4ext to the G4eTrackRep. For this, the formerly used surface trajectorystate with its easy structure was changed into the free trajectory state, alsoapplying the matrix conversions, as described in section 5.1.3.

Malfunctions after G4ext adaptations

After having successfully implemented the changes on the run managementtaken from G4ext, which lead to a version which did not crash on initializationand extrapolation in any environment, the changes concerning the extrapola-tion and the matrix conversions were implemented with high expectations.

Unfortunately, this lead to further problems. Actually, the matrix thencontained much more reasonable values after each step. Nevertheless, the


process failed.It is interesting, that the G4e works reasonably, and the G4eTrackRep

doesn’t, especially when used by the Kalman fitter. One difference in the us-age of the Geant4e by the bare extrapolation on one hand and the Kalman fiton the other hand, is that the Kalman fitter extrapolates the same track fromvarying positions with varying covariances back and forth many times, chan-ging the direction of propagation several times. Whereas a bare extrapolationjust does one extrapolation from one initial state and initial covariance. Thisbehaviour could be a hint that Geant4e’s propagation procedure still containssmaller errors, which don’t strike out in one single extrapolation like in G4ext,but build up in a chain of multiple extrapolations in the G4eTrackRep.

Another problem is the proper implementation of the magnetic field forGeant4e. Actually, the G4eTrackRep works partly reasonably with magneticfields switched off or very slightly switched on. After slight rises the extrapol-ation fails completely, with discontinuous jumps in it, often with setting thez-values of the position to ±50 cm.

This suggests, that the magnetic field is not implemented correctly, but itcan’t be figured out whether this is an issue of Geant4e or of the G4eTrackRep.It was still unknown to the author of this paper, in which way the magneticfield has to be implemented correctly. Neither the way of adding a fieldclass to the detector class, as it was done in the example program, lead tosuccess, nor did the way of attaching a field class directly to the Geant4’s fieldmanager. Again, this issue might have been solved if the usage of Geant4ewere documented better.

Since now, it was not possible to get a properly working version of theG4eTrackRep, which unfortunately means to abandon this project once again.

Spread of G4e and Other Users Experience

During the development of this track representation, it became necessary toget into personal contact with other users and the developers of Geant4e,for lack of sufficient documentation. The Geant4 documentation providesonly one section about the usage of Geant4e, describing shortly basic classesand their handling. Additionally, a simple example program can be found inthe Geant4 directory $G4INSTALL/examples/extended/errorpropagation

showing its basic usage [20].Geant4e seems not to be spread widely. Except L. Piilonen’s G4ext module

and the example program, the author couldn’t find any prove where Geant4ehas ever been used successfully [25].

After checking the bug reports for all Geant4 versions, there were actuallyfound two submitted bug fixes concerning Geant4e. These were submitted by


M. Wysocki, a member of the PHENIX experiment. He found them whileconstructing an Kalman fitter based on Geant4e. This is an interesting case,as a Kalman fitter with Geant4e is basically the same as the G4eTrackRep,which is mostly used by GFKalman for its extrapolation parts. He foundsome bugs, which were reported and fixed in following Geant4 versions, butunfortunately he never got it to work either [26].

Out of a correspondence with the author of Geant4e, P. Arce, CERN, itcould be gathered that Geant4e was not really used by anyone, not even byhimself, except for the example program. All this suggests, that Geant4e hasnever been tested properly and thus doesn’t work correctly [27].

This lack of dependability is unusual and was not expected at the beginningof this project, as Geant4 itself is a widely spread software and industrystandard in its subject, with many applications in medical engineering andradiology. However, none of these applications need the error propagation ofcovariance matrices, as this is an pure issue of particle physics.

Unfortunately the complexity of these bugs excels the volume of thisdiploma thesis; they could neither be found, nor be fixed.

Chapter 6

Performance Optimisations onGenFit

Unerhort schnelle Systeme begehen unerhort schnell Fehler.Stanislaw Lem

The enormous amount of data produced by the Belle II detector causeshigh demands on the efficiency of data processing. This includes, next tomemory efficiency, a fast and well optimised processing, consuming as littleCPU time as possible. This chapter focuses on the question of where thereis room for optimisation in the basf2 tracking modules, and in additionit presents a way to speed up the fitting algorithm within GFKalman byoptimising matrix calculations. In the next chapter, possible ways of a fastergeometry navigation will be described.

6.1 Profiling

Before any optimisation can be done, one must take a look at how timeconsuming which parts of the program are, and which of them provide roomfor optimisations. For this probing, two major approaches were taken. Atfirst, the inbuilt basf2 module was used to see the time consumption ofthe tracking modules related to the other modules usually used. Secondly,Valgrind was used, which allows a detailed analysis of the consumed CPUtime of each process1.

1 All benchmarks, performance tests and all profiling were run on a desktop PC withIntel(R) Core(TM) i3-2100 CPU @ 3.10GHz processor, 3.759 GB RAM, with operatingsystem Ubuntu 10.04 Lucid Lynx and kernel 3.0.0-26-generic

55

56 CHAPTER 6. PERFORMANCE OPTIMISATIONS ON GENFIT

6.1.1 Profiling using inbuilt basf2 module

The inbuilt statistics module of basf2 can be used to arrange a simpleprofiling of the time consumption of the different modules used in a basf2

script. To take a quick look on how much time GenFit - or respectively,its module GenFitter - takes in comparison with the other modules, thismodule is used in a simple script. For this, the MCFitting.py from the basf2examples was used. For the results, see figure 6.1.

Event Statistics:

====================================================

Name | Calls | Time(s) | Time(ms)/Call

----------------------------------------------------

EvtMetaGen | 2 | 0.000 | 0.014

Geometry | 1 | 0.000 | 0.001

EvtMetaInfo | 1 | 0.000 | 0.054

Gearbox | 1 | 0.000 | 0.001

ParticleGun | 1 | 0.001 | 1.493

Fullsim | 1 | 1.748 | 1747.689

CDCDigi | 1 | 0.015 | 15.259

MCTrackFinder | 1 | 0.005 | 4.678

GenFitter | 1 | 0.210 | 210.367

simpleoutput | 1 | 0.488 | 487.740

----------------------------------------------------

Total | 2 | 2.469 | 1234.547

====================================================

Figure 6.1: Output of basf2 profiling

When examining these numbers, it must be kept in mind that for areal event, the track finder must be called five times, once for each particlehypothesis2. Yet the MCTrackFinder3 is a very simple track finding algorithm,it needs only a tiny fraction of the time of the GenFitter. For sure, futuretrack finders will take much more time. But even compared to the FullSim

module, which carries out the simulation of the analysed event, it can be seenthat the consumed time for the fitting lies in the same order of magnitude4.

A closer look on the time consumption of the different parts of theGenFitter can be seen in figure 6.2. The time for the initialization andtermination does not play a role, because they are only called once in the

2The fit must be done for pions, kaons, electrons/positrons, myons and protons.3It just takes the Monte Carlo truth to ’find’ the track.4O(5 · 210.4ms) ≈ O(1747,7ms).

6.1. PROFILING 57

whole script. The other parts (beginRun(), event() and endRun()) arecalled repeatedly. Among them, only the event() plays a role, the others arenegligible.

Module GenFitter:

->initialize (): 161.910 ms , 1 calls , 161.910 ms/call

->beginRun (): 0.001 ms , 1 calls , 0.001 ms/call

->event(): 210.367 ms , 1 calls , 210.367 ms/call

->endRun (): 0.050 ms , 1 calls , 0.050 ms/call

->terminate (): 0.002 ms , 1 calls , 0.002 ms/call

Figure 6.2: Closer look on the GenFitter module.

6.1.2 Profiling using Valgrind

To get a closer look on which parts of the fit take the most resources, thetool Valgrind is used [28]. Valgrind is a toolkit for debugging and profiling ofprograms. It sets up a virtual processor, which executes the program. Thisallows a deep analysis of the execution structure, including the consumedtimes and efficient localization of runtime errors.

The time analysis, denominated as ’Profiling’, is done by Callgrind [29],which can be invoked by a command line option. As Callgrind measures thenumber of CPU clock cycles instead of the exact time, these numbers arecomparable for every system, not only on the particular one where it wasexecuted.

Callgrind writes its measurements out in a file, which can be visualized byCachegrind [30]. This provides a detailed look in every called function, bothon the cycles it ran, and on which subfunctions were called how often andhow many cycles they needed. A visualized map can be seen in figure 6.3.

The pure cycle numbers are still hard to analyse and to compare withinthis output. Though Cachegrind lists these number very detailed for everysingle process, arranging all processes hierarchically. But when a functionor a class is called by different mother processes in different hierarchicaltrees, these are summed up. Fortunately, the graphical ’Callee Map’ viewof Cachegrind groups similar processes with similar colors, which allows aquantitative overview. Such a map of the relevant ProcessTrack() can beseen in figure 6.3.

Cachegrind has grouped most matrix operations related to TMatrixD inbrown and beige colors (some are blue-green), while geometry operations aregrouped in green and turquoise (voxel operations). Already on a quick glance


Figure 6.3: Callee Map of the ProcessTrack(). The area of the map repres-ents the consumed CPU time, each tile representing one subprocess. Similarprocesses are colored similarly; the most time comsuming are TMatrixD oper-ations (brown/beige, some lime green) and TGeo operations (green, turquoise).For detailed explanation, see 6.1.2.

Figure 6.4: Callee Map of the ProcessTrack(), with O2 optimisation turnedon.

it can be seen that both matrix operations and geometry operations play abig role in how fast ProcessTrack() will work. They provide the most roomfor optimisations.

Its also interesting to compare this map with the one from the optimisedversion, see figure 6.4. This profile was created with a version of the programwhich was compiled with the ’O2’ compiler option of the g++ compiler [31].This option makes the compiler to look for possible optimisations, and appliesthem to the machine code. A look on this ’Callee Map’ reveals that the areasfor matrices and geometry even grew bigger. This is of course not becausethese operations became slower, but because all other operations became

6.2. ALTERNATIVES TO THE ROOT MATRICES 59

faster. This shows, that the possibilities for compiler optimisations on thecurrent algorithms are exhausted. Thus, further optimisations can only begained by structural modifications of the program.

Based on this analysis, it was estimated that the potential of optimisationgained by exchanging the matrix classes lies between 10 - 20%.

6.2 Alternatives to the ROOT matrices

As seen in the previous section, both geometry navigation operations andmatrix calculations cost by far most of the calculation time. Possible op-timisations on geometry navigation were not worked out in this thesis, asthey would have gone beyond the scope of it. Nevertheless, new possibleapproaches are shown in the next chapter 7. In this chapter, approaches onoptimising the matrix calculations, especially within GenFit’s GF class, aredemonstrated.

The amount and type of matrix calculations varies within the examinedGFKalman::ProcessTrack() method. At first, there are the matrix opera-tions to calculate the fit, and secondly there are matrix operations within theextrapolate() methods of the track representations called by the fitter. Theones of the extrapolation are mostly the calculation of the jacobians neededfor the conversions between the different track parametrizations describedin 5.1.3 at page 45. These calculations of the matrix entries are defined,containing many zero entries. One approach, followed by J. Rauch from theGenFit team, is to retrieve the array from the ROOT TMatrixD’s where theentries are stored and to directly write the calculations into them, withoutusing the normal methods of TMatrixD[32]. By this approach, about 20-30%of speed in the extrapolation could be gained.

For the more varying calculations in the Kalman fitter, this approach isnot suitable. Most operations here are invertions and multiplications withmatrices, whose form is not known before. This suggests the new approachof exchanging the TMatrixD matrix class from ROOT by another one whichprovides higher-performance calculations.

To find a suitable algebra package, a matrix benchmark program wascreated. This program is running standalone and basically was designed as atemplate class, providing different methods to test the performance of specificmatrix operations. This template class can load and analyse any matrixdata type for which the basic operators for calculations, i.e. ’+ − ∗ /’, areoverloaded.

Its core is a class called benchmatr. After being initialized with anarbitrary matrix class, various benchmarking actions like performance of


filling, multiplying and transforming can be executed and evaluated. For thetime measurement, the very precise timer package from the boost library [33]is used. It can measure precisely the consumed time between two calls.

For a benchmarking process, a boost timer class is initialized and started.Then, the operation of interest is executed over and over again, then thetimer measures the elapsed time. By that, an estimate time per examinedoperation can be calculated. This procedure of measuring the mean time issufficient, because the time a single operation takes is unmeasurably short.

An interesting effect could be seen here. If the same operation is repeatedover and again, and the program was compiled with the optimisation optionsof the g++ compiler [31], the repetitions were optimised out and just calculatedonce, which massively distorted the measured values. This caused measuredtimes of fractions of nanoseconds, which is simply impossible.

This behaviour could be suppressed by changing single entries of thematrices in a way which is non-deterministic for the compiler. For this, thestandard C++ random number generator was used, which can be seen asnon-deterministic. Indeed, this is an intervention in the measurement, andcan thus bias the measuring results. However, as this is a single operation,its effects on the measured time should be neglectable.

The benchmark was executed with the following matrix classes:

• TMatrixD5 from ROOT, which are used as a standard within GenFit.

• the standard matrices from the UBLAS package of the widely speadboost library[33].

• the static matrices from the Eigen library [34]

• and their dynamical matrices.

The selection of these matrices was based on recommendations from withinthe Belle-II collaboration and from colleagues from other Karlsruhe institutes.

The results of this benchmark session were surprising in several ways. Atfirst, the widely spread and well-renowned boost matrices performed by farworse than expected, while the matrices from the Eigen library were stunninglyfast, even compared to the better-than-expected TMatrixD’s. Within theEigen matrices, the static ones had clear advantages over the dynamicallyallocated ones. The exact results can be seen in figure 6.5. The gain ofthe static Eigen matrices towards the TMatrixD was, depending on thedimensionality, up to a factor of 40-50 [sic!].

5TMatrixD is a typedef on TMatrixT<double>.

6.2. ALTERNATIVES TO THE ROOT MATRICES 61

boost TMatrixD Eigen (dyn.) Eigen (stat.)0

500

1000

1500

2000

2500

3000

3500

4000

Elap

sed

time

[ms]

420

140 43 38

3600

820

350

14

FillingMultiplication

Figure 6.5: Results of the matrix package benchmark.

The poor performance of the boost matrices can be explained with theiroptimisation for large-dimensionality calculations, in contrary to the Eigenmatrices6.

All this led to the decision to use the static Eigen matrices within theGFKalman. The static one was chosen over the dynamical one because of itsadditional gain in speed.

6.2.1 The Eigen Matrices

The Eigen library is a free C++ algebra library with very fast and versatilecomponents[34]. Its design is based on an extensive use of templates, whichprovides several unique advantages (but also some disadvantages regardingthe flexible implementation in a ROOT environment). This section shall givea quick overview about Eigen, focussing on its static matrix classes.

Eigen highly relies on Template Meta Programming, a programming

6”[...] the main development was on (large) dynamic sized matrices and sparse orstructured matrices”, as stated in the boost FAQ [35].


technique which makes it possible that important configurations are not takenby the program itself at run time, but by the compiler at compile time. Forexample, this can be used so determine the size of matrices already duringthe compilation, setting their calculation algorithms to the optimal fittingones for this size. Also, with known size, the memory for the matrices can beallocated statically on the stack and doesn’t need to be allocated dynamicallyat run time, which again gives an appreciable gain of speed7.

However, this concept leads to several disadvantages. At first, as a partof GenFit, every class should be compatible to ROOT. At the first glance,this should not be a problem, as ROOT is coded in standard C++, too. Butseveral concepts of it make difficulties; so every class in GenFit is added toa dictionary for the ROOT streamer, which are both created automaticallyby ROOT’s own C++ interpreter ROOTCINT. This is done by parsing theheader files of the concerned classes. Unfortunately, ROOTCINT is notcapable of modern standard C++, like the multiply nested templates usedwithin Eigen. This lead to big challenges in designing the new GFKalman

class, as the solution to this was to avoid every appearance of Eigen objectswithin the header file.

Secondly, GenFit makes high demands on flexibility, as its name implies(’Generic Fitting Framework’), especially on the size of the matrices. Becauseany track representations with arbitrary parametrizations and thus dimen-sionality can be used, the classes holding and processing these parametersmust be flexible as well. This includes, of course, resizeability of matrixclasses within GFKalman. Nevertheless, this constraint can be circumventedby several tricks, shown in the next section.

6.3 Performance Optimisations at the GFKalman

Class

As mentioned before, some adjustments were necessary to make the newGFFastKalman class compatible with ROOT, whose C++ interpreter ROOTCINTis not fully capable of modern C++. To get rid of this hurdle, the GFFastKalmanwas designed without any appearance of Eigen objects within its header. Thiswas possible because only two private methods and no public ones handover matrices. These two methods, calcGain() and chi2Increment(), weretaken out of the class and implemented as free functions in the source file.

7Allocating memory on the stack is much faster than on the heap because it onlyinvolves incrementing/decrementing pointers within the programs own memory, whereasallocating heap memory involves the more complex memory management of the operatingsystem, also with additional overheap.

6.3. PERFORMANCE OPTIMISATIONS AT THE GFKALMAN CLASS 63

This might not be the most sophisticated solution regarding design aspects,but for sure it is sufficient for these needs.

6.3.1 Team Play with ROOT Matrices

As stated above, at various points within the Kalman fitter, other methodsand classes are called, with both varying dimensions and a fixed TMatrixD

interface. This makes it necessary to find a way which both provides thesefunctionalities and has the performance of the Eigen matrices.

A suitable way to unify theses two needs is to implement both kindsof matrices and make them share the same data array. Fortunately, bothTMatrixD and the Eigen matrices provide the functionality to outsource theirdata arrays. Of course, as the advantages of the stack-based arrays aredesireable, the TMatrixD uses the data array of the Eigen matrix, and notvice versa. This is done the following way:

EigMat state(5, 1);

EigMat cov(5, 5);

TMatrixD state_temp;

TMatrixD cov_temp;

state_temp.Use(5, 1, state.data());

cov_temp.Use(5, 5, cov.data());

Now, no matter which matrix is changed, the changes appear in both matrices,both are sharing the identical content. Moreover, both classes provide fullfunctionality8, so the fast calculation of Eigen can be used alongside theTMatrixD interface.

Yet, as the dimensionality of the track representation and the reconstruc-tion hits used within the fitter are still previously unknown, this needs to beknown at compile time for the Eigen matrices. An easy way to solve this isto declare the Eigen matrices with a big enough fixed size, and then theirsize can be varied within this maximum size. This works the following way(note that the type EigMat is a typedef):

typedef Eigen::Matrix <double , Eigen::Dynamic ,

Eigen ::Dynamic , 1, MAXROW , MAXCOL > EigMat;

EigMat myMatrix(5, 4);

The size stated in the constructor (here 5× 4) can be chosen in the rage ofthe MAXROW and MAXCOL option. This procedure has only minor effects on theperformance, but makes Eigen better suiting to GenFit.

8Except functionalities which alter attributes of the array, like resizing.


6.3.2 Adjustments

In general, the changes made within the new fitter GFFastKalman wereoriented on the following principles:

• Use Eigen objects wherever it’s possible.

• Use Eigen objects with shared data array when a TMatrixD interface isneeded.

• Use memcpy to copy matrix content, when shared array matrices can’tbe used.

• Create as little temporary objects as possible to reduce calls of de-/constructors and reduce memory allocation operations; make themmember variables if possible.

The adjustments done to the fitter will be explained in detail usingexample code. Therefore the most important changes within the methodprocessHit(), which is the only function using matrix operations duringthe fit, will be shown here; but only the relevant changes, as showing thecomplete function would take too much space here. Additionally, the two freefunctions will be shown here as well.

As previously described, the different matrix classes are assigned to thesame array9.

void

GFFastKalman :: processHit(GFTrack* tr , int ihit , int

irep ,int direction)

...

EigMat state(repDim , 1);

EigMat cov(repDim , repDim);

state_temp.Use(repDim , 1, state.data());

cov_temp.Use(repDim , repDim , cov.data());

Here, the temp and ext instances are TMatrixD’s, which are declared in theheader as member variables, to save calls of constructors and destructors.This is valid, because they are overwritten in each call of ProcessHit().

For the needed extrapolation, additional matrices (ext) are initialized,without shared array, to guarantee resizeability needed by the extrapolation:

9Note that all ... are standing for left out code.


GFDetPlane pl;

...

rep ->extrapolate(pl , state_ext , cov_ext);

memcpy(state_temp.GetMatrixArray (),

state_ext.GetMatrixArray (), state.size() *

sizeof(double));

memcpy(cov_temp.GetMatrixArray (),

cov_ext.GetMatrixArray (), cov.size() *

sizeof(double));

To copy the content of the matrices, no for-loops or similar are used, but onlymemcpy, which is faster.

The following matrices are set up similarly. Note that only the H matrixshares its array, as the others are needed to be resizeable:

TMatrixD H_temp = hit ->getHMatrix(rep);

TMatrixD m_temp(state_temp.GetNcols (),

H_temp.GetNcols ());

TMatrixD V_temp(H_temp.GetNcols (), H_temp.GetNcols ());

EigMat H (H_temp.GetNrows (), H_temp.GetNcols ());

H_temp.Use(H_temp.GetNrows (), H_temp.GetNcols (),

H.data());

H_temp = hit ->getHMatrix(rep);

hit ->getMeasurement(rep , pl , state_temp , cov_temp ,

m_temp , V_temp);

EigMat m (H_temp.GetNrows (), 1);

EigMat V (H_temp.GetNrows (), H_temp.GetNrows ());

memcpy(V.data(),V_temp.GetMatrixArray (),V.size() *

sizeof(double));

memcpy(m.data(),m_temp.GetMatrixArray (),m.size() *

sizeof(double));

All further calculations are done purely by Eigen matrices:

EigMat res(m_temp.GetNrows (), m_temp.GetNcols ());

res = m-(H*state);

// calculate kalman gain

EigMat Gain ( cov_temp.GetNrows () ,

V_temp.GetNcols ());


calcGain(cov , V, H, Gain);

// calculate update

EigMat update(state_temp.GetNrows (),

state_temp.GetNcols ());

update=Gain*res;

...

res = m-(H*state);

double chi2 = chi2Increment(res , H, cov , V);

...

Whereas the two called external functions are using pure Eigen, too:

void calcGain(const EigMat& Ecov , const EigMat&

EHitCov , const EigMat& EH , EigMat& gain)

gain = Ecov*(EH.transpose ()*(( EHitCov +

EH*Ecov*(EH.transpose ())).inverse ()));

double chi2Increment(const EigMat& Er , const EigMat&

EH, const EigMat& Ecov , const EigMat& EV)

EigMat chisq (1,1);

chisq = (Er.transpose ())*((EV -

EH*Ecov*(EH.transpose ())).inverse ())*Er;

...

return chisq (0,0);

6.3.3 Results

The GFFastKalman class was then tested with a modified version of GenFit’sPullsTest10, with an additional time measurement.

Two times were measured; at first the overall time for the whole program,and then the average time of the duration of all the calls of ProcessTrack()method from the (fast) Kalman fitter. For the overall measurement, thestandard C class clock t was used, and for the exact measurement, againthe boost timer was used, as in the benchmatr program in section 6.2.

These measurements were taken 100 times, once for the non-optimised Gen-Fit version11, then once for the latest GenFit version with J. Rauch’s optimised

10This program can be found in $GENFIT/test/PullsTest .11Revision 575 from May 9, 2012


RKTrackRep, and again with both optimised RKTrackRep and GFFastKalman,to see the full potential of the optimisation.

They were taken on the authors desktop computer. Of course, a bettercomputer would have achieved faster calculations, but as we’re just interestedin the percent gain, the absolute run time is not relevant. For the sake ofcompleteness, this data can be found in the appendix A.

During this benchmark, the author chose to switch off geometry operationsby simply not loading any file. Both optimisation problems, the matrixoptimisation and the later discussed geometry navigation optimisation, arelinearly independent, as their operations run sequentially separated from eachother, with simply added up run times. The geometry was chosen to beswitched off to save computing resources, to get a result as exact as possible.Nevertheless, the benchmark was run with switched on geometry as well,without any relevant differences. These results can be found in the appendixA, too.

Fortunately, the results turned out to be as expected. Overall, withcombined optimised RKTrackRep by J. Rauch and the authors GFFastKalman,a speed improvement of 59.4% with respect to the non-optimised versionscould be gained. This massive improvement of the performance of theProcessTrack() method marks a significant benefit for Belle-II tracking. Inlarge part these improvements come from the optimised RKTrackRep, whichalone cuts the runtime in half. The gains of GFFastKalman contribute smallergains, in respect to the measurement with only the new TrackRep it gains11.4%, which is in the expected scale. These results can be seen in figure 6.7.

The gains by the RKTrackRep were smaller than the ones from theGFFastKalman. This comes, as mentioned before, from the much biggerroom for optimisations in RKTrackRep, which contains much more matrixoperations which are less complex than in the fitter. Nevertheless, the previ-ously expected potential of improvement was estimated to lie between 10-20%,which could be fulfilled with the measured 11.4%.


no opt. trackrep opt. full opt.0

20

40

60

80

100

time

in p

erce

ntag

e of

non

-opt

.

100

45.840.6

Optimisation vs. Non-Optimisation

Figure 6.6: Measurement of the CPU time consumption of ProcessTrack()method, in percent of the unoptimised version.

trackrep opt. full opt.0

20

40

60

80

100

time

in p

erce

ntag

e of

non

-opt

.

100

88.6


Figure 6.7: Measurement of the CPU time consumption of ProcessTrack(),comparing the versions with only the track representation optimised and withfull optimisation.

Chapter 7

Ideas for Further PerformanceOptimisations

All models are false but some models are useful.George E. P. Box

7.1 Simplified Geometry Models

As seen in the last chapter, there is one big field for further optimisations inthe Belle II tracking. After the first issue with the slow matrix operations wasreduced reasonably, the focus of this chapter lies on possible optimisations ofthe geometry navigation. The RKTrackRep uses the ROOT geometry classesfor navigation during the extrapolation. These provide maximum flexibilityand precision. Nevertheless, this material information is not precisely neededduring the fit. Moreover, as seen before, navigation takes a by far notnegligible fraction of the time needed for the fit. These two points imply tofind a way to create a simplified geometry model, which will provide a fasternavigation with reasonable accuracy.

The least complex approach to a simplified geometry is to just reconstructthe full geometry manually, combining nearby volumes with the same (orsimilar) material to one volume, easing out substructures. But this approachprovides one big disadvantage – after every little change in the full geometry,this simplified model must be made over again in detail. It can be seen as amakeshift solution.

A better solution would be a program which automatically generates asimplified model out of the full geometry. One elegant approach to this wasthe SiliMap program by K. Rinnert [36] at the CDF experiment, describedin detail in the next section.

69

70CHAPTER 7. IDEAS FOR FURTHER PERFORMANCEOPTIMISATIONS

7.2 SiliMap at CDF

This program was implemented for the CDF experiment, for the same reasonsof a faster geometry navigation for the track fit. Most of the information orthis chapter was taken from [36]. It was created in the context of K. Rinnert’sdissertation at the Institut fur Experimentelle Kernphysik in Karlsruhe, toimprove the performance of the track fit within the silicon detectors.

The program itself basically has two toeholds for the performance optim-isations:

• Automatically breaking down the full geometry to a less complex andsymmetrical one.

• A new navigation for this model, fully exploiting its symmetry.

The construction of the simplified geometry works with the followingprinciple. At first, the geometry is voxelized. For this, the geometry of thesilicon detectors is equally binned in z and φ direction (fig. 7.1). In radialdirection the binning is not done equally, the division is chosen depending onthe the existing silicon layers. This can be seen in fig. 7.2. The boundariesare chosen, that each layer is merged into one radial bin; the air in betweenis treated the same. This is done by the SiliMapScanMod function.

After the binning is done, the material of these bins is scanned. This isdone via GEANT, measuring the material properties1 by shooting particlesthrough a bin. By this, the average material properties of each bin aredetermined. A projection of a scan for the material density in CDF can beseen in fig 7.3.

By this procedure, a geometry model made of symmetrical voxels iscreated, with arbitrary accuracy. Nevertheless, by this procedure alone, noimprovement of performance is gained. A new method of navigation adjustedto the new geometry is needed in addition.

In SiliMap, this is done by extensive exploitation of the symmetry of thenew model. Due to its symmetrical binning, which is equidistant in z- andφ-direction, basically a simple modulo operation can be used to determinethe bin, and to get its material information. This is highly efficient, as itscomplexity does not rise with the numbers of voxels.

This approach followed by SiliMap brought an enormous gain on theperformance of the track fit at CDF. The next section will survey, whether orhow this concept can be transferred to the Belle II tracking.

1Consisting of the energy loss constant CdE/dx, the mean excitation potential I0, thespecific radiation length X0, a particle hypothesis and the pathlength traversed by theparticle.

7.2. SILIMAP AT CDF 71

Figure 7.1: This picture shows the SiliMap binning in z and φ direction.[36]

Figure 7.2: This sketch shows the radial binning of SiliMap (projection onthe φ-plane). The right picture shoes a zoomed view onto the inner region.This binning is done in a way that the silicon layers are merged to one bin.[36]


Figure 7.3: Measured density after a scan in z-φ projection. [36]

7.3 Similar Approaches at Belle II

Firstly, one big difference in the detector designs of Belle II and CDF isthat because of its concept as a B factory, the Belle II detector is designedasymmetrically. This calls the whole concept of SiliMap into question, whichis highly related on symmetry. Nevertheless, by some adjustments, someparts of this concept could also be taken to Belle II.

The pure cylindrical binning must of course be abandoned, due to somenon symmetrically arranged parts of the detector. Yet, a solution to thiscould be to take these parts – like, for example, the slanted parts of theSVD (see 3.2) – out of the global parametrization of the binning, and toarrange an own kind of binning for these parts; like, for the slanted SVD, aconical parametrization. Exceptions like these would not cause a loss on thenavigation performance. As long as an appropriate binning is found, the scancan be performed just as on CDF.

For the navigation itself, the TGeo class from ROOT, which the RKTrackRepuses, must be replaced by an own class dealing with this binned model. If avarying binning is chosen, the simple approach of using a modulo to determinethe voxel can’t be used – however, it can be used to determine the binningarea, and then the single bin.

Though the very high-performance SiliMap approach does not perfectlyfit to the Belle II concepts, its basic ideas could be transformed with slightchanges to Belle II. Due to the need of further optimisation at Belle II tracking,

7.3. SIMILAR APPROACHES AT BELLE II 73

approaches like these should be considered.


Chapter 8

Conclusion

In the context of this thesis, numerous aspects regarding the tracking softwareof Belle II could be deepened, were examined in detail, and some could beimproved significantly. The outcome of the two main issues of this paper,the integration of Geant4e into GenFit by creating the G4eTrackRep on theone hand, and the optimisation of Genfit itself on the other hand, will bediscussed here in detail.

At first, let’s briefly call to mind the issues of this thesis on track fittingat Belle II. With the fourtyfold luminosity of its forerunner, and with abackground rate to be expected twenty times higher, a precise track fitting isindispensable. Also, as computing resources are limited, both the memoryand CPU time consumption should be kept within a limit.

In order to make track fitting more precise and less memory consuming,the creation of a new class using Geant4e as track extrapolator was one issueof this thesis. As Geant4e uses the same geometry format than Geant4 inthe Belle II simulation, the memory-consuming geometry could be shared; inaddition, this approach makes use of the precise particle stepping of Geant4.

To test the created class, called G4eTrackRep, a standalone testing en-vironment using Geant4 as simulation was set up. A simple detector wassimulated to produce hits, which then could be fitted by this new class withinthe track fitting framework GenFit. This direct integration of GenFit into apure Geant4 environment was one of the first of its kind.

As described in section 5.1.5, severe malfunctions within Geant4e, anda big lack of documentation about it have complicated the work with itenormously. Nevertheless, after analysing the work of L. Piilonen [25], Geant4ewas successfully integrated into Belle II’s software environment, although ofits complicated structure. Unfortunately, some severe malfunctions could notbe fixed, and in the end no properly working version of the G4eTrackRep

could have been established.

75

76 CHAPTER 8. CONCLUSION

This work on the G4eTrackRep was, along with the work of M. Wysocki[26], the first attempt of creating a Kalman fitter based on Geant4e, as faras the author’s inquiries have shown. As a result of dealing with Geant4e,considering its little user base, its lack of documentation and the severeproblems on its implementation, developers in future experiments shoulddeliberate about whether Geant4e should really be used.

As the other issue of this thesis, the performance of GenFit was analysed.This revealed great room for improvements, especially concerning matrixoperations. In the author’s benchmark for matrix packages, the stunningcharacteristics of the Eigen library could have been shown, which determinedthem as the perfect candidates to be integrated into GenFits GFKalman class,which is responsible for the track fit. This approach revealed significant speedadvantages in the fit, leading to a gain of 11.4% of performance; combinedwith the work of J. Rauch [32], this has reduced the time needed for a trackfit by 59.4 %. This is, if one keeps in mind that about 1013 fits are performedduring the Belle II experiment, an enormous gain of performance.

Additionally, new goals for possible further optimisations regarding thegeometry navigation were illustrated, with great room for further improve-ments of performance. It was shown, that the SiliMap ansatz taken at CDF[36], lowering the time for geometry navigation to a fraction, can be adaptedto Belle II with some changes.

Appendix A

Appendix

A.1 Performance of GFFastKalman

Note: All benchmarks, performance tests and all profiling were run on adesktop PC with Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz processor,3.759 GB RAM, with operating system Ubuntu 10.04 Lucid Lynx and kernel3.0.0-26-generic.

no opt. trackrep opt. full opt.0

20

40

60

80

100

time

in p

erce

ntag

e of

non

-opt

.

100

45.840.6


trackrep opt. full opt.0

20

40

60

80

100

time

in p

erce

ntag

e of

non

-opt

.

100

88.6


Figure A.1: Performance of the optimisations without geometry navigationcompared to non-optimized versions in percentage.

77

78 APPENDIX A. APPENDIX

no opt. trackrep opt. full opt.0.0

0.5

1.0

1.5

2.0

2.5

time

in m

s

2.06186

0.944460.83679


(a)

trackrep opt. full opt.0.0

0.2

0.4

0.6

0.8

1.0

time

in m

s

0.94446

0.83679


(b)

Figure A.2: Performance of the optimisations without geometry navigationcompared to non-optimized versions in absolute values with CPU time in ms.The CPU time does not reflect the really elapsed time.

no opt. trackrep opt. full opt.0.0

0.5

1.0

1.5

2.0

2.5

time

in m

s

2.07562

0.9466494850.83465


trackrep opt. full opt.0.0

0.2

0.4

0.6

0.8

1.0

time

in m

s

0.946649485

0.83465

Optimisation vs. Non-Optimisation + Geo. navgation

(a)

Figure A.3: Performance of the optimisations with geometry navigationswitched on compared to non-optimized versions in absolute values with CPUtime in ms. The CPU time does not reflect the really elapsed time.

A.1.

PERFORMANCEOFGFFASTKALMAN

79

hqopPuEntries 10000Mean -0.2453RMS 1.004

/ ndf 2χ 115.1 / 114Prob 0.4545Constant 3.0± 238.9 Mean 0.010± -0.252 Sigma 0.0073± 0.9909

-6 -4 -2 0 2 4 60

50

100

150

200

250

hqopPuEntries 10000Mean -0.2453RMS 1.004


q/p pull pValEntries 10000Mean 0.4973RMS 0.2888

/ ndf 2χ 113.6 / 98Prob 0.1339p0 2.0± 100.3 p1 3.447± -2.789

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

80

90

100

110

120

pValEntries 10000Mean 0.4973RMS 0.2888

/ ndf 2χ 113.6 / 98Prob 0.1339p0 2.0± 100.3 p1 3.447± -2.789

p-value

hupPuEntries 10000Mean -0.005973RMS 0.9923


-6 -4 -2 0 2 4 60

50

100

150

200

250

hupPuEntries 10000Mean -0.005973RMS 0.9923


u' pull hvpPuEntries 10000Mean -0.02998RMS 1.007

/ ndf 2χ 102 / 111Prob 0.7181Constant 2.9± 237.4 Mean 0.0101± -0.0312 Sigma 0.0071± 0.9989

-6 -4 -2 0 2 4 60

50

100

150

200

250

hvpPuEntries 10000Mean -0.02998RMS 1.007


v' pull

huPuEntries 10000Mean 0.0001461RMS 0.9986


-6 -4 -2 0 2 4 60

50

100

150

200

250

huPuEntries 10000Mean 0.0001461RMS 0.9986


u pull hvPuEntries 10000Mean -0.004481RMS 1.004


-6 -4 -2 0 2 4 60

50

100

150

200

250

hvPuEntries 10000Mean -0.004481RMS 1.004


v pull

Figure A.4: Pull distributions created with $GENFIT/test/PullsTest using both the optimised RKTrackRep andthe GFFastKalman. These pulls are identical to the ones created with the non-optimised classes.

80 APPENDIX A. APPENDIX

Bibliography

[1] A.D. Sakharov. ‘Violation of CP Invariance, c Asymmetry, and BaryonAsymmetry of the Universe’. In: Pisma Zh.Eksp.Teor.Fiz. 5 (1967),pp. 32–35. doi: 10.1070/PU1991v034n05ABEH002497.

[2] K Abe. ‘Observation of Large CP Violation in the Neutral B MesonSystem’. In: Phys. Rev. Lett. 87.hep-ex/0107061. BELLE-2001-10. KEK-2001-50 (July 2001), p. 091802.

[3] Bogdan Povh et al., eds. Particles and nuclei : an introduction to thephysical concepts; with 11 tables, and 58 problems and solutions. 3. ed.Berlin: Springer, 2002.

[4] Claude Amsler. Kern- und Teilchenphysik. UTB ; 2885 : Physik. kart. :ca. EUR 34.90, ca. EUR 35.90 (AT), ca. sfr 60.40. Zurich: vdf Hoch-schulverl. AG an der ETH Zurich, 2007. isbn: 3-8252-2885-1 ; 978-3-8252-2885-9. url: http://deposit.d-nb.de/cgi-bin/dokserv?id=2895537&prov=M&dok_var=1&dok_ext=htm.

[5] F. Abe et al. ‘Observation of top quark production in pp collisions’. In:Phys.Rev.Lett. 74 (1995), pp. 2626–2631. doi: 10.1103/PhysRevLett.74.2626. eprint: hep-ex/9503002.

[6] M. Banner et al. ‘Measurement of the Branching Ratio KL → γγ /KL → 3π0’. In: Palmer Physical Laboratory, Princeton University, NJ16 (1968), p. 16.

[7] J. Beringer et al. ‘Review of Particle Physics’. In: Phys. Rev. D 86(2010), p. 225. url: http://pdg.lbl.gov.

[8] C. Grupen. Particle Detectors. Ed. by C. Grupen et al. CambridgeMonographs on Particle Physics, Nuclear Physics and Cosmology, 2008.

[9] V.L. Highland. ‘Some practical remarks on multiple scattering’. In:Nucl.Instrum.Meth. 129 (1975), p. 497.

[10] J. Beringer et al. ‘Review of Particle Physics - Passage of Particlesthrough Matter’. In: Phys. Rev. D 86 (2012), pp. 16–18. url: http://pdg.lbl.gov.

81

http://dx.doi.org/10.1070/PU1991v034n05ABEH002497

http://deposit.d-nb.de/cgi-bin/dokserv?id=2895537&prov=M&dok_var=1&dok_ext=htm

http://deposit.d-nb.de/cgi-bin/dokserv?id=2895537&prov=M&dok_var=1&dok_ext=htm

http://dx.doi.org/10.1103/PhysRevLett.74.2626

http://dx.doi.org/10.1103/PhysRevLett.74.2626

hep-ex/9503002

http://pdg.lbl.gov

http://pdg.lbl.gov

http://pdg.lbl.gov

82 BIBLIOGRAPHY

[11] T. Abe et al. Belle II Technical Design Report. Tech. rep. High EnergyAccelerator Research Organization (KEK), 2010.

[12] Christian Pulvermacher. ‘dE/dx Particle Identification and Pixel De-tector Data Reduction for the Belle II Experiment’. MA thesis. Institutfur Experimentelle Kernphysik Karlsruhe, 2012.

[13] Rene Brun and Fons Rademakers. ROOT - An Object Oriented DataAnalysis Framework. 1997. url: http://root.cern.ch/.

[14] S. Agostinelli et al. ‘Geant4 - a simulation toolkit’. In: Nuclear Instru-ments and Methods in Physics Research Section A: Accelerators, Spec-trometers, Detectors and Associated Equipment 506.3 (2003), pp. 250–303.

[15] K. Amako et al. ‘Geant4 developments and applications’. In: IEEETransactions on Nuclear Science 53 (2006), pp. 270–278.

[16] P. Arce. ‘GEANT4E - Error propagation for track reconstruction insidethe GEANT4 framework’. In: CHEP 2006 Mumbai. 2006.

[17] C. Hoppner et al. ‘A novel generic framework for track fitting in com-plex detector systems’. In: Nuclear Instruments and Methods in PhysicsResearch Section A: Accelerators, Spectrometers, Detectors and Asso-ciated Equipment 620.2-3 (2010), pp. 518 –525. issn: 0168-9002. doi:10.1016/j.nima.2010.03.136. url: http://www.sciencedirect.com/science/article/pii/S0168900210007473.

[18] C. Hoppner et al. ‘A Novel Generic Framework for Track Fittingin Complex Detector Systems’. In: Nucl.Instrum.Meth. A620 (2010),pp. 518–525. doi: 10.1016/j.nima.2010.03.136. eprint: 0911.1008.

[19] S. Menzemer. ‘Spurrekonstruktion im Silizium-Vertexdetektor desCDFII-Experiments’. PhD thesis. Institut fur Experimentelle Kern-physik Karlsruhe, 2003.

[20] J. Rauch C. Hoppner S. Neubert. GenFit Code. url: http://genfit.svn.sourceforge.net/viewvc/genfit/trunk/.

[21] J. Rauch. ‘Tracking with a High-Rate GEM-TPC’. MA thesis. Technis-che Universitat Munchen, E18, 2012.

[22] Geant4 User Manual for Application Developers. url: http : / /

geant4.web.cern.ch/geant4/UserDocumentation/UsersGuides/

ForApplicationDeveloper/html/ch05s08.html.

[23] W. Wittek. ‘EMC Internal Report EMCSW/81/18’. In: EMCSW/81/18.1981.

http://root.cern.ch/

http://dx.doi.org/10.1016/j.nima.2010.03.136

http://www.sciencedirect.com/science/article/pii/S0168900210007473

http://www.sciencedirect.com/science/article/pii/S0168900210007473

http://dx.doi.org/10.1016/j.nima.2010.03.136

0911.1008

http://genfit.svn.sourceforge.net/viewvc/genfit/trunk/

http://genfit.svn.sourceforge.net/viewvc/genfit/trunk/

http://geant4.web.cern.ch/geant4/UserDocumentation/UsersGuides/ForApplicationDeveloper/html/ch05s08.html



BIBLIOGRAPHY 83

[24] L. Piilonen. Source code of G4ext. url: https://belle2.cc.kek.jp/browse/viewvc.cgi/svn/trunk/software/tracking/modules/ext/

src/ExtModule.cc.

[25] L. Piilonen. Ext. Belle II Collaboration. Software Developers Meeting.Apr. 2012.

[26] M. Wysocki. personal correspondency. 2012.

[27] P. Arce. personal correspondency. 2012.

[28] Julian Seward et al. Valgrind. 2012. url: http://www.valgrind.org.

[29] Josef Weidendorfer. Callgrind. 2012. url: http://www.valgrind.org/info/tools.html#callgrind.

[30] Nicholas Nethercote. Cachegrind. 2012. url: http://www.valgrind.org/info/tools.html#cachegrind.

[31] R. Stallman et al. GDB - The GNU Debugger Project. 2012. url:http://gcc.gnu.org/.

[32] J. Rauch. personal correspondency. 2012.

[33] various authors. Boost C++ Libraries. url: www.boost.org.

[34] G. Guennebaud and B. Jacob. The Eigen Library. 2012. url: http://eigen.tuxfamily.org/.

[35] BOOST WIKI. Frequently Asked Questions. 2012. url: http://

www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?

Frequently_Asked_Questions_Using_UBLAS.

[36] K. Rinnert. ‘Ein hoch effizientes Geometriemodell fur die Spurrekon-struktion im Silizium-Vertex-Detektor bei CDF2 und Suche nach kor-relierter Charm-Produktion’. PhD thesis. Institut fur ExperimentelleKernphysik Karlsruhe, 2005.

https://belle2.cc.kek.jp/browse/viewvc.cgi/svn/trunk/software/tracking/modules/ext/src/ExtModule.cc



http://www.valgrind.org

http://www.valgrind.org/info/tools.html#callgrind

http://www.valgrind.org/info/tools.html#callgrind

http://www.valgrind.org/info/tools.html#cachegrind

http://www.valgrind.org/info/tools.html#cachegrind

http://gcc.gnu.org/

www.boost.org

http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/

http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Frequently_Asked_Questions_Using_UBLAS



84 BIBLIOGRAPHY

Danksagung

Zunachst mochte ich hiermit allen herzlich danken, die zum Gelingen dieserDiplomarbeit beigetragen haben.

Großer Dank gilt Prof. Dr. Michael Feindt fur die Ubernahme des Refe-rendariats und die ausgezeichnete Betreuung. Sein großes Wissen und seinelangjahrige Erfahrung, kombiniert mit einem stets offenen Ohr fur die Belangeseiner Studenten haben maßgeblich zum Gelingen dieser Arbeit beigetragen.Auch fur seine beispiellose Unterstutzung im Rahmen des Data Mining Cupssei herzlich gedankt.

Vielen Dank an Prof. Dr. Thomas Muller, sowohl fur die Ubernahme desKorreferendariats, als auch fur die sehr schonen und lehrreichen Vorlesungenin meinem Grundstudium, an die ich mich immer gerne zuruckerinnern werde.

Herzlichen Dank an Dr. Martin Heck und Dr. Thomas Kuhr fur dierundum gute und umfangreiche Betreuung, Unterstutzung und Beratungwahrend dieses Jahres. Ohne ihre großartige Hilfe ware diese Diplomarbeitmit Sicherheit so nicht zustande gekommen.

Vielen Dank an die gesamte Belle/Belle II Arbeitsgruppe des EKP, imSpeziellen an meine Zimmerkollegen aus Raum 9-2, fur die freundschaftlicheund gute Arbeitsatmosphare. Besonderen Dank an alle unsere Doktorandenfur die große Hilfsbereitschaft, im Besonderen an Bastian Kronenbitter, dermir wichtige Impulse fur diese Arbeit gegeben hat.

Vielen Dank an Dr. Christian Hoppner und Johannes Rauch (TechnischeUniversitat Munchen) vom GenFit-Projekt fur die gute Zusammenarbeit unddie Gastfreundschaft.

Besonderer Dank gilt dem Institut fur Experimentelle Kernphysik und derBelle-II-Kollaboration fur die einmalige Gelegenheit, an einem internationalenGroßexperiment mitwirken zu durfen, und dass ich in diesem Rahmen Treffenund Konferenzen in ganz Deutschland besuchen durfte. Diese Erfahrung wareine große Bereicherung fur mich.

Vielen, vielen Dank an alle, die diese Arbeit probe- und korrekturgelesenhaben, ganz besonders an Timo Holzherr, Dennis Weber, Stephan Spath,Kristina Tanz, Johannes Grygier und Andreas Lang. Ich weiss, wie aufwandig

85

86 DANKSAGUNG

Korrekturlesen ist, und weiss es deshalb umso mehr zu schatzen!Ganz besonderer Dank geht an mein Familie, an meinen Bruder Julien

und an meine Eltern Doris und Karl-Ludwig, dafur, dass sie immer hinter mirstehen und mich immer unterstutzt haben. Und nicht zuletzt nochmals großenDank an meine Eltern, dass sie mir mein Studium uberhaupt ermoglichthaben!

Hiermit versichere ich, die vorliegende Arbeit selbststandig verfasstund nur die angegebenen Hilfsmittel verwendet zu haben.

Philipp OehlerKarlsruhe, den 31. Oktober 2012

87

Documents

Pr azisionstrackingstudien bei Belle IIthesis/data/iekp-ka2012-18.pdf · sie kann in einigen Bereichen der Medizintechnik und der Radiologie gar als Industriestandard angesehen werden