139
IEKP-KA/2012-14 Unterdr¨ uckung des Z -Untergrundes zur Higgs-Suche im Kanal H ττ durch Multivariate Analyse von ττ -Endzust¨ anden in pp-Kollisionen am LHC Thomas M¨ uller Diplomarbeit an der Fakult¨ at f¨ ur Physik des Karlsruher Instituts f¨ ur Technologie Referent: Prof. Dr. G¨ unter Quast Institut f¨ ur Experimentelle Kernphysik Korreferent: Prof. Dr. Wim de Boer Institut f¨ ur Experimentelle Kernphysik 11. April 2012

Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

IEKP-KA/2012-14

Unterdruckung des Z-Untergrundeszur Higgs-Suche im Kanal H → ττdurch Multivariate Analyse vonττ-Endzustanden in pp-Kollisionen

am LHC

Thomas Muller

Diplomarbeit

an der Fakultat fur Physik

des Karlsruher Instituts fur Technologie

Referent: Prof. Dr. Gunter QuastInstitut fur Experimentelle Kernphysik

Korreferent: Prof. Dr. Wim de BoerInstitut fur Experimentelle Kernphysik

11. April 2012

Page 2: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 3: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Deutsche Zusammenfassung

Das Standardmodell der Teilchenphysik stellt die theoretische Grundlage der heutigenTeilchenphysik dar. Es beschreibt bisher alle beobachteten Phanomene der elektro-magnetischen, der starken und der schwachen Wechselwirkung mit beeindruckenderPrazision. Dabei werden die Wechselwirkungen zwischen den Fermionen als Masse-teilchen durch den Austausch sogenannter Eichbosonen vermittelt. Im Rahmen derQuantenfeldtheorie konnen die zugehorigen Eichfelder eingefuhrt werden, wenn maneine lokale Eichsymmetrie der Lagrange-Funktionen, die solche Systeme beschreiben,fordert. Alle so eingefuhrten Austauschbosonen sind zunachst masselos.

Die experimentelle Beobachtung zeigt jedoch, dass die schwachen Eichbosonen, dieW±- und Z-Teilchen, eine relativ große Masse von 80 GeV

c2bzw. 91 GeV

c2haben. Mit-

tels spontaner Symmetriebrechung der SU(2)-Symmetriegruppe, die der schwachenWechselwirkung zugrunde liegt, erlangen die W±- und Z-Bosonen Masse. Dabei mussdas Higgs-Feld als neues Feld mit dem zugehorigem Higgs-Bosons mit der MassemH als freiem Parameter eingefuhrt werden. Ein Nachweis des Higgs-Bosons unddie Messung seiner Masse sowie der Kopplung zu den Fermionen und Bosonen desStandardmodells wurde den sogenannten Higgs-Mechanismus bestatigen und dasStandardmodell damit auch experimentell komplettieren.

Fur die Suche nach dem Higgs Boson werden Daten des CMS-Detektors (CompactMuon Solenoid) analysiert. Dieser Detektor, ein Meisterstuck derzeitiger Ingeniers-kunst, ist einer der beiden Vielzweckdetektoren des CERN (Europaische Organisationfur Kernforschung) bei Genf in der Schweiz, der am aktuell starksten Teichenbeschleu-niger LHC (Large Hadron Collider) installiert ist. Bei einer Schwerpunktsenergievon 7 TeV kollidieren dort Strahlen von Protonen bei hochsten Intensitaten. Dabeikonnen die neutralen Higgs-Teilchen des Standardmodells beispielsweise durch Fusionzweier Gluonen uber eine Schleife von schweren Quarks produziert werden. DerZerfall des Higgs-Bosons in Paare von τ -Leptonen, wie er in dieser Arbeit untersuchtwird, eignet sich besonders im unteren Higgs-Massenbereich fur die direkte Suche

iii

Page 4: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Deutsche Zusammenfassung

nach dem Higgs-Boson. Zum einen ist das Verzweigungsverhaltnis mit fast 10 %recht groß und zum anderen lassen sich ττ -Ereignisse im Detektor im Vergleichzum haufigeren Zerfall in Paare von b-Quarks gut identifizieren. Der Zerfall desZ-Bosons in Paare von Tauonen stellt wegen seiner ahnlichen Zerfallstopologie einenirreduziblen Untergrundprozesse dar und lasst sich insbesondere in Suchen nacheinem sehr leichten Higgs-Boson nur schwer unterdrucken. Im Rahmen dieser Arbeitwerden zwei Herangehensweisen verfolgt, die Beitrage von Z-Zerfallen bei der Suchenach dem Higgs-Boson zu unterdrucken.

Massenrekonstruktion mit Kunstlichen Neuronalen Netzen

Bei der Unterscheidung von Higgs- und Z-Zerfallen ist die invariante Masse allerZerfallsprodukte die wichtigste trennende Information. Beim Zerfall in Paare vonTauonen tragen je nach Zerfallsmodus der Tauonen zwei bis vier Neutrinos Ener-gie und Impuls undetektiert weg und machen so die vollstandige Rekonstruktiondes ττ -Systems unmoglich. Damit verbunden ist dann auch die Schwierigkeit derRekonstruktion einer invarianten Masse aller Zerfallsprodukte.

In offiziellen CMS-Analysen werden bisher drei verschiedene Massendefinitionen be-nutzt. Die Masse der sichtbaren Zerfallsprodukte unterschatzt dabei die wahre Massedurch Vernachlassigung aller Neutrinos, bietet aber einen sehr einfachen und robustenSchatzer fur die ττ -Masse. Des Weiteren kann man aus einer kollinearen Naherung furdie Flugrichtungen der Neutrinos eine unverzerrte Massendefinition ableiten, derenSpektrum allerdings durch starke Auslaufer zu hohen Massen hin gekennzeichnetist. Dadurch sinkt die erwartete Signifikanz fur eine Entdeckung des Higgs-Bosonsdeutlich. Außerdem gibt es den sogenannten SVfit-Algorithmus, der versucht, diewahrscheinlichste Masse in einem Likelihood-Verfahren zu schatzen, indem die Kom-patibilitat von gemessenen kinematischen Großen des Zerfalls mit Hypothesen austheoretischen Matrixelement-Rechnungen bewertet wird. Diese bezuglich Verzerrungund Auflosung beste Massenrekonstruktion wird durch eine vergleichsweise langeRechenzeit erkauft.

Im Rahmen des ersten Analyseteils wird untersucht, ob sich die Massenrekonstruk-tion mit Hilfe von kunstlichen neuronalen Netzen weiter verbessern lasst, um einebessere Trennung zwischen Higgs- und Z-Ereignissen erreichen zu konnen. Dazu wirddas Paket NeuroBayes benutzt. Es zeichnet sich durch fortgeschrittene Vorverar-beitungsroutinen fur die Eingangsvariablen sowie durch die Moglichkeit sogenannterDichte-Trainings aus. Diese rekonstruieren eine komplette Wahrscheinlichkeitsdichte(PDF) als Funktion der Zielgroße. Schatzer fur beliebe Zielgroßen lassen sich dannzum Beispiel aus des Mittelwert der PDF ableiten.

Die einfachste netz-basierte Moglichkeit, die ττ -Masse zu rekonstruieren, bestehtdarin, mit einem Dichte-Training direkt die gesuchte Masse vorherzusagen, wobeidie Zielgroße des Trainings direkt der generierten invarianten Masse des ττ -Systemsentspricht. Dabei werden zum Training des Netzes simulierte Ereignisse verwandt,deren Verteilung der wahren Massen flach ist, um das Netz davon abzuhalten,

iv

Page 5: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Massenrekonstruktion mit Kunstlichen Neuronalen Netzen

Praferenzen in der Masse zu lernen. Die wichtigsten Beitrage zum Training liefernhier erwartungsgemaß die Masse der sichtbaren Zerfallsprodukte als erster Schatzerfur die Masse sowie die fehlende transversale Energie (MET) als Hinweis auf dieEnergie, die von den Neutrinos weggetragen wird. Außerdem sind Winkelrelationenzwischen den jeweiligen Zerfallsprodukten der beiden Tauonen sowie zur MET undStoßparameter der Spuren der sichtbaren Zerfallsprodukte von Bedeutung. Da sichdie Analyse auf den Zerfallsmodus konzentriert, bei welchem ein τ -Lepton hadronischin einen sogenannten Jet und das zweite Tauon leptonisch in ein Myon zerfallen, istaußerdem die Masse des τ -Jets eine Variable, die zur Massenrekonstruktion signifikantbeitragt, indem sie Aufschluss uber den hadronischen Zerfall liefert.

Abbildung 1 zeigt die Ergebnisse der Rekonstruktion im Vergleich mit der SVfit-Methode sowie mit der Masse der sichtbaren Zerfallsprodukte. Es wird deutlich, dass

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]50

100

150

200

250

Reco

nst

ruct

ed m

ass

m

reco

ττ

[ GeV c2

] Vis. massSVfit massNeuroBayes

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]

0.15

0.20

0.25

0.30

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ Vis. massSVfit massNeuroBayes

Abbildung 1: Ergebnisse der direkten ττ -Massenrekonstruktion mit einem neuronalen Netzim Vergleich bestehenden Massendefinitionen. Links sind die rekonstruierten Massen alsFunktion der wahren Masse gezeigt. Die Mittelwerte der rekonstruierten Verteilungensind durch die Linien und die Streuungen durch die Bander um die Mittelwerte herumangedeutet. Rechts ist die Massenauflosung in Abhangigkeit von der wahren Massedargestellt. Randbereiche, in denen die netz-basierte Rekonstruktion technischen Grundenkeine verlasslichen Ergebnisse liefert, sind schattiert.

das neuronale Netz prinzipiell imstande ist, auf diese einfache Weise die ττ -Masse zurekonstruieren. Nur in Randbereichen des Intervalls der im Training verwendetenwahren Massen liefert die netz-basierte Rekonstruktion aus technischen Grunden keineverlasslichen Ergebnisse, was sich aber durch einen erweiterten Bereich von wahrenMassen wahrend des Trainings verhindern lasst. Im gesamten Massenbereich erreichtdie neue Rekonstruktion die Prazision der SVfit-Masse und kann deren Auflosung iminteressanten mittleren Massenbereich zwischen der nominalen Z-Masse und etwa150 GeV

c2um bis zu funf Prozentpunkte steigern, was einer relativen Verbesserung von

etwas mehr als 20 % entspricht. Die lineare Abhangigkeit der mittleren rekonstruierten

v

Page 6: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Deutsche Zusammenfassung

Masse von der wahren Masse ist im interessanten Massenbereich von 90− 150 GeVc2

vergleichbar mit der SVfit-Methode. Komplexere Ansatze fur die Vorhersage mit Hilfeneuronaler Netze, die spezielle Effekte wie auftretende Resonanzen im hadronischenτ -Zerfall berucksichtigen, haben jedoch keine signifikanten Verbesserungen erzielt.

Um ausfuhrlichere physikalische Uberlegungen in die Massenrekonstruktion ein-zubringen, werden zwei weitere Ansatze verfolgt, wobei die Masse nicht direkt voneinem einzigen Netz vorhergesagt wird. Mit dem Versuch, Korrekturen zur Masseder sichtbaren Zerfallsprodukte zu rekonstruieren, kann die bisherigen Ergebnissejedoch nur bestatigt aber nicht verbessert werden.

In einem weiteren Ansatz wird der Beitrag der Neutrinos zur ττ Masse inAbhangigkeit von drei Großen parametrisiert, welche Relationen zwischen sicht-baren und nicht sichtbaren Viererimpulskomponenten des ττ -Systems darstellen. Furjeden Parameter wird ein unabhangiges Netz trainiert, bevor die Masse als analyti-sches Ergebnis aus den rekonstruierten Schatzern fur die Parameter berechnet wird.Außerdem wird ein viertes Netz trainiert, welches sowohl die gemessenen Großenund die rekonstruierten Parameter, die in die analytische Rechnung eingehen, alsauch das analytische Ergebnis selbst als Eingangsvariablen erhalten.

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

0.35

0.40

0.45

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, one-stagedNB, analyt. param. comb.NB, net param. combination

Abbildung 2: Vergleich der Massenauflosung alsFunktion der wahren Masse fur die einfache di-rekte Rekonstruktion mit nur einem neuronalenNetz und den mehrstufigen Parametrisierungs-ansatz.

Abbildung 2 zeigt die Auflosungender so rekonstruierten ττ -Massenim Vergleich mit der einstufigen di-rekten Massenrekonstruktion mit-tels eines neuronalen Netzes sowieder SVfit-Masse. Die analytischeBerechnung liefert die schlechtesteAuflosung, weil sich die Rekonstruk-tionsunsicherheiten der einzelnen Pa-rameter auf das Ergebnis fortpflan-zen. Durch die Hinzunahme einesweiteren Netzes zur Kombinationder drei Parameter-Netze kann die-ser Effekt etwas unterdruckt, abernicht komplett verhindert werden.Die einfachere direkte Rekonstruk-tion der Masse bleibt also die besteLosung.

Der tatsachliche Einfluss der ver-besserten Massenauflosung auf ei-ne Trennung von H → ττ - undZ → ττ -Ereignissen sowie systematische Effekte werden im Rahmen dieser Arbeitnoch nicht untersucht. Stattdessen wird eine wurde eine Klassifikation von Higgs-Ereignissen auf Basis etablierter Variablen studiert.

vi

Page 7: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Klassifikation von Higgs-Ereignissen mit Kunstlichen Neuronalen Netzen

Klassifikation von Higgs-Ereignissen mit KunstlichenNeuronalen Netzen

Der zweite Teil der Analyse beschaftigt sich mit der Klassifikation von Higgs-Ereignissen im Kanal H → ττ → µµ. Mit Hilfe multivariater Analysemethoden solldie Unterdruckung des Z → ττ - sowie des Z → µµ-Untergrundes im Vergleich zuoffiziellen likelihood-basierten Analyse verbessert werden. Außerdem werden weitereUntergrundprozesse wie tt+ Jets- und Diboson- (WW , WZ, ZZ-Produktion) undW + Jets-Ereignisse oder QCD-Prozesse mit einbezogen.

0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency εsig

0.85

0.90

0.95

1.00

Back

gro

und r

eje

ctio

n

1−ε b

kg

NeuroBayesLikelihood

Abbildung 3: Leistung der Trennung des netz-basierten Diskriminators, welcher aqui-valent zur eindimensionalen Likelihood-Methode, im Vergleich zu dieser Likelihood-Methode. Die Werte fur die Signal-Effizienzen und die Anteile des verworfenenUntergrund-Ereignisse, entsprechen Selek-tionen nach variablen Schnitten auf dieVerteilungen der Diskriminatoren.

Im ersten Schritt wird der offiziel-le zweigliedrige Likelihood-Ansatz mitneuronalen Netzen nachempfunden. Da-bei werden die gleichen diskriminieren-den Variablen als Grundlage verwandt.In Analogie zu einer eindimensiona-len Likelihood-Methode wird zuerst einneuronales Netz trainiert, um Higgs-Ereignisse beliebiger Masse von Unter-grundprozessen zu trennen. Wichtige Va-riablen sind hier zum Beispiel der kleins-te Abstand der Myon-Spuren (DCA)oder das Verhaltnis der Transversalim-pulse im System der beiden Myonen. InAbbildung 3 ist zu sehen, dass der Anteildes Untergrundes in einer Selektion vonEreignissen nach einem Schnitt auf dieDiskriminatorgroße, die das Netz aus-gibt, im Mittel um einen Faktor 2 imVergleich zur Selektion basierend auf derLikelihood-Große reduziert werden kann,wobei die Menge des selektierten Signalsgleich bleibt. Es kann also ein reinererSignaldatensatz selektiert werden, was

fur anschließende Analysen von großem Nutzen ist.

Im Anschluss an diese likelihood-basierte Reduktion von Untergrundbeitragenwerden in der offiziellen Analyse erwartete obere Grenzen auf den Higgs-Produktions-querschnitt auf Grundlage zweidimensionaler Massenverteilungen bestimmt. DieseVerteilungen der µµ-Masse und der SVfit-Masse bieten aus statistischen Grundenwenig Erweiterungsmoglichkeiten durch weitere trennende Variablen. Mit jeweilseinem weiteren Netz fur jede Higgs-Massenhypothese wird ein eindimensionalerDiskriminator ermittelt, auf dessen Grundlage sich die erwarteten oberen Grenzenanalog berechnen lassen. Das Ergebnis ist in Abbildung 4 (rechts) zu sehen und zeigt,dass die Sensitivitat der netz-basierten Methode die der bisherigen Methode bestatigt.

vii

Page 8: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Deutsche Zusammenfassung

Daher ist zu erwarten, dass durch Hinzufugen weiterer signifikanter Variablen dieSensitivitat der Analyse gesteigert werden kann.

Bei der Klassifikation muss die Diskrimierung gegen den Z → ττ - und den Z → µµ-Untegrund gleichzeitig geschehen. Mithilfe der Vorverarbeitungsroutinen bereitetNeuroBayes die Eingangsvariablen optimal auf das jeweilige Klassifizierungspro-blem vor. Daher liegt es nahe, die beiden dominanten Untergrundprozesse in ei-nem zweistufigen Verfahren getrennt voneinander zu unterdrucken. Begunstigt wirddies durch die Tatsache, dass einige der verwandten Variablen besonders geeignetsind, den Z → ττ -Untergrund zu reduzieren, wohingegen andere gegen den Z → µµ-Untergrund diskriminieren.

Im ersten Schritt werden mit einem neuronal Netz ττ -Endzustande klassifiziert. Dienotige Trennung gegenuber dem dominanten Z → µµ-Untergrund liefert hierbei vorallem die µµ-Masse. Danach werden mit einem zweiten Netz H → ττ - und Z → ττ -Ereignisse fuer jede Higgs-Massenhypothese gegeneinander diskriminiert. Hier ist dierekonstruierte ττ -Masse besonders wichtig, wobei fur kleine Higgs-Massen weitereVariablen notig sind.

Abbildung 4 zeigt die Performanz dieses Ansatzes, wobei die beiden Netzausgabenmiteinander multipliziert werden, um einen kombinierten Diskriminator zu erhalten.Im Gegensatz zur offiziellen Analyse mit zweidimensionalen Massenverteilungen kannman nun auch Ereignisse nach einem Schnitt auf den kombinierten Dikriminatorselektieren. Die Signal-Reinheit eines so selektierten Unterdatensatzes steigt mit derMasse des Higgs-Bosons. Außerdem konnen die erwarteten oberen Grenzen auf denHiggs-Produktionsquerschnitt um durchschnittlich 40 % verringert werden, wennman systematische Effekte außer Acht lasst. Das bedeutet eine deutliche Steigerungder Sensitivitat der Analyse, die auf die Methode direkt zuruckzufuhren ist. Bei

0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency εsig

0.975

0.980

0.985

0.990

0.995

1.000

Back

gro

und r

eje

ctio

n

1−ε b

kg

mH = 120 GeV

c2

mH = 160 GeV

c2

mH = 250 GeV

c2

mH = 350 GeV

c2

100 200 300 400 500

Higgs mass mH

[GeV

c2

]

100

101

95%

CL

upper

limit

on σ·BR

[ pb] 2D masses, L cut

NB, two-stagedNB, traditionalMainstream analysis

Abbildung 4: Leistung des zweistufigen Klassifizierungsansatzes in einer Schnitt-basiertenAnalyse (links) und die erwarteten oberen Grenzen of den Higgs-Produktionsquerschnitt(rechts) fur die untersuchten Methoden.

viii

Page 9: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Klassifikation von Higgs-Ereignissen mit Kunstlichen Neuronalen Netzen

Verwendung von NeuroBayes ist es also sehr nutzlich die beiden dominantenZ-Untergrunde mit separaten Netzen zu behandeln.

Im Hinblick auf einen weiteren Einsatz der untersuchten Methode konnen einigeoffene Fragen noch nicht im Rahmen dieser Arbeit beantwortet werden. Hauptsachlichbleibt zu untersuchen, wie man die Auslaufer der Diskriminatoren, in denen dietatsachliche Trennung von Untergrundbeitragen geschieht, statistisch sinnvoll weiter-verwendet und wie sich Veranderungen auf die Sensitivitat der Analyse auswirken.

ix

Page 10: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 11: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Suppression of the Z Background forthe Higgs Search in the H → ττ Channel

by a Multivariate Analysis ofττ Final States in pp Collisions

at the LHC

Thomas Muller

Diploma Thesis

at the Department of Physics

of the Karlsruhe Institute of Technology

Reviewer: Prof. Dr. Gunter QuastInstitute of Experimental Nuclear Physics

Second Reviewer: Prof. Dr. Wim de BoerInstitute of Experimental Nuclear Physics

April 11th, 2012

Page 12: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 13: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Contents

Introduction 1

1 The Standard Model of Particle Physics 31.1 Quantum Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.1.1 Quantum Electrodynamics . . . . . . . . . . . . . . . . . . . 5

1.1.2 The Weak Interaction . . . . . . . . . . . . . . . . . . . . . . 6

1.1.3 Glashow-Weinberg-Salam Theory – The Electroweak Unification 6

1.1.4 The Higgs Mechanism – Spontaneous Symmetry Breaking . . 8

1.1.5 Yukawa Interaction . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2 Experimental Verification . . . . . . . . . . . . . . . . . . . . . . . . 10

1.2.1 Standard Model Higgs Boson Production and Decay . . . . . 10

1.2.2 Experimental Results – Higgs Exclusion Limits . . . . . . . . 14

1.2.3 Limitations of the Standard Model . . . . . . . . . . . . . . . 16

2 The CMS Experiment at the LHC 192.1 The Large Hadron Collider (LHC) . . . . . . . . . . . . . . . . . . . 19

2.2 The Compact Muon Solenoid (CMS) . . . . . . . . . . . . . . . . . . 20

2.2.1 Coordinate System . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.2 Silicon Tracking Detector . . . . . . . . . . . . . . . . . . . . 23

2.2.3 Electromagnetic Calorimeter . . . . . . . . . . . . . . . . . . 23

2.2.4 Hadronic Calorimeter . . . . . . . . . . . . . . . . . . . . . . 24

2.2.5 Superconducting Solenoid . . . . . . . . . . . . . . . . . . . . 25

2.2.6 Muon System . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.2.7 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3 Simulation, Reconstruction and Software . . . . . . . . . . . . . . . . 29

2.3.1 Monte Carlo Event Generators . . . . . . . . . . . . . . . . . 29

2.3.2 Detector Simulation . . . . . . . . . . . . . . . . . . . . . . . 29

xiii

Page 14: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Contents

2.3.3 Reconstruction of Physical Objects . . . . . . . . . . . . . . . 30

2.3.4 Software Frameworks . . . . . . . . . . . . . . . . . . . . . . 31

3 Multivariate Analysis Techniques 33

3.1 Discriminating Between Two Classes of Events . . . . . . . . . . . . 33

3.1.1 Determination of Test Statistics . . . . . . . . . . . . . . . . 33

3.1.2 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2 Reconstruction of Arbitrary Quantities . . . . . . . . . . . . . . . . . 39

3.3 The Neural Network Package NeuroBayes . . . . . . . . . . . . . . 40

4 Mass Reconstruction with Artificial Neural Networks 45

4.1 The Semi-leptonic Decay Mode . . . . . . . . . . . . . . . . . . . . . 46

4.1.1 Event Generation and Preselection . . . . . . . . . . . . . . . 47

4.2 Current Mass Definitions . . . . . . . . . . . . . . . . . . . . . . . . 48

4.2.1 Visible Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.2.2 Collinear Approximation Mass . . . . . . . . . . . . . . . . . 49

4.2.3 Secondary Vertex Fit Mass . . . . . . . . . . . . . . . . . . . 49

4.3 Mass Reconstruction Using NeuroBayes . . . . . . . . . . . . . . . . 50

4.3.1 Input Variables and Preprocessing . . . . . . . . . . . . . . . 50

4.3.2 Simple One-staged Network Topology . . . . . . . . . . . . . 53

4.3.3 Reconstruction of Corrections for the Visible Mass . . . . . . 60

4.3.4 Physically Motivated Approach Using a Parametrisation of theMissing Momentum . . . . . . . . . . . . . . . . . . . . . . . 62

4.4 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . 65

5 Classification of Higgs Events with Artificial Neural Networks 67

5.1 Overview Over the Current Analysis Strategy . . . . . . . . . . . . . 67

5.1.1 Final States with Two Muons . . . . . . . . . . . . . . . . . . 67

5.1.2 Data and Monte Carlo Events and Preselection . . . . . . . . 68

5.1.3 Current Multivariate Classification of Higgs Events . . . . . . 69

5.2 Two Subsequent Neural Networks Following the Current Analysis . . 71

5.2.1 First Network Equivalent to the One-dimensional LikelihoodAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.2.2 Second Network Equivalent to the Two-dimensional Mass Ana-lysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.3 New Two-staged Network Approach . . . . . . . . . . . . . . . . . . 76

5.3.1 Classification of ττ Final States . . . . . . . . . . . . . . . . . 76

5.3.2 Discriminating H → ττ Events Against Z → ττ Events . . . 78

5.3.3 Performance of the Combined Discriminator . . . . . . . . . . 80

5.4 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . 84

Conclusion – Summary of the Results and Perspective 87

xiv

Page 15: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Contents

A Additional Information for the Mass Reconstruction Analysis 91A.1 Rankings of the Network Input Variables . . . . . . . . . . . . . . . 91A.2 Additional Performance Tests for the Straightforward One-staged

Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

B Additional Information for the Higgs Classification Analysis 99B.1 Rankings of the Network Input Variables . . . . . . . . . . . . . . . 99B.2 Discriminator Distributions for Medium and High Higgs Mass Hypotheses103B.3 Plots Showing All Studied Discriminating Variables . . . . . . . . . . 104B.4 NeuroBayes Preprocessing . . . . . . . . . . . . . . . . . . . . . . . 108

xv

Page 16: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 17: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

List of Figures

1.1 The four main leading order Higgs production modes at hadron colliders. 111.2 Standard Model cross sections as a function of the collider energy. . 121.3 Standard Model Higgs boson decay branching ratios and width. . . . 131.4 Feynman diagrams for the leptonic and hadronic decay of the τ lepton. 141.5 Higgs mass constraints from electroweak precision measurements. . . 151.6 Upper limits on the Standard Model Higgs boson production cross

section based on the full analyses by ATLAS and CMS. . . . . . . . 161.7 Upper limits on tanβ as a function of mA for searches for the neutral

MSSM Higgs bosons in the ττ decay channel at CMS. . . . . . . . . 17

2.1 The CERN accelerators complex. . . . . . . . . . . . . . . . . . . . . 202.4 The coordinate system of the CMS detector. . . . . . . . . . . . . . . 212.2 Three-dimensional schematic view of CMS. . . . . . . . . . . . . . . 222.3 Slice trough the CMS detector. . . . . . . . . . . . . . . . . . . . . . 222.5 The inner silicon tracking system of CMS. . . . . . . . . . . . . . . . 242.6 Schematic view of the electromagnetic calorimeter of CMS. . . . . . 252.7 The CMS detector with respect to the hadronic calorimeter. . . . . . 262.8 The CMS detector with respect to the muon sytem. . . . . . . . . . 272.9 The CMS data acquisition system. . . . . . . . . . . . . . . . . . . . 282.10 The WLCG tier structure for CMS. . . . . . . . . . . . . . . . . . . 28

3.1 Schematic representation of a three-layer multilayer perceptron. . . . 353.2 Neural network activation function. . . . . . . . . . . . . . . . . . . . 363.3 testing of two hypotheses. . . . . . . . . . . . . . . . . . . . . . . . . 373.4 Illustration of regression methods. . . . . . . . . . . . . . . . . . . . 40

4.1 Typical pp→ Z/H → ττ → µ+ τjet event topology. . . . . . . . . . . 464.2 Mass peaks for the current mass definitions used in CMS analyses. . 48

xvii

Page 18: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

List of Figures

4.3 Illustration of the collinear approximation. . . . . . . . . . . . . . . . 50

4.4 Distributions of generated opening angles as a function of the generatedττ mass. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.5 Distribution of three point estimators derived from the probabilitydensity reconstructed by NeuroBayes. . . . . . . . . . . . . . . . . 55

4.6 Uncertainties of the reconstruction for the three point estimatorsdetermined from the probability density reconstructed by NeuroBayes. 56

4.7 Comparison of the performance of current mass definitions with thenew neural network approach. . . . . . . . . . . . . . . . . . . . . . . 57

4.8 Mass spectrum of the τ jet. . . . . . . . . . . . . . . . . . . . . . . . 58

4.9 Comparison of the network performances for trainings on differenthadronic resonances. . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.10 Comparison of the network performances for trainings on differentcombinations of Z and Higgs events. . . . . . . . . . . . . . . . . . . 60

4.11 Distribution of the generated and reconstructed correction values forthe visible mass. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.12 Comparison of the straightforward one-staged network approach witha two-staged multi-network approach. . . . . . . . . . . . . . . . . . 62

4.13 Comparison of a direct training on the generated ττ mass and atwo-staged training that following the parametrisation approach. . . 64

4.14 Mass peaks for the straightforward one-staged network reconstruction. 66

5.1 Discriminator distributions for the one-dimensional likelihood quantityand the equivalent network output . . . . . . . . . . . . . . . . . . . 72

5.2 Comparison of the performance of the likelihood quantity and theequivalent network method in terms of the background rejection as afunction of the signal efficiency where the signal sample is formed bya 120 GeV

c2Higgs sample. . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.3 Network discriminator distribution for the training on the mass vari-ables for a Higgs mass hypothesis of mH = 120 GeV

c2and the back-

ground rejection as a function of the signal efficiency for four Higgsmass hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.4 Expected upper limits on the product of the cross section for theMSSM Higgs boson production and the ττ decay branching ratio fortanβ = 30 for the subsequent multivariate analysis consisting of theone-dimensional likelihood analysis with five input variables and theanalysis of the visible mass and the SVfit mass . . . . . . . . . . . . 75

5.5 Network discriminator distribution for the training purposed for clas-sifying ττ final states and the background rejection as a function ofthe signal efficiency for four Higgs mass hypotheses . . . . . . . . . . 77

5.6 Network discriminator distribution for the Higgs-Z separation trainingfor a Higgs mass hypothesis of mH = 120 GeV

c2and the background

rejection as a function of the signal efficiency for four Higgs masshypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

xviii

Page 19: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

List of Figures

5.7 Distribution of the combined discriminator for classifying Higgs eventsfor a Higgs mass hypothesis of mH = 120 GeV

c2and the background

rejection as a function of the signal efficiency for four Higgs masshypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.8 Comparison of the performance of the traditional network approachwith the new two-staged network approach. . . . . . . . . . . . . . . 82

5.9 Comparison of the expected upper limits on the product of the crosssection for the MSSM Higgs boson production and the ττ decaybranching ratio for tanβ = 30 for the traditional network approachwith the ones resulting from the new two-staged network approach. . 83

5.10 Impact of binning of the test statistics on the expected upper limits. 84

A.1 Comparison of the performances for various tests of the simple straight-forward mass reconstruction network. . . . . . . . . . . . . . . . . . 98

B.1 Network discriminator distributions for the trainings on the massvariables for a medium and a high Higgs mass hypothesis . . . . . . 103

B.2 Network discriminator distributions for the Higgs-Z separation train-ings for a medium and a high Higgs mass hypothesis . . . . . . . . . 103

B.3 Distributions of the combined discriminator for classifying Higgs eventsfor a medium and a high Higgs mass hypothesis . . . . . . . . . . . . 104

B.4 Discriminating variables the one-dimensional likelihood quantity isbased on . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

B.5 Mass variables, as used for the two-dimensional mass analysis, andthe missing transverse energy. . . . . . . . . . . . . . . . . . . . . . . 106

B.6 Additional variables not used in the current multivariate analysis. . 107B.7 NeuroBayes output informing about the preprocessing performance

for a single input variable. . . . . . . . . . . . . . . . . . . . . . . . . 108

xix

Page 20: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 21: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

List of Tables

1.1 Gauge bosons and their interactions . . . . . . . . . . . . . . . . . . 4

1.2 Elementary fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Leptonic and hadronic τ decay modes. . . . . . . . . . . . . . . . . . 13

1.4 Branching for decays of ττ pairs. . . . . . . . . . . . . . . . . . . . . 14

4.1 Summary of the preselection of semi-leptonically decaying ττ finalstates for the H → ττ → µ+ τjet analysis. . . . . . . . . . . . . . . . 47

4.2 Ranking of the nine most important input variables used for thestraightforward one-staged network. . . . . . . . . . . . . . . . . . . 53

5.1 Summary of the preselection of events with double muon final statesfor the H → ττ → µµ analysis. . . . . . . . . . . . . . . . . . . . . . 69

5.2 Numbers of Events that remained after the preselection for eachstudied process and real data. . . . . . . . . . . . . . . . . . . . . . . 70

5.3 Ranking of the most important input variables used for the networkclassifying ττ final states . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.4 Ranking of the most important input variables used network discrim-inating H → ττ events against Z → ττ events for a low Higgs masshypothesis of mH = 120 GeV

c2. . . . . . . . . . . . . . . . . . . . . . . 78

5.5 Ranking of the most important input variables used network discrim-inating H → ττ events against Z → ττ events for a low Higgs masshypothesis of mH = 250 GeV

c2. . . . . . . . . . . . . . . . . . . . . . . 78

5.6 Ranking of the most important input variables used network discrim-inating H → ττ events against Z → ττ events for a low Higgs masshypothesis of mH = 500 GeV

c2. . . . . . . . . . . . . . . . . . . . . . . 79

A.1 Ranking of all input variables for the straightforward one-staged massreconstruction network. . . . . . . . . . . . . . . . . . . . . . . . . . 94

xxi

Page 22: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

List of Tables

A.2 Ranking of the input variables used for the training purposed toreconstruct corrections for the visible mass. . . . . . . . . . . . . . . 95

A.3 Ranking of the input variables used for the training on the parameterα for the mass parametrisation approach. . . . . . . . . . . . . . . . 95

A.4 Ranking of the input variables used for the training on the parameterβ for the parametrisation approach. . . . . . . . . . . . . . . . . . . 96

A.5 Ranking of the input variables used for the training on the parameterγ for the mass parametrisation approach. . . . . . . . . . . . . . . . 96

A.6 Ranking of the input variables used for the combination trainingfor the mass parametrisation approach to improve the analyticallycalcuated ττ mass. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

B.1 Ranking of the input variables used for the Higgs classification trainingthat is equivalent to the one-dimensional likelihood method based onfive input variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

B.2 Ranking of the input variables used for the Higgs classification trainingthat is equivalent to the two-dimensional mass analysis for a low Higgsmass hypothesis of mH = 120 GeV

c2. . . . . . . . . . . . . . . . . . . . 100

B.3 Ranking of the input variables used for the Higgs classification trainingthat is equivalent to the two-dimensional mass analysis for a mediumHiggs mass hypothesis of mH = 250 GeV

c2. . . . . . . . . . . . . . . . . 100

B.4 Ranking of the input variables used for the Higgs classification trainingthat is equivalent to the two-dimensional mass analysis for a highHiggs mass hypothesis of mH = 500 GeV

c2. . . . . . . . . . . . . . . . . 100

B.5 Ranking of all input variables used for the network classifying ττ finalstates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

B.6 Ranking of all input variables used network discriminating H → ττevents against Z → ττ events for a low Higgs mass hypothesis ofmH = 120 GeV

c2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

B.7 Ranking of all input variables used network discriminating H → ττevents against Z → ττ events for a medium Higgs mass hypothesis ofmH = 250 GeV

c2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

B.8 Ranking of all input variables used network discriminating H → ττevents against Z → ττ events for a high Higgs mass hypothesis ofmH = 500 GeV

c2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

xxii

Page 23: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Introduction

Particle physics concentrates on fundamental questions concerning the structureof matter and the interactions between elementary particles. Modern researchis characterised by the interplay between theories that are proposed to describeexperimental observations and to predict the outcome of future experiments andthe experiments that are performed to verify or falsify theories and to get hints forphenomena beyond existing theory models.

Since the second half of the past century, the Standard Model has been developedand is now able to precisely predict the results of many particle physics experiments,such as collider experiments. Up to now, there is no experimentally verified theorythat is able to describe the observed heavy masses of the W± and Z bosons. TheHiggs mechanism is the most favoured theoretical approach to introduce these massesto the Standard Model. After introducing the so-called Higgs field, the weak vectorbosons acquire mass via a spontaneous symmetry breaking. To verify this theory,the Higgs boson has to be discovered and its mass, as the last unknown parameterof the Standard Model has to be measured. Chapter 1 provides an overview overthe theoretical framework of the Standard Model with particular focus on the HiggsMechanism which is proposed the explain the origin of mass. Additionally, the Higgsproduction at hadron colliders and its decay is presented phenomenologically.

The technological progress makes it possible to engineer more and more sophistic-ated experiments. The Large Hadron Collider (LHC) situated at CERN (EuropeanOrganization for Nuclear Research) near Geneva, Switzerland, is the most powerfulaccelerator ever built. Protons can be collided at centre-of-mass energies of up to14 TeV enabling studies at the TeV scale. Chapter 2 presents the Compact MuonSolenoid experiment (CMS) as one of the two general-purpose particle detectorsinstalled at the LHC. The main goal of this state-of-the-art detector is to disclose thesecret of the Higgs particle, namely to exclude it or to ideally discover it and measureits properties. Additionally, CMS is capable of probing the Standard Model a theTeV energy scale and search for processes that can only be explained by theories

1

Page 24: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Introduction

beyond the Standard Model.The search of the Higgs boson is like the search for a needle in a haystack and

therefore requires sophisticated analysis techniques. Since this thesis focuses on thestudy of multivariate analysis techniques, the introductory part is concluded withChapter 3 introducing the statistical methods, especially artificial neural networksand their usage, employed throughout the work.

The main part of this thesis is based on two analysis chapters presenting studiesabout the suppression of the Z background for the Higgs search in the H → ττchannel which is characterised by both a promising high branching ratio and cleardetector signatures as wells as the challenging appearance of multiple neutrinos inthe final state preventing a full reconstruction of the ττ system. Chapter 4 focuseson network-based mass reconstruction techniques, whereas Chapter 5 concentrateson a multivariate classification of Higgs events with ττ final states. An improvementof the mass reconstruction resolution for ττ events is needed for a better separationof H → ττ and Z → ττ events that apart from the invariant mass of ττ systemlead to extremely similar detector signatures. In particular in the low Higgs massregion but also for higher Higgs mass hypotheses the reconstructed ττ mass does notprovide enough discrimination power to distinguish between signal and backgroundevents and is even not the sole discriminating variable. Therefore network-basedclassification approaches are studied in the scope of the official analysis for the Higgsboson search within the scope of CMS.

2

Page 25: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1The Standard Model of Particle Physics

Since ancient times, mankind searches for the fundamental building blocks of matterin the microcosm as well as for a description of the development and the structure ofour universe. Particle physics focuses on the first part, whereas the investigation ofthe structure of the matter cannot be fully separated from the study of cosmologicalproblems.

Two prominent pioneers should be named: the Greek philosopher Democrituspostulated indivisible constituents of all matter which he called atoms [1]. Thatpossibly oldest model of particle physics has been stated more than two thousandsyears ago. The era of modern particle physics was mainly introduced by EnestRutherford and his scattering experiment at the beginning of the last century [2].By radiating beams of α and β particles at a gold foil and analysing the elasticallyscattered particles he discovered the atomic nucleus.

Since this experiment, the methods of modern particle physics have only changedslightly. While still scattering experiments are performed, only the energy of thecolliding particles has increased by orders of magnitude and the focus switched toinelastic scattering processes. Therefore more and more small nuclear structuresof composite particles can be resolved and new unstable particles can be createdby transforming energy into mass. At present, the collider flagship LHC at CERN,collides protons at a centre-of-mass energy of 7 TeV.

Today, physics knows four fundamental forces. All processes involving interactionsbetween two systems can be traced back to either one of these forces or a combinationof them. Firstly, there is the gravitation that describes the interaction betweenmassive particles. Secondly, an electromagnetic force between electrically chargedparticles is known. Thirdly, the weak interaction is responsible for nuclear processessuch as beta decays. At last, there is the strong interaction describing the forcesbetween colour-charges particles such as quarks and gluons. For example, this forceholds together the nuclei of the atomic nucleus.

Presently, the gravitation can only be described classically by the theory of generalrelativity. A quantum field theory that fits into the scheme of the Standard Model

3

Page 26: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1 The Standard Model of Particle Physics

is required to describe gravitational phenomena at high energies or small distancesas well. For the actual research at the energy scale of colliders, gravitation doesnot play a big role, since its force is negligible in comparison with the other threefundamental forces. Table 1.1 summarises the properties of these gauge bosons withspins of 1~. Due to the spin of 1~ they are also called vector bosons.

Interaction Gauge Boson Mass[

GeVc2

]Range [m]

Electr. magn. Photon γ 0 ∞

WeakZ0 91.18

10−15

W± 80.40Strong 8 Gluons g 0 10−18

Table 1.1: Gauge bosons and their interactions

In the Standard Model all matter particles are fermions as their spin equals 12~

(often the unit ~ is omitted). Table 1.2 summarises the properties of the elementaryfermions. Firstly, two groups are distinguished: leptons and quarks. Only quarks take

FermionsGeneration Charge1 2 3 El. Charge Weak Isospin Colour

Leptonsνe νµ ντ 0 +1/2

0e µ τ −e −1/2

Quarksu c t +2/3e +1/2

r, g, bd s b −1/3e −1/2

Table 1.2: Left-handed elementary fermions. Right-handed elementary fermions do not carryany weak isospin.

part in strong interactions because leptons do not carry any colour charge. Secondly,all fermions appear in three generations or so-called flavours. These generationsare distinguished by the mass of the particles. Stable particles, such as electronsand up and down quarks, belong to the first generation as they have the lightestmasses. The charges are a measure for the couplings between the particles and theforce carriers and therefore they determine the interaction strength, although thecouplings depend also on the energy scale of studied processes. Each fermion has asibling with opposite charges which is referred to as its antiparticle.

In the following, a short introduction into the theoretical framework describingquantum electrodynamics, the weak interaction and its unification with the electro-magnetic interaction as well as the Higgs mechanism is presented. The reader ispointed to textbooks for more details, such as [3, 4], or to the summarising article [5].The chapter concludes with a phenomenological overview of Higgs boson productionand decay in the Standard Model and a summary of the latest results in the searchfor the Higgs boson.

4

Page 27: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1.1 Quantum Field Theory

1.1 Quantum Field Theory

In classical mechanics the dynamics of a point-like system is described by a LagrangianL (q, q) = Ekin −Epot. The action S =

∫Ldt is minimised by every state transition,

from which an equation of motion, the so-called Euler-Lagrange equation, can bederived.

In quantum field theory particles are referred to as wave functions ψ(x). Thesecontinuous systems are described by a Lagrange density function L

(ψ, ∂µ ψ

)where

∂µ ψ denotes the derivative ∂ψ∂xµ . In turn, the action

S =

∫L(ψ, ∂µ ψ

)d4xµ

is minimised leading to the Euler-Lagrange equation

∂µ

(∂L

∂(∂µ ψ)

)− ∂L∂ψ

= 0

that expresses the equation of motion for the wave function ψ(x) described by theLagrangian L

(ψ, ∂µ ψ

). For instance, the Lagrangian for a free fermion with mass

m and the corresponding Euler-Lagrange equation are

L = iψ† γµ ∂µ ψ −mψ† ψ ⇒(i γµ ∂µ −m

)ψ = 0 (1.1)

The fermion fields ψ are four-component Dirac spinors and ψ† denotes the adjointwave function. The equation of motion is called the Dirac equation. Here, the γµ

denote the gamma matrices.

1.1.1 Quantum Electrodynamics

Quantum Electrodynamics describes the interaction between electrically chargedparticles by exchanging photons as force carriers between them. For the introductionof interactions between particles the principle of local gauge invariance has proovedto be adequate.

In this theory, global gauge transformations are phase rotations of the wavefunctions coming from the underlying U(1) symmetry group. They are expected tohave no influence on any measurement because only the probability density |ψ|2 canbe measured.

Local U(1) symmetry gauge transformations for both the wave function and itsderivative can be expressed as

ψ → ψ′ = eiα(x) ψ and ∂µ ψ → ∂µ ψ′ = eiα(x)

(iψ ∂µ α(x)︸ ︷︷ ︸

6= 0

+ ∂µ ψ)

depending on the local parameter α(x). The derivative of this parameter yields anadditional term in the derivative of the wave function that destroys the invariance

5

Page 28: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1 The Standard Model of Particle Physics

of the Lagrangian. A compensating field A has to be included to conserve the localgauge invariance.

∂µ → Dµ = ∂µ + i eAµ(x) with Aµ → A′µ = Aµ −1

e∂µα(x)

The Lagrangian (1.1) has to be completed with terms resulting from the new fieldA.

LQED = ψ†(i γµ ∂µ −m

)ψ − jµAµ − 1

4Fµν F

µν (1.2)

with jµ = q ψ† γµ ψ and Fµν = ∂µAν − ∂ν Aµ

jµ = (ρ, j) denotes the four-current and Aµ = (Φ,A) the electromagnetic four-potential as known from classical electrodynamics. The second term expresses thecoupling of the fermion with charge q to the field A that can be identified as thephoton field. Thus the photon is introduced as the gauge boson mediating theelectromagnetic interaction between charged particles. The last term, a kinematicterm for the photon field, is added for completeness. A mass term ∼ m2

γAµAµ would

spoil the local gauge invariance: the photon is massless.The Dirac equation as well as the Maxwell equations follow from the QED Lag-

rangian.(i γµ ∂µ −m

)ψ = 0 and ∂µ F

µν = jν (1.3)

1.1.2 The Weak Interaction

The weak interaction describes transition between fermions (either leptons or quarks)of the same flavour by exchanging W or Z bosons between them. These transitionsare elegantly expressed in terms of a spin formalism. Left-handed fermions arearranged in so-called weak isospin doublets and transitions are described by rotationsin the isospin space. The chirality of a particle refers its helicity, the projection ofthe spin onto the direction of momentum. Right-handed particles (and left-handedantiparticles) have a positive helicity whereas left-handed particles (and right-handedantiparticles) are characterised by a negative helicity. For instance, a muon with athird component of its weak isospin of T3 = −1

2 is converted into a muon neutrinowith T3 = +1

2 by emitting a W− boson with T3 = −1. Since in the Standard Modelright-handed fermions do not couple to weak gauge bosons, their weak isospin is zero.They are therefore regarded as isospin singlets.

1.1.3 Glashow-Weinberg-Salam Theory – The Electroweak Unification

In analogy to the theory of the electromagnetic interaction, the weak interaction hasbeen tried to be described by a similar theory, a so-called Yang-Mills theory. SheldonGlashow [6], Abdus Salam [7] and Steven Weinberg [8] formulated a unified theorythat is capable to describe the electromagnetic as well as the weak interaction. Thisintroduction focuses on the electroweak interaction.

6

Page 29: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1.1 Quantum Field Theory

The formalism of the electroweak interaction are based on local SU(2) ⊗ U(1)symmetry gauge transformations. They can be expressed as in the following dependingon four local parameters α(x) and β(x).

ψL → ψ′L = eiα(x) σ2

+iβ(x)Y ψL and ψR → ψ′R = eiβ(x)Y ψR

Here, the hypercharge Y = 2( qe − T3

)denotes the generator of the U(1) symmetry

group describing phase rotations whereas the Pauli matrices σ2 are the generators of

the SU(2) symmetry group initiating rotations in the isospin space. The left-handedfermion fields are referred to as isospin doublets of Dirac spinors. For example, theleptonic doublet containing a charged lepton l = e,µ,τ and its neutrino νl can bewritten as the following vector.

ψL =

(ψνlψl

)L

In turn, the local gauge invariance of the Lagrangian (1.1) for free fermions ispreserved, after four new gauge fields W µ and Bµ have been introduced.

∂µ → Dµ = ∂µ − i g2σ

2W µ − i g1

Y

2Bµ

W µ →W ′µ = W µ −

1

g2∂µα(x)−α(x)× W µ

Bµ → B′µ = Bµ −1

g1∂µ β(x)

The Lagrangian that is invariant under local SU(2)⊗U(1) symmetry gauge trans-formations reads as the following.

LEWK = ψ† i Dµ γµ ψ − 1

4W µνW

µν − 1

4Bµν B

µν (1.4)

with W µν = ∂µW ν − ∂νW µ and Bµν = ∂µBν − ∂ν Bµ

Couplings between the fermions and the new gauge bosons are given by the mixedterms with fermion and boson fields origination from the covariant derivative. Thecoupling constants are denoted by g1 for the couplings to the field Bµ and by g2 forthe couplings to the three fields W µ. Again, the Lagrangian contains no mass termsince such a term would spoil the gauge invariance.

The actual fields of the corresponding weak and electromagnetic force carriers arelinear combinations of the newly introduced fields. The charged W± bosons mediatecharged weak couplings whereas the neutral Z boson is connected to neutral currents.The field A represents the photon γ responsible for the electromagnetic interaction.The latter are orthogonal linear combinations parametrised by the mixing angle θW,

7

Page 30: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1 The Standard Model of Particle Physics

the so-called Weinberg angle.

W±µ =1√2

(W 1µ ∓W 2

µ

)and

(ZµAµ

)=

(cos θW − sin θW

sin θW cos θW

)(W 3µ

)

with sin2 θW =g2

2

g21 + g2

2

≈ 0.23

In an analog way the theory of the strong interaction, quantum chromodynamics(QCD) is established. SU(3) symmetry operations describe rotations in the three-dimensional colour space. This symmetry is preserved by introducing eight gluonfields mediating interactions between colour charged objects based on a Lagrangianterm LQCD.

1.1.4 The Higgs Mechanism – Spontaneous Symmetry Breaking

Although the quantum field theory derived from the fundamental principle of localgauge invariance describes three fundamental forces very well, it is not capableof explaining the large masses of the weak gauge bosons W± and Z that can notbe neglected. Peter W. Higgs [9, 10], Robert Brout and Francois Englert [11] aswell as Gerald S. Guralnik, Carl R. Hagen and Tom W. B. Kibble [12] suggested amechanism of spontaneous electroweak symmetry breaking by which the weak gaugebosons acquire mass. The mechanism is shortly known as the Higgs mechanism.

An new complex scalar field Φ is introduced based on the following Lagrangianterm which is invariant under SU(2)⊗U(1) symmetry gauge transformations.

LHiggs = (Dµ Φ)†(Dµ Φ

)−µ2 Φ† Φ− λ

(Φ† Φ

)2

︸ ︷︷ ︸=−V (Φ)

with Φ =

(φ+

φ0

)(1.5)

The potential, which is know as a Mexican hat potential, contains a mass term withthe mass-type constant µ2 and self-couplings with the positive dimensionless constantλ. For negative values of µ2 the potential has not only one minimum at 〈Φ〉0 = 0 asit is the case for µ2 > 0 but there are multiple ground states 〈Φ〉0 for the scalar fieldfulfilling the following condition.

∣∣〈Φ〉0∣∣2 = − µ2

2λ≡ v2

2

A so-called vacuum expectation value v is introduced. Since there is no distinctiveground state, the system chooses one at random. Because of the SU(2) invariance,the ground state can be taken as the following one.

〈Φ〉0 =1√2

(0v

)

8

Page 31: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1.1 Quantum Field Theory

After the system has fallen into such a ground state, the SU(2) rotation symmetryis no more apparent. It is said that the symmetry is spontaneously broken. Aftertransforming the scalar field Φ and expressing it in terms of the so-called Higgsfield H,

Φ(x) =1√2

(0

v +H(x)

)

the Lagrangian (1.5) can be expanded around the ground state 〈Φ〉0. By doing so,

the kinetic term∣∣Dµ Φ

∣∣2 results in mass terms for the vector bosons W± and Z aswell as terms describing the couplings between the weak gauge bosons and the Higgsboson and kinematic terms for the Higgs field occur.∣∣Dµ Φ

∣∣2 =1

8v2 g2

2

∣∣∣W 1µ + iW 2

µ

∣∣∣2 +1

8v2∣∣∣g2W

3µ − g1Bµ

∣∣∣2 + . . .

=1

2m2W W+

µ W−µ +1

2m2Z Zµ Z

µ + . . .

Therefore the boson masses can be expressed in terms of the vaccum expectationvalue v. The relationship between the W mass and the Z mass is given by theWeinberg angle.

mW =v

2g2 and mZ =

v

2

√g2

1 + g22 with

mW

mZ=

g2√g2

1 + g22

= cos θW

In turn, the expansion of the potential terms yields a mass term for the Higgsboson.

V =1

2µ2 (v +H)2 +

1

4λ (v +H)4 ⇒ mH = −

√2µ = v

√2λ

The vacuum expectation value can be expressed in terms of the measured vector bosonmasses and the Weinberg angle, but one free parameter, i.e. the Higgs mass mH ,is not predicted by the theory and therefore has to be measured. From theoreticalconsiderations merely an approximate upper boundary of mH < 650 GeV

c2can be

derived [5].By this mechanism, only the SU(2) symmetry is spontaneously broken and the

weak force carriers acquire mass. Since the U(1) symmetry remains exact, the photonis still massless. The same is true for the massless gluons as the gauge fields of theQCD.

1.1.5 Yukawa Interaction

The fermion masses are neither described by the electroweak Lagrangian (1.4), sincethis would mix up left-handed and right-handed terms, nor are they introducedby the Lagrangian describing the scalar field (1.5). However, these masses can bedescribed in a similar manner as the vector boson masses.

9

Page 32: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1 The Standard Model of Particle Physics

The coupling between the fermion fields ψ and the scalar field Φ is expressed by aYukawa interaction Lagrangian. For leptons it is expressed as the following.

LYukawa = −λl ψ† Φψ = − 1√2λf (v +H)ψ† ψ (1.6)

After spontaneous symmetry breaking in terms of the remaining Higgs field H. ThisLagrangian introduces both the fermion masses mf as their couplings to the Higgsfield gHff .

mf =v

2λf and gHff =

mf

v

Since the parameters λf are not fixed, the fermion masses appear as additionalparameters of the theory. The Yukawa coupling of the fermions to the Higgs bosonis proportional to their mass.

In summary, the full Lagrangian of the Standard Model comprises the followingfour terms, where it has to be noted, that the terms have to be summed for allfermions in all generations. In this overview theses sums have been omitted.

LSM = LEWK + LHiggs + LYukawa + LQCD

1.2 Experimental Verification

As a impressively successful theory the Standard Model has been verified by manyparticle physics experiments and parameters are precisely measured. Its last missingelement is the Higgs boson which is determined by one parameter that can be takento be its mass. With a discovery of the Higgs boson and the examination of itsproperties, i.e. its couplings to fermions and bosons, the theory of the electroweaksymmetry breaking could be confirmed and the origin of mass explained.

Therefore, scattering experiments at colliders have been designed to directly searchfor the Higgs boson. In the following, a phenomenological overview over the Higgsboson production at proton-proton colliders and its decay is given. Additionally,recent results of direct and indirect searches for the Higgs boson are shortly presented,before an outlook is given over theories extending the Standard Model.

1.2.1 Standard Model Higgs Boson Production and Decay

At the LHC the main production mode for Higgs bosons is the gluon fusion (Fig-ure 1.1a). Since the Higgs boson only couples to the mass of particles the productionis mediated by a loop of heavy quarks.

The cross section for the second important production mode, the vector bosonfusion (Figure 1.1b), is about one order of magnitude smaller then the one for gluonfusion. This process provides an opportunity to discriminate such Higgs events frombackground events because forward jets from the proton remnants are produced thatensure low central event activity in the so-called rapidity gap in between.

10

Page 33: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1.2 Experimental Verification

g

g

t ,b H

(a) Gluon fusion

H

q

q q

q

W,Z

W,Z

(b) Vector boson fusion

H

q

q

W,ZW*,Z*

(c) Higgs strahlung

H

g

g t ,b

t ,b(d) Associated production

Figure 1.1: The four main leading order Higgs production modes at hadron colliders.

Especially for light Higgs masses two other processes contribute significantly:the Higgs strahlung (Figure 1.1c) together with W or Z boson and the associatedproduction (Figure 1.1d) together with two heavy quarks.

The neutral Higgs boson can also be produced via quark-antiquark annihilationsimilar to the Z boson production. Since at a proton-proton collider the antiquarkhas to be a sea quark, this production mode is suppressed at the LHC. At theproton-antiproton collider Tevatron this was the most important production mode.

Figure 1.2 shows the cross sections for different Standard Model processes as a func-tion of the centre-of-mass energy

√s for hadron colliders. It clearly discloses the low

expected production rates for Higgs events compared to already established StandardModel processes such as W and Z boson production. Therefore it is challenging toclearly select Higgs events and distinguish them from a huge background.

Since the Higgs boson couples to the mass of elementary particles, it preferablydecays into the heaviest particle-antiparticle pairs that are kinematically allowed.Therefore, light Higgs bosons primarily decay into bb quark or τ+τ− lepton pairs. Thedecay into pairs of gluons, which is mediated via a heavy-quark loop, is experimentallynot accessible due to large irreducible QCD background although the branchingratio comparably large, too. Higgs events with b jets in the final state require asophisticated b-tagging strategy due to the large hadronic background. This augmentsthe importance of studying Higgs bosons decaying into pairs of τ leptons as presentedin this thesis.

Also for low masses the investigation of Higgs bosons decaying into pairs of photons(via a quark loop) is very powerful. Although the branching ratio is relatively small,this decay mode grains from the fact, the it is almost only contaminated by the wellknown Drell-Yan background which can be subtracted easily.

Above the WW and ZZ mass thresholds the decay of heavy Higgs is dominated by

11

Page 34: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1 The Standard Model of Particle Physics

0.1 1 1010-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

102

103

104

105

106

107

108

109

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

102

103

104

105

106

107

108

109

WJS2009

σjet(E

T

jet > 100 GeV)

σjet(E

T

jet > √s/20)

σjet(E

T

jet > √s/4)

σHiggs

(MH=120 GeV)

200 GeV

LHCTevatron

events/sec

forL

=10

33cm

-2s-1

σb

σtot

proton - (anti)proton cross sections

σW

σZ

σt

500 GeV

σ(nb)

√s (TeV)

Figure 1.2: Standard Model cross sections as a function of the collider energy [13].

these pairs of vector bosons. Figure 1.3 depicts the branching ratios of the differentdecay modes as a function of the Higgs mass and shows the decay width of the Higgsparticle.

Decay into Pairs of τ Leptons

In the search for light Higgs bosons, that are favoured by most recent analyses (seeSection 1.2.2), the decay into pairs of τ leptons is very important. One reason itthe comparably high branching ratio (see Figure 1.3a). About almost 5 to 10 % ofall Higgs bosons with masses below 135 GeV

c2decay into pairs of tauons. Another

reason is the well known irreducible background from Z boson decay, whereas otherbackground processes can be suppressed with little effort. This issue is taken up inChapter 5.

Tauons having a mass of mτ = 1777 MeVc2

are the heavy siblings of the lighterleptons, electrons and muons. Since they decay only weakly, high energetic τ leptons

12

Page 35: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1.2 Experimental Verification

[GeV]HM

100 200 300 1000

Bra

nchi

ngra

tios

-310

-210

-110

1

500

bb

ττ

cc

ttgg

γγ γZ

WW

ZZ

LHCHIG

GSXSWG

2010

(a) Branching ratios

[GeV]HM

100 200 300 1000

[GeV

]HΓ

-210

-110

1

10

210

310

LHCHIG

GSXSWG

2010

500

(b) Decay width

Figure 1.3: Standard Model Higgs boson decay branching ratios and width [14].

can fly mean distances of c∆t = 87 µm before they decay. This enables the possibilityto resolve secondary decay vertices distinct from the primary production vertex forspecific decay modes. Due to their mass, tauons do not only decay leptonically but amajority of about 65 % decays hadronically into so-called τ jets. Table 1.3 shows themost prominent τ decay modes together with their branching fractions.

Decay Mode Resonance Branching Ratio[%]

τ− → e− νe ντ 17.8τ− → µ− νµ ντ 17.4

τ− → π− ντ π(140) 11.6τ− → π− π0 ντ ρ(770) 26.0τ− → π− π0 π0 ντ a1(1260) 10.8τ− → π− π+ π− ντ a1(1260) 9.8τ− → π− π+ π− π0 ντ 4.8Other hadronic modes 1.7

All hadronic modes 64.8

Table 1.3: Leptonic and hadronic τ decay modes together with their branching ratios [15,16]. Although the pions represent the majority of the hadronic decays, the suppresseddecay into kaons is possible too, which is not included in the table.

Figure 1.4 shows the Feynman diagrams for leptonic and hadronic τ decays atleading order. Two neutrinos in the leptonic and one in the hadronic mode carryaway energy and momentum that cannot be directly measured with a detector suchas CMS. This makes a full reconstruction of the four-momentum of the original tauon

13

Page 36: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1 The Standard Model of Particle Physics

τ µ, e

ν ,νµ e

ντ

W

(a) Leptonic decay

τ d

u

ντ

W τjet

(b) Hadronic decay

Figure 1.4: Feynman diagrams for the leptonic and hadronic decay of the τ lepton.

impossible. Hadronic τ decays may lead to resonances in the spectrum of the visibledecay products. The spin of this resonance has an impact on the decay kinematicsdue to conservation of angular momentum.

The branching ratios listed above lead to the branching ratios for the decays ofττ systems as listed in Table 1.4. In this thesis, decays into pairs of muons (seeChapter 5) and into one muon and a τ jet (see Chapter 4) are studied.

Decay Mode Branching Ratio[%]

ττ → µ+ µ+ νννν 3.0ττ → e+ e+ νννν 3.2ττ → e+ µ+ νννν 6.2

ττ → µ+ τhad + ννν 22.5ττ → e+ τhad + ννν 23.1

ττ → τhad + τhad + νν 42.0

Table 1.4: Branching for decays of ττ pairs. There are three fully leptonic, two semi-leptonic/hadronic and one fully hadronic decay mode.

1.2.2 Experimental Results – Higgs Exclusion Limits

In experimental Higgs analyses direct and indirect searches for the Higgs boson aredistinguished. Indirect measurements allow for probing the theory for much higherHiggs masses as they are directly accessible.

Since the masses of the vector bosons W and Z, which have been precisely measuredat the lepton collider LEP and also at the Tevatron, depend on the Higgs boson massvia quantum loop corrections, it is possible to constrain the Higgs mass from fits tothe vector boson masses. Such fits to electroweak precision measurements [17] yielda most probable Higgs mass of mH = 94+29

−24GeVc2

at 68 % confidence level. At 95 %

14

Page 37: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1.2 Experimental Verification

Figure 1.5: Result of the fit for the Higgs mass based on electroweak precision measurementsobtained at LEP and the Tevatron [17]. Mass intervals excluded by direct searches aremarked in yellow. The remaining interval is compatible with the 2σ-interval resultingfrom the fit, i.e. ∆χ2 < 4.

confidence level this analysis can exclude Higgs masses above 152 GeVc2

. The resultingχ2 quantity of the fit is shown in the so-called blueband plot depicted in Figure 1.5.

By direct searches for the Higgs boson at LEP and the Tevatron masses below114.4 GeV

c2and between 156 GeV

c2and 177 GeV

c2can be excluded at 95 % confidence

level.

At the LHC direct searches are performed by the two general-purpose detectorsATLAS [18] and CMS [19]. In the years 2010 and 2011, both experiments collecteddata corresponding to an integrated luminosity of roughly 5 fb−1 and searched fordirect observations of the Higgs boson within a mass range of 110 GeV

c2to 600 GeV

c2.

Exclusion limits for a combination of all studied channels are shown in Figure 1.6.The upper limits are expressed in terms of Higgs boson production cross sectionsσ normalised to the expected Standard Model cross section σSM. Within intervalswhere the observed limits are below 1, the existence of the Standard Model Higgsboson can be excluded with a certain confidence level > 95 %. In the remainingregions both experiments obtain slight excesses in comparison to the expectation forthe absence of Higgs signal. The ATLAS experiment observes its largest excess atmH ≈ 126 GeV

c2with a local significance of 2.5σ, whereas the CMS experiment sees

an 3.1σ excess at a Higgs mass of 124 GeVc2

.

Based on new data collected in 2012 it will be possible to discover the StandardModel Higgs boson in this low mass region or fully exclude it.

15

Page 38: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1 The Standard Model of Particle Physics

Higgs boson mass (GeV)100 200 300 400 500 600

SM

σ/σ95

% C

L lim

it on

-110

1

10 ObservedExpected (68%)

Expected (95%)

LEP excluded

Tevatron excluded

CMS excluded

ObservedExpected (68%)

Expected (95%)

LEP excluded

Tevatron excluded

CMS excluded

-1L = 4.6-4.8 fb = 7 TeVsCMS, Observed

Expected (68%)

Expected (95%)

LEP excluded

Tevatron excluded

CMS excluded

-1L = 4.6-4.8 fb = 7 TeVsCMS,

Figure 1.6: Expected and observed upper limits on the Standard Model Higgs boson produc-tion cross section as a function of the Higgs boson mass based on the full analyses of theATLAS collaboration (left) [18] and the CMS collaboration (right) [19]. The right plotalso shows the exclusion determined at previous collider experiments and at the CMSexperiment.

1.2.3 Limitations of the Standard Model

Up to now there is no experimental observation that disproves the Standard Model ofparticle physics. All experiments whose outcomes can be predicted by the StandardModel confirm the theory with an impressive accuracy. As a prominent example, theelectroweak precision measurements at LEP [17] should be mentioned here.

Anyhow, there are experimental observations that cannot be described by theStandard Model. For example, the renowned cosmic microwave background mappedby the WMAP satellite disclosed that the fraction of baryonic matter, which isdescribed by the Standard Model, yields only 4.56 % whereas the dominating fractionconsists of dark matter (22.7 %) and dark energy (72.8 %) that cannot be particles fromthe Standard Model [20]. Additionally, there are open questions from a theoreticalpoint of view, such as possible great unified theories (GUT) [21].

A simple and promising extension to the Standard Model is the so-called MinimalSupersymmetric Standard Model (MSSM) [21]. For every Standard Model particle apartner is predicted which differs in spin by 1

2 and interacts only weakly with knownmatter. An additional discrete symmetry, i.e. the so-called R-parity, enforcing leptonand baryon number conservation, leads to massive stable supersymmetric particleswhich are possible dark matter candidates.

This model also introduces five (instead of one) Higgs particles for masses generationthrough a local symmetry breaking mechanism: two CP-even neutral Higgs bosons hand H, a pseudoscalar A boson and a pair of charged scalar particles H±. However,there are only two new parameters to be measured in the MSSM Higgs sector. Theycan be chosen as the mass mA and the ratio of two vacuum expectation values tanβ.All five Higgs masses can be expressed in terms of these two parameters.

The study of neutral MSSM Higgs bosons decaying into pairs of τ leptons is very

16

Page 39: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

1.2 Experimental Verification

promising, since in this supersymmetric model the Higgs couplings to down-typefermions are enhanced proportional to tanβ. Therefore this decay mode is evenmore interesting in higher Higgs mass region where the Standard Model Higgs bosonmainly decays into pairs of massive vector bosons. Also the production in associationwith b-quarks is enhanced which gives reasons for a study of final states with b-jetsresulting from the associated quarks. In particular, the two dominant productionmodes are the gluon fusion in association with a b quark loop (see Figure 1.1a) andthe production in association with a bb quark pair (see Figure 1.1d).

Figure 1.7 shows the parameter space of tanβ versus mA as the two free parametersin the MSSM Higgs sector including the latest expected and observed upper limitson tanβ as a function of mA at 95 % confidence level for searches for the neutralMSSM Higgs bosons at the CMS experiment with respect to different decay modesin the ττ decay channel.

[GeV]Am100 200 300 400 500

βta

n

05

10152025303540

4550 CMS, -1 = 7 TeV, L = 4.6 fbs

= 1 TeVSUSY

scenario, MmaxhMSSM m

ObservedExpected

Expectedσ 1± Expected 2±

LEPσ

95% CL Excluded:

]2 [GeV/cAm

100 200 300 400

βta

n

0

10

20

30

40

50

60

70CMS Preliminary

=7 TeVs -14.5 fb-channelµτµτ

CMS ObservedCMS Expected

Expectedσ1± Expectedσ2± = 1 TeVSUSYM

scenariomaxhMSSM m

95% CL excluded regions

Figure 1.7: Expected and observed upper limits on tanβ as a function of mA at 95 %confidence level for searches for the neutral MSSM Higgs bosons at the CMS experimentwith respect to semi-leptonic decay modes [22] (left) and muonic decay modes [23] (right)in the ττ decay channel. Excluded regions in the MSSM Higgs sector parameter spacespanned by tanβ and mA are coloured. The left plot also contains boundaries obtainedat LEP.

In this thesis, the three neutral supersymmetric Higgs bosons h, H and A will besubsumed by H for simplicity reasons, since the can neither be distinguished in ττsearches in the low mass region.

17

Page 40: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 41: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2The CMS Experiment at the LHC

2.1 The Large Hadron Collider (LHC)

The LHC is a proton-proton collider hosted by CERN1 situated near Geneva (Switzer-land) at the French border. The hadron collider delivers particle collisions at thehighest center-of-mass energies that are reached up to now. With an achieved center-of-mass energy of 7 TeV it supersedes the Tevatron collider near Chicago (USA) witha center-of-mass energy of up to 1.96 TeV. A detailed technical introduction is givenin the LHC Design Reports [24–26].

Before bunches of protons or ions are injected into the 27 km long LHC ring,they pass several connected pre-accelerators as shown in Figure 2.1. In the ringaccelerators the bunches gain their energy from microwave radiation in cavities. Thecavities also ensure a longitudinal stability of the bunches. The bending is performedby superconducting dipole magnets which reach magnetic fields of up to 8.3 T in theLHC ring. The radial focussing of the beam particles is performed by quadrupoleand sextupole magnets.

Hadron colliders are well suited for the exploration of new energy regimes. Byusing protons, which have a much higher mass than electrons at lepton accelerat-ors, the synchrotron radiation loss is limited to a reasonable amount. However,the constituents of the protons, quarks and gluons, which contribute to the hardinteraction processes, carry a variable fraction of the proton momentum. Thereforethe center-of-mass energy of the scattering particles is not a well defined quantity asin colliding experiments with leptons and it thus varies over a wide range.

The LHC provides four major experiments with collision events at four interactionpoints. Both ATLAS2 [28] and CMS3 [29] are general-purpose particle detectors.ALICE4 [30] focuses on studying heavy ion collisions and the resulting quark-gluon-

1Conseil Europeen pour la Recherche Nucleaire (engl.: European Organisation for Nuclear Research)2A Toroidal LHC Apparatus3Compact Muon Solenoid4A Large Ion Collider Experiment

19

Page 42: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2 The CMS Experiment at the LHC

Figure 2.1: The CERN accelerators complex [27] showing the pre-accelerators and the LHCring.

plasma whereas the asymmetric detector LHCb5 [31] concentrates on the CP violationin B hadrons.

An important performance quantity of a collider is the instantaneous luminosity L.It is a measure for the number of particles colliding at a specific area per time intervaland is simply based on machine parameters such as the geometry of the beam, thenumber of bunches and the number of particles per bunch. Together with the crosssection σ for a specific process, the event rate d

dtN reads

ddtN = σ L and L =

∫L dt (2.1)

The integrated luminosity L is a quantity measuring the amount od data accumulatedby a colliding experiment. The design luminosity of L = 1034 cm−2s−1 has not beenachieved yet and requires as well as the center-of-mass energy upgrade to 14 TeVanother technical stop.

2.2 The Compact Muon Solenoid (CMS)

CMS is one of the two general-purpose detectors designed to investigate particlephysics up to the energy of 1 TeV. The main task is to study the electroweak symmetrybreaking by directly searching for the predicted Higgs-Boson (see Section 1.1.4).

5Large Hadron Collider beauty

20

Page 43: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2.2 The Compact Muon Solenoid (CMS)

Additionally, the Standard Model has to be probed at the TeV energy scale. Othergoals are the possible processes that could extend the Standard Model such assupersymmetry or extra dimensions. As one of the biggest detectors ever built,CMS has been designed to achieve an excellent muon detection, a good di-photonmass resolution and a spatial tracking resolution that enables the reconstruction ofsecondary vertices of tauons and b-quarks.

The high demands from physicists and the technological feasibility required anelaborate construction. The cooling, the mechanical stability and the radiationhardness have been big challenges.

The basic structure is depicted in Figure 2.2: in the barrel region sub-detectors arearranged in layers around the collision point. Endcaps with the same sub-detectorscover these tubes. A solenoid magnet incorporates the tracking detector and thecalorimeters. This requires a compact design. Outside the coil is the muon system.A brief overview of these components is given in the following subsections. Moredetails information can be taken from the CMS Physics Technical Design Reports [33,34] and a more recent official documentation [29]. In the following all numbers aretaken from these sources unless they are indicated differently. Figure 2.3 shows thedifferent kinds of particle interactions with the detector components that are alsomentioned in the following.

In total, CMS measures 21.6 m in length, 14.6 m in diameter and about 12500 t inweight. About 108 data channels can be read out every 25 ns.

2.2.1 Coordinate System

z

xy

r

ϑ

ϕ

beamaxis

nominalcollisionpoint

Figure 2.4: The coordinate system ofthe CMS detector.

The CMS coordinate system has its origin in thenominal collision point. In Cartesian coordinates,the x axis points radially inward with respectto the LHC ring, the y axis points vertically up-ward and the z axis points along the beam axis.Together, the three axes form a right-handedcoordinate system. In spherical coordinates rdenotes the distance from the origin of the co-ordinate system whereas ϕ is the azimuthal anglein the x-y plane measured from the x-axis andϑ is the polar angle in the y-z plane measuredfrom the beam line. ϕ ranges from −π to +π.Usually, the polar angle ϑ is translated into thenon-dimensional pseudorapidity η defined as

η = − ln tan

2

)The pseudorapidity yields the value 0 for points in the x-y plane with ϑ = π

2 andincreases while ϑ degreases to 0. In comparison to the rapidity y, the pseudorapidityonly takes into account the flight direction of a particle given by its three-momentum

21

Page 44: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2 The CMS Experiment at the LHC

Figure 2.2: Three-dimensional schematic view of CMS [32].

1m 2m 3m 4m 5m 6m 7m0m

Transverse slicethrough CMS

2T

4T

SuperconductingSolenoid

HadronCalorimeter

ElectromagneticCalorimeter

SiliconTracker

Iron return yoke interspersedwith Muon chambers

Key:ElectronCharged Hadron (e.g. Pion)

Muon

PhotonNeutral Hadron (e.g. Neutron)

Figure 2.3: Slice trough the CMS detector showing signatures of different types of particlestraversing the various sub-detectors [32]

22

Page 45: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2.2 The Compact Muon Solenoid (CMS)

and is therefore independent of the particle energy and mass. Angular distancesbetween to points can then be quoted according to the following formula.

∆R =√∆φ2 +∆η2

Longitudinal directions (index L) always refer to the beam line. Therefore, thex-y plane in denoted as the transverse plane (index T). For example, the transverse

momentum pT =√p2x + p2

y and the missing transverse energy EmissT coming from

the transverse energy imbalance are common variables.

2.2.2 Silicon Tracking Detector

Tracks of charged particles originating from the collision point are detected as hitsin the inner silicon tracking system. A p-n junction in reverse direction has a regionwith almost no free charges between the p-type and the n-type semiconductor. Aionising particle passing through this zone causes a measurable current. That is thefunctional principle of each silicon tracking cell [35].

The beam pipe is surrounded by three cylindrical layers of pixel detectors atradii between 4.4 and 10.2 cm that are covered by two layers of disc pixel modulesat each side. Each pixel gives a three-dimensional information per hit. Its highspatial resolution ensures the capability to precisely measure particle momenta and toreconstruct secondary decay vertices. For example, the transverse impact parameterresolution of high pT tracks with a value of 10 µm is dominated by the pixel size of100× 150 µm2 in r-ϕ and z.

Adjacent to the pixel detector, multiple silicon strip detectors are installed up toan outer radius of r = 116 cm, a longitudinal distance of |z| = 282 cm and a maximumpseudorapidity of |η| = 2.5. These components only provide a two-dimensionalinformation. Therefore the single layers are tilted towards each other in order toobtain three-dimensional information with multiple modules, unless the the particleflux is small enough to avoid ambiguities. In total, 66 million pixels and 9.3 millionstrips are read-out by this sub-detector which covers an overall area of about 200 m2.A schematic view of the sub-detector is shown in Figure 2.5.

This innermost sub-detector is exposed the high particle flux near the collisionpoint. Although all modules are designed for this environment, radiation damagesreduce its lifetime to about one decade. Another problem is the high electrical powerdensity. The necessary cooling system reduces the volume of active detector materialand yields additional unwanted inference of the particles with matter.

2.2.3 Electromagnetic Calorimeter

In the electromagnetic calorimeter basically photons and electrons, are stopped. Byproducing electromagnetic showers they deposit their energy in the dense absorbermaterial. Incoming electrons radiate photons via bremsstrahlung, whereas photonsthemselves produce pairs of electrons and positrons until their energy lies above thepair production threshold. Therefore a shower of secondary particles spreads out over

23

Page 46: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2 The CMS Experiment at the LHC

TEC+TEC-

TOB

TOB

TIB

TIB

TID

TIDTID

TID

PIXEL

-2600 -2200 -1800 -1400 -1000 -600 -200 200 600 1000 1400 1800 2200 2600-1200

-1000

-800

-600

-400

-200

0

200

400

600

800

1000

1200

z (mm)

r (mm)

0.1 0.3 0.5 0.7 0.9 1.1 1.3 1.5

1.7

1.9

2.1

2.32.5-2.5

-2.3

-2.1

-1.9

-1.7

-1.5 -1.3 -1.1 -0.9 -0.7 -0.5 -0.3 -0.1 0.1

η

Figure 2.5: The inner silicon tracking system of CMS [29]. The pixel detector (PIXEL) issurrounded by the strip tracker. As parts of this component the Tracker Inner Barrel andDisks (TIB/TID) are followed by Tracker Outer Barrel (TOB) and the Tracker EndCaps(TEC).

the calorimeter. Scintillators are then used to detect these particles. Its materialis excited by high energetic photons and radiates scintillation light, that can bedetected by photodiodes. The energy of the incoming particle is proportional to thenumber of generated photons. Therefore the statisticallz induced energy uncertaintyis√E [35].

CMS uses homogeneous lead tungstate (PbWO4) crystals both as absorber with ahigh density and a short radiation length and as scintillator material. This allows acompact calorimeter. The small Moliere radius results in a high granularity of thesub-detector and the scintillation decay time is short enough to detect approximately80 % of the light within the time of 25 ns between the proton bunches. 61200 crystalsin the central barrel part of the detector and 7324 ones in each the endcaps surroundthe collision point hermetically up to |η| = 3. The scintillation light is read out byavalanche photodiodes (APDs) in the barrel and vacuum photodiodes (VPTs) in theendcaps. Additionally, a pre-shower sampling calorimeter is installed front of theendcap crystals to reject neutral pions. Figure 2.6 shows a schematic view of thesub-detector.

The main goal of the design of the electromagnetic calorimeter was the capabilityto detect the decay to two photons coming from a possible Higgs boson with a goodgood energy resolution.

2.2.4 Hadronic Calorimeter

Hadronic calorimeters are used to measure the energy of hadrons. Their functionalprinciple is the same as the one of electromagnetic calorimeters. Hadronic particles,

24

Page 47: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2.2 The Compact Muon Solenoid (CMS)

y

z

Preshower (ES)

Barrel ECAL (EB)

Endcap

=1.653

=1.479

=2.6

=3.0ECAL (EE)

Figure 2.6: Schematic view of the electromagnetic calorimeter of CMS [33]. The dashed linessignalise polar angles in terms of the pseudorapidity η.

basically jets, produce showers in dense absorber materials depositing their energybefore they are stopped. The deposit of their energy can be measured with scintillatorsand photodiodes.

CMS used a hadronic sampling calorimeter where brass absorber plates alternatewith plastic scintillators. Only a small fraction of the energy of an incoming particleis deposited in the scintillator and gets measured. To account for this effect, theenergy has to be corrected. It also causes a worse energy resolution compared withthe homogeneous electromagnetic calorimeter. The absorber thickness in the barrelregion lies between 5.82 and 10.6 interaction lengths depending on the polar anglewhile the electromagnetic calorimeter adds about 1.1 interaction lengths. Since not allhigh energetic hadrons can be stopped within the volume limited by the surroundingmagnet coil, an additional outer calorimeter detects tails of these showers. The CMScalorimeters are completed by a forward calorimeter placed 11.2 m away from thebeam crossing which is used to measure the instantaneous luminosity. A schematicview of the sub-detector is shown in Figure 2.7.

On account of the large spatial coverage of the calorimeter the total energy depositin the detector can be well measured which helps drawing conclusions about themissing transverse energy originating from undetected neutrinos or exotic particles.

2.2.5 Superconducting Solenoid

A superconducting solenoid magnet comprising the tracking detector and the elec-tromagnetic and inner hadronic calorimeters provides a longitudinal magnet fieldwith a maximum field strength of 4 T. The magnetic field is needed to bend thetracks of charged particles such as muons. The curvature of a charged track whichis bended by the magnetic field indicates the momentum of the particle. Thereforea high momentum resolution needs both a good spatial resolution of the trackingdetectors and a high magnetic field strength.

Due to the compact design of the inner sub-detectors, these could be placed insidethe magnetic coil with an inner radius of about 3 m and a length of 12.5 m. The

25

Page 48: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2 The CMS Experiment at the LHC

HF

HE

HB

HO

Figure 2.7: The CMS detector with respect to the hadronic calorimeter [29]. The hadronbarrel (HB) and endcap (HE) calorimeters are located inside the magnet coil, whereasthe outer (HO) calorimeter measures the tails of high energetic showers outside the coil.The forward calorimeter (HF) is placed 11.2 m away from the beam crossing. The dashedlines signalise polar angles in terms of the pseudorapidity η.

magnetic flux is returned by a 10000-t steel yoke.

2.2.6 Muon System

All stable particles except for muons and undetectable neutrinos are shielded bythe inner detector components and the solenoid magnet. Thus, a tracking detectorspecialised on muons is installed outside the magnet coil between the yoke elements.Low noise levels result from this shielding effect. Since a large surfaces has to becovered, a cheaper solution than for the innermost tracking detector is selected:gaseous particle detectors. Traversing charged particles (respectively muons) ionisegas atoms. Then the charge carriers are collected at wires and measured as a current.

CMS uses three different kinds of gaseous particle detectors covering an angularinterval of|η| < 2.4 in total. The barrel region is equipped with drift tubes whereas theendcaps use cathode strip chambers, that are capable of working in inhomogeneousmagnetic fields. Additional fast resistive plate chambers are taken for triggering. Aschematic view of the sub-detector is shown in Figure 2.8.

The momentum resolution of the muon system itself (about 9 % for muons withpT 6 200 GeV) can be improved by one order of magnitude in combination withthe inner tracking system. Muons are objects of high interest, since they appearin many final states and can be detected with high efficiencies. One reason forthe special efforts invested in the muon system is the golden Higgs decay channelH → ZZ → µµµµ.

26

Page 49: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2.2 The Compact Muon Solenoid (CMS)

0

100

200

300

400

500

600

700

800

0 200 400 600 800 1000 1200Z (cm)

R(cm)

RPC

CSC

DT 1.04

2.4

ˇ

ˇ

ıı ˇ ˇ

2.1

1.2

eta = 0.8

1.6

ME1

ME2 ME3 ME4

MB4

MB3

MB2

MB1

Figure 2.8: The CMS detector with respect to the muon sytem [33]. The barrel drift tubes(DT) and the endcap cathode strip chambers (CSC) are completed by resistive platechambers (RPC). Iron yoke components intersect the muon system to return the magneticflux of the solenoid.

2.2.7 Data Acquisition

At a bunch crossing rate of 40 MHz with about 20 proton-proton interactions perbunch crossing (in case of achieving the LHC design luminosity of L = 34 cm−2s−1)the collected detector information has to be reduced dramatically before recording.One bunch crossing produces about 1 MB uncompressed data. The majority of lowenergetic scattering processes are uninteresting for most physics analyses and can tobe discarded. Several fast triggers reduce the data rate to a recordable size.

At the first stage, the Level-1 trigger is implemented in hardware. Based oncalorimeter information and information from the muon system it reduces the eventrate from 40 MHz to 100 kHz and about 100 GB/s. Events passing this trigger arecomputed by a second-stage trigger called High Level Trigger (HLT). Since thistrigger is implemented in software, it is able to evaluate the full detector information.Additionally, this trigger is configurable and its performance can simply be increasedby adding more computing power. The High Level Trigger reduces the event rateby three orders of magnitude to reduce the amount of data that has to be stored totechnically processible size. About 400 events per second are recorded after they areprocessed by computer farms. Figure 2.8 illustrates the data acquisition system ofCMS.

It is obvious that such huge amounts of data requires for a sophisticated offlinestorage and processing of the data. High energy physicists chose a distributedcomputing infrastructure called the worldwide LHC computing grid (WLCG) [36,37]. Computing centres, referred to as sites, are located all over the world and share

27

Page 50: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2 The CMS Experiment at the LHC

Detector Front-Ends

Computing Services

ReadoutSystems

FilterSystems

EventManager

Level 1Trigger

Controland

MonitorBuilder Network

40MHz

105 Hz

102 Hz

100GB/s

Figure 2.9: The CMS data acquisition system [29].

both data and computing power. The system is organised in a tiered structure. Moredetails about the CMS computing model can be found in [38, 39].

At CERN the sole Tier-0 site is located. Raw data coming from the data acquisitionsystems is stored on tape archives together with first reconstructed samples. Thissite also distributes the data to eleven Tier-1 sites [40] which are large computingcentres responsible for the storage of copies of the data as well as for the large-scalere-reprocessing and skimming of the data. Fast network connections enable them toechange date with the Tier-2 sites. Local resources for user analyses are combined bythe about 140 Tier-2 sites. Beside a grid-based analysis they are supposed provideresources for event simulation. At last, Tier-3 sites form the final stage. Theseare small clusters that do not have to fulfil any demands towards the WLCG andare mainly used for used analyses. A schematic view of the tier structure for CMSis shown in Figure 2.10. The access to grid resources is mediated by so-called

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3

Tier 2/3 Aachen

Tier 2/3DESY

Tier 3Karlsruhe

Tier 1

RALUnited Kingdom

Tier 1

ASGCTaiwan

Tier 1

CNAFItaly

Tier 1

PICSpain

Tier 1

CC-IN2P3France

Tier 1

FNALUnited States

Tier 0

CMS WLCG

Structure

Tier 2Warsaw

Tier 2 CSCS

Tier 1

GridKaGermany

Figure 2.10: The WLCG tier structure for CMS [40].

middleware, a software layer that allows a standardised interaction between usersoftware and the inhomogeneous grid hardware.

28

Page 51: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2.3 Simulation, Reconstruction and Software

2.3 Simulation, Reconstruction and Software

2.3.1 Monte Carlo Event Generators

Pythia is a general-purpose event generator used for hadron and lepton colli-sions [41]. Pythia simulates the whole scattering of particles with a partonicsubstructure. Therefore, the results a comparable with a hypothetical idealdetector.

Starting with particles having a partonic substructure described by partondistribution functions, Pythia describes the gluon-radiation of colour-chargedobjects in the initial state by a parton shower model. After that, two partonsparticipate in the hard process. The calculations base on a library containingdescriptions for various 2→ 1/2/3 processes. Final state radiation is consideredas well as the hadronisation of colour charged objects in the final state. This isdone based on the so-called Lund string model [42]. In addtion to that, remnantsof the initial hadrons not participating in the hard process are described bythe underlying event model of Pythia. Even multiple partonic interactionsare possible.

Since details aspects of certain processes may be treated better by other pack-ages, Pythia allows to replace some calculations with plugins. As one exampleimportant for this thesis the package Tauola [43] should be mentioned. Thispackage enriches Pythia with the opportunity to handle spin and polarisationeffects of decaying tau leptons.

MadGraph is a general-purpose matrix-element based event generator [44]. Incontrast to Pythia, only the hard process is simulated. Interfaces to eventgenerators such as Pythia are available.

For given sets initial and final state particles, MadGraph computes all pos-sible feynman diagrams and generates code to calculate the matrix elements.MadGraph is able to deal with every possible renormalizable or effective2 → n theory that is based on a lagrangian. The included tool MadEventcan then be used to generate events.

2.3.2 Detector Simulation

After generating pure events originating from proton-proton-collisions, the detectorresponse has to be simulated in order to yield outputs that are compatible with realobtained data. Two different methods for the detector simulation are available. Bothproduce outputs ready for reconstruction algorithms also used for measured data,based on inputs coming from arbitrary event generators via the so-called HepMCfile interface.

Full Simulation: the whole detector geometry is modelled in detail by the GEANT4 [45]simulation toolkit. This enables the simulation of traversing particles through

29

Page 52: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2 The CMS Experiment at the LHC

the detector material and its interactions. Besides this physical simulation ofthe detector behaviour, the toolkit also covers the digitising step, where theread-out electronics is simulated.

Fast Simulation: the procedure of the full simulation can be accelerated by simplify-ing the detector geometry. In addition to that, measured quantities are simplysmeared according to distributions coming from studies of obtained data orfully simulated events. The method is shortly denoted as FastSim.

The computing time can be shortened by three orders of magnitude by usingthis algorithm instead of the full simulation. Typically, one event is simulatedin times of the order of one second. Since the results are tuned with the fullysimulated samples, the agreement with the full simulation is at the acceptablelevel of percent or below.

2.3.3 Reconstruction of Physical Objects

Before physical objects such as muons with certain track and momentum can beused in final analyses, the information of many detector cells, i.e. tracker hits andcalorimeter energy deposits, has to be combined with each other. This process isreferred to as reconstruction of events.

Muon Reconstruction and Identification

The muon reconstruction is performed in two steps. At first, a local reconstructiononly takes the information from the muon system into account. After taking seedsfrom hits in the innermost muon chamber, the state vectors (track position, mo-mentum and direction) of the candidates are propagated from inside out. Only smallregions of interest have to be considered. A comparison of the hits in the next layerresults in an update of the state vectors until the outermost layer is reached.

Secondly, a global reconstruction is performed based on candidates (seeds) fromthe local reconstruction. Here, hits in the innermost tracking system are takeninto account that are compatible with extrapolated tracks from the muon systemcandidates. A global fit combines all information and improves the resolution of themuon momentum compared to the standalone local version.

While propagating the muons through the various detector components, theirenergy loss in the material, the effect of multiple scattering, and the non-uniformmagnetic field in the muon system is considered.

Muons originating from Z or τ decays are considered to be isolated from hadronicactivity since in this decays no colour charged particles appear. Consequently,isolation criteria are applied to muon candidates to reject muons that accompanyQCD processes. The energy deposited and the number of additional tracks in acertain so-called isolation cone around the muon track is required to fall below acertain threshold.

30

Page 53: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2.3 Simulation, Reconstruction and Software

Tauon Jet Reconstruction and Identification

Hadronically decaying tau leptons as well as other hadronic processes producecollimated streams of hadrons. These streams are referred to as jets. Reconstructionsoftware has to cluster these single detector signals to physical objects.

Different algorithms exist for this purpose. Cone-type algorithms define coneswith a certain opening angle R in (η, ϕ) around a high energetic track. The coneaxis is corrected iteratively by merging tracks in the cone until a certain precision isreached. Cluster-type algorithms group objects by taking certain distance quantitiesinto account. In addition to the geometrical distance of objects this can also bemomentum or energy information.

Due to the large background of QCD-induced jets at a hadron collider, jetsoriginating from hadronically decaying tauons have to be carefully selected. Onecharacteristic difference between tau jets and quark or gluon induced jets is the narrowcone size since only few hadrons form a tau jet. This allows to apply isolation criteriaon the reconstructed jets to select tau jets in analogy to the muon isolation criteria.Additionally, more sophisticated approaches are available using neural networksanalysing kinematic information of the decay products (TaNC6) or extending theidea of the isolation criterion using on hypotheses about the tau decay signature(HPS7) [46].

2.3.4 Software Frameworks

Large amounts of data require sophisticated tools to perform the data analysis. Theanalysis software development is an important issue for experimenters.

ROOT

Most frequently, the analysis code in high energy physics uses the software frameworkROOT [47] which has been mainly developed at CERN. This object oriented frame-work provides a great variety of classes that cover almost every step of the analysisprocedure from event generation and detector simulation over event reconstructionand data acquisition to the data analysis. For the first four areas the CMS experimentprovides its own framework CMSSW on top of ROOT. Data structures such asn-tuples and trees which allow access to the data event by event are used for finalanalyses. Also histograms with various functionalities including fitting routines aswell as visualisation capabilities are part of the most frequently used classes.

The CMSSW Application Framework

The CMS collaboration provides a software framework called CMSSW. This softwarebased on the ROOT framework covers all parts of the analysis from data taking orgeneration to final studies on small sets of selected events.

6Tau Neural Classifier7Hadron Plus Strips classifier

31

Page 54: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

2 The CMS Experiment at the LHC

CMSSW is characterised by a modular architecture that enables a plug-in mech-anism for three types of modules: so-called producers generate new physics objectsand add them to the data of single events. The data acquisition system and MonteCarlo event generators are examples for such producers. Filters handle triggering andselections of events and analysers are used for final analyses as they are specialised inproducing histograms and graphs. Besides the modules there are services and utilitytoolkits that can be accessed within every module.

Via configuration scripts various predefined modules can be added to an executionchain and are den performed consecutively. Own modules can simply be added aseach kind of module follows a strict design. As a noteworthy advantage of this plug-insystem is has to be mentioned that the usage of configuration scripts allows to simplychange configurations without any need of a modification of the software itself.

32

Page 55: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3Multivariate Analysis Techniques

This chapter gives an overview of the statistical methods used for the analysespresented in this thesis. Detailed descriptions can be obtained from statisticaltextbooks such as [48–50].

Two main statistical issues will be presented in this thesis: the determination ofquantities that are capable of discriminating H → ττ events from background eventsand the mass reconstruction of ττ final states. For both tasks statistical methods,so-called multivariate analysis techniques, are required that can handle multiplevariables to come to a decision or to yield a numerical result.

3.1 Discriminating Between Two Classes of Events

For the search of the Higgs boson it is crucial to be able to separate the very fewHiggs signal events from the large number of background events. Discriminatingvariables have to be identified that express the differences between signal and back-ground processes. One important example for such a discriminating variable for theclassification of Higgs events is the invariant mass of the decay products.

Here, the multivariate analysis techniques come into play. The aim is to combineinformation contributed by various variables into one single discriminating variablethat is then used classify the events.

3.1.1 Determination of Test Statistics

The discriminating quantites introduced in this section are referred to as test statisticsor discriminators. In order to avoid confusion with arbitrary discriminating variables,in this section the denotation test statistics is used exclusively. A test statistics isperfectly characterised by the following properties:

• It is calculable for every event independently.

• It clusters in different intervals for the different event classes.

33

Page 56: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3 Multivariate Analysis Techniques

• It acts as a distance measure to enable a meaningful discrimination between thedifferent event classes, i.e. events that are distinctly separable yield significantlydifferent values of the discriminator.

Due to its statistical character, actual test statistics do not fulfil all of the mentionedproperties for all events in general.

Based on a number of n discriminating quantities x the function

t : Rn → R, x 7→ t (x)

has to be determined as a discriminating variable. Often, this procedure is calledtraining or teaching of a method and is performed on a training sample. A samplecontaining the truth class information, the so-called target, is required for the training.After performing the training procedure the function can be applied to the observablesof the actual measurement.

In the following, two approaches for determining test statistics, the likelihoodratio method and artificial neural networks, are introduced as they are used forthe analysis presented in this thesis. Several other methods such as Fisher’s lineardiscriminant analysis or boosted decision trees exist, too [48].

Likelihood Method

Knowing the marginal signal and background probability density functions (pdfs)psig/bkg,i (xi) for each discriminating variable xi, a likelihood quantity can be determ-ined as a product of the n signal probabilities based on the single variables accordingto the following equation.

t (x) = L (x1, . . . , xn) =n∏i=1

psig,i (xi)

psig,i (xi) + pbkg,i (xi)(3.1)

After normalising this quantity, it equals the signal probability density for avector of discriminating variables x. As the test statistic is proportional to thesignal probability, it yields higher values for signal-like events than for background-like events. By using marginal distributions this common approach does not takecorrelations between the variables xi into account. Such likelihood quantites are themost powerful discriminating quantities for uncorrelated variables as they minimisethe probabilities for wrong decisions (see Section 3.1.2). This is stated by theNeyman-Pearson lemma [51].

To benefit from the Neyman-Pearson lemma in the case of correlated variables itis possible to either decorrelate the variables or to use a multi-dimensional pdf thatdoes not only contain marginal information. In practice, the weakness of the latteroften lies in poor statistical precision due to low event numbers in some variablephase space.

34

Page 57: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3.1 Discriminating Between Two Classes of Events

Artificial Neural Networks

Highly non-linear discrimination between event classes can be achieved by employingartificial neural networks [52]. The basic structure of a multilayer perceptron typeneural network is depicted in Figure 3.1.

wij s(x)

s(x)

Weights

Activation Function

Output Layer,Target

Bias Node

x1

x2

x3

x4

x5

x6

x7

x8

x9

1

w'ijWeights

Input Layer

x

h(x,w)Hidden Layer

t(x,w,w')

Activation Function

Figure 3.1: Schematic representation of the functionality of artificial neural networks. Athree-layer multilayer perceptron with one hidden layer and with one output node (red)for classification and multiple output nodes for regression purposes (see Section 3.2) isshown.

The network consists of several layers of nodes which are – in analogy to theneurons in a brain – connected. The information is passed through the layers ofthe network and weighted to yield a decision. The weighted inputs of each neurondetermine its outputs. An activation function applied to each node value supportsthe numerical stability and allows to introduce a non-linear response.

Intermediate layers are called hidden layers and the last layer of nodes is theoutput layer. For classification purposes it is sufficient to only use one output node.For this special case, the one-dimensional output of a three-layered network can beexpressed according to the following formula.

t (x) = s

(m∑i=1

w′i · s(w0,i +

n∑j=1

wji · xi)

︸ ︷︷ ︸hi(x)

)(3.2)

where wji denotes the first and w′i the second layer of weights, hi(x) are the values ofthe m hidden layer nodes and s(x) indicates the activation function. Bias nodes with

35

Page 58: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3 Multivariate Analysis Techniques

constant values of 1 are sometimes also introduced in order to enable simple shiftsin the target range. A basic activation function, the symmetric sigmoid function, isshown in Figure 3.2. This function yields a strong dependency on the original nodevalue in a small interval around zero and pulls up numerical outliers into this range.

0.0node value x

-1.0

0.0

1.0

activ

atio

n fu

nctio

n s(x)

sigmoid functions(x) = 2

1 +e−x−1

Figure 3.2: Symmetric sigmoid functions are common choices for activation functions atneural network nodes.

During the training process the weights are optimised by minimising the distancedimension between the net output t(x) and the true target value T for a set oftraining events of size N . One approach is to minimise a quadratic loss function

χ2 =1

2

N∑i=1

(t(xi)− Ti

)2The number of weights to be determined grows quickly with the number of nodesand number of hidden layers. Thus, a high-dimensional minimisation has to be per-formed. The algorithm which is mainly used for this purpose is the backpropagationalgorithm [53].

3.1.2 Hypothesis Testing

This section addresses the question of determining the class, single events belongto, according to previously stated hypotheses. Again, two classes of signal andbackground events will be studied with respect to a one-dimensional discriminatingvariable as the only basis of decision-making. The explanations in the following willconcentrate on test statistics used as discriminating variables as it has been introducedin the previous section. Then there exist two abstract hypotheses: Hsig for signalevents and Hbkg for background events. Considering only simple hypotheses theexpectations of signal and background events in terms of the discriminating variable

36

Page 59: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3.1 Discriminating Between Two Classes of Events

← tcut→test statistic t

prob

abili

ty d

ensi

ty p

(t)

p(t |Hbkg)≡pbkg(t)

p(t |Hsig)≡psig(t)

Hsig correctly discardedHsig falsely discarded

Hsig correctly acceptedHsig falsely accepted

Figure 3.3: Event distributions following the signal (red) and background (black) hypotheses.The blue line shows a possible cut on the test statistic quantity and therefore divides allevents in four groups: Correctly or falsely accepted or discarded events with regard tothe alternative hypothesis H1. Usually, one selects events above the cut.

are fully known as the pdfs psig(t) and pbkg(t) depicted exemplarily in Figure 3.3.The pdfs are separately normalised to unity.

To classify events, a cut value tcut has to be defined. Events characterised byt-values above the cut value are taken as signal events. All other events are assumedto belong to the class of background events. It is said that these events are discardedaccording to the signal hypothesis.

Depending on the selection defined by the cut value, the certainty of the decisioncan be expressed in terms of several quantities based on the underlying pdfs psig(t)and pbkg(t) and the numbers of expected signal and background events S and B. Asa consequence of the choice of the actual selection cut, each decision for overlappinghypotheses is always a compromise.

The expected numbers of selected or discarded events Ssel/disc and Bsel/disc depend-ing on tcut are determined by integrating over the pdfs.

Ssel (tcut) = S

+∞∫tcut

psig(t) dt and Sdisc (tcut) = S

tcut∫−∞

psig(t) dt

Bsel (tcut) = B

+∞∫tcut

pbkg(t) dt and Bdisc (tcut) = B

tcut∫−∞

pbkg(t) dt

Several quality measures for the selections can be stated based on these event numbers.Firstly, two different types of errors are defined: Decisions where background eventsare identified as signal events yield the so-called type 1 error. The type 2 error is a

37

Page 60: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3 Multivariate Analysis Techniques

measure of the number of falsely rejected signal events.

Ptype 1 error (tcut) =Bsel

Band Ptype 2 error (tcut) =

Sdisc

S

Pright decision (tcut) =Bdisc + Ssel

S +B

The probability Ptype 1 error is called significance level. The power of the hypothesistest is known as the probability

(1 − Ptype 2 error

). The probability Pright decision

depends on the relative composition of signal and background in the studied sample.Additional quality measures are listed in the following.

Signal purity ℘sig: Fraction of selected signal events compared to all selected events.

℘sig (tcut) =Ssel

Ssel +Bsel

For selected events this quantity has to be considered as the probability forfinding signal events in the selection. The preferred value is ℘sig = 1 resulting ina selection of only signal events. Since purities depend on the relative composi-tion of signal and background in the studied sample, it is more straightforwardto examine efficiencies instead.

Signal efficiency εsig: Fraction of selected signal events compared to all signal events.

εsig (tcut) =Ssel

S=

Ssel

Ssel + Sdisc= 1− Ptype 2 error

For selected events this quantity has to be considered as the probability forselecting signal events. The preferred value is εsig = 1 resulting in a selectionof all signal events.

Background misidentification probability εbkg: Fraction of selected background eventscompared to all background events.

εbkg (tcut) =Bsel

B=

Bsel

Bsel +Bdisc= Ptype 1 error

For selected events this quantity has to be considered as the probability forselecting background events and therefore misidentify them as signal events.The preferred value is εbkg = 0 resulting in a selection of no background events.

The so-called background rejection 1− εbkg can be used instead of the back-ground misidentification probability.

Significances S: These quantities are particularly important to assess the samplecomposition of desired signal events and contaminating background events. ThePoisson-type significance in the presence of both signal and background events

38

Page 61: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3.2 Reconstruction of Arbitrary Quantities

compared to that of the background hypothesis is expressed in the followingequation.

S =

√√√√2 ln

(Lsig+bkg

Lbkg

)=

√2 (Ssel +Bsel) ln

(1 +

Ssel

Bsel

)− 2Ssel (3.3)

Additionally, the significances SB and SS+B are rather common quantities tocompare different selections.

SB =Ssel√Bsel

and SS+B =Ssel√

Ssel +Bsel

Both of them express the number of signal events in units of the uncertainty√N of a Poisson process with mean N . It can be shown that the optimisation

of the significance SB optimises upper limits on the signal strength of countingexperiments at the same time, even for small event numbers [54]. This isimportant for searches for yet unknown processes. On the other hand, thesignificance SS+B can be optimised for measurement.

3.2 Reconstruction of Arbitrary Quantities

The second issue of multivariate analysis techniques covers the reconstruction ofarbitrary quantities. These are so-called regression methods. Within this thesis it isneeded for reconstructing the invariant ττ mass based on measured quantities.

The methods for the determination of discriminators (see Section 3.1.1) can beextended to a formalism that yields a probability density as function of the studiedquantity for each event. This can be achieved by performing multiple classificationsassigned to different thresholds in the range of target values. The procedure isillustrated in Figure 3.4.

Based on the measured variables x a quantity Q is quested. For the training samplethe true target value Qtrue is known. Given are a number of Nthresh thresholds at thevalues q1, . . . , qNthresh

in the expected range of values for Q. The classification withregards to the i-th threshold yields the discriminator ti, whose values are distributedaccording to the pdf pi(ti). Then the conditional probability for having an eventwith a value Qtrue greater than a threshold value qi after obtaining a discriminatorvalue ti can be calculated as the fraction of events fulfiling the condition Qtrue > qiin a small discriminator interval, i.e. [ti, ti + dti].

P(Qtrue > qi | ti

)=pi(ti | Qtrue > qi) dti

pi(ti) dti

Altogether, these probabilities are supposed the form a cumulative distributionfunction F

(Q | x

)whose derivative gives the pdf in the quantity Q.

F(qi | x

)= P

(Qtrue > qi | ti(x)

)⇒ p

(Q | x

)= d

dQ F(Q | x

)

39

Page 62: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3 Multivariate Analysis Techniques

Quantity Q0.0

0.5

1.0

Cum

ula

tive d

istr

ibuti

on F(Q|~ x

)

truereco.

(a) Cumulative distribution

Quantity Q0.00

0.01

Pro

babili

ty d

ensi

ty f

unct

ion p(Q|~ x

) truereco.

(b) Probability density function

Figure 3.4: Illustration of regression methods. First classifications are performed with respectto multiple thresholds in the studied quantity Q. The evaluation of the results leadsto a cumulative distribution F

(Q | x

)that is then spline-fitted and derived to yield

a probability density p(Q | x

)as a function of the studied quantity for every event

characterised by x.

In practice the F(qi | x

)can be fitted monotonously before the derivation for

numerical issues, i.e. to avoid negative pdf values.In summary, the described procedure yields a probability density function p

(Q | x

)depending on the studied quantity Q for each set of input variables x. Instead ofusing the whole information provided by the pdf, it is of course possible to extractsingle point estimators, i.e. the mean, the median or the most likely value, from thepdf as a result. Then also uncertainties on these values can be derived based on thewidth of the distribution.

3.3 The Neural Network Package NeuroBayes

While the common analysis framework ROOT (see Section 2.3.4) recently containsthe TMVA package [55] for multivariate analysis including a variety of classificationand regression techniques such as likelihood methods or artificial neural networks, inthis work the neural network package NeuroBayes [56] is used. The algorithm hasbeen developed by Prof. Dr. Michael Feindt [52] at the University of Karlsruhe andthe project is continued by Phi-T1, where it shows its analysis capabilities amongother applications in insurance and banking businesses.NeuroBayes extends the idea of a neural network as described in Section 3.1.1

by a sophisticated pre-processing. Each variable is pre-processed individually, beforelinear correlations among the input variables are eliminated globally by rotatingthe phase space of the variables. This approach allows the neural network to focus

1Phi-T Physics Information Technologies GmbH, http://www.phi-t.de

40

Page 63: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3.3 The Neural Network Package NeuroBayes

on non-linear effects in the input data during the learning process. The individualpre-processing comprises the following four steps:

1. Discretization and flattening: the range of an input variable gets sub-divided into a fixed number of bins with individual bin widths and equal eventnumbers in each bin. This ensures a meaningful treatment of peaking variablesas resolution gets dependent on the accumulation with events.

2. Target-respective pre-processing: for each bin the mean of the (true)target values is calculated. In terms of a classification with target values ofzero and one, this mean corresponds to the signal purity for events in each bin.So the network gets the information about the target depending on the inputvariable as input. With regards to a density training it is also possible to inputthe width of the target values in a certain bin instead of the its mean into thenetwork. These second moments are useful for are reconstruction of a pdf.

3. Smoothing: the purity or target mean values are fitted with a spline functionto balance numerical fluctuations.

4. Simple variable transformation: the distributions of the spline-fit valuesget transformed to have a mean of zero and width of one. This enablesNeuroBayes to improve the numerical calculations based on well-definedvalue ranges.

Figure B.7 illustrates this procedure with an exemplary output from NeuroBayesfor one variable. The type of preprocessing can be adjusted to the type of the inputvariable. For example, discrete variables can be specifically treated by the spline-fitor undefined values in single events for single variables can be ignored.

Moreover, several quality measures are calculated by NeuroBayes for each inputvariable in order to give detailed information about the impact of each variable onthe training performance.

• Correlation to all other input variables together: Cothers

• Correlation to the target: Ctarget. The significance for single variables Ssingle var.

expresses the correlation to the target with respect to the number of trainingevents Ntrain.

Ssingle var. =∣∣Ctarget

∣∣ ·√Ntrain

• The so-called added significance Sadded mainly defines the ranking of thevariables. It describes the loss of correlation to the target for the completeset of the input variables if the considered variable is removed. To simplifymatters, the corresponding correlation will be called added correlation Cadded.

Sadded = |Cadded| ·√Ntrain

In the analysis chapters the correlation is mentioned instead of the significanceas it is independent of the number of training events.

41

Page 64: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3 Multivariate Analysis Techniques

• The significance loss Sloss describes the information loss for the training whenthe considered variable is removed from the set of input variables. The quantitytakes into account the correlation of the input variables as one variable can bepartially replaced by another correlated variable. In turn, it will be referred tothe corresponding correlation Closs that is independent of the training samplesize.

Sloss = |Closs| ·√Ntrain

All correlations refer to the situation after the decorrelation of the input variables.After the pre-processing there are two different training options.

• Iterative training: This training procedure of a neural network follows thedescription in Section 3.1.1. NeuroBayes also features an internal boostingmechanism which changes the event weights iteratively to increase the import-ance of events that are misidentified in the iterations before. The iterativetraining process stops automatically if the training error calculated with a lossfunction falls below a certain threshold.

• Zero-iteration training: This method is a fully analytical transformationof the preprocessed input variables which has primarily nothing to do witha neural network as shown in Figure 3.1. The advantages of this methodare the very fast training compared to the iterative method and the exactreproducibility of the results. Due to the elaborate pre-processing, the resultsof the zero-iteration training are in general only slightly worse than the resultsof a full iterative training. However, the results tend to be numerically lessstable and consequently this training mode is only used for testing purposesthroughout this work.

Each quantity resulting from one output node gets transformed in order to ensurean easy probabilistic interpretation of the test statistic values. The aim is that thetest statistic value for a certain event itself is able to be taken as a probability thatthis event is a signal event. Each central value of ti of the bin i in the binned outputrange should be a measure for the signal probability ℘sig (bin i) in this bin after thetraining process.

℘sig (bin i) =S (bin i)

B (bin i)

!= ti (3.4)

In order to achieve this behaviour a monotonous non-linear transformation calleddiagonal fit is performed. The quality of this fit can be monitored in the plot of thesignal purity per bin ℘sig (bin i) as a function of the bin centres ti where the plottedpoints are supposed to lie near to the diagonal.

Two main applications are discerned: firstly, the Teacher for training the neuralnetwork and outputting the results in an expertise file and secondly, the Expert forreading an expertise file, applying the trained network to data and obtaining the

42

Page 65: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

3.3 The Neural Network Package NeuroBayes

network outputs. Besides the numerical outputs NeuroBayes provides a large setof various monitoring plots for each training. The Expert is able to provide boththe results of a classification as well as access to the results of a full regression.

• Classification: NeuroBayes outputs the test statistic t(x) which can bestudied according to the explanations in Section 3.1.2.

• Regression: Several outputs are available. Internally, NeuroBayes cal-culated a probability density function for each event. This complete set ofinformation about the unknown true target value Qtrue based on an inputevent x can be retrieved in terms of pdf values p

(Q | x

)at arbitrary values

of Q. In most cases it is easier to handle with simple point estimators whereNeuroBayes provides the mean, median and most likely value of the pdf. It isalso possible to get the widths of the pdfs as well a expected value with regardto an arbitrary function f(Q).

43

Page 66: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 67: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4Mass Reconstruction with Artificial

Neural Networks

This thesis focuses on the study of Higgs bosons decaying into pairs of τ leptons andthe corresponding background processes. The process Z → ττ denotes a backgroundprocess which is only separable from H → ττ events by exploiting sophisticatedtechniques.

One important tool is the mass reconstruction since the mass of the invariantττ system provides important discriminating information. The aim of the analysispresented in this chapter is to find suitable mass definitions for events which improvethe ττ mass reconstruction resolution in order to achieve a better separation ofZ → ττ and H → ττ events based on mass information.

In this chapter a new mass reconstruction method is introduced which estimatesthe mass of the Higgs or Z within the semi-leptonic decay into a muon and a τjet accompanied by three neutrinos by using a neural network. According to theenergy-momentum relation from special relativity, the four-momentum vectors p ofall decay products, i. e. visible (vis) and invisible or missing (miss) ones, have to beadded before the square of the sum yields the mass of the decayed particle.

m2ττ =

(∑vis

p+∑miss

p

)2

=

(pµ + pτjet +

3∑ν=1

)2

with p2 = E2 − p2

The three-momentum vector is denoted by p. When there are neutrinos in the finalstate, their contribution to the vectorial momentum sum is missing because theycannot be detected. A full reconstruction of the ττ system therefore is impossible.Sophisticated mass reconstruction algorithms are needed to estimate the mass of theintermediate particle.

45

Page 68: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

4.1 The Semi-leptonic Decay Mode

Semi-leptonically decaying ττ pairs originating from Higgs and Z Bosons are char-acterised by a reasonable high branching ratio, see Table 1.4. Both events withone muon and one electron in the final state occur in about 23 % of all ττ pairs.Since muons can be efficiently triggered, the decay mode with a muon and a τ jetin the final state plays an important role in the study of ττ final states. The massreconstruction study is in principle equally performed in the electron mode.

Hadronically decaying tauons provide at least the same reconstruction informationas muons from such decays, sometimes even more by exploiting its secondary decayvertex if more than one charged track is reconstructed. As a consequence, themethods presented below can also be applied to fully hadronically decaying ττ events.Thereby, the majority of ττ final states can be exploited to mass reconstructionmethods presented below without the need of significant adjustments.

Another advantage of this channel is the opportunity to study the impact of twosignificantly different τ decay modes on the mass reconstruction. The muonic decayyields a precisely measured four-momentum vector of the visible decay products.Nevertheless, two neutrinos escape detection. In general, the system of these twoneutrinos has a non-vanishing mass. In the hadronic τ decay only one neutrinoescapes which has a negligible mass and therefore one free parameter less comparedto the leptonic decay. On the other hand, the hadronic decay results in largervisible momentum uncertainties due to the jet reconstruction. However, the greatestadvantage of hadronically decaying τ leptons is the chance to reconstruct a secondaryvertex and therefore get information about the flight direction of the original tauon.Most of the hadronically decaying τ leptons produce one or three tracks of chargedparticles, so-called prongs. In case of a 3-prong decay, a secondary vertex canbe reconstructed by determining the intercept point of the three charged tracks.Figure 4.1 schematically illustrates a typical event topology.

p p

τ

µ

ν

τ

τJet

ZH

Figure 4.1: Typical pp→ Z/H → ττ → µ+ τjet event topology. Three neutrinos in the finalstate preclude a full reconstruction of the ττ system. The τ jet produces one or threecharged tracks (prongs) in most of the cases. For 3-prong decays it is, to some extent,possible to reconstruct a secondary vertex which allows an estimation of the original τflight direction.

46

Page 69: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.1 The Semi-leptonic Decay Mode

For the analysis presented below, several definitions have to be stated. The originalτ leptons are denoted by τlep and τhad whereas their visible decay products areindicated by τvis

lep = µ and τvishad = τjet. τ

misslep = ντ + νµ and τmiss

had = ντ are used forthe undetected decay products. In the transverse plane, the momentum conservationallows for a measurement of the transverse missing energy Emiss

T or momentumpmiss

T . Actually, neutrinos originating from processes such as pile-up can lead to anmomentum imbalance in the transverse plane, too. This effect is neglected in thestudy presented in this thesis.

4.1.1 Event Generation and Preselection

Officially simulated Monte Carlo samples are only available for a very limited numberof Higgs and Z events and the mass distributions are not suitable for the massreconstruction study. Therefore, the fast simulation (FastSim) has been employed tosimulate the CMS detector. Pythia together with Tauola has been used for theevent generation (see Section 2.3). Through this approach, a very large number ofevents could be simulated within a reasonable time interval to bring the statisticaluncertainties to a level that is limited by the mass reconstruction method itself.

Both Z → ττ and H → ττ events have been generated in the interval between45 and 250 GeV

c2. The Z events have been simulated in slices of the true Z mass

with 5 GeVc2

width. This ensured to achieve enough events with masses that aresignificantly different from the mean Z mass. The Standard Model Higgs eventshave been generated with Higgs masses also in steps of 5 GeV

c2. Having an almost

continuous distribution of generated masses in a wide range allows for a unbiasedmass reconstruction as explained below.

The Z/H → ττ events have been simulated inclusively. Therefore, µ+ τjet finalstates had to be reconstructed and selected. The reconstruction of muons and �jets isperformed with the standard algorithms in the CMSSW framework as described inSection 2.3.3. The selections chosen follow the official approach as presented in [57]and also in [58]. Table 4.1 summarises the preselection. After the preselection, about6 millions of Z → ττ events and 10 millions of H → ττ events remained.

Property Selection

trigger pass single-muon L1 trigger and HLTwell-defined primary vertex pass

muon isolated, pT > 15 GeVc and |η| < 2.1

τ jet pT > 20 GeVc and |η| < 2.3

charge µ and τ jet oppositely charged

transverse mass of µ and EmissT mT

(µ,Emiss

T

)> 40 GeV

c2

Table 4.1: Summary of the preselection of semi-leptonically decaying ττ final states for theH → ττ → µ+ τjet analysis.

47

Page 70: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

Mainly, all methods are trained and tested on Z samples with a continuoustrue ττ mass distribution. Special trainings and tests involving Higgs samples areparticularly mentioned. Results shown in this chapter are always based on anstatistically independent testing sample with respect to the training sample.

For each network training a sample with a flat distribution of true invariant ττmasses is used. This procedure avoids that NeuroBayes “learns” mass preferences.It has been shown that a “normal” Z sample peaking at the nominal Z mass is notwell suited for a mass reconstruction method which should work in a broad massrange. Networks trained this way always yielded the Z mass as a result independentof any input information.

4.2 Current Mass Definitions

Within current H → ττ analyses from the CMS experiment, three different massdefinitions are used. Their distributions are depicted in Figure 4.2 for three differentMonte Carlo samples: a Z sample and two Higgs samples with different Higgs massesmH = 120 GeV

c2and mH = 200 GeV

c2.

50 100 150 200 250 300 350

visible mass m visττ

[GeV/c2

]0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

arb

itra

ry u

nit

s

Z → ττ

H → ττ (120 GeV

c2)

H → ττ (200 GeV

c2)

(a)

50 100 150 200 250 300 350

coll. approx. mass m collττ

[GeV/c2

]0.00

0.01

0.02

0.03

0.04

0.05

arb

itra

ry u

nit

s

Z → ττ

H → ττ (120 GeV

c2)

H → ττ (200 GeV

c2)

(b)

50 100 150 200 250 300 350

SVFit mass m SVFitττ

[GeV/c2

]0.00

0.01

0.02

0.03

0.04

0.05

0.06

arb

itra

ry u

nit

s

Z → ττ

H → ττ (120 GeV

c2)

H → ττ (200 GeV

c2)

(c)

Figure 4.2: Mass peaks for the current mass definitions used in CMS analyses. The visiblemass (left), the collinear approximation mass (centre) and the SVfit mass (right) areshown for three different Monte Carlo samples: a Z sample (red) and two Higgs samples(black) with two different Higgs masses mH = 120 GeV

c2 and mH = 200 GeVc2

4.2.1 Visible Mass

According to its name, the visible mass only takes into account the visible decayproducts of the ττ system. Hence, the presence of neutrinos is neglected.(

mvisττ

)2=

(∑vis

p

)2

=(pµ + pτjet

)2

Figure 4.2a demonstrates the performance of this simple definition. It is naturallycharacterised by a underestimation. The linear correlation with the true ττ mass

48

Page 71: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.2 Current Mass Definitions

(see Figure 4.7) enables the correction of the mass scale by multiplicating the visiblemass with a certain number.

4.2.2 Collinear Approximation Mass

Simplifying assumptions about the momenta carried away by neutrinos lead to thecollinear approximation mass. Since the τ leptons are about one to two ordersof magnitudes lighter than the Higgs or Z boson, they are boosted. The back-to-back geometry of the two-body decay in the rest frame of the decaying particle istransformed into a far more collimated geometry in the detector frame for boostedparticles. Therefore the neutrinos are supposed to fly in almost the same directionas the visible decay products.

Two assumptions are made for the collinear approximation mass:

• The direction of the invisible τ decay products is exactly the same as the direc-tion the visible decay products are flying. Therefore, the visible four-momentumcan be expressed by a linear combination of the real tauon momentum.

pτlep = xµ · pµ and pτhad = xτjet · pτjet with xµ, xτjet ∈ R

• The sum of all transverse momenta including the missing contributions mustmatch the measured transverse sum of momenta.

(1− xµ) · pµT + (1− xτjet) · pτjetT = pmiss

T

The two equations above can be solved for the fractions xµ and xτjet depending onthe measured visible momenta pµ and pτjet as well as the known missing transversemomentum pmiss

T . Physical solutions yield xµ, xτjet ∈ [0, 1].

(mcollττ

)2=

(1

xµ· pµ +

1

xτjet· pτjet

)2

Physically contradictory solutions appear for situations where the projection of theMET vector on at least one of the τ legs points in the opposite compared to the τleg itself. Figure 4.3 illustrates the collinear approximation.

With the restriction to physical values of xµ and xτjet , the collinear approximationis not able to provide a mass hypothesis for all events. This results in a loss ofstatistical precision as can be seen in Figure 4.2b. The same plot also shows thatthis methods produces long tails in the invariant mass distribution.

4.2.3 Secondary Vertex Fit Mass

Recently, a more advanced technique called secondary vertex fit (SVfit) has beendeveloped within the CMS collaboration. Details can be found in [59, 60].

49

Page 72: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

pT2ν

jetpTτpT

ν

pTmiss

pTµ

leppTτ

hadpTτ

Figure 4.3: Illustration of the collinear approximation. Missing transverse momentum vectorslying in the grey area lead to unphysical solutions.

This analytical method constructs a likelihood function depending on parametersdescribing the missing momentum. In order to determine the full four-momentum ofthe original ττ system, this likelihood function is maximised. The result mainly usedis a mass hypothesis mSVfit

ττ which is extracted from the four-momentum information.

Several likelihood terms are combined: they describe the matching between themeasured τ kinematics and the decay information from matrix element calculations aswell as the compatibility of the obtained missing transverse energy with the neutrinomomentum hypotheses. Contrary to the naming, up to now this method does notinclude information about the secondary vertex of hadronically decaying tauons andthe τ flight distance.

Figure 4.2c shows the performance of this algorithm. One has to emphasise thenearly unbiased results with an almost Gaussian shape. It also yields smaller widthsrelative to the reconstructed mean than the visible or collinear approximation massand therefore separates Higgs and Z events more significantly. A mass hypothesisis given for each event. The high time-consuming calculation slightly lessens itsattractiveness.

4.3 Mass Reconstruction Using NeuroBayes

One main part of the work presented in this thesis has been the study of new ττmass reconstruction approaches based on the neural network package NeuroBayes(see Section 3.3). The necessary input variables, the general performance and usageof the network output have been studied. Additionally, more physically motivatedapproaches have been investigated focusing on more comprehensible procedures.

4.3.1 Input Variables and Preprocessing

The four-momentum vectors of all decay products, visible and invisible ones, con-tribute to the mass of the decaying particle, the Higgs or Z boson in the actualstudy. The total visible four-momentum vector is simply measured by adding up thefour-momentum vectors of all visible decay products. The missing transverse energy

50

Page 73: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.3 Mass Reconstruction Using NeuroBayes

(MET) provides information about the momentum of the invisible neutrinos, butonly in the transverse plane.

Other contributions of the neutrino momenta have to be compensated by additionalavailable information. Angular relations between the flight directions of variousreconstructed decay products provide additional information that helps to deduceknowledge about the neutrino momenta. Also, the impact parameters of reconstructedtracks with respect to the primary vertex contribute knowledge about the openingangle between the visible and invisible τ decay products. Besides angular relations,other components of the visible four-momenta and their relations (averages ordifferences) may also contain useful information.

Knowledge about the secondary vertex (SV) of 3-prong hadronically decayingtauons has been expected to bring important additional information because togetherwith the reconstructed primary vertex (PV) the secondary vertex gives the flightdirection of the original τ lepton, referred to as τ reco

had . With the knowledge aboutthe vector pointing from the primary vertex to any reasonable secondary vertex,one can state relationships between this direction and the flight directions of visibledecay products and MET. Secondary vertex information can only be exploited fora certain fraction of all 3-prong or 5-prong decays. NeuroBayes is also capableof using input information that is only available in a subset of the training sample.In the other cases the network is supplied with the information, that the secondaryvertex is missing.

For the analysis the secondary vertex has been reconstructed by a vertex findingalgorithm from CMSSW which searched for the most compatible vertex reconstructedin the event matching the tracks from charged particles from the τ jet. Due totechnical limitations this algorithm sometimes returns the innermost hit of thehadronic τ tracks instead of the correct secondary vertex. But even the vectorpointing from the primary vertex to this hit forms a reasonable estimator for theflight direction of the original τ lepton since it defines the decay plane together withthe τ jet direction.

All tests in the following show that the impact of variables containing informationabout the secondary vertex of the hadronic τ decay on the mass reconstruction is notas large as expected. The main reason for this might be the case, that this informationis not available for all events and that it is not clear, whether the secondary vertexhas been correctly reconstructed. Figure 4.4 depicts another possible reason. Thegenerated angles between the flight directions of the hadronically decaying tauon andits visible and invisible decay products reveal only very small values. Especially formasses that are significantly larger than the Z mass, the decay products almost flyin the same direction as the original tauon due to its large boost. This point and theuncertainty the secondary vertex reconstruction is afflicted with explain the weakinfluence of secondary vertex information on the mass reconstruction.

It is not possible to simply imitate the SVfit algorithm with a neural network sincethis method also takes into account theoretical information from matrix elementcalculations. Such information is only available in terms of functions depending onhypotheses for the missing momentum. However, a neural network generally requires

51

Page 74: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

50 100 150 200 250

Generated mass mττ

[GeV

c2

]0

2

4

6

8

10

Gen. openin

g a

ngle

( τ had

,τvis

had

) [◦]

50 100 150 200 250

Generated mass mττ

[GeV

c2

]0

2

4

6

8

10

Gen. openin

g a

ngle

( τvis h

ad,τ

mis

shad

) [◦]

Figure 4.4: Distributions of generated opening angles (in degrees) as a function of thegenerated ττ mass. Especially the small angles between original τhad flight direction andits visible decay products (left) but also the only slightly larger angles between visibleand missing τhad decay products (right) indicate a large boost of the original hadronicallydecaying tauon.

simple input values. Moreover, the aim of the network-based approach is a masshypothesis based on measured quantities without additional hypotheses in the set ofinput variables.

Before each training procedure all variables are decorrelated and ranked byNeuroBayes preprocessing routines. For each training in the following first atesting training with all available variables as described above is performed to achievethe variable ranking by NeuroBayes. This ranking is the basis of every furtherstudy of the variables. Basically, thresholds on the added correlation Cadded asdefined in Section 3.3 have been defined in order to prune input variables which haveonly little or even no impact on the training performance1. Variables with meaningfulinformation for each event are used in a twofold way: Both their information withrespect to the target mean and to the target width is used as input. For variableswith missing information in certain events this is not possible. A certain value (-999)indicates to the network that information for such a variable is missing for a certainevent. This is the case for all variables containing secondary vertex information, suchas the distance between primary and secondary vertex or angular relations betweenthe reconstructed tauon momentum and the visible momenta.

1NeuroBayes typically prints out the added significance. The problem of this quantity is that itdepends on the size of the training sample, so that it has only a meaningful significance relativeto other added significances. To avoid this problem, the added correlation is used instead asdefined in Section 3.3.

52

Page 75: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.3 Mass Reconstruction Using NeuroBayes

4.3.2 Simple One-staged Network Topology

The first approach is a straightforward one-staged network topology which directlypredicts the invariant ττ mass. The training target therefore is mgen

ττ . The inputvariables are selected according to the condition Cadded > 5 % for the added correlationwhich indicates the impact of single variables for the training performance (seeSection 3.3). Nine resulting variables form a manageable network complexity withrespect to both its dependency on reconstructed quantities and the time consumptionfor evaluating single events. Table 4.2 shows a list of the input variables containinginformation about the NeuroBayes ranking.

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mvisττ 77.1 77.1 32.9 79.9

EmissT 45.7 25.9 26.7 59.3∣∣∣∆ϕ (τjet, µ

)− π

∣∣∣ 25.7 9.6 13.6 62.2∣∣∣∆ϕ (τjet,MET)− π

∣∣∣ 26.3 7.7 7.7 61.4

mvisττ (*) 58.7 7.4 8.3 73.2∣∣dµmin

∣∣ /σ 15.4 7.2 6.9 22.3∣∣∣∣dmin

(τ lead

jet

)∣∣∣∣ /σ 10.8 6.9 6.9 7.4

mτjet (*) 4.0 5.5 5.4 5.8∣∣∣∆η (τjet, µ)∣∣∣ 23.1 5.4 5.4 41.7

Table 4.2: Ranking of the nine most important input variables used for the straightforwardone-staged network. Variables marked with an asterisk (*) are preprocessed with respectto the target width instead of its mean. Four different correlations as defined in Section 3.3determine the quality of each variable. The list is ordered by the NeuroBayes rankingthat mainly follows the added correlation Cadded.

The ranking clearly discloses that the most important variables are the visible ττmass and the missing transverse energy because they have a large correlation Ctarget

with the target and contain at least partially complementary information. Though,the impact of MET indicated by the added correlation is significantly smaller thanits actual correlation with the target. Therefore the information the MET is able tocontribute is already partially covered by the visible ττ mass.

Three groups of additional variable contribute substantially to the training progress.There are angular relations between visible and missing decay products as well asimpact parameter significances and the mass of the τ jet. The first ones, ∆ϕ

(τjet, µ

),

∆ϕ(τjet,MET

)and ∆η

(τjet, µ

)describe the opening angles between the visible decay

product, the muon and the τ jet as well as knowledge about the flight direction ofthe sum of all neutrinos in relation to the muon flight direction. The second ones, theimpact parameter of the muon track dµmin and the one of the leading track belonging

to the τ jet dmin

(τ lead

jet

), indirectly provide information about the opening angles

53

Page 76: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

between the visible and invisible decay products of each τ lepton separately. Dividedby the individual reconstruction uncertainties σ one gets the significances of theimpact parameters. At last, the mass of the τ jet mτjet contributes with knowledgeabout the hadronic decay mode as described above.

Further tests with larger sets of input variables have been performed (see Fig-ure A.1a). It turned out that the overall mass reconstruction cannot significantly beimproved by using more input variables. In fact, this only leads to a method that ismore complicated since the agreement between real data and the simulation has tobe examined for more variables.

The training is firstly performed only on Z events with invariant mass between45 GeV

c2and 250 GeV

c2. The events used form a uniform distribution. This prevents the

network form learning preferences in the target distribution. The testing is done ona statistically independent Z sample.

Two other tests concern the impact on the number of used training events Ntrain

and the number of output nodes Nout of the neural network. The figures A.1b andA.1c show that Ntrain ≈ 100,000 and Nout = 20 are good choices as they minimisethe relative reconstruction uncertainties, although the differences are not that large.

The following list summarises the standard settings for trainings discussed in thefollowing.

• Training sample: Z events

• Mass range: 45 GeVc2

6 mgenττ 6 250 GeV

c2(uniformly distributed)

• Number of training events: Ntrain ≈ 100,000

• Used input variables: Cadded > 5 %, see Table 4.2

• Number of output nodes: Nout = 20

• Testing sample: Z events (uniformly distributed and statistically independentfrom training sample)

All current mass definitions yield at most one estimator for the ττ mass. There-fore an intuitive comparison of the network performance and the performances ofthe current methods is based on point estimators derived from the pdf per event,NeuroBayes outputs. Figure 4.5 depicts three point estimators: the maximumlikelihood value, the mean and the median of the pdf.

The first plot showing the reconstructed Z mass distributions is capable of evalu-ating the shape of the distributions exemplarily for the Z mass. The spiky shape ofthe maximum likelihood distribution attracts attention immediately. NeuroBayesbasically determines a cumulative distribution depending on the mass quantity. Thisis fitted by a spline function and then derived to result in a probability density for themass quantity. It is expected that numerical issues occurring during this procedurelead to the clustering of the maximum likelihood values. The distributions of themean and median values reveal shapes that are better formed and smooth.

54

Page 77: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.3 Mass Reconstruction Using NeuroBayes

50 100 150 200 250

Reconstructed mass m recoττ

[GeV/c2

]0.00

0.05

0.10

0.15

0.20

arb

itra

ry u

nit

s

NB (max. likeli.)NB (mean)NB (median)

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]50

100

150

200

Reco

nst

ruct

ed m

ass

m

reco

ττ

[ GeV c2

]

NB (max. likeli.)NB (mean)NB (median)

Figure 4.5: Distribution of three point estimators derived from the probability densityreconstructed by NeuroBayes. The left plot shows the Z mass spectra as a result ofapplying the method to an original Z sample whereas the right plot provides informationabout the reconstructed masses as a function of the generated mass. The lines indicatethe average reconstructed masses and the bands around the lines demonstrate theirstandard deviations.

The second plot shows the averages of the reconstructed mass estimators asfunction of the generated mass. The means of the distributions shown in theleft plot correspond to the function values in the right plot at the generated Zmass (mgen

ττ = mZ). The dashed green line indicates a perfect reconstruction withmrecoττ = mgen

ττ . The standard deviations of the mass hypotheses for each pointestimator in slices of generated masses are shown with transparent bands around theaverage values. The widths are identified as the reconstruction uncertainties. At theZ mass they correspond to the widths of the mass estimator distribution shown inthe left plot of Figure 4.5.

The mean and median mass values overestimate the true target values but theypreserve the linear dependency on the target values in the interesting mass rangebetween the Z mass and about 150 GeV

c2. Outside this range the curves flatten out,

whereas the effect is more considerable at the high mass boundary, since means andmedians of distributions do not tend to lie at the boundaries of the intervals wherethe distributions are defined. Therefore these boundary effects are expected and canbe avoided by using a larger interval for the target values while training.

Figure 4.6 focuses on the reconstruction uncertainties in more detail. The left plotillustrates the estimator distribution width as a function of the generated mass. Thevalues correspond to the width of the bands shown in Figure 4.5. Divided by the av-erages of the reconstructed mass distributions one gets the reconstruction resolutionsdepicted in the right plot. For instance, the NeuroBayes mean reconstruction fortrue Z masses yields a mean of about 110 GeV

c2and a width slightly above 20 GeV

c2.

55

Page 78: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]10

20

30

40

50

Reco

nst

ruct

ion u

nce

rtain

ty σ

reco

mττ

[ GeV c2

]

NB (max. likeli.)NB (mean)NB (median)

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]

0.15

0.20

0.25

0.30

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ

NB (max. likeli.)NB (mean)NB (median)

Figure 4.6: Uncertainties of the reconstruction for the three point estimators determinedfrom the probability density reconstructed by NeuroBayes. The left plot shows thewidths a the reconstructed masses as a function of the generated mass. These width arethe same that are indicated by the bands in the right plot in Figure 4.5. Divided by theaverage reconstructed mass as a function of the generated mass, one gets the relativereconstruction uncertainty or resolution (right).

This gives a relative width or resolution of about 18 %. Due to the boundary effectsmentioned above, the width values and therefore also the resolutions cannot exceeda certain threshold. The spiky distribution of the maximum likelihood estimatorsexplains the comparably large uncertainties shown in the plot.

For mass reconstructions that result in a linear dependency on the true mass theresolution has to be optimised. That is crucial for reconstructing preferable narrowmass peaks that are essential in the Higgs search for the separation of Higgs and Zevents. Hence, the right plot can be seen as a benchmark plot. In the following, thisrepresentation is used for the comparison of different methods. In order to avoidconfusion, only the mean of the pdf per event is used since it results in the bestresolution over the entire mass range.

Comparison with the Current Mass Definitions

After it is proved that a mass reconstruction with the neural network packageNeuroBayes works in principle, the method has to achieve its pre-eminence amongthe existing methods, in order to have reasons for deploying this new method toactual analyses. Figure 4.7 shows two quality plots comparing the network estimatorwith the visible and the SVfit mass. The collinear approximation mass is omitteddue to the long tails that would dominate the plots, see Figure 4.2b.

In the left plot, the almost perfect linear dependency of the current mass definitionsbetween reconstructed and generated masses is visible. Nevertheless, the networkreconstruction also has this property in the interesting intermediate mass range,

56

Page 79: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.3 Mass Reconstruction Using NeuroBayes

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]50

100

150

200

250

Reco

nst

ruct

ed m

ass

m

reco

ττ

[ GeV c2

] Vis. massSVfit massNeuroBayes

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]

0.15

0.20

0.25

0.30

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ Vis. massSVfit massNeuroBayes

Figure 4.7: Comparison of the performance of current mass definitions with the new neuralnetwork approach. For benchmarking the performances both the dependency of theaverages of the mass reconstructions on the generated mass (left) and the resolutions asa function of the generated mass are important. The boundary regions are greyed outbecause a meaningful evaluation of the network performances is not possible there.

that can be extended by another training sample with greater generated ττ masses.Moreover, in this mass range, the SVfit and the network mass yield almost the sameslope and therefore the same dependency on the true ττ mass.

The right plot shows the reconstruction resolutions of the three methods. Theoverall performance of the NeuroBayes method exceeds the performances of thecurrent methods in the interesting region above 100 GeV

c2. The visible mass cannot be

recommended for separating two mass peaks of Z and Higgs events due to the strongresolution dependency on the true mass. Although the visible mass achieves thenarrowest mass peak for original Z events, the resolution for Higgs events with highermasses is worse than the one of the SVfit reconstruction. The SVfit algorithm resultsin a weaker resolution dependency on the true mass as the relative uncertaintieslie between 20 % and 25 % over a broad mass range. The new network methodyields at least as precise results as the SVfit mass over the entire mass range. Inthe intermediate mass range its relative uncertainty falls below the one of the SVfitmass by 1 to 5 %. For example, at a true mass of 150 GeV

c2the SVfit mass yields a

relative uncertainty of about 23 % whereas the network method is able to achieve aresolution of about 18 %.

In current analyses concerning the search for the Higgs boson the SVfit mass is thepreferred mass definition, when Z → ττ and H → ττ events have to be distinguished(see Chapter 5). The improvements of the SVfit mass compared to the visible mass interms of relative reconstruction uncertainties are even further surpassed significantlyby the results of the neural network approach. Therefore it can be expected thatHiggs-Z separation analyses will benefit from this gain of performance, even though

57

Page 80: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

this improvement is not quantified in the scope of this thesis.

For further comparisons of new methods with current ones, only the SVfit mass isshown in the following because new methods have to outreach this best algorithmyet.

Performance on Different Hadronic Resonances

Additional tests investigate the possibility to further improve the performance of thepreviously described neural network mass reconstruction in special cases. The firststudy concerns the impact of spin of the resonance in the hadronic τ lepton decay.

Figure 4.8 shows the generated mass spectrum of the hadronically decaying τleptons with identified resonances. Whereas events with jet masses below 0.51 GeV

c2

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

Generated τjet mass m genτjet

[GeV

c2

]0.0

0.5

1.0

1.5

arb

itra

ry u

nit

s

×10 2

π

K ∗a1

Figure 4.8: Mass spectrum of the τ jet. While τ jets with masses below 0.51 GeVc2 (blue line)

are mainly responsible for pion and kaon resonances with zero spin, jets with massesabove this blue indicated threshold form resonances with spins of 1~ in most cases. Thepion peak is cropped in order to achieve a good depiction of all resonances.

mainly decay with zero spin resonances, e.g. pions and kaons, the higher jet massesproduce resonances with spins of 1~ such as ρ or a1 mesons. In particular, a1 mesonsfeature also 3-prong decays, i.e. decays with three charged particles. The spin ofthe resonant intermediate particle influences angular distributions of their decayproducts which could introduce new information to the neural network.

Therefore it has been studied, whether this fact has an influence on the recon-struction, too. Two new trainings have been performed with the same standardsettings as discussed beforehand (especially the same input variables) except for thetraining sample. One network has been trained on low generated τ jet masses andthe other on the high masses with respect to the thresholds at 0.51 GeV

c2. Similar

training sample sizes have been ensured.

Figure 4.9 illustrates the performance of these two networks in comparison with thefull training. Each network is tested separately on events with mainly spin-0 hadronic

58

Page 81: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.3 Mass Reconstruction Using NeuroBayes

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

0.35

0.40

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, training: all m gen

τjet

NB, training: m genτjet

< 0.51 GeV

c2

NB, training: m genτjet

> 0.51 GeV

c2

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

0.35

0.40

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, training: all m gen

τjet

NB, training: m genτjet

< 0.51 GeV

c2

NB, training: m genτjet

> 0.51 GeV

c2

Figure 4.9: Comparison of the network performances for trainings on different hadronicresonances. The reconstruction resolutions are evaluated on a testing sample with mainlyspin-0 hadronic resonances with mgen

τjet < 0.51 GeVc2 (left) and another with jet masses

above the threshold (right) that contains mainly hadronic resonances with spin 1. TheSVfit mass is added to the figures to provide a reference for the comparison.

resonances and events with spin-1 resonances in most of the cases. Compared tothe relatively large differences in the performance of the SVfit method, the differentnetworks show no significant trends. Only in the low generated ττ mass region thespin-0 training yields slightly worse resolutions. Additionally, hadronically decayingtauons decay about three times as often into jets with masses above 0.51 GeV

c2. Since

the performances of the training on all events and on events with high τ jet massesdo not differ at all, it can be stated that the reconstruction is not able to gain fromsuch a distinction. Thus, it is not necessary to study possible spin measurements forthis purpose that would be necessary because of the smearing of the mass spectrumby the detector response and the jet reconstruction.

Performance on a Higgs sample

A second test of the network approach discussed in this section evaluated theperformance of the network on Z and Higgs events. In addition to the networktrained on Z events, two new networks have been trained with the only differencelying in the trainings sample. One network has been trained on Higgs events only andthe other one has been trained on a mixture of Z and Higgs events. Both networksalso have been trained with about 100,000 events that are characterised by a flat truemass distribution. The mixture of Z and Higgs events therefore only contained everysecond event that has been used in the trainings on the single Z or Higgs samples.

Again the resolutions are compared on two different testing samples, Z and Higgsevents. Figure 4.10 shows the performance of all three networks. Over the entire

59

Page 82: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30R

eco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, training: Z

NB, training: H

NB, training: Z and H

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, training: Z

NB, training: H

NB, training: Z and H

Figure 4.10: Comparison of the network performances for trainings on different combinationsof Z and Higgs events. The reconstruction resolutions are evaluated on a pure Z testingsample (left) and another with contains only Higgs events (right). The SVfit mass isadded to the figures to provide a reference for the comparison.

mass range and for both testing samples, the resolution of the Z training can beimproved by training on a mixture of Z and Higgs events. The improvement is evenmore manifest for the training on only Higgs events. The study of possibile reasonsfor this behaviour exceeds the scope of this thesis.

Also the absolute performances on Z and Higgs events can be compared for thedifferent trainings. The resolutions determined on Z and Higgs samples only differvisibly in the low mass range whereas both Z and Higgs events with higher generatedττ masses are reconstructed with the same relative uncertainty independent of thetraining method.

In comparison with the SVfit method is has to be noticed that the network versionis more stable in terms that the reconstruction resolution does not vary between thedifferent samples as much as the results of the SVfit method do.

4.3.3 Reconstruction of Corrections for the Visible Mass

Until now, only a straightforward approach has been discussed. The most criticalpoint is the fact that the entire procedure of predicting a ττ mass is done by thenetwork as a statistical tool. It is not easy to comprehend the reconstruction procedurein a physical manner. Also, no knowledge about any physical background has beenintroduced (except for spin information of the hadronic resonance). Therefore twomethods are introduced in the following that cope with these points in some aspects.

The visible ττ mass is characterised by the almost perfectly linear dependencyof the average estimated masses on the true mass (see Figure 4.7). As an existingsimple mass estimator, the visible mass yields reasonable results in the correct order

60

Page 83: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.3 Mass Reconstruction Using NeuroBayes

of magnitude if its values are multiplied by a factor of two. With this knowledgethe work that has to be done by the neural network can be reduced by splitting themass prediction into two terms.

mττ = 2mvis,recoττ + δm

The neural network is then supposed to focus on estimating corrections δm for thetwo times the visible mass in order to improve its performance. Figure 4.11 showsthe distribution of the target values as well as the one of the reconstructed estimators.Both distributions show peaking shapes with a mean reasonably close to zero.

150 100 50 0 50 100 150

Correction δm[GeV/c2

]0.00

0.01

0.02

0.03

0.04

arb

itra

ry u

nit

s

gen.reco.

Figure 4.11: Distribution of the generated and reconstructed correction values for the visiblemass for a Z sample that is uniformly distributed in the true ττ mass.

As it is described in the previous section, for this training procedure a study of thevariables regarding their impact on the training performance has been performed, too.All variables have been tested according to their impact on the training performanceand a threshold Cadded > 5 % has been defined to use only the most importantvariables for the actual training. Again, the visible mass and the missing transverseenergy turned out to be the most important variables. A list of the ten used variablesincluding the NeuroBayes ranking results can be found in Table A.2.

Besides, only slight changes with respect to the previous trainings, here thetransverse momentum of the muon pµT and the ratio of missing and total visibleenergy Emiss

T /Eττ gained importance. Both variables together with the visible massare considered to bring in knowledge about the missing momentum. Knowledgeabout the missing momentum provides information about needed corrections for thevisible mass.

Figure 4.12 compares the performance of this new method with the straightforwardnetwork discussed in the previous section as well as with the SVfit mass. Again,the left plot helps to evaluate to linearity of the reconstruction. The appearanceof boundary effects is not reduced significantly, although the distribution of targetvalues as depicted in Figure 4.11 shows that most of the events lie in its central

61

Page 84: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]50

100

150

200

250R

eco

nst

ruct

ed m

ass

m

reco

ττ

[ GeV c2

] SVfit massNB, target: m gen

ττ

NB, target: δm

(a) Reconstructed vs. generated mass

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, target: m gen

ττ

NB, target: δm

(b) Reconstruction resolution

Figure 4.12: Comparison of the simple one-staged network approach with a two-stagedmulti-network approach. The average reconstructed masses (left) and the reconstructionresolutions are shown as a function of the generated mass. The SVfit mass is added tothe figures to provide a reference for the comparison.

region. A possible reason for this is that the corrections δm themselves depend onthe visible mass in a non-trivial way and therefore network reconstruction is notsimplified at all in comparison with the simple straightforward network approach.

The right plot depicts the reconstruction resolution. For true ττ masses in theinteresting mass range and above about 110 GeV

c2the resolution is improved only

slightly whereas the reconstruction of Z masses yields higher uncertainties. At a truemass of 90 GeV

c2the relative uncertainty worsens from about 19 % to 20 %. Therefore

the straightforward approach is preferred for Higgs-Z separation purposes comparedto this new approach, since not only a good Higgs mass reconstruction but also smallrelative resolutions for the Z mass reconstruction are required.

4.3.4 Physically Motivated Approach Using a Parametrisation of theMissing Momentum

The concept of subdividing the entire mass reconstruction problem into small taskswith certain physical relevance is extensively used in this section.

The missing information about the neutrino four-momentum in total is fixed intwo independent momentum vector components because the missing momentumis fully known in the transverse plane. The quested components can be chosen asthe invariant mass of the sum of all neutrinos and the polar angle of the missingmomentum vector. As both quantities do not depend on any measured quantitiesin an intuitive way, another approach with three parameters is studied. Obviously,these parameters cannot by fully independent of each other.

62

Page 85: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.3 Mass Reconstruction Using NeuroBayes

The invariant ττ mass can be expressed in terms of the measured visible energy

Evis and the modulus of the measured momentum∣∣∣pvis

∣∣∣ and parametrised by three

parameters α, β and γ. Each parameter has an easy physical comprehension. Thefirst two describe the ratio between missing and visible energy and momentum,whereas γ denotes the angle between the total visible four-momentum vector andthe missing momentum.

m2ττ =

[Evis (1 + α)

]2−∣∣∣pvis

∣∣∣2 · (1 + 2β cos γ + β2)

with α =Emiss

Evisand β =

∣∣∣pmiss∣∣∣∣∣pvis∣∣ and γ = ∠

(ττmiss, ττvis

)For each parameter a separate network has been trained. The tables A.3, A.4 andA.5 present the lists of the used input variables for each network fulfilling the usualcondition Cadded > 5 %. It has to be emphasised that for every training the mostimportant variable NeuroBayes determined is the one that is expected from thephysical comprehension about the parameters. The parameter α strongly dependson the ratio of the missing transverse energy and the visible ττ energy Emiss

T /Eττ ,whereas the ratio of missing transverse momentum and the measured ττ momentumEmiss

T /pττ has the greatest impact on the reconstruction of the parameter β. Moreover,the most important variable for the last parameter γ fits the expectations with theangle between the muon and the τ jet flight directions ∠

(τjet, µ

).

After reconstructing the parameters, they have to be combined to get the invariantττ mass. The easiest way is an analytical calculation of the mass with the formulagiven above. This causes the loss of very few events for which the radicand in theformula becomes negative. Besides that, it is possible to train a second-stage networkon the ττ mass which takes the estimated parameters of the first-stage networks, theenergy and momentum of the visible decay products as well as the analytical solutionas input variables. The inclusion of the analytical solution is necessary to enhancethe total correlation of the set of input variables with the target and to enable areasonable mass reconstruction. With this procedure, it can be investigated whetherthe analytical calculation can be improved by an additional network. Table A.6 withall used input variables for the combining forth network clearly indicates that thetraining is extremely dominated by the analytically calculated mass.

Figure 4.13 shows the performance of the analytical combination and the networkcombination in comparison with the straightforward one-staged network reconstruc-tion and the SVfit mass. Firstly, the analytical parameter combination yields ahighly linear dependency between reconstructed and generated masses in averageover the entire studied mass range as shown in the left plot. Significantly reducedboundary effects compared to the other statistical approaches are noticeable. Incomparison to the reconstruction of corrections for the visible mass, the reason forless boundary here may be explained with less mass dependent training targets. Asit can be seen in the tables A.3, A.4 and A.5, the visible mass is superseded by atleast one other variable in each parameter training in terms of correlations with the

63

Page 86: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]50

100

150

200

250

300

Reco

nst

ruct

ed m

ass

m

reco

ττ

[ GeV c2

] NB, one-stagedNB, analyt. param. comb.NB, net param. combination

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

0.35

0.40

0.45

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, one-stagedNB, analyt. param. comb.NB, net param. combination

Figure 4.13: Comparison of a direct training on the generated ττ mass and a two-stagedtraining that following the parametrisation approach. The average reconstructed masses(left) and the reconstruction resolutions are shown as a function of the generated mass.It is clearly apparent that the more complicated but also more comprehensible approachresults in less precise reconstructions. The SVfit mass is added to the figures to providea reference for the comparison.

target Ctarget. But this methods results in a reconstruction uncertainty that doesnot fall below 25 % as shown in the right plot. Even the SVfit mass yields betterresolutions in the entire mass range. This can be explained with the uncertaintyevery reconstructed parameter is afflicted with. The analytical calculation then leadsto a propagation of these uncertainty which will only get larger by this process.

Therefore it is necessary to employ a more sophisticated method of combiningthe parameters and measured quantities. For this purpose the second-stage networkhas been trained. The results yield the smallest slope of the average reconstructedmasses as a function of the generated masses. This can be explained with the smallertotal correlation of the set of input variables with the target of 72.6 % compared tothe one for the straightforward one-staged network of 83.5 %. The right plot showsthat the resolution can definitely be improved by using a second-stage network ontop of the analytical calculation. But the performance does not reach the one of thestraightforward one-staged network, although the SVfit mass can even be improvedby this approach in the interesting mass range.

It is of course simply possible to add more variables to the second-stage network,especially the invariant ττ mass and the missing transverse energy, in order to achievea higher correlation of the input variables set with the training target and with thatalso a better performance of the method. This procedure, however, destroys theaim of the parametrisation approach introduced above and makes the method morecomplicated. This example strongly suggests to keep the mass reconstruction methodas simple as possible.

64

Page 87: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4.4 Conclusion and Outlook

4.4 Conclusion and Outlook

In this chapter a new method for reconstructing the invariant ττ mass in Z and Higgsdecays has been presented that is based on the neural network package NeuroBayes.The proof of principle has been demonstrated with a straightforward approach. Asimple network was able to reconstruct the ττ mass as precise as the SVfit masswhich is a method that reconstructs the mass by maximising likelihoods based onthe ττ kinematics.

The methods presented here demonstrate an interesting application for the densitymode provided by NeuroBayes. The neural network package has been used extens-ively, not only for reconstructing masses but also for employing the preprocessingfeatures including the input variable ranking routines. The program proved its powerin confirming the expected importances of the single variables with numerical outputssuch as the added significance or correlation.

Surprisingly, the simplest network approach resulted in the best mass resolutions.The straightforward one-staged network that directly reconstructs the ττ massexceeded the approach of reconstruction corrections for the visible mass and theparametrisation approach in terms of reconstruction performance. This performanceis mainly measured by the reconstruction resolution but also the dependency of thereconstructed masses on the true masses plays a role. Only by training on a mixture ofZ and Higgs events together or by even training only on Higgs events the resolutioncould be improved compared to the training on only Z events. Many discussedtests yield no significant improvements compared to the simple approach. Thisindicates the limitations of the method and shows that no significant improvement isfurther possible with comparable methods. To conclude this study, three mass peaksreconstructed with the first discussed network are depicted in Figure 4.14.

The simple approach yields a great advantage: Due to the fact that all inputvariables except for the τ jet mass are not jet specific, the method is expected to bein principle easily applied on all other ττ decay modes.

The network reconstruction showed its pre-eminence in comparison to the SVfitmass. The mass resolution can be improved with the neural network by up to 5 % inthe interesting mass range, i.e. from about 23 % for the SVfit mass to about 18 %for the neural network. Below the Z mass and above about 150 GeV

c2the network

estimators are characterised by boundary effects that can be avoided by training ona sample with a larger generated mass range which is mainly necessary for higherHiggs masses.

The usage of the information provided by the entire pdf per event NeuroBayesoutputs has not been studied in detail. In most of the cases the pdf is characterisedby comparable broad peaks, often even multiple peaks. That makes it impossibleto simply sum up all pdfs in order to get a mass peak the contains all informationprovided by the network because the mass peak is smeared. Meaningful usage of thisinformation requires a sophisticated analysis of the pdf that is beyond the scope ofthis thesis.

The work presented in this chapter is only a feasibility study. Neither systematic

65

Page 88: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

4 Mass Reconstruction with Artificial Neural Networks

50 100 150 200 250

NeuroBayes mass mNBττ

[GeV/c2

]0.00

0.01

0.02

0.03

0.04

0.05

arb

itra

ry u

nit

s

Z → ττ

H → ττ (120 GeV

c2)

H → ττ (200 GeV

c2)

Figure 4.14: Mass peaks for the straightforward one-staged network reconstruction. Threedifferent Monte Carlo samples are shown: a Z sample (red) and two Higgs samples(black) with different Higgs masses mH = 120 GeV

c2 and mH = 200 GeVc2 . The shapes of the

reconstructions are well defined whereas the last sample clearly shows boundary effects.

effects nor the application to real data has been studied. How a possible separation ofHiggs and Z events based on this new mass reconstruction method can benefit fromthe better resolution in comparison to the SVfit mass is subject of further studies.Moreover, that uncertainty that comes along with using the fast detector simulationinstead of the full detector simulation has to be studied due to the problem that thiswould require large sets of fully simulated events which are flatly distributed in theinvariant ττ mass.

However, in the next chapter another approach for improving the separabilityof Z and Higgs events is studied. There the SVfit mass is taken as an establishedmethod and the impact of additional information from other discriminating variablesis investigated.

66

Page 89: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5Classification of Higgs Events with

Artificial Neural Networks

Besides improving the reconstruction of the invariant τ pair mass, it can be studied,whether additional variables are able to contribute to a discrimination betweenH → ττ events and background events. This chapter addresses the question whetherneural network techniques can improve the sensitivity of the multivariate analysisfor the search of the Higgs boson. This study is performed in the scope of theH → ττ → µµ analysis at the CMS experiment. The official analysis is documentedin [23, 61].

5.1 Overview Over the Current Analysis Strategy

5.1.1 Final States with Two Muons

The CMS detector is characterised by its ability to clearly identify muons and toprecisely measure their momenta. Therefore channels with muons in the final stateare favoured for data analyses. Although the branching ratio for τ pairs decayinginto two muons with four neutrinos of about 3 % is in comparison to other final states(see Table 1.4), this channel contributes in the search for the Higgs boson [19].

The H → ττ → µµ signal is contaminated by a huge number of events fromother background processes. Z → ττ → µµ and Z → µµ events form the irreduciblebackground which cannot be eliminated by simple selection criteria. Nevertheless,there are quantities that can be exploited for a discrimination between these processes.H → ττ and Z → ττ events are distinguished by different invariant ττ masses.Despite the challenging mass reconstruction is afflicted with large uncertainties (seeChapter 4), the reconstructed ττ mass contributes significantly to the separationof these two event classes. In contrast, Z → µµ background events do not comealong with neutrinos in the final state. Therefore the visible mass and the missingtransverse energy is expected to discriminate between Z → ττ → µµ and Z → µµ

67

Page 90: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

events. Additionally, tauons originating from Z and Higgs decays are highly boostedwhich leads to displaced decay vertices for each τ lepton. Thus, muon pairs originatingfrom tauon pairs do not come from the same production vertex as it is the case formuons that originate directly from Z bosons.

Other non-dominant background processes are studied: diboson production, whereat least two muons originate from ZZ, WW or WZ decays, tt+ jets and W + jetsevents. Additionally, also QCD processes can result in two muons in the final state.

5.1.2 Data and Monte Carlo Events and Preselection

The analysis is based on the full data set collected with the CMS experiment inproton-proton collisions at a centre-of-mass energy of

√s = 7 TeV in 2011. The

sample corresponds to an integrated luminosity of L = 4.5 pb−1 [62]. New aspects ofthe multivariate analysis presented in this chapter focus on simulations. Nevertheless,the data sample is included in every histogram in order to show the agreementbetween data and simulation samples.

This work concentrates on the search of the neutral MSSM Higgs boson decayinginto pairs of τ leptons. With respect to the decay kinematics of the Higgs bosonand the separability against background processes there is no difference between theStandard Model Higgs boson and the studied supersymmetric Higgs boson. Thereforeit is in principle possible to transfer all results concerning the methods of Higgs eventclassifications to the Standard Model Higgs search. The study has been performedfor the Higgs mass hypotheses between mH = 90 GeV

c2and 500 GeV

c2.

The Higgs signal, diboson and QCD Monte Carlo samples have been simulatedwith Pythia, whereas all other background samples, inclusive Z and W productionand tt events have been generated with MadGraph. As these samples belong toofficial CMS simulations, a full detector simulation has been performed. All τ decayshave been modelled using Tauola.

Table 5.1 summarises the preselection with respect to the MSSM Higgs search.Two categories are distinguished in order to increase the sensitivity, one with andthe other without a b-tagged jet. These categories try to exploit this informationto distinguish the Higgs signal from Z boson production. The new multivariatemethods presented in this chapter do not take these categories into account, ratheran inclusive study is performed. Therefore the comparison with current methodsrefers to the non b-tag category because it covers almost all events. Subsequently amultivariate analysis is performed which is introduced in the next subsection.

Finally, Table 5.2 gives an overview over the numbers of both simulation anddata events which remained after the preselection. For simulated events both theexpected event numbers according to an an integrated luminosity of 4.5 fb−1 and thetotal number of simulated events are listed. In particular, the numbers for the Higgssignal events are quite small due to the limited size of official Monte Carlo simulatedsamples.

68

Page 91: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.1 Overview Over the Current Analysis Strategy

Property Selection

single isolation muon trigger pass, pT > 17 GeVc (first quarter of dataset)

doule muon trigger pass, pT > 8 GeVc and pT > 13 GeV

c

charge at least one muon pair with opposite charges

leading muon pT > 20 GeVc and |η| < 2.1

subleading muon pT > 10 GeVc and |η| < 2.4

isolation certain threshold on∑pT within

cone with ∆R < 0.4 around muon tracks

non b-tag category at most one jet with pT > 30 GeVc ,

no b-tagged jet with pT > 20 GeVc

b-tag category at most one jet with pT > 30 GeVc ,

at least one b-tagged jet with pT > 20 GeVc

Table 5.1: Summary of the preselection of events with double muon final states for theH → ττ → µµ analysis.

5.1.3 Current Multivariate Classification of Higgs Events

The multivariate analysis consists of two steps. Firstly, Higgs signal events areclassified based on a likelihood quantity. After applying a cut on the likelihoodquantity, mass information about the ττ system is exploited for the calculation ofupper limits on the Higgs boson production cross section.

The likelihood quantity is determined according to the formula (3.1). It discrimin-ates between H → ττ signal and the two backgrounds, Z → ττ and Z → µµ. Otherbackground processes are not considered during the construction of the likelihoodfunction. Five discriminating variables are used:

• The ratio of the transverse momentum of the muon-pair to the scalar sum of

the two muon transverse momenta, pµµT /(pµ

+

T + pµ−

T

).

• The pseudorapidity of the muon pair system, ηµµ

• The significance of the distance of closes approach of the muon tracks,log10

[DCA signif (µµ)

]• The azimuthal angle between the flight direction of the positively charged muon

and the missing transverse energy, ∆ϕ(µ+,MET

)• A binary variable that contains the validity of the collinear approximation

(CA)

The discriminating power and the agreement between data and simulation is illus-trated in Figure B.4 for the five variables.

Because of the fact that these variables are only weakly correlated, they are wellsuited for a likelihood analysis that optimise the discrimination between signal and

69

Page 92: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

Process Events after preselectionexpected simulated

Data 1,458,488

Z → µµ 1,433,948 3,773,476Z → ττ → µµ 8,528 23,977tt+ Jets 2,338 12,817Diboson (ZZ, WW , WZ) 1,882 135,296QCD 363 19W + Jets 73 37

H → ττ → µµ

mH = 120 GeVc2

1,183 2,241

mH = 250 GeVc2

112 4,869

mH = 500 GeVc2

4 6,082

Table 5.2: Numbers of Events that remained after the preselection for each studied processand real data. For the Higgs signal exemplarily three mass hypotheses are listed. Boththe number of expected events corresponding to an integrated luminosity of 4.5 fb−1 andthe number of total simulated events after the preselection are listed.

background for sets of uncorrelated variables. Though, the most important separationvariables, mass information, is not used. In order to avoid higher correlation betweenthe variables and to keep variables that can be used as test statistics for the limitcalculation, mass variables are used separately.

Both the visible mass and the SVfit mass contain discriminating power dependingon the studied Higgs mass. To account for this fact, two-dimensional mass histogramsare used as test statistics for the limit calculation. They are filled with events with alikelihood value above a Higgs mass dependent cut on the likelihood quantity forwhich the signal significance is optimised according to the formula (3.3).

Based on these histograms, upper limits with 95 % confidence level are set on thecross section of the MSSM Higgs boson production multiplied with the branchingfraction for the Higgs boson decaying into τ lepton pairs for tanβ = 30. Basedon the background-only hypothesis, the limits are computed with an asymptoticCLs approach as described in [63]. With this official procedure it is possible tocompare the sensitivity of arbitrary methods separating signal and backgroundprocesses. Therefore, in this thesis expected upper limits [64] based on Monte Carlosimulations are considered to quantify a possible improvement of the new neuralnetwork classifications.

70

Page 93: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.2 Two Subsequent Neural Networks Following the Current Analysis

5.2 Two Subsequent Neural Networks Following theCurrent Analysis

In the following two sections, multivariate methods based on the neural networkpackage NeuroBayes are presented. All networks are trained on a compositionof simulated Monte Carlo samples for all studied processes, unless a training isintroduced differently. For the training each event is weighted with an individualevent weight describing its reconstruction efficiency and a weight accounting forthe cross section of the underlying process. Additionally the weight of every signalevent is increased by a certain constant factor to ensure a prior signal probabilityof 50 % in order to avoid that NeuroBayes learns the relative composition of thesample in terms of signal and background fractions. The relative compositions of thebackground sample and the signal sample are separately compatible with the actualexpectations.

After each training, the network output tflat according to the formula (3.2) isnon-linearly transformed into a discriminator tcorr that gives the signal probabilityconserving the correct prior signal expectation [65].

tcorr =

[1 +

(1

tflat− 1

)·Nbkg

Nsig

]−1

(5.1)

The expected numbers of signal and background events are denoted by Nsig andNbkg.

The first neural network approach introduced in this section concentrates more onreproducing the current likelihood-based procedure than on improving its results.Therefore a similar twofold way of discriminating between Higgs signal and otherbackgrounds is chosen.

5.2.1 First Network Equivalent to the One-dimensional LikelihoodAnalysis

A first network is trained on the five discriminating variables the likelihood quantityis based on to classify H → ττ final states. Therefore this network training is referredto as the method equivalent to the one-dimensional likelihood method. A list ofthese variables including the ranking by NeuroBayes can be taken from Table B.1.In contrast to the likelihood method, here variables that are symmetric with respectto a certain value, e.g. pseudorapidity quantities such as ηµµ, are used in a twofoldway: the modulus and the sign of the variables with respect to the central value areused independently as input variables to enable the network to focus on relevantinformation provided by these variables.

Unlike the likelihood quantity that is determined only based on Higgs and Zboson decay channels, the network is trained on every process mentioned above.For the signal sample a composition of all available Higgs mass hypotheses withmH = 90, . . . , 500 GeV

c2is used.

71

Page 94: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

It has to be noted that a correct evaluation of the network output requires a Higgsmass dependent prior correction, since the expected number of Higgs signal eventsNsig is also a function of the Higgs boson mass. In this step, the network output isjust corrected with respect to the inclusive number of Higgs events for all studiedHiggs mass hypotheses together because it is not interpreted as a signal probability.

Figure 5.1 shows the distributions of the prior-corrected network outputs for allsimulated processes as well as for data. Signal efficiency and the background rejection

0.0 0.2 0.4 0.6 0.8 1.0Likelihood L for H → ττ vs. others

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7Discriminator H → ττ vs. others

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.0 0.2 0.4 0.6 0.8 1.0Likelihood L for H → ττ vs. others

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

Figure 5.1: Discriminator distributions for the one-dimensional likelihood quantity (left)and the equivalent network output (right). The Higgs signal distribution for a Higgsmass hypothesis of mH = 120 GeV

c2 is shown in front of the stacked histograms based onthe Monte Carlos samples for the various studied channels. The ratio plot shows theagreement between data and the sum of all backgrounds.

values are then calculated as described in Section 3.1.2 by scanning the thresholdfor the selection cut value based on these discriminator distributions. Figure 5.2depicts a comparison of the performance of the two methods expressed in terms ofthe background rejection as a function of the signal efficiency. The figure allows todefine a working point by choosing a cut threshold with a certain signal efficiencyand the corresponding background rejection. On average, the number of backgroundevents which are rejected by cutting on the discriminator can be doubled for arbitrarysignal efficiencies, if one takes the network discriminator instead of the likelihoodquantity.

The better performance of the neural network method is explained by the factthat the neural network is able to exploit non-linear correlations to some extent.Also the small but existing linear correlations between the input variables canbetter be treated by the network. Since the likelihood method is based on themarginal distributions of the discriminating variables, it does not take in account anycorrelations between variables. In particular, the correlation between the dimoun pT

ratio, pµµT /(pµ

+

T + pµ−

T

), and the pseudorapidity of the dimuon system,

∣∣ηµµ∣∣, cannot

be neglected according to Table B.1. If the absolute value of the ηµµ quantities is not

72

Page 95: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.2 Two Subsequent Neural Networks Following the Current Analysis

0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency εsig

0.85

0.90

0.95

1.00

Back

gro

und r

eje

ctio

n

1−ε b

kg

NeuroBayesLikelihood

Figure 5.2: Comparison of the performance of the likelihood quantity and the equivalentnetwork method in terms of the background rejection as a function of the signal efficiencywhere the signal sample is formed by a 120 GeV

c2 Higgs sample.

taken, the linear correlation nearly disappears, but a non-linear correlations betweenthe two variables remains which derogate the likelihood method.

Although at least 85 % of all background events can be rejected for the network-based method by choosing a cut threshold in a way that all bins but the lowest onein Figure 5.1 are considered, according to Table 5.2 the number of background eventsremains large. For the likelihood quantity the effect is even worse. Therefore it isnecessary to perform a subsequent analysis.

5.2.2 Second Network Equivalent to the Two-dimensional Mass Analysis

In turn, a second network referred to as the masses network is trained on the visiblemass mµµ and the SVfit mass mττ to classify Higgs events. This training imitatesthe current two-dimensional mass analysis. For the current limit calculation based onthe two-dimensional mass analysis only signal-like events selected above a determinedHiggs mass dependent cut threshold on the likelihood quantity are exploited. Sinceneural network methods are more easily extendable to multiple input variables thanlikelihood methods based on multidimensional pdfs, the output of the likelihoodequivalent network is used as an additional network input instead of cutting on thisquantity.

As for the current analysis, the network analysis is performed for each Higgs masshypothesis separately. Within this chapter, the focus will be on a low Higgs masshypothesis of mH = 120 GeV

c2. Additional information about both a medium Higgs

mass hypothesis of 250 GeVc2

and a high Higgs mass hypothesis of 500 GeVc2

can beinferred from the appendix B.

Figure 5.3 shows the distribution of the prior-corrected network output for a lowHiggs mass hypothesis as well as the performance of the discriminator in terms of

73

Page 96: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

signal efficiencies and background rejections. It is clearly visible that the separation

0.00 0.05 0.10 0.15 0.20 0.25 0.30Discriminator H → ττ vs. others

10-3

10-2

10-1

100

101

102

103

104

105

106Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency εsig

0.95

0.96

0.97

0.98

0.99

1.00

Back

gro

und r

eje

ctio

n

1−ε b

kg

mH = 120 GeV

c2

mH = 160 GeV

c2

mH = 250 GeV

c2

mH = 350 GeV

c2

Figure 5.3: Network discriminator distribution for the masses training for a Higgs masshypothesis of mH = 120 GeV

c2 (left) and the background rejection as a function of thesignal efficiency for four Higgs mass hypotheses (right). The network outputs for higherHiggs mass hypotheses are shown in Figure B.1.

power increases with the mass of the studied Higgs boson. For low Higgs masshypothesis Higgs and Z events can hardly be separated by mass variables, whereasfor higher Higgs masses the separation power increases as expected according to theexplanations in Chapter 4.

This fact is also represented by the variable rankings of NeuroBayes as shownin the tables B.2, B.3 and B.4 for trainings with respect to different Higgs masses.For Higgs masses up to 250 GeV

c2the visible mass of the dimuon system is the most

important variable. From 300 GeVc2

on the SVfit mass supersedes the visible massin terms of the impact on the training performance. The same exchange in theimportance of the tow mass variables is visible, when the expected limits are directlycalculated based on single mass variables as done in [66]. The relative importanceof the SVfit mass incessantly increases with the Higgs mass since it is the variablethat separates Higgs from Z events decaying into τ lepton pairs as illustrated inFigure B.5.

This second network taking into account mass variables that provide strongseparation power is capable of improving the background rejection in comparisonto the training equivalent to the one-dimensional likelihood method as shown inFigure 5.2. The SVfit mass distinguishes between H → ττ and Z → ττ eventswhereas the dimuon mass separates them from Z → µµ events that yield visiblemasses at the nominal Z mass since no neutrinos are involved in the decay.

The official procedure of CMS Higgs analyses to compare the separation powerof any multivariate analysis is to calculate expected upper limits in the absence ofHiggs signal with a certain confidence level because the expected upper limit is adirect measure for the sensitivity of an analysis. Figure 5.4 shows the expected upper

74

Page 97: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.2 Two Subsequent Neural Networks Following the Current Analysis

limits on the product of the cross section for the MSSM Higgs boson production fortanβ = 30. The mainstream analysis, i.e. the MSSM Higgs search in the no b-tag

100 200 300 400 500

Higgs mass mH

[GeV

c2

]

100

101

95%

CL

upper

limit

on σ·BR

[ pb] 2D masses, L cut

NB, traditionalMainstream analysis

Figure 5.4: Expected upper limits on the product of the cross section for the MSSM Higgsboson production and the ττ decay branching ratio for tanβ = 30 for the subsequentmultivariate analysis consisting of the one-dimensional likelihood analysis with five inputvariables and the analysis of the visible mass and the SVfit mass [54]. The mainstreamanalysis refers to the official analysis for the MSSM Higgs search in the no b-tag categoryevaluated without a study of systematic effects.

category without a study of systematic effects, is compared with the performance ofthe two-dimensional mass analysis and the one of the equivalent network methodthat uses the same input variables. The good agreement between the mainstreamanalysis and the inclusive two-dimensional mass analysis proofs the consistency ofthe presented network methods with the current official methods.

In comparison with these two-dimensional methods the network-based approachyields no overall improvements. Although the likelihood equivalent methods yieldsa better separation, this improvement is compensated by the performance of thetwo-dimensional mass analysis. Based on two-dimensional mass distributions, ittakes into account the full correlation between the visible mass and the SVfit mass.According to the Neyman-Pearson lemma [49] this method is the best to be done.

Nevertheless there is an advantage of the methods using neural networks. Sincethey do not used any cut on a discriminator, the full dataset is available for the limitcalculation resulting in higher statistical precision. Moreover, the network trainingcan easily be extended with additional variables improving the separation. Likelihoodmethods suffer from correlations between the variables. Before new variables can beadded to the set of input variables of an existing likelihood quantity, their correlationsto the used variables have to be examined. Therefore, variables cannot be easilyadded because in most of the cases they are correlated to other already used variables.For methods, such as the limit calculation based on the two dimensional massdistributions, it is also almost impossible to add new discriminating variables due to

75

Page 98: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

limited numbers of available events and resulting statistical problems.

5.3 New Two-staged Network Approach

The previous section has shown that neural network methods can reach the separationpower of the current analysis for the same list of discriminating input variables. Inthis section two points are modified.

The first one concerns the comprehension of the functions the networks have tofulfil. Two main background processes have to be rejected. There are variables thatare specialised on discriminating against Z → µµ events and others that discriminateagainst Z → ττ events. So, two networks with individual purposes are trained andevaluated.

The second point is to include additional variables to improve the separation power.Neural networks are not restricted to uncorrelated input variables the way likelihoodmethods are. Therefore it is easy to add arbitrary new variables to the training. Thefollowing additional variables have been tested:

• The visible dimuon mass with respect to the nominal Z mass, mµµ −mZ

• The missing transverse energy EmissT

• The reconstructed pseudorapidity of the ττ system ηττ which is an additionalresult of the SVfit algorithm

• The angle ω∗ between the flight direction of the positively charged muon andthe production plane which is spanned by the three-momentum vector of thedimuon system and the beam axis [67]

• The angle θ∗ between the three-momentum vectors of the positively chargedmuon and the dimuon system [67]

• The discriminator Pζ − 1.85 · P visζ which takes into account the visible and

missing transverse momenta projected on the direction ζ of the visible ττ decayproducts perpendicular to the beam axis [68].

5.3.1 Classification of ττ Final States

The purpose of the first-stage network is to classify events with two τ leptons in thefinal state, namely Z → ττ and H → ττ events, by distinguishing them mainly fromthe huge Z → µµ background. The network is trained on all Higgs samples with allmass hypotheses at once.

Table 5.3 shows a list of the five most important input variables. The NeuroBayesranking clearly identifies the visible mass (transformed with respect to the nominal Zmass) as far the most important variable because the dominant Z → µµ backgroundevents produce dimuon masses that are approximately equal to the nominal Z mass.The second most important variable is the information about the distance of closest

76

Page 99: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.3 New Two-staged Network Approach

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]∣∣mµµ −mZ

∣∣ 83.1 83.1 39.0 82.0log10

[DCA signif (µµ)

]42.3 13.7 12.1 37.0

EmissT 28.5 7.8 6.2 44.7|cosω∗| 28.3 7.4 6.4 42.7sign

(mµµ −mZ

)45.9 5.7 5.2 50.4

Table 5.3: Ranking of the most important input variables used for the network classifyingττ final states. Four different correlations as defined in Section 3.3 determine the qualityof each variable. The list is ordered by the NeuroBayes ranking that mainly followsthe added correlation Cadded. A full list of all used variables provides Table B.5.

approach of the two muon tracks. As expected and shown in Figure B.4 the muonsoriginating directly from Z decay can be separated from ones that originate fromsecondary vertices, i.e. the τ decay vertices. Moreover, the missing transverse energy,pointing at neutrinos in the final state, and the angle ω∗ provide significant separationpower.

Figure 5.5 illustrates the network output and the performance of the classification.For simplicity reasons, only the Z → ττ signal events are taken into account for the

0.0 0.2 0.4 0.6 0.8Discriminator Z/H → ττ vs. others

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency εsig

0.92

0.94

0.96

0.98

1.00

Back

gro

und r

eje

ctio

n

1−ε b

kg

mH = 120 GeV

c2

mH = 160 GeV

c2

mH = 250 GeV

c2

mH = 350 GeV

c2

Figure 5.5: Network discriminator distribution for the training purposed for classifying ττfinal states (left) and the background rejection as a function of the signal efficiency forfour Higgs mass hypotheses (right). The discriminator distribution exemplarily includesthe distribution for a low Higgs mass hypothesis of mH = 120 GeV

c2 .

prior-correction of the network output. The contribution of Higgs events to the signalis small and even decreases with larger Higgs mass hypotheses. So, the performanceis nearly independent of the Higgs mass hypothesis because the sample is dominatedby Z → ττ events. Therefore it is not necessary to focus on subsets of the combinedHiggs samples or single Higgs mass hypotheses.

More than 92 % of all background events can be rejected by choosing a cut threshold

77

Page 100: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

in a way that all bins but the lowest one the left plot in Figure 5.5 are considered,whereas almost all signal events remain in the selected sample. It has to be studiedwhether the following multivariate analysis can benefit from such a reduction of thebackground sample. For the analysis presented in the following no cut will be appliedon the shown discriminator to calculate expected upper limits on the entire samplewith the highest statistical precision possible.

5.3.2 Discriminating H → ττ Events Against Z → ττ Events

The previous network is capable of classifying final states which are characterised byτ lepton pair mainly discriminating them against the dominant Z → µµ background.In the second stage Z → ττ events have to be separated from H → ττ events. Foreach Higgs mass hypothesis a separate network has been trained to specialise thenetwork on classifying Higgs events with a certain mass.

This specialisation manifests itself also in the ranking of the input variables resultingfrom the NeuroBayes training procedures. The tables 5.4, 5.5 and 5.6 present thelists of the most important input variables exceeding the threshold Cadded > 5 % fora low, a medium and a high Higgs mass hypothesis.

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mττ 28.4 28.4 7.2 81.0|ηττ | 24.7 24.5 18.4 46.2∣∣mµµ −mZ

∣∣ 27.2 11.1 12.4 76.5

pµµT /(pµ

+

T + pµ−

T

)14.7 11.0 3.4 72.5

log10

[DCA signif (µµ)

]4.2 6.6 6.8 10.5∣∣∣∆ϕ (µ+,MET

)− π

2

∣∣∣ 7.7 5.5 4.5 27.7

Table 5.4: Ranking of the most important input variables used network discriminatingH → ττ events against Z → ττ events for a low Higgs mass hypothesis of mH = 120 GeV

c2 .Four different correlations as defined in Section 3.3 determine the quality of each variable.The list is ordered by the NeuroBayes ranking that mainly follows the added correlationCadded. A full list of all used variables provides Table B.6.

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mττ 82.8 82.8 33.6 88.0log10

[DCA signif (µµ)

]19.0 7.0 7.1 16.2

|ηττ | 18.2 6.3 4.7 44.5

Table 5.5: Ranking of the most important input variables used network discriminatingH → ττ events against Z → ττ events for a medium Higgs mass hypothesis of mH =250 GeV

c2 . A full list of all used variables provides Table B.7.

78

Page 101: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.3 New Two-staged Network Approach

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mττ 93.0 93.0 21.2 94.5sign

(mµµ −mZ

)88.4 10.2 9.9 91.0

Table 5.6: Ranking of the most important input variables used network discriminatingH → ττ events against Z → ττ events for a high Higgs mass hypothesis of mH = 500 GeV

c2 .A full list of all used variables provides Table B.8.

Although the SVfit mass mττ remains the most important variable for all Higgsmass hypotheses, its relative impact on the training performance incessantly increaseswith the Higgs mass. Only for Higgs masses below 120 GeV

c2the reconstructed

pseudorapidity of the ττ system provides more discriminating power since the massof the Higgs and the Z boson do not differ much. As already discovered in thetraining equivalent to the two-dimensional mass analysis, the trainings for low Higgsmasses also need the visible mass for the separation, but for higher masses it is onlynecessary to know whether the visible mass yields values above or below the nominalZ mass.

The ranking discloses that for the difficult Higgs-Z separation in the low massregion the information from many discriminating variables is needed since they cannot discriminate very efficiently signal from background, whereas for high Higgsmasses the reconstructed ττ mass yields significant information to perform a goodseparation. Therefore it is clear that a better mass reconstruction resolution especiallyfor low Higgs masses would improve the separation of Higgs and Z events.

Figure 5.6 illustrates the network output exemplarily after a training for a lowhiggs mass hypothesis of mH = 120 GeV

c2and the performance of the classification for

trainings with respect to different masses. For the purpose of evaluating the Higgs-Zseparation, the performance plot showing the background rejection as a function ofthe signal efficiency is based on the two studied processes and does not take intoaccount other background processes.

Due to the small size of the training sample, the output of the neural networkresults in a spiky distribution, especially for the low Higgs mass region. The effectcan be avoided by using more simulated events, in particular for the signal process.For higher Higgs mass hypotheses more simulated events are available and thereforethe distributions become smoother, see Figure B.2.

The separation power of the networks increases with the Higgs mass hypothesis asexpected. The small background rejection values in comparison with the previouslydiscussed neural networks indicate the challenge of the Higgs-Z separation problem.It is indispensable to pay closer attention to this separation problem as done withthe presented networks. In cases of networks trying to discriminate Higgs signalagainst both Z → ττ and Z → µµ background at once as done for the previousnetwork approach following the current analysis, the problem of separation H → ττand Z → ττ events is covered by the necessary discrimination from the huge Z → µµbackground.

79

Page 102: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

0.0 0.1 0.2 0.3 0.4 0.5 0.6Discriminator H → ττ vs. Z → ττ

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency εsig

0.0

0.2

0.4

0.6

0.8

1.0

Back

gro

und r

eje

ctio

n

1−ε b

kg

mH = 120 GeV

c2

mH = 160 GeV

c2

mH = 250 GeV

c2

mH = 350 GeV

c2

Figure 5.6: Network discriminator distribution for the Higgs-Z separation training for aHiggs mass hypothesis of mH = 120 GeV

c2 (left) and the background rejection as a functionof the signal efficiency for four Higgs mass hypotheses (right). Although the neuralnetwork has only been trained on H → ττ and Z → ττ events, the network output isshown for all studied processes. The performance plot (right) is only based on the tostudied processes. The network outputs for higher Higgs mass hypotheses are shown inFigure B.2.

5.3.3 Performance of the Combined Discriminator

In the previous sections two discriminators have been introduced, each one for aspecial separation purpose. It has been shown that discriminating variables whichare used in the official multivariate analysis fulfil different task. Some are appropriatefor discriminating against Z → µµ events such as the visible mass or the distanceof the closest approach of the muon tracks and other separate H → ττ events fromZ → ττ events such as the SVfit mass and dimuon ratio of transverse momenta.

After partitioning the problem into two well defined tasks the resulting discrimin-ators have to be efficiently combined preserving a possible gain introduced by a moremeaningful employment of the discriminating variables. Two ways are possible.

Firstly, a third stage of neural networks can be trained on all studied processes atonce based on the two discriminators in order to classify Higgs events. The problemof these networks is that they are automatically optimise on separating on the mostdominant Z → µµ background. Therefore the power of the Higgs-Z discriminatoris underestimated achieving results that are comparable the networks which areequivalent to the current analysis. Thus, this approach is no longer followed up.

A far more promising approach is the simple multiplication of the two discriminators.The first network purposed for the classification of ττ final states resulted in adiscriminator P

(ττ | xττ

)giving the probability for obtaining a ττ final state under

the condition of a certain set of input variables xττ , whereas the second network usedto separate H → ττ events from Z → ττ events yields the Higgs signal probabilityP(H | ττ , xH/Z

)under the condition that the event is characterised by a final state

80

Page 103: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.3 New Two-staged Network Approach

with two tauons and under the condition of a certain set of input variables xH/Z .By multiplying these two probabilities one gets the probability

P(H → ττ | xττ , xH/Z

)= P

(ττ | xττ

)· P(H | ττ , xH/Z

)for obtaining a H → ττ event under the condition of certain obtained realisations ofthe input variables. In the following, this quantity is studied as final discriminatorclassifying the searched Higgs events.

Figure 5.7 shows the distribution of the discriminator for a low Higgs masshypothesis and its performance for multiples masses. As well as in the previous

0.0 0.1 0.2 0.3 0.4 0.5Discriminator H → ττ vs. others

10-5

10-4

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency εsig

0.975

0.980

0.985

0.990

0.995

1.000

Back

gro

und r

eje

ctio

n

1−ε b

kg

mH = 120 GeV

c2

mH = 160 GeV

c2

mH = 250 GeV

c2

mH = 350 GeV

c2

Figure 5.7: Distribution of the combined discriminator for classifying Higgs events for aHiggs mass hypothesis of mH = 120 GeV

c2 (left) and the background rejection as a functionof the signal efficiency for four Higgs mass hypotheses (right). The network outputs forhigher Higgs mass hypotheses are shown in Figure B.3.

network outputs, here the ratio plot comparing data with the simulation looksproblematic, although the error bars indicate a good agreement with respect to thestatistical uncertainties. The problem is the strongly decreasing number of eventsresulting in higher discriminator values. But this is expected as the discriminatorequals the Higgs signal probability for every event. At this stage a first solutionwould be to reduce the number of bins since the discriminator resolution in the regionof high values might not be sufficient enough for the chosen fine binning. Since themost separation power is achieved by higher discriminator values this approach isnot followed up. The problem is picked up more quantitatively below once more.

The performance plot showing the background rejection as a function of the signalefficiency firstly has to be compared with the performance of the ττ classificationnetwork as depicted in Figure 5.5. It is noticeable that for a low Higgs masshypothesis of mH = 120 GeV

c2there is almost no improvement which is compatible to

the performance measured for the Higgs-Z separation network shown in Figure 5.6.For higher Higgs masses there is a significant improvement as the classificationof H → ττ events gets better. Because of the nearly Higgs mass independent

81

Page 104: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

performance of the ττ classification network, the improvement visible for higherHiggs masses can be deduced to a better rejection of Z → ττ events. This justifiesthe two staged approach. Also for low Higgs masses there is an improvement that ismore clearly visible in quantities discussed below.

Before quantifying the improvements in terms of expected upper limits, Figure 5.8compares the performance of the two staged approach with traditional networkapproach in terms of the signal and background efficiencies. For cuts resulting in

0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency εsig

0.95

0.96

0.97

0.98

0.99

1.00

Back

gro

und r

eje

ctio

n

1−ε b

kg

NB, two-stagedNB, traditional

Figure 5.8: Comparison of the performance of the traditional network approach (see Sec-tion 5.2.2) with the new two-staged network approach presented in this section. Thebackground rejection is shown as a function of the signal efficiency where the signalsample is formed by a 120 GeV

c2 Higgs sample.

the same signal efficiency for both discriminators the number of wrongly selectedbackground events can be reduced by a up to one third by replacing the traditionalapproach with the two staged approach. This clearly discloses that the Z → ττbackground is not treated very well by simply trying to separate H → ττ eventsfrom all other backgrounds due to the large Z → µµ background. It is necessary toassign a special network to the treatment of the Z → ττ background as done in thetwo staged approach.

At least, the separation power officially has to be expressed in terms of expectedupper limits on the production cross section of the studied MSSM Higgs boson (fortanβ = 30) times the branching ratio for its decay into pairs of τ leptons. Figure 5.9compares limits for the traditional analysis based on the two-dimensional massdistributions after a cut on the one-dimensional likelihood quantity with the limitsachieved by the new method introduced in this section. The right plot indicatesthat the expected upper limits can be improved by about 40 % on average based onthe new two-staged approach. The decrease in the limits implies an improvementin the sensitivity of the search for the Higgs boson for the studied decay mode.In the absence of Higgs signal as assumed for the limit calculation, smaller limitsdemonstrate a better exclusion power, since also smaller signal event numbers can

82

Page 105: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.3 New Two-staged Network Approach

100 200 300 400 500

Higgs mass mH

[GeV

c2

]

100

101

95%

CL

upper

limit

on σ·BR

[ pb] 2D masses, L cut

NB, two-staged

100 200 300 400 500

Higgs mass mH

[GeV

c2

]0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Rela

tive im

pro

vem

ent

Figure 5.9: Comparison of the expected upper limits on the product of the cross section forthe MSSM Higgs boson production and the ττ decay branching ratio for tanβ = 30 forthe traditional network approach (see Section 5.2.2) with the ones resulting from the newtwo-staged network approach presented in this section [54]. Whereas the left plot showsthe expected limits, the right figure depicts the relative improvement of about 40 % onaverage.

be faked by fluctuations of background processes.The separation power of the analysis is improved methodically, since the improve-

ment reveals no systematic dependency on the Higgs mass hypothesis. It has to beconcluded that the particular attention on the problem of discriminating H → ττevents against Z → ττ events is the key to success. Additionally, the combinationof the individual networks outputs is a crucial point. By simply multiplying themto yield a conditional probability, statistically uncertainties are avoided that wouldhave been introduced by every statistical method such as an additional network.

It was already mentioned that mainly the region with higher discriminator val-ues contributes to the separation of signal and background events. Additionally,Figure 5.10 shows, that the binning of the discriminator distributions has a strongimpact on the expected limits that are calculated based on these distributions. Forlower Higgs mass hypotheses it seems to be better to take more bins whereas forhigher Higgs masses the better solution may achieved by using smaller bin numbers.By increasing the number of bins and even by using 100 bins the regions of higherdiscriminator values is characterised by only very few events resulting in a lowstatistical precision. In order to be independent of these effect, an unbinned fit ofthe distributions may be studied. The usage of the discriminators is the subject offurther studies beyond the scope of this thesis.

83

Page 106: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5 Classification of Higgs Events with Artificial Neural Networks

100 200 300 400 500

Higgs mass mH

[GeV

c2

]

100

101

95%

CL

upper

limit

on σ·BR

[ pb] NB, Nbins =100

NB, Nbins =300

NB, Nbins =500

Figure 5.10: Impact of binning of the test statistics on the expected upper limits [54].

5.4 Conclusion and Outlook

The multivariate methods presented in this chapter are studied in the scope ofthe current search for the Higgs boson in the H → ττ → µµ channel at the CMSexperiment. Whereas the current multivariate analysis uses likelihood methods, newstrategies based on neural networks have been developed. Two main questions havebeen answered.

The first question focused on a replacement of the current methods with neuralnetworks by using the same set of input variables. The separation power of theone-dimensional likelihood quantity based on five input variables is outperformed byan equivalent network as the network discriminator enables cut-bases selections witha higher signal purity for comparable signal sample sizes.

The achieved gain in the multivariate analysis of the five likelihood variables couldnot be transmitted to a subsequent network training mainly on the visible mass andthe SVfit mass to separate H → ττ events from all other backgrounds. However, theachieved expected upper limits on the Higgs boson production cross section provedthe consistency with the current two dimensional methods. Two conclusions haveto be drawn: The current full analysis is simplified by the network approach asthe final test statistic, which is needed for the calculation of the limits, is no moretwo-dimensional. After demonstrating the proof of principle, the network allows fora simple addition of further discrimination variables.

The second question concerned the usage of discriminating variables and theinvestigation of new variables. The currently used variables fulfilled different tasksby separating the Higgs signal from different backgrounds. The two importantbackgrounds, Z → µµ and Z → ττ events, have been studied. By examining thedistributions of all available inputs and with the help of the NeuroBayes rankingroutines, variables have been identified that are capable of either classifying ττ final

84

Page 107: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

5.4 Conclusion and Outlook

states or separating H → ττ events from Z → ττ events. For these two purposesindividual networks have been trained and a final discriminator classifying Higgsevents has been developed which is calculated by through a multiplication of theindividual network outputs.

By selecting signal-like samples based on a cut on the final discriminator, thepurity of the selection is increased in comparison with the traditional approach. Thatmakes this method interesting for further studies of the properties of a possible Higgsboson where a preferably pure Higgs signal sample is needed. Before a discovery thesensitivity of the analysis for the search for the Higgs boson is expressed in terms ofexpected upper limits on the Higgs boson production cross section where the newtwo-staged approach yields a relative improvement of 40 % on average. Becausethe improvement shows no systematic dependency on the Higgs mass hypothesisand according to the NeuroBayes variable rankings, this method gains more byoutsourcing the Higgs-Z separation problem and by non-dissipatively combining thenetwork outputs than by adding more variables.

85

Page 108: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 109: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Conclusion – Summary of the Resultsand Perspective

Within the scope of this thesis, two simulation-based multivariate analyses have beenperformed to improve the suppression of Z background for the Higgs search in theH → ττ channel.

New ττ mass reconstruction approaches based on artificial neural networks havebeen studied in order to improve the resolution of currently employed mass definitions,since a full reconstruction of ττ final states is prevented by the presence of neutrinosin the τ decays. Reduced reconstruction uncertainties allow for a better separationof H → ττ and Z → ττ events, especially in the low Higgs mass region.

The best reconstruction resolution has been achieved with a one-staged neuralnetwork directly predicting estimators for the ττ mass. With the information providedby nine input variables including the visible mass and the missing transverse energyas the most important ones, the neural network package NeuroBayes was ableto reconstruct ττ mass with resolutions of 18 to 19 % for true masses between thenominal Z mass and 150 GeV

c2, whereas the SVfit mass as the best method at present

yields resolutions between 19 and 23 % in the same interval of generated masses (seeFigure 4.7). The improvement of the network-based reconstruction in comparisonto the SVfit mass increases with the true ττ mass and yields a reduction of relativereconstruction uncertainties of up to five percentage points within the interestingmass range. Additionally, the new network approach yields at least as precise resultsas the SVfit method over the entire studied mass range. Boundary effects the networkestimators are afflicted with can be avoided with a training sample that covers andextended interval of generated ττ masses.

More elaborate physically motivated approaches have been studied, too. Thereconstruction of corrections for the visible mass as an easy and good first massestimator could at most confirm the results obtained by the more simple direct massreconstruction in the interesting mass range (see Figure 4.12). Estimating three

87

Page 110: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Conclusion – Summary of the Results and Perspective

quantities that parametrise the missing four-momentum yields mass resolutions whichimprove the ones of the SVfit mass but cannot reach the ones of the simple straight-forward approach in terms of precision due to the propagation of reconstructionuncertainties during the combination of the parameter estimators (see Figure 4.13).

The second analysis concerned the classification of Higgs events in the finalstate with a pair of tauons. The analysis has been performed in the scope of theofficial search for the Higgs boson in the H → ττ → µµ channel. Therefore theexisting likelihood-based multivariate analysis has been implemented based on neuralnetworks imitating the current approach. Firstly, a network has been trained onthe five variables used for the current one-dimensional likelihood method to classifyHiggs events. The background contamination of a sample that is selected basedon the network discriminator after applying a cut threshold could be reduced by afactor of about two with simultaneously selecting equal numbers of signal events incomparison to the likelihood quantity (see Figure 5.2).

Secondly, Higgs mass specific network trainings have been performed exploitingmass information, i.e. the visible µµ mass and the reconstructed ττ mass (SVfit), inorder to achieve a test statistic which is used for the calculation of expected upperlimits on the Higgs production cross section. This approach was able to confirm theexpected limits calculated based on two-dimensional mass distributions as done in thecurrent analysis. The performance of the method manifests itself in the simplificationby substituting two dimensional analyses with one-dimensional ones which enablesfacile addition of more discriminating variables to the multivariate analysis.

In order to improve the sensitivity of the analysis significantly, a new two-stagedclassification approach has been developed within the scope of this thesis. A firstneural network is purposed to classify ττ final states, i.e. Z → ττ and H → ττ events,in order to firstly discriminate against the most dominant Z → µµ background. Themost important discriminating variable for this network is the visible mass. Secondly,for each Higgs mass hypothesis a second-staged network has been trained in orderto discriminate H → ττ events from Z → ττ events as the second most dominantbackground process. For this trainings the SVfit mass is of particular importance.As expected, the discrimination power of the network outputs increased with theHiggs mass hypothesis.

Both the ττ classification discriminator and the Higgs-Z separating discriminatorhave been combined through a multiplication. The resulting discriminator can betaken as a conditional probability for obtaining a Higgs signal. The calculatedexpected upper limits for this approach yielded an significant relative improvementof about 40 % with respect to the existing approach nearly independent of the Higgsmass hypothesis without considering any systematic effects (see Figure 5.9). Theimprovement of the sensitivity for the Higgs search is the result of mainly a moremeaningful usage of the discriminating variables.

Further studies are needed to combine the independently treated analyses presentedin this thesis. The performance of the network-based mass reconstruction techniqueshas to be examined for real data. Furthermore, limitation of the employed fastsimulation with respect to the full detector simulation have to be studied. After that,

88

Page 111: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

the impact of the improved mass resolution on the separation of Higgs and Z eventshas to be investigated.

In the scope of the classification, the actual usage of the final discriminators has tobe decided. Due to statistical limitations a parametrisation of the network outputsmay be considered. Then the method will be ready for its application to real data.The great improvements in terms of expected upper limits finally have to be broughtto the official analysis and systematic effects have to be investigated.

89

Page 112: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 113: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

AAdditional Information for the Mass

Reconstruction Analysis

A.1 Rankings of the Network Input Variables

All rankings provided by NeuroBayes evaluate the quality of single variables withthe help of four different correlation quantities as defined in Section 3.3. The listsprovided below are ordered by the NeuroBayes ranking that mainly follows theadded correlation Cadded. Variables marked with an asterisk (*) are preprocessedwith respect to the target width instead of its mean. The correlations Cothers to allother variables are rather large due to the fact that most of the variables are usedtwice.

Impact parameters as the closes distance between a track and the primary vertex(PV) are denoted by dmin, whereas their reconstruction uncertainties are symbolisedwith either σ (dmin) or only σ. Therefore impact parameter significances correspondto the fraction dmin/σ. The reconstructed hadronically decaying tauon resulting fromthe secondary vertex (SV) reconstruction is expressed by τ reco

had .

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mvisττ 77.1 77.1 5.5 99.4

EmissT 45.7 25.9 6.6 97.1∣∣∣∆ϕ (τjet, µ

)− π

∣∣∣ 25.7 9.6 2.4 97.9∣∣∣∆ϕ (τjet,MET)− π

∣∣∣ 26.3 7.7 2.8 94.9

mvisττ (*) 58.7 7.4 4.3 94.8∣∣dµmin

∣∣ /σ 15.4 7.2 1.5 98.1∣∣∣∣dmin

(τ lead

jet

)∣∣∣∣ /σ 10.8 6.9 0.6 98.6

mτjet (*) 4.0 5.5 3.3 69.1

91

Page 114: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

A Additional Information for the Mass Reconstruction Analysis

∣∣∣∆η (τjet, µ)∣∣∣ 23.1 5.4 0.8 99.6

EmissT /Eττ (*) 15.3 3.2 0.9 98.0

distance (PV→ SV) 4.6 3.3 2.4 58.5∆pT

(τjet, µ

)48.4 2.5 2.7 98.5∣∣∆ϕ (µ,MET)

∣∣ (*) 20.9 2.2 1.6 89.5∣∣∆ϕ (µ,MET)∣∣ 29.0 2.7 1.8 93.3

EmissT /EττT (*) 6.4 2.3 2.3 96.0

EmissT /EττT 10.0 2.4 1.4 81.4

σ(dµmin

)(*) 26.6 2.3 1.9 97.4

σ(dµmin

)27.7 2.2 1.3 97.2

∆pT

(τjet, µ

)(*) 46.8 1.5 1.6 98.0

pT

(τjet

)(*) 64.6 1.6 0.5 99.6

pT

(τ lead

jet

)(*) 54.3 1.7 1.3 95.8

m(τ lead

jet

)(*) 11.0 1.4 0.9 97.2

mτjet > 200 GeVc2

6.4 1.6 1.1 86.5Emiss

T /pττ (*) 9.3 1.4 0.8 88.1pµT (*) 45.8 1.1 1.6 99.2pµT 46.6 1.5 1.2 99.5∣∣∣∆ϕ (τjet,MET

)− π

∣∣∣ (*) 22.1 1.3 1.4 92.1∣∣∣∆ϕ (τjet, µ)− π

∣∣∣ (*) 24.9 1.3 1.1 98.2

EmissT (*) 42.6 1.1 1.4 96.9∣∣∣∣dmin

(τ lead

jet

)∣∣∣∣ 10.6 0.5 1.1 96.4∣∣∣∣dmin

(τ lead

jet

)∣∣∣∣ (*) 10.2 1.1 1.1 96.5∣∣dµmin

∣∣ 14.0 1.1 0.8 96.1∣∣∣〈ηµ, ητjet〉∣∣∣ 7.3 1.1 0.2 99.0

Eττ (*) 30.4 0.9 0.5 96.0pττ (*) 18.7 0.9 0.9 95.9∣∣∣ητjet∣∣∣ 2.5 0.9 0.4 98.0

sign〈η(τ reco

had

), ηµ〉 5.6 0.7 0.1 100.0∣∣∣η (τ reco

had

)∣∣∣ 5.7 0.9 1.0 98.2

classes (SV) 6.0 1.0 0.1 100.0∣∣dµmin

∣∣ /σ (*) 15.0 1.0 1.1 97.7pττT (*) 15.2 1.0 1.0 76.3∣∣∣∆R (τjet, µ

)− π

∣∣∣ 20.6 0.8 0.5 90.0∑Emiss

T 36.9 0.8 0.5 93.9

σ

[dmin

(τ lead

jet

)]27.9 0.3 0.7 97.3

92

Page 115: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

A.1 Rankings of the Network Input Variables

σ

[dmin

(τ lead

jet

)](*) 27.2 0.7 0.6 97.3

pT

(τ lead

jet

)57.4 0.7 0.6 96.7

∠(τjet, τ

recohad

)7.1 0.7 0.8 85.8

sign(ϕµ)

0.3 0.7 0.4 86.6∠(τjet, µ

)16.0 0.5 0.3 96.6

Eττ 54.9 0.6 0.8 99.2Emiss

T /pττ 12.9 0.5 0.6 96.2Emiss

T /pττT 16.5 0.6 0.4 89.2∣∣∣∆η (τjet, τrecohad

)∣∣∣ 6.0 0.5 0.5 94.6

pττT 35.9 0.4 0.5 97.0|ηττ | (*) 4.8 0.4 0.5 96.3∣∣∣∣dmin

(τ lead

jet

)∣∣∣∣ /σ (*) 10.5 0.4 0.4 98.1∣∣∣ϕτjet∣∣∣ 0.6 0.1 0.2 97.2∣∣∣ϕτjet∣∣∣ (*) 0.5 0.4 0.3 96.2∣∣∣∆η (τjet, µ)∣∣∣ (*) 22.8 0.4 0.4 99.3

sign(ηµ)

0.7 0.3 0.2 78.5

sign[∆R

(τjet, µ

)− π

]25.0 0.3 0.3 79.5∣∣∣∆ϕ (τ reco

had , µ)− π

∣∣∣ 8.6 0.2 0.3 80.2∣∣∣∣∣∣∣∆ϕ (τ recohad ,MET

)− π

∣∣∣− π2

∣∣∣∣ 8.6 0.3 0.3 76.2

Eτjet 49.7 0.2 0.3 98.7Eτjet (*) 6.5 0.2 0.3 86.7Eµ 33.7 0.3 0.2 98.7∣∣ηµ∣∣ (*) 0.1 0.3 0.3 43.7sign

[∆ϕ (µ,MET)

]0.1 0.3 0.2 68.6

mτjet 10.3 0.2 0.3 92.2

mτjet < 200 GeVc2

(*) 6.8 0.2 0.2 88.0Emiss

T /Eττ 16.2 0.2 0.3 98.9Emiss

T /pττT (*) 12.8 0.2 0.2 96.2Emiss

T /pττT (*) 13.5 0.2 0.2 97.5

sign

[dmin

(τ lead

jet

)/σ

]0.6 0.2 0.1 78.7∣∣∣∆R (τjet, µ

)− π

∣∣∣ (*) 6.1 0.2 0.2 71.0∣∣∣〈ηµ, ητjet〉∣∣∣ (*) 5.5 0.2 0.2 94.9

sign (ϕMET) 0.3 0.2 0.2 64.1sign (ϕττ ) 0.2 0.2 0.2 62.4|ϕττ | (*) 0.5 0.2 0.2 53.2∑

EmissT (*) 33.7 0.2 0.2 92.1

93

Page 116: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

A Additional Information for the Mass Reconstruction Analysis

∣∣∣∆ϕ (τjet, τrecohad

)∣∣∣ 7.7 0.2 0.2 84.6∣∣ϕµ∣∣ (*) 0.1 0.2 0.2 80.3|ϕMET| (*) 0.3 0.2 0.2 53.3

m(τ lead

jet

)18.6 0.2 0.1 86.3

pT

(τjet

)65.7 0.2 0.2 99.7

sign[∆η(τjet, τ

recohad

)]5.6 0.1 0.1 99.8

|ϕMET| 0.7 0.1 0.1 65.8∣∣∣〈η (τ recohad

), ηµ〉

∣∣∣ 7.0 0.1 0.1 81.2

sign(dµmin/σ

)0.1 0.1 0.1 40.5∣∣∣ϕ (τ reco

had

)∣∣∣ 5.6 0.1 0.1 99.5∣∣∣ητjet∣∣∣ (*) 2.6 0.1 0.1 92.8

sign[∆ϕ

(τjet, µ

)− π

]0.2 0.1 0.1 43.3

∠(τjet, µ

)(*) 15.0 0.1 0.1 98.4

sign[∆ϕ

(τjet, τ

recohad

)]5.6 0.1 0.1 99.8

sign[∆η(τjet, µ

)]0.0 0.1 0.1 68.1

sign(ητjet

)0.7 0.0 0.1 85.6

sign (ηττ ) 0.5 0.1 0.1 94.7∣∣ϕµ∣∣ 0.6 0.1 0.1 89.0

η(τ lead

jet

)(*) 1.7 0.1 0.1 91.9

ϕ(τ lead

jet

)0.5 0.1 0.1 86.8

sign(ϕτjet

)0.5 0.1 0.1 93.5

ϕ(τ lead

jet

)(*) 0.0 0.1 0.1 88.5

Eµ (*) 29.4 0.1 0.1 91.5pττ 30.8 0.1 0.1 98.4

Table A.1: Ranking of all input variables for the straightforward one-staged mass reconstruc-tion, see Section 4.3.2.

94

Page 117: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

A.1 Rankings of the Network Input Variables

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mvisττ 52.7 52.7 34.1 58.5

EmissT 29.1 33.3 8.5 86.6

pµT (*) 38.2 13.6 10.7 57.6∣∣∣∆ϕ (τjet, µ)− π

∣∣∣ (*) 2.6 12.7 15.3 47.2∣∣∣∣dmin

(τ lead

jet

)∣∣∣∣ 13.2 10.8 9.4 11.7∣∣dµmin

∣∣ 21.9 9.1 9.0 24.9∣∣∣∆ϕ (τjet,MET)− π

∣∣∣ (*) 1.5 7.9 8.9 51.9

EmissT /Eττ 41.4 8.1 10.5 89.4

mτjet > 200 GeVc2

10.7 7.6 7.3 8.2∣∣∣〈ηµ, ητjet〉∣∣∣ 4.8 6.8 6.8 63.8

Table A.2: Ranking of the input variables used for the training purposed to reconstructcorrections for the visible mass, see Section 4.3.3. The visible mass and the missingtransverse energy play the most important role for the training.

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

EmissT /Eττ 37.4 37.4 37.1 51.7

mvisττ 32.8 26.7 20.9 31.8∣∣ηµ∣∣ 11.3 20.2 21.5 28.3∣∣∣∆ϕ (τjet, µ

)− π

∣∣∣ (*) 7.0 14.1 18.5 53.9∣∣∣∆ϕ (τjet,MET)− π

∣∣∣ (*) 1.0 15.4 13.4 57.7∣∣dµmin

∣∣ 20.8 10.6 10.2 23.5∣∣∣∣dmin

(τ lead

jet

)∣∣∣∣ 11.8 9.2 9.1 9.4

mτjet > 200 GeVc2

9.1 6.7 6.7 6.6Eτjet 18.0 6.1 6.1 36.5

Table A.3: Ranking of the input variables used for the training on the parameter α for themass parametrisation approach, see Section 4.3.4. For this training the measured energyratio Emiss

T /Eττ is important as the parameter α refers to the ratio of missing and visibleenergy of the ττ system.

95

Page 118: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

A Additional Information for the Mass Reconstruction Analysis

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

EmissT /pττ 47.7 47.7 10.1 96.1∣∣ηµ∣∣ 12.3 31.2 16.8 95.6

mvisττ 24.8 19.1 15.2 46.4∣∣∣∆ϕ (τjet, µ

)− π

∣∣∣ 8.1 10.7 11.7 58.3∣∣∣∆ϕ (τjet,MET)− π

∣∣∣ 6.4 10.0 10.3 64.1∣∣dµmin

∣∣ 18.6 9.8 8.7 25.6Emiss

T /Eττ 43.9 6.4 8.2 94.1∣∣ηµ∣∣ (*) 11.5 6.4 7.8 94.7pττ 25.8 7.7 11.8 95.7pµT (*) 15.6 7.3 7.6 67.0∣∣∣〈ηµ, ητjet〉∣∣∣ 16.5 7.9 10.7 96.1∣∣∣∣dmin

(τ lead

jet

)∣∣∣∣ 9.1 6.5 6.6 9.6

pT

(τjet

)(*) 2.9 6.5 6.8 82.6

∠(τjet, µ

)(*) 8.5 1.9 5.8 95.5∣∣∣〈ηµ, ητjet〉∣∣∣ (*) 1.8 5.6 5.6 94.8

Table A.4: Ranking of the input variables used for the training on the parameter β for theparametrisation approach, see Section 4.3.4. For this training the measured momentumratio Emiss

T /pττ is important as the parameter β refers to the ratio of missing and visiblemomentum of the ττ system.

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

∠(τjet, µ

)(*) 63.3 63.3 23.2 85.2

∆pT

(τjet, µ

)46.2 45.2 16.9 68.2

EmissT /pττ 70.0 25.2 14.1 80.5∣∣∣∆ϕ (τjet,MET

)− π

∣∣∣ 39.3 13.8 15.1 61.3

|ηττ | 65.6 7.1 7.7 85.4∣∣∣ητjet∣∣∣ 40.1 7.3 6.8 77.4∣∣∣∆η (τjet, µ)∣∣∣ 8.9 7.2 5.7 59.7

Eµ 45.5 5.2 5.2 60.5

Table A.5: Ranking of the input variables used for the training on the parameter γ for themass parametrisation approach, see Section 4.3.4. For this training the angle ∠

(τjet, µ

)is important as the parameter γ refers to the angle between the directions of the missingand the visible momentum of the ττ system.

96

Page 119: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

A.1 Rankings of the Network Input Variables

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

manalytττ 71.2 71.2 54.4 61.1

αreco 31.9 5.6 5.1 78.9βreco (*) 6.6 5.0 8.2 79.6αreco (*) 6.8 5.9 6.7 72.8Emiss/Evis (*) 15.3 2.0 4.6 96.9Emiss/Evis 16.2 5.5 5.6 97.8pmiss/pvis 12.9 5.7 4.7 93.8γreco 24.3 3.8 6.0 94.1γreco (*) 21.7 5.2 4.4 91.5pmiss/pvis (*) 9.3 3.1 3.0 85.1βreco 25.2 0.3 0.3 78.8

manalytττ (*) 31.1 0.1 0.1 56.2

Table A.6: Ranking of the input variables used for the combination training for the massparametrisation approach to improve the analytically calcuated ττ mass, see Section 4.3.4.

97

Page 120: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

A Additional Information for the Mass Reconstruction Analysis

A.2 Additional Performance Tests for the StraightforwardOne-staged Network

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, Cadded> 5 %

NB, Cadded> 1 %

NB, Cadded> 0 %

(a) Reconstruction resolution for differentsets of input variables selected accordingto the added correlation Cadded.

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, Ntrain≈ 200,000

NB, Ntrain≈ 100,000

NB, Ntrain≈ 50,000

(b) Reconstruction resolution for differentnumbers of used training events Ntrain

50 100 150 200 250

Generated mass m genττ

[GeV

c2

]0.15

0.20

0.25

0.30

Reco

nst

ruct

ion r

eso

luti

on σ

reco

mττ/m

reco

ττ SVfit massNB, Nout = 20

NB, Nout = 40

NB, Nout = 80

(c) Reconstruction resolution for differentnumbers of network output nodes Nout.

Figure A.1: Comparison of the performances for various tests of the simple straightforwardmass reconstruction network described in Section 4.3.2. The plots show the impact of thenumbers of training events, used input variables and outputs nodes on the reconstructionresolution. In each plot only one of these aspects has been varied during the training.Everything else remained to the standard settings described in Section 4.3.2. In order toevaluate the results, the performance of the SVfit mass is shown, too. The figures revealno significant differences. Only for low masses the training with 80 output nodes yieldsslightly worse results that can be avoided by using a smaller number of output nodeswhich suits the problem better.

98

Page 121: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

BAdditional Information for the Higgs

Classification Analysis

B.1 Rankings of the Network Input Variables

All rankings provided by NeuroBayes evaluate the quality of single variables withthe help of four different correlation quantities as defined in Section 3.3. The listsprovided below are ordered by the NeuroBayes ranking that mainly follows theadded correlation Cadded.

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

log10

[DCA signif (µµ)

]44.4 44.4 33.7 23.9

pµµT /(pµ

+

T + pµ−

T

)43.0 34.9 20.5 71.4

CA validity 17.8 12.5 13.7 17.7∣∣∣∆ϕ (µ+,MET)− π

2

∣∣∣ 19.5 12.7 12.7 19.0∣∣ηµµ∣∣ 34.9 4.5 4.5 70.8sign

(ηµµ)

1.0 0.3 0.3 2.6

sign[∆ϕ

(µ+,MET

)− π

2

]1.3 0.2 0.2 9.6

Table B.1: Ranking of the input variables used for the Higgs classification training that isequivalent to the one-dimensional likelihood method based on five input variables (seeSection 5.2.1).

99

Page 122: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B Additional Information for the Higgs Classification Analysis

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mµµ 80.7 80.7 51.4 70.4tnet 1 58.8 18.0 18.3 55.1mττ 49.2 4.5 4.5 63.9

Table B.2: Ranking of the input variables used for the Higgs classification training that isequivalent to the two-dimensional mass analysis (see Section 5.2.2) for a low Higgs masshypothesis of mH = 120 GeV

c2 .

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mµµ 70.6 70.6 34.4 58.4tnet 1 62.6 36.5 31.5 45.6mττ 61.2 20.6 20.6 56.4

Table B.3: Ranking of the input variables used for the Higgs classification training that isequivalent to the two-dimensional mass analysis (see Section 5.2.2) for a medium Higgsmass hypothesis of mH = 250 GeV

c2 .

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mττ 90.2 90.2 25.8 88.7tnet 1 67.7 15.9 15.3 61.2mµµ 85.3 12.6 12.6 87.4

Table B.4: Ranking of the input variables used for the Higgs classification training that isequivalent to the two-dimensional mass analysis (see Section 5.2.2) for a high Higgs masshypothesis of mH = 500 GeV

c2 .

100

Page 123: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B.1 Rankings of the Network Input Variables

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]∣∣mµµ −mZ

∣∣ 83.1 83.1 39.0 82.0log10

[DCA signif (µµ)

]42.3 13.7 12.1 37.0

EmissT 28.5 7.8 6.2 44.7|cosω∗| 28.3 7.4 6.4 42.7sign

(mµµ −mZ

)45.9 5.7 5.2 50.4

pµµT /(pµ

+

T + pµ−

T

)39.6 4.8 2.5 82.9

CA validity 15.0 3.9 3.8 40.1∣∣∣∆ϕ (µ+,MET)− π

2

∣∣∣ 15.9 2.7 2.9 28.9∣∣ηµµ∣∣ 29.2 1.7 1.6 70.4Pζ − 1.85 · P vis

ζ 14.7 1.3 1.4 66.5

mττ 58.1 0.8 0.9 81.2

sign[∆ϕ

(µ+,MET

)− π

2

]2.0 0.5 0.5 9.2

Table B.5: Ranking of the all input variables used for the network classifying ττ final states(see Section 5.3.1).

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mττ 28.4 28.4 7.2 81.0|ηττ | 24.7 24.5 18.4 46.2∣∣mµµ −mZ

∣∣ 27.2 11.1 12.4 76.5

pµµT /(pµ

+

T + pµ−

T

)14.7 11.0 3.4 72.5

log10

[DCA signif (µµ)

]4.2 6.6 6.8 10.5∣∣∣∆ϕ (µ+,MET

)− π

2

∣∣∣ 7.7 5.5 4.5 27.7

Pζ − 1.85 · P visζ 17.7 3.7 4.3 73.6

CA validity 6.9 4.1 4.0 50.7∣∣ηµµ∣∣ 16.0 3.5 3.7 61.0Emiss

T 12.4 2.1 1.8 62.1sign

(mµµ −mZ

)8.1 1.9 1.8 31.4

sign[∆ϕ

(µ+,MET

)− π

2

]3.8 1.4 1.4 15.0

|cos θ∗| 8.7 1.2 1.2 45.9

Table B.6: Ranking of the most important input variables used for the network discriminatingH → ττ events against Z → ττ events (see Section 5.3.2) for a low Higgs mass hypothesisof mH = 120 GeV

c2 .

101

Page 124: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B Additional Information for the Higgs Classification Analysis

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mττ 82.8 82.8 33.6 88.0log10

[DCA signif (µµ)

]19.0 7.0 7.1 16.2

|ηττ | 18.2 6.3 4.7 44.5sign

(mµµ −mZ

)61.0 4.7 5.3 75.2

|cos θ∗| 21.9 4.7 2.4 51.2∣∣mµµ −mZ

∣∣ 53.9 3.9 3.7 65.0∣∣∣∆ϕ (µ+,MET)− π

2

∣∣∣ 17.3 3.3 2.6 47.8∣∣ηµµ∣∣ 20.1 2.5 2.1 58.5Pζ − 1.85 · P vis

ζ 49.5 1.6 1.7 80.0

EmissT 42.9 2.0 2.1 72.6

pµµT /(pµ

+

T + pµ−

T

)19.8 1.1 1.1 58.4

sign[∆ϕ

(µ+,MET

)− π

2

]3.7 0.5 0.5 10.7

CA validity 12.3 0.0 0.0 36.8

Table B.7: Ranking of the most important input variables used for the network discriminatingH → ττ events against Z → ττ events (see Section 5.3.2) for a low Higgs mass hypothesisof mH = 250 GeV

c2 .

Variable Ctarget

[%]

Cadded

[%]

Closs

[%]

Cothers

[%]

mττ 93.0 93.0 21.2 94.5sign

(mµµ −mZ

)88.4 10.2 9.9 91.0

log10

[DCA signif (µµ)

]27.1 4.5 4.3 25.5

pµµT /(pµ

+

T + pµ−

T

)29.3 3.5 2.3 55.6∣∣ηµµ∣∣ 26.9 2.8 1.9 61.3∣∣mµµ −mZ

∣∣ 71.9 2.0 1.8 78.5Pζ − 1.85 · P vis

ζ 70.8 1.5 1.0 87.7∣∣∣∆ϕ (µ+,MET)− π

2

∣∣∣ 32.1 0.9 0.7 59.1

|cos θ∗| 34.0 0.7 0.6 60.0CA validity 12.8 0.5 0.5 36.9Emiss

T 67.9 0.3 0.3 85.3|ηττ | 8.0 0.1 0.1 38.6

sign[∆ϕ

(µ+,MET

)− π

2

]0.1 0.1 0.1 9.7

Table B.8: Ranking of the most important input variables used for the network discriminatingH → ττ events against Z → ττ events (see Section 5.3.2) for a low Higgs mass hypothesisof mH = 500 GeV

c2 .

102

Page 125: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B.2 Discriminator Distributions for Medium and High Higgs Mass Hypotheses

B.2 Discriminator Distributions for Medium and High HiggsMass Hypotheses

0.0 0.1 0.2 0.3 0.4 0.5 0.6Discriminator H → ττ vs. others

10-4

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

(a) mH = 250 GeVc2

0.00 0.05 0.10 0.15 0.20Discriminator H → ττ vs. others

10-610-510-410-310-210-1100101102103104105106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

(b) mH = 500 GeVc2

0.00 0.05 0.10 0.15 0.20Discriminator H → ττ vs. others

10-610-510-410-310-210-1100101102103104105106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

Figure B.1: Network discriminator distributions for the trainings on the mass variables (seeSection 5.2.2) for a medium (left) and a high (right) Higgs mass hypothesis. The outputsrefer to separate networks.

0.0 0.1 0.2 0.3 0.4 0.5 0.6Discriminator H → ττ vs. Z → ττ

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

(a) mH = 250 GeVc2

0.0 0.1 0.2 0.3 0.4 0.5Discriminator H → ττ vs. Z → ττ

10-4

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

(b) mH = 500 GeVc2

0.0 0.1 0.2 0.3 0.4 0.5Discriminator H → ττ vs. Z → ττ

10-4

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

Figure B.2: Network discriminator distributions for the Higgs-Z separation trainings (seeSection 5.3.2) for a medium (left) and a high (right) Higgs mass hypothesis. The outputsrefer to separate networks. Although the neural networks have only been trained onH → ττ and Z → ττ events, the network output is shown for all studied processes.

103

Page 126: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B Additional Information for the Higgs Classification Analysis

0.0 0.1 0.2 0.3 0.4 0.5 0.6Discriminator H → ττ vs. others

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

(a) mH = 250 GeVc2

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40Discriminator H → ττ vs. others

10-4

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

(b) mH = 500 GeVc2

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40Discriminator H → ττ vs. others

10-4

10-3

10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

Figure B.3: Distributions of the combined discriminator for classifying Higgs events (seeSection 5.3.3) for a medium (left) and a high (right) Higgs mass hypothesis.

B.3 Plots Showing All Studied Discriminating Variables

The figures below show the discriminating power of the discriminating variables usedfor the Higgs classification methods (left). They are show for the dominant Z → µµand Z → ττ background processes as well as for two H → ττ signal samples with alow and a medium Higgs mass hypothesis. The right plots show the distributionsof these variables for all simulated processes as well as for real data to evaluate theagreement between the measurement and the simulation. All histograms are filledwith events remaining after the preselection. No additional cut is applied.

2.5 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5log10[DCA significance(µµ)]

0.00

0.05

0.10

0.15

0.20

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

2.5 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5log10[DCA significance(µµ)]

0.00

0.05

0.10

0.15

0.20

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

2.5 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5log10[DCA significance(µµ)]

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

2.5 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5log10[DCA significance(µµ)]

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

Figure B.4: Discriminating variables the one-dimensional likelihood quantity is based on

104

Page 127: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B.3 Plots Showing All Studied Discriminating Variables

0.0 0.2 0.4 0.6 0.8 1.0

p µµT /(p µ+

T + p µ−

T )

0.00

0.05

0.10

0.15

0.20

arb

itra

ry u

nit

sZ → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

0.0 0.2 0.4 0.6 0.8 1.0

p µµT /(p µ+

T + p µ−

T )

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.5 0.0 0.5 1.0 1.5CA validity

0.0

0.1

0.2

0.3

0.4

0.5

0.6

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

0.5 0.0 0.5 1.0 1.5CA validity

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

1.5 1.0 0.5 0.0 0.5 1.0 1.5

∆ϕ(µ+, MET) − π2

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

1.5 1.0 0.5 0.0 0.5 1.0 1.5

∆ϕ(µ+, MET) − π2

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

6 4 2 0 2 4 6ηµµ

0.00

0.02

0.04

0.06

0.08

0.10

0.12

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

6 4 2 0 2 4 6ηµµ

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

Figure B.4: (Cont.) Discriminating variables the one-dimensional likelihood quantity isbased on

105

Page 128: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B Additional Information for the Higgs Classification Analysis

20 40 60 80 100 120 140 160 180 200

mµµ

[GeV

c2

]0.0

0.1

0.2

0.3

0.4

0.5

0.6

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

20 40 60 80 100 120 140 160 180 200

mµµ

[GeV

c2

]0.0

0.1

0.2

0.3

0.4

0.5

0.6

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

20 40 60 80 100 120 140 160 180 200

mµµ

[GeV

c2

]10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

20 40 60 80 100 120 140 160 180 200

mµµ

[GeV

c2

]10-2

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

50 100 150 200 250 300 350

mττ

[GeV

c2

]0.00

0.05

0.10

0.15

0.20

0.25

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

50 100 150 200 250 300 350

mττ

[GeV

c2

]10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0 20 40 60 80 100

EmissT

[GeV]

0.00

0.05

0.10

0.15

0.20

0.25

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

0 20 40 60 80 100

EmissT

[GeV]

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

Figure B.5: Mass variables (visible mass and SVfit mass), as used for the two-dimensionalmass analysis, and the missing transverse energy.

106

Page 129: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B.3 Plots Showing All Studied Discriminating Variables

8 6 4 2 0 2 4 6 8ηττ

0.00

0.02

0.04

0.06

0.08

0.10

0.12

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

8 6 4 2 0 2 4 6 8ηττ

0.00

0.02

0.04

0.06

0.08

0.10

0.12

arb

itra

ry u

nit

sZ → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

8 6 4 2 0 2 4 6 8ηττ

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

8 6 4 2 0 2 4 6 8ηττ

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.0 0.2 0.4 0.6 0.8 1.0| cos ω ∗ |

0.00

0.05

0.10

0.15

0.20

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

0.0 0.2 0.4 0.6 0.8 1.0| cos ω ∗ |

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

0.0 0.2 0.4 0.6 0.8 1.0| cos θ ∗ |

0.00

0.02

0.04

0.06

0.08

0.10

0.12

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

0.0 0.2 0.4 0.6 0.8 1.0| cos θ ∗ |

10-2

10-1

100

101

102

103

104

105

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

250 200 150 100 50 0 50

Pζ−1.85 ·P visζ

0.00

0.05

0.10

0.15

0.20

0.25

0.30

arb

itra

ry u

nit

s

Z → µµ

Z → ττ

H → ττ

(250 GeV

c2)

H → ττ

(120 GeV

c2)

250 200 150 100 50 0 50

Pζ−1.85 ·P visζ

10-1

100

101

102

103

104

105

106

Events

per

bin H → ττ

Data

W+Jets

QCD

Diboson

tt+Jets

Z → ττ

Z → µµ

0.81.01.2

Rati

o

Figure B.6: Additional variables not used in the current multivariate analysis. These variablesare used in the new Higgs classification networks to study their impact on the trainingperformance.

107

Page 130: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

B Additional Information for the Higgs Classification Analysis

B.4 NeuroBayes Preprocessing

Phi-T Teacher

NeuroBayes

Input node 8 : MuonDCASig PrePro: 14 #8

only this 61.00

corr. to others 37.00%

2nd most important

added signi. 19.73

signi. loss 17.49

0 0.2 0.4 0.6 0.8 1

even

ts

020406080

100120140160180200220

flat

-5.185172 -1.44819 -1.148264 -0.981 -0.847 -0.752 -0.674 -0.606 -0.545 -0.486 -0.441 -0.399 -0.359 -0.321 -0.289 -0.255 -0.225 -0.196 -0.167 -0.142-0.115

-0.0892 -0.0668 -0.0445 -0.0228 -0.000516 0.0219 0.0447 0.0673 0.0884 0.1087163 0.1309070.1528115

0.1754391 0.1992528 0.2202605 0.2421652 0.2666209 0.2927495 0.3186158 0.3499379 0.3802854 0.4138259 0.4535138 0.4950796 0.5430276 0.5905764 0.640712 0.7172955 0.8066636 1.455948

bin #10 20 30 40 50 60 70 80 90 100

puri

ty

0.2

0.4

0.6

0.8

1

1.2 spline fit

final netinput-3 -2 -1 0 1 2 3

even

ts

0200400600800

1000120014001600180020002200

backgroundUnderflow 0Overflow 0

backgroundUnderflow 0Overflow 0

final netinput-3 -2 -1 0 1 2 3

even

ts

0200400600800

1000120014001600180020002200

backgroundUnderflow 0Overflow 0

signalUnderflow 0Overflow 0

final

signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

sign

al p

urity

0

0.2

0.4

0.6

0.8

1

separation

signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

sign

al p

urity

0

0.2

0.4

0.6

0.8

1

separation

Figure B.7: NeuroBayes output informing about the preprocessing performance for asingle input variable with regard to a classification training as described in Section 3.3.The first plot shows the flattened distributions for signal (red) and background (black),whereas the signal purities per bin are depicted in the second plot together with a splinefit. The third plot then shows the distribution of the final network inputs which are inprinciple the purity values from the second plot but transformed to have a mean of 0and a standard deviation of 1. The last plot illustrates the separation power of the singlevariable (black) in comparison the the one of the entire input variables set (red). Theheader provides the significances indicating the quality of the variable.

108

Page 131: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Bibliography

[1] Rudolf Lobl. Demokrits Atomphysik. Ertrage der Forschung; 252. Darmstadt:Wiss. Buchges., 1987. isbn: 3-534-03132-6.

[2] Ernest Rutherford. ‘The Scattering of the Alpha and Beta Rays and theStructure of the Atom’. In: Proceedings of the Manchester Literary and Philo-sophical Society IV (1911), pp. 18–20. url: http://www.math.ubc.ca/~cass/rutherford/rutherford.html.

[3] Helmut Hilscher. Elementare Teilchenphysik. Facetten. Wiesbaden: Vieweg,1996. isbn: 3-528-06670-9.

[4] Michael E. Peskin and Daniel V. Schroeder. An introduction to quantum fieldtheory. The advanced book program. Westview Press, 1995. isbn: 0-201-50397-2;978-0-201-50397-5.

[5] Abdelhak Djouadi. ‘The Anatomy of electro-weak symmetry breaking. I: TheHiggs boson in the standard model’. In: Phys. Rept. 457 (2008), pp. 1–216. doi:10.1016/j.physrep.2007.10.004. eprint: hep-ph/0503172.

[6] Sheldon L. Glashow. ‘Partial-symmetries of weak interactions’. In: NuclearPhysics 22.4 (1961), pp. 579 –588. issn: 0029-5582. doi: 10.1016/0029-

5582(61)90469-2.

[7] Abdus Salam. ‘Weak and Electromagnetic Interactions’. In: Conf.Proc. C680519(1968), pp. 367–377.

[8] Steven Weinberg. ‘A Model of Leptons’. In: Phys. Rev. Lett. 19 (21 1967),pp. 1264–1266. doi: 10.1103/PhysRevLett.19.1264.

[9] P.W. Higgs. ‘Broken symmetries, massless particles and gauge fields’. In: PhysicsLetters 12.2 (1964), pp. 132–133. issn: 0031-9163. doi: 10.1016/0031-9163(64)91136-9.

109

Page 132: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Bibliography

[10] Peter W. Higgs. ‘Broken Symmetries and the Masses of Gauge Bosons’. In: Phys.Rev. Lett. 13.16 (Oct. 1964), pp. 508–509. doi: 10.1103/PhysRevLett.13.508.

[11] Francois Englert and Robert Brout. ‘Broken Symmetry and the Mass of GaugeVector Mesons’. In: Phys. Rev. Lett. 13.9 (Aug. 1964), pp. 321–323. doi:10.1103/PhysRevLett.13.321.

[12] Gerald S. Guralnik, Carl R. Hagen and Tom W. B. Kibble. ‘Global ConservationLaws and Massless Particles’. In: Phys. Rev. Lett. 13.20 (Nov. 1964), pp. 585–587. doi: 10.1103/PhysRevLett.13.585.

[13] Martin-Stirling-Thorne-Watt Parton Distribution Functions. url: http:

//projects.hepforge.org/mstwpdf/.

[14] LHC Higgs Cross Section Working Group et al. ‘Handbook of LHC Higgs CrossSections: 1. Inclusive Observables’. In: CERN-2011-002 (CERN, Geneva, 2011).eprint: 1101.0593.

[15] K. Nakamura et al. ‘Review of particle physics’. In: J. Phys. G37 (2010). url:http://pdg.lbl.gov/2011/reviews/rpp2011-rev-higgs-boson.pdf.

[16] Michail S. Bachtis et al. ‘Performance of tau reconstruction algorithms with2010 data in CMS’. CMS AN-2011/045.

[17] LEP Electro-Weak Working Group. url: http://lepewwg.web.cern.ch/LEPEWWG/.

[18] ATLAS Collaboration. An update to the combined search for the StandardModel Higgs boson with the ATLAS detector at the LHC using up to 4.9 fb−1 ofpp collision data at

√s = 7 TeV. Tech. rep. ATLAS-CONF-2012-019. Geneva:

CERN, Mar. 2012.

[19] CMS Collaboration. ‘Combined results of searches for the standard modelHiggs boson in pp collisions at

√s = 7 TeV’. In: (Feb. 2012). CMS-HIG-11-032;

CERN-PH-EP-2012-023. doi: 10.1016/j.physletb.2012.02.064.

[20] N. Jarosik et al. ‘Seven-Year Wilkinson Microwave Anisotropy Probe (WMAP)Observations: Sky Maps, Systematic Errors, and Basic Results’. In: Astro-phys.J.Suppl. 192 (2011), p. 14. doi: 10.1088/0067-0049/192/2/14. eprint:1001.4744.

[21] Abdelhak Djouadi. ‘The Anatomy of electro-weak symmetry breaking. II. TheHiggs bosons in the minimal supersymmetric model’. In: Phys.Rept. 459 (2008),pp. 1–241. doi: 10.1016/j.physrep.2007.10.005. eprint: hep-ph/0503173.

[22] CMS Collaboration. ‘Search for Neutral Higgs Bosons Decaying to Tau Pairsin pp Collisions at

√s = 7 TeV’. In: (2011). CMS-PAS-HIG-11-029. url:

http://cdsweb.cern.ch/record/1406353.

[23] CMS Collaboration. ‘Search for Neutral Higgs Bosons Decaying into TauLeptons in the Dimuon Channel with CMS in pp Collisions at 7 TeV’. In: (2012).CMS-PAS-HIG-12-007. url: http://cdsweb.cern.ch/record/1429929.

110

Page 133: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Bibliography

[24] Oliver Sim Bruning et al. LHC Design Report, Volume I: the LHC Main Ring.Geneva: CERN, 2004. url: http://cdsweb.cern.ch/record/782076.

[25] Oliver Sim Bruning et al. LHC Design Report, Volume II: the LHC Infrastruc-ture and General Services. Geneva: CERN, 2004. url: http://cdsweb.cern.ch/record/815187.

[26] Michael Benedikt et al. LHC Design Report, Volume III: the LHC InjectorChain. Geneva: CERN, 2004. url: http://cdsweb.cern.ch/record/823808.

[27] The CERN accelerator complex. url: http://public.web.cern.ch/public/en/Research/AccelComplex-en.html.

[28] ATLAS Collaboration. ‘The ATLAS Experiment at the CERN Large HadronCollider’. In: Journal of Instrumentation 3.08 (2008). doi: 10.1088/1748-0221/3/08/S08003.

[29] CMS Collaboration. ‘The CMS experiment at the CERN LHC’. In: Journal ofInstrumentation 3.08 (2008). doi: 10.1088/1748-0221/3/08/S08004.

[30] ALICE Collaboration. ‘The ALICE experiment at the CERN LHC’. In: Journalof Instrumentation 3.08 (2008). doi: 10.1088/1748-0221/3/08/S08002.

[31] LHCb Collaboration. ‘The LHCb Detector at the LHC’. In: Journal of Instru-mentation 3.08 (2008). doi: 10.1088/1748-0221/3/08/S08005.

[32] CMS Media. url: http://cmsinfo.web.cern.ch/cmsinfo/Media/.

[33] Albert de Roeck et al. CMS Physics Technical Design Report Volume I: DetectorPerformance and Software. Ed. by D Acosta. Technical Design Report CMS.Geneva: CERN, 2006. url: http://cdsweb.cern.ch/record/922757.

[34] Martin Grunwald et al. CMS physics Technical Design Report, Volume II:Physics Performance. Ed. by A de Roeck. Vol. 34. CERN-LHCC-2006-021.CMS-TDR-008-2. 2007, pp. 995–1579. doi: 10.1088/0954-3899/34/6/S01.

[35] Christoph Berger. Elementarteilchenphysik – von den Grundlagen zu denmodernen Experimenten. 2., aktualisierte und uberarb. Aufl. Springer-Lehrbuch.Berlin: Springer, 2006. isbn: 3-540-23143-9; 978-3-540-23143-1.

[36] Christoph Eck et al. LHC computing Grid: Technical Design Report. Version1.06 (20 Jun 2005). Technical Design Report LCG. Geneva: CERN, 2005. url:http://cdsweb.cern.ch/record/840543.

[37] Worldwide LHC Computing Grid, Technical Site. url: http://lcg.web.cern.ch.

[38] G. L. Bayatyan et al. CMS computing: Technical Design Report. TechnicalDesign Report CMS. Submitted on 31 May 2005. Geneva: CERN, 2005. url:http://cdsweb.cern.ch/record/838359.

[39] The CMS Offline WorkBook, Twiki page. url: https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBook.

111

Page 134: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Bibliography

[40] Armin Scheurer et al. ‘German Contributions to the CMS Computing Infra-structure’. In: Journal of Physics: Conference Series 219.6 (2010). ID:CHEP299.doi: 10.1088/1742-6596/219/6/062064.

[41] Torbjorn Sjostrand, Stephen Mrenna and Peter Z. Skands. ‘PYTHIA 6.4 Physicsand Manual’. In: Journal of High Energy Physics 5 (2006). doi: 10.1088/1126-6708/2006/05/026.

[42] B. Andersson et al. ‘Parton fragmentation and string dynamics’. In: PhysicsReports 97.2-3 (1983), pp. 31–145. issn: 0370-1573. doi: 10.1016/0370-

1573(83)90080-7.

[43] Zbigniew Was. ‘TAUOLA the library for tau lepton decay, and KKMC/KOR-ALB/KORALZ/... status report’. In: Nucl. Phys. Proc. Suppl. 98 (2001). doi:10.1016/S0920-5632(01)01200-2.

[44] Johan Alwall et al. ‘MadGraph 5 : Going Beyond’. In: Journal of High EnergyPhysics 6 (2011). doi: 10.1007/JHEP06(2011)128.

[45] S. Agostinelli et al. ‘Geant4 – a simulation toolkit’. In: Nuclear Instruments andMethods in Physics Research Section A: Accelerators, Spectrometers, Detectorsand Associated Equipment 506.3 (2003), pp. 250–303. issn: 0168-9002. doi:10.1016/S0168-9002(03)01368-8.

[46] CMS Collaboration. ‘Tau Identification in CMS’. In: (2011). CMS-PAS-TAU-11-001. url: http://cdsweb.cern.ch/record/1337004.

[47] Rene Brun and Fons Rademakers. ‘ROOT — An object oriented data analysisframework’. In: Nuclear Instruments and Methods in Physics Research SectionA: Accelerators, Spectrometers, Detectors and Associated Equipment 389.1–2(1997). New Computing Techniques in Physics Research V, pp. 81 –86. issn:0168-9002. doi: 10.1016/S0168-9002(97)00048-X.

[48] Gerhard Bohm and Gunter Zech. Introduction to Statistics and Data Analysisfor Physicists. Verlag Deutsches Elektronen-Synchrotron, 2010. isbn: 978-3-935702-41-6. doi: 10.3204/DESY-BOOK/statistics.

[49] Roger Barlow. Statistics: a guide to the use of statistical methods in the physicalsciences. The Manchester physics series. Wiley, 1989. isbn: 0-471-92294-3; 0-471-92295-1.

[50] Glen Cowan. Statistical data analysis, with applications from particle physics.Repr. Oxford science publications. Oxford: Clarendon Press, 2002. isbn: 0-19-850156-0; 0-19-850155-2.

[51] Jerzy Neyman and Egon S. Pearson. ‘On the Problem of the Most EfficientTests of Statistical Hypotheses’. In: Philosophical Transactions of the RoyalSociety of London. Series A, Containing Papers of a Mathematical or PhysicalCharacter 231.694-706 (1933), pp. 289–337. doi: 10.1098/rsta.1933.0009.

112

Page 135: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Bibliography

[52] Michael Feindt. ‘A Neural Bayesian Estimator for Conditional ProbabilityDensities’. In: ArXiv Physics e-prints (Feb. 2004). eprint: arXiv:physics/0402093.

[53] Yann LeCun et al. ‘Efficient BackProp’. In: (1998). Ed. by G. Orr and Muller K.url: http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf.

[54] Private communication with Gregory Schott.

[55] Toolkit for Multivariate Data Analysis (TMVA), Twiki page. url: https://twiki.cern.ch/twiki/bin/view/TMVA.

[56] NeuroBayes, Twiki page. url: https://twiki.cern.ch/twiki/bin/view/Main/NeuroBayes.

[57] CMS Collaboration. ‘Measurement of the inclusive Z → τ+τ− cross sectionin pp collisions at

√s = 7 TeV’. In: (2010). CMS-PAS-EWK-10-013. url:

http://cdsweb.cern.ch/record/1343465.

[58] Armin Burgmeier. ‘Data-Driven Estimation of Z0 Background Contributionsto the Higgs Search in the H → τ+τ− Channel with the CMS Experiment atthe LHC’. In: (2011). IEKP-KA/2011-21. url: http://www-ekp.physik.uni-karlsruhe.de/pub/web/thesis/iekp-ka2011-21.pdf.

[59] CMS Collaboration. ‘Search for Neutral Higgs Boson Production and Decay toTau Pairs’. In: (2011). CMS-PAS-HIG-10-002. url: http://cdsweb.cern.ch/record/1335388.

[60] John Conway et al. ‘Search for MSSM neutral Higgs→ τ+τ− Production usingthe TaNC Tau id. algorithm’. CMS AN-2010/460.

[61] Agni Bethani et al. ‘Search for Neutral Higgs Bosons Decaying into Tau Leptonsin the Dimuon Channel with CMS in pp Collisions at 7 TeV’. CMS-AN-12-018.

[62] CMS Collaboration. ‘Measurement of CMS Luminosity’. In: (2010). CMS-PAS-EWK-10-004. url: http://cdsweb.cern.ch/record/1279145.

[63] Glen Cowan et al. ‘Asymptotic formulae for likelihood-based tests of newphysics’. In: The European Physical Journal C - Particles and Fields 71 (22011), pp. 1–19. doi: 10.1140/epjc/s10052-011-1554-0.

[64] K. Nakamura et al. ‘Review of particle physics’. In: J. Phys. G37 (2010). url:http://pdg.lbl.gov/2011/reviews/rpp2011-rev-statistics.pdf.

[65] Private communication with Michael Feindt.

[66] Timo Doll. ‘Vergleich verschiedener multi-variater Verfahren zur Bestimmungvon Ausschlussgrenzen auf den Produktionsquerschnitt des Higgs-Bosons imKanal H → ττ → 2µ4ν’. In: (2011). IEKP-KA/2011-31. url: http://www-ekp.physik.uni-karlsruhe.de/pub/web/thesis/iekp-ka2011-31.pdf.

[67] Private communication with Agni Bethani.

113

Page 136: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Bibliography

[68] CMS Collaboration. ‘Search for Neutral Higgs Bosons Decaying to Tau Pairsin pp collisions at

√s = 7 TeV’. In: (2011). CMS-PAS-HIG-11-009. url: http:

//cdsweb.cern.ch/record/1369552.

114

Page 137: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Danksagung

Mein besonderer Dank gilt Prof. Dr. Gunter Quast fur die hervorragende Betreuungwahrend des letzten Jahres und die Integration in seine Arbeitsgruppe. Insbesonderewerden mir sein Engagement fur die Gruppe sowie sein fortwahrender Einblick alledie Aktivitaten der Gruppenmitglieder in positiver Erinnerung bleiben. Außerdemmochte ich Prof. Dr. Wim de Boer fur die freundliche Ubernahme des Korreferatsdanken.

Dr. Manuel Zeise und Dr. Gregory Schott gebuhrt mein Dank fur die Betreuungbei der Anfertigung dieser Arbeit und deren Korrekturlesen. Bei inhaltlichen Fragenjeglicher Art hatten sie stets gute Ratschlage fur mich. Außerdem danke ich DanielMartschei sowie Prof. Dr. Michael Feindt fur ihre Hilfe im Umgang mit NeuroBayes.Fur die ergiebige Kollaboration mit der Gruppe vom DESY danke ich insbondereDr. Alexei Raspereza, Armin Burgmeier und Agni Bethani.

Ebenfalls mochte ich mich bei der Arbeitsgruppe fur die gute Arbeitsatmophare,die vielen Hilfestellungen und anregenden Diskussionen sowie das Korrekturlesenmeiner Arbeit bedanken. Im einzelnen sind dies Dr. Oliver Oberst, Fred-MarkusStober, Joram Berger, Thomas Hauth, Georg Sieber, Raphael Friese und DominikHaitz sowie die ehemaligen Kollegen Dr. Andreas Oehler, Dr. Michael Heinrich, Dr.Christoph Hackstein, Dr. Armin Scheurer, Timo Doll, Stephan Riedel und DavidKernert.

Zuletzt mochte ich meinen Eltern fur ihre vielseitige Unterstutzung wahrend desganzen Studiums danken.

115

Page 138: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen
Page 139: Thomas M uller Diplomarbeitthesis/data/iekp-ka2012...Deutsche Zusammenfassung nach dem Higgs-Boson. Zum einen ist das Verzweigungsverh altnis mit fast 10% recht groˇ und zum anderen

Hiermit versichere ich, die vorliegende Arbeit selbststandig verfasstund nur die angegebenen Hilfsmittel verwendet zu haben.

Thomas MullerKarlsruhe, den 11. April 2012