46
Concepts of molecular analysis I: Big Data and Knowledge Mining SAMO Interdisciplinary Workshop on Molecular Analysis in Clinical Practice Hotel Hermitage, Lucerne, October 21, 2016 Peter J. Wild Systems Pathology Institute of Pathology and Molecular Pathology FAZ 15.6.2016: 2 Terabyte pro Patient «Big Data kann helfen, Leben zu retten. Aber nur, wenn man die Informationsflut bewältigt»

Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Embed Size (px)

Citation preview

Page 1: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Concepts of molecular analysis I:Big Data and Knowledge Mining

SAMO Interdisciplinary Workshop on Molecular Analysis in Clinical Practice

Hotel Hermitage, Lucerne, October 21, 2016

Peter J. Wild

Systems Pathology

Institute of Pathology and Molecular Pathology

FAZ 15.6.2016: 2 Terabyte pro Patient«Big Data kann helfen, Leben zu retten. Abernur, wenn man die Informationsflut bewältigt»

Page 2: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Disclosures

• Participation in advisory boards or speakers bureau (compensated):

Thermo Fisher Scientific, Roche Diagnostics, Sophia Genetics SA, Myriad

Genetics, Ventana Medical Systems, Life Technologies, Astellas Pharma AG,

Merck AG, Sanofi-Aventis, Janssen-Cilag AG, Astra Zeneca.

• Research Support:

Gilead, Astra Zeneca, Ventana Medical Systems

Page 3: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Big Data = Automation of Experience

The term “Big Data” is on everyone’s lips, but not everyone understands thesame thing by it.

Donald Kossmann: „My favourite definition of big data is the “automation ofexperience.” Essentially, this means that you learn from the past with an eye onthe future and avoid making the same mistake twice.“

Globe magazine, ETHZ, June 2014

Joachim Buhmann: „Machine learningalgorithms search data sets for patternsand characteristic structures. Typical tasksare the classification of data, ...“

Page 4: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Outline1. Big Data in Molecular Analysis2. Examples for Knowledgebase Mining3. ZurichCancerMaps

Page 5: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

KRAS Mutation

Normal(Wildtype)

Missense mutationKRAS c.34G>T

(p.G12C)

DNA ... GCT GGT GGC...

... GCT TGT GGC ...

RNA

Protein ... A - G - G ... ... A - C - G ...Glycin(G) Cystein (C)

Function Normal Activation

Example: c.34G>T (G12C) Exon 2

Page 6: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Evidence Indication Alteration Drug (s)

FDAApprovedLabels

Breast Cancer ERBB2 amplification pertuzumab, trastuzumab

Colorectal Cancer KRAS mutationcetuximab, panitumumabcontraindicated

Gastric Cancer ERBB2 amplification trastuzumab

Melanoma BRAF mutationdabrafenib, trametinib,vemurafenib

Non-Small Cell Lung Cancer ALK fusion ceritinib, crizotinib

Non-Small Cell Lung Cancer EGFR mutation afatinib, erlotinib

NCCNGuidelines

Gastrointestinal Stromal Tumor PDGFRA mutation dasatinib

Colorectal Cancer NRAS mutationcetuximab, panitumumabcontraindicated

Melanoma KIT mutation imatinib mesylate

Non-Small Cell Lung Cancer BRAF mutation dabrafenib, vemurafenib

Non-Small Cell Lung Cancer ERBB2 mutation afatinib

Non-Small Cell Lung Cancer MET amplification crizotinib

Non-Small Cell Lung Cancer RET fusion cabozantinib

Non-Small Cell Lung Cancer ROS1 fusion crizotinib

12 different alterations aligned to 14 different approved therapies

Many Alterations Already Aligned to Therapies

Page 7: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Alteration Indication Investigational drug(s)

AKT1 mutation Multiple MK-2206, MSC-2363318A

CCND1 amplification Multiple palbociclib

CDK4 amplification, mutation Melanoma, NSCLC palbociclib

CDK6 amplification NSCLC palbociclib

DDR2 mutation Multiple crizotinib + dasatinib

KRAS mutation Multiple various MEKi combinations

ERBB3 mutation Multiple neratinib

FGFR1-4 mutation, amplification, fusion Multiple BGJ-398, JNJ-42756493

GNA11 mutation Melanoma vorinostat

GNAQ mutation Melanoma vorinostat

HRAS mutation Multiple binimetinib + panitumumab, BVD-523

IDH1 mutation Multiple AG-120

KIT amplification Melanoma dasatinib

NRAS mutation Multiple various MEKi combinations

MET mutation Multiple AMG-337, crizotinib, INCB-028060

MTOR mutation Multiple MSC-2363318A

MYCN amplification Multiple GSK-525762

PDGFRA amplification Glioblastoma nilotinib, sorafenib

PIK3CA mutation Multiple various PI3K pathway combinations

PPARG fusion Thyroid Cancer pioglitazone

PTCH1 mutation Multiple vismodegib

RET mutation NSCLC, Thyroid Cancer ponatinib, sunitinib

SMO mutation Multiple vismodegib

STK11 mutation Multiple MSC-2363318A

More Therapies are Under Investigation...

Page 8: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Today’s Challenges

• Growing number of oncology biomarkers with clinical utility

• Multiple, global sources of information are not standardized

• Variants identified via NGS need to be quickly and accuratelyassociated to actionable information

Molecular Biomarkers in Oncology

Current methods are time intensive and requireextensive research of multiple sources to mapactionable information to variants

OncoPortal (Sophia Genetics)Oncomine Knowledge Base (Thermo Fisher)

Page 9: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Science 2016

Page 10: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Outline1. Big Data in Molecular Analysis2. Examples for Knowledgebase Mining3. ZurichCancerMaps

Page 11: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

The world’s largest curated cancer genomic database, gathered from public sources,peer reviewed literature, and published clinical trials

Hovelson et al., Neoplasia 2015

The Oncomine Knowledgebase

Page 12: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Solid Tumor Variant Map

Hovelson et al., Neoplasia 2015

Data analysis performed using the Oncomine Knowledgebase

Page 13: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Oncomine Focus Assay – Gene List

Detects variants in 52 solid tumor genes that are associated withcurrent oncology drugs and backed by published evidence

Page 14: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Oncomine Knowledgebase Reporter (OKR)

Oncomine™

KnowledgebaseReporter*

Ion Reporter™

Workflow*

Lab GeneratedReport*

ResearchLaboratory

• US FDA labels• US NCCN Guidelines• EMA labels• ESMO Guidelines• Global clinical trials

Geneticvariants

Associated publishedevidence

For Research Use Only. Not for use in diagnostic procedures.

23 Cancer types

69 Countries with enrolling trials

4 Sources for labels and guidelines

Page 15: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Generating a Custom Report

For Research Use Only. Not for use in diagnostic procedures.

Page 16: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

The Report: 1 Variant Summary

Variant summary: shows all gene variants with associated information in the

report and the cancer type information by source as well as global clinical trial

status.

For Research Use Only. Not for use in diagnostic procedures.

Page 17: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

The Report: 2 Relevant Therapy Summary

For each gene variant, published therapies from

each source are given an evidence label.

In this cancer type

In other cancer type

In this cancer type and other cancer type

Contraindicated

Both for use and contraindicated

No evidence

Global clinical trials are also labeled.

For Research Use Only. Not for use in diagnostic procedures.

Page 18: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

The Report: 3 Current Source Information

For each gene variant, a summary of each

therapy is given with:

Cancer type

Label date

Class

Indication and usage summary for FDA

labels

Reference

For Research Use Only. Not for use in diagnostic procedures.

Page 19: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

The Report: 4 Global Clinical Trial Information

For each gene variant, a summary of open

global clinical trials with:

Summary: Trial identifier, Trial title

Cancer type

Class

Other identifiers

Population segments

Phase

Published therapies

Countries

US States

Contact information

For Research Use Only. Not for use in diagnostic procedures.

Page 20: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Example: EGFR del exon 19

• 6/2015: Biopsy of the right pleura due to recurrent effusions with advanced

adenocarcinoma of lung

• Sanger sequencing: EGFR Deletion in Exon 19 (pE746_A750del)

53 y/o male pt. with lung adenocarcinoma

Page 21: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Initital diagnosis07/2015

PR 3 months later10/2015

Oligo-progressivedisease6/2016

Therapy with Afatinib

Whole body PET scan

Page 22: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Mechanisms of drug resistence to EGFR tyrosine kinaseinhibitors in EGFR-mutant NSCLC

EGFR p.T790M (50%)

Sharma et al, Nature Rev. Cancer 2007Cortot, Jänne. ERR 2014

Page 23: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Case: Liquid Biopsy from cfDNA

Resistence mechanism? Osimertinib in EGFR-TKI-resistantEGFR p.T790M positive NSCLC

Jänne et al, NEJM 2015

53 y/o male pt. with lung adenocarcinoma

Page 24: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter
Page 25: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter
Page 26: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter
Page 27: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter
Page 28: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter
Page 29: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Outline1. Big Data in Molecular Analysis2. Examples for Knowledgebase Mining3. ZurichCancerMaps

Page 30: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

The concept of ZurichCancerMapsor how do we get from Big Data to PM?

Combine big medical data & cancer genomics data to model patients, predict

outcomes, optimize treatment and design clinical trials.

Gunnar Rätsch

etc.

Page 31: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

The problem

• To date, large amounts of molecular, image and clinical data are savedunstructured and not accessible.

• The quantitative molecular make-up of a particular specimen to gainclinically important insights is a central component of Precision Medicine.

• Clinical specimens are unique, finite and cannot be reproduced.

Page 32: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

ZurichCancerMaps

Definition

Generation of a Digital Biobank of clinicalspecimens where the genomic and expressedtranscriptomic, proteomic and metabolomicinformation is recorded in searchable digital filesthat are stored in a database, along with clinicalmetadata

Page 33: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Data Warehouse

KlinischesInformationssystem

(KISIM)

LIMS(Molis*, Patho- /Dermapro****)

Liquid *(Blut, Urin, Speichel,

Liquor (Hirnflüssigkeit),Ergüsse)

Tissue & Cell **(Gewebe & aus

Tumoren gezüchteteZellen)

Reproduction ***(Stammzellen,

Eizellen, Spermien)

Research Data Service Center(Oracle TRC)

Raw files

ProcessingInput files(u.a. VCF)

Studiensystem(Secutrial)

PACS(Impax,(Radiologiebilder)

Bilder Allgemein(Synedra / Histo-DB)

Krebsregister(nicht nur USZ;Survivaldaten)

* Molis: für Liquid Diagnostik und evtl. Liquid Biobank** SLIMS: mögliche Lösung für Tissue & Cell Biobank*** RURO: bereits im Einsatz für Probenlagerung****Patho-/Dermapro: für Tissue Diagnostik (keine Biobank Funktionalität)

Bereitstellunganonymisierte undverdichteteInformationen

Separate, abgeschottete Zonefür Externe

Processing

Probe

Sequencing

Measurements

Biobanken

Page 34: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Data Visualization cBioPortal (The Hyve, Cambridge)

EGFR p.T790M

Page 35: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

2008: “Computational Pathology”

Fuchs, Wild, et al. MICCAI 2008Raman et al., BMC Bioinformatics 2010Schüffler et al. J Pathol Inform 2013Rupp et al., J Pathol Inform 2016Zhong et al., Sci Rep 2016Zerhouni, et al., Proc. of SPIE 2016

Detects cancer cell nuclei of renal cell carcinoma and predictsimmunohistochemical staining of Ki-67 on TMAs

Page 36: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Total no. of digital slide scans since 2007 at USZ

20

07

.12

20

16

.05

2000

1500

1000

500

20

12

.12

20

13

.12

20

14

.12

20

15

.12

20

08

.12

20

09

.12

20

10

.12

20

11

.12

Scans per month

Page 37: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

2011-2013: TMARKER software

P.J. Schüffler, et al. J Pathol Inform 2013

http://www.nexus.ethz.ch/equipment_tools/software/tmarker.html

• Generic• Integrative• Open-source

Page 38: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

PrECISE Project

TeamMaria Rodriguez Martinez – IBMHeinz Köppl - TU DarmstadtPavel Sumazin – Baylor CollegeZsolt Torok – Astrid Bio Technologies Kft.Julio Saez-Rodriguez – EBIRudolf Aebersold – ETHZLaurence Calzone – Institut CurieWalter Koch – TechnikonPeter Wild – UZH/USZ

Page 39: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Zurich - Basel Alliance

Page 40: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Impact• The project will democratize PM research because it will support a multitude

of research projects with unique data resources and enable in silico research(e.g. search for drug resistance patterns across different cancer cohorts)

• The project will be a data and knowledge hub for many research projectsexpected to be funded through national and international PM programs

• The project is presently unique and will strengthen the standing of Swissscience in the field

Page 41: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Identifying Personal Genomes by Surname Inference

Gymrek et al., Science 2013

Surnames can be recovered from personal genomes byprofiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogydatabases.

Craig Venter

Page 42: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Acknowledgements

Ruedi AebersoldGunnar RätschBernd WollscheidTiannan GuoYasuo UchidaAlex EbhardtJoachim BuhmannNiko BeerenwinkelChristian BeiselManfred ClaassenChristian StirnimannWilhelm Krek

Qing ZhongMarkus RechsteinerChristine FritzNadejda ValtchevaUlrich WagnerVanessa FreyAnnette BohnertNadezhda VelizhevaKathrin OehlElisa BelliniMalamati KoletouNiels RuppJan RüschoffLorenz BuserSimone BrandtDario VischiLivia BaldiniChristian FankhauserAilsa ChristiansenNathan EschbacherLaura De Vargas RoditiNora Toussaint

Silke GillessenMarkus JoergerWolfram JochumAurelius OmlinArnoud TempletonChristian Rothermund

Bernd BodenmillerIan FrewAndrea JacobsLukas PelkmansMarkus HermannChristian von Mering

Thomas FuchsPeter J. Schüffler

Holger MochPeter SchramlNorbert WeyAndre WethmarMonika BieriAurelia NoskeAndre FitscheChrissie MittmannSimone BrandtVerena TischlerSusanne DettwilerMartina Storz

Markus ManzStefan Balabanov

MatthiasGuckenberger

Tullio Sulser

Roger StuppAlessandra CurioniThomas WinderChristian Britschgi

Martin MatterEmmanuel Eschmann

Andreas WickiLuigi Terracciano

Page 43: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter
Page 44: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Systems Pathology Order Sheet

Page 45: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

Oncomine Focus Assay & Oncomine Knowledgebase Reporter Workflow

RunSequence

PrepareLibrary

Low SampleInput

AnalyzeData

PrepareTemplate

Oncomine FocusAssay

FFPE materialincluding fine needle

aspirates, needlebiopsy(10 ng)

Ion Reporter Software

OncomineKnowledgebase

Reporter

Ion PGM System

Ion Select™ 318 Chip

Ion OneTouch 2System

Ion OneTouch™

Select Template 200Kit

Page 46: Wild - Data and knowledge mining - samo-workshop.ch · Big Data and Knowledge Mining ... Detects cancer cell nuclei of renal cell carcinoma and predicts ... biopsy (10 ng) Ion Reporter

EGFR p.T790M in liquid biopsies

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

53 y/o male pt. with lung adenocarcinoma