43
Data Analysis in Data Analysis in Experimental Particle Experimental Particle Physics Physics C. Javier Solano S. C. Javier Solano S. Grupo de Física Fundamental Grupo de Física Fundamental Facultad de Ciencias – Universidad Nacional de Facultad de Ciencias – Universidad Nacional de Ingeniería Ingeniería

Data Analysis in Experimental Particle Physics

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Analysis in Experimental Particle Physics

Data Analysis in Data Analysis in Experimental Particle Experimental Particle

PhysicsPhysics

C. Javier Solano S.C. Javier Solano S.Grupo de Física FundamentalGrupo de Física Fundamental

Facultad de Ciencias – Universidad Nacional de Facultad de Ciencias – Universidad Nacional de IngenieríaIngeniería

Page 2: Data Analysis in Experimental Particle Physics

Data Analysis in Particle PhysicsData Analysis in Particle PhysicsOutline of LectureOutline of Lecture

Characteristics of data from particle experimentsCharacteristics of data from particle experimentsFrom DAQ data to Event Records:From DAQ data to Event Records:

Event BuildingEvent BuildingFrom From hitshits to to trackstracks and and clustersclustersFrom From trackstracks and and clustersclusters to to “particles”“particles”::

Correlating sub-detector informationCorrelating sub-detector informationUncertainties and resolutionUncertainties and resolutionData Data reconstructionreconstruction and “production”: and “production”:

Data Summary “Tapes”Data Summary “Tapes”Personal data analysis: Personal data analysis: n-tuplesn-tuples

Page 3: Data Analysis in Experimental Particle Physics

Data Analysis in Particle PhysicsData Analysis in Particle PhysicsOutline of Lecture (cont.)Outline of Lecture (cont.)

Monte CarloMonte Carlo simulation simulationStatisticsStatistics and error analysis and error analysisHypothesisHypothesis testing testingSimulationSimulation of particle production and interactions with the of particle production and interactions with the detectordetectorDigital representationsDigital representations of event data of event dataMonitoring and Monitoring and CalibrationCalibrationWhy physicists don’t (yet) use Excel and Oracle for their Why physicists don’t (yet) use Excel and Oracle for their daily analysis.daily analysis.The challenge of The challenge of analysis for the LHC experimentsanalysis for the LHC experimentsThe challenge of The challenge of computing for the LHCcomputing for the LHCSolving the Solving the LHC computing challengeLHC computing challenge

Page 4: Data Analysis in Experimental Particle Physics

Characteristics of data from Characteristics of data from particle experimentsparticle experiments

Page 5: Data Analysis in Experimental Particle Physics

Characteristics of data from Characteristics of data from particle experimentsparticle experiments

Most data comes from digitized information from Most data comes from digitized information from sensors sensors activated by particlesactivated by particles crossing them. crossing them.We call the data resulting from the observation of a particle We call the data resulting from the observation of a particle collision an collision an eventevent..During hours, days, weeks, months, years or even decades, During hours, days, weeks, months, years or even decades, we observe many we observe many eventsevents. We group them according to the time-. We group them according to the time-varying experimental conditions into varying experimental conditions into runsruns..Calibration and environmental informationCalibration and environmental information is also stored, is also stored, usually in a periodic fashion.usually in a periodic fashion.For practical reasons, this data is stored in For practical reasons, this data is stored in data files of many data files of many eventsevents..Almost always, Almost always, events are independent from each otherevents are independent from each other..

Page 6: Data Analysis in Experimental Particle Physics

Characteristics of data from Characteristics of data from particle experimentsparticle experiments

The Experimental Particle Physics Data Worm

Calibration records

Run 137 Run 138Run 139 Run 140

Data file 418 Data file 419

Event number 31896

Page 7: Data Analysis in Experimental Particle Physics

From DAQ data to Event RecordsFrom DAQ data to Event Records“Event Building”“Event Building”

Page 8: Data Analysis in Experimental Particle Physics

From hits to tracks and clustersFrom hits to tracks and clusters

Page 9: Data Analysis in Experimental Particle Physics

From hits to tracks and clustersFrom hits to tracks and clusters

Occupancy and point resolution are related to ambiguities in track finding

Page 10: Data Analysis in Experimental Particle Physics

From hits to tracks and clustersFrom hits to tracks and clusters

Calibration, monitoring and software are needed to resolve these ambiguities

Page 11: Data Analysis in Experimental Particle Physics

From hits to tracks and clustersFrom hits to tracks and clusters

What you see is not always what there was !

Nuclear interaction

Page 12: Data Analysis in Experimental Particle Physics

Monitoring and CalibrationMonitoring and Calibration•Particles deposit energy in sensors•Sensors give Voltages, Currents, Charges•Space position of sensor is known•On-detector Analog-to-Digital Converters change these into numbers representing these or other quantities (for example clock-ticks between V pulses)•Calibration establishes the relationship between the ADC units and the physical units (eV, {x,y,z}, ns)

Page 13: Data Analysis in Experimental Particle Physics

From tracks and clusters to “particles”From tracks and clusters to “particles”Correlating sub-detector informationCorrelating sub-detector information

Page 14: Data Analysis in Experimental Particle Physics

Uncertainties and resolutionUncertainties and resolution Each measurement or Each measurement or hit hit

has some uncertaintyhas some uncertainty, due , due to alignment and the to alignment and the characteristic of the sensor.characteristic of the sensor.

These uncertainties get These uncertainties get propagated, often in a non-propagated, often in a non-linear manner, to linear manner, to resolution resolution functions for the physics functions for the physics quantitiesquantities used in analysis. used in analysis.

Resolution has various Resolution has various consequences:consequences:

Direct on measurementsDirect on measurements

Signal-Background Signal-Background confusionconfusion

CombinatoricsCombinatorics

Page 15: Data Analysis in Experimental Particle Physics

Data reconstruction and “production”:Data reconstruction and “production”:Data Summary “Tapes”Data Summary “Tapes”

ReconstructionReconstruction turns turns hits+calibration+geometryhits+calibration+geometry into into particle hypothesisparticle hypothesisReconstruction is Reconstruction is time consumingtime consuming and must be made and must be made coherentlycoherently ⇒⇒ Centrally organized Centrally organized productionproductionOutputOutput is one or more levels of so-called Data Summary is one or more levels of so-called Data Summary Tapes (Tapes (DSTDST) which are used as ) which are used as input to Personal input to Personal AnalysisAnalysisIn practice, there is a lot of utility software to organize In practice, there is a lot of utility software to organize these data for easy analysis (these data for easy analysis (bookkeepingbookkeeping))

Programming of complicated event structuresProgramming of complicated event structures

Old: FORTRAN with home-made memory managersOld: FORTRAN with home-made memory managers

Today: Object-Oriented design using C++ or JavaToday: Object-Oriented design using C++ or Java

Page 16: Data Analysis in Experimental Particle Physics

Personal data analysisPersonal data analysisMost modern detectors can address Most modern detectors can address multiple physics multiple physics topics.topics.Hundreds or thousands Hundreds or thousands of professors and studentsof professors and students distributed around the world.distributed around the world.Modern Modern experimental collaborations experimental collaborations are early example of are early example of virtual communities.virtual communities.Historical enablers for virtual communities:Historical enablers for virtual communities:

Fellowship and exchange programmesFellowship and exchange programmesTelegraph, telex, telephone and telefaxTelegraph, telex, telephone and telefaxNational and International LaboratoriesNational and International LaboratoriesReasonably priced airline ticketsReasonably priced airline ticketsComputer inter-networking, e-mail and ftpComputer inter-networking, e-mail and ftpThe World Wide WebThe World Wide WebMulti-media applications on the InternetMulti-media applications on the Internet

Page 17: Data Analysis in Experimental Particle Physics

Personal data analysisPersonal data analysisToday, Today, physics analysis topicsphysics analysis topics are increasingly are increasingly tackled by tackled by virtual teamsvirtual teams within these virtual within these virtual communities.communities.Must maintain Must maintain coherency of data and algorithmscoherency of data and algorithms within the virtual team.within the virtual team.““ProductionProduction” for a modern detector is ” for a modern detector is very very complex and consumes many resourcescomplex and consumes many resources..DSTDST contains all imagined contains all imagined reconstruction objects reconstruction objects for all foreseen analysisfor all foreseen analysis, so they are big., so they are big.Handling a DSTHandling a DST often requires installation of often requires installation of special software libraries and writing codespecial software libraries and writing code in in “reconstruction dialect”.“reconstruction dialect”.

Page 18: Data Analysis in Experimental Particle Physics

Personal data analysisPersonal data analysisSolution: Each virtual team develops a code to Solution: Each virtual team develops a code to extract a extract a common analysis datasetcommon analysis dataset for a given topic which is for a given topic which is written written and manipulated using a “lingua franca”:and manipulated using a “lingua franca”:n-tuplesn-tuples and the Physics Analysis Workstation and the Physics Analysis Workstation (PAW)/ROOT(PAW)/ROOTPhysicist’s version of business data mining with ExcelPhysicist’s version of business data mining with ExcelIterative process (time-scale of weeks or months):Iterative process (time-scale of weeks or months):

Team agrees on complex algorithms to be coded in the Team agrees on complex algorithms to be coded in the extraction program.extraction program.Algorithms coded and tested, extraction from DST.Algorithms coded and tested, extraction from DST.n-tuple file is rapidly distributed via computer network. n-tuple file is rapidly distributed via computer network. n-tuple is analyzed using non-compiled platform-n-tuple is analyzed using non-compiled platform-independent code (PAW/ROOT macros today, Java in independent code (PAW/ROOT macros today, Java in future ?) that are easily modified and shared by e-mail. future ?) that are easily modified and shared by e-mail. Eventually limitations are reached, go back to step 1.Eventually limitations are reached, go back to step 1.

Page 19: Data Analysis in Experimental Particle Physics

Personal data analysisPersonal data analysisPAW was the “killer application” for physics in the 90sPAW was the “killer application” for physics in the 90s

Interactive, just as powerful workstations became availableInteractive, just as powerful workstations became availablePlatform independent, in a very diverse workstation worldPlatform independent, in a very diverse workstation worldGraphical, just as X-windows gave graphics over networkGraphical, just as X-windows gave graphics over networkSimple to write analysis macros, just as the complexity of Simple to write analysis macros, just as the complexity of

FORTRAN programming required in experiments decoupled FORTRAN programming required in experiments decoupled most of the collaborators from the experiment’s code.most of the collaborators from the experiment’s code.In summary, PAW was like going from DOS to Macintosh.In summary, PAW was like going from DOS to Macintosh.

One major limitation of PAW is the lack of variable length One major limitation of PAW is the lack of variable length structures or more generally data objects.structures or more generally data objects.ROOT overcomes these limitations keeping a similar ROOT overcomes these limitations keeping a similar philosophy as PAW.philosophy as PAW.Java Analysis Studio tries to go further with “agents”.Java Analysis Studio tries to go further with “agents”.

Page 20: Data Analysis in Experimental Particle Physics

Personal data analysisPersonal data analysisWhich will be the “killer application” for LHC analysis?Which will be the “killer application” for LHC analysis?Is a Mac Classic on Appletalk enough or do we need Is a Mac Classic on Appletalk enough or do we need the conceptual leap equivalent of Web + Java-enabled the conceptual leap equivalent of Web + Java-enabled browser?browser?Will the personal n-tuple model work for LHC ?Will the personal n-tuple model work for LHC ?Do we need and can we afford to support our own Do we need and can we afford to support our own interactive data analysis tool ?interactive data analysis tool ?Will one of the newer tools, such as Java Analysis Will one of the newer tools, such as Java Analysis Studio, go exponential in the open source world ?Studio, go exponential in the open source world ?Many questions, one simple answer:Many questions, one simple answer:It will be young people like you who will make the next It will be young people like you who will make the next step happen.step happen.

Page 21: Data Analysis in Experimental Particle Physics

Monte Carlo simulationMonte Carlo simulationMonte Carlo simulation uses random numbersMonte Carlo simulation uses random numbers((→→ mathematics textbooks) mathematics textbooks)Try the following:Try the following:

Find a source of random numbers in the interval [0,1] Find a source of random numbers in the interval [0,1] (calculator, Excel, etc.)(calculator, Excel, etc.)Take a function that you want to simulate (e.g. y=xTake a function that you want to simulate (e.g. y=x22) and ) and

normalize it to fit in the interval [0,1] for both x and y.normalize it to fit in the interval [0,1] for both x and y.Find graph paper to histogram values of xFind graph paper to histogram values of xRepeat this at least 20 times:Repeat this at least 20 times:

Throw two random numbers. Use first as value for xThrow two random numbers. Use first as value for xEvaluate the function y and compare its value to 2Evaluate the function y and compare its value to 2ndnd random numberrandom number

If function value is less than random number, add a count to If function value is less than random number, add a count to histogram in the correct bin for xhistogram in the correct bin for xIf function value is more than random number, forget itIf function value is more than random number, forget it

Compare your histogram to the shape of the function Compare your histogram to the shape of the function

Page 22: Data Analysis in Experimental Particle Physics

Monte Carlo simulationMonte Carlo simulationIf you don’t know how to program, If you don’t know how to program, you can pick up an Excel file from you can pick up an Excel file from http://cern.ch/Manuel.Delfino/Brazilhttp://cern.ch/Manuel.Delfino/BrazilHere is the result for Here is the result for

100 trials:100 trials:Note there are 30 Note there are 30

entries so theentries so the “efficiency” is 30% “efficiency” is 30%Note the statisticalNote the statistical fluctuations fluctuationsHomework: How is the Homework: How is the normalization done ?normalization done ?

Example of Monte Carlo simulation of y=x*x

0

1

2

3

4

5

6

7

8

9

10

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

x

y

Y

SAMPLE

Page 23: Data Analysis in Experimental Particle Physics

Statistics and error analysisStatistics and error analysisAnalysis involves Analysis involves selecting, counting and normalizingselecting, counting and normalizing..Things are easier when you actually have a signal.Things are easier when you actually have a signal.

Understand underlying statistics: Poisson, Understand underlying statistics: Poisson, Binomial,Multinomial, etc.Binomial,Multinomial, etc.If measuring a differential distribution, understand If measuring a differential distribution, understand

relation between normalization of binned counts vs. relation between normalization of binned counts vs. total counts.total counts.Understand selection biases and their impact on Understand selection biases and their impact on

observed distributions.observed distributions.Things are a lot harder when you place limits.Things are a lot harder when you place limits.Two observations:Two observations:

If you cannot make an analytical estimate of the If you cannot make an analytical estimate of the uncertainties, I won’t believe your result.uncertainties, I won’t believe your result.The expression “n-sigma effect” should be banned.The expression “n-sigma effect” should be banned.

Page 24: Data Analysis in Experimental Particle Physics

Hypothesis testingHypothesis testingYou must understand Bayes’ theorem.You must understand Bayes’ theorem.And every time you think you understand it, you must make a And every time you think you understand it, you must make a big effort to understand it better !big effort to understand it better !Compare differential distributions of data with predictions of Compare differential distributions of data with predictions of “theory” or “model”“theory” or “model”

Different theoriesDifferent theoriesDifferent parameters for same modelDifferent parameters for same model

Setting up the statistical test is often straight-forward, which Setting up the statistical test is often straight-forward, which is why it is surprising most people do it wrongis why it is surprising most people do it wrongTaking account of resolution and systematic uncertainties is Taking account of resolution and systematic uncertainties is hardhard

Make simulation look like data to get your answersMake simulation look like data to get your answersEven if graphics looks better the other way around !!!Even if graphics looks better the other way around !!!

Page 25: Data Analysis in Experimental Particle Physics

Simulation of particle production Simulation of particle production and interactions with the detectorand interactions with the detector

For particle production, combine Monte Carlo withFor particle production, combine Monte Carlo withDetailed particle propertiesDetailed particle propertiesDetailed cross-sections predicted by theory of Detailed cross-sections predicted by theory of

phenomenologyphenomenologyComputation of phase-spaceComputation of phase-spaceOutput consists of event records containing simulated Output consists of event records containing simulated

particles (often called 4-vectors by experimentalists)particles (often called 4-vectors by experimentalists)For simulating the detector, combine MC withFor simulating the detector, combine MC with

Detailed description of the detectorDetailed description of the detectorDetailed cross-sections for interaction with detector Detailed cross-sections for interaction with detector

materialsmaterialsDetailed phenomenology of mechanism producing signalDetailed phenomenology of mechanism producing signalTransport (Ray-tracing) algorithms including B fieldsTransport (Ray-tracing) algorithms including B fieldsDigitization model mapping of {x,y,z} to read-out channelDigitization model mapping of {x,y,z} to read-out channel

Page 26: Data Analysis in Experimental Particle Physics

Simulation of particle production Simulation of particle production and interactions with the detectorand interactions with the detector

Example:Small part of design

of GEANT4

Reference to Jackson’s

textboook in documentation !

Page 27: Data Analysis in Experimental Particle Physics

Digital representations of event Digital representations of event datadata

In principle, representing event data digitally should be very In principle, representing event data digitally should be very simple, except:simple, except:

everything comes in variable numbers: hits, tracks, everything comes in variable numbers: hits, tracks, clustersclustersambiguities lead to multiple relationsambiguities lead to multiple relationsparticle identification may depend on analysis hypothesisparticle identification may depend on analysis hypothesisetc.etc.

In simple terms, events don’t look like bank account data, In simple terms, events don’t look like bank account data, they look like collections of objects.they look like collections of objects.You can do a reasonable representation using relational You can do a reasonable representation using relational tables, but actually using the data structures from tables, but actually using the data structures from Fortran/ROOT programs is still cumbersomeFortran/ROOT programs is still cumbersomeObject Oriented Programming is a better match, but C++ Object Oriented Programming is a better match, but C++ does not resolve all problems does not resolve all problems →→ Frameworks Frameworks

Page 28: Data Analysis in Experimental Particle Physics

Why physicists don’t (yet) use Excel Why physicists don’t (yet) use Excel and Oracle for their daily analysis.and Oracle for their daily analysis.

SpreadsheetsSpreadsheets like Excel and like Excel and relational databasesrelational databases like Oracle like Oracle have a very “square” view of data.have a very “square” view of data.This is This is not a good match to the Data Wormnot a good match to the Data Worm..““Normal” people (banks and insurance companies) can Normal” people (banks and insurance companies) can define define a prioria priori the quantities that they will select on ( the quantities that they will select on (the keysthe keys of the of the database).database).We usually derive selection criteria We usually derive selection criteria a posteriori using a posteriori using quantities calculatedquantities calculated from the stored data. from the stored data.We like (need ?) to express queries as We like (need ?) to express queries as individualistic individualistic detailed low-level computer codesdetailed low-level computer codes. Difficult to support in . Difficult to support in database.database.But this is changing very rapidly due to But this is changing very rapidly due to Data MiningData Mining::Businesses are interested in analyzing their raw data in unpredictable ways.Businesses are interested in analyzing their raw data in unpredictable ways.Example: Cash register tickets to choose sale itemsExample: Cash register tickets to choose sale items

Support for this requires a more “organic” view of data, for Support for this requires a more “organic” view of data, for example example object-relational databasesobject-relational databases..

Page 29: Data Analysis in Experimental Particle Physics

Why physicists don’t (yet) use Excel and Why physicists don’t (yet) use Excel and Oracle for their daily analysis.Oracle for their daily analysis.

Particle hypothesis

MassCharge

MomentumOrigin Track

OriginCurvature

ExtrapolationNumber of hits

Tracker hit

PositionResponse

Cluster

PositionWidthDepthEnergy

Number of hits

Calorimeter hit

PositionResponse

One to Many

One to Many

One to Many

One to Many

Idealized

Simple relation

Page 30: Data Analysis in Experimental Particle Physics

Why physicists don’t (yet) use Excel and Why physicists don’t (yet) use Excel and Oracle for their daily analysis.Oracle for their daily analysis.

Particle hypothesis

MassCharge

MomentumOrigin Track

OriginCurvature

ExtrapolationNumber of hits

Tracker hit

PositionResponse

Cluster

PositionWidthDepthEnergy

Number of hits

Calorimeter hit

PositionResponse

Many to Many

Many to Many

Many to Many

Many to Many

Reality

Complicated algorithmic

relation

Page 31: Data Analysis in Experimental Particle Physics

The challenge of analysis for the The challenge of analysis for the LHC experimentsLHC experiments

Page 32: Data Analysis in Experimental Particle Physics

The challenge of analysis for the LHC experimentsThe challenge of analysis for the LHC experiments

1:1012

Online1:107

Analysis1:105

Page 33: Data Analysis in Experimental Particle Physics

The challenge of analysis for the LHC The challenge of analysis for the LHC experimentsexperiments

Page 34: Data Analysis in Experimental Particle Physics

The challenge of analysis for the LHC The challenge of analysis for the LHC experimentsexperiments

Detector

Raw data

EventEventReconstructionReconstruction

EventEventReconstructionReconstruction

EventEventSimulationSimulation

EventEventSimulationSimulation

One Experiment

One Experiment

One Experiment

One Experiment35K SI9535K SI95

~200 MB/sec

250K SI95250K SI95

350K SI9564 GB/sec350K SI9564 GB/sec

500 TB

1 PB / year

~100 MB/sec analysis objects

Event FilterEvent Filter(selection &(selection &

reconstruction)reconstruction)

Event FilterEvent Filter(selection &(selection &

reconstruction)reconstruction) Event

Summary Data

Batch PhysicsBatch PhysicsAnalysisAnalysis

Batch PhysicsBatch PhysicsAnalysisAnalysis

0.1 to 10.1 to 1GB/secGB/sec

Thousands of scientists distributed around the planet

Page 35: Data Analysis in Experimental Particle Physics

The challenge of computing for the LHCThe challenge of computing for the LHC

Long Term Tape Storage EstimatesLong Term Tape Storage Estimates

CurrentCurrentExperimentsExperiments COMPASSCOMPASS

LHCLHC

00

2'0002'0004'0004'000

6'0006'0008'0008'000

10'00010'00012'00012'000

14'00014'000

19

95

19

95

19

96

19

96

19

97

19

97

19

98

19

98

19

99

19

99

20

00

20

00

20

01

20

01

20

02

20

02

20

03

20

03

20

04

20

04

20

05

20

05

20

06

20

06

YearYear

TeraBytesTeraBytes

Page 36: Data Analysis in Experimental Particle Physics

The challenge of computing for the LHC The challenge of computing for the LHC

Long Term Tape Storage EstimatesLong Term Tape Storage Estimates

CurrentCurrentExperimentsExperiments COMPASSCOMPASS

LHCLHC

00

2'0002'000

4'0004'000

6'0006'000

8'0008'000

10'00010'000

12'00012'000

14'00014'000

19

95

19

95

19

96

19

96

19

97

19

97

19

98

19

98

19

99

19

99

20

00

20

00

20

01

20

01

20

02

20

02

20

03

20

03

20

04

20

04

20

05

20

05

20

06

20

06

YearYear

TeraBytesTeraBytes

Accumulation: 10 PB/yearAccumulation: 10 PB/yearSignal/Background up to 1:10Signal/Background up to 1:101212

Page 37: Data Analysis in Experimental Particle Physics

The challenge of computing for the LHCThe challenge of computing for the LHC

Estimated CPU Capacity required at CERN

0

1,000

2,000

3,000

4,000

5,000

199

8

199

9

200

0

200

1

2002

200

3

200

4

2005

2006

200

7

200

8

200

9

201

0

Jan 2000:3.5K SI95

LHCLHC

K SI95

Moore’s law – some measure of the capacity technology advances provide for a constant number of processors or investment

Page 38: Data Analysis in Experimental Particle Physics

The challenge of computing for the LHC The challenge of computing for the LHC

CERN Centre Physics Computing Capacity

0

20

40

60

80

100

120

1988 1990 1992 1994 1996 1998 2000year

Th

ou

san

ds

of

CE

RN

Un

its

Mainframes decomissioned

First PC services

Moore's law(based on 1988)

CERN RD47 project

RISC decomissioning agreed

Page 39: Data Analysis in Experimental Particle Physics

The challenge of computing for the LHC The challenge of computing for the LHC

CERN Centre Physics Computing Capacity

0

20

40

60

80

100

120

1988 1990 1992 1994 1996 1998 2000year

Th

ou

san

ds

of

CE

RN

Un

its

Mainframes decomissioned

First PC services

Moore's law(based on 1988)

CERN RD47 project

RISC decomissioning agreed

Continued innovation

Page 40: Data Analysis in Experimental Particle Physics

Solving the LHC Computing Challenge:Solving the LHC Computing Challenge:Technology Development DomainsTechnology Development Domains

DEVELOPER VIEW

GRID

FABRIC

APPLICATIONUSER VIEW

Page 41: Data Analysis in Experimental Particle Physics

Solving the LHC Computing ChallengeSolving the LHC Computing Challenge

5

250

0.88 24 *

960 *

6 *

1.5

12

LAN-WAN Routers

Computing fabricat CERN (2006)Computing fabricat CERN (2006)

0.8

0.8Storage Network

StorageNetwork

Farm Network

* Data Ratein Gbps

* Data Ratein Gbps

10 Thousand dual-CPU boxes10 Thousand dual-CPU boxes

10 Thousand disk units10 Thousand disk units

Hundreds oftape drivesHundreds oftape drives

Real-timedetector data

Real-timedetector data

Grid InterfaceGrid Interface

Page 42: Data Analysis in Experimental Particle Physics

Solving the LHC Computing Challenge:Solving the LHC Computing Challenge:Data-Intensive Grid ResearchData-Intensive Grid Research

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Managing multiple resources”: ubiquitous infrastructure services

User“Specialized services”: user- or appln-specific distributed services

InternetTransport

Application

Link

Internet Pr otocol Archit ecture

Grid Protocol Architecture

Page 43: Data Analysis in Experimental Particle Physics

AcknowledgementsAcknowledgementsMany of the figures in this talk are from the Many of the figures in this talk are from the

Web sites of ATLAS, CMS, Aleph and Web sites of ATLAS, CMS, Aleph and Delphi.Delphi.