Upload
others
View
14
Download
1
Embed Size (px)
Citation preview
B.S. JAGADEESH, COMPUTER DIVISION,
BARC, TROMBAY, MUMBAI – 400 085
First National Workshop of the National Knowledge Network
Indian Institute of Technology, Powai 31/October/2012
Bunches, each containing 100 billion protons, cross 40 million times a second in the centre of
each experiment 1 billion proton-proton interactions per second in ATLAS & CMS !
Large Numbers of collisions per event
~ 1000 tracks stream into the detector every 25 ns a large number of channels (~ 100 M ch) ~ 1 MB/25ns i.e. 40 TB/s !
LHC is a very large scientific instrument…
Lake Geneva
Large Hadron Collider27 km circumference
CMS
ATLAS
LHCb
ALICE
James Casey, CERN, IT Department
most rudimentary
Experiment CPU (MSi2k * year) Disk (TB) MSS (TB)
Alice 59.2 23903 17880
ATLAS 150 72,453 48398
CMS 108.2 34,403 46,800
LHCb 17.88 47,49 11632
2,00,000 Computers !!!!
Result
Same (as) Yet different (from)
Web (information) Allows collaboration too.
(many resources of which
information is one of them)
Cluster/distributed computing
(unifies resources)
Unifies resources belonging to
different administrative
domains
Virtualization (single resource) Allows virtualization of large
no. of resources like,
computation, Data, storage,
information etc
11
More than 140 computing centres
12 large centres for primary data management: CERN (Tier-0) and eleven Tier-1s
138 federations of smaller Tier-2 centres
India – BARC, TIFR, VECC
Relies on EGEE and OSG Grids
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPS France Regional Centre
Italy Regional Centre
Germany Regional Centre
Institute Institute Institute Institute ~0.25TIPS
Physicist workstations
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
1 TIPS is approximately 25,000
SpecInt95 equivalents
Specific to WLCG– Gftp, LFC, GUID
MIDDLEWARE FUNCTIONALITIES ?
DAE-CERN Joint Co-ordination Meeting, Nov 2, 2011, CERN
Node Configuration Manager
NCM
CompA CompB CompC
ServiceA ServiceB
RPMs / PKGs
SW Package Manager
SPMA
Managed Nodes
SW server(s)
HT
TP
SW
Repository
RPMs
Install server
HT
TP
/ PX
E System
installer
Install
Manager
base OS
XML configuration profiles
HTTP
CDB
SQL backend
SQL
CLI
XML backend
SO
AP
GUI
SCRIPTS
ServiceC
The key monitoring areas in GridView
include
Service Availability Monitoring
File Transfer Monitoring
Job Monitoring
SAM
DB
GRIDVIEW
DB
Service Nodes
SAM tests SAM Test
Results
SAM
Framework
Publishing
Web Service
R-GMA
Archiver Module
Web Service
Archiver Module
SAM XSQL
Export Module
RBs SEs (gridftp)
WS Client
RB Job
Logs
Gridftp
Logs
Gridftp
Logs
Fabric Monitoring System
at Site (LEMON / Nagios)
WS Client
HTTP/XML
Availability Metrics
GOCDB GOCDB
Sync Module
Data Analysis &
Summarization Module
Visualization
Module
Graphs & Reports
Displays periodic Graphs and Reports for
Detailed SAM test results for tests run for services at
a particular site
Hourly, Daily, Weekly and Monthly basis
Full traceability from aggregate Availability to detailed
SAM test results
Provision for saving user preferences based on
certificates
Refer http://gridview.cern.ch/GRIDVIEW/
Gridview computes job statistics based on RB
job logs
Displays periodic Graphs and Reports for
Job Status (Total Number of Jobs in various States)
Job Success Rate
Job Resource Utilization (Elapsed time,CPU, Memory)
Average Job Turnaround time (RB Waiting, Site
Waiting, Execution Time)
Site, VO and RB-wise distribution
Hourly, Daily, Weekly and Monthly reports
Displays periodic Graphs and Reports for
Overall Summary
○ sites with high/low job execution rate
○ sites with high/low job success rate
○ VOs running more/less jobs etc
Possible to view job statistics for any user selected
combination of VO, Site and RB
Refer http://gridview.cern.ch/GRIDVIEW/
Please visit:
http://gridview.cern.ch/GRIDVIEW/
Most recent snap shot
Fully web based system providing Tracking : Tracking reported bugs, defects,
feature requests, etc.
Assignment : Automatic routing and notification to support staff to get issues resolved
Communication : Capturing discussion and sharing knowledge
Enforcement : Automatic reminders according to severity of the issues
Accountability : History and Logs
NKN is a state-of-the-art multi-gigabit pan-India network: www.nkn.in
►9th April 2009:
President of
India
Inaugurated the
NKN Project. ► 16 PoP
► 26 Backbone Links
► 57 Edge Links
► 100 Crores allocated
in 2008 budget
► Completed in ¾ year
Connect R & D,
educational,
health, agri,
labs institutes
etc..
NIC working as
the Project
Execution
Agency
Idea of setting
NKN was
finalized at the
Office of PSA &
NKC.
1500+ institutes
in final phase
GoI approved a
budget of INR 5990
crores for NKN in
March, 2010
►5th March,
2011: Launched
the Logo &
Website of NKN
► 27 PoP
► 76 Backbone Links
► 216 Edge Links
HAS ENABLED EVERYONE TO COME ON BOARD !!
0
National Grid
Computing
CDAC, Pune
WLCG Collaboration
Common Users
Group (CUG)
Anunet (DAE
units)
BARC – IGCAR
NKN Router
NKN-Internet
(Grenoble-France)
NKN-General
(National
Collaborations)
Intranet
segment of
BARC
Internet
segment of
BARC
Logical Communication
Domains Through NKN
Category Examples Characteristics
Distributed
Supercomputing
Ab-initio
Molecular Dyn
Large Cpu/
memory reqd
High Throughput Cryptography Harness Idle
cycles
On Demand Medical
instruments
Cost
effectiveness
Data Intensive CERN LHC Info from Large
Data sets
Collaborative Data
Exploration
Support
communication
KEEPING PROVENANCE INFORMATION TO MAKE
DATA DISCERNABLE TO NEXT GENERATION
MEETING PROVISIONING CHALLENGES
( CLOUDMAN PROJECT)
COMPLETE SWICTH OVER TO “CLOUDS” ?
(SECURITY OF DATA?)