View
218
Download
2
Category
Preview:
Citation preview
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
The Current Status of CDF Grid
양유철 * , 한대희 , 공대정 , 김지은 , 서준석 , 장성현 , 조기현 , 오영도 , MIAN Shabeer, AHMAD KHAN Adil, 김동희 ( 경북대학교 )
김수봉 , 김현수 , 문창성 , 이영장 , 전은주 , 정지은 , 주경광 (서울대학교 )
유인태 , 이재승 , 조일성 , 최동혁 ( 성균관대학교 )
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Introduction to CDF Computing
Developed in 2001-2002 to respond to experiments greatly increased need for computational and data handling resources to deal with RunII
One of the first large-scale cluster approaches to user computing for general analysis.
Greatly increased CPU power & data to physicists.
CDF Grid via CAF, DCAF, SAM and JIM ☞ DCAF(DeCentralized Analysis Farm) ☞ SAM (Sequential Access through Metadata) – Real data Handling System ☞ JIM(Job Information Management) – Resource Broker
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Outline
CAF Central Analysis Farm :A large central computing resource based on Linux cluster farms with a simple job management scheme at Fermilab.
DCAF
Decentralized CDF Analysis Farm :We extended the above model, including its command line interface and GUI, to manage and work with remote resources
GridWe are now in the process of adapting and converting out work flow to the Grid
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Environment on CAF All basic CDF software pre-installed on CAF Authentication via Kerberos ☞ Jobs are run via mapped accounts with authentication of actual user through special principal ☞ Database, data handling remote usres ID passed on through lookup of actual user via special principal
User’s analysis environment comes over in tarball - no need to pre-register or submit only certain jobs. Job returns results to user via secure ftp/rcp controlled by user script and principal
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
In 2005, 50% of analysis farm outside of FNAL
Distributed clusters in Korea, Taiwan, Japan, Italy, Germany, Spain, UK, USA and Canada
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Current DCAF approach
Cluster technology (CAF = “Central analysis farm”) extended to remote site (DCAFs = Decentralized CDF analysis Farm)
Multiple batch systems supported : converting from FBSNG system to Condor on all DCAFs
SAM data handling system required for offsite DCAFs
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
http://www-cdf.fnal.gov/internal/fastnavigator/fastnavigator.html (2005/Oct/17)
Current CDF Dedicated Resources
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
TYPE CPU RAM HDD NO
head Node cluster46.knu.ac.kr
AMD MP2000 * 2 2G 80G 1
sam stationcluster67.knu.ac.kr
Pentium 4 2.4G 1G 80G 1
submission node
cluster52.knu.ac.kr
Pentium 4 2.4G 1G 80G 1
worker nodecluster39~cluster73(21
)cluster102~cluster114(
13)Cluster122~cluster130(
9)
AMD MP2000 * 2 2G 80G 4
AMD MP2200 * 2 1G 80G 2
AMD MP2800 * 2 2G 80G 11
AMD MP2800 * 2 2G 250G 2
Pentium 4 2.4G 1G 80G 15
Xeon 3.G * 2 2G 80G 9
Total 75 CPU (173.9GHz)
73G 4020G 46
Detail of KorCAF resources
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Storage Upgrade statusCPU RAM HDD NO
Current 0.6TB
Opteron dual (2005) 2G 4TB 1
Zeon dual (2005) 1G 1TB 1
Total 5.6TB 2
Now, Converting to Condor batch system
cdfsoft Installed products : 4.11.1, 4.11.2, 4.8.4, 4.9.1, 4.9.1hpt3, 5.2.0, 5.3.0, 5.3.1, 5.3.3, 5.3.3_nt, 5.3.4, development Installed binary products: 4.11.2, 5.3.1, 5.3.3, 5.3.3_nt, 5.3.4
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
CAF gui & Monitoring SystemSelect farm
Process type
Submit status
User script , I/O file location
Data access
http://cluster46.knu.ac.kr/condorcaf
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Functionality for UserFeature Status
Self-contained user interface
Yes
Runs arbitrary user code Yes
Automatic identity management
Yes
Network delivery of results
Yes
Input and output data handling
Yes
Batch system priority management
Yes
Automatic choice of farm Not yet
Negotiation of resources Not yet
Runs on arbitrary grid resources
Not yet
Grid
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Luminosity and Data Volume
Expectations are for continued high-volume growth as luminosity and data logging rate continue to improve :
Luminosity on target to reach goal of 2.5x present rate.
Data logging rate will increase to 25 - 40MB/s in 2005
Rate will further increase to 60 MB/s in FY 2006
You are here
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Total Computing Requirements
Input Conditions Resulting Requirements
FiscalYear
Int L Evts Peak rate Ana Reco DiskTape I/O
Tape Vol
fb-1 x 109 MB/s Hz THz THz PB GB/s PBActu
al
2003 0.3 0.6 20 80 1.5 0.5 0.2 0.2 0.4
2004 0.7 1.1 20 80 4.0 0.7 0.3 0.5 1.0
Estim
ate
d
2005 1.2 2.4 35 220 7.2 1.0 0.7 0.9 2.0
2006 2.7 4.7 60 360 16 1.4 1.2 1.9 3.3
2007 4.4 7.1 60 360 26 2.8 1.8 3/0 4.9 Analysis CPU, disk, tape needs scale with number of events.
FNAL portion of analysis CPU assumed at roughly 50% beyond 2005.
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Movement to Grid It’s the world wide trend for HEP experiment.
Need to take advantage of global innovations and resources.
CDF still has a lot of data to be analyzed.
USE Grid
Cannot continue to expand dedicate resource
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Activities for CDF Grid Testing various approaches to using Grid resources (Grid3/OSG and LCG)
Adapt the CAF infrastructure to run on top of the Grid using Condor glide-ins
Use direct submission via CAF interface to OSG and LCG
Use SAMGrid/JIM sendboxing as an alternate way to deliver experiment + user software
Combine DCAFs with Grid resources
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Conclusions
CDF has successfully deployed a global computing environment (DCAFs) for user analysis.
A large portion (50%) of the total CPU resources of the experiment are now provided by offsite through a combination of DCAFs and other clusters. And KorCAF (DCAF in Korea) switch to Condor batch system.
Active work is in progress to build bridges to true Grid methods & protocols provide a path to the future.
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
Backup
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
AbstractsCDF is a large-scale collaborative experiment in particle physics currently taking data at the Fermilab Tevatron. As a running experiment, it generates a large amount of physics data that require processing for user analysis. The collaboration has developed techniques for such analysis and the related simulations based on distributed clusters at several locations throughout the world. We will describe the evolution of CDF's global computing approach, which exceeded 5 THz of aggregate cpu capability during the past year, and its plans for putting increasing amounts of user analysis and simulation onto the grid
October 21, 2005 The Korean Physical Society YuChul Yangycyang@mail.knu.ac.kr
CDF Data Analysis Flow
CDF
Level-3 Trigger
Tape Storage
Production Farm
Central Analysis Farm
Recommended