1
Management System of Event Processing and Data Files Based on XML Software Tools at Belle Ichiro Adachi, Nobu Katayama, Masahiko Yokoyama IPNS, KEK, Tsukuba, Japan Yutaka Ueshima, Yukimasa Matsuda, Takashi Okamoto Quatre-i Science Ltd., Kyoto, Japan Belle Experiment at KEKB Accelerator • B-factory experiment at KEK, where 8 GeV ele ctrons collide with 3.5 GeV positrons, to expl ore frontier in flavor physics with millions o f B-meson pairs • KEKB accelerator: world’s highest e + e - colli der providing 1.6x10 34 /cm 2 /sec luminosity • More than 700/fb of data has been accumulate d since 1999 Luminosity Records peak luminosity: 1.71x10 34 /cm 2 /sec Integrated luminosities daily: 1.23/fb 8 hour: 0.43/fb further improvement expected due to installation of crab cavities Belle Event Processing and File Management Rawdata Physics ev ent skims mu-pair Bhabha Real-time processing DST data hadronic Reprocessing Calibration constant making Physics a nalysis f iles file registration into user analysis list archive for backup appropriate d istribution f or user analy sis detector experts’ analysis data quality monitoring Most of steps in event processin g have been automated, however hum an intervention is still necessary in some parts. • More than 1 million data files a re difficult to manage in a consis tent way. need a systematic procedure for checking/monitoring in each step as well as stable management for data files Introduce RCM system tools RCM ( R&D Chain Management ) System Software Automated repetition of event processing procedure Capability of automatic retrieval of error High traceability of each step Keep history in database Clear visualization of processing status using Web access control log-database Web access XML viewer CLI template 1-st layer 2-nd layer 3-rd layer REST SSH SCP XML-USER INTERFACE User PC Internet LAN,etc Web server File server Analyze server Sensor devices Visualize server DB sever XML DATABASE Super computer Control server XML-CONTROL Concept of RCM system server configuration work-flow configuration automation of repetitive work login window in a browser Layer 2 Layer 3 File nameÅF I_250mA.dat DateÅF 2004-04-27 File Nam eÅF Å@I_250Time_VÅDÇê Çé Çá Date; 2004-04-27 SizeÅF 100k File Nam eÅF Å@I_250DistÅDÇê Çé Çá Date; 2004-04-27 SizeÅF 180k èâä˙ílÅF 220m A Date ÅF 2004-04-26 Code V erÅF 1.8 Flexible shared level Public Privat e Group workflow log DB request RCM-DB Design free database annotation Analysis flow is defined as a set of “workflow templates” written in XML In each workflow template, check point can be provided as users’ request Each check point ensures correct stream in the flow XML user interface Multi-functional viewer provides automatic display of text, thumbnail and download link interpreting query result written in XML List of templates Edit window of template GUI window of template XML workflow templates automate complex data search, analysis, registration in XML-DB. which are shared and easily edited DST processing including output data management was described as 27 steps and each step was defined in one XML template data file servers database server HSM archive server Data file quality If fail, copy back from HSM archive server Data database update MD5 checksum RAID disk server issue copy back from HSM database server data location stored in DB updated in a consistent way Processed data files are also managed by RCM MD5 checksum is always done and result is kept when output files is produced In case of disk trouble, files are automatically examined, and if corrupted, automatic recovery is issued from RCM Intelligent Belle Processing & Data File Management System Based on RCM Produced data consistency check DST job error analysis DST job completion set of 27 templates test answer query answer In each template, one can communicate with a database s erver as well as any scripts to perform as one step in a RCM template Necessary checking as well as monitoring can be made au tomatically Processing error, if happens, can be notified via email to managers Recovery process can be also made by RCM if defined bef orehand Event processing File management Abstract number : 282 06-2007 Quatre-i Science. Ltd. All rights reserved ref: http://www.i4s.co.jp/

Management System of Event Processing and Data Files Based on XML Software Tools at Belle Ichiro Adachi, Nobu Katayama, Masahiko Yokoyama IPNS, KEK, Tsukuba,

Embed Size (px)

Citation preview

Page 1: Management System of Event Processing and Data Files Based on XML Software Tools at Belle Ichiro Adachi, Nobu Katayama, Masahiko Yokoyama IPNS, KEK, Tsukuba,

Management System of Event Processing and Data Files Based on XML Software Tools at Belle

Ichiro Adachi, Nobu Katayama, Masahiko YokoyamaIPNS, KEK, Tsukuba, Japan

Yutaka Ueshima, Yukimasa Matsuda, Takashi OkamotoQuatre-i Science Ltd., Kyoto, Japan

Belle Experiment at KEKB Accelerator• B-factory experiment at KEK, where 8 GeV electrons collide with 3.5 GeV positrons, to explore frontier in flavor physics with millions of B-meson pairs

• KEKB accelerator: world’s highest e+e- collider providing 1.6x1034 /cm2/sec luminosity

• More than 700/fb of data has been accumulated since 1999

Luminosity Recordspeak luminosity: 1.71x1034 /cm2/secIntegrated luminosities

daily: 1.23/fb8 hour: 0.43/fb

further improvement expected due to installation of crab cavities

Belle Event Processing and File Management

Rawdata

Physics event skims

mu-pair

Bhabha

Real-time processing

DST data

hadronic

Reprocessing

Calibration constant making

Physics analysis files

file registration into user analysis list

archive for backup

appropriate distribution for user analysis

detector experts’ analysis

data quality monitoring

• Most of steps in event processing have been automated, however human intervention is still necessary in some parts.

• More than 1 million data files are difficult to manage in a consistent way.

need a systematic procedure for checking/monitoring in each step as well as stable management for data files

Introduce RCM system tools

RCM ( R&D Chain Management ) System Software

Automated repetition of event processing procedureCapability of automatic retrieval of errorHigh traceability of each stepKeep history in databaseClear visualization of processing status using Web

access controllog-database

Web accessXML viewerCLI template

1-st layer

2-nd layer

3-rd layer

REST SSH SCP

XML-USER INTERFACE

User PC

Internet、 LAN,etc

Web server

File server

Analyze server

Sensor devices

Visualize server

DB sever

XML DATABASE

Super computer

Control server

XML-CONTROL

Concept of RCM system

server configurationwork-flow configuration

automation of repetitive work

login window in a browser

Layer 2 Layer 3

File nameÅF I_250mA.datDateÅF2004-04-27File NameÅFÅ@I_250Time_VÅDÇêÇéÇáDate; 2004-04-27SizeÅF100k

File NameÅFÅ@I_250DistÅDÇêÇéÇáDate; 2004-04-27SizeÅF180k

èâä í̇l ÅF 220mADate ÅF 2004-04-26Code VerÅF 1.8

Flexible shared level

Public

Private

Group

workflow

log

DB request

RCM-DB

Design free database

annotation

Analysis flow is defined as a set of “workflow templates” written in XMLIn each workflow template, check point can be provided as users’ requestEach check point ensures correct stream in the flow

XML user interfaceMulti-functional viewer provides automatic display of text, thumbnail and download link interpreting query result written in XML

List of templates

Edit window of template GUI window of template

XML workflow templates automate complex data search, analysis, registration in XML-DB. which are shared and easily edited

DST processing including output data management was described as 27 stepsand each step was defined in one XML template

data file servers

database server

HSM archive server

Data file quality

If fail, copy back from HSM archive server

Data database update

MD5 checksum

RAID disk server

issue copy back from HSM

database server

data location stored in DBupdated in a consistent way

Processed data files are also managed by RCMMD5 checksum is always done and result is kept when output files is producedIn case of disk trouble, files are automatically examined, and if corrupted, automatic recovery is issued from RCM

Intelligent Belle Processing & Data File Management System Based on RCM

Produced data consistency check

DST job error analysis

DST job completion

set of 27 templates

test

answer

query

answer

In each template, one can communicate with a database server as well as any scripts to perform as one step in a RCM templateNecessary checking as well as monitoring can be made automaticallyProcessing error, if happens, can be notified via email to managers Recovery process can be also made by RCM if defined beforehand

Event processing File management

Abstract number : 282

2006-2007 Quatre-i Science. Ltd. All rights reserved ref: http://www.i4s.co.jp/