Upload
oro
View
26
Download
0
Embed Size (px)
DESCRIPTION
A summary of system tests at BNL related to the ROD. From a user point of view : Description of the system of the test stand at BNL Various capabilities of our test stand, the works done and the experience Data integrity, continuity and noise up to 100 kHz. presented by Kin Yip. Host 1. - PowerPoint PPT Presentation
Citation preview
5 Dec., 2001
A summary of system tests at BNL related to the ROD
presented by Kin Yip
From a user point of view :
Description of the system of the test stand at BNL
Various capabilities of our test stand, the works done and the experience
Data integrity, continuity and noise up to 100 kHz
5 Dec., 2001
TS
A B
oard
RO
D
TT
C C
ontr
ol
PT
G
veto
Data (through optical link)
trigger
trigger
Signal from a pulser(triggered by TTC)
FE Crat
e
SDRAM
PU
OC
Host 2
Host 1
“Host 2” — single board in the same crate as the ROD — is a diskless node booted from “Host 1” through the network
~VME
Cal
ib.
boar
d
FE
B
CPU board hidden inside
5 Dec., 2001
Brief system descriptionsFront-End Crate (~final) : FEB (Nevis, U. Columbia), calibration board (LAPP, Annecy), trigger summing analyzer (U. Pittsburg)
TTC (just like the testbeam, Saclay)
PTG (Pulse Trigger Generator) etc., “home-made" by BNL (H. Chen)
ROD (Read-Out Driver) Demo Board:
Motherboard from U. Geneva (input up to 2 FEB)
PU (TI 6202) from Nevis, U. Columbia (64 input each)
Optical Glink Rx/Tx cards from SMU, Texas
Transition module from KTH, Stockholm
Softwares and firmwares in concern : DSP codes originally and mostly from Nevis (sometimes I modify)
Input FPGA firmware from Nevis (5 samples vs 32 samples)
Output Controller firmware from U. Geneva (quite a few upgrades)
ROD system software, U. Geneva and BNL
5 Dec., 2001
Relative Trigger and Data timing
signal
FEB
ROD
delay
Data
10 s
Trigger
s
E, T, 2 calculation takesless than time than passing the entire FEB event
(drawn not to scale )
5 Dec., 2001
Nothing is trivial
With some initial struggles with the VME crate, ROD demo board, and other “accessories”,
we first managed:
data in host PC thru’ VME ROD E, T, 2 VME to PC
then we took data from the FEB and we immediately realized that it is useful to program the PU to just copy all the data bits from the FEB, which is useful for debugging and detailed calibration analysis this becomes the most popular running mode
We first read 5 sample FEB data (nominal)
For calibration and other purposes, we have also taken 32 sample data. This requires changing the DSP codes and the FPGA firmware in the PU.
There have been a lot of detailed data debugging to make the whole scheme work
1
5 Dec., 2001
Nothing is trivial 2
One typical mode of running :
Trigger FEB PU/OC SDRAM on ROD motherboard
When the SDRAM is full, “BUSY” is raised veto trigger
Transfer all data in SDRAM host PC through VME (~3 seconds)
Transfer data through the network to harddisk in another host (~ 3 seconds)
We routinely read 5 sample data at a trigger up to 100 kHz — verified using the oscilloscope … more later
For 32 sample, the max. rate is about ~12 kHz
The SDRAM has a capacity of 16 Mbytes which can hold, for example, 15947 events, each of 263 words --- including some redundant words. These are already many events for one to check data integrity etc.
Routine measurements/
plots
5 sample pulse(with the right delay time, of course)
5 Dec., 2001
By changing firmware/software in the input FPGA/DSP of PU, we can take data with 32 samples.
5 Dec., 2001
We have used 2 PU to read all 128 channels of a FEB successfully.
Capacitor underneath
5 Dec., 2001
Two controllers in two different crates
5 Dec., 2001
5 Dec., 2001
Controlling trigger rate
5 Dec., 2001
A couple notes of problem solving …Since the early days, we have been checking errorflag, the ADC bits, BCID’s etc. This has been very useful in debugging the data taking system.
OC in ROD has had problems in storing data to the SDRAM (mainly due to partitions in the RAM and communication between PU and OC). All along, our diagnosis tool has spotted repeatable data corruptions, identified symptoms for the problem
and therefore provided hints for solving the problem eventually.
The temperature at the back of the VME crate was too high leading to temperature 60o C for the Glink chip/clock on the receiver card
GLink chips have malfunctioned such that the phase between the data and clock would shift corrupted data (including wrong error codes)
Solution :
I put a small fan blowing air right towards the Glink receiver card
Temperature drops and the phase doesn’t change any more, which one can observe even in a scope.
5 Dec., 2001 Thermometer
Pulse from thetrigger suming analyzer board
5 Dec., 2001
Data seen from the ROD using 1 calibration channel
Pulse from the trigger summing analyzer board
Analog vs Digital
Signals from calibration board
5 Dec., 2001
Testing at 100 kHz : Noise/Pedestal Run
5 Dec., 2001
Testing at 100 kHz : Fixed Pattern of 0x111
5 Dec., 2001
Continuity
I have checked that the BCID’s (Bunch Crossing ID) from OC & PU agree with those of FEB ( except the consistent difference of “1” )
The BCID would be reset at ~0xdef / 0xde7 and one has to take that into consideration when making comparison
All BCID’s originates from the same “BCRST” from the TTC
The trigger is synchronous with the 40 MHz clock (either from the PTG or a clock divider)
At trigger rate 100 kHz, we see regular intervals of BCID’s (such as 400 for ~100 kHz) between consecutive events, ie.
eg. BCID(i+1) – BCID(i) = 400 ( in this example and taking care of the 0xdef resetting etc.)
No missing or skipped events At trigger rate 100 kHz, we occassionally see irregular intervals of BCID’s
There are events skipped by the FEB because FEB needs ~9.6 µs for digitization
5 Dec., 2001
Incidences of discontinuities vs trigger rate
At trigger rate 100 kHz, FEB is quite unstable but I have managed to measure at certain rates
Apparently, events are just skipped at higher rates but the data do not seem to be over-written
0
50
100
150
200
250
300
350
400
98 103 108 113 118
trigger rates ( kHz)
Inci
denc
es o
f dis
cont
inui
ties
5 Dec., 2001
Comparion of noise measurementsNoise RMS of the pedestals
Not surprising probably because our events are separated by the same time interval and there is no overlap between events
5 Dec., 2001
Overnight long runs and “free service”
I have run the system (ROD-Glink-FEB) overnight
4 bad events out of ~2108 events (as the data recovered in the next event)
~10K of bad events of out of ~2108 events when the data don’t recover immediately after one bad event
We even sometimes provide free service for people outside BNL to use our entire system to make sophisticated measurements, though I personally prefer our service is not free but more profitable.
A few weeks ago, we have set up necessary software for E. Ladygin to make various measurements for his pre-shaper in HEC. We provide the script and he could just use it to run to take millions of events using our setup (ROD, FEB etc.)
It shows how robust and reliable our system is.
5 Dec., 2001
0 8 16 24 32 40 48 56 64
channel #
0
5
10
15
20
25
30
35
40ENI, nA
Equivalent noise current of PZ v.1
QC limitation level = 33 nA
Shaper HI gain
E. Ladygin’s noise measurement at BNL for his pre-shaper
This includes measurements of the amplitudes of the pulses, averages and RMS’ of the pedestals for all 64 channels etc.
5 Dec., 2001
Summary
In the past ~year, we have set up, integrated and tested a system with the ROD at the end of the chain, that can be used to take serious calibration/physics measurements.
We have done a lot of debugging and got help from all the different board designers of the LArG/Atlas collaboration.
Now it has attained a state that is stable and robust enough to take sophisticated measurements, even by people outside.