35
Some Challenges in Some Challenges in Signal Processing and Signal Processing and Computing in Computing in Astrophysics Astrophysics Yashwant Gupta Yashwant Gupta National Centre for Radio Astrophysics National Centre for Radio Astrophysics Tata Institute of Fundamental Research Tata Institute of Fundamental Research Scientific Discovery Through Intensive Data Exploration JNCASR Scientific Discovery Through Intensive Data Exploration JNCASR Bangalore 4 Bangalore 4 th th February 2011 February 2011

Some Challenges in Signal Processing and Computing in Astrophysics Yashwant Gupta Yashwant Gupta National Centre for Radio Astrophysics Tata Institute

Embed Size (px)

Citation preview

Some Challenges in Signal Some Challenges in Signal Processing and Computing in Processing and Computing in

AstrophysicsAstrophysics

Yashwant GuptaYashwant Gupta

National Centre for Radio Astrophysics National Centre for Radio Astrophysics Tata Institute of Fundamental ResearchTata Institute of Fundamental Research

Scientific Discovery Through Intensive Data Exploration JNCASR Bangalore 4Scientific Discovery Through Intensive Data Exploration JNCASR Bangalore 4 thth February 2011 February 2011

Introduction : Different Ways of Introduction : Different Ways of Scientific Discovery Scientific Discovery

Computational Astrophysics : Computational Astrophysics : Numerical SimulationsNumerical Simulations Large Volume Data Analysis Large Volume Data Analysis

(offline)(offline) Real-time Data Processing Real-time Data Processing

(Instrumentation)(Instrumentation)

Courtesy : A.H.Nelson, Cardiff University

Numerical Simulations in AstrophysicsNumerical Simulations in Astrophysics Examples of simulations :Examples of simulations : Large scale structure of the Universe Large scale structure of the Universe Formation & interaction of galaxiesFormation & interaction of galaxies Formation of stars and solar systemsFormation of stars and solar systems Complex structure of the SunComplex structure of the Sun

Common features :Common features : Involve upto 10Involve upto 1099 or more “particles” or more “particles” Complex interactions between particles: Complex interactions between particles:

gravity, magneto-hydrodynamics etc gravity, magneto-hydrodynamics etc Long time-scale evolution, many steps Long time-scale evolution, many steps

Large scale Universe : Complex web of cold dark matter

Collision of 2 gas clouds leading to formation of stars

Galaxy formation & evolution

Some large cluster computer systems Some large cluster computer systems in astrophysicsin astrophysics

Swinburne University of Technology, Swinburne University of Technology, Melbourne, Australia :Melbourne, Australia : The Swinburne Green machine : The Swinburne Green machine :

145 nodes dual processor quad-core 145 nodes dual processor quad-core Dell 1950 (# of cores = 1160)Dell 1950 (# of cores = 1160)

10.8 TFLOPS @ 46 KWatts10.8 TFLOPS @ 46 KWatts Canadian Institute of Theoretical Canadian Institute of Theoretical

Astrophysics (CITA):Astrophysics (CITA): The CITA cluster : The CITA cluster :

200 nodes dual processor quad-core 200 nodes dual processor quad-core Dell 1950 (# of cores = 1600)Dell 1950 (# of cores = 1600)

15 TFlops @ 64 KWatts15 TFlops @ 64 KWatts Harish-Chandra Research Institute Harish-Chandra Research Institute

(HRI), Allahabad (HRI), Allahabad

Computational AstrophysicsComputational Astrophysics

Computational Astrophysics : Computational Astrophysics : Numerical SimulationsNumerical Simulations Large Volume Data Analysis Large Volume Data Analysis

(offline)(offline) Real-time Data Processing Real-time Data Processing

(Instrumentation)(Instrumentation)

Radio AstronomyRadio Astronomy(study of the Universe at (study of the Universe at

radio wavelengths)radio wavelengths)

A Basic Radio TelescopeA Basic Radio Telescope

For high sensitivity (to see faint sources out to the distant reaches of For high sensitivity (to see faint sources out to the distant reaches of the Universe) : the Universe) : large dishes (several 10s of metres in diameter)large dishes (several 10s of metres in diameter) high quality, low noise electronics in the receivers high quality, low noise electronics in the receivers large bandwidth of observation large bandwidth of observation long integration times of observation long integration times of observation

THE CHALLENGE :

Celestial radio signals are VERY weak ; unit of flux used is :

1 Jy = 10 –26 W / m2 / Hz Input radio power into a typical

telescope is ~ -100 dBm ! (would take 1000 years of

continuous operation to collect 1 milliJoule of energy !!)

Large volumes of data Large volumes of data

Single Dish versus Array TelescopesSingle Dish versus Array Telescopes

Resolution and sensitivity depend on the Resolution and sensitivity depend on the physical size (aperture) of the radio telescopephysical size (aperture) of the radio telescope

Due to practical limits, fully steerable single Due to practical limits, fully steerable single dishes of more than ~ 100 m diameter are very dishes of more than ~ 100 m diameter are very difficult to build difficult to build resolution ( resolution ( / D) ~ 0.5 degree at 1 metre / D) ~ 0.5 degree at 1 metre (very poor compared to optical telescopes) (very poor compared to optical telescopes)

To synthesize telescopes of larger size, many To synthesize telescopes of larger size, many individual dishes spread out over a large area individual dishes spread out over a large area on the Earth are usedon the Earth are used

Signals from such array telescopes are Signals from such array telescopes are combined and processed in a particular combined and processed in a particular fashion to generate a map of the source fashion to generate a map of the source structure : EARTH ROTATION APERTURE structure : EARTH ROTATION APERTURE SYNTHESIS SYNTHESIS resolution = resolution = / D / Dss , D , Ds s = largest separation= largest separation

The new 100-m Greenbank Telescope

The Very Large Array Telescope

A typical modern radio telescope : A typical modern radio telescope : The GMRT The GMRT

The Giant Metre-wave Radio Telescope (GMRT) is a new, world class The Giant Metre-wave Radio Telescope (GMRT) is a new, world class instrument for studying astrophysical phenomena at low radio frequencies instrument for studying astrophysical phenomena at low radio frequencies (50 to 1450 MHz)(50 to 1450 MHz)

Designed and built primarily by NCRA, a national centre of TIFR.Designed and built primarily by NCRA, a national centre of TIFR.

Array telescope consisting of 30 antennas of 45 metres diameter, operating Array telescope consisting of 30 antennas of 45 metres diameter, operating at metre wavelengths -- the largest in the world at these frequenciesat metre wavelengths -- the largest in the world at these frequencies

14 km

1 km x 1 km

Location and Configuration of the GMRTLocation and Configuration of the GMRT

• 30 dishes; 45 m diameter• 12 dishes in central

compact array• Remaining along 3

arms of Y-array

• Total extent : 14 km radius resolution of a 28 km size antenna is achieved

• Latitude : 19 deg N• Longitude : 74 deg E

• About 70 km N of Pune, 160 km E of Mumbai.

Radio Interferometry & Aperture SynthesisRadio Interferometry & Aperture Synthesis

Signals from a pair of antennas are Signals from a pair of antennas are cross-correlated (cross-spectrum is cross-correlated (cross-spectrum is obtained)obtained)

This functions like a Young’s double slit, This functions like a Young’s double slit, multiplying the sky brightness multiplying the sky brightness distribution by a sinusoidal response distribution by a sinusoidal response patternpattern

Thus, an interferometer measures one Thus, an interferometer measures one Fourier component of the imageFourier component of the image

From measurements using different From measurements using different pairs of antennas, several Fourier pairs of antennas, several Fourier components of the image are obtained components of the image are obtained

Inverse Fourier transform of the Inverse Fourier transform of the combined “visibilities” gives a combined “visibilities” gives a reconstruction of the original image reconstruction of the original image aperture synthesis aperture synthesis

The signal processing has both The signal processing has both real timereal time and and off-line off-line components components

GMRT Receiver : Digital Back-endsGMRT Receiver : Digital Back-ends

Main components : Main components :

FX Correlator FX Correlator

Pulsar Receiver Pulsar Receiver

can operate simultaneoslycan operate simultaneosly

Common signal processing Common signal processing stages: sampling, delay stages: sampling, delay correction, fringe stopping & FFTcorrection, fringe stopping & FFT

Input data rate : Input data rate : 1.9 Gsamples/s1.9 Gsamples/s

Output data rate : Output data rate : few Mbytes/sfew Mbytes/s

Total Compute power : Total Compute power : ~ 150 GFlops !~ 150 GFlops !

Uses mostly ASICs + some Uses mostly ASICs + some FPGAsFPGAs

The difficulties of pulsar searchingThe difficulties of pulsar searching Requires large volumes of data to be handled & processed :Requires large volumes of data to be handled & processed :

10 mins of data from the GMRT is one data file ~ 4 GBytes in size10 mins of data from the GMRT is one data file ~ 4 GBytes in size In one observing session, there may be ~ 100 such data files In one observing session, there may be ~ 100 such data files ~ 400 ~ 400

GBytes per night ! GBytes per night !

It is a highly compute intensive job : It is a highly compute intensive job : Collapsing multi-channel data into a single channel data with different Collapsing multi-channel data into a single channel data with different

dispersion delays dispersion delays e.g. 4 GB data explodes to 16 GB data e.g. 4 GB data explodes to 16 GB data Searching each of the above time series data for presence of periodic Searching each of the above time series data for presence of periodic

pulsar signals, using spectral domain search algorithms ( 4M point FFT pulsar signals, using spectral domain search algorithms ( 4M point FFT + analysis algorithms)+ analysis algorithms)

Sorting and classification of results to identify the best candidatesSorting and classification of results to identify the best candidates

Full analysis of a 10 min data stretch (~ 4 GBytes) takes ~ 15 hours on Full analysis of a 10 min data stretch (~ 4 GBytes) takes ~ 15 hours on a single, 3.6 GHz Xeon processor and requires ~ 16 Gbytes of a single, 3.6 GHz Xeon processor and requires ~ 16 Gbytes of intermediate data storage intermediate data storage

However, the task is highly amenable to parallelisation !However, the task is highly amenable to parallelisation !

Moving to real-time data processing…Moving to real-time data processing…

Computational Astrophysics : Computational Astrophysics : Numerical SimulationsNumerical Simulations Large Volume Data Analysis Large Volume Data Analysis

(offline)(offline) Real-time Data Processing Real-time Data Processing

(Instrumentation)(Instrumentation)

Radio AstronomyRadio Astronomy(study of the Universe at (study of the Universe at

radio wavelengths)radio wavelengths)

GMRT Receiver : Digital Back-endsGMRT Receiver : Digital Back-ends

Main components : Main components : FX Correlator FX Correlator

Pulsar Receiver Pulsar Receiver

can operate simultaneoslycan operate simultaneosly

Common signal processing Common signal processing stages: sampling, delay stages: sampling, delay correction, fringe stopping & FFTcorrection, fringe stopping & FFT

Input data rate : 1.9 Gsamples/sInput data rate : 1.9 Gsamples/s

Output data rate : few Mbytes/sOutput data rate : few Mbytes/s

Total Compute power : Total Compute power : ~ 150 GFlops !~ 150 GFlops !

Uses mostly ASICs + some Uses mostly ASICs + some FPGAsFPGAs

Can this data processing be Can this data processing be done in done in REAL-TIMEREAL-TIME on a on a MULTI-CORE COMPUTE MULTI-CORE COMPUTE CLUSTER ?CLUSTER ?

Computing resources are Computing resources are off-the-shelf componentsoff-the-shelf components can put together as many as needed to meet the real-time can put together as many as needed to meet the real-time

requirements !requirements ! Highly flexibleHighly flexible : Can change parameters (e.g. frequency : Can change parameters (e.g. frequency

resolution) and algorithms (e.g. polyphase filter bank instead of resolution) and algorithms (e.g. polyphase filter bank instead of FFT) almost at willFFT) almost at will

Can do Can do full floating point calculationsfull floating point calculations better accuracy, more better accuracy, more dynamic range, better protection against interference signalsdynamic range, better protection against interference signals

Ability to Ability to add new, sophisticated algorithmsadd new, sophisticated algorithms, e.g. to filter out , e.g. to filter out interference signalsinterference signals

We have recently completed such a software based back-end We have recently completed such a software based back-end for the GMRTfor the GMRT

Software Based Back-endsSoftware Based Back-ends

Basic Architecture :Basic Architecture : A 32-channel back-end (32 ants,

single pol) using 16 compute nodes, connected over Gigabit ethernet

Node configuration : Quad core, dual processor Intel Xeon

CPUs 2 GB RAM, 1 TB SATA RAID

storage Dual Gigabit Ethernet ports 8-bit, 4 Channel, 100 MSPS, PCI-X

compliant ADC card

A Software Back-end for the GMRTA Software Back-end for the GMRT

1. Real-time data acquisition + correlation 2. Real-time data acquisition + writing to disks (on each node);

offline read-back and correlation

Two Modes of Operation :Two Modes of Operation : Jayanta Roy et al (2010)Jayanta Roy et al (2010)

Block Diagram of the GMRT Software Back-endBlock Diagram of the GMRT Software Back-end

Jayanta Roy et al (2010)Jayanta Roy et al (2010)

Basic Methodology :Basic Methodology : Run synchronous sampling on all 8 ADC

boards (32 antennas) – 16/32 MHz BW

Transfer data from ADC board to CPU unit via interrupt driven DMA over PCI bus in large blocks (32 MB size 8 MB per antenna)

Distribute data from all antennas (using time

division multiplexing) to all nodes -- each node handles 1/8 time slice from each block

Carry out FFT, fringe stop, MAC and other required operations at each node

Record integrated visibilities results to local disk on each node, or send them to “collector nodes”

Optimise all the operations to meet real-time processing requirements

A Software Back-end for the GMRTA Software Back-end for the GMRT

Possibilities for Multi-beaming of the GMRT :Possibilities for Multi-beaming of the GMRT : For raw data recording mode, total data rate to disk is

1.8 Gsamples / sec ~ 500 Mbytes/sec (at 2 bits / sample)

Transfer recorded data to a large compute cluster for off-line analysis : e.g. to the cluster at Pune campus, using a dedicated 4 Mbps dedicated link over fibre from GMRT to Pune

Run off-line analysis to do the correlations and also to

produce multiple phased array beams (total # required is ~ few hundred beams)

Carry out the pulsar search analysis for each beam output ~ 100 x increase in the computation load, for data acquired in the same duration !

Running this application is a major CHALLENGE for High Performance Computing !!

Offline Computing for Software Back-endOffline Computing for Software Back-end

Data acquisition +

Real-time data recording

(32 node quad-core dual-CPU

cluster at the GMRT)

Off-line data read-back +

Distributed data analysis

(NCRA Pune Main Compute Cluster )

Dedicated 4 Gbps link

Another Software Correlator :Another Software Correlator : LOFAR + Blue GeneLOFAR + Blue Gene

The LOFAR Radio Telescope : A pan-European effort

LOFAR + Blue GeneLOFAR + Blue Gene

The IBM “Blue Gene” Supercomputer

The LOFAR Radio Telescope uses an off-the-shelf supercomputer (The BLUE GENE from IBM) to implement a correlator !!

The LOFAR Correlator : LayoutThe LOFAR Correlator : Layout

The LOFAR Correlator : SpecificationsThe LOFAR Correlator : Specifications

Inner core

Station

Wide-angle radio camera +

radio “fish-eye lens”

Future ProjectionsFuture Projections : The SKA: The SKA The Square Kilometre Array (SKA) -- next big step in Radio Astronomy (an The Square Kilometre Array (SKA) -- next big step in Radio Astronomy (an

international telescope)international telescope)

Total collecting area of 1 million sq. meters (about 30 times the GMRT) !Total collecting area of 1 million sq. meters (about 30 times the GMRT) !

Will be spread over a much larger area : ~ thousand km !! (contintental size)Will be spread over a much larger area : ~ thousand km !! (contintental size)

Future Projections : The SKAFuture Projections : The SKA

Will have large range of Will have large range of

frequencies and bandwidthsfrequencies and bandwidths

To be completed in 2020 To be completed in 2020

estimated cost : 1 billion dollars ! estimated cost : 1 billion dollars !

Will require astronomical signal Will require astronomical signal

processing processing PETA FLOPS !!PETA FLOPS !!

The LOFAR Radio Telescope

HPC for modern Radio TelescopesHPC for modern Radio Telescopes

Summary Summary Data Intensive computing Data Intensive computing is of great importance in astrophysics for is of great importance in astrophysics for SIMULATIONS, OFF-SIMULATIONS, OFF-

LINE DATA ANALYSIS andLINE DATA ANALYSIS and REAL-TIME SIGNAL PROCESSINGREAL-TIME SIGNAL PROCESSING

Multicore compute clusters ~ 10s of TFlopsMulticore compute clusters ~ 10s of TFlops capacity for exclusive use by astronomers (for capacity for exclusive use by astronomers (for off-line processing) are becoming quite commonoff-line processing) are becoming quite common

Radio AstronomyRadio Astronomy involves long duration observations of very faint radio signals from involves long duration observations of very faint radio signals from celestial objects, using sensitive telescopes with large bandwidths. celestial objects, using sensitive telescopes with large bandwidths. It requires significant amounts of real-time signal processing and computing to make It requires significant amounts of real-time signal processing and computing to make the final images from the telescopesthe final images from the telescopes A large, modern radio telescope like theA large, modern radio telescope like the GMRTGMRT requiresrequires ~ 150 GFlops~ 150 GFlops of computing on of computing on data coming in atdata coming in at ~ 2 Gsamples/sec~ 2 Gsamples/sec

Doing the real-time signal processing using compute clusters has significant advantages, Doing the real-time signal processing using compute clusters has significant advantages, and it is now becoming technologically feasible; e.g. theand it is now becoming technologically feasible; e.g. the GMRT software back-endGMRT software back-end

New telescopes being made, will requireNew telescopes being made, will require ~ 500 TFlops~ 500 TFlops with data rates ofwith data rates of 600 Gsamples/s600 Gsamples/s

TheThe SKASKA -- -- the discovery instrument of the future, will ensure that radio astronomy the discovery instrument of the future, will ensure that radio astronomy requirements for real-time and off-line processing remain on the cutting edge of technology ! requirements for real-time and off-line processing remain on the cutting edge of technology !

Thank YouThank You

CMB is a tool to study Cosmology on largest scales

Linear transforms:

Linear Algebra on large basises

For each frequency channel

Time sky:Nt (10 G) Npix (10 M)

Data compression

Sky Harmonic: Npix lmax (few k)

Ongoing Planck analysis effort:

Joint estimation of Cl and BipoSH coefficients using Gibbs

sampling.

90k CPU hours

(30 Tflops-days/year)

Storage : 10 Tb/year

Current Angular power spectrum

Image Credit: NASA / WMAP Science Team

3rd peak

6thpeak

4th peak

5th peak

ML(optimal)

Qest(suboptimal)

GW Astronomy GW Astronomy with Intl. Network of GW Observatorieswith Intl. Network of GW Observatories

LIGO-LLO: 4km

LIGO-LHO: 2km, 4kmGEO: 0.6km VIRGO: 3km

TAMA: 0.3km

LIGO-Australia?

• Detection confidence

• Source direction

• Polarization info.

LIGO-India ?

Indian Initiative in Gravitational-wave Observationswww.gw-indigo.org

Primary Science: Online Coherent search for GW signal from binary mergers using data from global detector network

Role of IndIGO data centre Large Tier-2 data/compute centre for archival of g-wave data

and analysis Bring together data-analysts within the Indian gravity wave

community. Puts IndIGO on the global map for international collaboration

with LIGO Science Collab. wide facility. Part of LSC participation from IndIGO

100 Tflops = 8500 cores x 3 GHz/coreNeed 8500 cores to carry out a half decent coherent search for

gravitational waves from compact binaries.

(1 Tflop = 250 GHz = 85 cores x 3 GHz / core)

Storage: 4x100TB per year per interferometer.

Network: gigabit backbone, National Knowledge Network.

Courtesy: Anand Sengupta, IndIGO

IndIGO Data Centre@IUCAA Indian Initiative in Gravitational-wave Observations

Time Domain Astronomy

•Real-time search in and characterization of 4-D data arrays

position (x,y), colour (wavelength), time

•Large surveys are in the offing at many places across the world including India. Indian interest will include the processing of data from local as well as some of the international facilities

•Need both large network bandwidth and high compute power

•Most of the computation will be for automatic detection, characterization and classification of transient events, on which the decision of follow-up operations will be based

Theoretical AstrophysicsNear and Medium term projections in India

•MHD and Hydrodynamic simulations at progressively higher resolution (sun, galactic dynamos, accretion flows, jets, cosmological structure formation…)

•Cosmological and stellar N-body simulations•Monte-Carlo and Matrix-based radiative transfer problems (comptonization, cyclotron resonant scattering, solar and planetary atmospheres…)

•Molecular structure of dust grains and their optical properties; radiative transfer in dusty media; astrochemistry

A net computational need of a few tens of teraflops is envisaged for these problems in India over the next few years