12
NETWORK OF EXCELLENCE ON HI GH PERFORMANCE AND E MBEDDED ARCHITECTURE AND COMPILATION THE 10TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND EMBEDDED ARCHITECTURES AND COMPILERS (HIPEAC 2015), 19-21 JANUARY 2015, AMSTERDAM, THE NETHERLANDS WELCOME TO THE AUTUMN COMPUTING SYSTEMS WEEK, 8-10 OCTOBER 2014, ATHENS, GREECE WWW.HiPEAC.NET info 40 appears quarterly october 2014

HiPEACinfo 40

  • Upload
    hipeac

  • View
    216

  • Download
    1

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: HiPEACinfo 40

NETWORK OF EXCELLENCE ON HIGH PERFORMANCE AND EMBEDDED

ARCHITECTURE AND COMPILATION

THE 10TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND EMBEDDED ARCHITECTURES AND COMPILERS (HIPEAC 2015), 19-21 JANUARY 2015, AMSTERDAM, THE NETHERLANDS

WELCOME TO THE AUTUMN COMPUTING

SYSTEMS WEEK, 8-10 OCTOBER 2014,

ATHENS, GREECE

WWW.HiPEAC.NET

info 40

appears quarterly october 2014

Page 2: HiPEACinfo 40

MESSAGE FROM THE HIPEAC COORDINATOR

CONTENT

intro

I hope you have enjoyed a relaxing summer break with your family and friends. This summer, one small news item caught my attention. On June 7, there was a Turing test competition organized by the University of Reading, in which a Russian chatterbot convinced 33% of the judges that it was human. The organizers claim that this was the first time that the Turing test was passed, but this was not generally accepted. It is not especially important whether or not the Turing test was passed. Important for me was that it clearly shows that the artificial intelligence community is making significant progress. In 2011, Watson won Jeopardy, beating the human Jeopardy champions. Recently it became known that Watson is contributing to oncology research, by searching and automatically interpreting the fast growing body of oncology literature. IBM recently announced its SyNAPSE chip, a brain inspired computer architecture powered by 1 million neurons and 256 million synapses. With 5.4 billion transis­tors and consisting of 4096 neurosynaptic cores, it is the largest chip IBM has ever built. And of course, everybody knows the worldwide race to develop a driverless car.

For sure, cognitive computing will become the next big thing in computing.

In July, 172 participants enjoyed the annual ACACES summer school in Fiuggi, Italy. To celebrate ten years of ACACES, this year’s technical program was delivered by instructors who had taught in previous summer schools. It was nice to meet them again after a few years, and they were also happy to again be part of the summer school again. Soon we will start the preparations for the eleventh edition of the summer school.

On October 8 to 10, we will organize the fall Computing Systems Week in Athens. As part of this event, we will also organize the third HiPEAC Industry Partner Program, in which we bring industry and academia together to think about common challen­ges. This computing systems week is the ideal event to form consortia for the ICT4 call on customized and low power com­puting (deadline April 2015).

In January 2015, there is the HiPEAC Confe­rence, which will take place in Amsterdam. this is the fourth year we outsource the

hipeac activity

4 ACACES2014REPORT

hipeac announce

5 BOOK:MICROARCHITECTUREOFNETWORKONCHIPROUTERS:ADESIGNER'SPERSPECTIVE

5 HIPEAC2015FROM19TO21JANUARYINAMSTERDAM,THENETHERLANDS

hipeac news

6 PARVEC:VECTORIZEDPARSECBENCHMARKS

6 SUPEROPTIMIZATION:AFEASIBILITYSTUDY

7 WORKSHOPONSOFTWARETOOLSFORNEXTGENERATIONCOMPUTING

7 SOMENEWSFROMMAXELERTECHNOLOGIES

hipeac startup news

8 LOWRISC-ANSOCTHATISOPENTOTHECORE

in the spotlight

8 TETRACOMCOMPLETES1STCALLFORTECHNOLOGYTRANSFERPROJECTS

hipeac students

9 COLLABORATIONGRANTREPORT–ATHENAELAFROU

9 COLLABORATIONGRANTREPORT–MARIUSENACHESCU

10 phd news

12 upcoming events

reviewing process for the conference to ACM TACO, and we received many high quality submissions for this year’s conference again. We already have 16 papers accepted, and 38 revised papers are currently being reviewed. The conference itself will be a major net working and recruiting event for the computing systems community in Europe. Throughout the conference, 32 associated workshops, tutorials and special sessions will be organized in parallel with the main paper track and all HiPEAC companies and FP7 computing systems projects will be invited to promote their research and activities. We hope the HiPEAC2015 conference will once more be one of the top networking events for the HiPEAC community.

Take care,Koen De Bosschere_________

HiPEAC info 402

Page 3: HiPEACinfo 40

HiPEAC info 40 3

intro

The end of the summer brought us a new European Commission and a new Commissioner for the Digital Economy and Society, Günther Oettinger. The new Commission has been defined as “a strong and experienced team standing for change”, ready for the big challenges Europe is facing in all areas, including research and technology. Horizon 2020 is now running at full speed, with its strong coupling between research and innovation, and the emphasis on excellent science, industrial leadership and tackling societal challenges. We all know that science, inno­vation and technology are very important for the future of Europe; what is at stake is not only our financial future and the employment perspectives of our kids, but the capacity of Europe to stay relevant and maintain its role in this connected world, where the “old continent” could easily become irrelevant, econo mically and strategically, in just a few years. If we value the European model of a society where only old people remember what war is, and where welfare and public services are available for everybody, then we, people

MESSAGE FROM THE PROJECT OFFICER

active in science and technology, should roll up our sleeves and start working hard, because Europe can stay relevant only if it is a leader in scientific knowledge, innovation and technology.

You already know that the European Commission is trying hard to push the research community towards more innovation. Starting from September, a new initiative will try to better support innovators in our research community. It is called “Innovation Radar”, and has the objective of identifying the high potential innovators and innovations within EU­funded projects, together with their specific “go to market” needs. In practical terms, during the project review process, a questionnaire about innovation activities will be used to gather information about the innovation profile of the project; this data will be then used to identify the most promising innovators and to put them in contact with EU­funded support action, which will help the innovators in the “go to market” effort. The support may be exper­tise on networking, market, finance, Intellectual Property and skills, and can be often delivered via other EU funded / endorsed support actions.

Another very important financial tool has

been introduced recently, and is called FTI – Fast Track to Innovation. You can find the general description at http://europa.eu/rapid/press­release_MEMO­14­492_en.htm and, more important, you can already find an FTI pilot in the 2015 workprogramme: http://ec.europa.eu/research/participants/portal/desktop/en/opportunities/h2020/topics/9096­ftipilot­1­2015.html . FTI offers grant to innovative businesses and organisations in order to give a final push to get great ideas to market. Proposals are not limited to ICT but can span all enabling and industrial technologies, so you can expect a very strong competition for those grants, but FTI should really be considered as a great opportunity for moving results of research projects to the market.

Things are changing, and new opportu­nities arise all the time. I really hope that the HiPEAC community will be up to the challenge of supporting scientists in their research activities, but also in boosting the impact of their work; you deserve it, because understanding the world is an exciting objective of any good researcher, but changing the world can be even more exciting.

Sandro D’Elia_________

Page 4: HiPEACinfo 40

HiPEAC info 404

hipeac activity

ACACES 2014 REPORT

The ACACES summer school is an event where researchers in computer architec­ture, compilers and programming languages can follow high quality lectures, and also meet and interact with other researchers working in similar areas. We would like to start by saying that participating in the ACACES summer school was a fantastic experience.

ACACES offers a large number of courses to choose from, which gave us the opportunity to select courses that were closely related to our research. The courses were well thought out and well organized. As it was the tenth ACACES summer school, the organizers wanted to give it a celebratory tone, and decided to invite

teachers who had taught in previous ACACES summer schools. The result was outstanding, as most of the teachers were well aware of the audience they had to teach to. The lecturers were acclaimed professors and accomplished industry researchers. Meeting those people and hearing their stories was very inspiring. In addition to the courses, two keynote speakers gave motivating talks about research on big data and processors in the aerospace industry. The poster session was a nice opportunity to get acquainted with different topics and the works of other talented students. The event presented a fantastic chance for those who were stuck in their work and needed help, as well as for those who were seeking collaboration opportunities. All in all, the networking possibilities were abundant. The location of the school was very beauti­ful. Fiuggi is a village in the mountainside, not far from Rome, with famous spring waters. It is small but quite alive. In it you’ll

find all the traditional Italian offerings, i.e. people, food, wines and ice creams! Fiuggi is famous for its spa hotels, and we were staying in a very good one. The summer school offered lots of coffee breaks, lunches and dinners, plenty of free time and a big party on the last day, which provided perfect opportunities not only to relax from the courses but also to interact and meet the teachers and other researchers. A huge cake marked the 10th anniversary of the event and the end of this year’s school. It was kind of a bittersweet moment with frosting! Then followed a lively party, live music and dancing. Quite memorable! All in all, we hope we will have the opportunity to attend a future ACACES summer school. We would also highly recommend it to any student that has not been there already.

Leyla Ghazanfari, Tampere University of Technology, FinlandNikos Nikoleris, Uppsala University, Sweden_________

Page 5: HiPEACinfo 40

HiPEAC info 40 5

The Network­on­chip paradigm solves the problem of on­chip communication by applying well­established networking principles at the silicon chip level, after

suitably adapting them to the silicon chip characteristics and the application demands. Routers are the heart and backbone of any network­on­chip, and their purpose is to provide arbitrary connectivity between the inputs and outputs and to allow for the implementation of arbitrary network topologies.

This book focuses on the microarchitecture of network­on­chip routers, following a designer’s perspective and providing ready­to­use solutions for simple and more sophisticated design cases. All aspects of the design of a network­on­chip router, including flow control, buffering architectures, arbitration and allocation, as well as pipelined organizations, are presented in detail, building on top of detailed examples and practical abstract models, when

necessary. Router micro­architectural options are presented in a step­by­step manner beginning with the basic design principles. Even highly sophisticated design alternatives are categorized and broken down to simpler pieces that can be easily understood and analyzed. This book is an invaluable reference for system, architecture, circuit, and EDA researchers and developers, who are interested in understanding the overall picture of the microarchitecture of network­on­chip routers, the associated design challenges, and the available solutions.

Additional material such as PowerPoint slides and figures can be found in http://utopia.duth.gr/~dimitrak/noc­router­book/

Giorgos Dimitrakopoulos, Anastasios Psarras, Ioannis Seitanidis, Democritus University of Thrace, Greece_________

BOOK: MICROARCHITECTURE OF NETWORK ON CHIP ROUTERS: A DESIGNER'S PERSPECTIVE

hipeac announce

10th Conference of the European Network of Excellence on High Performance and Embedded Architecture and Compilation

HIPEAC2015 FROM JANUARY 19 TO 21, 2015 IN AMSTERDAM, THE NETHERLANDS

The annual HiPEAC conference provides a high­quality forum for computer architects and compiler and tools builders for high performance, mobile and embedded com­puting. The 10th HiPEAC conference will take place at the RAI Convention Center in the bustling city of Amsterdam, The Netherlands from Monday January 19 to Wednesday January 21, 2015.

Paper selection for the HiPEAC conference is being done by ACM TACO, the ACM Transactions on Architecture and Code Optimization as part of the journal first publication model. Authors of accepted papers are invited to present their work in the main paper track of the conference. Onur Mutlu from the Carnegie Mellon University is the program chair.

The following keynote speakers have already confirmed their participation: (with preliminary presentation titles):• Jim Larus, EPFL

It’s the End of the World as We Know It • Rudy Lauwereins, IMEC

New memory technologies and their impact on computer architectures

• Burton Smith, Microsoft Resource Management in PACORA

• Bill Dally, NVIDIA TBA

Throughout the conference, up to nine associated workshops, tutorials and special sessions will run in parallel with the main paper track. Several EU projects will hold their meetings and organize scientific events during the conference. EU projects will present their results during a poster session on Wednesday. The high number and variety of activities will offer all participants the opportunity to create their own personalized conference program.

Tuesday is dedicated to the computing systems industry with an industrial session and with an industrial exhibit throughout the day. The conference has increasingly become a recruitment event for the computing systems industry in Europe. Companies interested in spon­soring the event can contact the general chairs.

Another highlight of the conference will be the boat tour (provisional) and net­working dinner at the Maritime Museum of Amsterdam on Tuesday evening.

The three­day event attracts about 500 delegates each year. You can find more information about the Conference on http://www.hipeac.net/2015/amsterdam. We look forward to see you there!

Andy D. Pimentel, Stephan WongGeneral chairs of HiPEAC2015_________

Page 6: HiPEACinfo 40

HiPEAC info 406

hipeac news

Subheadline: A collaborative project between Embecosm Ltd. and the University of Bristol funded by the UK TSB

How to produce highly optimised code has been of interest to developers, users and compiler­writers alike since the very first optimising compiler. However, producing efficient assembly code from a high­level language is still a challenge. Superoptimi­zation has the ability to change this. By using either brute­force or program synthesis techniques, it is possible to create optimal, energy efficient, fast and short sequences of code.Superoptimization first began in the 80s with Henry Massalin's work describing how his 'superoptimizer' had found very short sequences of code that nobody had thought were possible. These sequences were better than the best code produced by assembly programmers, using the processor in weird, wonderful ways and truly exploiting every feature the processor had to offer. This brute­force approach has

been further refined, culminating in two parallel strands of research ­ brute­forcing more efficiently, and using ever more powerful solvers to synthesise sections of code. The first approach has been used to feedback into compilers. This means we only have to do the brute­force search once, and then all applications can benefit from the effort put in. The latter approach is also promising since it does not have to go through the entire search space, and can instead intelligently choose candidate code sequences. More recently machine learning has been applied to super opti­mization. This removes the guarantee of optimality, yet still has impressive gains in efficiency ­ a high performance montgomery multiplication kernel in the popular OpenSSL library was optimized to be 1.6 times faster than the best code GCC could produce.Many such techniques have been slowly

accruing in the academic world ­ ripe for bringing together into a commercial tool. This feasibility study will explore these techniques, and evaluate how they can be applied to real world situations. In essence, we want to bring superoptimization to the masses, and our project is the first step in doing so. The Superoptimization Feasibility Study is funded by the Technology Strategy Board (TSB) in the UK, following on from James Pallister’s 2013 HiPEAC internship with Embecosm. This internship studied how we can apply intelligent compilation to reduce energy consumption, and ulti­mately investigated the use of super optimizers.Embecosm is now interested in exploring the commercial feasibility of super optimi­zation in a collaborative project with the University of Bristol. During this study, prototypes of different superopti mizers

SUPEROPTIMIZATION: A FEASIBILITY STUDY

Optimized Hardware for Suboptimal Software: The Case for SIMD-aware Benchmarks

PARVEC: VECTORIZED PARSEC BENCHMARKS

Energy efficiency has recently replaced performance as the main design goal for microprocessors across all market seg ments. Vectorization, parallelization, speciali zation and heterogeneity are the key approaches that both academia and industry embrace to make energy efficiency a reality.

As technologies move forward benchmarks should be extended to cover new hardware features, or else researchers may end up over­ or underestimating the impact of their contributions. Vectorization capabi­lities are available in processors designed for different market segments, from embedded devices (NEON), desktop and servers (SSE and AVX) to GPUs and accelerators. However, many of the bench­mark suites frequently used by the research community (SPEC, SPLASH­2, PARSEC, Rodinia, Parboil, etc.) have no or very limited SIMD capabilities. ParVec aims to close this

gap by extending most of the PARSEC benchmarks with SIMD capabilities.

ParVec can target SSE, AVX and NEON SIMD architectures by means of custom vectorization and math libraries. The per for­mance and energy efficiency improve ments from vectorization depend greatly on the fraction of code that can be vectorized. Vectorization­friendly bench marks obtain

energy improvements per thread of up to ten times. The ParVec benchmark suite is available for the research community, to serve as a new baseline for the evaluation of future computer systems.

Juan Manuel Cebrian, Magnus Jahre and Lasse Natvig, NTNU, Trondheim, Norway. (http://www.ntnu.edu/ime/eecs)_________

S: Scalable. RL: Resource Limited. CI: Code/Input limited.

Page 7: HiPEACinfo 40

HiPEAC info 40 7

hipeac news

SOME NEWS FROM MAXELER TECHNOLOGIES

ENABLING LOW-COST DFE ACCELERATION For all Maxeler University Program (MAX­UP) members, an affordable plug­in PCIe card will be available soon as an alternative to the MaxWorkstations currently in use. With the generous price discount on the reconfigurable chip from Altera, Maxeler was able to reduce the price to only $4,999 US. This is expected to allow a much larger number of universities and non­commer­cial organizations to benefit from the power of Maxeler's DataFlow Engine (DFE) technology. This new product (codename Galava) is based on the MAIA DFE card that has been scaled down in order to meet the sharp budget target while still supporting

very competitive DFE acceleration. With Galava almost any Linux­based desktop computer can be extended with DFE technology and transformed into a powerful data crunching workstation. Galava MAX­UP cards can be ordered now with expected delivery time of no longer than 15 weeks. You can approach Prof. Gaydadjiev at ([email protected]) in case you are interested in additional technical details.

BECOME A SOUGHT-AFTER DATAFLOW EXPERTMaxeler is looking for highly qualified candidates to strengthen its engineering

teams. In addition, there are various opportunities for exciting internships in London as part of an extremely motivating environment. Maxeler engineers and interns work on solving some of the most challenging computational problems of today and their products are able to outperform any other computing techno­logy in both performance and reduced power consumption. More information about an exciting career at Maxeler can be found at: http://www.maxeler.com/about­us/careers/opportunities/

Georgi Gaydadjiev, Maxeler Technologies, UK_________

WORKSHOP ON SOFTWARE TOOLS FOR NEXT GENERATION COMPUTING

In June, the European Commission orga nized a workshop on soft­ware tools and methods, with a specific focus on software develop ment for cyber­physical systems, industrial and time­critical applications used in sectors like aerospace, automotive, transport and automation.The objective of the workshop was to discuss the state of the art and future perspectives in view of the expected evolution of computing hardware. A small number of experts gathered in Brussels on June 24th to participate in the workshop. The participants were asked to identify technologies and ideas expected to have significant market and cross­marked influence. Furthermore, the participants raised their views on where investments should go in order to strengthen the competitive position of actors in the European Union. The workshop began by setting the current scenario, identifying software trends (such as complexity, parallelism and specia­lization) and cross­cutting issues (such as energy, security and economics). Conside ring the expected evolution of the com puting hardware and the associated soft ware stack, the participants were asked to advise the European Commission on future actions and directions for the research and innovation work programme, with a focus on the 2016­2017 timeframe. Key messages from the participants were classified into the two broad categories: economic and general aspects and technical aspects. On the economic side, the partici pants discussed a variety

of topics, such as: • The necessity of “friendly” ecosystems to help the European

software tool industry• The advantages and disadvantages of free and open software• The importance of standards and participation in

standardization activities• European approaches to funding and commercial exploitation• The role of HPC and access to HPC infrastructure• How to foster innovation in the simulation software market

On the technical side, the participants focused on aspects such as:• The need for formal high­level tools and approaches• The importance of domain specific approaches• How to manage the expected increase in code size• Collaboration environments for model­driven tools• Multicore software development• Impact of hardware evolution on the European tool industry• Impact of the Internet of Things• New algorithms and structures

To summarize the discussion, the work shop produced a report discussing the various viewpoints on these topics. We encourage our readers to take a look at the full report which is available on: http://ec.europa.eu/digital­agenda/en/news/software­tools­next­generation­computing­0_________

will be investigated, and their applica bility extended from a subset of the target processor’s instructions to include all instructions. By the end of the project we hope to have a prototype methodology for applying superoptimi zation in a commer­

cial setting, as well as a roadmap for implementing this. As with all of Embecosm's projects, the Superoptimi­zation Feasibility Study will be fully open source and the progress of the project can be followed at http://superoptimization.org.

James Pallister, Embecosm & University of Bristol, Kerstin Eder, University of Bristol, Jeremy Bennett, Embecosm URL: http://superoptimization.org_________

Page 8: HiPEACinfo 40

HiPEAC info 408

hipeac startup news / in the spotlight

After one year of operation, the TETRACOM (Technology Transfer in Computing Systems, www.tetracom.eu) Coordination Action is now financially and contractually suppor ting fifteen European Technology Transfer Projects (TTPs). Each TTP is based on a bilateral and highly focused academia­industry transfer activity with concrete impact goals and timelines. The most recent ten TTPs resulted from an open call for TTPs in February 2014, which received a great response from the embedded and computing systems communities, with a total of 31 TTP proposals from all over Europe. The TETRACOM steering committee imple mented an efficient review process, employing independent evaluators, and has carefully selected the highest­ranked proposals for co­funding. The total funding volume in this call amounts to approx. 340,000 EUR.

In addition to the TTP call, TETRACOM helped to stimulate European technology transfer activities by various initiatives, including its individual consultation service, and special events at the HiPEAC confe rence and DATE 2014. Further key project mile­stones included a successful first­year project review in May, as well as the first meeting, in September, of the TETRACOM consortium with its Industrial Advisory Board.

At this point, the TETRACOM management is strongly focused on the upcoming 2nd call for TTPs, to be opened in November. Our ambition is to receive a record number of high­quality proposals, and, even more importantly, to make participation as simple and lightweight as possible.

These are the major rules:• A TTP is based on a bilateral technology transfer agreement

between an “academic partner” (i.e. a publicly funded EU research institution) and an “industry partner” (i.e. a privately funded entity with business activities in the EU).

• Only the academic partner can submit a TTP proposal and, if granted, receive TETRACOM funding of up to 50% of the total TTP volume. The remaining budget comes from the industry partner. Typical funding size is 25,000 to 50,000 EUR per TTP.

• TTP proposals are externally and confidentially evaluated based on impact and other transparent criteria.

• TTP proposals are short and focused. The evaluation procedure is fast and lightweight, too. Successful academic partners formally join the TETRACOM consortium to implement a granted TTP.

So, if your academic R&D has recently resulted in some exciting new piece of hardware or software technology with reasonable readiness level, and you have a potentially interested industry partner, do not hesitate to leverage the TETRACOM resources. In case of questions, please feel free to contact the project coordinator at [email protected]­aachen.de or to use the free consultation service at https://tetracom­service.doc.ic.ac.uk.

Rainer Leupers, RWTH Aachen University, Germany_________

2nd Call to be opened in November 2014

TETRACOM COMPLETES 1ST CALL FOR TECHNOLOGY TRANSFER PROJECTS

Can open ISAs and SoC implementations help kickstart a greater number of semiconductor startups?

LOWRISC - AN SOC THAT IS OPEN TO THE CORE

lowRISC is a new not­for­profit organi sa­tion working closely with the University of Cambridge and the open­source commu­nity. Our aim is to produce a completely open computing eco­system, including an SoC manufactured at volume in a modern

process with a low­cost development board.Our open­source SoC designs will be based on the 64­bit RISC­V instruction set archi­tec ture (www.riscv.org) with extensions focusing on security and helper cores to provide a flexible platform.

The goal is to create a competitive and novel SoC, explore low­complexity design techniques and to create a benchmark design to aid the research community. Ulti mately, we hope our IP will help support semiconductor startups by

lowering the cost to reaching their first tape­out.

More details can be found at www.lowrisc.org.

We would also very much like to hear from HiPEAC members who might be interested in collaborating.

Robert Mullins, Computer Laboratory, University of Cambridge_________

Page 9: HiPEACinfo 40

HiPEAC info 40 9

hipeac news

COLLABORATION GRANT REPORT - ATHENA ELAFROU

COLLABORATION GRANT REPORT - MARIUS ENACHESCU

I am a PhD student at the National Technical Univer­

sity of Athens (NTUA), who has recently been wor king on sparse linear algebra. As part of my HiPEAC­sponsored industrial internship, I spent four months at Movidius in Dublin, Ireland. Movidius designs vision processing chips that enable high­end computational imaging and visual aware­ness at extreme levels of energy efficiency. Many of the algorithms required to provide such functionality come down to solving linear systems, which involves performing linear algebra computations on matrices and vectors. In many occasions, the matrices involved have a sparse block structure, where the size of the blocks corresponds to the number of degrees of freedom of the variables of the system. Currently, sparse matrices are being treated as dense, leading to a significant waste of bandwidth and performance.The first goal of my work was to examine whether a compact representation for sparse matrices would improve the perfor­mance of a specific linear algebra compu­

tation, i.e. the Sparse Matrix­by­Vector Multi plication (SpMV). This kernel is a funda mental operation in the solution of linear systems, and it is very often the bottleneck of iterative methods. Further­more, it is highly memory bound, due to its low arithmetic intensity, and could, therefore, benefit from a compact repre­sen tation of the sparse matrix. To this end, the commonly­used Compressed Sparse Row (CSR) format was used to store the sparse matrix.After verifying the benefits of using a sparse matrix storage format, my main goal was to design a more elaborate (and potentially more efficient) format, taking advantage of Myriad's hardware support for sparsity. The Myriad platform allows for a very compact representation of sparse data and, more specifically, sparse blocks of data. This feature was used to implement a more efficient variant of the well­known Block Compressed Sparse Row (BCSR) format, improving the performance of the SpMV kernel over both BCSR and CSR. The Myriad­tailored format relies on bitmaps to store only the nonzero elements of each block in

the matrix, leading to a smaller memory footprint. Our results verify that hardware support for sparsity can further improve the performance of the SpMV kernel.Concerning the actual implementation, the SpMV kernel proved to be more memory­latency bound than memory­bandwidth bound on the Myriad 1 platform. This indicated that high perfor­mance would mainly rely on “hiding” the memory transfers to the extent possible. This was achieved through efficient use of the DMA engines that are available on each SHAVE processor. A combination of double buffering and fine­grain scheduling of the memory transfers produced the best results.I would like to thank HiPEAC for giving me the opportunity to work with a group of highly­skilled and inspiring professionals at a company committed to enabling innovations in computational imaging, vision and photography.

Athena Elafrou, National Technical University of Athens, Athens, Greece_________

As part of a HiPEAC sponsored indus­trial internship, I spent four months at ARM’s Research

and Development in Cambridge, United Kingdom.ARM’s main development is a family of microprocessor instruction set architec­tures based on a reduced instruction set computing (RISC) architecture. The low power consumption of ARM processors has made them very popular. My PhD topic being “Hybrid NEMS­CMOS Architectures for Ultra Low Power Smart Systems”, the road towards reaching the most out of the low­power NEMS­CMOS hybrid integration quickly turned to 3D­Stacked IC hetero­geneous integration. I therefore started a theoretical analysis of the power distribu­tion network (PDN) and IR­drop analysis. In order to have a complete overview of the

design of the PDN, practical experience is required. My work at ARM therefore focused on modeling and analyzing the PDN in ARM’s Juno Development Platform.In order to thoroughly analyze Juno’s core supply bounce, ARM also includes models of the package and printed circuit board (PCB). The internship begun by under­standing the package extraction outputs, closely followed by the development of package extraction methodologies based on Apache extraction tools, by identifying the most promising one, i.e., the one that gives results correlated with package house ones, and by evaluating its potential perfor­mance in terms of supply rail impedance, loop inductance, and voltage droop.This involved drawing on knowledge gained during my research, the paper published on the subject of TSV­based PDN, as well as ideas and requests from users of the package model inside ARM, in

order to extract improvements which make sense in the context of the core supply bounce analysis and signal integrity of high speed peripherals.In addition, I also contributed to the development of in­house understanding of package analysis to be used by test chip teams, which gave me the opportunity to learn about signal integrity techniques used in PCB­package­die simulations.This internship was also the first time I have used the package extraction and analysis tools, which has given me lots of ideas for new areas to explore in my future research. I am grateful to HiPEAC for providing the opportunity to participate in this internship, as well as to ARM for hosting it.

Marius Enachescu, Delft University of Technology, The Netherlands_________

Page 10: HiPEACinfo 40

HiPEAC info 4010

PhD news

high voltage engineering applications on multi­ and many­core architectures. This includes strategic changes to improve runtime complexity, exploiting characteris­tic hardware properties, dynamic code gen­eration and assessing the implications of implementing new and improved theoreti­cal methods._________

Thomas Müller, Technische Universität München, GermanyAdvisors: Prof. Dr. Arndt Bode, Prof. Dr. Carsten Trinitis, Dr. Andreas BlaszczykGraduation date: April 2014

TECHNIQUES FOR ADAPTING INDUSTRIAL SIMULATION SOFTWARE FOR POWER DEVICES AND NETWORKS TO MULTI- AND MANY-CORE ARCHITECTURES

In recent years processor characteristics have changed significantly towards parallel processing. In contrast to performance improvements from rising processor clock frequency in the past, existing software typically does not automatically benefit from additional processor cores, but it has to be adapted to efficiently use multi­ and many­core architectures.My thesis devises techniques to cost­effec­tively increase the efficiency of industrial

run­time resource management on many­core platforms. Firstly, the customization is achieved by applying, on the middleware level, custom microcoded memory alloca­tors. Secondly, the run­time resource man­agement on the platform is achieved by using cores in different roles and by apply­ing a distributed on­chip communication scheme._________

Iraklis Anagnostopoulos, National Technical University of AthensAdvisor: Assoc. Prof. Dimitrios SoudrisGraduation date: February 2014

RUN-TIME RESOURCE MANAGEMENT AND APPLICATION CUSTOMIZATION FOR MANY-CORE EMBEDDED PLATFORMS

Dr. Iraklis Anagnostopoulos received his Ph.D. from Microprocessors Laboratory and Digital Systems Lab (MicroLab) ECE school in National Techical University of Athens. In his Ph.D. Thesis, entitled "Run­time resource management and application customiza­tion for many­core embedded platforms", he presented (i) memory management middleware acceleration and customiza­tion methodologies for applying custom­ized dynamic memory managers (alloca­tors) and (ii) frameworks for distributed

communication systems. More precisely, it studies the trade–offs between the repre­sentativeness of the scenarios (clustering overhead), implementation of scenario detection (detection overhead) and the platform tuning cost (switching overhead)._________

Nikolaos Zompakis, National Technical University of AthensAdvisors: Assoc. Prof. D. Soudris, Prof. K. Pekmestzi, Ass. Prof. G. Economakos, Prof. G. Stassinopoulos, Ass. Prof. Panagopoulos, Ass. Prof. D. Reisis, Ass. Prof. G. TheodoridisGraduation date: February 2014

DEVELOPMENT OF A SYSTEMATIC METHODOLOGY FOR DYNAMIC RESOURCE MANAGEMENT FOR EMBEDDED SYSTEMS

Next generation wireless systems support a wide range of communication protocols and services, opening new design chal­lenges. These devices experience transient overloads leading the system to take timely reactions to the occurrences of unexpected usage scenarios. The current thesis concen­trates on these design challenges exploit­ing the system scenario methodology, pro­posing solutions especially for wireless

Page 11: HiPEACinfo 40

HiPEAC info 40 11

PhD news

In this thesis, developed under the TERAFLUX project, a hardware data­flow task sched­uler has been designed and tested. It pro­vides a proof­of­concept, showing that it works with feasible hardware resources and that it can efficiently manage real applications with thousands of tasks distri­buted among hundreds of processors, with almost negligible processing overhead. _________

Fahimeh Yazdanpanah Ahmadabadi: Polytechnic University of Catalonia, SpainAdvisors: Dr. Carlos Álvarez Martínez, Dr. Daniel Jiménez-GonzálezGraduation date: June 2014

HARDWARE DESIGN OF TASK SUPERSCALAR ARCHITECTURE

As larger numbers of cores are placed in a single chip, parallel programming models such as OmpSs and OpenMP 4.0 struggle to exploit data­flow task­based parallel­ism, by pushing the task scheduling effort to programming model software run­times. Although run­time systems have demonstrated the ability to manage thou­sands of tasks, the management overhead per task limits their capacity to manage large numbers of small tasks in short periods of time.

system that automatically classifies such variables; and a compile­time GCC­based system that automatically and seamlessly translates this new OpenMP clause into calls to our TLS runtime system, which runs the target loop speculatively in parallel._________

Parallelizing sequential code is not straight­forward, and so often it is not possible to apply automatic compile­time techniques. Runtime techniques, such as Thread­Level Speculation (TLS), solve this restriction, but still require a manual augmentation of the code. My Ph.D. thesis addresses this problem by defining a new OpenMP clause, called speculative, which identifies the variables that may lead to a dependency violation; a

Sergio Aldea, Universidad de Valladolid, SpainAdvisor: Dr. Diego R. LlanosGraduation date: July 2014

COMPILE-TIME SUPPORT FOR THREAD-LEVEL SPECULATION

niques for reducing on­chip network resource requirements by conveying global information using only local resources and by exploiting emerging regional on­chip traffic behaviors. These proposed schemes and techniques increase the effectiveness of on­chip networks and widen the spec­trum of trade­offs among power, resource and performance._________

Lizhong Chen, University of Southern California, USAAdvisor: Prof. Timothy M. PinkstonGraduation date: August 2014

DESIGN OF LOW-POWER AND RESOURCE-EFFICIENT ON-CHIP NETWORKS

This dissertation explores the opportuni­ties, challenges and viable solutions at the architecture­level in designing low­power and resource­efficient on­chip networks for future many­core chips. Promising schemes are proposed that effectively decouple com­putational and communication resources to maximize static power savings while minimizing the performance penalty, by dynamically powering on and off resources based on runtime application traffic loads. This research also proposes different tech­

Page 12: HiPEACinfo 40

upcoming events

The 1st Workshop on MEthods and TOols for Dataflow PrOgramming (METODO)Colocated with DASIP 2014 - 7 October 2014, Madrid, Spain http://www.ecsi.org/dasip/metodo2014

The Nordic Microelectronics Conference (NORCHIP 2014)27-28 October 2014, Tampere, Finland http://www.norchip.org/

The International Symposium on System-on-Chip (SoC 2014)28-29 October 2014, Tampere, Finland http://soc.cs.tut.fi/2014/index.php

The 47th International Symposium on Microarchitecture (MICRO-47)13-17 December 2014, Cambridge, UK http://www.microarch.org/micro47/

The 10th International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC 2015)19-21 January 2015, Amsterdam, The Netherlands http://www.hipeac.net/conference

The 21st IEEE International Symposium on High Performance Computer Architecture (HPCA 2015) The 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2015) The 2015 International Symposium on Code Generation and Optimization (CGO 2015)7-11 February 2015, Bay Area, California, USA http://darksilicon.org/hpca/ http://ppopp15.soe.ucsc.edu/ http://cgo.org/cgo2015/

The European Conference on Computer Systems (EuroSys 2015) 21-24 April 2015, Bordeaux, France http://eurosys2015.labri.fr/

Design, Automation & Test in Europe (DATE-15)9-13 March 2015, Grenoble, France http://www.date-conference.com/

20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2015)14-18 March 2015, Istanbul, Turkey http://asplos15.bilkent.edu.tr/

HIPEAC 2015 CONFERENCE, 19-21 JANUARY 2015, AMSTERDAM, NETHERLANDSWWW.HIPEAC.NET/2015/AMSTERDAM

info 40

hipeac info is a quarterly newsletter published by the hipeac network of excellence, funded by the 7th european framework programme (fp7) under contract no. fp7/ict 287759website: http://www.hipeac.net.subscriptions: http://www.hipeac.net/newsletter

contributions If you are a HiPEAC member and would like to contribute to future HiPEAC newsletters, please visit http://www.hipeac.net/hipeacinfo

design

: w

ww

.mag

elaa

n.be