Upload
pierres-boisson
View
104
Download
1
Embed Size (px)
Citation preview
System on ChipSystem on ChipDEA 2002DEA 2002
Evolution du marchéEvolution du marché
• Plus en plus présent dans le quotidien– Ordinateurs, PDA– GSM,GPRS,UMTS,
GPS– TV numérique– Electronique
embarquée dans l’automobile
– Baladeurs CD/MP3 DVD
StandardStandard
• Les standards facilitent cette évolution vers l’intégration de services :– PDA + GSM– GSM + MP3– UMTS + MPEG4 + MP3 + Hiperlan2 + ...
Plus de performancePlus de performance
• GSM =>GPRS =>EDGE =>UMTS
• Bluetooth 11 Mbits/s =>Hiperlan2 à 54 Mbits/s
Réduire le « time to market »Réduire le « time to market »
• Les produits ont une durée de plus en plus faible– Réduire le «time to market»– Réutilisation pour concevoir d’autres produits
(rentabiliser)
REUTILISATION
• Approche retenue pour limiter les coûts
• Conception d’un SOC à partir de blocs prédéfinis : Intellectual Properties
Réduction des coûtsRéduction des coûts
• Conséquences de la réduction des coûts de conception du matériel
• Réduire les coûts du matériel augmente en proportion les coûts du logiciel – 80% du coût de développement d’un SOC est
aujourd’hui dû au logiciel
• Le coût du test croît de façon exponentielle– Equipes de vérification 2 fois plus nombreuses que
celles de développement
La révolutionLa révolution
• Le nombre de SOC vendus croît de 30% par an
• Prévision de répartition par secteur pour 2004 :– Communication : 44%
(croissance 24% par an)– Electronique grand public :
28 % (croissance 43% par an)
– Le reste 28 %0
200
400
600
800
1000
1200
1400
1995 2000 2005
Nb Soc(M)
Evolution des besoinsEvolution des besoins
• Plus de fonctionnalités
• Plus de puissance
• Consommation réduite
• Réduction de la taille
• Coût faible
• Réduire le time to market
• Réutisibalité
Evolution des outilsEvolution des outils
• Outils de conception évoluent moins vite que la technologies
Réutiliser des éléments– Bibliothèques, IP
Evolution de l’architectureEvolution de l’architecture
Techniques de conceptionTechniques de conception
• 70-80 : full-custom– Schéma – Dessin des masques– Simulation electronique
• 80-90 : Précaractérisé FPGA– Réutilisation de briques élémentaires– Modélisation, simulation
• 00-xx : SoC– Réutlisation du matériel et logiciel– Co-design, vérification
Principes de conceptionPrincipes de conception
• Une architecture matérielle– Blocs standards (CPU, mem)– Blocs spécifiques– Bus de communication
• Des ressources logicielles
• SoC = cohabitation de ces ressources sur un même chip, prise en compte globale pour la réalisation hard/soft
Approche traditionnelle de conception
• Concevoir un SOC : Vaste Problème d’Optimisation
• Ensemble de choix suivant plusieurs critères– Performances atteintes– Coûts minimum– Communications
maîtrisées– Time-to-market réduit– Consommation
minimisée
Quelle architecture?Quelle architecture?
• Architecture Généraliste ou Spécialisée?
Cycle classique de conception
Vérification par co-simulationVérification par co-simulation
Techniques de vérification formelle
• Vérification par équivalence de modèles
• Vérification par preuve formelle de propriétés
• Difficulté pour le concepteur : déterminer les propriétés qui font du sens
Limitations approche traditionnelleLimitations approche traditionnelle
• La vérification par cosimulation de plus en plus limitée : couverture vs temps
• Vérification des contraintes de temps : test du système i.e. en fin de cycle
• Les remises en cause ont une portée importante dans le cycle
• Approche de conception “processor centric”. Tendre vers “communication centric”
• Augmenter l’effort sur les premières étapes :– Méthodes et outils qui opèrent au niveau système– Nécessité de modèles
Vers une Conception Système Vers une Conception Système
• Modélisation des applications
• Construction de l’architecture
• Le problème du partitionnement
• Le problème des communications
• Le problème de la consommation
Conception de SoCConception de SoC
Réalisation d’un SoCRéalisation d’un SoC
• Réutiliser les blocs déjà conçus dans la société ;
• Utiliser les générateurs de macro-cellules (Ram, multiplieurs,…)
• Acheter des blocs conçus hors de l’entreprise.
PLATEFORM-BASED DESIGN
• Poursuite de la réduction des coûts
• Concevoir un SOC réutilisable
• SOC pour une famille d’applications
Notion d’IP (Notion d’IP (Intellectual PropertyIntellectual Property))
• Blocs fonctionnels complexes réutilisables– Hard: déjà implanté, dépendant de la
technologies, fortement optimisé– Soft: dans un langage de haut niveau (VHDL,
Verilog, C++…), paramétrables
• Normalisation des interfaces• Environnement de développement (co-
design, co-specif, co-verif)• Performances moyennes (peu optimisé)
Utilisation d’IPUtilisation d’IP
• Bloc réutilisable (IP) – connaître les fonctionnalités– estimer les performances dans un système– être sûr du bon fonctionnement de l’IP– intégrer cet IP dans le système– valider le système
Commerce d ’IP « design & reuse »
Les Cœurs de Processeurs RISC
• Grande variété de RISC disponibles sous forme d’IP : ARM, Hitachi, MIPS, LSI Logic
• Exemple : Processeurs RISC 32 bits ARM (Advanced Risc Machines)
Les DSPLes DSP
• Architecture des DSP : ciblée par rapport aux besoins d’une (classe d’) applicaton– Exemple : TMS320C54x pour GSM– Pipeline faible : déterminisme, consommation,
surface– Parallélisme de calcul : performances,
consommation– Registres : spécialisés, juste suffisants– Mémoire : parallèle multi-bancs, on-chip et
off-chip : performances 1 à 3
IP DSPIP DSP
• Nombreux constructeurs et nombreux DSP chez chacun d’eux
VSI alliance : StandardisationVSI alliance : Standardisation
• Objectifs :– Réutiliser, échanger, vendre des composants virtuels
• Principes :– Spécification et recommandation sur :
• Interfaces logiciels et matériels• Formats• Directives de conception
– Modèles pour :• Spécification à différents niveaux d ’abstraction• Documentation• Test• Simulation
Alliance VSIA: ConceptionAlliance VSIA: Conception
Virtual Socket Interface Alliance
SoC vs SoPCSoC vs SoPC
• SoC– Peu évolutif– Grandes productions– Fabrication et test long et coûteux
• System on Programmable Chip– Prototypage rapide sur FPGA– Composant reconfigurable à volonté– Moins de portes logiques dispo– Consommation plus élevée– Performances moins bonnes
SOPHOCLESSOPHOCLESSystem level develOpment Platform based
on HeterOgeneous models
and Concurrent LanguagEs for System applications implementation
Thomson-CSF Communications, Thomson Marconi Sonar, LIFL, Thomson-CSF Communications, Thomson Marconi Sonar, LIFL, Esterel Technologies - FranceEsterel Technologies - France
Philips - Pays-BasPhilips - Pays-Bas
ENEA, Ipitec - ItalieENEA, Ipitec - Italie
Présentation Générale Présentation Générale • Avènement du “tout numérique” dans les applications
Telecom et Multi-média– accroissement des puissances de traitement nécessaires– systématisation de l’usage de processeurs programmables– systèmes hétérogènes - adéquation des unités de calcul aux
besoins de traitement :• structures SIMD pour le Traitement de signal systématique
• DSP pour le T.S.
• RISC pour la supervision
– intégration de Composants Virtuels (VC) multi-source– Réduction permanente du “Time to Market”
Présentation Générale Présentation Générale (2)(2)
• Les enjeux:–Maîtriser la conception et le développement
des applications Temps Réel complexes: simulations systèmes globales,
mise en œuvre de simulations hétérogènes distribuées,
introduction des techniques formelles pour validations précoces,
utilisation d’environnements de programmation haut niveau,
constitution d’une “cyber entreprise” au travers d’Internet.
Présentation Générale Présentation Générale (3)(3)
• Les techniques mises en œuvre :– Introduction de nouveaux formalismes:
– SyncCharts / Esterel– ArrayOL– Evolving Grammars– Made
– Utilisation de divers langages et techniques :– UML, XML, VDM++– Java, Jini, RMI, Corba– MPI, ZZ– Design Patterns, Esterel ++, agents intelligents
Environnement SOPHOCLESEnvironnement SOPHOCLES
La “Cyber Entreprise”La “Cyber Entreprise”
WEB
CoreLevel
CyberEnterpriseWeb Level
User-SystemLevel
...Client3 Manager
Multimedia User Interface
Client2 Manager
Cognitive Interactions Manager
SIMULATIONTASKS
MANAGER
CONFIGURATION TASKSMANAGER
USER- ADVISOR (IDSS)
Client1 Manager
Provider1
CYBER-ENTERPRISEWeb Manager
..
....
Distributed Configuration & Simulation MANAGER
CONFIGURATIONMANAGER
SIMULATIONMANAGER
VC1 VC2 VCn
ProviderN..
.
Sophocles Cyber Enterprise Architecture
AVIWeb
Server
Client user
Central Database
Local DatabaseWS 1,2,...,n
VC1
CommunicationManagerModule
WS1 WS2 WSn
DIFF
VC Manager& Interface
VC2 VCnIA
Client provider
Organisation du PartenariatOrganisation du Partenariatet apportset apports
• SOPHOCLES est organisé sur une base coopérative:–France (TCC, TMS, LIFL, Simulog)
• Pilotage (TCC)• Techniques et environnement de simulation
–Italie (ENEA, Nergàl)• Cyber Entreprise (MUI, WEB) et techniques de
simulations
–Pays-Bas (Philips)• Environnement pour l’analyse de performance
The Gaspard environmentThe Gaspard environment
http://http://www.lifl.frwww.lifl.fr//westwest//dartdart//
Scientific areaScientific area
• Two different domain intersection– Intensive Signal Processing (ISP)
• Huge quantity of data• Real time constraints
– High performance computing• Heterogeneity of tasks• Large and infinite data flow
Gaspard overviewGaspard overview
Hiperf
RT
ISP
Effective Goal Effective Goal
• To propose at the higher level, in a unique “standard” environment– A formal model and an explicit specification
model • Validation, performance evaluation, verification
– All inputs are used at different levels • Code analysis• Mapping and scheduling • Code production
– Feasible for a particular applicative domain
Modeling and specificationModeling and specification
• High level complete functional specification • To reduce the time to market• No new programming language
– Visual expression of data dependences
• Express all the potential parallelism– Task and data parallelism paradigms
• Specify different levels of complexity – Exchange network of cluster – Data transfer on SMP board, on SoC
• Take into account the methods used by industrial partners– Multiples – Several levels of specification: functional, application, cycle-
accurate…– Separate specification and execution
« Y » model« Y » model
• Visual specification– ISP applications– Target architectures– Mapping of applications
on architectures
• Model separation allows reuse
• Typical programming techniques in SP world
Algorithm Architecture
Mapping
User applications
Compilers VC
Models
Why three levels of formalismWhy three levels of formalism
• Application:– Complete formal description (a priori validation )
– Hardware independent
– Simulation and compilation compatibility
• Architecture– Functional description
– Iterative refinement
– Application independent
• Mapping– Deployment of one application on one architecture
– Data allocations
– Data transfers
– Processing distribution
Modèle d’applicationModèle d’application
Data dependence expressionData dependence expression• Specification of task applied on objects• Express only which objects are needed to process an other
object (seems like demand-driven model)• Concurrency - partial order based on data dependences• No Runtime control (no mapping, no scheduling)• High level specification of the algorithm
– Common formalism– Development of tools (Hierarchy, Modularity, GUI)– Reduction of programming task effort
• Different type of dependences– Array dependences– Iterative dependences– Recursive dependences– Aletrnative dependences
Iterative dependencesIterative dependences
• Regular computational structure (Data Parallel)
• Allows a compact and parametric representation
– Local model of Array-OL
– Forall statements
• No or partial execution order (SPMD): explicit or automatic extraction
z2
z1
A(z1,z2) = f ( , ) (A(z1,z2-1) A(z2-1, z2-1)
ElementaryTransform
Pattern
InputArray
Output Array
ExamplesFittingPaving
Array (object) dependencesArray (object) dependences
• Graph of actions applied on objects– Global model of Array-OL
– Kahn Process Networks - YAPI
• Time and space dependences should be similarly expressed
userinterface
TSdemux
PESparser
videomixer
MPEG-2decoder
MPEG-2decoder
videoresizer
pes
pid
pid
es1
es2
yuv1
yuv2 pip
ts
size
position
out
Recursive/alternative Recursive/alternative dependencesdependences
• Irregular applications– Reduce add, cyclic graph…
– Dynamic arrays
– Collections
• Functional language inheritance– Fold operator of Caml…
• Unsystematic applications– Switch Components
– End of recurrence
• Towards a model to specify types of dependences (Meta)
SOPHOCLES
BS
R/T
BR
/ST
N/A
RC
/012
001
Array-OLTM Techniques
Thomson Marconi Sonar (TMS)
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
56
Some properties of Systematic Signal Processing (SSP)
Dedicated to Systematic Signal Processing, Array-OL takes advantage of some simple features of this domain:
Data are naturally organized into arrays, Dimensions of the arrays have a static size (at least as
long as the same mode is running), Access in the arrays are predefined and most of the
time “regular”, There is no conditional branches inside the application
(However, conditional operations - e.g.: if cond. then a+b, else a-b - are authorized. Sorting for instance may be considered to belong to SSP).
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
57
Specificity's of arrays in SSP
One dimension may be of infinite size (e.g. : time) Some dimensions may be wrapped-around (e.g : sensors). In fact some arrays may be
cylindrical or even toroidal. Size of some dimensions may change during the application (e.g : undersampling,
oversampling) Some dimensions may appear or vanish (e.g : frequency)
Array-OL takes into account all those specificity's.
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
58
Graphical Array-OL
With Graphical Array-OL, a SSP application is specified through the use of two models:
the Global Model , to first give a complete view of dependencies between tasks at the Array level, then
the Local Model , to give separately for each task dependencies between the output arrays and the input arrays at the Component level.
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
59
The Global Model Organization of the treatments for the complete application is defined by
a graph in which: nodes are the tasks, and vertices carry arrays between tasks.
In the Global model, one gives the name of the tasks and the name and the size of the arrays
Array
Task
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
60
The Local Model
ElementaryTransform
Pattern
InputArray
Output Array
ExamplesFittingPaving
The local Model makes the following assumptions: Few elements of the input array(s) are consumed and few elements of the output
array(s) are produced, each time an elementary transform (ET) is performed. The task is finished when its output arrays are completely filled
The "Few elements" are called a pattern Fitting and paving examples define graphically sampling strides
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
61
So far, what is defined with Array-OL?
So far, Array-OL just gives in fact an exhaustive view of dependencies in a SSP application.
Among all the strong potential parallelism of SSP applications - data parallelism, task parallelism - the choice is left completely free for execution provided that dependencies are respected.
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
62
First literal formalism: Array-OL Q.D
Array-OL Q.D is restricted to a formal representation of the Local Model. It is centered on the “Q.D” Array which doesn’t order data but computation grains.
The "Q.D" array is composed of: Q dimensions, which are the quotients (Q) obtained by dividing one of the
Result arrays by a divider (D) being the corresponding Result pattern. Result patterns dimensions, Operand patterns dimensions,
To define accesses among the data, each element (processing grain) of the Q.D array has to be projected on the Operand and Result arrays via matrices in which the coefficients define paving and fitting strides.
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
63
Array-OL Q.D: an example
RES.0OPE.0
OPE.1 RES.1Operand array
0 ope.0 < 64
0 ope.1 < 128
Result array
0 res.0 <8
0 res.1 < 6
ope.1 = 3 0 0 0 . q1
ope.0 0 2 0 6 q0
d res.
d ope.
res.1 = 2 0 1 0 . q1
res.0 0 1 0 0 q0
d res.
d ope.
Projection Q.D => Operand Projection Q.D => Result
8
6
64
128
Q.D Array
0 q1 < 3
0 q0 < 8
0 d res. < 2
0 d ope. < 4
Graph-ical
view
For-mal view
NB: the main lack with Array-OL Q.D is that it doesn’t establish a continuity between the Result space and the Operand space, thus forbidding dependence computation beyond one task.
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
64
Second literal formalism:Array Distribution Operators (ADO's)
ADO's permit to describe links between an Emitter space of type NxNx...: Ne of dimension e, and a Receiver space: Nr of dimension r.
More precisely, an ADO describes for each element of Ne if it is connected to elements in Nr, and if true which elements in Nr.
E0
E1
R0
NeR1 Nr
ADO
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
65
To describe dependencies, distributions or lay-out, 8 EO: 4 direct EO, 4 "mirror EO" are proved to be sufficient.
DIRECT
Gauge
Modulo
Shift
Projection
MIRROR
Gauge -1
Modulo -1, or Blow-out
Shift -1
Segmentation
ADO composition
An ADO is composed of a chain of "Elementary Operators" (EO) implicitly separated by Nd spaces.
An EO may cut ( ), prolong ( ), multiply ( ), concentrate ( ), links received from its input space, but only for the components having been already reached by links coming from the previous EO.
ADO’s may be seen as a multi-dimensional arithmetic with some non classic operations (e.g. : segmentation).
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
66
ADO example : dependencies in an Array-OL task
General expression of the links between
Operand space <= Result space
q
opé. . off-set
. popé.
0 fopé.
. d rés. . prés.
frés.
0 . rés.
d opé.
M S G GPRO. SEG.
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
67
Tasks composition computation The standard expression of an Array-OL task t is (R1):
M t .S tPRO tG t SEG t .G t}
where EO's contain only integer values.
Links between results of this task t and operands of the task t-1 are expressed by (R2):
M t-1 .S t -1PRO t -1G t -1 SEG t -1 .G t -1} M t .S tPRO tG t SEG
t .G t}
Patterns, fitting, paving of the macro-task obtained by composing t et t-1 will be defined when, by modifying EOs in (R2), one gets back to an expression of the same type as (R1). Modifications must respect the rule:cutting any link established in (R2) is forbidden; only adding new links is allowed.
A (graphical) tool for tasks merging with automatic computation of pattern sizes, fitting and paving parameters has been realized in cooperation with "Laboratoire d'Informatique Fondamentale de Lille"
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
68
Transformations
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
69
Interface Visuelle Gaspard
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
70
Hierarchisation in Array-OL
Hierarchisation has been added to Array-OL mainly to give a way to define the lay-out of an application on a SSP machine.
Starting from a flat view of the application, hierarchisation consists in: composing a segment of successive tasks into a single “macro-
task”, considering that the relationship between the “macro-patterns” and
the “macro-ET” of this macro-task may be seen as an application in a next lower level.
Hierarchisation is done according to the architecture of the SSP target machine which in turn may be seen as a hierarchical organisation: several clusters, several nodes per cluster, several PEs in a node...
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
71
Hierarchisation example
Cx. MPY IFFTFix=>Float FFT
A1 A2 A3 A4 A5 A6
Beam Forming
Fix=>Float Pulse Compression
A1 A2 A5 A6
Beam Forming
Composition
T
C C
TCLocalModel
Cx. MPY IFFTFFT
A'2 A'3 A'4 A'6
Level ”L-1" GlobalModel
Level ”L" GlobalModel
UML languageUML language
• Standard used also in industrial development• Extension mechanisms (« stereotypes », « tagged
values », « profils ») • No method is imposed
– Used to validate our own “Y” model • Visual as well as textual• A lot of tools exit (Rational Rose, Objecteering...)• Metamodel exits and is specified with MOF • We propose our metamodel dedicated to ISP
applications • Interoperability in XMI/XML
SOPHOCLES
<R
EF
>
- <
DA
TE
>
UML ARRAY-OL for Signal Processing
Cédric Dumoulin
LIFL
SOPHOCLES 74
<R
EF
>
- <
DA
TE
>
Goals
Visual modeling of Intensive Signal Processing Application (ISP-A)
Automatic exploitation of application models by tools Code generation, transformations, simulations, …
SOPHOCLES 75
<R
EF
>
- <
DA
TE
>
The ‘Y’ Model
This work is part of Sophocles project and DaRT project Propose the “Y” model:
Separation of algorithm, architecture and mapping Allows reuse of algorithms and architectures
Algorithm Architecture
Mapping
User applications
Compilers VC
Models
SOPHOCLES 76
<R
EF
>
- <
DA
TE
>
Modeling Applications with UML
Restrict to Intensive Signal Processing (ISP) applications
The framework clearly define what should be model and how
We can model a complete application and exploit it (transformation, code generation, …)
SOPHOCLES 77
<R
EF
>
- <
DA
TE
>
Basic Concepts to Model an ISP-A
Inspired from Array-OL (Array-Oriented Language) (TUS, Alain Demeure) Global model, task, array, local model, pattern, QD, elementary task, hierarchical description
Our proposal: SP-Component (signal processing component): unit of
computation Port: typed input/output of a sp-component Graph of dependencies:
made of sp-components interconnected by their ports Represent the internal structure of a sp-component
SOPHOCLES 78
<R
EF
>
- <
DA
TE
>
ISP UML et UML – Array-OL
Trois éléments clés : Composants
Définit par une interface et un comportement Elémentaires, hiérarchiques, ou itératifs / data-parallèles
Ports Symbolisent les données manipulées Points de connexion entre composants
Connexions Entre les ports des composants « directes » ou « contrôlées » par un « Tiler »
outputTacheElementaireinput
T T
SOPHOCLES 79
<R
EF
>
- <
DA
TE
>
Sp-Components and Ports in UML
Sp-component Class stereotyped <<spComponent>>
Port Class stereotyped <<port>> Specify a type description
Add a port to a sp-component Component’s attribute
stereotyped <<inPort>> or <<outPort>; with appropriate port type
+ UML association
SOPHOCLES 80
<R
EF
>
- <
DA
TE
>
c2:C2 c3:C3
c1:C1
input1 input2
output
Composants UML
SOPHOCLES 81
<R
EF
>
- <
DA
TE
>
Composants en XML (aperçu)
<UML:Package xmi.id="a0" name="myPackage" isSpecification="false" isRoot="false" isLeaf="false" isAbstract="false"> <UML:Namespace.ownedElement> <UML:SpComponent xmi.id="a3" name="InputComponent" isSpecification="false" isRoot="false" isLeaf="false" isAbstract="false"
isActive="false"> <UML:Classifier.feature> <UML:PortAttribute xmi.id="a4" type="a5" name="outStream" isSpecification="false" direction="out"></UML:PortAttribute> </UML:Classifier.feature> </UML:SpComponent> <UML:SpComponent xmi.id="a6" name="OutputComponent" isSpecification="false" isRoot="false" isLeaf="false" isAbstract="false"
isActive="false"> <UML:Classifier.feature> <UML:PortAttribute xmi.id="a7" type="a5" name="inStream" isSpecification="false" direction="in"></UML:PortAttribute> </UML:Classifier.feature> </UML:SpComponent> <UML:SpComponent xmi.id="a8" name="CompoundComponent" isSpecification="false" isRoot="false" isLeaf="false"
isAbstract="false" isActive="false"> <UML:Classifier.feature> <UML:PortAttribute xmi.id="a9" type="a5" name="input" isSpecification="false" direction="in"></UML:PortAttribute> <UML:PortAttribute xmi.id="a10" type="a5" name="output" isSpecification="false" direction="out"></UML:PortAttribute> </UML:Classifier.feature> </UML:SpComponent>
SOPHOCLES 82
<R
EF
>
- <
DA
TE
>
Graph of dependencies in UML
Collaboration diagram containing: Instances of sp-component’s ports Instances of sub sp-components
with associated ports Links between instances
Graph is associated to the sp-component it describes
Possible improvement: structure diagram More adapted to our business
vectorA : Complex512Port
vectorB : Complex512Port
: Product
: Comp...
: Comp...
: Comp...
: Reduction
: Comp...
: Comp...
result : ComplexPort
Product
DotProduct
A
B
resultReduction
SOPHOCLES 83
<R
EF
>
- <
DA
TE
>
Application Creation
Define port types Create sp-components One top-level sp-component: the application
No ports Structure represents the
top level graph of dependencies
SOPHOCLES 84
<R
EF
>
- <
DA
TE
>
Component Creation
Create a class stereotyped <<spComponent>> Add ports:
attributes stereotyped <<inPort>> or <<outPort>> associations
Associate collaboration diagram Draw sp-component internal structure
AxB<<spComponent>>
Complex512x512Port(from Aol)
<<port>>
Complex512x512Port(from Aol)
<<port>>
AxB
<<inPort>> A : Complex512x512Port<<inPort>> B : Complex512x512Port<<outPort>> result : Complex512x512Port
<<spComponent>>
Complex512x512Port(from Aol)
<<port>>
+B
+A
+result
A : Complex512x512Port
B : Complex512x512Port
result : Complex512x512Port
: MatrixProduct
: Comp...
: Compl...
: Comp...
A
BC
SOPHOCLES 85
<R
EF
>
- <
DA
TE
>
Ports
Typed: only “compatible” ports can be connected
Ports and Array-OL Arrays: Arrays typed port One port type for each kind of
array Extend class AolPort
AolPortdimsSize : StringelementsType : String
getNbDims()getSizeOfDim()getElementsType()isDimDynamic()
<<port>>
ComplexAolPortelementsType : String = "complex"
<<port>>
Complex512PortdimsSize : String = {512}
<<port>>
SpVectorPortdimsSize : String = {UNDEFINED}
getSize()
<<port>>
SpComplexVectorPort<<port>>
Complex512x512PortdimsSize : String = {512,512}
<<port>>
SOPHOCLES 86
<R
EF
>
- <
DA
TE
>
All is Sp-Component ?
Application: top level sp-component
Captors, data I/O: specialized sp-components
Elementary tasks, calls to external functions, … Data-parallel components Modeled by classes stereotyped <<spComponent>>
Some components use stereotypes extending <<spComponent>>: Ex: <<dataParallel>>
SOPHOCLES 87
<R
EF
>
- <
DA
TE
>
DataParallel Sp-Component(AOL local model)
Such component contains one and only one sub-component
Ports represent arrays of data Tilers describe how arrays are divided in patterns and
iterated Values are specified in tiler’ instances
Sub-component is executed once for each set of patterns
A : Complex512x512Port
B : Complex512x512Port
: DotProduct
C : Complex512x512Port
: AolIterativeTiler
: AolIterativeTiler
: AolIterativeTiler
: Comp...
: Comp...
: Comp...
direction = INorigin = {0,0}paving = {{0,0},{0,1}}fitting = {1,0}
direction = INorigin = {0,0}paving = {{1,0},{0,0}}fitting = {0,1}
direction = OUTorigin = {0,0}paving = {{1,0},{0,1}}fitting = {}master = true
DotProduct
MatrixProduct
A
Bresult
SOPHOCLES 88
<R
EF
>
- <
DA
TE
>
Advanced Concepts
Ports can carry sp-components Each sp-component has a special port
named “thisComponent” As output: carry the sp-component itself As input: the instance is replaced by sp-
component provided by the port
Useful to build multiplexers, switches, build parameterized component
CaseA
this
ApplyCase
this
CaseB
this
multiplexer
A B
selectedkey
CaseA CaseB
ApplyCase<<Interface>>
: CaseA : AolPort
: AolPort
: AolPort
: CaseB
: ApplyCase : AolPort
: AolPort
: AolPort
: AolPort
: AolPort
: AolPort
: Multiplexeur
: SpC...
: SpC...
: int
: SpC...
: SpC...
: SpC...
: SpC...
A
B
C
A
BC
A
CB
this
A
B
1: selected
key
2:
SOPHOCLES 89
<R
EF
>
- <
DA
TE
>
Advanced Concepts (Con’t)
Recursive calls New kind of Array Tilers
Enumerative: describe explicitly how an array is enumerated
Irregular iteration: Ex: extract the first data and drop the rest, …
input : SpVectorPort
output : SpVectorPort
: ReductionPattern
: SpVectorPort : SpVectorPort
: AolEnumerativeTiler
: AolEnumerativeTiler
origin={0} dims={input.getSize()/2} fitting={1}origin={input.getSize()/2} dims={input.getSize()/4} fitting={1}
origin={0} dims={1} fitting={1}origin={1} dims={1} fitting={1}
input output
E E
SOPHOCLES 90
<R
EF
>
- <
DA
TE
>
Itérations non régulières
Exemple d’itération non régulière :
input : SpVectorPort
output : SpVectorPort
: ReductionPattern
: SpVectorPort : SpVectorPort
: AolEnumerativeTiler
: AolEnumerativeTiler
origin={0} dims={input.getSize()/2} fitting={1}origin={input.getSize()/2} dims={input.getSize()/4} fitting={1}
origin={0} dims={1} fitting={1}origin={1} dims={1} fitting={1}
input output
E E
ReductionPattern
SOPHOCLES 91
<R
EF
>
- <
DA
TE
>
Composants récursifs
Modélisation d’applications récursives : Définition récursive de composants
SOPHOCLES 92
<R
EF
>
- <
DA
TE
>
Models Exploitation
MOFUML metamodel
MOFAOL metamodel
MOFISP-UML metamodel
UML Java interfaces
and implementations
UMLXMI
ISPXMI
AOLXMI
visual tools(netbeans)
Ptolemy IIUML
visual tools
ISPJava interfaces
and implementations
AOLJava interfaces
and implementations
transformations
code generations
simulations
OMG OMG
JMI
OMG
JMI
OMG
JMI
import/exportimport/export
extends
SOPHOCLES 93
<R
EF
>
- <
DA
TE
>
Works in Progress and Future Works
Provide similar framework for: Architecture modeling Mapping description
Complete the “Y” model Simulator, code generator, transformations, …
Architecture specificationArchitecture specification
• The same specification environment in UML• Improve architecture modeling of UML
– Similar model than application specification– Support the targets of applications : SoC, COTS, distributed
simulators... – Different levels of specification for different levels of simulation – Hierarchical and iterative architecture building
• Two kind of components – Active components to process data – Passive components to store data
• To take account of industrial environments of Sophocles partners – Array-OL architecture, mAgiSim, Vcc, SynDEx…
Mapping an application on an Mapping an application on an architecturearchitecture
• Last step before code generation : simulation or execution
• Specified explicitly by the programmer• Integrate in UML new deployment elements
– Link between architecture and algorithm– Same application on different architecture and
reciprocally• Express associations between components of
application and active and passive elements of the architecture– Same kind of iterators are use for iterative
mapping on space and time
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
96
Possible SSP Architectures suggestedby “Flat”Array-OL
“Flat” Array-OL may suggest that: : tasks in the Global Model could be
executed in a pipeline fashion on different resources (if available),
TEs in the Local Model could be executed in SIMD fashion if the resource considered in the Global Model is composed of several identical “Elementary Resources”.
“Flat” Array-OL naturally calls for multi-SPMD architectures.
A1
A2 A3
T1
T2
T.E.
A1 A2
Global Model
Local Model
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
97
With hierarchy, Array-OL may also suggest that an Elementary Resource in the Local Model could appear as composed of several heterogeneous resources n the Global Model of a next level.
T
C C
TC
TM1A’1 A’3 A’4
Level L
Level L+1 GlobalModel
LocalModel
M2A’4
Possible SSP Architectures suggestedby Hierarchical Array-OL
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
98
Resources Classification
We consider that a SSP architecture may be seen as a set of only 2 types of resources : Passive Resources used as a medium for Arrays :
memories to store the arrays, peripherals producing or consuming data outside the
architecture. Active resources capable of reading or writing data in those
passive resources : DMAs to move - without modification - data between
passive resources, CPUs to read, modify, write, data in one or several
memories.
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
99
Resources possible representation
Péripheral Memory DMA CPU
Passive Resources Active Ressources
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
100
From Array-OL Appli to Array-OL Archi
From Array-OL (which we will call now “Array-OL Appli”) it is possible to derive a very close formalism (which we will call “Array-OL Archi”) to describe SSP architectures.
Like Array-OL Appli, Array-OL Archi uses: a global model to define connexions between resources, a local model to show the (spatial or temporal) repetitive
nature of each active resource and eventual constraints to its surrounding memories,
and the hierarchy to gently detail the architecture. The general principle is to replace Arrays and Tasks in Array-OL
Appli by respectively Passive and Active resources of Array-OL Archi.
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
101
The Global Model in Array-OL Archi
Like Array-OL Appli, the Global Model of Array-OL Archi uses a graph describing data paths between passive and active resources.
E.g. : for a given level, a machine composed of 2 PPC G4 COTS boards
TD
DMA1
FECSDRAM2SDRAM1
DMA2 DMA3CPU1 CPU2
SOPHOCLES : Array-OLTM Techniques (TMS)
BS
R/T
BR
/ST
N/A
RC
/012
001
102
The Local Model in Array-OL Archi
The Local Model of Array-OL Archi defines for each Active Resource how its activity is distributed along a temporal or spatial (user’s choice) axis, on identical Elementary resources.
Memory icons are used to define which dimension in a memory has to be paved or fitted in the last level of the hierarchy.
16B. Width.
E.g. : CPU1 of the previous example being spatially distributed on 4 Elementary Processors (PE)
x4
4 Node.
@
1 Mega x 64B.
SDRAM1
Node.
@
SDRAM1
Width.
Compilation and simulationCompilation and simulation
• Mapping/scheduling and code generation of ISP applications
• « Y » architecture allows– Compilation techniques just after mapping specification– Standard techniques of automatic parallelization is
applied to ISP
• Code is produced for a particular architecture• Diversity of targets (IP, COTS, simulator, VSIA...)
asks for a lot of code generators– A “generic” code generator– Distributed simulator
CompilationCompilation
• Different type of optimizations– Unique assignment, recursive and first class citizen
functions are similar to functional languages– Data flow dependences are similar to forall loops
• Loop transformation de boucles, tiling… – Signal processing allows other optimizations
• Memory management, data transfers…• Infinite arrays and memory management
– Time and space dimensions overlapping• Multi optimization method integration in Gaspard• Toward a formal model to express code
transformations
Mapping/scheduling techniquesMapping/scheduling techniques
• Unified time and space dimensions allows a joint optimization
• We propose automatic transformations of regular ISP applications to allow– To choose the grain of the mapping to minimize
communications– To load balance memory occupation and performances
• Gaspard could already map regular ISP applications on heterogeneous SoC
• Extension to irregular ISP applications
Code generation Code generation
• Different levels of heterogeneity and communication – SoC are very heterogeneous, data transfers and
execution time can be finely controlled (no OS) – SMP are homogeneous but used OS and compilers (C,
C++)– Metacomputing insures interoperability and
communication/synchronization between software components
• Generate code for different compilers according to the mapping
• Modularization et parameterization of Gaspard code generator
Distributed simulationDistributed simulation
• Sophocles cyber-entreprise for SoC simulation from distributed IP– Runtime support, distributed component coupling with good
performance characteristics
• Distributed process network – Adequation between this runtime and our specification model– Dynamic runtime support in Corba based on multithreaded
component interconnexion via FIFO – Gaspard/Yapi gateway
• Integration of data-parallelism in PN – Data-parallel CORBA
• VCI defined by Virtual Socket Interface Alliance could be supported by CORBA Wrapper integration
Compilation Gaspard Compilation Gaspard environmentenvironment
Architecture Overview of Architecture Overview of GaspardGaspard
UML ToolsRose, Together, …
Classes
XML / XMI
interfaces
inte
rfac
es
Code generation
transformations
visualization
Gaspard files
scheduling
mapping
SIMD Architecture
Distributed simulation
java
others …
runtime
interfaces