View
196
Download
1
Category
Preview:
DESCRIPTION
Groupe de travail Biopuces, INRA d'Auzeville March 12th, 2010
Citation preview
Données métabolomiques : apprentissage etondelettes
Nathalie Villa-Vialaneixhttp://www.nathalievilla.org
Institut de Mathématiques de ToulouseIUT de Carcassonne (Université de Perpignan)
Groupe de travail BioPuces, INRA de Castanet12 mars 2010
Présentation générales des données
Les données ont été fournies par Alain Paris (INRA) : il s’agitd’enregistrements de données metabolomiques (H NMR) d’urinede souris : 950 variables de 0.50 ppm à 9.99 ppm.
Par des procédures automatiques, les pics ont été alignés et laligne de base (partiellement) corrigée.
Présentation générales des données
Les données ont été fournies par Alain Paris (INRA) : il s’agitd’enregistrements de données metabolomiques (H NMR) d’urinede souris : 950 variables de 0.50 ppm à 9.99 ppm.
Par des procédures automatiques, les pics ont été alignés et laligne de base (partiellement) corrigée.
Présentation générales des données
Les données ont été fournies par Alain Paris (INRA) : il s’agitd’enregistrements de données metabolomiques (H NMR) d’urinede souris : 950 variables de 0.50 ppm à 9.99 ppm.
Par des procédures automatiques, les pics ont été alignés et laligne de base (partiellement) corrigée.
Problématique biologique
Etude des effets de l’ingestion de Hypochoeris radicata (HR)ou pissenlit toxique sur le métabolisme : les fleurs de cetteplante sont en effet responsables d’une maladie mortelle pour lescheveaux: “Australian stringhalt” (atteinte du système nerveux,tremblements ...)
Les expériences ont été réalisées sur 72 souris.
Problématique biologique
Etude des effets de l’ingestion de Hypochoeris radicata (HR)ou pissenlit toxique sur le métabolisme : les fleurs de cetteplante sont en effet responsables d’une maladie mortelle pour lescheveaux: “Australian stringhalt” (atteinte du système nerveux,tremblements ...)Les expériences ont été réalisées sur 72 souris.
Description des expériences
Les souris se répartissent en plusieurs groupes selon :
I leurs sexes : 36 mâles ; 36 femelles
I la dose quotidienne de HR ingérée : 0 (contrôle) : 24 souris; 3% : 24 souris ; 9% : 24 souris
I 3 dates de décès : 8ème jour : 24 souris ; 15ème : 24 souris; 21ième : 24 souris
⇒ 18 groupes (mais les groupes issus des dates de décès ne sontpas très pertinent pour la question étudiée).
Description des expériences
Les souris se répartissent en plusieurs groupes selon :
I leurs sexes : 36 mâles ; 36 femellesI la dose quotidienne de HR ingérée : 0 (contrôle) : 24 souris
; 3% : 24 souris ; 9% : 24 souris
I 3 dates de décès : 8ème jour : 24 souris ; 15ème : 24 souris; 21ième : 24 souris
⇒ 18 groupes (mais les groupes issus des dates de décès ne sontpas très pertinent pour la question étudiée).
Description des expériences
Les souris se répartissent en plusieurs groupes selon :
I leurs sexes : 36 mâles ; 36 femellesI la dose quotidienne de HR ingérée : 0 (contrôle) : 24 souris
; 3% : 24 souris ; 9% : 24 sourisI 3 dates de décès : 8ème jour : 24 souris ; 15ème : 24 souris
; 21ième : 24 souris
⇒ 18 groupes (mais les groupes issus des dates de décès ne sontpas très pertinent pour la question étudiée).
Description des expériences
Les souris se répartissent en plusieurs groupes selon :
I leurs sexes : 36 mâles ; 36 femellesI la dose quotidienne de HR ingérée : 0 (contrôle) : 24 souris
; 3% : 24 souris ; 9% : 24 sourisI 3 dates de décès : 8ème jour : 24 souris ; 15ème : 24 souris
; 21ième : 24 souris
⇒ 18 groupes (mais les groupes issus des dates de décès ne sontpas très pertinent pour la question étudiée).
Jours de mesure
L’urine a été collectée les jours suivants :
Jours 0 1 4 8 11 15 18 21Nb d’observations 68 68 68 66 46 44 19 18
Pour chaque souris, de 2 à 22 mesures ont été effectuées.Au final, 397 observations de 950 variables.
Jours de mesure
L’urine a été collectée les jours suivants :
Jours 0 1 4 8 11 15 18 21Nb d’observations 68 68 68 66 46 44 19 18
Pour chaque souris, de 2 à 22 mesures ont été effectuées.
Au final, 397 observations de 950 variables.
Jours de mesure
L’urine a été collectée les jours suivants :
Jours 0 1 4 8 11 15 18 21Nb d’observations 68 68 68 66 46 44 19 18
Pour chaque souris, de 2 à 22 mesures ont été effectuées.Au final, 397 observations de 950 variables.
Principe de base de la décomposition en ondelettes
Pour un entier donné J, le spectre f peut être décomposé auniveau J par :
f(x) =∑
k
αk 2−J/2Ψ(2−Jx − k) +J∑
j=1
∑k
βjk 2−j/2Φ(2−jx − k
)
f(x) =∑
k
αk 2−J/2Ψ(2−Jx − k)︸ ︷︷ ︸Tendance basée sur l’ondelette père Ψ
+J∑
j=1
∑k
βjk 2−j/2Φ(2−jx − k
)︸ ︷︷ ︸Détails aux niveaux 1, . . . , Jbasés sur l’ondelette mère Φ
Principe de base de la décomposition en ondelettes
Pour un entier donné J, le spectre f peut être décomposé auniveau J par :
f(x) =∑
k
αk 2−J/2Ψ(2−Jx − k)︸ ︷︷ ︸Tendance basée sur l’ondelette père Ψ
+J∑
j=1
∑k
βjk 2−j/2Φ(2−jx − k
)
f(x) =∑
k
αk 2−J/2Ψ(2−Jx − k)︸ ︷︷ ︸Tendance basée sur l’ondelette père Ψ
+J∑
j=1
∑k
βjk 2−j/2Φ(2−jx − k
)︸ ︷︷ ︸Détails aux niveaux 1, . . . , Jbasés sur l’ondelette mère Φ
Principe de base de la décomposition en ondelettes
Pour un entier donné J, le spectre f peut être décomposé auniveau J par :
f(x) =∑
k
αk 2−J/2Ψ(2−Jx − k)︸ ︷︷ ︸Tendance basée sur l’ondelette père Ψ
+J∑
j=1
∑k
βjk 2−j/2Φ(2−jx − k
)︸ ︷︷ ︸Détails aux niveaux 1, . . . , Jbasés sur l’ondelette mère Φ
Exemple de décomposition hiérarchique sur un spectrede métabolome
↓ ↘
...
Détails 1 à 8↓ ↘
Exemple de décomposition hiérarchique sur un spectrede métabolome
↓ ↘
...
Détails 1 à 8↓ ↘
Exemple de décomposition hiérarchique sur un spectrede métabolome
↓ ↘
...
Détails 1 à 8↓ ↘
Exemple de décomposition hiérarchique sur un spectrede métabolome
...
Détails 1 à 8↓ ↘
Cas particulier : Les ondelettes de Haar
Partant d’un signal discrétisé (β0,1, . . . , β0,2n ), la transformationdiscrète en ondelettes de Haar consiste en le processus itératif :I Coefficients de tendance : βj,k =
βj−1,2k−1+βj−1,2k√
2pour
j = 1, . . . , n et k = 1, . . . , 2n−j ;I Coefficients de détails : αj,k =
βj−1,2k−βj−1,2k−1√
2pour j = 1, . . . , n
et k = 1, . . . , 2n−j
Dans la suite, on conserve, les coefficients de détails les plusfins (α1,k )k et les coefficients de détails les plus fins du spectretranslaté (β0,2, . . . , β0,2n , 0). L’ensemble suffit pour reconstituer lespectre initial.
Cas particulier : Les ondelettes de Haar
Partant d’un signal discrétisé (β0,1, . . . , β0,2n ), la transformationdiscrète en ondelettes de Haar consiste en le processus itératif :I Coefficients de tendance : βj,k =
βj−1,2k−1+βj−1,2k√
2pour
j = 1, . . . , n et k = 1, . . . , 2n−j ;I Coefficients de détails : αj,k =
βj−1,2k−βj−1,2k−1√
2pour j = 1, . . . , n
et k = 1, . . . , 2n−j
Dans la suite, on conserve, les coefficients de détails les plusfins (α1,k )k et les coefficients de détails les plus fins du spectretranslaté (β0,2, . . . , β0,2n , 0). L’ensemble suffit pour reconstituer lespectre initial.
Coefficients d’ondelettes retenus
D.1 D.57 D.125 D.297 D.370 D.443 D2.41 D2.120 D2.304 D2.389 D2.474
−40
−20
020
40
Before scaling
D.1 D.57 D.125 D.297 D.370 D.443 D2.41 D2.120 D2.304 D2.389 D2.474
−15
−10
−5
05
1015
After scaling
Problème de normalisation
●
●
●●
●
●
● ●
●
●
●
●● ●
●
●
●●
●
●
●●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
● ●●
●
●
●
● ● ●
●
●
●
● ●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●● ●
●●
●
●
●
● ●
●
●
●●
●
●
● ●●● ●
●
●●●
●
●
●
●
●
● ●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●●
●
●●●
●
−10 −5 0 5 10 15
−10
−5
05
PC1 vs. PC2
PC1
PC
2
●
●
●
●
●
●
●
●
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
●
●
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●● ●
●
●
●●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●● ●●
●
●●●
●●
●
●●
●
●
●
●
●●
●
●
●
● ●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
−10 −5 0 5 10 15
−20
−10
010
PC1 vs. PC3
PC1
PC
3
●
●
●
●
●
●
●
●
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
● ●●●
●●
●
●
●●
●●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●●
● ●
●
●
●
● ●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●
−10 −5 0 5 10 15
−15
−5
05
1015
20
PC1 vs. PC4
PC1
PC
4
●
●
●
●
●
●
●
●
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
●
●
●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
● ●
●
●
●●
●
●●
●●●
●
●
●
●
●
●
● ●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●●●●
●
●●●
●●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
−10 −5 0 5
−20
−10
010
PC2 vs. PC3
PC2
PC
3
●
●
●
●
●
●
●
●
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●●●●
●●
●
●
●●
●●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●●
●●
●
●
●
●●
●
●●●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●● ●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●
−10 −5 0 5
−15
−5
05
1015
20
PC2 vs. PC4
PC2
PC
4
●
●
●
●
●
●
●
●
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
● ●●●
● ●
●
●
●●
●●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●●
● ●
●
●
●
●●
●
●●●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●
−20 −10 0 10
−15
−5
05
1015
20
PC3 vs. PC4
PC3
PC
4
●
●
●
●
●
●
●
●
Day 0
Day 1
Day 4
Day 8
Day 11
Day 15
Day 18
Day 21
ACP des coeffi-cients : mise envaleur du jour deprélèvement pourle groupe contrôle
Procédure de normalisation choisie
I Déterminer la médiane de chaque jour pour chaquecoefficient d’ondelette dans le groupe contrôle.
I Utiliser ces valeurs pour normaliser toutes les observations.
Après normalisation :
●
●
●
●
0 1 4 8 11 18
−2
−1
01
2
D2.444
Day
Wav
elet
coe
ffici
ents
●
●
●●
●
0 1 4 8 11 18
−3
−1
01
2
D.78
Day
Wav
elet
coe
ffici
ents
●
● ●
0 1 4 8 11 18
−3
−1
01
23
D.332
Day
Wav
elet
bco
effic
ient
s
●
●●
●●
●
●
0 1 4 8 11 18
−3
−1
01
23
D2.289
Day
Wav
elet
coe
ffici
ents
Procédure de normalisation choisie
I Déterminer la médiane de chaque jour pour chaquecoefficient d’ondelette dans le groupe contrôle.
I Utiliser ces valeurs pour normaliser toutes les observations.
Avant normalisation :
●
●
●
●
0 1 4 8 11 15 18 21
−0.
20.
00.
20.
40.
6D2.444
Day
Wav
elet
coe
ffici
ents
●
●
●
●
●
0 1 4 8 11 15 18 21
−0.
20−
0.10
0.00
0.10
D.78
Day
Wav
elet
coe
ffici
ents
●
●
●
0 1 4 8 11 15 18 21
0.0
0.5
1.0
1.5
2.0
2.5
D.332
DayW
avel
et c
oeffi
cien
ts
●
●●
●●
●
●
0 1 4 8 11 15 18 21
−1.
5−
1.0
−0.
5D2.289
Day
Wav
elet
coe
ffici
ents
Après normalisation :
●
●
●
●
0 1 4 8 11 18
−2
−1
01
2
D2.444
Day
Wav
elet
coe
ffici
ents
●
●
●●
●
0 1 4 8 11 18
−3
−1
01
2
D.78
Day
Wav
elet
coe
ffici
ents
●
● ●
0 1 4 8 11 18
−3
−1
01
23
D.332
Day
Wav
elet
bco
effic
ient
s
●
●●
●●
●
●
0 1 4 8 11 18
−3
−1
01
23
D2.289
Day
Wav
elet
coe
ffici
ents
Procédure de normalisation choisie
I Déterminer la médiane de chaque jour pour chaquecoefficient d’ondelette dans le groupe contrôle.
I Utiliser ces valeurs pour normaliser toutes les observations.
Après normalisation :
●
●
●
●
0 1 4 8 11 18
−2
−1
01
2D2.444
Day
Wav
elet
coe
ffici
ents
●
●
●●
●
0 1 4 8 11 18
−3
−1
01
2
D.78
Day
Wav
elet
coe
ffici
ents
●
● ●
0 1 4 8 11 18
−3
−1
01
23
D.332
DayW
avel
et b
coef
ficie
nts
●
●●
●●
●
●
0 1 4 8 11 18
−3
−1
01
23
D2.289
Day
Wav
elet
coe
ffici
ents
APC après normalisation
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●●●
●
● ●
●
●●
●●
● ●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
● ●
●
● ●●
●
●● ●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
−10 −5 0 5 10 15
02
46
810
PC1 vs. PC2
PC1
PC
2
●
●
●
●
●
●
●
●
Day 0Day 1Day 4Day 8Day 11Day 15Day 18Day 21
●
●
●
●●
●
●●
●
●●
●
● ●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●● ●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●●
●
●●●
●● ●
●●
●
●
●
●
●●
●
●
●
● ●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
−10 −5 0 5 10 15
−10
−5
05
1015
PC1 vs. PC3
PC1
PC
3
●
●
●
●
●
●
●
●
Day 0Day 1Day 4Day 8Day 11Day 15Day 18Day 21
●
●
●
●●
●
●
●●
●
●●
● ●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●●
●
●● ●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
−10 −5 0 5 10 15
−5
05
PC1 vs. PC4
PC1
PC
4
●
●
●
●
●
●
●
●
Day 0Day 1Day 4Day 8Day 11Day 15Day 18Day 21
●
●
●
●●
●
●●
●
●●
●
●●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●●
●●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
0 2 4 6 8 10 12
−10
−5
05
1015
PC2 vs. PC3
PC2
PC
3●
●
●
●
●
●
●
●
Day 0Day 1Day 4Day 8Day 11Day 15Day 18Day 21
●
●
●
●●
●
●
● ●
●
●●
●●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
0 2 4 6 8 10 12
−5
05
PC2 vs. PC4
PC2
PC
4
●
●
●
●
●
●
●
●
Day 0Day 1Day 4Day 8Day 11Day 15Day 18Day 21
●
●
●
●●
●
●
●●
●
●●
●●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●●
● ●
●●
●
●
●
●●
●
●●●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
−10 −5 0 5 10 15
−5
05
PC3 vs. PC4
PC3
PC
4
●
●
●
●
●
●
●
●
Day 0Day 1Day 4Day 8Day 11Day 15Day 18Day 21
ACP des coef-ficients : miseen valeur du jourde prélèvementpour le groupecontrôle aprèsnormalisation
Lien entre ACP et dose totale ingérée
●
●
●
●●
●
● ●●
●
●●
●●
●
●
●●
●
●●
●
●
●
●
●● ●●
●●
●
●● ●
●
●
●●
●
●●
●●
●
●●●
●
● ●●
●
●
●● ●
●●●●
●
●●
●
●
●●●●
●●
● ●● ●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●●●
●●●● ●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●● ●●
●●
●
●
●
●
●●●
●
●
●
●
● ●
●
●
●●●
●
●●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●●
●
●
● ● ●
●
●
●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●
●●●●
●
●
●
●
●●●●
●●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●●
●
● ●
●
●●
●●
●●●●
●
●●●
●
●●
●
●●
●
●●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●●
●
●●
●
● ●
●●
●
●
●●
● ●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●●● ●
●
●●●
●
−10 0 5 15
−30
−20
−10
010
PC1 vs. PC2
PC1
PC
2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
TD 0TD 3TD 9TD 12TD 24TD 33TD 36TD 45TD 54TD 63TD 72TD 99TD 135TD 162TD 189 ●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●● ●●
●
●●
●
●
●
●●
●
●●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●●
−10 0 5 15
−10
010
2030
PC1 vs. PC3
PC1
PC
3
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
TD 0TD 3TD 9TD 12TD 24TD 33TD 36TD 45TD 54TD 63TD 72TD 99TD 135TD 162TD 189
●
●
●
● ●●
●
●
●
●●
●
●
●
●
●●●
●
●●
●
●
●●●
● ●
●
●
●
●
●
● ●●
● ●
●
●
●●●
●
● ●
●
●
● ●
●●●●
●●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
● ●
●●
●●
●●
●
●
●
●● ●●
●
●●
●
●
●
●
●●
● ●
●
●
●●●
●
●
●
●●
●
●
●●
●●
● ●
●
●●
●
●
●
●
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
● ●
●
●
●●●
●
●
●
●
●
●
●
●●
● ●
●●
●
●●●
●●●
●●● ●
●●
●●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●●
●●●
●
●●
● ●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●●
●●●
●
●
●
●
●●
●●●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●● ●
●
●●
●
●●●
●●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●●
●
●
●
●●●
●●
●
●
●
−10 0 5 15
−20
−10
010
2030
4050
PC1 vs. PC4
PC1
PC
4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
TD 0TD 3TD 9TD 12TD 24TD 33TD 36TD 45TD 54TD 63TD 72TD 99TD 135TD 162TD 189
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●●●●
●
●●
●
●
●
●
●●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●●●●
●
●●
●
●
●
●●
●
●●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●●
−30 −10 0 10
−10
010
2030
PC2 vs. PC3
PC2
PC
3
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
TD 0TD 3TD 9TD 12TD 24TD 33TD 36TD 45TD 54TD 63TD 72TD 99TD 135TD 162TD 189
●
●
●
●●●
●
●
●
●●
●
●
●
●
● ●●
●
●●
●
●
●● ●
●●
●
●
●
●
●
●●●
●●
●
●
●●●
●
● ●
●
●
●●
●●●●
●●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
● ●
●●
●●
●●
●
●
●
●●●●
●
●●
●
●
●
●
●●
●●
●
●
● ●●
●
●
●
●●
●
●
●●
●●
●●
●
●●
●
●
●
●
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●●●
●●●
●●● ●●
●●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●● ●
●●
●● ●
●
●
●
●
●●
●●●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●●
●
●●
●
●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●●
●
●
●
●●●
●●
●
●
●
−30 −10 0 10
−20
−10
010
2030
4050
PC2 vs. PC4
PC2
PC
4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
TD 0TD 3TD 9TD 12TD 24TD 33TD 36TD 45TD 54TD 63TD 72TD 99TD 135TD 162TD 189
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●●
●
●●
●
●
●●●
●●
●
●
●
●
●
● ● ●
● ●
●
●
●●●
●
●●
●
●
●●
●●●●
● ●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●●
●●
●
●
●
●●●●
●
●●
●
●
●
●
●●
●●
●
●
●●●
●
●
●
●●
●
●
●●
●●
●●
●
●●●
●
●
●
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
● ●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
● ●
●●
●
●●●
●●●
● ●●●
●●●●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●●●
●●
●
●●
●●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
● ●
●●●
●
●
●
●
●●
●●●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●●
●
●●
●
●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●●●
●
●
●
● ●●
●●
●
●
●
−10 0 10 20 30
−20
−10
010
2030
4050
PC3 vs. PC4
PC3
PC
4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
TD 0TD 3TD 9TD 12TD 24TD 33TD 36TD 45TD 54TD 63TD 72TD 99TD 135TD 162TD 189
ACP des coef-ficients : miseen valeur dela dose totaleingérée aprèsnormalisation
Métabolites impliquées dans le phénomène
2 4 6 8 10
05
1015
20
La plupart correspondent à des métabolites connues etimpliquées dans le processus biologique (selon une étudepréliminaire).
Métabolites impliquées dans le phénomène
2 4 6 8 10
05
1015
20
La plupart correspondent à des métabolites connues etimpliquées dans le processus biologique (selon une étudepréliminaire).
Motivations
L’idée est de valider l’impact de l’ingestion de HR sur lemétabolome en essayant de prédire, à partir des coefficientsd’ondelette normalisés et réduits, la dose totale de HRingérée. Si la prédiction s’avère être de bonne qualité, l’impactn’est pas un artefac des données mais valide la dépendancebiologique.
Méthodes comparées :I random forestI ridge regressionI LASSOI ElasticnetI Partial Least Squares (PLS)I sparse PLS
Motivations
L’idée est de valider l’impact de l’ingestion de HR sur lemétabolome en essayant de prédire, à partir des coefficientsd’ondelette normalisés et réduits, la dose totale de HRingérée. Si la prédiction s’avère être de bonne qualité, l’impactn’est pas un artefac des données mais valide la dépendancebiologique.Méthodes comparées :I random forestI ridge regressionI LASSOI ElasticnetI Partial Least Squares (PLS)I sparse PLS
Méthologie
I Séparation des données en apprentissage et test enrespectant l’équilibre des 18 groupes présentés enintroduction ;
I Apprentissage des 6 méthodes sur les donnéesd’apprentissage avec calibration des hyperparamètres parvalidation croisée ;
I Calcul de l’erreur quadratique moyenne sur les donnéesde test.
Cette procédure a été répétée 250 fois.
Méthologie
I Séparation des données en apprentissage et test enrespectant l’équilibre des 18 groupes présentés enintroduction ;
I Apprentissage des 6 méthodes sur les donnéesd’apprentissage avec calibration des hyperparamètres parvalidation croisée ;
I Calcul de l’erreur quadratique moyenne sur les donnéesde test.
Cette procédure a été répétée 250 fois.
Résultats
●
●
●
● ●●
●●
●●
●
●
●
●
●●
●
●
●●
●
●●●
●
●●
●
●
●
Lass
o
Rid
ge
ELN
0.1
ELN
0.2
5
ELN
0.5
ELN
0.7
5
PLS
SP
LS 5
SP
LS 1
0
SP
LS 2
0
RF
14
16
18
20
22
24
ELN : Qualité de prédiction
●●
●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●
●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●
●●●●●●●●
●●●●●●●●●●●●
●●
●
●●●
●●
●
●●●●●●
●●●
●●●
●●●●● ●●
●●●●●●
●●●
●●●●●
●●●●●●●
●●●●●●
●●
●●●●
●●●
●●●●●
●●●
●●●●●●●
●
●●●●●●
●●
●●●●●●●
●●●●●
●
●
●
●●●
●●●
●●●●●
●
●
●
●
●●●
●
●●●●●
●●
●●
●●●●
●●
●●
●
●
●●●●
●●●●●●●●
●●●●●●●
●●●
●●●
●
●
●●●●●●
●
●●●●●●
●●
●●●●●●●
●
●●
●
●●●●
●
●
●●●●●●
●●●●●
●
●
●
●●●●●●
●●●●●●●●
●●
●●
●
●●
●
●●●●●
●●
●
●●●●●●●
●
●●
●●●
●
●●●●●●●
●●●●●●
●●●●
●●
●●●●●
●
●●
●●
●●●
●
●●
●
●
0 50 100 150
050
100
150
Mixing: 10%
True value
Pre
dict
ed v
alue
●●
●●●●●●●●●●●
●
●●●●●
●
●●●●●●●●●●●●
●●●
●●●●●●●●●●
●
●●●●●●
●●●●●●●●●●●●●●●
●
●●●
●●●●●●●●●●●
●●●●●●●●
●●●●●●●●●●●●
●●
●
●●●
●●
●
●●●●●●
●●●
●●●●●●●● ●●
●●●●●●
●●●
●●●●●
●●●●●●●
●●●●●
●
●●
●●●
●
●●●
●●●●●●●●
●●
●●●●●
●
●●●●●●●●
●●●●●●●
●
●●●●
●
●
●●●●
●●●
●●●●●
●
●
●
●
●●●
●
●●●●●
●●
●●
●●●●
●●
●
●
●
●
●●●●
●●●●●●●●
●●●●●●●
●●●
●●●
●
●
●
●●●●●
●
●●●●●●●●
●●●
●●●●
●
●
●
●
●●●●
●
●
●●●●●●
●
●●●●
●
●
●
●●●●●●
●●●●●●●●
●●
●●
●
●●●
●●●●
●
●●
●
●●●●●●●
●
●●
●●●
●
●●●●●
●●
●●●●●●
●●●●
●●
●●●●●
●
●●
●●
●●●
●
●●
●
●
0 50 100 1500
5010
015
0
Mixing: 25%
True value
Pre
dict
ed v
alue
ELN : Coefficients impliqués
2 4 6 8 10
05
1015
20
ppm
Certains coefficients sont les mêmes que ceux connus etprécédemment identifés, certains métabolites manquent dans laliste, certains métabolites de la liste semblent inconnus. ⇒ effetd’échelle ?
ELN : Coefficients impliqués
2 4 6 8 10
05
1015
20
ppm
Certains coefficients sont les mêmes que ceux connus etprécédemment identifés, certains métabolites manquent dans laliste, certains métabolites de la liste semblent inconnus. ⇒ effetd’échelle ?
Perspectives et questions
I Normalisation actuelle ? Modèle à effets mixtes ?I Qu’est-ce qui est pertinent pour la recherche des coefficients
les plus importants dans ELN ?
Recommended