222
Metabolomics Technologies applied to the Identification of Compounds in Plants Sofia Moço

Metabolomics Technologies applied to the

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Metabolomics Technologies applied to the

Metabolomics Technologies applied to the Identification of Compounds in Plants

Sofia Moço

Page 2: Metabolomics Technologies applied to the

PromotorenProf. Dr. R. J. BinoHoogleraar in de Metabolomica van PlantenWageningen Universiteit

Prof. Dr. S. C. de VriesHoogleraar in de BiochemieWageningen Universiteit

Co-promotorenDr.ir. J.J.M. VervoortUniversitair hoofddocent, Laboratorium voor BiochemieWageningen Universiteit

Dr. R. C. H. de VosSenior onderzoekerPlant Research InternationalWageningen

PromotiecomissieProf. Dr. Ton Bisseling, Wageningen Universiteit, NederlandDr. Robert D. Hall, Plant Research International, Wageningen, NederlandDr. Joachim Kopka, Max Planck Institute for Molecular Plant Physiology, Potsdam, GermanyDr. John P. M. van Duynhoven, Unilever, Vlaardingen, Nederland

Dit onderzoek is uitgevoerd binnen de onderzoeksschool Voeding, Levensmiddelentechnologie, Agrobiotechnologie en Gezondheid.

Page 3: Metabolomics Technologies applied to the

Metabolomics Technologies applied to the Identification of Compounds in Plants

A liquid chromatography-mass spectrometry / nuclear magnetic resonance perspective over the tomato fruit

Sofia Isabel Abraúl Viana Moço

ProefschriftTer verkrijging van de graad van doctor

op gezag van de rector magnificus van Wageningen Universiteit

Prof. Dr. M. J. Kropff in het openbaar te verdedigen op maandag 15 oktober 2007

des namiddags te vier uur in de Aula

Page 4: Metabolomics Technologies applied to the

Metabolomics Technologies applied to the Identification of Compounds in Plants

Sofia Isabel Abraúl Viana Moço2007

PhD thesis, Wageningen University, The Netherlands with references - with summaries in English and Dutch

ISBN 978-90-8504-742-1

Page 5: Metabolomics Technologies applied to the

Contents

Page

Preface 7

CHAPTER 1: Metabolomics technologies and metabolomics identification 11

CHAPTER 2: Untargeted large scale plant metabolomics using liquid chromatography coupled to mass spectrometry 35

CHAPTER 3: A liquid chromatography mass spectrometry based metabolome database for tomato 65

CHAPTER 4: Tissue specialization at the metabolite level is perceived during the development of tomato fruit 91

CHAPTER 5: Building up a comprehensive database of flavonoids based on nuclear magnetic resonance data 123

CHAPTER 6: Push-button flavonoid identification: a NMR database integrated with a 1H NMR predictive model 135

CHAPTER 7: Metabolite correlations in tomato obtained by fusion of liquid chromatography-mass spectrometry and nuclear magnetic

resonance data 175

Summarizing discussion and conclusions 197

Samenvatting 203References 207Curriculum vitae 219List of publications 220Acknowledgements / Agradecimentos 221Training and supervision plan 222

Page 6: Metabolomics Technologies applied to the
Page 7: Metabolomics Technologies applied to the

7

Preface

A new era of plant biochemistry at the systems level is emerging in which the detailed description of biochemical phenomena, at the cellular level, is important for a better understanding of physiological, developmental, and biomolecular processes in plants. This emerging fi eld is oriented towards the characterisation of small molecules (metabolites) that act as substrates, products, ligands or signalling entities in cells. This thesis concerns the development and establishment of such metabolomics strategies for screening and identifying metabolites in biological systems. Most technological strategies were applied to the assignment of metabolites from tomato (Solanum lycopersicum) fruit. Tomato was chosen for being a widely consumed crop with nutritional attributes, representing a model for the Solanaceae family. In order to achieve both high coverage of detected metabolites and valuable information for identifi cation purposes, liquid chromatography coupled to mass spectrometry (LC-MS) and nuclear magnetic resonances (NMR) technologies were used. In addition, metabolite databases, based on experimental data (mass-based, in the case of LC-MS and chemical shift-based, in the case of NMR) were initiated, in order to systemize the extensive metabolite information. The chapters in this thesis describe method developments and their applications in plant metabolomics that are also feasible to be implemented on other biological systems.

A review on the technologies used for metabolomics with a perspective on compound identifi cation is presented in Chapter 1.

In Chapter 2, a robust large scale LC-MS method for the analysis of metabolites in plants is described in detail. It presents a step-by-step protocol with thorough information about the reagents used, sample preparation, instrument set-up, methods of analysis and data processing strategies. The described analytical method combines LC with photo diode array (PDA) and MS detection, and allows

Preface

Page 8: Metabolomics Technologies applied to the

8

PREFACE

the analysis of mostly semi-polar secondary metabolites present in plants, such as phenolic acids, flavonoids, glucosinolates, saponins, alkaloids and derivatives thereof.

Chapter 3 presents an application of the LC-PDA-MS method for the profiling of metabolites present in tomato fruit. The metabolites putatively identified in this fruit were included in a tomato dedicated-database (the MoTo DB) that is available for public search on the web (see: http://appliedbioinformatics.wur.nl). A comparison between two tomato fruit tissues, peel and flesh, for their metabolite content was made using this MoTo DB.

Using the same LC-PDA-MS setup, several different tomato fruit tissues were compared in more detail, along the fruit ripening timeline, in Chapter 4. The presence of tissue-specific metabolites, at determined ripening stages, suggests developmental control of metabolite biosynthesis. Such tissue-specific metabolomics approach may give rise to a biological view over metabolite compartmentalisation.

Chapters 5 and 6 describe the implementation of a NMR database for secondary metabolites, mostly including flavonoids, the Flavonoid Database (see: Flavonoid Database under http://www.wnmrc.nl). The acquisition of a large data set of related standard compounds allowed the analysis of shifts in NMR characteristics by the presence of certain functional groups or substituents in the flavonoid backbone. In addition, a 1H NMR-based prediction model was iteratively trained from the acquired experimental data and can be used for the prediction of unknown related molecules. This approach greatly increases the efficiency in the identification of (flavonoid) metabolites.

Chapter 7 describes correlations of metabolomics data derived from LC-MS and NMR analyses of a large number of different tomato cultivars. The identification of metabolites is obtained among other available sources, the MoTo DB and the Flavonoid Database. This approach illustrates the complementariness and coincidence of NMR and MS as analytical techniques, applied to the detection of metabolites in tomato fruit.

The summarizing discussion and conclusions, sets the work presented in this thesis into a biochemical perspective, and prospects suggestions for the future.

Page 9: Metabolomics Technologies applied to the
Page 10: Metabolomics Technologies applied to the
Page 11: Metabolomics Technologies applied to the

11

Chapter 1

Metabolomics Technologies and Metabolite Identifi cation

Sofi a Moco, Raoul J. Bino, Ric C.H. De Vos, Jacques Vervoort Trends in Analytical Chemistry (2007) in press

Metabolomics studies rely on the analysis of the multitude of small molecules (metabolites) present in a biological system. Most commonly, metabolomics is heavily supported by mass spectrometry (MS) and nuclear magnetic resonance (NMR), as parallel techniques. These two technologies provide an overview of the metabolome and detain a high compound-elucidation power. Beyond the capacity of large scale analysis, a main effort should be pursued for the unequivocal identifi cation of metabolites. The combination of liquid chromatography (LC)-MS and NMR is a powerful methodology to achieve metabolite identifi cation. A better chemical characterization of the metabolome will undoubtedly enlarge the knowledge of any biological system.

Metabolomics Technologies and Metabolite Identifi cation

Sofi a Moco, Raoul J. Bino, Ric C.H. De Vos, Jacques Vervoort Trends in Analytical Chemistry (2007) Trends in Analytical Chemistry (2007) Trends in Analytical Chemistryin press

Page 12: Metabolomics Technologies applied to the

12

CHAPTER 1

INTRODUCTION

Biological systems are under constant challenges from the environment. Adaptation to environmental stimuli is reflected in alterations in the genome, the transcription of genes, the expression level and post-translational modifications of proteins, and in the primary and secondary metabolism. The phenotype of the organism is the product of its genotype within its environment. The metabolic composition is reflected onto the phenotype and hence a detailed analysis of the metabolome is a representation of the phenotype under study. Phenotypic changes are therefore most adequately monitored by reliable metabolomics studies.

Metabolomics stands out from any other organic compound analysis in scale and in chemical diversity, i.e., all metabolites are aimed to be described, both secondary and primary metabolites, present in an organism or biological system. Perhaps the most striking feature of metabolomics lays in its integrative capacity, as part of the “omics” disciplines, which has resulted in a shift from mainly pure (organic) chemistry-based characterization (as in phytochemistry) into a biochemical context. In plants, the characterization of endogenous primary and secondary metabolites is of interest for the quality and improvement of crops, as well in the study of e.g. physiology, ecology and development phenomena in plant biochemistry. Metabolomics can thus provide valuable tools, relevant in a wide range of applications (Table 1.1) including the perception of cellular phenomena through systems biology approaches (Bino et al., 2004; Hall, 2006).

Table 1.1. Fields of metabolomics applications.

Systems biology is considered to be the latest strategy to describe cellular mechanisms at a global scope, making use of transcriptomic, proteomic and metabolomic information. This discipline is expected to provide a better

Area of applicationPlant breeding and crop quality assessmentFood assessment and safetyToxicity assessmentNutrition assessmentMedical diagnosis and assessment of disease status Pharmaceutical/drug developmentsYield improvement in crops and fermentationsBiomarker discoveryTechnological advances in analytical chemistryGenotypingEnvironmental adaptationsGene function elucidationIntegrated in systems biology

Page 13: Metabolomics Technologies applied to the

13

Metabolomics and identification

understanding of cell biology by enabling the study of the function and behaviour of molecular interactions in complex networks (Galperin and Ellison, 2006). The ability to address biological questions using a systems biology approach depends on the information that can be obtained from the system under study. However, the quality of the conclusions extracted from such study also depends on the information that is fed into the system. Hence, the better understanding of a certain biological system by a metabolomics approach relies on the amount of participating metabolites with known identity.

metabolome

GC-MS

LC-MS

CE-MSPDA

NMR

Figure 1.1. Heuristic representation of the metabolome indicating that only a small fraction of metabolites have been identified up to now, being the majority of naturally occurring metabolites still unknown. The most commonly used analytical techniques that have been used for metabolite identification are LC-MS, NMR, GC-MS, CE-MS and PDA.

In metabolomics, the necessity of enlarging the list of identified metabolites becomes more and more a main constraint (Fig. 1.1). The extensive data sets nowadays obtained from analytical platforms such as the most commonly used MS- and NMR-based systems, create a gap between “signal x (or at most, detected metabolite X)” and “metabolite with IUPAC name 2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-3-[(2S,3R,4S,5R,6R)-3,4,5-trihydroxy-6-[[(2R,3R,4R,5S,6S)-3,4,5-trihydroxy-6-methyl-oxan-2-yl]oxymethyl]oxan-2-yl]oxy-chromen-4-one, also commonly known as rutin with CAS registry number 153-18-4, described using a unique InChi identifier” (Fig. 1.2). The construction of (experimental) spectrometric and spectroscopic-based metabolite databases and the accessibility to searchable chemical databases are some of the initiatives that can aid narrowing this gap. This is a challenge that not only resides in obtaining high quality data suitable for identification from the available analytical technologies but also resides in the integration and development

Page 14: Metabolomics Technologies applied to the

14

CHAPTER 1

of bio-computational tools for automation of the data analysis.The identification of metabolites is a necessity for understanding the

molecular nature of the biochemical processes in which they participate, as substrates or products, in reactions at the (sub)cellular level. The density of the characterization of compounds can be described up to the stereochemical conformation accounting for their three-dimensional (3D) structure. In general, chiral structures profoundly influence chemical and biological mechanisms, being essential for structure-activity relationships in catalysis, drug development applications and medicinal chemistry.

0

303

564

z/m

%

90677623

0163038

91219444116

1002 6632;0221

%

0

552

453

CLtcartxe fo noitisopmoC

yticibohpordyHemit noitneteR

VUlangis / mn 006-002 artcepS

-serohpomorhC λ xam

SMlangis / ssam etaruccA

C :noitisopmoc latnemelE aHbNcOdPeS f

nrettap cipotosI

sdnuopmoc dradnatSerutaretiL

RMNstfihs lacimehC

stnatsnoc gnilpuoC

SM/SMnrettap noitatnemgarF

noitamrofni larutcurtS

lacimehC)oiB(sesabatad

Figure 1.2. Pieces of information given by analytical technologies and knowledge resources that lead to the identification of a metabolite, here exemplified for the metabolite rutin (IUPAC name: 2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-3-[(2S,3R,4S,5R,6R)-3,4,5-trihydroxy-6-[[(2R,3R,4R,5S,6S)-3,4,5-trihydroxy-6-methyl-oxan-2-yl]oxymethyl]oxan-2-yl]oxy-chromen-4-one; CAS registry number 153-18-4; InChi identifier: InChI=1/C27H30O16/c1-8-17(32)20(35)22(37)26(40-8)39-7-15-18(33)21(36)23(38)27(42-15)43-25-19(34)16-13(31)5-10(28)6-14(16)41-24(25)9-2-3-11(29)12(30)4-9/h2-6,8,15,17-18,20-23,26-33,35-38H,7H2,1H3/t8-,15+,17-,18+,20+,21-,22+,23+,26+,27-/m0/s1): liquid chromatography (LC), mass spectrometry (MS), fragmentation pattern analysis (MS/MS), ultraviolet/visible range spectroscopy (UV/Vis) obtained by the photo diode array (PDA), nuclear magnetic resonance (NMR), experimental validation by standard compounds and information present in literature and databases.

The technologies used in metabolomics allow the characterization of molecules by providing pieces of information that can lead to their annotation and ultimately to their identification. In this study, we pinpoint several major considerations to be taken into account in any metabolomics approach: sample preparation, analytical technique, data analyses, identification tools and databases, and finally hypothesis testing and conclusions. Special attention is given to the identification of metabolites in plants by means of LC-MS and NMR strategies.

Page 15: Metabolomics Technologies applied to the

15

Metabolomics and identification

SAMPLE PREPARATION

Sample preparation is perhaps the most underestimated part of metabolomics analyses. In any biological system, metabolites of a wide chemical diversity are present in a dynamic range of concentrations that can exceed 106 (e.g. ratio in concentrations between sucrose and brassinolide in Arabidopsis). In plants, a major part of the large diversity in the metabolome is due to the presence of a wide range of secondary metabolites, which generally strongly exceeds the number of primary metabolites. The composition and quantity of detected metabolites depends to a large extent on the sample preparation chosen. From the estimated hundreds of thousand metabolites that can exist in the plant kingdom, there is an impressive chemical variation. This large chemical variation not only exists between different plant species but also between different tissues of a single plant. According to Krishnan et al. (2005), a typical cell may contain 5,000 metabolites (expectedly in diverse concentrations and diverse chemical properties), which challenges the ability of a sample preparation method to capture as many of these metabolites as possible. The extent of the detected metabolome is therefore dependent on the contents of the (prepared) biological sample. The more steps introduced in the sample preparation, such as sequential extractions and concentrations (for example to favour a particular class of compounds), the narrower will be the chemical diversity of compounds finally present in the extract. One should be aware that the further into the analysis pipeline, the slenderer will be the overall knowledge of the metabolites present in the sample. On the other hand, the knowledge of the (narrow) set of metabolites that endures the complete analytical pipeline, i.e. from sample preparation to identified metabolite, is progressively richer (Fig. 1.3).

In order to have reproducible measurements, the conditions of the biological material should be as homogeneous as possible, in terms of environmental conditions (light, temperature, humidity, nutrients, time of sampling, etc.), leaving ideally the biological variation as the only inherited variation. For metabolomics applications, a fast, reproducible and unselective extraction method is preferred for detection of a wide range of metabolites that occur in the plant, avoiding unforeseen chemical modifications. There are various methodologies for extracting compounds from biological materials: liquid extraction (temperature or pressure-assisted), solid phase extraction (SPE), solid phase microextraction, and microwave assisted extraction. In general, metabolites of interest are extracted by liquid extraction with one solvent, aqueous or organic, or with a combination of solvents (liquid-liquid extraction), implying that the type of metabolites extracted is dependent

Page 16: Metabolomics Technologies applied to the

16

CHAPTER 1

on the chemical properties of the solvent used. For a certain class of metabolites, a particular solvent can be more adequate, yet not unique for its extraction. For plants, semi-polar compounds such as phenolic acids, flavonoids, alkaloids and glycosylated sterols are successfully extracted in solutions of methanol/water while the apolar carotenoids are better extracted in chloroform. The choice of solvent should also be compatible with the analytical instruments used. For reversed phase LC-MS analyses, solvents such as ethyl acetate or chloroform are not advisable as these do not dissolve in the mobile phase used for the chromatography nor they produce an efficient spray in the case of direct flow injection analysis. On the other hand, in NMR analyses any solvent can be used, preferentially deuterated in case of 1H NMR measurements. More important than the choice of sample preparation protocol is the reproducibility of the extraction and the ability to distinguish naturally occurring compounds.

biological system(thousands ofmetabolites)

x xx

xxx

chromatogram+ spectrum

biologicaltissue

biologicalextract

CELCGC...

PDAMS

NMR...

list of peaks

list of putativemetabolites

visualization;differential metabolites

Identifiedmetabolite

OO H

HO O

7

8

65

3

2

2'3'

4'

5'6'

Met

abol

ome

Iden

tifi

ed M

etab

olit

es

publicationsSpecies DBs DNP HMDB KNApSAcK

Spectral DBs AMDIS/NIST ACD Labs PERCH AMIX

Chemical DBs PubChem SciFinder Beilstein Merk Index

GenomicsTranscriptomicsProteomicsInteractomics

Metabolic PathwaysMetabolic NetworksSystems Biology

Fluxomics∫dt

Figure 1.3. Metabolomics pipeline towards a systems biology approach: from the whole metabolome to identified metabolites. The large amount of metabolites present in a biological system (e.g. plant) undergoes a strategy of combined experimental design and data interpretation to achieve the identification of only few metabolites naturally occurring in the system. This procedure includes: sampling, sample preparation and extraction, analysis by typically CE-, LC-, GC-MS or NMR, interpretation of chromatograms and spectra obtained, list of statistical relevant candidate peaks, visualisation of the multivariate data, extraction of differential peaks, construction of a list of putative metabolites, identification of a metabolite. Along this procedure, resources such as species databases, literature, spectral databases and chemical databases can be fed into the analytical pipeline, narrowing the number of ambiguities for candidate metabolites. Metabolite information can aid the interpretation of metabolic pathways and networks in combination with dynamic and transient measurements made by flux analyses. The integration of genomics, transcriptomics, proteomics and interactomics with metabolomics contributes to a systems biology overview of the system (for abbreviations see Table 1.2).

Page 17: Metabolomics Technologies applied to the

17

Metabolomics and identification

ANALYTICAL TECHNOLOGIES

LC-MS

MS is a spectrometric method that allows the detection of mass-to-charge species pointing to the molecular mass (MM) of the detected metabolite. As a developing technology in metabolomics applications, there are various configurations of mass spectrometers, in terms of ion acceleration and mass detection, ion production interfaces and ion fragmentation capabilities (Fig. 1.4). Moreover, there have been constant adjustments over the years in the hardware and software of mass spectrometers to meet robustness, practicality, applicability and efficiency of the analyses.

analyte A ion A

EIESIAPCIMALDIDESIAPPI

Q-MSTripleQ-MSQ-Ion trap-MSTOF-MSQ-TOF-MSFT-ICR-MSFT-Orbitrap-MS

CIDSIDIRMPDECD

ion production

ion fragmentation

ion acceleration and detection

Figure 1.4. Configuration possibilities of mass spectrometers. There are different configurations of mass spectrometers, according to the ion acceleration and detection: quadrupole-MS (Q-MS); triple quadrupole-MS (tripleQ-MS), quadrupole-ion trap-MS (Q-ion trap-MS), time-of-flight-MS (TOF-MS), Fourier transform-ion cyclotron resonance-MS (FT-ICR-MS) and FT-Orbitrap-MS. There are different interfaces for the production of ions: electron impact (EI), electrospray (ESI), atmospheric pressure chemical ionisation (APCI), matrix assisted laser desorption ionisation (MALDI), desorption electrospray ionisation (DESI) and atmospheric pressure photoionisation (APPI). In terms of ion fragmentation techniques, collision-induced dissociation (CID) is the most conventional method. Other fragmentation techniques include surface-induced dissociation (SID), infrared multiphoton dissociation (IRMPD) and electron-capture dissociation (ECD), especially for the fragmentation of multiple-charged polypeptides.

The performance of soft-ionisation mass spectrometers, as used in LC-MS applications, can be described (and compared) by means of several intrinsic parameters (Fig. 1.5): the mass resolving power (or resolution), the mass accuracy, the linear dynamic range and the sensitivity (McLuckey and Wells, 2001). The improvement of these parameters enables a more effective identification of the MM of the analyte injected into the MS instrument. In general, the highly used quadrupole (Q)-MS instruments have a mass resolving power that is 4 times less than that of a time-of-flight (TOF)-MS, while an Fourier transform (FT)-ion cyclotron (ICR)-MS can reach a resolving power of higher than 1,000,000 (or 400 times higher than a Q-MS) (Balogh, 2004). A higher mass accuracy facilitates a finer distinction between closely related mass-to-charge signals. Consequently, the quality and the

Page 18: Metabolomics Technologies applied to the

18

CHAPTER 1

quantity of the assignments of mass signals into metabolites can be much improved by using high and ultra-high resolution accurate mass spectrometers.

Hybrid TOF-MS instruments such as QTOF-MS instruments are widely used in metabolomics due to their high sensitivity, mass resolving power (about 10,000) and mass accuracy, having a semi-automated instrument control. However, in terms of linear dynamic range, (Q)TOF-MS instruments are limited by the properties of the time-to-digital converter detector that is only able to record one ion per dead time. Intense mass signals become saturated, masking their real intensity and leading to distortions on the mass peak shape producing deviations on the mass accuracy, typically to lower mass-to-charge values (Chernushevich et al., 2001; Verhoeven et al., 2006). Recently some improvements have been implemented into (Q)TOF-MS instruments, extending their dynamic range. The use of an online lock mass spray, acting as an internal standard, can help to correct for deviations in the mass-to-charge axis, and can dictate the ion intensity interval for which the mass accuracy is highest and adequate for elemental composition calculation (Moco et al., 2006a).

0

20

40

60

80

100

999.94 999.96 999.98 1000 1000.02 1000.04

m

50%of max

FWHM

max

Mass resolving power R = m/ m

Mass accuracy m - m

mmeasuredrealmeasuredDm =

Linear dynamic rangerange over which the ion signal is linear withthe analyte concentration

Sensitivity intensity of m noise intensitySNR =

% r

elat

ive

abun

danc

e

m/z

m

m

Figure 1.5. Parameters used to describe the performance of mass spectrometers.The mass resolving power (or resolution), m/Δmx, can be described by two ways: i. for m being the averaged mass-to-charge ratio associated with two adjacent mass signals of equal size and shape that overlap by x% (50% is commonly used nowadays) and Δmx being the difference in mass-to-charge between the two adjacent mass signals or ii. m being the mass at the apex of the mass signal and Δmx being the width at x% height (typically 50%) of this mass signal, designated by FWHM (full width at half height of maximum). The mass accuracy is described by the ratio between the mass error (difference between measured and real mass) and the theoretical mass, often represented as parts per million (ppm). The sensitivity is described by the ratio between the intensity level of the mass signal and the intensity level of the noise. The linear dynamic range is described as the range of linearity of the ion signal measured in function of the analyte concentration.

Page 19: Metabolomics Technologies applied to the

19

Metabolomics and identification

FT-MS instruments, both the cyclotron (FT-ICR-MS) and the Orbitrap type (FT-Orbitrap-MS), enable measurements at a higher mass accuracy in a wider dynamic range. The FT-ICR-MS has the highest mass resolving power so far reported for any mass spectrometer (>1,000,000) and a mass accuracy generally within 1 ppm (Brown et al., 2005). The recently developed FT-Orbitrap-MS has a more modest performance compared to the FT-ICR-MS (maximum resolving power > 100,000 and 2 ppm of mass accuracy with internal standard), but is a high speed and high ion transmission instrument due to shorter accumulation times. This is a very advantageous characteristic especially when hyphenated to a separation technique, such as LC, and also when carrying out MS/MS experiments (Makarov et al., 2006). The appearance of high mass accuracy instruments in a wide dynamic range can improve immensely the identification capabilities in the online methods applied to complex mixtures.

The mass detection of a molecule in soft-ionisation MS is conditioned by the capacity of the analyte to ionise while being part of a complex mixture. Because only ions, either anions or cations, can be measured by MS, metabolites unable to ionise can not be detected. Apart from the chemical properties of the molecule itself, the eluent flow and composition, sample matrix and ionisation source all influence the ionisation. Ion suppression and matrix effects can become a main issue, in particular in semi-quantitative measurements. The use of ionisation enhancers, sample clean-up methods and different ionisation source are some of the possibilities that can improve the ionisation of the analytes under study (Mallet et al., 2004).

Most MS applications in metabolomics make use of a separation method before mass detection, typically LC, gas chromatography (GC) or capillary electrophoresis (CE). Such separation step introduces an extra dimension for identification (retention time) to the data, and reduces the complexity of the data analysis by avoiding ion suppression at the source (especially relevant when soft ionisation is used). A separation method, however, diminishes the throughput of analyses compared to a direct flow injection method. Different classes of compounds can be measured according to the separation technique used. LC is probably the most versatile separation method, as it allows the separation of compounds of a wide range of polarity. Using reverse-phase columns, semi-polar compounds (phenolic acids, flavonoids, glycosylated steroids, alkaloids and other glycosylated species) can be separated and by using hydrophilic columns, polar compounds can also be measured by LC-MS (sugars, amino sugars, amino acids, vitamins, carboxylic acids and nucleotides) (Tolstikov and Fiehn, 2002). The appearance of ultra performance liquid chromatography (UPLC) can improve the speed of analysis but

Page 20: Metabolomics Technologies applied to the

20

CHAPTER 1

more importantly, provide a better chromatographic resolution. The hyphenation of UPLC to MS can be advantageous for a better assignment of metabolites from chromatographic mass signals.

Regardless of the configuration of the (UP)LC-MS system, the robustness and reproducibility (in retention time and mass accuracy) as well as efficient ionisation of the analyses are essential for obtaining consistent data (De Vos et al., 2007). The chromatographic parameters (temperature, pH, column, flow rate, eluents, gradient), injection parameters, sample properties, MS and MS/MS parameters (calibration, instrumental parameters: capillary voltage, orientation of lens, etc.) and all other parameters related to the configuration of the LC-MS system (presence of other detectors such as photo diode array (PDA), tube widths, etc.) may all influence the performance of the metabolomics analyses. An adequate configuration should be adopted fitting the aim of the analyses and the limitations of the instruments.

Using LC-MS in the identification of metabolites

Metabolite assignments using LC-MS as a tool for compound identification are usually obtained by combining accurate mass, isotopic distribution, fragmentation patterns and any other mass spectrometric information available.

The calculation of the chemical combinations that fit a certain accurate mass is generally one of the first steps to obtain a set of alternatives that can lead to the identity of the metabolite detected. This set of alternatives becomes less extensive if the mass spectrometer can provide a more accurate MM value (Kind and Fiehn, 2006). Using an instrument that can provide very high mass accuracies, the range of possibilities of molecular formulae (MF’s) is limited and can, especially for lower m/z values, lead to the correct MF. The number of possible MF’s increases with increasing MM values. Furthermore, in most cases a pre-selection of chemical elements can be made, avoiding the generation of excessive false alternatives upon inclusion of all elements of the periodic table. For general applications in plant or animal metabolomics, most metals can be excluded (except perhaps for Na or K that are common adducts in mass spectra), being the core elements C, H, O, N, P and S. Logically, any other element for which there is the slightest evidence of being present in the analysed sample should be included for elemental composition calculation. Another point to take into account when MF’s are calculated from MM’s is the algorithm used for the calculation. There are more possible mathematical combinations of elements that fit certain MM’s than the amount of chemically

Page 21: Metabolomics Technologies applied to the

21

Metabolomics and identification

existing MF’s. This is related to chemical rules such as the octet rule that dictate certain limitations on chemical bonding derived from the electronic distribution of the participating atoms present in molecules. The widely applied nitrogen rule is used for the assessment of the presence or absence of N-atoms in a molecule or ion. Another useful item is the presence of rings and double bonds. As described by Bristow (2006) the number of rings and double bonds can be calculated from the number of C, H and N atoms that a molecule contains (assuming a C, H, N and O containing molecule).

One of the most powerful methods for narrowing the number of MF’s is to make use of the isotopic pattern of a mass signal. For most small organic molecules M, the intensity of the second isotopic signal, corresponding to the 13C signal, can unravel the number of carbons that the molecular ion contains knowing that the natural abundance of 13C is 1.11%. This is therefore of major assistance in the assignment of MF’s from MM’s. According to Kind and Fiehn (2006), this strategy can remove more than 95% of the false positives and can even outperform an analysis of solely accurate mass using a (yet non-existing) mass spectrometer capable of 0.1 ppm mass accuracy. With the appearance of large dynamic range MS instruments (with good isotopic intensity measurements), this is certainly an efficient strategy when combined with the MS spectra analyses tools described below.

The fragmentation pattern of a mass signal can provide structural information about the fragmented ion. From the fragments obtained the structure of the molecule can be deduced, knowing that the breakages will occur at the weakest points of the ion. For example, an O-glycosylated flavonoid will firstly fragment on the glycosidic linkage and only afterwards in the aglycone backbone, if sufficient energy is provided. The possibility of isolating one ion and performing tandem MS to the successively obtained fragments can be highly informative for tracking functional groups and connectivity of fragments for structure elucidation of metabolites. In addition, the possibility of obtaining accurate mass fragments is also another advantage when there is little knowledge about the possible atomic arrangements of the molecular ion.

Moreover there is a series of possible MS experimental procedures that can enhance our knowledge about the metabolites of interest. These experiments include comparisons of analysis obtained by positive and negative ion modes (either by online switching or offline), neutral mass loss experiments that can aid on the identification of certain functional groups or substituents, such as hydroxyls, carbonyls or glycosides (Fabre et al., 2001). The usage of 13C material as internal standard is also an elegant method of obtaining metabolite information (Mashego et al., 2004).

Page 22: Metabolomics Technologies applied to the

22

CHAPTER 1

Additionally, on the case of a separation method being coupled to the mass spectrometer, the retention time is a parameter that can give information about the polarity of the metabolite. Nowadays, in stabilized (LC or GC)-MS setups, the retention time variation can be relatively low allowing direct comparisons of chromatograms and the construction of metabolite databases (Lisec et al., 2006; Moco et al., 2006a; De Vos et al., 2007).

Data obtained from additional detectors can also be a complementary source of structural information of a metabolite. Typically, for a well-separated chromatographic signal with sufficient intensity, a full absorbance spectrum can be obtained in the ultraviolet/visible (UV/Vis) range using a PDA detector. For many secondary metabolites, their light-absorbance spectra can indicate at least the classes of compounds that these belong to, as the type of chromophores can be inferred from the absorbance maxima and the shape of the spectrum. Absorbance maxima can undergo slight shifts with the introduction of conjugations in the polyaromatic system.

Possibly the most straight-forward approach for obtaining confirmation of the identity of metabolites in a biological sample is to test commercially available standard compounds on the same analytical system. However, this approach implies the (commercial) availability of such standard compounds which, especially in for secondary metabolitrs is scarce. When standard compounds are available, these are useful not only for confirmation of the identity of compounds but also for undergoing (semi-) quantitative analyses and most importantly for the construction of metabolite databases containing experimental data of tested compounds on a fully characterized system.

In summary, the ability to assign metabolites using MS resides in the possibility of combining different features of the MS analysis: accurate mass, fragmentation pattern, isotopic pattern with additional experimental parameters such as retention time, UV/Vis spectra and confirmation with standard compounds. Also biochemical, literature and species information, as well as other related relevant information is appreciated for the assignment of the metabolite in study (Fig. 1.2).

NMR

NMR is a spectroscopic technique that takes advantage of the spin properties of the nucleus of atoms. The nuclear spin is the total angular momentum present in the nucleus of atoms. Only the nuclei that have a non-zero nuclear spin exhibit nuclear magnetic resonance. Among this group of atoms, are 1H, 13C, 15N and 31P,

Page 23: Metabolomics Technologies applied to the

23

Metabolomics and identification

which are elements that are present in bio-organic molecules. Depending on the nucleus, the nuclear spin can assume at least 2 different spin states. When exposed to an external magnetic field, the spins of the nuclei (re)orient along the magnetic field axis making it possible to change from one spin state to the other by absorbing energy. This is the basis of nuclear magnetic resonance. The energetic difference between the nuclear spin states can be explained by the Boltzmann distribution of the spin populations and is dependent on the (external) magnetic field strength, the temperature of the sample and the gyromagnetic ratio of the nuclear spin. Because the nuclear transition energy is much lower (typically in the order of 104) than an electronic transition, NMR is not as sensitive as other techniques such as Infrared (IR) or UV/Vis spectroscopy (Claridge, 1999; Nave, 2005).

Nevertheless, NMR is perhaps the most selective analytical technique available, being able to provide unambiguous information from the magnetic signatures of the atoms that take part in a molecule. One of the many applications of NMR is the ability to elucidate chemical structures, as it can provide highly specific evidence for the identification of a molecule. Furthermore, NMR is a quantitative technique, because the number of nuclear spins is directly related to the intensity of the signal (Pauli, 2001).

Different metabolomics approaches can be applied when using NMR (Ratcliffe and Shachar-Hill, 2005). The first is related to the capacity of molecule identification. Because 1H is part of almost all bio-organic molecules and has a very high natural abundance (99.9816-99.9974% (de Laeter, 2003)) and good NMR properties, it is the most used nucleus for NMR measurements. In general, the compounds of interest are isolated from their tissues, often through laborious analytical procedures, and solubilised in (when possible deuterated) solvent for the acquisition of a 1H NMR and when adequate two-dimensional (2D)-NMR spectra. For most bio-organic compounds the acquisition of a 1D 1H NMR spectrum is not sufficient for a full structural elucidation. Homonuclear 1H -2D spectra such as COSY (correlated spectroscopy), TOCSY (total correlation spectroscopy) and NOESY (nuclear Overhauser enhancement spectroscopy) are very informative about the 3D position of the protons in a molecule. To capture connectivities between different nuclei, such as between 1H and 13C, heteronuclear 2D spectra can be acquired for detecting direct 1H-13C bonds by a HMQC (heteronuclear multiple bond coherence) or over a longer range by a HMBC (heteronuclear single quantum coherence). There is a wide diversity of different types of NMR measurements, according to the interest of the user in particular chemical features.

Another metabolomics application is in vivo NMR. Because NMR is a non-

Page 24: Metabolomics Technologies applied to the

24

CHAPTER 1

destructive technique, performing measurements without sample loss is feasible, which allows monitoring time series of changing biological materials or the performance of analyses without sample extraction, i.e. on solid tissues. In vivo NMR can be advantageous when measuring specific cellular compartments that with an extraction step could never be attributed to specific organelles or tissues (Aubert et al., 1999). Furthermore, the usage of HR-MAS (high resolution magic angle spinning) is of high importance in the medical field in the analysis of biopsies for clinical judgements (Sitter et al., 2006), in the food processing industry (Shintu et al., 2004) and in any other cases where the sampling proves to be difficult.

A fast-growing approach, in particular in the animal/human research area, is NMR fingerprinting. This approach involves the acquisition of NMR spectra of complex mixtures, as biofluids or plant extracts for pinpointing differences between the samples, with the intention of biomarker discovery (Ratcliffe and Shachar-Hill, 2005; Kochhar et al., 2006). This strategy pairs with MS fingerprinting for obtaining a global overview over the metabolome. Tomato fruits and Arabidopsis leaves have been profiled by NMR (Le Gall et al., 2003a; Ward et al., 2003). Most studies so far use 1H NMR as being the least selective for the type of molecules and that can provide the highest sensitivity. However, 13C NMR (Vlahov, 2006) and 2D measurements such as JResolved (JRes) (Viant, 2003), COSY (Xi et al., 2006) and HMBC (Masoum et al., 2006) have also been used. In NMR profiling, the necessity of spectral comparisons demands the spectrum acquisition and the control of conditions to be extremely rigorous. Small changes in temperature, pH, and presence of impurities or degradation of the sample material can lead to the detection of false metabolic alterations and therefore the indication of incorrect differential metabolites.

Nowadays, in a 14.1 Tesla (600 MHz for 1H NMR) instrument, the limit of detection is in the microgram (1H and 1H-13C NMR) or even sub-microgram region (1H NMR). The sensitivity of NMR has been improving over the years, increasing the suitability of this technique for analytical applications. The detection of less sensitive nuclei such as 13C or 15N through magnetization of 1H, using probe heads with pulsed gradients for acquisition of 2D-NMR spectra increased the spectral resolution and sensitivity compared to 1D NMR of 13C or 15N. The nuclear properties as well as the natural abundance of the nuclei chosen for NMR acquisition also condition the NMR signal: 1H is naturally more abundant than 13C, therefore the amount of 13C in a sample is dependent on this fact. The resolution and the signal-to-noise of the measurement can be improved by using instruments with higher magnetic field strengths. The number of nuclei or the number of moles of the analyte in the detection volume used for the measurement also influences the sensitivity of the

Page 25: Metabolomics Technologies applied to the

25

Metabolomics and identification

NMR measurement. Thus, for high MM compounds, larger amounts (in mass) are needed to achieve sufficient sensitivities (Fig. 1.6). The labelling of low abundant metabolites with stable isotopes can also be applied and can be a strategy for performing 2D-NMR analysis on low amounts of material. In flux analysis, the labelling of compounds for analysis of the propagation of the isotope label in pathway analysis and kinetics measurements is a known application (Ratcliffe and Shachar-Hill, 2006). In NMR spectroscopy, the signal to noise ratio (Equation 1.1) is dependent on the T2* of the signals measured (Claridge, 1999). The T2* is inversely related to the line width of the signals obtained (πΔν½ = 1/T2*) and is influenced by magnetic field inhomogeneities. These magnetic field inhomogeneities can be caused by magnetic field susceptibility fluctuations in the sample (for instance large particles present, paramagnetic ions or inferior NMR tubes) or by poor shimming. Automated shimming procedures available for the most recent type of NMR instruments largely alleviate the latter, leaving sample preparation as the major cause of inferior NMR spectra.

21

23

23

)(*201 NSTBNAT obsexcN

S γγα −

S/N = signal-to-noise ratioN = number of molecules in the observed sample volumeA = abundance of the NMR active spins involved in the experimentT = temperatureB0 = static magnetic fieldγexc = magnetogyric ratio of the initially excited spinsγobs = magnetogyric ratio of the observed spinsT2* = effective transverse relaxation timeNS = total number of accumulated scans

(1.1)

The appearance of cryogenic probeheads brought important improvements in the NMR sensitivity (Kovacs et al., 2005). Being able to take advantage of the reduction of thermal noise by using low temperature detection coils, a signal-to-noise can be obtained up to 5 fold higher than with conventional probes. In addition, the possibility to miniaturize the active volume of the detection cell enabled the appearance of microprobes. Moreover, the signal to noise of the detection coil is inversely related to its diameter. These minituarized NMR probes are available with active volumes as low as 1.5 µL, providing new possibilities for analysing molecules in the lower detection volumes, increasing the concentration of the analyte at no expense on the signal-to-noise. This low active volume is compatible with chromatographic elution volumes in capillary chromatography, making the usage of capillary microcoil NMR (CapNMR) feasible (Schroeder and Gronquist, 2006).

Page 26: Metabolomics Technologies applied to the

26

CHAPTER 1

LC-(SPE)-NMR

The coupling of LC with NMR is becoming increasingly useful as the NMR sensitivity improves, avoiding excessive analytical demands on obtaining enough material to perform NMR measurements. In practical terms, the hyphenation of LC with NMR is still not as clear-cut as LC-MS but it is establishing itself as a powerful system for identifying related metabolites from complex mixtures such as natural extracts from plants. There are different configurations that can be used when coupling LC to NMR (Exarchou et al., 2005). More recently the online coupling of LC to SPE and subsequent NMR became available and improved some of the existing analytical barriers of the previous modes. In this configuration, the chromatographic peaks are trapped in SPE cartridges and can be concentrated up to several times by multi-trapping into the same cartridge. The chromatography itself can be done with (less expensive) protonated solvents because the analytes within the cartridges are dried and then eluted with fully deuterated solvents. The separation of flavonoids and phenolic acids present in Greek oregano extract was accomplished by this method (Exarchou et al., 2003). This method is suitable for the analysis of less abundant compounds in complex mixtures, since it allows the separation, concentration and NMR acquisition of metabolites within a single system, avoiding the often tedious analytical preparations before NMR analysis.

Using NMR in the identification of metabolites

The magnetic resonance of nuclei present in a molecule is displayed as signals with a determined frequency, represented by chemical shift values, δ, in the NMR spectrum. The analysis of a NMR spectrum can be extremely puzzling due to overlapping signals and multiplicities within the signals. The NMR spectrum of a particular molecule is unique, and for this reason NMR is considered one of (and perhaps even) the most selective techniques for compound elucidation. For the analysis of NMR spectra, the number, position and area of the signals in the spectrum as well as the multiplicity of these are some of the aspects that are used in order to attempt the assignment of a molecule. An aspect that can be both highly informative and difficult to interpret is the multiplicity of signals. The signal splitting or multiplicity of the signals is caused by the spin-spin coupling between the proton and the nearby atoms. The coupling constants, J, transmit structural information, necessary for the elucidation of most molecules.

The interpretation of NMR spectra can be quite demanding, especially for highly related structures or higher MM molecules. There are several software tools

Page 27: Metabolomics Technologies applied to the

27

Metabolomics and identification

(ACD/Labs, ChemOffice, and PERCH Solutions) that can help in 1H NMR spectral analysis by providing NMR spectral predictions. The aim of these prediction tools is to aid analysts to assign spectral δ’s and J‘s to the analysed molecule. Strictly theoretical calculations of NMR spectra from molecular properties are an option, yet unaccounted effects often appear on experimental spectra being difficult to incorporate in the theoretical prediction routines. In particular the prediction of 1H NMR spectra proves to be more difficult to implement due to the effect of 3D conformational structures on the 1H NMR chemical shifts of the protons. The construction of prediction models based on experimental data can be a successful alternative in order to describe chemical phenomena at a detailed molecular level (Moco et al., 2006b).

LC-MS-NMR

The identification of metabolites can be aided by metabolite profiling methods such as MS or NMR, but often the full chemical description of a molecule is only achieved by integrating metabolite information taken from different sources. The combination of MS with NMR for unravelling the identity of a molecule is one of the most powerful strategies (Fig. 1.2). On the one hand, MS can indicate not only the MM of a compound and therefore the possible MF’s, but also the presence of certain functional groups or substitution patterns. However, ambiguity still remains in the absence of standard compounds from which mass values, fragment ions and fragmentation energies can be compared to the unknown molecule. On the other hand, NMR allows the structural elucidation of molecules up to the isomer level. The most efficient way to seize the advantages of both technologies is to use them in parallel or if possible online.

The coupling of LC with both MS and NMR has been described and it is an elegant and efficient way of obtaining useful data for the identification of compounds (Exarchou et al., 2003). The advantage of performing the same separation for both MS and NMR makes the correspondence of the chromatographic signals between these two instruments clear. However, due to the complex analytical setup, the analyses done by LC-MS and LC-(SPE)-NMR separately are still the most common.

Developments in chemometric methods can assist in the rapid identification of molecules present in complex mixtures. The method depends on data obtained from a large number of samples which are both measured by LC-MS and NMR. The different data matrices obtained from these fingerprints can be fused using concatenation or other data fusion methods. In theory, fluctuations in the LC-MS

Page 28: Metabolomics Technologies applied to the

28

CHAPTER 1

matrix should reflected similar changes in the NMR matrix data set. When the sample preparation and analysis are done in a coherent manner, this method might enable high throughput identification of molecules. This approach has been tested for biofluid analysis, by coupling LC-MS and NMR data of urine samples (Crockford et al., 2006; Forshed et al., 2007) and can be a promising strategy in biomarker discovery.

DATA ANALYSES

The extraction of valuable conclusions from the analysis of metabolomics data is as important as performing the analytical measurements itself. There are a variety of methods that allow the transformation of raw data, directly taken from the instrument, passing through different treatments and ultimately leading to a list of metabolites.

Prior to any data analysis, it is important to be aware of the possible sources of variation present in the samples that can influence the final conclusions if these are not overseen. Parameters such as biological variation present among individuals, sampling, sample preparation and the analytical measurement influence the reproducibility of the results and these should be monitored as much as possible by the measurement of replicates, both analytical and biological. In principle, the biological variance should surpass all analytical variance.

Signal irreproducibility is an obstacle for reliable comparison of chromatograms and spectra. Retention time shifts in GC and more severely LC are common, as are occasional shifts in NMR spectra. In the latter, non-reproducibilities seem to be strictly related to sample preparation and hardly due to instrumental incoherence. Nevertheless, even in strictly controlled conditions signal shifts may persist. For this reason, the use of signal alignment software has become a routine procedure for comparison of chromatograms or spectra. MetAlign (De Vos et al., 2007), XCMS (Smith et al., 2006) and MZmine (Katajamaa et al., 2006) are some of the available alignment toolboxes for MS applications and HiRes for NMR applications (Zhao et al., 2006). These are relevant items for the reduction of raw data into a still informative but workable sized data set.

For masking or emphasising variable and sample deviations, scaling and standardisation tools can be applied, as long as these do not lead to artificial distortions of the original data. As for all the “omics” technologies, the multidimensionality is one of the characteristics of metabolomics data which ensures an inherent

Page 29: Metabolomics Technologies applied to the

29

Metabolomics and identification

complexity of the data set. The application of supervised and unsupervised tests such as principal component analysis (PCA), hierarchical cluster analysis (HCA), partial least squares (PLS) and discriminant analysis (DA), among others, are widely applied in metabolomics (Scholz et al., 2005; Masoum et al., 2006). These methods not only simplify the data by reduction of dimensionality but can also provide a visual representation of the data.

More sophisticated methods of emphasising relationships between metabolites such as correlation matrixes and metabolic correlation networks can help to establish relationships between different metabolites and even between metabolites and transcripts, genes or proteins. In this way, a systems level overview is envisioned (Joyce and Palsson, 2006). There are different tools either for visualisation purposes or databases that can be used to display the coupling of different “omics” data: KEGG (www.genome.jp/kegg), MetaCyc (http://metacyc.org), MAPMAN (gabi.rzpd.de/projects/MapMan) and KappaView (kpv.kazusa.or.jp/kappa-view), for example.

IDENTIFICATION TOOLS AND DATABASES

There are still only few tools that can produce automatically a list of possible metabolites from the mass signals at a particular retention time (MS) or δ’s (NMR). The analysis of spectrometric or spectroscopic data imply an intensive manual effort, hindering the throughput of the analysis setup.

In fact, the bridge between experimental data (MS and NMR spectra, retention time, fragmentation pattern, chemical shift, coupling constant) and the available chemical databases (Table 1.2) is still weak, let alone automatic. Some identification tools such as elemental composition calculation or MM calculation exist among the different instrumental software’s, but these seldom allow a spectral matching tool linked to a public database, like in proteomics applications. Some of the few examples of spectral databases are AMDIS (Automated mass spectral deconvolution and identification system) (www.amdis.net) which can be used mostly for identification of GC-MS signals. Advanced Chemistry Development Labs (ACDLabs) also provides commercially spectral matching with databases for MS and NMR, as well as predictor tools (Advanced Chemistry Development, Inc.). Nevertheless many plant metabolites such as secondary metabolites are not present in these databases.

Page 30: Metabolomics Technologies applied to the

30

CHAPTER 1

Table 1.2. Number of metabolite records present in MS and NMR, pathway and chemical databases.

DB Source No. Records (ca.)

MS-based DBs

NIST/EPA/NIH Mass Spectral Library (NIST 0.5) National Institute of Standards and Technology (NIST) 163,000

SpecInfo Daresbury Laboratory 139,000

Spectral Database for Organic Compounds, SDBS

National Institute of Advanced Industrial Science and Technology (AIST) 23,500

KNApSAcK (Comprehensive Species-Metabolite Relationship Database)

Nara Institute of Science and Technology (NAIST) 15,500

Metlin The Scripps Research Institute 15,000

Human Metabolome Database (HMDB) Genome Alberta and Genome Canada 2,300

Golm Metabolome Database ([email protected]) Max Planck Institute of Molecular Plant Physiology

Metabolome of Tomato Database(MoTo DB) Plant Research International 100

NMR-based DBs

Flavonoid Database Wageningen University 250 (13C and 1H)

Human Metabolome Database (HMDB) Genome Alberta and Genome Canada 400 (13C)350 (1H)

ACD Databases Advanced Chemistry Development, Inc.15,000 (13C and 1H)

8,800 (15N)26,100 (31P)

Spectral Database for Organic Compounds, SDBS

National Institute of Advanced Industrial Science and Technology (AIST)

12,500 (13C)14,300 (1H)

SpecInfo Daresbury Laboratory

102,000 (13C)117,000 (1H)1,000 (15N)1,000 (17O)

17,000 (31P)25,000 (19F)

Standard Compounds on Biological Magnetic Resonance Bank (BMRB) University of Wisconsin 275 (13C and 1H)

NMRShiftDB University of Koeln 19,500 (13C)3,000 (1H)

Pathways DBs

Kyoto Encyclopedia of Genes and Genomes (KEGG) Kyoto University / Tokyo University 14,000

Chemical DBs

SciFinder Chemical Abstracts Service (CAS) 30,500,000

PubChem National Institutes of Health (NIH) 10,100,000

Beilstein Database MDL 9,400,000

eMolecules eMolecules > 5,600,000

Available Chemicals Directory Elsevier MDL >> 200,000

Combined Chemical Dictionary (CCD) Chapman & Hall/CRC Press

Dictionary of Organic Compounds 265,000

Dictionary of Natural Products 170,000

Page 31: Metabolomics Technologies applied to the

31

Metabolomics and identification

Building up public metabolite databases is starting to be done by the laboratories within the community (Table 1.2). One of the largest initiatives for the identification of metabolites is the Human Metabolome Project where MS and NMR data are combined with molecule information (Wishart et al., 2007). The detailed description of the methods of sample preparation and analysis, conditions of the analytical experiment, chemical information about the metabolites (name, IUPAC name, chemical descriptors such as CAS registry numbers and InChi and/or structural information, links to chemical databases), experimental spectra and biological source are some of the features to include in the metabolite databases. A troublesome issue resides already in the nomenclature of molecules, as the list of common names for a given molecule can be quite extensive, as well as the same common name can be attributed to distinct molecules. This is a real impediment for a reliable and unambiguous classification and creates false interpretations, in particular in the organisation of databases and searching tools. Only with an acute description of the experimental conditions and chemical identity of the metabolites is the comparison and exchange of data relevant. Perhaps at this stage the priority into a rigorous identification of metabolites will have to arise, as dealing with unknowns and not fully identified metabolites creates a lot of incongruent hits in the databases. Ideally the separate metabolite databases will be accessible through a common search engine as an open source web service, as in BioMOBY (Wilkinson et al., 2005).

CONCLUSIONS

The description of the metabolome can be achieved by different methods, either in parallel or in combination. Especially MS and NMR profiling techniques are

Dictionary of Inorganic and Organometallic Compounds 103,000

Dictionary of Drugs 44,000

Dictionary of Analytical Reagents 14,000

ChemIDplus National Institute of Health (NIH) 380,000

Substance Registry System (SRS) Environmental Protection Agency (EPA) 98,000

ChemFinder CambridgeSoft Corporation 72,000

Merk Index John Wiley & Sons, Inc. 10,200

Chemical Entities of Biological Interest (ChEBI)European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI)

10,000

Page 32: Metabolomics Technologies applied to the

32

CHAPTER 1

powerful methods to detect the metabolome as a whole. Comparison of metabolic profiles can elucidate differences between organisms and pinpoint the responsible metabolites. However, if we can identify differences but not chemically describe these, very little is left to say about the underlying nature of the metabolic phenomena. There is still a long way to go to completely describe the metabolome of an organism, pointing to the elucidation of unknowns as a priority. As yet, no single analytical method can capture the whole metabolome and the analytical method chosen confines the amount of metabolites left to identify along its process. Currently, the integration of high resolution LC-MS and NMR approaches provides necessary information for the elucidation of compounds. The development of bioinformatic tools will facilitate the management of large amounts of data and help in the integration of different data sets by sieving the metabolite information from the instrumental chromatographs and spectra. The expansion of our view over the metabolome of organisms will improve the description of metabolic networks and cellular phenomena in general.

Page 33: Metabolomics Technologies applied to the
Page 34: Metabolomics Technologies applied to the
Page 35: Metabolomics Technologies applied to the

35

Chapter 2

Untargeted Large Scale Plant Metabolomics using Liquid Chromatography coupled to Mass Spectrometry

Ric C.H. De Vos*, Sofi a Moco*, Arjen Lommen*, Joost J.B. Keurentjes, Raoul J. Bino and Robert D. Hall Nature Protocols 2: 778-791 (2007) *equally contributing authors

Untargeted metabolomics aims to gather information on as many metabolites as possible in biological systems by taking into account all information present in the data sets. Here we describe a detailed protocol for large scale untargeted metabolomics of plant tissues, based on reversed phase liquid chromatography coupled to high resolution mass spectrometry (LC-QTOF-MS) of aqueous-methanol extracts. Dedicated software, metAlign, is used for automated baseline correction and alignment of all extracted mass peaks across all samples, producing detailed information on the relative abundance of thousands of mass signals representing hundreds of metabolites. Subsequent statistics and bioinformatics tools can be used to provide a detailed view on the differences and similarities between (groups of) samples or to link metabolomics data to other systems biology information, genetic markers and/or specifi c quality parameters. The complete procedure from metabolite extraction towards a data matrix with aligned mass signal intensities takes about 6 days for 50 samples.

Untargeted Large Scale Plant Metabolomics using Liquid Chromatography coupled to Mass Spectrometry

Ric C.H. De Vos*, Sofi a Moco*, Arjen Lommen*, Joost J.B. Keurentjes, Raoul J. Bino and Robert D. Hall Nature Protocols 2: 778-791 (2007) *equally contributing authors

Page 36: Metabolomics Technologies applied to the

36

CHAPTER 2

INTRODUCTION

Metabolomics has emerged as a valuable technology for the comprehensive profiling and comparison of metabolites in biological systems and a multitude of applications for human, microbial and plant systems have already been reported or predicted (Sumner et al., 2003; Bino et al., 2004; Jenkins et al., 2004; Trethewey, 2004; van der Greef et al., 2004; Vaidyanathan et al., 2005; Dixon et al., 2006; Hall, 2006; Saito et al., 2006). Plants are especially rich in chemically diverse metabolites, which are usually present in a large range of concentrations, and no single analytical method is currently capable of extracting and detecting all metabolites. Over the past decade, several methods suitable for large scale analysis and comparison of metabolites in plant extracts have been established (Dixon et al., 2006; Hall, 2006), including gas chromatography coupled to mass spectrometry (GC-MS) (Fiehn et al., 2000; Roessner et al., 2001; Roessner et al., 2002; Fernie, 2003; Schauer et al., 2005a; Lisec et al., 2006), direct flow injection-mass spectrometry (DFI-MS) (Aharoni et al., 2002; Goodacre et al., 2003; Hirai et al., 2005; Overy et al., 2005), liquid chromatography-mass spectrometry (LC-MS) (Tolstikov et al., 2003; Jander et al., 2004; von Roepenack-Lahaye et al., 2004; Vorst et al., 2005; Moco et al., 2006a; Rischer et al., 2006), capillary electrophoresis-mass spectrometry (CE-MS) (Sato et al., 2004), and nuclear magnetic resonance (NMR) technologies (Le Gall et al., 2003a; Ward et al., 2003). LC-MS based approaches are expected to be of particular importance in plants, due to the highly rich biochemistry of plants which covers many semi-polar compounds, including key secondary metabolite groups, which can best be separated and detected by LC-MS approaches (Huhman and Sumner, 2002; Tolstikov et al., 2003; von Roepenack-Lahaye et al., 2004; Breitling et al., 2006; Dixon et al., 2006; Hall, 2006; Moco et al., 2006a; Saito et al., 2006). Of the many semi-polar compounds not involved in primary metabolism, quite a number have already been shown to have phenotypic / physiological importance. It is also mainly secondary metabolites that are attracting much attention from health, food and nutrition groups (Beekwilder et al., 2005; Vaidyanathan et al., 2005; Dixon et al., 2006; Rischer et al., 2006) owing to, for example, their resistance effects, antioxidant properties, and colour and flavour characteristics. These and other so-called quality aspects of plant materials are generally not centred on individual metabolites but rather are related to a particular mixture of compounds from diverse, biochemically-related and unrelated groups. As such, a metabolomics approach to help understand better how complex these mixtures are, which components play the most important role, and how their biosynthesis is controlled, is likely to be of great future value and importance.

Page 37: Metabolomics Technologies applied to the

37

Large Scale Metabolomics using LC-MS

Commonly used plant metabolomics approaches and their advantages and limitations

Although NMR is in principle the most uniform detection technique and is essential for the unequivocal identification of unknown compounds, NMR-based metabolomics approaches still suffer from a relatively low sensitivity compared to MS. As yet, MS-based platforms are most widely used in plant metabolomics (Hall, 2006). GC coupled to electron-impact time-of-flight (TOF)-MS was the first approach used in large scale plant metabolomics (Fiehn et al., 2000), and a detailed protocol for sample extraction, derivatization and subsequent data analyses has recently been described (Lisec et al., 2006). This approach covers a large variety of nonvolatile metabolites, mainly those involved in primary metabolism, including organic and amino acids, sugars, sugar alcohols, phosphorylated intermediates (in the polar fraction of extracts), as well as lipophilic compounds such as fatty acids and sterols (in the apolar fraction). GC-(TOF)MS produces highly reproducible separation and fragmentation patterns of metabolites, which enables the development of common GC-TOF-MS based metabolite libraries (Kopka et al., 2005; Schauer et al., 2005a). Although CE-MS also enables good separation and detection of many polar primary metabolites (Sato et al., 2004), it is seldom used compared to GC-TOF-MS. As most primary metabolites have commercially available standard compounds, both GC-TOF MS and CE-MS can produce quantitative data for hundreds of compounds involved in central metabolism.

The preferred method for analysing semi-polar metabolites is LC-MS with a soft-ionisation technique, such as electrospray ionisation (ESI) or atmospheric pressure chemical ionisation (APCI), resulting in protonated (in positive mode) or deprotonated (in negative mode) molecular masses. Compounds detectable by LC-MS include the large and often economically important group of plant secondary metabolites such as alkaloids, saponins, phenolic acids, phenylpropanoids, flavonoids, glucosinolates, polyamines, and all kinds of derivatives thereof (Huhman and Sumner, 2002; Tolstikov et al., 2003; Moco et al., 2006a; Rischer et al., 2006). These compounds can be effectively extracted with aqueous alcohol solutions and directly analysed without derivatization. Depending on the type of column used, various primary metabolites including several polar organic acids and amino acids can be reliably analysed using LC-MS (Tolstikov and Fiehn, 2002). Based on the high mass resolution of time-of-flight (TOF)-MS and Fourier Transform-Ion Cyclotron Resonance-MS (FT-ICR-MS) instruments, enabling elemental formulae calculations of detected ions, rapid DFI-MS approaches without any prior compound separation have been developed to compare metabolite fingerprints of crude plant extracts (Aharoni

Page 38: Metabolomics Technologies applied to the

38

CHAPTER 2

et al., 2002; Goodacre et al., 2003; Hirai et al., 2005; Overy et al., 2005). However, such direct injection approaches, irrespective of the resolution and accuracy of the mass spectrometer, may suffer from significant adduct formation and ion suppression phenomena upon ionisation of complete crude extracts. Moreover, by definition, direct injection methods cannot discriminate between the many molecular isomers. Therefore, most MS-based platforms in plant metabolomics perform at least some separation. LC preceding MS not only results in the detection of isomeric compounds, which are often abundantly present in plants, but also enables valuable structural information to be collected online, for example, MS/MS fragmentation patterns and UV/Vis absorbance spectra using photodiode array (PDA) detection (Huhman and Sumner, 2002; Tolstikov and Fiehn, 2002; Tolstikov et al., 2003; von Roepenack-Lahaye et al., 2004; Moco et al., 2006a; Rischer et al., 2006; Saito et al., 2006). It has been estimated that extensive LC in combination with high resolution MS (e.g. TOF-MS) enables the detection of several hundreds of compounds in a single crude plant extract (von Roepenack-Lahaye et al., 2004; Vorst et al., 2005; Moco et al., 2006a). With continually improving tools for data acquisition, processing and mining, LC-MS will certainly grow in value for biochemical profiling and metabolite identification. Combining LC with ultra-high resolution mass spectrometry such as FTMS (Breitling et al., 2006; Peterman et al., 2006) and other identification tools like LC-NMR-MS (Exarchou et al., 2003; Wilson and Brinkman, 2003; Wolfender et al., 2003), as well as making use of improved separation technologies such as ultra-performance LC (UPLC) coupled to MS (Laaksonen et al., 2006; Nordström et al., 2006), will further improve our potential to identify metabolites and to provide an even more detailed metabolite profile of plant extracts.

Untargeted LC-MS for plant metabolomics

Compared with primary metabolites, the number of commercially available standards for secondary metabolites per plant species or tissue is still very limited. Consequently, metabolomics approaches based on analyses of compounds for which standards are available, which is common practice in GC-(TOF)MS based metabolomics studying primary metabolism, would very much limit the great potential of LC-MS in plant research. Recent developments in processing software for unbiased mass peak extraction and alignment of LC-MS data, such as metAlign (Bino et al., 2005; Vorst et al., 2005; Keurentjes et al., 2006; Moco et al., 2006a), XCMS (Nordström et al., 2006; Smith et al., 2006), MZmine (Katajamaa et al., 2006) and Markerlynx (Idborg et al., 2005) now offer possibilities for more holistic untargeted

Page 39: Metabolomics Technologies applied to the

39

Large Scale Metabolomics using LC-MS

metabolomics approaches aimed to gather information on as many metabolites as possible present in extracts analysed. In such untargeted approaches, mass peak identification using standards is not the primary step in data processing. In contrast, all analytical information present in the profiles is first transformed into coordinates on the basis of mass, retention time and signal amplitude. These coordinates are then aligned across all samples. By applying appropriate statistical and multivariate analyses tools, differential mass peaks or mass peaks correlating with a specific trait can be filtered out and identified to some degree by using accurate mass, MS/MS fragmentation and then confirmed with standards when available. Examples of such untargeted approaches in plant research are the comparison of secondary metabolites in roots and leaves of wild-type and mutant Arabidopsis (Arabidopsis thaliana) plants (von Roepenack-Lahaye et al., 2004), studying metabolic alterations in fruits of a light-hypersensitive mutant of tomato (Solanum lycopersicum) (Bino et al., 2005), comparing tubers of potato (Solanum tuberosum) of different genetic origin and developmental stages (Vorst et al., 2005), determining tissue-specificity of metabolic pathways in tomato fruit (Moco et al., 2006a), establishing gene-to-metabolite networks in Catharanthus roseus (Rischer et al., 2006), and identifying quantitative trait loci (QTL’s) controlling metabolite composition in Arabidopsis (Keurentjes et al., 2006; Fu et al., 2007).

For our metabolomics approaches we prefer to use the freeware metAlign (www.metalign.nl and www.rikilt.wur.nl/UK/services/MetAlign+download) to process large LC-MS (Bino et al., 2005; Vorst et al., 2005; Keurentjes et al., 2006; Moco et al., 2006a) as well as GC-MS (Tikunov et al., 2005) data sets, based on a number of features:

• compatibility with most mass spectrometry software such as MassLynx, Xcalibur, ChemStation, Agilent, Bruker and ANDI/netCDF formats and output in any of these formats as well as in Excel;

• compatibility with both LC and GC, and independent of mass spectrometer type (e.g. quadrupole-MS, TOF-MS, FTMS) or instrument maker;

• an easy interface for user-defined parameter settings;• automated local noise calculation and mass-specific baseline

corrections;• capability to align up to hundreds of data sets.Examples of using metAlign for the comparison of ten to hundreds of LC-MS

data files are available (Bino et al., 2005; Vorst et al., 2005; Keurentjes et al., 2006; Moco et al., 2006a). Though metAlign converts accurate mass data into nominal masses, mainly for reasons of faster data processing, the masses of aligned signals

Page 40: Metabolomics Technologies applied to the

40

CHAPTER 2

can automatically be recovered using a script called MetAccure (Vorst et al., 2005; Moco et al., 2006a).

Considerations for tissue sampling and handling

Although no limitations regarding sample type are foreseen, except from a technical point of view, care must be taken in acquiring reproducible data. Sources of variation contributing to the total ‘noise’ in subsequent statistical analyses are biological variation (e.g. variation in plant growth conditions, development, etc.), perturbations during and after tissue collection, and variation in tissue sampling for metabolite extraction including weighing errors. Metabolic conversions in tissues can be abolished by flash-freezing samples in liquid nitrogen immediately after harvest. Frozen samples should be fully homogenized into a fine powder in order to facilitate and standardise metabolite extraction. Nevertheless, each analysis provides only a single snapshot of the metabolic state of that sample without further information on biological variation or measurement errors. To estimate these variations, sufficient biological replicates and sufficient technical replicates from the same batch of tissue powder, respectively, need to be prepared and analysed.

Considerations for metabolite extraction and LC-PDA-MS analyses

The extraction procedure is crucial for the detection of metabolites naturally occurring in the extracted tissues. Therefore, the extraction protocol should be reproducible and with high recovery and stability of most compounds, at least those of prime interest. We have tested a number of different solvents, such as methanol, ethanol and acetone, at different ratios of water versus organic solvent, for extraction efficiency, chromatographic behaviour and extract stability. Acidified aqueous-methanol at a final concentration of 75% methanol (v/v) and 0.1% formic acid (v/v) was the most suitable solvent for efficient extraction of a wide range of compounds of our prime interest, mostly secondary metabolites, from different plant species and tissues (Bino et al., 2005; Vorst et al., 2005; Keurentjes et al., 2006; Moco et al., 2006a). Enzymes present in the sample should be inactivated by directly adding the solvent to frozen plant powder and mixing immediately. Extraction efficiency was tested using several (poly)phenolic compounds added to the frozen powder before extraction. At a solvent/sample ratio of 3 and a sonication time of 15 min, the recovery of all standards tested was higher than 90%. Sonication for up to 2 h did not significantly change the metabolite profile as compared to

Page 41: Metabolomics Technologies applied to the

41

Large Scale Metabolomics using LC-MS

15-min sonication. However, it is advised to check the extraction efficiency upon analysis of a completely different plant matrix or in case of main interest in specific key compounds.

Time5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00

%

0

100

5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00

%

0

100

5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00

%

0

100

TOF MS ES- BPI

3.25x1037.78

1081.475026.55

1314.53322.94191.0080

24.42609.1366

14.25353.083612.67

443.1941

17.41387.1647

22.49741.1760

33.191078.5151

27.05962.4711

28.901152.5239

37.231079.5116

42.71271.0497

42.18271.0591

43.621455.2723

31.53480.0210

19.88492.0550

17.44385.0844

14.50478.0744

2.95292.9145

3.39376.0196

4.61565.0416 9.07

481.0574

26.88339.0537

21.97755.1909

27.13577.1447

29.15447.0891

48.71476.0715

40.41478.0796

35.09591.1628

2.45341.0953

13.27431.0981

3.34191.0119

5.19391.0305

12.28577.1318

25.92477.062614.09

325.0881 19.10449.1069 23.92

934.0601

40.26385.1494

27.99447.0908

31.28371.1339

36.89711.3969

47.73385.1505

A

C

B

m/z550 600 650 700 750 800 850 900 950 1000 1050 1100 1150

1078.531032.53

576.39

870.48738.43

577.39

1079.55

1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083

1078.5415

1079.5465

1080.5498

C 51H84NO 23

-1.8 ppm

550 600 650 700 750 800 850 900 950 1000 1050 1100 1150

%

0

100 1078.531032.53

576.39

870.48738.43

577.39

1079.55

m/z1069 1070 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083

%

0

100 1078.5415

1079.5465

1080.5498

C 51H84NO 23

-1.8 ppm

m/z100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440

96.96

358.03

195.98

195.03 259.01

422.02

422.0250

424.0250

C 11H20NO 10S 3

+0.2 ppm

m/z100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440

%

0

100 96.96

358.03

195.98

195.03 259.01

422.02

m/z418 419 420 421 422 423 424 425

%

0

100 422.0250

424.0250

C 11H20NO 10S 3

+0.2 ppm

m/z100 150 200 250 300 350 400 450 500 550 600 650

%

0

100 609.14300.03

271.03

255.03

151.01

301.04

302.04

610.15

611.15

m/z600 601 602 603 604 605 606 607 608 609 610 611 612 613

%

0

609.1445

610.1428

611.1508

C 27H29O16

-1.7 ppm

m/z100 150 200 250 300 350 400 450 500 550 600 650

609.14300.03

271.03

255.03

151.01

301.04

302.04

610.15

611.15

m/z600 601 602 603 604 605 606 607 608 609 610 611 612 613

100609.1445

610.1428

611.1508

C 27H29O16

-1.7 ppmFED

O

OHOH

O

OH

O

O

O

HO

O

O

OHOH

OH

HO

OH

HO

HO

OHO

NHO

H H

H

H

H

H S

OH

OH

HO

OHO

N

OS

O

O

HO

SO

O

O O

OH

OH

HO

OH

HO

HO

O

O

O

HO

OH

OH

OH

4

TOF MS ES- BPI

3.41x104

TOF MS ES- BPI

3.02x104

Figure 2.1. LC-QTOF MS profiling of crude extracts from three different plant species. Upper panel shows typical ion chromatograms, obtained in ESI negative mode, of (A) tomato fruit, (B) Arabidopsis leaf, and (C) strawberry fruit. Lower panels show detected accurate masses of [M-H]- ions and LC-MS/MS spectra of three compounds from different classes of secondary metabolites: (D) α-tomatine, an alkaloid, detected as formic acid adduct; (E) glucoiberin, a glucosinolate; (F) rutin, a flavonoid.

The chromatographic conditions applied are always a compromise between metabolite resolution, retention time stability and sample throughput. In the standard protocol we use a C18-reversed phase microbore column with a relatively small particle size. This column was selected after testing different type of columns

Page 42: Metabolomics Technologies applied to the

42

CHAPTER 2

for their ability to retain and separate semi-polar compounds of our prime interest, including flavonoids and phenolic acids (Bino et al., 2005; Keurentjes et al., 2006; Moco et al., 2006a), alkaloids (Bino et al., 2005; Vorst et al., 2005; Moco et al., 2006a) and glucosinolates (Keurentjes et al., 2006). A gentle and continuous acetonitrile gradient of 45 min, followed by 15 min column washing and stabilization, resulted in adequate separation of many semi-polar compounds including isomeric forms (Fig. 2.1). We tend to use the same chromatographic conditions in our untargeted metabolomics work, in order to compare mass signals from different samples and to enable compound identification using LC-MS databases (Moco et al., 2006a). In most of our experiments, the LC-MS run itself is not the limiting factor in sample throughput. Instead, sample harvest, grinding, weighing and extraction, and finally data analyses usually take much more time. For large series of samples, e.g. more than 300 extracts, steeper gradients with shorter run times may be useful in order to decrease total run time and therefore the chance of possible perturbations upon increasing analysis times. This might occur due to (pre-)column deterioration or disturbances in the MS-electronics or LC pump, thus introducing extra variation in the final data set. Thus, during analyses of an Arabidopsis recombinant inbred line (RIL) population consisting of 409 extracts including controls, we doubled the sample throughput by using a total run time of 30 min per extract (Keurentjes et al., 2006). However, speeding up the LC-run time, with the same type of column, unavoidably results in an increased amount of co-eluting compounds and thus may lead to a loss of resolution of isomers and an increased ion suppression and adduct formation at the ionisation source. We advise to start with the standard 60 min protocol as outlined below and, if needed, to modify the chromatographic conditions (gradient, column type, etc.) in such a way that at least the compounds of key interest are adequately separated and detected.

Upon starting up a new series of analyses, the chromatography is relatively unstable due to (pre-)column conditioning by the crude extracts themselves. To avoid suboptimal alignment resulting from this early-stage system instability, several “dummy” runs of extracts should be performed, before collecting the actual data. We routinely program the LC-MS software to inject and analyse repeatedly the first sample extract at least 4 times. Standard solutions should not be injected between crude extracts, as during analysis of these relatively clean samples the column can partly be re-conditioned resulting in small retention shifts. To ensure constant and reproducible ionisation, regularly check the actual pressure and supply of the nitrogen and argon gasses. In our system we can check this pressure by comparing the intensity of the reference mass (lock mass; see below) over the samples. If the

Page 43: Metabolomics Technologies applied to the

43

Large Scale Metabolomics using LC-MS

intensity of this mass signal is markedly changed in one or more samples, these samples should be reanalysed within the same series.

Analyse extracts in a randomized order to avoid possible variation from time-dependent changes, e.g. due to slow deterioration of (pre-) column or ionisation source. Due to the high variability of metabolites present in crude extracts with respect to their chemical characteristics and intrinsic behaviours upon sample preparation, the use of a single internal standard to correct for variation in extraction and detection of all mass signals over the samples is of dubious value. Adding a series of internal standards, e.g. each representing a different class of plant metabolites, may be a better option but may introduce ion suppression effects in the case of co-eluting compounds. Consequently, we recommend preparing a statistically relevant number of replicates from a homogenous (pooled) batch of material and analysing these throughout the entire sample series, in order to estimate technical reproducibility and, if needed, to correct for this type of variation.

Figure 2.2. Schematic overview of experimental set-up and data fl ow for untargeted LC-QTOF MS based metabolomics of plant materials. A detailed description of each step is given in the PROCEDURE.

With our LC-QTOF MS system we normally acquire data in centroid mode. In contrast to the continuum mode, in which the mass signal is represented by a Gaussian curve, the centroid mode projects each mass signal as accurate m/z value by on-the-fl y mathematical transformation. Although relevant information on mass

Freezing and

grinding

Transfer vials

to autosampler

Output:

CSV file

Data

processing

analysing these throughout the entire sample series, in order to estimate technical reproducibility and, if needed, to correct for this type of variation.

g and

g

Transf

to auto

analysing these throughout the entire sample series, in order to estimate technical reproducibility and, if needed, to correct for this type of variation.

vials

ampler

LC-MS profiles

analysing these throughout the entire sample series, in order to estimate technical reproducibility and, if needed, to correct for this type of variation.

Freezin

grinding

Extraction, centrifugation

and filtration

Data analyses

• t-tests

• multivariate analyses tools

• correlation analyses

• ………

Identification of relevant

mass peaks

Growth and harvest of

plant material

LC-PDA-QTOF MS

Output:

CSV fil

a

cessing

and filtration

peak extraction and

alignment over samples

STEPS 1-2

STEPS 10-14

STEPS 8-9STEPS 3-7

STEPS 16-19

MetAlign for mass

STEP 15

Page 44: Metabolomics Technologies applied to the

44

CHAPTER 2

peak shape and purity may be lost upon centroiding, the raw data files are markedly reduced from about 500 Mb to a more useful size of about 10 Mb per sample (at a run time of 1 h and sampling rate of 1 scan per second). Especially upon analysing and processing large series of extracts, and for storing and databasing thousands of raw data files gathered over years of analyses, acquiring data in centroid mode is the most practical option. In addition, by using a separate lock mass spray as reference and by continuously switching between sample and reference, the MassLynx software can automatically correct the centroid mass values in the sample for small deviations from the exact mass measurement (Wolff et al., 2001), resulting in a mass accuracy of better than 5 ppm generally.

This paper describes a detailed protocol for untargeted LC-MS based metabolomics of large numbers of extracts. The standard procedure is schematized in Fig. 2.2 and consists of: tissue sampling and extract preparation; LC-QTOF MS analysis using an ESI source, metAlign-assisted mass peak extraction and alignment across samples; and the identification of mass peaks selected by means of appropriate statistical filtering. In principal, the methodology described below is applicable to a wide range of plants species, tissues or products derived thereof.

MATERIALS

Reagents

• Acetonitrile, HPLC supra-gradient grade (Biosolve, cat. no. 01203502, CAS [75-05-8]). CAUTION: Acetonitrile is harmful and highly flammable and should be handled in a fume hood.

• Methanol absolute, HPLC supra-gradient grade (Biosolve, cat. no. 13683502, CAS [67-56-1]). CAUTION: Methanol is toxic and highly flammable and should be handled in a fume hood.

• Formic acid for analysis 98-100% (Merck-KGaA, cat. no 1.00264.1000, CAS [64-18-6]). CAUTION: Formic acid is corrosive and volatile, and should be handled in a fume hood.

• Leucine enkaphaline, ≥ 95% pure, isolated by HPLC (Sigma, cat. no. L9133, CAS [81678-16-2]).

• Phosphoric acid p.a. 85% in water solution (m/v) (Acros, cat. no. 20114-0010, CAS [7664-38-2]). CAUTION: Phosphoric acid is corrosive and should be handled in a fume hood.

• Ultrapure water (Elga Maxima, Bucks)

Page 45: Metabolomics Technologies applied to the

45

Large Scale Metabolomics using LC-MS

• Liquid nitrogen for freezing samples. CAUTION: Liquid nitrogen is a low temperature refrigerant and should be handled with protective glasses and protective gloves.

• Liquid nitrogen for applying gas to mass spectrometer ionisation source.

• Argon 5.0, at least 99.999% pure, for applying gas to mass spectrometer collision cell.

• Sample extraction solution (see REAGENT SETUP)• HPLC mobile phase (see REAGENT SETUP)• MS calibration solution (see REAGENT SETUP)• Lock mass solution (see REAGENT SETUP)

Equipment

• Storage tubes or plastic bags resistant to liquid nitrogen, e.g. polypropylene 50-mL tubes with screw cap (Greiner, cat. no. 210261), Eppendorf micro-test tubes, 12 mL glass tubes with screw caps (Omnilabo)

• IKA A11 basic grinder• Pipettes and tips suitable for handling organic solvents (Microman,

Gilson)• Ultrasonic bath (Branson 3510)• Single-use sterile and non-pyrogenic latex-free syringes, 0.01-1 mL

Tuberkulin Omnifix-F (B.Braun Melsungen AG, cat. no 9161406V)• Single-use syringe filters free of polymers, such as Anotop 10

(diameter 10 mm, pore size 0.2 µm; Whatman, cat. no 6809-1022) or Minisart RC4 (diameter 4 mm, pore size 0.2 µm; Sartorius, cat. no 17821). CRITICAL: Filters for MS analyses should be resistant to extraction solution (75% methanol + 0.1% FA) and free of polyethylene glycol or any other soluble polymer

• Crimp cap autosampler vials of 1-2 mL with aluminum crimp caps containing natural rubber/ polytetrafluoroethylene septum

• Tecan Genesis Workstation with TeVacs vacuum filtration unit• Protein filtration plates in 96 wells format (Captiva 0.45 µm, Ansys

Technologies)• Ninety-six-well plates with 700 µL glass inserts (Waters) and 96-

square well polytetrafluoroethylene-coated seal (Waters)• Analytical column Luna C18(2), 2.0 mm diameter, 150 mm length,

100 Å pore size, spherical particles of 3 µm (Phenomenex)

Page 46: Metabolomics Technologies applied to the

46

CHAPTER 2

• Pre-columns Luna C18(2), 2.0 mm diameter, 4 mm length (Security Guard, Phenomenex)

• PEEK in-line filter holder with PEEK frit 0.5 µm pore size (UpChurch Scientific)

• Alliance 2795 HT liquid chromatography system equipped with an internal degasser, sample cooler and column heater (Waters)

• Photodiode array detector 2996 (Waters)• Quadrupole-time-of-flight Ultima V4.00.00 mass spectrometer

equipped with an electrospray ionisation (ESI) source (Waters) and separate lock mass spray inlet

• Separate HPLC pump (e.g. Bromma 2150; LKB) for continuously pumping the lock mass solution at 10 µL min-1

• PEEK tubings (Upchurch Scientific) for connecting the LC-PDA (125 µm inner diameter) and the lock mass pump (250 µm inner diameter) to the mass spectrometer

• PHD 4400 syringe pump (Harvard)• Gastight glass syringe 0.1-1.0 mL (Hamilton-Bonaduz Schweiz, cat.

no. 1001)• Software: MassLynx data management software 4.0 (Waters),

metAlign (www.metalign.nl or www.rikilt.wur.nl/UK/services/MetAlign+download), Microsoft Office Excel 2003. Optional: multivariate analyses software such as GeneMaths 2.01 (Applied Maths, Belgium).

Reagents Setup

Plant growth and sampling conditions: Samples to be prepared for metabolomics studies should be as representative as possible for the genotype or tissues to be analysed. For small plants like Arabidopsis seedlings, a combinatorial approach of controlled plant growth, pooling and replicate analyses can be used to minimize biological and experimental variation. For instance, in the large scale metabolomics study in Arabidopsis RILs (Keurentjes et al., 2006), seeds were sown on 10 mL ½ MS (Murashige and Skoog) agar (2%) in Ø 6 cm Petri-dishes with a density of a few hundred seeds per dish. Dishes were placed in a cold room at 4°C for 7 days in the dark to promote uniform germination and were then randomly placed in five blocks in a climate chamber where each block contained one replicate dish of each line. Growth conditions were 16 h light (30 W m-2) at 20°C and 8 h dark at 15°C, at 75% relative humidity. After 6 days the lids of the Petri-dishes were removed to

Page 47: Metabolomics Technologies applied to the

47

Large Scale Metabolomics using LC-MS

ensure that seedlings were free of condensed water on the day of harvest. On day 7, at 7 h into light period, all seedlings were harvested within 2 h by submerging the complete Petri-dish briefly in liquid nitrogen and scraping off the aerial parts with a razor blade. Finally, per line material from 2 dishes was pooled to make one of the replicate samples and from the other 3 dishes to make the second. To obtain representative material from large plant tissues, such as fruits of tomato, apple, or tubers of potatoes, a representative “pie”-segments is taken from at least 5 fruits or tubers per plant using a sharp knife. Segments are snap-frozen in liquid nitrogen and pooled per plant. Once harvested, plant material can be stored at -80°C until further processing.

Sample extraction solution: Prepare 99.875% methanol solution acidified with 0.125% (v/v) formic acid (FA). CAUTION: methanol is toxic and highly flammable, while formic acid is corrosive. Both solvent should be handled in a fume hood.

HPLC mobile phase: Two eluents are used as mobile phase; eluent A is 0.1% FA (v/v) in ultrapure water, and eluent B is 0.1% FA (v/v) in acetonitrile. CAUTION: Both methanol and acetonitrile are toxic and highly flammable, while FA is corrosive; all solutions should be handled in a fume hood. CRITICAL: As the retention of some metabolites, especially alkaloids, is very sensitive to slight variations in the acidity of the mobile phase, always precisely add 0.1% (v/v) FA to both eluents and prepare sufficient eluents to analyse the entire sample series.

MS calibration solution: To calibrate the mass spectrometer, freshly prepare about a 1 mL solution of phosphoric acid at a concentration of 0.05% (v/v) in 50% acetonitrile / ultra pure water and load into the gastight glass syringe. CAUTION: Handle solvents in fume hood.

Lock mass solution: Prepare a solution of leucine enkaphaline in 50% (v/v) acetonitrile / ultra pure water to obtain a final concentration of 0.1 µg mL-1. Prepare sufficient solution for analysis of the complete series of samples. CAUTION: Handle solvent in fume hood.

Equipment Setup

LC-PDA-QTOF MS setup: see Boxes 2.1 and 2.2. CRITICAL: The LC-PDA system needs to be conditioned for a minimum of 1 h before; the QTOF MS should be conditioned for a minimum of 2 h.

Data pre-processing and alignment: We routinely program the metAlign software to extract and align all mass signals having a signal to noise ratio of at least 3 (normally used as a threshold in analytical chemistry). The software performs the following processing steps: (i) mass data smoothing using a digital filter related

Page 48: Metabolomics Technologies applied to the

48

CHAPTER 2

to average peak width; (ii) local noise calculation as a function of retention time and ion trace; (iii) baseline correction of all ion traces and introduction of a threshold to obtain noise reduction; (iv) scaling and calculation and storage of peak maximum amplitudes; (v) between-chromatogram alignment using high signal/noise peaks common to all chromatograms; (vi) iterative fine alignment by including an increasing number of low signal peaks; (vii) output of aligned data into a csv-file compatible with Microsoft Excel and most multivariate programs, and, finally and optional, (viii) significant difference filtering at user-defined thresholds and output of selected data back to the MS-software platforms for visualisation of differential chromatographic mass peaks. A picture of the metAlign interface is given in Fig. 2.3. The parameters used for processing the 30-min LC-MS runs are shown in the figure itself; for the 60-min runs the differing parameters are given in the legend. The software, examples, manual, etc., can be downloaded for free from www.metalign.nl or www.rikilt.wur.nl/UK/services/MetAlign+download/. It is recommended to carefully read the manual to become acquainted with the effect of the different parameters and how to optimize the settings. Box 2.3 gives a summarized account of this information. Default parameters for some other MS systems can be found in the metAlign manual.

Figure 2.3. Interface of metAlign software used for untargeted processing of LC-QTOF MS data files. The program is divided into three parts: part A deals with program configuration, data selection, peak extraction and baseline correction; part B covers the actual alignment of extracted mass peaks and output of [mass peak intensity x samples]-data matrix; part C is used to identify and visualise chromatographic peaks that are statistically different between two groups of samples (optional). Parameter settings given in this figure correspond to the default values for processing of 30 min LC-MS runs. For 60 min LC-MS runs the following default parameter settings are recommended: 4=70; 5=2450; 8=3; 9=25; 13=69, 35 and 2450, 35; 16=10, 5. A short description of buttons and parameters is given in Box 2.3.

Page 49: Metabolomics Technologies applied to the

49

Large Scale Metabolomics using LC-MS

LC-PDA-MS/MS set up: If needed, mass signals can be further identified using LC-MS/MS. For this purpose, masses of interest are incorporated into a mass inclusion list (data-directed MS/MS). We perform LC-MS/MS on the QTOF Ultima with a scan time of 0.4 s and an interscan delay of 0.1 s. The collision energy profile is programmed to increase sequentially from 5, 10, 20 to 30 eV (ESI positive mode) or 10, 15, 30 to 50 eV (ESI negative mode). If these settings are insufficient to obtain informative MS/MS information for the masses of interest, the collision energy profile can be adjusted. CRITICAL: In case of random LC-MS/MS experiments, in which up to the eight highest intensity ions per survey scan can be automatically selected for MS/MS, use a mass exclusion list containing abundant eluent mass signals in order to prevent switching to MS/MS mode for these impurities.

PROCEDURE

Tissue sampling and extraction

1. Harvest a reproducible amount of tissue (leaf, roots, fruit, etc.) by rapid freezing in liquid nitrogen. Large plant parts such as tomato fruits or potato tubers should first be cut rapidly into representative smaller parts with a sharp knife before freezing. In the case of seeds or small seedlings (e.g. Arabidopsis) use 1.5- or 2.2-mL Eppendorf tubes; in case of larger tissues use 50-mL Greiner tubes or plastic bags that are resistant to liquid nitrogen.

CAUTION: To prevent storage tubes or bags from exploding, remove all liquid nitrogen by gently pouring off before closing and do not screw tube lids firmly!

PAUSE POINT: Frozen tissue can be stored at -80°C for at least 1 year.

2. Homogenize the frozen tissue in liquid nitrogen into a fine powder using a pestle and mortar, but preferably use a ball mill (Retsch Mixer Mill MM 301 for Arabidopsis) or analytical mill (IKA A11 for larger tissues) which have been thoroughly pre-cooled with liquid nitrogen. Transfer homogenized powder into pre-cooled storage containers resistant to liquid nitrogen.

CRITICAL STEP: Take care that tissues stay fully frozen during homogenization; discard any samples that start to thaw. If needed carefully pour a small volume of liquid nitrogen onto the sample, let the nitrogen evaporate and continue homogenization.

PAUSE POINT: Frozen powder can be stored at -80°C for at least 1 year.

3. Weigh 100 mg frozen powder of Arabidopsis with an accuracy of

Page 50: Metabolomics Technologies applied to the

50

CHAPTER 2

more than 5% in a pre-cooled Eppendorf tube, or 500 mg in the case of larger amounts of tissue (e.g. tomato fruit or potato tuber) in a 10-mL glass tube with screw cap. Lower amounts can be used as well, but this is not advisable in view of the inherent relative higher weighing error using frozen material.

CRITICAL STEP: Take care that tissues stay fully frozen; discard any samples that start to thaw. Lyophilization of tissue is not recommended, unless for specific practical reasons, without knowing the effect of the lyophilization procedure on the metabolite profile.

PAUSE POINT: Frozen powder in tubes can be stored at -80°C for at least 1 month.

4. Prepare extracts freshly at the beginning of a series of analyses. Add ice-cold sample extraction solution (99.875% methanol acidified with 0.125% FA) in a volume/fresh weight ratio of 3 to the tube containing the weighed frozen powder, close lid and immediately vortex for 10 s. Assuming a tissue-water content of about 95%, this will result in a final concentration of 75% methanol and 0.1% FA. In the case of samples with highly variable water contents or lyophilized material, pure water can be added to adjust each sample to a final solvent concentration to 75% methanol and 0.1% FA. Store extracts on ice until all samples are ready.

5. Sonicate 15 min at maximum frequency (40 kHz) in a water bath at room temperature (20 ºC).

6. Centrifuge 10 min at maximum speed (20,000 x g for Eppendorf tubes; 3,000 x g for glass tubes) at room temperature.

7. Filter the supernatant over a 0.2 µm PTFE filter using a disposable syringe into a 1.8-mL glass vial and close vial with cap. In the case of large amounts of samples, use suitable filtration plates in 96-wells format and a vacuum filtration unit. We use a TECAN Genesis Workstation 150 equipped with a 4-channel pipetting robot and a TeVacS 96-wells filtration unit. Pre-wash filtration plates (Captiva 0.45 µm, Ansys Technologies) with at least three times with 700 µL of 75% methanol containing 0.1% FA. Dry bottom tips of the filters by blotting on filter paper. Place a 96-well plate with 700 µL glass inserts in the filtration unit under the pre-washed filtration plate. Load each well with 700 µL of extract and vacuum-filtrate 2 times 20 s until dry. Carefully remove air-bubbles trapped at the bottom of the inserts. Cover the plate with a 96-square well PTFE-coated seal.

CRITICAL STEP: All filters used should be free of aqueous-methanol soluble polymers, such as polyethylene glycol.

Page 51: Metabolomics Technologies applied to the

51

Large Scale Metabolomics using LC-MS

LC-PDA-QTOF MS analysis

8. Place vials or 96-well plates in the autosampler, conditioned at 20°C.

9. Check for the presence of sufficient eluents, lock mass solution and nitrogen gas, and start sample series with at least 4 “dummy” injections, to stabilize the LC-PDA-MS system, using the setup detailed in Boxes 2.1 and 2.2. Check system performance and mass accuracy during these first runs. Deviations of observed known parent masses from their calculated masses should be less than 5 ppm (at signal intensities similar to that of the local lock mass), otherwise recalibrate system. ? TROUBLESHOOTING

PAUSE POINT: Raw data can be stored on hard disks, tapes, DVD’s or other digital storage devices until further processing.

Pre-processing and alignment of LC-MS data

10. Configure metAlign (see Equipment Setup) and select the data to be processed (buttons 1-3, see Box 2.3 for more details). The first sample selected in button 2B is used as the reference file in the actual alignment (part B, see Fig. 2.3). We recommend selecting the sample that has been analysed just in the middle of the entire LC-MS series as this reference file, to minimize the extent of retention profile correction between first and last samples analysed.

11. Perform a test baseline correction (part A, Fig. 2.3) and alignment (part B, Fig. 2.3) on only a few variable samples to check whether the default settings are at least correct to extract and align mass peaks that are of specific interest (if any). Define parameters for peak extraction and noise (buttons 4-9, see Box 2.3 for more details) and run baseline correction (button 11, see Box 2.3 for more details). Manually inspect corresponding mass peaks in the beginning, middle and at the end of the baseline-corrected chromatograms and compare with the original raw data. If it is obvious that some mass signals from relatively broad chromatographic peaks are missing in the baseline corrected data, set parameter 9 (see Box 2.3 for more details) at a slightly higher value and re-run baseline correction. On the other hand, if closely eluting peaks of compounds with similar (nominal) mass have been extracted as single peaks, lower the value at button 9.

12. Once peak extraction and baseline correction settings are satisfactory, run baseline correction for all samples. Note that baseline correction is

Page 52: Metabolomics Technologies applied to the

52

CHAPTER 2

the most time-consuming part of metAlign and can take a few hours for 100 samples (depending on configuration of the computer).

13. After baseline correction of the entire series, inspect retention shifts in the baseline-corrected data files of the reference sample and of the first and last sample of the entire data set. Set maximum shift at initial peak searching criteria (parameter 13, see Box 2.3 for more details) according to default settings, or to a value at least a factor of 2 higher than visually observed retention shifts and higher than that set in parameter 9. In most experiments on related samples we use the iterative alignment with parameters indicated in Fig. 2.3 and its legend (see also examples in the metAlign manual). ? TROUBLESHOOTING.

14. To prevent metAlign outputting mass peaks that are detected in only one or a few samples, e.g. due to impurities present in one extract, it is recommended to increase parameter 18 (see Box 2.3 for more details) to a value corresponding to the number of replicates or to relevant statistical units.

15. After running the alignment (button 20), create the data output file (button 21, see Box 2.3 for more details).

Identification of relevant metabolites

16. Retrieve accurate masses of filtered mass peaks in the raw data file manually. Inspect absorbance spectra, recorded by the PDA detector, of compounds of interest. ? TROUBLESHOOTING

17. Perform additional LC-QTOF MS/MS fragmentation experiments for further identification. Enter selected masses into a mass inclusion list to ensure isolation in the quadrupole (data-directed MS/MS).

18. Predict the elemental composition of the mass peaks of interest from the accurate mass calculation, together with MS/MS fragmentation, isotopic patterns and, if possible, specific absorbance spectra.

19. Use the elemental formulae obtained to search the internet or commercially available compound databases (e.g. Database of Natural Products on CD-ROM) for possible candidates. As a first step to facilitate the query of LC-MS based plant metabolomics data, an open access database for identified semi-polar metabolites, currently mainly (poly)phenolic compounds, detected in tomato fruit has recently been developed (Moco et al., 2006a) and can be searched at http://appliedbioinformatics.wur.nl/moto. This database is derived using the exact

Page 53: Metabolomics Technologies applied to the

53

Large Scale Metabolomics using LC-MS

protocol described in the present paper. However, in untargeted LC-MS most of the elemental compositions detected in plant extracts are yet still unknowns or reference compounds are not commercially available (von Roepenack-Lahaye et al., 2004; Vorst et al., 2005; Moco et al., 2006a). Therefore, many of the putatively annotated structures cannot yet be unambiguously identified without using NMR or other tools.

TIMINGday 1

day 2

day 3

day 4

day 5

day 6

harvest

sampling

extractionLC-MS analyses

data pre-processing

data output

Figure 2.4. Timeline of standard procedure of untargeted LC-MS analyses, based on 50 Arabidopsis seedling samples and LC-MS analysis time of 1 h. For large plant tissues such as tomato fruits, the sampling step (including grinding and weighing) can take 4 days, resulting in a total time of 8 days for 50 samples.

The timeline of the procedure from tissue handling up to the final output for subsequent statistical analyses (matrix of intensity of aligned mass peaks versus samples) is schematized in Fig. 2.4. For about 50 Arabidopsis samples, the sampling step, which includes grinding in liquid nitrogen using a ball mill and weighing of frozen tissues, can be done in 2 days. However, for the same amount of samples from larger plant tissues such as tomato fruit and potato tubers, these activities usually take more time: about 4 days. Subsequent sample extraction, conditioning the LC-MS, extract analysis and mass peak alignment by metAlign will take about 4 days for 50 samples, irrespective of the type and origin of tissue. Depending upon the research question, much more time may be needed for further interpretation of the comprehensive metabolomics data set including statistical filtering and identification of relevant mass peaks.

Page 54: Metabolomics Technologies applied to the

54

CHAPTER 2

BOX 2.1: LC-PDA-QTOF MS setup; Conditioning the HPLC-PDA system

1. Prepare the mobile phase solvents, prime HPLC pump and tubing, and degas both solvents for at least 10 min using the in-line degasser of the Alliance 2795 HT.

2. Install one PEEK in-line solvent fi lter between injection system and pre-column cartridge. Place two pre-columns in tandem in the cartridge, fi x in front of the analytical column and place both columns in the column oven conditioned at 40°C.

3. Precondition column system by increasing the percentage of eluent A step-wise (starting at 100% eluent B) until the initial gradient conditions are reached.

4. Program the inlet fi le according to the gradient settings given below (Tables 2.1 and 2.2). In the standard set up we use relatively long chromatographic runs of 1 h, including column washing and re-conditioning, with a mobile phase fl ow of 0.19 mL min-1 into the analytical column (diameter of 2.0 mm). This fl ow rate corresponds to 1 mL min-1 on a 4.6 mm column, which is standard in most HPLC-UV/Vis applications. In the case of a large sample series, e.g. more than 300 extracts, we consider the use of a 30-min run at a slightly higher fl ow rate, to lower the chance of possible perturbations.

Table 2.1. Gradient settings for a 60 min run; Flow rate 0.19 mL min-1.

Table 2.2. Gradient settings for a 30 min run; Flow rate 0.20 mL min-1.

5. The PDA detector is placed between analytical column and the QTOF-MS. Connect column outlet to fl ow cell of the PDA detector and switch on the detector. Program PDA to acquire data every second from 210 nm to 600 nm with a resolution of 4.8 nm. Wavelength range, scan rate and resolution can be adjusted according to LC runs times and research aims.

Time (min) %A %B0 95 520 25 7525 25 7526 95 530 95 5

Time (min) %A %B0 95 545 65 3547 25 7552 25 7554 95 560 95 5

BOX 2.1: LC-PDA-QTOF MS setup; Conditioning the HPLC-PDA system

1. Prepare the mobile phase solvents, prime HPLC pump and tubing, and degas both solvents for at least 10 min using the in-line degasser of the Alliance 2795 HT.

2. Install one PEEK in-line solvent fi lter between injection system and pre-column cartridge. Place two pre-columns in tandem in the cartridge, fi x in front of the analytical column and place both columns in the column oven conditioned at 40°C.

3. Precondition column system by increasing the percentage of eluent A step-wise (starting at 100% eluent B) until the initial gradient conditions are reached.

4. Program the inlet fi le according to the gradient settings given below (Tables 2.1 and 2.2). In the standard set up we use relatively long chromatographic runs of 1 h, including column washing and re-conditioning, with a mobile phase fl ow of 0.19 mL min-1 into the analytical column (diameter of 2.0 mm). This fl ow rate corresponds to 1 mL min-1 on a 4.6 mm column, which is standard in most HPLC-UV/Vis applications. In the case of a large sample series, e.g. more than 300 extracts, we consider the use of a 30-min run at a slightly higher fl ow rate, to lower the chance of possible perturbations.

Table 2.1. Gradient settings for a 60 min run; Flow rate 0.19 mL min-1.

Table 2.2. Gradient settings for a 30 min run; Flow rate 0.20 mL min-1.

5. The PDA detector is placed between analytical column and the QTOF-MS. Connect column outlet to fl ow cell of the PDA detector and switch on thedetector. Program PDA to acquire data every second from 210 nm to 600 nm with a resolution of 4.8 nm. Wavelength range, scan rate and resolution can be adjusted according to LC runs times and research aims.

Time (min)Time (min) %A %B0 95 520 25 7525 25 7526 95 530 95 5

Time (min)Time (min) %A %B0 95 545 65 3547 25 7552 25 7554 95 560 95 5

according to LC runs times and research aims. according to LC runs times and research aims.

Page 55: Metabolomics Technologies applied to the

55

Large Scale Metabolomics using LC-MS

CRITICAL: Check HPLC pump for air bubbles and connections for leakage by verifying pressure stability.

CRITICAL: Precondition PDA-lamp, column oven temperature and analytical column for at least 1 hour before starting sample analyses. Meanwhile, the mass spectrometer can be calibrated and checked for performance as described in Box 2.2.

6. Place the aqueous-methanol extracts in trays inside the autosampler (20 ºC) during the analysis series. Program the injection system to operate in sequential mode and to load the syringe with 5 µL of sample with 5 µL of air both before and after the sample. The injection needle is washed with 50% (v/v) methanol/water between injections.

BOX 2.2: LC-PDA-QTOF MS setup; Conditioning the MS system

Before each series of sample analyses, the mass spectrometer should be conditioned and calibrated to obtain good performance in terms of mass accuracy and resolution. In contrast to electron impact ionisation, as used in most GC-(TOF)MS applications, detection sensitivity and mass spectra obtained by soft-ionisation LC-MS are completely dependent on the type of mass spectrometer, ionisation source and chromatographic system used. The procedure and settings described here are for a QTOF Ultima with ESI source and the TOF-tube in V-mode, in combination with the HPLC conditions described above.

1. Connect the outlet of the PDA, with eluent fl ow of 0.19 mL min-1, to the inlet of the mass spectrometer and set the capillary voltage at 2.75 kV, cone voltage at 35 V, source temperature at 120 ºC and desolvation temperature at 250 ºC. Use a cone gas fl ow of 50 L h-1 and desolvation gas fl ow of 600 L h-1.

CRITICAL: pre-condition MS for at least 2 h at these standard settings.2. Disconnect the eluent tubing from the MS and use the syringe pump to

inject the phosphoric acid calibration solution directly into the ESI source, at an initial fl ow of 5 µL min-1.

3. Acquire data from m/z 80-1,500 at a scan rate of 0.9 s and an interscan delay of 0.1 s. A series of phosphoric acid cluster peaks should appear throughout the entire range of the mass spectrum.

CRITICAL: to obtain proper calibration and accurate mass calculations, none of the mass calibration peaks should exceed an intensity of 250 counts s-1 (in continuum mode) and the intensity of the clusters over the mass range should be as

BOX 2.2: LC-PDA-QTOF MS setup; Conditioning the MS system

Before each series of sample analyses, the mass spectrometer should be conditioned and calibrated to obtain good performance in terms of mass accuracy and resolution. In contrast to electron impact ionisation, as used in most GC-(TOF)MS applications, detection sensitivity and mass spectra obtained by soft-ionisation LC-MS are completely dependent on the type of mass spectrometer, ionisation source and chromatographic system used. The procedure and settings described here are for a QTOF Ultima with ESI source and the TOF-tube in V-mode, in combination with the HPLC conditions described above.

1. Connect the outlet of the PDA, with eluent fl ow of 0.19 mL min-1, to the inlet of the mass spectrometer and set the capillary voltage at 2.75 kV, cone voltage at 35 V, source temperature at 120 ºC and desolvation temperature at 250 ºC. Use a cone gas fl ow of 50 L h-1 and desolvation gas fl ow of 600 L h-1.

CRITICAL: pre-condition MS for at least 2 h at these standard settings.2. Disconnect the eluent tubing from the MS and use the syringe pump to

inject the phosphoric acid calibration solution directly into the ESI source, at an initial fl ow of 5 µL min-1.

3. Acquire data from m/z 80-1,500 at a scan rate of 0.9 s and an interscan delay of 0.1 s. A series of phosphoric acid cluster peaks should appear throughout the entire range of the mass spectrum.

CRITICAL: to obtain proper calibration and accurate mass calculations, none of the mass calibration peaks should exceed an intensity of 250 counts s-1 (in continuum mode) and the intensity of the clusters over the mass range should be as

CRITICAL: Check HPLC pump for air bubbles and connections for leakage by verifying pressure stability.

CRITICAL: Precondition PDA-lamp, column oven temperature and analytical column for at least 1 hour before starting sample analyses. Meanwhile, the mass spectrometer can be calibrated and checked for performance as described in Box 2.2.

6. Place the aqueous-methanol extracts in trays inside the autosampler (20 ºC) during the analysis series. Program the injection system to operate in sequential mode and to load the syringe with 5 µL of sample with 5 µL of air both before and after the sample. The injection needle is washed with 50% (v/v) methanol/water between injections.

Page 56: Metabolomics Technologies applied to the

56

CHAPTER 2

uniform as possible. Adjust pump fl ow, capillary voltage, cone voltage, desolvation gas fl ow and/or collision energy until criteria are fulfi lled.

4. Combine spectra of about 50 scans during acquisition mode at optimal settings in continuum mode, centre the mass signals and check mass resolution of the machine for m/z 488.8772 (negative ionisation mode) or 490.8918 (positive ionisation mode). Mass resolution is calculated by dividing the m/z value of the centred mass signal by the mass difference at half height of the Gaussian-shaped mass peak in continuum mode, and should be better than 8,500 (with QTOF Ultima in V-mode); otherwise re-tune instrument and repeat procedure.

5. Use the centred mass data for calibration of the instrument using a polynomial-5 fi t.

CRITICAL: mean residual mass deviation should be less than 1.5 ppm, otherwise adjust calibration settings.

6. Check calibration using leucine enkephalin as a standard. Inject the leucine enkephalin solution through the separate lock mass inlet into the ESI source and acquire data under MS conditions as used during sample analyses, but in continuum mode. Adjust fl ow to obtain a specifi c mass intensity of 250 counts s-1. Collect and combine about 50 spectra and centre the mass peak.

CRITICAL: the observed mass should be within 20 ppm deviation of m/z 556.2767 in positive mode and 554.2619 in negative mode, otherwise recalibrate instrument.

7. Reconnect the outlet of the PDA to the inlet of the mass spectrometer. Check the effl uent from the LC system, including mobile phase, tubings, columns and PDA fl ow cell, by acquiring centroid data from m/z 80 - 1,500 under the exact conditions of sample analysis. Individual mass signals at initial gradient conditions should preferably be less than 200 counts per scan in negative mode or less than 500 counts per scan in positive mode, to prevent excessive ion suppression of sample compounds.

8. Prepare MS method fi le to acquire mass data from m/z 80-1,500, at a scan rate of 0.9 s and an interscan delay of 0.1 s and in centroid mode.

CRITICAL: the range of masses to be detected in sample extracts should fall within the range of calibration masses. During sample analyses, the standard setting of collision energy is 10 eV in negative ion mode and 5 eV in positive ion mode. If needed for optimal ionisation of key compounds, the collision energy may be adapted. The MS is programmed to switch from sample to lock spray every 10 s and to average two scans for lock mass correction (m/z 556.2767 in positive mode and 554.2619 in negative mode). The lock mass solution is used for online calibration of

uniform as possible. Adjust pump fl ow, capillary voltage, cone voltage, desolvation gas fl ow and/or collision energy until criteria are fulfi lled.

4. Combine spectra of about 50 scans during acquisition mode at optimal settings in continuum mode, centre the mass signals and check mass resolution of the machine for m/z 488.8772 (negative ionisation mode) or 490.8918 (positive ionisation mode). Mass resolution is calculated by dividing the m/z value of the centred mass signal by the mass difference at half height of the Gaussian-shaped mass peak in continuum mode, and should be better than 8,500 (with QTOF Ultima in V-mode); otherwise re-tune instrument and repeat procedure.

5. Use the centred mass data for calibration of the instrument using a polynomial-5 fi t.

CRITICAL: mean residual mass deviation should be less than 1.5 ppm, otherwise adjust calibration settings.

6. Check calibration using leucine enkephalin as a standard. Inject the leucine enkephalin solution through the separate lock mass inlet into the ESI source and acquire data under MS conditions as used during sample analyses, but in continuum mode. Adjust fl ow to obtain a specifi c mass intensity of 250 counts s-1. Collect and combine about 50 spectra and centre the mass peak.

CRITICAL: the observed mass should be within 20 ppm deviation of m/z 556.2767 in positive mode and 554.2619 in negative mode, otherwise recalibrate instrument.

7. Reconnect the outlet of the PDA to the inlet of the mass spectrometer. Check the effl uent from the LC system, including mobile phase, tubings, columns and PDA fl ow cell, by acquiring centroid data from m/z 80 - 1,500 under the exact conditions of sample analysis. Individual mass signals at initial gradient conditions should preferably be less than 200 counts per scan in negative mode or less than 500 counts per scan in positive mode, to prevent excessive ion suppression of sample compounds.

8. Prepare MS method fi le to acquire mass data from m/z 80-1,500, at a scan rate of 0.9 s and an interscan delay of 0.1 s and in centroid mode.

CRITICAL: the range of masses to be detected in sample extracts should fall within the range of calibration masses. During sample analyses, the standard setting of collision energy is 10 eV in negative ion mode and 5 eV in positive ion mode. If needed for optimal ionisation of key compounds, the collision energy may be adapted. The MS is programmed to switch from sample to lock spray every 10 s and to average two scans for lock mass correction (m/z 556.2767 in positive mode and 554.2619 in negative mode). The lock mass solution is used for online calibration of

Page 57: Metabolomics Technologies applied to the

57

Large Scale Metabolomics using LC-MS

the mass accuracy during sample analysis (Wolff et al., 2001; Moco et al., 2006a).CRITICAL: adjust fl ow rate or concentration of the lock mass solution to

obtain an intensity of about 500 counts per scan (in centroid mode) during LC-MS runs, to enable accurate mass calculation of as many compounds in the extracts as possible.

BOX 2.3: Description of metAlign buttons and parameters

A more detailed description can be found in the manual, which can be downloaded from www.metalign.nl or www.rikilt.wur.nl/UK/services/MetAlign+download/.

PART A: Program confi guration, data set selection and baseline correction

• Buttons 1-3 are used to defi ne the data sets as well as defi ne folders and formats for input and output.

• Parameters 4 and 5 (value in scans) refer to the region in the chromatogram, which should be processed. In particular parameter 5 should be taken in an empty region of the chromatogram at the highest concentration of organic modifi er in the gradient or at an earlier time point. This enables metAlign to calculate a matrix of noise vs. retention time vs. mass. This noise matrix together with parameter 7 and 8 is then used as a basis to fi nd real mass peaks.

• Parameter 6 (value in ion counts of a single mass) is machine dependent and should be set at about 70% of the maximum value a detector can record, to be able to deal with artefacts due to detector saturation. MetAlign creates artifi cial maxima at this value for all peaks above this value.

• Parameter 7 and 8 (factor times local noise) are peak slope and threshold factors used to fi lter out peaks from noise.

• Parameter 9 (value in scans) should be the average mass peak width at half height of non-saturated compounds. This parameter is used in determining the data smoothing (digital fi lter) as well as for a window in the alignment (see “14. Tuning Alignment Options and Criteria”).

• Parameter 10 is “de-clicked” to indicate that the peak shapes should not be saved, which only in this mode is compatible with alignment; “clicked” keeps peak shapes and renders the output incompatible with alignment, but on the other hand is compatible with deconvolution algorithms from third party software.

• Button 11 consecutively processes all data sets defi ned by buttons

BOX 2.3: Description of metAlign buttons and parameters

A more detailed description can be found in the manual, which can be downloaded from www.metalign.nl or www.rikilt.wur.nl/UK/services/MetAlign+download/.

PART A: Program confi guration, data set selection and baseline correction

• Buttons 1-3 are used to defi ne the data sets as well as defi ne folders and formats for input and output.

• Parameters 4 and 5 (value in scans) refer to the region in the chromatogram, which should be processed. In particular parameter 5 should be taken in an empty region of the chromatogram at the highest concentration of organic modifi er in the gradient or at an earlier time point. This enables metAlign to calculate a matrix of noise vs. retention time vs. mass. This noise matrix together with parameter 7 and 8 is then used as a basis to fi nd real mass peaks.

• Parameter 6 (value in ion counts of a single mass) is machine dependent and should be set at about 70% of the maximum value a detector can record, to be able to deal with artefacts due to detector saturation. MetAlign creates artifi cial maxima at this value for all peaks above this value.

• Parameter 7 and 8 (factor times local noise) are peak slope and threshold factors used to fi lter out peaks from noise.

• Parameter 9 (value in scans) should be the average mass peak width at half height of non-saturated compounds. This parameter is used in determining the data smoothing (digital fi lter) as well as for a window in the alignment (see “14. Tuning Alignment Options and Criteria”).

• Parameter 10 is “de-clicked” to indicate that the peak shapes should not be saved, which only in this mode is compatible with alignment; “clicked” keeps peak shapes and renders the output incompatible with alignment, but on the other hand is compatible with deconvolution algorithms from third party software.

• Button 11 consecutively processes all data sets defi ned by buttons

the mass accuracy during sample analysis (Wolff et al.the mass accuracy during sample analysis (Wolff et al.the mass accuracy during sample analysis (Wolff , 2001; Moco et al., 2006a).CRITICAL: adjust fl ow rate or concentration of the lock mass solution to

obtain an intensity of about 500 counts per scan (in centroid mode) during LC-MS runs, to enable accurate mass calculation of as many compounds in the extracts as possible.

Page 58: Metabolomics Technologies applied to the

58

CHAPTER 2

1 to 3. It starts the noise estimation as a function of time and mass, the smoothing, maximum amplitude correction (if needed), baseline correction, noise elimination, peak picking and exporting of baseline corrected peaks.

PART B: Scaling and aligning data sets

• Button 12 provides different modes of scaling data sets. Options are: (a) no scaling, (b) scaling on the basis of sum of all the amplitudes of the peaks picked, (c) scaling using a specifi c mass.

• The parameters in “13. Initial Peak Search Criteria” provide the window (in +- the indicated scans) at a position (in scans) in the chromatogram in which a search for identical masses is done over all chromatograms. This window may vary with retention time; the parameters in 13 provide coordinates used for linear interpolation of the window size for the whole chromatogram.

• The options in “14. Tuning Alignment Options and Criteria” determine if the rough or iterative alignment should be performed. In brief the alignment is described as follows: In both modes of alignment the window determined by “13. Initial Peak Search Criteria” is used to restrict searches for identical masses in different data sets. For the rough mode the alignment fi nishes here. For the iterative alignment this is the starting point for the fi rst estimation of a retention shift profi le for all data sets with regard to the fi rst data set. For each time point in a retention shift profi le criteria (parameter 16 and 17) to calculate differences in retention times between fi les are on the basis of a minimum number of aligned masses present in all data sets, which are above a minimum amplitude (factor times noise) and occur in a chromatogram sub-window (of two times parameter 9). The next iteration will start from here. Using this fi rst retention shift profi le the alignment is refi ned by doing book keeping on the differences in retention and automatically decreasing the parameters in “13. Initial Peak Search Criteria” to obtain a smaller search window throughout the chromatogram. The second alignment is then done as described for the smaller retention corrected search window (13). Parameters 16 (number of masses) and 17 (factor times noise) are also automatically reduced and a new and better retention shift profi le is calculated analogous to the fi rst iteration. Iterations continue until the fi nal values in parameters 16 and 17 are reached and the search window is 2 times the value of parameter 9 (average peak width). After fi nalizing the last iteration incomplete mass peak sets spread over neighbouring scans are combined in a fi ne-alignment process.

• Parameter 15 restricts changes in retention time shifts between calculated points in a retention shift profi le to a maximum value (in scans per 100

1 to 3. It starts the noise estimation as a function of time and mass, the smoothing, maximum amplitude correction (if needed), baseline correction, noise elimination, peak picking and exporting of baseline corrected peaks.

PART B: Scaling and aligning data sets

• Button 12 provides different modes of scaling data sets. Options are: (a) no scaling, (b) scaling on the basis of sum of all the amplitudes of the peaks picked, (c) scaling using a specifi c mass.

• The parameters in “13. Initial Peak Search Criteria” provide the window (in +- the indicated scans) at a position (in scans) in the chromatogram in which a search for identical masses is done over all chromatograms. This window may vary with retention time; the parameters in 13 provide coordinates used for linear interpolation of the window size for the whole chromatogram.

• The options in “14. Tuning Alignment Options and Criteria” determine if the rough or iterative alignment should be performed. In brief the alignment is described as follows: In both modes of alignment the window determined by “13. Initial Peak Search Criteria” is used to restrict searches for identical masses in different data sets. For the rough mode the alignment fi nishes here. For the iterative alignment this is the starting point for the fi rst estimation of a retention shift profi le for all data sets with regard to the fi rst data set. For each time point in a retention shift profi le criteria (parameter 16 and 17) to calculate differences in retention times between fi les are on the basis of a minimum number of aligned masses present in all data sets, which are above a minimum amplitude (factor times noise) and occur in a chromatogram sub-window (of two times parameter 9). The next iteration will start from here. Using this fi rst retention shift profi le the alignment is refi ned by doing book keeping on the differences in retention and automatically decreasing the parameters in “13. Initial Peak Search Criteria” to obtain a smaller search window throughout the chromatogram. The second alignment is then done as described for the smaller retention corrected search window (13). Parameters 16 (number of masses) and 17 (factor times noise) are also automatically reduced and a new and better retention shift profi le is calculated analogous to the fi rst iteration. Iterations continue until the fi nal values in parameters 16 and 17 are reached and the search window is 2 times the value of parameter 9 (average peak width). After fi nalizing the last iteration incomplete mass peak sets spread over neighbouring scans are combined in a fi ne-alignment process.

• Parameter 15 restricts changes in retention time shifts between calculated points in a retention shift profi le to a maximum value (in scans per 100

Page 59: Metabolomics Technologies applied to the

59

Large Scale Metabolomics using LC-MS

scans). This restriction is used after calculation of a retention shift profi le and serves to fi lter out possible anomalies.

• Parameters 18 and 19 are fi lters for aligned mass peaks, which indicate minimum completeness of aligned mass peak sets.

• Button 20 starts the scaling and alignment of data obtained in PART A.

• Button 21 is used to obtain information on the alignment of masses. There are 3 options: (i) a normal ASCII output, (ii) a Excel-compatible CSV-fi le output, (iii) a graphical display of the retention shift profi les of individual data sets with regard to the fi rst reference fi le.

• Button 29 executes the calculations under Button 11, Button 20 and Button 28.

• Button 30 exits the program saving the parameters set.

PART C: Peak selection and export to MS software format for visualisation (only applicable when comparing 2 groups of data)

• Parameter 22 is the signifi cance percentage restriction when selecting differences between data in group 1 vs. group 2.

• Parameter 23 restricts selection of differences between groups on the basis of the ratio in the means of individual aligned masses.

• Parameter 24 restricts selection of differences between groups on the basis of the minimum amplitudes defi ned as a factor times noise, i.e. it determines what is defi ned as present.

• Parameter 25 is used to fi lter on peaks which are only present in one group. The extra edit box is a fi lter for this option. It determines the minimum number of masses which should be present for a “compound”, which is only present in one group.

• Parameter 26 is a condition. With this condition you select if peaks present in group 2 are larger than in group 1 or vice versa.

• Button 27 executes PART C and creates a selection of peaks on the basis of the parameters set (22-26).

• Button 28 gives similar output as described at Button 21.

scans). This restriction is used after calculation of a retention shift profi le and serves to fi lter out possible anomalies.

• Parameters 18 and 19 are fi lters for aligned mass peaks, which indicate minimum completeness of aligned mass peak sets.

• Button 20 starts the scaling and alignment of data obtained in PART A.

• Button 21 is used to obtain information on the alignment of masses. There are 3 options: (i) a normal ASCII output, (ii) a Excel-compatible CSV-fi le output, (iii) a graphical display of the retention shift profi les of individual data sets with regard to the fi rst reference fi le.

• Button 29 executes the calculations under Button 11, Button 20 and Button 28.

• Button 30 exits the program saving the parameters set.

PART C: Peak selection and export to MS software format for visualisation (only applicable when comparing 2 groups of data)

• Parameter 22 is the signifi cance percentage restriction when selecting differences between data in group 1 vs. group 2.

• Parameter 23 restricts selection of differences between groups on the basis of the ratio in the means of individual aligned masses.

• Parameter 24 restricts selection of differences between groups on the basis of the minimum amplitudes defi ned as a factor times noise, i.e. it determines what is defi ned as present.

• Parameter 25 is used to fi lter on peaks which are only present in one group. The extra edit box is a fi lter for this option. It determines the minimum number of masses which should be present for a “compound”, which is only present in one group.

• Parameter 26 is a condition. With this condition you select if peaks present in group 2 are larger than in group 1 or vice versa.

• Button 27 executes PART C and creates a selection of peaks on the basis of the parameters set (22-26).

• Button 28 gives similar output as described at Button 21.

Page 60: Metabolomics Technologies applied to the

60

CHAPTER 2

TROUBLESHOOTING

Major problems are not expected when applying this protocol and keeping in mind the indicated critical steps. If by accident the LC flow stops, or for some reason has to be stopped even for only a short time, or upon running out of nitrogen gas, at least 4 samples should be analysed as “dummy” injections to re-stabilize the system. Upon malfunction of the MS system, e.g. sudden decrease in detector sensitivity, the extracts can be stored at 4-10°C for at least 1 week. After storage, always sonicate vials or inserts to re-dissolve possible precipitates in the extracts or filter extracts once again.

If, upon metAlign processing, there seems to be insufficient land-mark peaks (i.e. mass signals common in all samples) for proper iterative alignment, a message will automatically be displayed. This can be the case in comparing highly unrelated samples (“apple and pears”). If such comparison is still essential for the research question, we recommend to lower parameters 16 and/or 17 or, alternatively, use the rough alignment tool at button 14 (see also Box 2.2).

With regard to accurate mass calculation, the mass accuracy of an ion detected by the QTOF-Ultima MS is in principle highest at signal intensities that are comparable to that of the local lock mass (Moco et al., 2006a). Thus, if in all samples the mass signal of interest is lower than about half the intensity of the lock mass, it is impossible to calculate its exact mass using this type of mass spectrometer. Lowering the lock mass intensity during analysis is not recommended, as this will prevent an accurate estimation of the lock mass itself. At low mass signals, it is difficult to obtain informative MS/MS fragmentation as well. Strategies to increase the mass signal, such as injecting higher sample volumes, analyzing in the opposite ionisation mode, using a different ionisation source (e.g. APCI) or post-column addition of ionisation promoter (e.g. ammonium acetate), may be tested. Alternatively, the compound of interest can be concentrated or the sample can be re-analysed by other instruments with higher mass accuracy and/or MS/MS capabilities at a low mass intensity range.

ANTICIPATED RESULTS

As this untargeted metabolomics protocol makes use of crude 75% aqueous-methanol extracts of plants coupled to C18-reversed phase LC and ESI-MS, the technique described is slightly biased towards semi-polar secondary metabolites. Nevertheless, within the same extracts a number of primary metabolites, e.g.

Page 61: Metabolomics Technologies applied to the

61

Large Scale Metabolomics using LC-MS

several organic acids, nucleotides, amino acids, sugars and their phosphorylated forms, can be detected by this technique as well. However, as most of these primary metabolites are highly polar and usually co-elute with other compounds in the injection peak when using this type of columns, one should be aware that differences detected in the intensity of polar mass signals may result from differential degrees of ion suppression. Results on polar compounds obtained with this protocol should be checked with alternative LC systems (Tolstikov and Fiehn, 2002; Jander et al., 2004) or other metabolomics techniques (e.g. GC-TOF MS, CE-MS).

-10

-8

-6

-4

-2

0

2

4

6

8

10

0 48 96 144 192 240

time of LC-MS analyses (h)

rete

ntio

n ti

me

vari

atio

n (s

) A

-100

-75

-50

-25

0

25

50

75

100

time of LC-MS analyses (h)

inte

nsit

y va

riat

ion

(%)

0 48 96 144 192 240

B

-10.0

-7.5

-5.0

-2.5

0.0

2.5

5.0

7.5

10.0

0 48 96 144 192 240

time of LC-MS analyses (h)

mas

s ac

cura

cy (

ppm

)

C

Figure 2.5. Stability of the LC-QTOF MS system during 240 h continuous analyses of crude plant extracts (ESI negative mode). From a homogenous batch of Brassica nigra leaf tissue, 16 replicate extracts were prepared and analysed throughout a series of 240 samples, using a run time of 1 h per sample. Variation between replicates in the detection of rutin (for identification see Fig. 2.2F) is indicated. (A) Retention drift during analyses, expressed in seconds deviation from the mean retention time (23.195 min ± 1.3 sec; n=16). (B) Variation in mass signal intensity (peak height calculated by metAlign), expressed as percentage deviation from the mean intensity (1721 ± 355 counts scan-1, coefficient of variation=21%; n=16) versus time of analysis. Variation is sum of all technical variation including weighing, extraction, LC-MS analysis and data-processing. (C) Variation in accurate mass measurement, in ppm deviation from the mean of accurate masses calculated on the top of chromatographic peaks. Scale of y-axis: -10.0 to +10.0 ppm.

As shown in Fig. 2.5, the protocol described here enables highly stable chromatography and mass signal detection throughout analysis of large sample series. As the quality of metAlign-assisted data alignment and untargeted sample comparison is higher with increasing reproducibility of chromatography, the maximum drift in retention time of (known) compounds over the sample series analysed should be as small as possible and preferably less than 10 s (Fig. 2.5A). Larger retention shifts usually indicate column deterioration, trapped air bubbles or changes in eluent pH. Technical variation in relative quantification of mass signals between samples, which can be introduced at each step from 1 to 16 of the PROCEDURE, can be calculated from the intensities of (known) mass peaks (Fig. 2.5B). The coefficient of variation in intensities between replicate samples should be less than 25% overall, and is usually less than 10% for the higher abundant signals (Moco et al., 2006a).

Page 62: Metabolomics Technologies applied to the

62

CHAPTER 2

In addition, technical reproducibility can be estimated by creating scatter plots of all mass peaks from replicate samples (Vorst et al., 2005). Upon adequate mass calibration and by using lock mass correction on-line, the accurate masses of ions detected are usually stable throughout large sample series (Fig. 2.5C). With the TOF resolution used and at a signal intensity that is comparable to that of the lock mass, the observed accurate mass of a compound of interest should be within 5 ppm deviation from the calculated mass. In our laboratory, we use a script called MetAccure (Vorst et al., 2005; Moco et al., 2006a) to select scans within a user-defined intensity ratio of sample versus lock mass, to enable automated and correct accurate mass calculations. By calculating the mean values of observed accurate masses of compounds across all samples analysed, mass accuracies of 2 ppm or better can be obtained (Moco et al., 2006a).

5.5

6

6.5

7

7.5

8

8.5

9

9.5

10

5.5 6 6.5 7 7.5 8 8.5 9 9.5

PDA signal [Ln(Area 360 nm)]

MS

sign

al [

Ln(M

ass

peak

int

ensi

ty)]

Figure 2.6. Correlation between conventional LC-PDA analysis and untargeted LC-MS based metabolomics with regard to detection of the flavonoid rutin (for identification see Fig. 2.2F). Ripe fruits of 114 different tomato cultivars were analysed by LC-PDA-QTOF MS in ESI negative mode, as described in this protocol. LC-PDA signals (peak areas at 360 nm) were subsequently extracted in a targeted manner using the QuanLynx tool of MassLynx, while LC-MS parent ion signals were retrieved in an untargeted manner using metAlign. Ln-transformed data show high linear correlation (y = 1.0937 x with r2 = 0.972; p < 2.5x10-7), indicating that the untargeted approach is equivalent to the targeted (conventional) LC-PDA approach.

Reversed phase LC with PDA detection is used since decades for quantitative analysis of many secondary metabolites in plants. As the analytical system described in this protocol consists of reversed phase LC coupled to both PDA and MS, the quality of the untargeted LC-MS data can be checked by comparing with LC-PDA data of the same samples (Fig. 2.6). After log-transformation of both data, a significant and linear correlation should be achieved between a mass peak signal obtained by

Page 63: Metabolomics Technologies applied to the

63

Large Scale Metabolomics using LC-MS

untargeted metabolomics and peak area obtained by conventional LC-PDA analysis. A low correlation may indicate significant ion suppression, MS detector saturation or marked misalignments. However, correlations can only be established for compounds that show clearly separated PDA peaks in the chromatograms.

Figure 2.7. Hierarchical clustering (Pearson correlation) of 180 A. thaliana genotypes consisting of a recombinant inbred line (RIL) population and their parents, based on untargeted metabolomics data. Samples were analysed by LC-QTOF MS (30 min run) and 5783 mass peaks, extracted and aligned by metAlign, were loaded into GeneMaths software for multivariate analyses. Mass signal intensities (y-axis) were ln-transformed and standardised per raw average (each raw representing single mass peak), with colour scale given in the lower panel (green means relatively low, red means relatively high intensity). Replicate samples are indicated with the same colour on the sample key (x-axis): yellow- and blue-coloured samples are replicate analyses of two different samples each composed of a mixture of RILs, to check for LCMS-reproducibility and alignment; green- and red-coloured samples represent 5 biological replicates of the Ler and Cvi parents, respectively.

The aligned data sets can also be imported into software packages for large scale multivariate or statistical analyses, such as GeneMaths (Vorst et al., 2005) and MetaNetwork (Fu et al., 2007). We recommend loading mass peak data as nlog-transformed values. We routinely use GeneMaths software to check the quality of the mass signal output from large scale experiments, by applying principle component analysis and hierarchical clustering. In these multivariate approaches, replicate samples should cluster relatively close, as compared to e.g. different genotypes (Fig. 2.7), plant treatments or tissues, and the segregation of the scores should be according to the expected data structure (Vorst et al., 2005) (if applicable).

Page 64: Metabolomics Technologies applied to the
Page 65: Metabolomics Technologies applied to the

65

Chapter 3

A Liquid Chromatography Mass Spectrometry based Metabolome Database for Tomato

Sofi a Moco, Raoul J. Bino, Oscar Vorst, Harrie A. Verhoeven, Joost de Groot, Teris A. van Beek, Jacques Vervoort and Ric C.H. De Vos Plant Physiology 141: 1205-1218 (2006)

For the description of the metabolome of an organism, the development of common metabolite databases is of utmost importance. Here we present the MoTo DB (Metabolome Tomato Database), a metabolite database dedicated to liquid chromatography-mass spectrometry (LC-MS)-based metabolomics of tomato fruit (Solanum lycopersicum). A reproducible analytical approach consisting of reversed phase LC coupled to quadrupole time-of-fl ight MS and photodiode array detection (PDA) was developed for large scale detection and identifi cation of mainly semi-polar metabolites in plants and for the incorporation of the tomato fruit metabolite data into the MoTo DB. Chromatograms were processed using software tools for mass signal extraction and alignment, and intensity dependent accurate mass calculation. The detected masses were assigned by matching their accurate mass signals with tomato compounds reported in literature and complemented, as much as possible, by PDA and MS/MS information, as well as by using reference compounds. Several novel compounds not previously reported for tomato fruit were identifi ed in this manner and added to the database. The MoTo DB is available at http://appliedbioinformatics.wur.nl and contains all information so far assembled using this LC-PDA-QTOF MS platform, including retention times, calculated accurate masses, PDA spectra, MS/MS fragments and literature references. Unbiased metabolic profi ling and comparison of peel and fl esh tissues from tomato fruits validated the applicability of the MoTo DB revealing that all fl avonoids and α-tomatine were specifi cally present in the peel, while several other alkaloids and some particular phenylpropanoids were mainly present in the fl esh tissue.

A Liquid Chromatography Mass Spectrometry based Metabolome Database for Tomato

Sofi a Moco, Raoul J. Bino, Oscar Vorst, Harrie A. Verhoeven, Joost de Groot, Teris A. van Beek, Jacques Vervoort and Ric C.H. De Vos Plant Physiology 141: 1205-1218 (2006)Plant Physiology 141: 1205-1218 (2006)Plant Physiology

Page 66: Metabolomics Technologies applied to the

66

CHAPTER 3

INTRODUCTION

For understanding the dynamic behaviour of a complex biological system, it is essential to follow, as unbiased as possible, its response to a conditional perturbation at the transcriptome, proteome and metabolome levels. To study the dynamics of the metabolome, to analyse fluxes in metabolic pathways and to decipher the biological roles of metabolites, the identification of the participating metabolites should be as unambiguous as possible. Metabolomics is defined as the analysis of all metabolites in an organism and concerns the simultaneous (‘multiparallel’) measurement of all metabolites in a given biological system (Dixon and Strack, 2003). However, this is a technically challenging task, as no single analytical method is capable of extracting and detecting all metabolites at once due to the enormous chemical variety of metabolites and the large range of concentrations at which metabolites can be present. Therefore the characterization of a complete metabolome requires different complementary analytical technologies. Currently, mass spectrometry (MS) is the most sensitive method enabling the detection of hundreds of compounds within single extracts.

Ideally, metabolome data should be incorporated into open access databases where information can be viewed, sorted and matched. Different pathway resources are available that combine information from the “omics” technologies such as Kyoto Encyclopedia of Genes and Genomes (http://www.genome.jp/kegg), MetaCyc (http://metacyc.org) or The Arabidopsis Information Resource (http://www.arabidopsis.org). Hitherto, research on plant metabolic profiling using chromatographic techniques coupled to MS technologies for database purposes has been accomplished by gas chromatography mass spectroscopy (GC-MS) analysis of extracts (Schauer et al., 2005a; Tikunov et al., 2005). GC-MS entails high reproducibility in both chromatography and mass fragmentation patterns. This reproducibility enabled the development of common metabolite databases, e.g. [email protected] (http://csbdb.mpimp-golm.mpg.de/csbdb/gmd/gmd.html) and the Fiehn-Library (http://fiehnlab.ucdavis.edu/compounds), that gather information mainly on primary metabolites.

Liquid chromatography (LC)-MS is the preferred technique for the separation and detection of the large and often unique group of semipolar secondary metabolites in plants. Specifically, high resolution accurate mass MS enables the detection of large numbers of parent ions present in a single extract and can provide valuable information on the chemical composition and thus the putative identity of large numbers of metabolites. Recently, accurate mass LC-MS was performed to detect

Page 67: Metabolomics Technologies applied to the

67

MoTo DB

secondary metabolites present in roots and leaves of Arabidopsis (Arabidopsis thaliana) (von Roepenack-Lahaye et al., 2004), to study metabolic alterations in a light hypersensitive mutant of tomato (Solanum lycopersicum) (Bino et al., 2005) and to compare tubers of potato (Solanum tuberosum) of different genetic origin and developmental stages (Vorst et al., 2005). The variety of LC-MS systems, and the generally poorer retention time reproducibility of LC compared to GC, limits the establishment of a single optimised analytical procedure and hampers the comparison of LC-MS chromatograms between laboratories. Moreover, software tools able to transform automatically mass spectrometry data into a list of (putative) plant metabolites, in particular for LC-MS, are not yet available. This implies that analyses of mass signal data sets are left to manual searches in the available chemical databases such as SciFinder, PubChem or Dictionary of Natural Products (DNP). To extend the applicability of LC-MS in plant metabolomics, efforts should be made in (i) the establishment of a routine and reproducible LC-MS method, (ii) the annotation of the large numbers of mass signals detected, (iii) the unambiguous identification of compounds, and (iv) the development of a common reference database and searching tools for secondary metabolites in plants.

In this article we present an open access metabolite database for LC-MS, called MoTo DB, dedicated to tomato fruit. This database is based on literature information combined with experimental data derived from LC-MS based metabolomics experiments. A reproducible and robust C18-based reversed phase LC-PDA-electrospray ionisation (ESI)-QTOF-MS method was developed for the detection and putative identification of predominantly, secondary metabolites of semi-polar nature. The assignment of mass signals detected relies on the combination of the parameters: i) accurate mass, ii) retention time, iii) UV/Vis spectral information, and iv) MS/MS fragmentation data. To demonstrate the applicability of the established LC-MS metabolomics platform including database searching, peel and flesh tissues from ripe tomato fruit were compared for differences in metabolic composition. Statistically significant differences in LC-QTOF MS profiles between the tissues were identified in an unbiased manner, and differential mass peaks were annotated by searching in the MoTo DB. Several compounds not previously reported in tomato were also identified and have been incorporated into the database. All available information in the MoTo DB can be searched at http://appliedbioinformatics.wur.nl.

Page 68: Metabolomics Technologies applied to the

68

CHAPTER 3

MATERIALS AND METHODS

Plant material: A large pool of tomato (Lycopersicum esculentum, now Solanum lycopersicum) fruit material was prepared by combining fruits from turning, pink and red ripe stages of development of 96 different tomato cultivars representing the three major types of tomato fruits (i.e. cherry, Dutch beef and normal round tomatoes). These plants were grown in an environmentally controlled greenhouse located in Wageningen, The Netherlands, during the summer and autumn of 2003. Plants were grown in rock wool plugs connected to an automatic irrigation system comparable to standard commercial cultivation conditions. For analysis of anthocyanins, purple-coloured fruits from offspring of a crossing of two natural mutants, Af x hp-2 j (van Tuinen et al., 2005), were harvested at the ripe stage of development. Peel (about 2 mm thickness) was removed from fruits, ground into a fine powder in liquid nitrogen and stored at –80 °C until further analysis. For metabolite profile comparison of peel and flesh, red ripe fruits of cultivar Money Maker were used of which peel (2 mm thickness) and flesh (rest of fruit) were separated and used as described.

Extraction: Of the frozen tomato powder, 0.5 g FW was weighed and extracted with 1.5 mL pure methanol (final methanol concentration in the extract ~ 75%). Hydrolysed extracts were prepared by sequentially adding 1 mL of 0.1% TBHQ in methanol solution and 0.4 mL of HCl 6 M to 0.6 g FW tomato material, shaking in a water bath at 90-95°C for 1 h, and adding 2 mL of methanol (Bovy et al., 2002). All samples were sonicated for 15 min, filtered through a 0.2 μm inorganic membrane filter (Anotop 10 Whatman, Maidstone, England) and analysed.

Chemicals: Standard compounds p-coumaric acid, protocatechuic acid, salicylic acid, caffeic acid, ferulic acid, cinnamic acid, myricetin and naringenin were purchased from ICN (Ohio, USA); p-hydroxybenzoic acid, chlorogenic acid quercetin, phenylalanine, sinapic acid and α-tomatine from Sigma (St. Louis, USA); vanillic acid and rutin (quercetin-3-O-rutinoside) from Acros (New Jersey, USA); naringenin chalcone from Apin Chemicals (Abingdon, UK), kaempferol and kaempferol-3-O-rutinoside from Extrasynthese (Genay, France) and tert-butylhydroquinone (TBHQ) from Aldrich (Steinheim, Germany). Acetonitrile HPLC supra gradient and methanol absolute HPLC supra gradient were obtained from Biosolve (Valkenswaard, The Netherlands). Formic acid for synthesis 98-100% was from Merck-Schuchardt (Hohenbrunn, Germany), hydrochloric acid (HCl) 37% p.a. from Acros (New Jersey, USA) and ultra pure water was obtained from an Elga Maxima purification unit (Bucks, UK). Leucine enkaphaline was purchased from Sigma (St. Louis, USA).

Page 69: Metabolomics Technologies applied to the

69

MoTo DB

Chromatographic conditions: HPLC was carried out using a Waters Alliance 2795 HT system with a column oven. For chromatographic separation, a Luna C18(2) pre-column (2.0 x 4 mm) and analytical column (2.0 × 150 mm, 100 Å, particle size 3 μm) from Phenomenex (Torrance, CA, USA) were used. Five μL of sample was injected into the system for LC-PDA-MS analysis. Degassed solutions of formic acid:ultra pure water (1:103, v/v) (eluent A) and formic acid:acetonitrile (1:103, v/v) (eluent B) were pumped at 0.19 mL min-1 into the HPLC system. The gradient applied started at 5% B and increased linearly to 35% B in 45 min. Then, for 15 minutes the column was washed and equilibrated before the next injection. The column temperature was kept at 40 °C and the samples at 20 °C. The room temperature was maintained at 20 °C.

Detection of metabolites by PDA and MS: The HPLC system was connected online to a Waters 2996 PDA detector, set to acquire data every second from 240 to 600 nm with a resolution of 4.8 nm, and subsequently to a QTOF Ultima V4.00.00 mass spectrometer (Waters-Corporation, MS technologies, Manchester, UK). An ESI source working either in positive or negative ion mode was used for all MS analyses. Before each series of analyses, the mass spectrometer was calibrated using phosphoric acid:acetonitrile:water (1:103:103, v/v) solution. Capillary voltage, collision energy and desolvation temperature were optimised to obtain a series of phosphoric acid clusters suitable for calibration between m/z 80 and 1500. During sample analyses, the capillary voltage was set to 2.75 kV and the cone at 35 V. Source and desolvation temperatures were set to 120 °C and 250 °C, respectively. Cone gas and desolvation gas flows were 50 and 500 Lh-1, respectively. In the positive ion mode, the collision energy was 5 eV while in the negative ion mode it was 10 eV. Resolution was set at 10,000 and during calibration the MS parameters were adjusted to achieve such a resolution.

TOF MS data were acquired in centroid mode. During LC-MS analyses scan durations of 0.9 s and an interscan time of 0.1 s were used. For LC-MS/MS measurements 10 μL of sample was injected into the system and MS/MS measurements were made with 0.40 s of scan duration and 0.10 s of interscan delay with increasing collision energies according to the following program: 5 (ESI-positive) or 10 (ESI-negative), 15, 30 and 50 eV.

A lock spray source was equipped with the mass spectrometer allowing on line mass correction to obtain high mass accuracy of analytes. Leucine enkephalin, [M+H]+ = 556.2766 and [M-H]- = 554.2620, was used as a lock mass, being continuously sprayed into a second ESI source using an LKB Bromma 2150 HPLC pump, and sampled every 10 s, producing an average intensity of 500 counts scan-1 in centroid mode

Page 70: Metabolomics Technologies applied to the

70

CHAPTER 3

(~100 count scan-1 in continuum mode).Data analysis and alignment: Acquisition of LC-PDA-MS data was performed

under MassLynx 4.0 (Waters). MassLynx was used for visualisation and manual processing of LC-PDA-MS/MS data. Mass data were automatically processed by metAlign version 1.0 (www.metalign.nl). MetAlign transforms accurate masses into nominal masses to shorten the calculation time and minimize the number of mass bins. Baseline and noise calculations were performed from scan number 225 to 2,475, corresponding to retention times 4.0 min to 49.3 min. The maximum amplitude was set to 15,000 and peaks below three times the local noise were discarded. The .csv file output containing nominal mass peak intensity data (peak heights, i.e. ion counts scan-1 at the centre of the peak) at aligned retention times (scans) over all samples processed was used for further data processing. A script called metAccure was used for the calculation of accurate masses from the metAlign-extracted peaks. MetAccure calculates the accurate mass, using only those scans in which signal intensities are within a user-defined window relative to the lock mass intensity of each mass signal using the .csv files containing retention time alignments, originating from metAlign analysis, in combination with the original data in NetCDF format, created from MassLynx.raw files by Dbridge (Vorst et al., unpublished data) Comparison of extracts from peel and flesh tissues for significant differences in intensity of each aligned mass signal was made using the t-student statistical tool within metAlign (level of significance set at 0.05). The settings for baseline corrections and signal alignment were analogous to those described above.

Annotation of metabolites: Data sets obtained after metAlign and metAccure treatment were analysed as [retention time x accurate mass x peak intensity] matrixes for metabolite identification. [M+H]+ and [M-H]- values were calculated for metabolites present in Table 3.1 and used for sorting with the matrixes. Data collected during the first 4.0 min of chromatography were discarded. Novel metabolites were identified by calculating the elemental composition from accurate mass measurements using the MassLynx software. The tolerance was set at 5 ppm, taking into account the correct analyte-lock mass signal ratio. For an observed accurate mass, a list of possible molecular formulae was obtained, selected for the presence of C, H, O and N. In addition, raw data sets were checked manually in MassLynx for retention time, UV/Vis spectra and QTOF-MS/MS-fragmentation patterns for chromatographically separated peaks, complementing the accurate mass-based elemental formulas. The combination of accurate mass data, retention time (as an indication of polarity), UV/Vis spectra and MS/MS data allowed a putative identification of metabolites. Best matches were searched in the

Page 71: Metabolomics Technologies applied to the

71

MoTo DB

Dictionary of Natural Products (DNP) and SciFinder databases for possible structures. The putative identifications were confirmed by published data and with standard compounds, if commercially available.

MoTo DB build-up: Based on available literature information about compounds identified in tomato, information acquired from LC-PDA-MS analysis of tomato fruit was used to validate each metabolite: a retention time; ii] accurate mass in the form of monoisotopic mass (neutral) and in the ion forms [M+H]+ and [M-H]-; iii] Elemental compositions; iv] MS/MS fragments, v] maximum absorbance peaks in UV/Vis. Given a found mass and a ∆ppm (or ∆mDa) that is set by the user, the database can find possible matches. Formic acid, if detected, was also included in the database. The database is implemented in MySQL and running on a Linux cluster.

RESULTS

Metabolites present in tomato fruit according to literature

First, a database was constructed based on literature research to include metabolites reported to be present in tomato fruit from both wild and cultivated varieties as well as transgenic tomato plants. Though some tomato varieties are known to contain anthocyanins in their fruit (Jones et al., 2003), so far, to our knowledge, there are no reports on the identification of this class of compounds in fruit tissue. Therefore, in our literature search we included reports on anthocyanin identification in seedlings of tomato. Names (common and International Union of Pure and Applied Chemistry (IUPAC)), Chemical Abstracts Service (CAS) registry number, molecular formula, monoisotopic accurate mass, published references and other properties of each metabolite are systematized in this database. The database includes both polar, semi-polar and apolar compounds. Because the procedure used by us for extraction, separation and detection (see below) is biased towards compounds of semi-polar nature, we expected mostly secondary metabolites like (poly)phenols, alkaloids and derivatives thereof to be detected. Table 3.1 summarizes all (poly)phenolic compounds (48) and alkaloids (15) so far reported to be present in tomato fruit extracts, including compounds that have been identified only in fruits of transgenic tomato plants. Many compounds were assigned before mass spectrometry technologies became available. The number of compounds identified by NMR is very limited.

Page 72: Metabolomics Technologies applied to the

72

CHAPTER 3

Table 3.1. List of secondary metabolites identified in tomato fruit extracts according to literature. a – identified after hydrolysis, b – identified in transgenic tomato plants, c – identified using NMR data, d – identified in seedlings, Mol Form – molecular formula, MM – monoisotopic molecular mass.

Compound Mol Form MM Reference

p-Hydroxybenzoic acid C7H6O3 138.0317 (Mattila and Kumpulainen, 2002)

Salicylic acid C7H6O3 138.0317 (Schmidtlein and Herrmann, 1975), (Petró-Turza, 1987)

Cinnamic acid C9H8O2 148.0524 (Petró-Turza, 1987)

Protocatechuic acid C7H6O4 154.0266 (Mattila and Kumpulainen, 2002)a

m-Coumaric acid C9H8O3 164.0474 (Hunt and Baker, 1980)a

p-Coumaric acid C9H8O3 164.0473

(Schmidtlein and Herrmann, 1975)a, (Hunt and Baker, 1980)a, (Petró-Turza, 1987), (Martinez-Valverde et al., 2002), (Mattila and Kumpulainen, 2002), (Raffo et al., 2002), (Le Gall et al., 2003a)bc

Vanillic acid C8H8O4 168.0423 (Schmidtlein and Herrmann, 1975), (Mattila and Kumpulainen, 2002)

Caffeic acid C9H8O4 180.0423

(Schmidtlein and Herrmann, 1975)a, (Hunt and Baker, 1980)a, (Martinez-Valverde et al., 2002), (Mattila and Kumpulainen, 2002), (Raffo et al., 2002), (Sakakibara et al., 2003), (Minoggio et al., 2003), (Le Gall et al., 2003a)bc

Ferulic acid C10H10O4 194.0579

(Schmidtlein and Herrmann, 1975)a, (Hunt and Baker, 1980)a, (Martinez-Valverde et al., 2002), (Mattila and Kumpulainen, 2002), (Raffo et al., 2002), (Minoggio et al., 2003)

Sinapic acid C11H12O5 224.0685 (Schmidtlein and Herrmann, 1975)a

Naringenin C15H12O5 272.0685(Hunt and Baker, 1980)a; (Justesen et al., 1998)a, (Martinez-Valverde et al., 2002)a, (Raffo et al., 2002), (Minoggio et al., 2003)

Naringenin chalcone C15H12O5 272.0685(Hunt and Baker, 1980)a, (Krause and Galensa, 1992), (Muir et al., 2001), (Le Gall et al., 2003b)b, (Minoggio et al., 2003)

Kaempferol C15H10O6 286.0477 (Stewart et al., 2000), (Martinez-Valverde et al., 2002)a, (Tokusoglu et al., 2003)a

Quercetin C15H10O7 302.0427

(Hertog et al., 1992), (Crozier et al., 1997)a, (Justesen et al., 1998)a, (Stewart et al., 2000), (Martinez-Valverde et al., 2002)a, (Raffo et al., 2002), (Sakakibara et al., 2003), (Tokusoglu et al., 2003)a

Myricetin C15H10O8 318.0376 (Raffo et al., 2002), (Sakakibara et al., 2003), (Tokusoglu et al., 2003)a

p-Coumaric acid-O-β-D-glucoside C15H18O8 326.1002(Fleuriet and Macheix, 1977), (Reschke and Herrmann, 1982)a, (Winter and Herrmann, 1986)c, (Buta and Spaulding, 1997)

p-Coumaroylquinic acid C16H18O8 338.1002 (Fleuriet and Macheix, 1977)

Caffeic acid-4-O-β-D-glucoside C15H18O9 342.0951 (Fleuriet and Macheix, 1977), (Winter and Herrmann, 1986)

Page 73: Metabolomics Technologies applied to the

73

MoTo DB

Chlorogenic acid (5-O-caffeoylquinic acid) C16H18O9 354.0951

(Fleuriet and Macheix, 1977), (Fleuriet and Macheix, 1981), (Winter and Herrmann, 1986), (Buta and Spaulding, 1997), (Martinez-Valverde et al., 2002), (Mattila and Kumpulainen, 2002), (Raffo et al., 2002), (Sakakibara et al., 2003), (Minoggio et al., 2003), (Le Gall et al., 2003a; Le Gall et al., 2003b)bc

4-O-Caffeoylquinic acid C16H18O9 354.0951 (Winter and Herrmann, 1986), (Mattila and Kumpulainen, 2002)

5-O-Caffeoylquinic acid C16H18O9 354.0951 (Winter and Herrmann, 1986)

Ferulic acid-O-β-D-glucoside C16H20O9 356.1107(Fleuriet and Macheix, 1977),(Reschke and Herrmann, 1982), (Winter and Herrmann, 1986)

Feruloylquinic acid C17H20O9 368.1107 (Fleuriet and Macheix, 1977)

Tomatidine C27H45NO2 415.3450 (Juvik et al., 1982)a, (Friedman et al., 1998)a

Tomatidenol C27H43NO2 413.3294 (Juvik et al., 1982)a, (Friedman et al., 1994)a, (Friedman et al., 1997)a, (Friedman, 2002)a

Naringenin-7-O-glucoside C21H22O10 434.1213 (Hunt and Baker, 1980), (Le Gall et al., 2003a)bc, (Le Gall et al., 2003b)bc

Naringenin chalcone-glucoside C21H22O10 434.1213 (Bino et al., 2005)

Astragalin C21H20O11 448.1006 (Le Gall et al., 2003a)bc, (Le Gall et al., 2003b)bc

Dihydrokaempferol-7-O-hexoside and Dihydrokaempferol-?-O-hexoside C21H22O11 450.1162 (Le Gall et al., 2003a)bc, (Le Gall et al.,

2003b)bc

Isoquercitrin C21H20O12 464.0955 (Muir et al., 2001)b, (Le Gall et al., 2003a)b, (Le Gall et al., 2003b)b

Myricitrin C21H20O12 464.0955 (Sakakibara et al., 2003)

Naringin C27H32O14 580.1792 (Bovy et al., 2002)abd

Kaempferol-3-O-rutinoside C27H30O15 594.1585 (Bovy et al., 2002)bd, (Le Gall et al., 2003b)bc

Kaempferol-3-7-di-O-glucoside C27H30O16 610.1534 (Le Gall et al., 2003a)bc, (Le Gall et al., 2003b)bc

Rutin C27H30O16 610.1534

(Fleuriet and Macheix, 1977), (Buta and Spaulding, 1997), (Stewart et al., 2000), (Muir et al., 2001), (Raffo et al., 2002); (Le Gall et al., 2003a)bc, (Le Gall et al., 2003b)bc, (Minoggio et al., 2003)

Quercetin-3-O-trisaccharide C32H38O20 742.1956 (Muir et al., 2001), (Minoggio et al., 2003)

p-Coumaric acid-rutin conjugate C36H36O18 756.1902 (Buta and Spaulding, 1997)

Kaempferol-3-O-rutinoside-7-O-glucoside C33H40O20 756.2113 (Le Gall et al., 2003a)bc, (Le Gall et al.,

2003b)bc

Delphinidin-3-O-rutinoside-5-O-glucoside C33H41O21

+ 773.2135 (Mathews et al., 2003)bd

Petunidin-3-O-rutinoside-5-O-glucoside C34H43O21

+ 787.2291 (Mathews et al., 2003)bd

Malvidin-3-O-rutinoside-5-O-glucoside C35H45O21+ 801.2448 (Mathews et al., 2003)bd

Delphinidin-3-O-(p-coumaroyl)rutinoside-5-O-glucoside C42H47O23

+ 919.2503 (Mathews et al., 2003)bd

Page 74: Metabolomics Technologies applied to the

74

CHAPTER 3

Petunidin-3-O-(p-coumaroyl)rutinoside-5-O-glucoside C43H49O23

+ 933.2659 (Bovy et al., 2002)bd, (Mathews et al., 2003)bd

Delphinidin-3-O-(caffeoyl)rutinoside-5-O-glucoside C42H47O24

+ 935.2452 (Mathews et al., 2003)bd

Malvidin-3-O-(p-coumaroyl)rutinoside-5-O-glucoside C44H51O23

+ 947.2816 (Bovy et al., 2002)bd, (Mathews et al., 2003)bd

Petunidin-3-(caffeoyl)rutinoside-5-O-glucoside C43H49O24

+ 949.2608 (Bovy et al., 2002)bd, (Mathews et al., 2003)bd

Malvidin-3-(caffeoyl)rutinoside-5-O-glucoside C44H51O24

+ 963.2765 (Mathews et al., 2003)bd

δ-Tomatine C33H55NO7 577.3979 (Friedman et al., 1998)a

γ-Tomatine C39H65NO12 739.4507 (Friedman et al., 1998)a

β-Tomatine C45H75NO17 901.5035 (Friedman et al., 1998)a

Dehydrotomatine C50H81NO21 1031.5301 (Friedman et al., 1994), (Kozukue and Friedman, 2003)

α-Tomatine C50H83NO21 1033.5458

(Juvik et al., 1982), (Willker and Leibfritz, 1992)c, (Friedman et al., 1994), (Yahara et al., 1996), (Friedman et al., 1997), (Friedman et al., 1998), (Friedman, 2002), (Bianco et al., 2002), (Kozukue and Friedman, 2003)

Lycoperoside H C50H83NO22 1049.5407 (Yahara et al., 1996)c, (Yahara et al., 2004)c

Lycoperoside A C52H85NO23 1091.5512 (Yahara et al., 1996)c, (Yahara et al., 2004)c

Lycoperoside B C52H85NO23 1091.5512 (Yahara et al., 1996)c, (Yahara et al., 2004)c

Lycoperoside C C52H85NO23 1091.5512 (Yahara et al., 1996)c, (Yahara et al., 2004)c

Esculeoside B C56H93NO28 1227.5884 (Fujiwara et al., 2004)c, (Yahara et al., 2004)c

Esculeoside A C58H95NO29 1269.5990(Fujiwara et al., 2003)c, (Fujiwara et al., 2004)c, (Yahara et al., 2004)c, (Yoshizaki et al., 2005)c

Lycoperoside F C58H95NO29 1269.5990 (Yahara et al., 2004)c

Lycoperoside G C58H95NO29 1269.5990 (Yahara et al., 2004)c

Metabolite Extraction and LC-PDA-MS analysis

A representative tomato fruit sample was obtained by combining fruits of 96 different tomato cultivars producing ripe red, orange coloured beef, round or cherry-type of fruits at different stages of ripening (Tikunov et al., 2005). In addition, some purple-skinned fruits were selected for analyses of anthocyanins, which is a class of tomato fruit compounds only occurring in specific varieties (Jones et al., 2003) or in transgenic plants (Mathews et al., 2003). Peel material was chosen as the starting material, as this tissue contains the highest levels of flavonoids (Muir et al., 2001), which represent an important class of secondary metabolites. The 75% methanol/water extract enabled separation by C18-reversed phase LC and detection by both PDA and MS of semi-polar metabolites. Fig. 3.1 shows an example

Page 75: Metabolomics Technologies applied to the

75

MoTo DB

of a chromatogram obtained upon LC-PDA-QTOF-MS analysis of 75% methanol/water extracts from tomato peel. These extracts were stable for several months at –20 ºC, as determined by comparing LC-PDA chromatograms. Only naringenin chalcone was observed to decay slowly into naringenin while standing in the autosampler (20 ºC) during a series of analyses (about 1.4 μg g-1 FW h-1).

In order to test the reproducibility of the LC system, chromatograms of the tomato fruit material, that have been analysed over a period of 2 years (>100 samples), were manually compared for retention time shifts using some typical tomato compounds (Table 3.2). Within a single series of analyses, the standard variation was very small (about 2 s) for all compounds tested. Between series of analyses over this time period, the maximum variation was 30 sec, with a maximum retention time window of 1.1 min for naringenin chalcone. During this prolonged period, LC columns of different batches were used.

A

B III

II

I

Figure 3.1. Typical chromatograms obtained from reversed phase LC-PDA-ESI+-QTOF-MS analysis of tomato peel extract: A, total ion signal (QTOF MS); B, absorbance signal (PDA). Retention times (in min) are indicated for the most intense peaks (difference between the two detectors is 0.15 min). Inserts in A show accurate mass (I) and MS/MS spectrum (II) and in B absorbance spectrum (III) obtained for the compound rutin eluting at 23.3 min.

Page 76: Metabolomics Technologies applied to the

76

CHAPTER 3

Comparison of ionisation modes

Since compounds may preferentially ionize in either positive or negative mode in our LC system, which is based on a gradient of acetonitrile acidified with formic acid, we analysed tomato extracts sequentially in both modes and compared the absolute mass signal intensities, expressed in peak heights, of the monoisotopic parent ions of some identified compounds. Phenolic acids and their carboxylic acid derivatives ionised better in negative ionisation mode, while flavonoids generated higher signal intensities in positive ionisation mode (Fig. 3.2). Nitrogen-containing compounds such as phenylalanine and some alkaloids ionised better in positive mode, and were mainly detected as formic acid-adducts in negative mode. These adducts were formed in the ionisation source and were readily recognized in MS/MS mode from the loss of 46 Da (formic acid). A loss of 18 Da corresponding to a loss of H2O was also regularly observed in negative ionisation mode.

Table 3.2. Retention time shifts observed during LC-QTOF-MS analysis of tomato fruit. Ret (min) = retention time, in minutes; Av = average; StDv = standard deviation; Wd = retention time window

Automatic mass alignment and exact mass calculation

Firstly, reproducibility of sample preparation and subsequent automated extraction and comparison of mass signal intensities, expressed as peak height, using metAlign software (Bino et al., 2005; Vorst et al., 2005), was performed on a data set obtained from LC-MS analysis of 8 replicate extractions of tomato peel. The retention time correction used by the software to align all mass signals was, on average, 2.5 s, which is in accordance to the retention shift observed on manual inspection of the chromatograms (Table 3.2). The overall variation in mass signal intensities between these replicate samples was < 15%.

Automation of the calculation of the accurate mass of detected LC-MS signals was tested using a data set of 44 tomato extracts obtained from both peel and flesh tissues analysed in negative ionisation mode. Upon metAlign-assisted data processing, 4,958 mass signals with signal-to-noise ratios > 3 were extracted. It is known that exact mass measurements on QTOF instruments using lock mass correction provide highest accuracy at analyte signal intensities that are similar

Ret metabolite Chlorogenic acid Rutin Naringenin chalcone (min) Av StDev Wd Av StDev Wd Av StDev WdWithin series (n=13) 14.42 0.03 0.09 23.40 0.04 0.13 41.81 0.03 0.11In-between series (n=6) 14.92 0.33 0.79 23.85 0.50 0.99 42.26 0.50 1.12

Page 77: Metabolomics Technologies applied to the

77

MoTo DB

to the lock mass signal (Colombo et al., 2004). To establish the dynamic range in signal intensity for producing high mass accuracy in our TOF MS, the deviation of manually measured mass (i.e. the mean of the 3 top scans of the extracted mass peak) from the theoretical mass was plotted against the parent mass signal intensity (ion counts at top scan) for some known tomato metabolites (Fig. 3.3). Typically, accurate mass measurements derived from peak intensities lower than the lock mass intensity resulted in a positive deviation from the real mass, while mass measurements from peak intensities higher than lock mass intensity resulted in a negative deviation. High mass accuracies (i.e. mass deviation less than 5 ppm) were observed within an analyte signal intensity window of 0.25-2.0 times the lock mass. Thus, to automatically calculate correct accurate masses for signals extracted and aligned by metAlign, a script called metAccure (Vorst et al., unpublished data) was programmed to use only those scans with mass signal intensities within this intensity window. In this way, appropriate accurate masses were automatically obtained for 479 (about 10%) of the total mass signals detected in ESI-negative mode, in which isotopes, adducts and fragments are included. This number indicates that for the majority of extracted mass signals, though having a chromatographically relevant signal-to-noise ratio of at least 3, the intensities in the samples analysed were too low to estimate properly their accurate mass, either by automated calculation through metAccure or by manual calculation.

Que

rcet

in-t

risa

ccha

ride

5-C

affe

oylq

uini

c ac

id

3-C

affe

oylq

uini

c ac

id

Rut

in

Nar

inge

nin

4-C

affe

oylq

uini

c ac

id

-Tom

atin

e

Phen

ylal

anin

e

Nar

inge

nin

chal

cone

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

metabolite

Log

[mas

s si

gnal

int

ensi

ty (

pos)

/

mas

s si

gnal

int

ensi

ty (

neg)

]

Figure 3.2. Peak intensity ratios, in logarithmic scale, of mass signals (peak height) obtained in positive and negative ionisation modes for some metabolites found in tomato peel extracts.

Page 78: Metabolomics Technologies applied to the

78

CHAPTER 3

Table 3.3. Metabolites that have previously been reported in literature, identified by LC-PDA-ESI-QTOF-MS/MS (negative ionisation mode) in tomato peel extracts. Ret (min) = retention time, in minutes; Av = average; StDv = standard deviation; Av m/z = average found mass signal; UV/Vis = absorbance maximums in the UV/Vis range; Mol Form = molecular formula of the metabolite; Theo. Mass = theoretical monoisotopic mass calculated for the ion [M-H]-; Mean ∆ (ppm) = deviation between the averages of found accurate mass and real accurate mass, in ppm; Putative ID = putative identification of metabolite; ()FA = formic acid adduct; - = data not found; (S) = identification confirmed by the standard compound; I, II, III, IV, V, VI= different isomers (only one reported in literature).

Ret (min)Av m/z UV/Vis MS/MS fragments Mol Form Theo.

MassMean ∆ (ppm) Putative ID

Av StDev

9.45 0.09 341.0883 - 179, 135 C15H18O9 341.0878 1.52 Caffeic acid-hexose I

9.75 0.08 325.0930 294sh, 313 163 C15H18O8 325.0929 0.25 Coumaric acid-hexose I

10.32 0.08 341.0883 310 179, 161, 135 C15H18O9 341.0878 1.58 Caffeic acid-hexose II

11.35 0.08 341.0883 302sh, 318

281, 251, 233, 221, 179, 161, 135 C15H18O9 341.0878 1.53 Caffeic acid-hexose III

12.08 0.06 355.1036 290sh, 313 193, 177, 145 C16H20O9 355.1035 0.31 Ferulic acid-hexose I

12.58 0.07 341.0883 - 181, 179, 137, 135 C15H18O9 341.0878 1.49 Caffeic acid-hexose IV

13.32 0.05 341.0883 - 281, 221, 181, 179, 161, 137, 135 C15H18O9 341.0878 1.39 Caffeic acid-hexose V

13.43 0.07 353.0878 300sh, 327 191, 173, 127 C16H18O9 353.0878 0.01 3-Caffeoylquinic acid

13.71 0.07 325.0929 285 163, 119 C15H18O8 325.0929 0.05 Coumaric acid-hexose II

14.41 0.10 353.0878 295sh, 327 179, 173 C16H18O9 353.0878 -0.08 5-Caffeoylquinic acid (S)

15.90 0.05 355.1036 - 193, 175, 160 C16H20O9 355.1035 0.42 Ferulic acid-hexose II

15.98 0.06 341.0886 - 179 C15H18O9 341.0878 2.26 Caffeic acid-hexose VI

16.76 0.07 353.0880 323 191, 173, 161, 127 C16H18O9 353.0878 0.49 4-Caffeoylquinic acid

19.53 0.25 1272.5901 - 1227, 1095, 1065, 933, 866, 770 C57H95NO30 1272.5866 2.75 (Esculeoside B)FA

21.42 0.04 741.1870256, 299sh, 351

301, 271, 255 C32H38O20 741.1884 -1.82 Quercetin-hexose-deoxyhexose-pentose

22.83 0.06 1314.6001 - 1269, 1137, 1107, 974, 770, 752 C59H97NO31 1314.5972 2.21

(Lycoperoside G)FA or (Lycoperoside F)FA or (Esculeoside A)FA I

23.43 0.04 609.1451256, 299sh, 355

301, 271, 255 C27H30O16 609.1461 -1.59 Quercetin-glucose-rhamnose (S)

25.48 0.16 1314.6005 -

1269, 1137, 1107, 975, 908, 866, 812, 770, 752, 275, 179, 161, 149, 143, 125, 113

C59H97NO31 1314.5972 2.54(Lycoperoside G)FA or (Lycoperoside F)FA or (Esculeoside A)FA II

26.37 0.21 1314.6021 -

1270, 1138, 1108, 976, 909, 813, 753, 179, 161, 143, 125, 113

C59H97NO31 1314.5972 3.74(Lycoperoside G)FA or (Lycoperoside F)FA or (Esculeoside A)FA III

Page 79: Metabolomics Technologies applied to the

79

MoTo DB

26.41 0.03 593.1505 368 285 C27H30O15 593.1512 -1.09 Kaempferol-glucose-rhamnose (S)

26.44 0.39 1094.5382 - 1049 C51H85NO24 1094.5389 -0.59 (Lycoperoside H)FA

32.46 0.37 1078.5463 - 1033, 871, 738, 576, 161, 143 C51H85NO23 1078.5440 2.14 (α-Tomatine)FA (S)

32.59 0.22 1136.5539 - 1091, 958, 928, 796, 635, 149, 143, 113 C53H87NO25 1136.5494 3.91

(Lycoperoside C)FA or (Lycoperoside B)FA or (Lycoperoside A)FA

32.65 0.02 433.1135 315sh, 368 271, 151 C21H22O10 433.1140 -1.21 Naringenin chalcone-

hexose I

41.43 0.05 271.0617 288, 303sh 151,119,107 C15H12O5 271.0612 1.84 Naringenin (S)

41.86 0.05 271.0615 365 151, 119, 107 C15H12O5 271.0612 1.15 Naringenin chalcone (S)

Identification of tomato metabolites

The identification of compounds reported to be present in tomato fruit was done using two approaches. Firstly, 19 available standard compounds (see Materials and Methods section) were injected and compared for retention time, accurate mass and UV/Vis spectra with LC-peaks detected in the extracts from the pooled peel material of the 96 tomato cultivars. In this way, chlorogenic acid (i.e. 3-caffeoylquinic acid), rutin, kaempferol-rutinoside, naringenin, naringenin chalcone and α-tomatine were identified. Secondly, the chromatograms from the 44 LC-MS data sets were checked for the presence of accurate masses, as calculated by metAccure, corresponding to metabolites that were expected to be detected with our system (Table 3.1). The accurate mass hits were subsequently combined with PDA and MS/MS-fragmentation data for further identification and confirmation of metabolites. As an example, data of known tomato metabolites observed in extracts of the pooled peel material of the 96 tomato cultivars, derived by LC-PDA-MS and MS/MS analyses in negative mode, are listed in Table 3.3. In an analogous way, the presence of anthocyanins was confirmed by LC-PDA-QTOF-MS/MS analysis (positive mode) in peel extracts from purple-skin tomato fruits (data not shown). Using this primarily accurate mass-directed targeted approach, about 41% (25 compounds) of the metabolites cited in Table 3.1 were identified in both tomato peel samples. In addition, caffeic acid, ferulic acid, p-coumaric acid, quercetin and kaempferol aglycones could be detected but only after acid hydrolysis of the extract. All experimental LC-MS information gathered for these metabolites, including retention time window, accurate mass, PDA spectral information and MS/MS data generated at different collision energies were added to the MoTo DB.

Page 80: Metabolomics Technologies applied to the

80

CHAPTER 3

Figure 3.3. Difference between observed and theoretical monoisotopic masses, calculated as ∆ppm (y-axis), as a function of the parent ion signal intensity, expressed as ion counts scan-1 at centre of peak (x-axis, log10-transformed data) for some identified compounds in tomato peel extracts. Threshold levels for mass accuracies between +5 and – 5 ppm, and for analyte mass signal intensities between 0.25 and 2.0 times the lock mass signal intensity are indicated with dotted lines.

Database building

The data from Table 3.1 were used as a foundation upon which to initiate the tomato fruit LC-MS database. From the molecular formula, the accurate mass of each component was calculated using the “Isotopic compositions of the elements 1997” list (Rosman and Taylor, 1998) for accurate mass assignments. The observed mass, together with a mass accuracy setting, is the main search entry for this database (Fig 3.4). A choice on the entry form is provided to enable ionisation specific correction of mass spectrometer data, in order to submit the proper mass value of the uncharged molecule to the database. Mass accuracy can be set from 1 to 1,000 ppm, thus enabling the matching of data from detectors generating masses with either low or high accuracy. All other properties of the compounds are stored in a table, which can be accessed from the hit list after mass searching. Each hit suggests either a metabolite previously found in literature and validated by experimental data (Table 3.3) or a novel compound (Table 3.4). Links with the PubChem and MedLine database are available for extended, external searches on particular or related components. The information for each compound includes molecular formula, molecular mass,

-60

-40

-20

0

20

40

60

0 10 100 1000 10000log (mass signal intensity)

ppm

0.25 2.0ANALYTE

LOCK MASS

5.0

-5.0

Page 81: Metabolomics Technologies applied to the

81

MoTo DB

CAS number, IUPAC name and analytical properties such as retention time, MS/MS fragments and UV/Vis absorbance maxima, when available. Literature references related to the occurrence in tomato fruit are also listed. Since our aim is to provide a compound database with data from literature and/or experimental MS/MS data, we did not include unknown or novel compounds that have not been validated.

Accurate mass

MoTo

MS/MS

PDA

Ret

Standards

NMR

Putative Identification

Identification

A

B

Figure 3.4. A. Strategy applied for data analysis and identification of metabolites in tomato fruit, using LC-PDA-QTOF MS. Key entry into the database is the (intensity-corrected) accurate mass. B. Screenshot from the MoTo database query frame. Detected masses can be filled in (in this example m/z 609 in negative ionisation mode) and searched against the database at user-defined mass accuracy (first frame). If at least one mass hit is found in the database, the elemental compositions, deviations from accurate masses and IUPAC names of the corresponding metabolites are indicated, as well as links to PubChem, if applicable, and our own experimental data (second frame). The last frame shows the experimental and literature information available for the selected compound.

Comparison of metabolic profiles of peel and flesh tissues

The applicability of the LC-MS platform and metabolite database to automatically extract and annotate (differentially accumulating) mass signals was tested with red ripe fruits of tomato cultivar Money Maker. Since we are interested in the differential distribution of metabolites and their biochemical pathways

Page 82: Metabolomics Technologies applied to the

82

CHAPTER 3

between tomato fruit tissues, peel and flesh material was separated from whole ripe fruits and analysed by LC-PDA-ESI-QTOF-MS in both positive and negative ion modes. After automatic peak extraction and alignment of samples per ionisation mode using metAlign, 2,944 mass signals (signal-to-noise ratio > 3) were obtained in negative mode and 4,059 in positive mode. Since both tissues had similar water content (i.e. flesh: 94%, peel: 93%; n=8; determined by freeze-drying), the intensities of their mass signals were directly comparable. For each aligned mass peak, the extracts from both tissues were compared for significant differences in signal intensity (based on 8 extraction repetitions) using the student t-test tool within metAlign. As expected, the mass profiles of these fruit tissues were markedly different. About 38 % of the total of mass signals detected were significantly ≥ 1.5-fold higher in the peel extracts than in the flesh extracts (1,095 signals for negative mode and 1,566 for positive mode), and about 25 % were higher in flesh than in peel (794 for negative mode and 880 for positive mode). Chromatographic mass peaks detected in negative ionisation mode that were significantly different between the extracts from both tissues are visualised in Fig. 3.5. Subsequent metAccure-assisted accurate mass calculation of the differential mass peaks and searching for analogous masses in the MoTo database indicated that flavonoids and derivatives thereof and α-tomatine were mainly occurring in the peel extracts. On the other hand, some phenylpropanoids (h, 52 fold; i, 2 fold) as well as glycosylated steroids such as glycosylated spirosolanols (j, 130 fold) were significantly higher in the flesh extracts. An intense mass signal, k, was solely detected in the extracts from flesh tissue and could be identified as the parent ion of a hydroxyfurostanol-tetrahexose (e.g. tomatoside A) from the accurate mass observed ([M-H]- = 1081.5442, C51H85O24

-, 1.0 ppm difference from theoretical mass) and its MS/MS fragmentation pattern.

Table 3.4. Novel metabolites identified or putatively assigned by LC-PDA-ESI--QTOF-MS/MS in tomato fruit extracts (abbreviations as in Table 3.3).

Ret (min)Av m/z UV/Vis MS/MS fragments Mol Form Theo.

MassMean ∆ (ppm) Putative ID

Av StDev

4.74 0.05 299.0771 251 137 C13H16O8 299.0772 -0.48 Hydroxybenzoic acid-hexose

7.42 0.07 380.1558 - 146 C15H27NO10 380.1562 -1.11 Pantothenic acid-hexose

12.99 0.05 431.1557 - 269, 161, 143, 125, 119, 113, 101 C19H28O11 431.1559 -0.43 Benzyl alcohol-dihexose

14.76 0.05 771.1989 263sh, 351 609, 463, 301 C33H40O21 771.1989 -0.01 Quercetin-dihexose-

deoxyhexose

15.47 0.06 595.1665 - 475, 385, 355 C27H32O15 595.1668 -0.51Naringenin chalcone-dihexose or Naringenin-dihexose

Page 83: Metabolomics Technologies applied to the

83

MoTo DB

15.82 0.04 401.1452 -293, 269, 233, 191, 161, 149, 131, 125, 101

C18H26O10 401.1453 -0.37 Benzyl alcohol-hexose-pentose

24.77 0.15 1312.5872 - 1266, 1135, 1105 C59H95NO31 1312.5815 4.33

(Dehydrolycoperoside G)FA or (Dehydrolycoperoside F)FA or (Dehydroesculeoside A)FA

27.05 0.12 515.1193 301sh, 323

353, 335, 191, 179, 173 C25H24O12 515.1195 -0.45 Dicaffeoylquinic acid I

27.60 0.07 515.1191 301sh, 323 353, 191, 179 C25H24O12 515.1195 -0.72 Dicaffeoylquinic acid II

29.71 0.07 515.1188 301sh, 327

353, 299, 203, 191, 179, 173, 135 C25H24O12 515.1195 -1.40 Dicaffeoylquinic acid III

30.11 0.04 887.2246256, 301sh, 323

741, 723, 301, 271, 255, 179 C41H44O22 887.2251 -0.57

Quercetin-hexose-deoxyhexose-pentose-p-coumaric acid

32.16 0.03 433.1137 307sh, 360 271, 151 C21H22O10 433.1140 -0.84 Naringenin chalcone-

hexose II

38.40 0.08 677.1503 301sh, 327 515 C34H30O15 677.1512 -1.29 Tricaffeoylquinic acid I

39.78 0.11 677.1493 292sh, 325

515, 353, 335, 179, 173 C34H30O15 677.1512 -2.82 Tricaffeoylquinic acid II

DISCUSSION

Metabolomics is developing as an important functional genomics tool. Technical improvements in the large scale determination of metabolites in complex plant tissues and dissemination of metabolomics research data are essential (Sumner et al., 2003; Bino et al., 2004). A major challenge is to construct consolidated metabolite libraries, and to develop metabolite-specific data management systems. Here we set out to establish a reproducible LC-PDA-MS based metabolomics platform including a LC-MS metabolite database and mass-directed searching tools for a commonly used plant material, i.e. tomato fruit.

An in-depth literature study was performed to obtain as much information as possible on metabolites previously detected in tomato fruits. Because tomato is an important crop, numerous analytical studies aimed at identifying its constituents have been performed. However, a number of problems arise when building such a database from the literature. Firstly, finding the exact identity of a specific natural compound can be troublesome since common names or non-IUPAC nomenclatures are often used. Secondly, studies performed without MS or NMR technologies might lead to questioning the validity of at least some of the assigned compounds. Thirdly, it is known that using harsh conditions during sample preparation may produce artefacts, which can result in the correct identification but of a compound not occurring in the original biological sample. For instance, it has long been thought that the flavanone

Page 84: Metabolomics Technologies applied to the

84

CHAPTER 3

A

B

C

ab c

de

f

g

h i jk

2.54x104

42.54x10

2.54x104

Figure 3.5. Unbiased LC-QTOF MS based comparative profiling of aqueous-methanol extracts from peel and flesh tissues from ripe tomato fruit (var. Money Maker). Mass chromatograms (m/z 100-1,500) were acquired in ESI negative mode. Retention times (in min) and nominal masses of the most intense signals are indicated in the chromatograms (plotted as BPI, base peak intensities, from 4 to 50 min). A, representative original chromatogram of peel tissue; B, representative original chromatogram of flesh tissue; C, differential chromatogram for metabolites that are significantly (p<0.05; n=8 extracts) at least 1.5-fold higher in extracts from peel compared to flesh tissue (peaks pointing upwards) or higher in extracts from flesh compared to peel tissue (peaks pointing downwards) a coumaric acid-hexose II, b quercetin-hexose-deoxyhexose-pentose, c rutin, d kaempferol-hexose-deoxyhexose-pentose or quercetin-dideoxyhexose-pentose, e α-tomatine, f naringenin, g naringenin chalcone, h caffeic acid-hexose II, i 3-caffeoylquinic acid, j spirosolanol-trihexose, k hydroxyfurastanol-tetrahexoside.

Page 85: Metabolomics Technologies applied to the

85

MoTo DB

naringenin instead of naringenin chalcone was the main tomato flavonoid (Krause and Galensa, 1992). This is probably due to unforeseen cyclisation of the chalcone to the corresponding flavanone during sample preparation and compound isolation.Likewise, some of the metabolites reported in literature have been identified after an enzymatic or chemical hydrolysis step. In the non-hydrolyzed tomato peel extract we exclusively found a range of glycosylated forms of caffeic acid, coumaric acids and the flavonols quercetin and kaempferol, while the corresponding aglycones were only detectable after acid hydrolysis of the same sample.

The amount of information obtained by a single LC-QTOF MS analysis can be extensive and the use of dedicated software for data processing and comparison is crucial. The extraction of relevant mass signals and the subsequent alignment of chromatograms were performed using metAlign (Vorst et al., 2005). An average of 2 s variation within series of analyses and 30 s between analyses over a 2 year time period is an indication of high chromatographic reproducibility. These retention time shifts are sufficiently low to align correctly and thus compare samples when analysed under the same chromatographic conditions. Variation in metabolite retention is a known and common obstacle in LC and thus important to take into account when searching LC-MS based databases for comparable masses. Representative retention times and retention indexes of unknown mass peaks relative to tomato key compounds, such as rutin, chlorogenic acid and naringenin, can be of use when comparing data generated by different LC systems or with a different type of C18-reversed phase column.

MetAccure (Vorst et al., unpublished data) is an important tool for automated accurate mass calculation of all aligned mass signals from the metAlign output. Within a specific range of mass signal intensities (depending on the specificities of the TOF MS and lock mass intensity used), the metAccure-assisted accurate mass calculations enabled the assignment of compounds. By calculating the average of all detected accurate masses of a certain aligned mass peak over all samples analysed (taking into account only those scans with the correct range of ion intensities) high mass accuracies were obtained, i.e, frequently within 1 ppm and, in all cases, within 4 ppm deviation from the predicted mass (Table 3.3). Apparently, this high mass accuracy was consistent over the entire mass range analysed (m/z 100-1500): accuracies better than 3 ppm were obtained for metabolites at both low (e.g. 271.0615 for naringenin chalcone) and high m/z values (e.g. 1314.6005 for the FA-adducts of the possible isomers lycoperoside G or F or esculeoside A). With the QTOF instrument used, the metAccure script was able to generate appropriate accurate masses for about 10% of the total mass peaks detected in ESI negative mode.

Page 86: Metabolomics Technologies applied to the

86

CHAPTER 3

Evidently, this percentage is highly dependent on the dynamic range of accurate mass measurements of the mass spectrometer used, as well as on the concentrations of each metabolite in the samples analysed. By changing the lock mass to analyte ratio in successive analyses of the same sample it should be possible, in principle, to obtain accurate mass data for a wider range of amplitudes, leading to an expansion of the dynamic range.

The identification of compounds, in particular secondary metabolites, through a metabolomic profiling approach encounters some major difficulties. Firstly, the number of commercially available standards of secondary metabolites reported to be present in a specific plant species, or tissue, is low. Secondly, in an automated online separation, PDA detection, MS measurement and/or MS/MS fragmentation of mass signals, it is difficult to meet optimized levels for all eluting compounds. Due to overlapping compounds, low intensity mass signals or difficulties in the isolation of the mass signal for MS/MS fragmentation, the extraction of usable information for identification purposes can be complicated. Thirdly, the lack of dedicated software and databases that integrate spectroscopic and mass spectrometry data limits the identification procedure to a manual level. Nevertheless, by these means 43 metabolites could be readily assigned in the tomato fruit extract (Tables 3.3 and 3.4), leaving more to be identified. The total number of compounds detectable by our LC-MS system is difficult to calculate due to the presence of mass signals from isotopes, adducts and unintended in-source fragmentation. Using the strategy demonstrated in this study, the assignment of compounds lies on the integration of different sources of information (accurate mass, retention time, fragmentation pattern, UV/Vis spectra). In addition to experimental data, previous findings and biochemical evidence can complement certain putative assignments.

In the MoTo DB we established searching tools to link an observed mass in LC-MS chromatograms to the putative tomato metabolite, through calculating the exact monoisotopic mass of each metabolite for both positive and negative ionisation modes. Identifications can be validated using the retention time intervals, PDA spectra and MS/MS data, so far available. The link with external databases allows searching for similar molecules from other sources.

Some compounds reported in literature appear to occur more than once in our chromatograms, e.g. p-coumaroylhexoside, caffeoylhexoside and naringenin chalcone-hexoside (Table 3.3). Apparently, these metabolites can exist as different constitutional isomers in tomato fruit. The position and/or nature of the sugar substitution can influence the polarity and therefore the retention time of the compound. From the literature it is often unclear which particular isomer is

Page 87: Metabolomics Technologies applied to the

87

MoTo DB

mentioned. Three chromatographic peaks corresponding to caffeoylquinic acids were found. According to previous studies with comparable analytical systems (Clifford et al., 2003), the order of elution is likely 5-caffeoylquinic acid, followed by 3-caffeoylquinic acid and then 4-caffeoylquinic acid (Table 3.3).

Applying the same data analysis strategy, novel derivatives of phenolic acids, and flavonoids were putatively assigned and information on the level of their identification are presented (Table 3.4). Dicaffeoylquinic acid (3 isomers) and tricaffeoylquinic acid (2 isomers) were identified in tomato, and novel glycosides of naringenin, naringenin chalcone and quercetin were detected. The chromatographic separation of several isomers of coumaroyl- and caffeoylhexosides, of which only one has previously been described, also indicates the high resolution power of our LC-MS set up. MS/MS fragmentation can sometimes distinguish between constitutional isomers, however in most cases other approaches such as NMR will have to be performed to unravel the complete and exact structure of novel compounds. These NMR studies are part of our future activities in tomato metabolomics. Ideally, the combination of LC/MS/NMR should be performed for the unambiguous structure elucidation of metabolites (Exarchou et al., 2003; Sumner et al., 2003; Wolfender et al., 2003). Organizing all such analytical data into a single database will facilitate the identification of compounds and will further improve the quality and quantity of compound annotation through database searching.

By making use of the MoTo DB and the LC-PDA-MS platform established, extracts from two tissues in tomato fruit, peel and flesh, were compared for relative differences in LC-MS signals in an untargeted manner (Fig. 3.5). As was expected from previous experiments, e.g. (Muir et al., 2001; Bovy et al., 2002), most of the flavonoid species and their glycosides were detected in the extracts of peel tissue, while in the flesh extracts these compounds were hardly or not detectable at all. The specific accumulation of flavonoids in peel is in accordance with the idea that these compounds play a role in the protection against stress by for example UV-light (Winkel-Shirley, 2002). On the other hand, by using this untargeted approach it became clear that tomato flesh contains markedly higher amounts of, amongst many still unknown metabolites, specific phenolic compounds such as caffeoylhexose II and 3-caffeoylquinic acid, as well as glycosylated alkaloids of the spirosolanol type. A compound uniquely present in the extracts from flesh tissue was identified as a hydroxyfurostanol-tetrahexose, which might correspond to tomatoside A. (Shchelochkova et al., 1981). This molecule has a brassinosteroid-like structure and is structurally related to spirosolanes. Recently, highly active biosynthesis of brassinosteroids has been found in developing tomato fruits (Montoya et al., 2005).

Page 88: Metabolomics Technologies applied to the

88

CHAPTER 3

As yet, neither the biological functions nor the mechanisms underlying the specific accumulation of these phenolic acids and glycosylated spirosolanols in the flesh of the fruit are known. Clearly, further research into the differential distribution of (secondary) metabolites between peel and flesh tissues of tomato fruit, by analysing these tissues from fruits from several cultivars, may provide novel information on tissue-specific regulation of biochemical pathways.

CONCLUSION

The maturation of metabolomics as the next cornerstone of functional genomics ultimately depends on the establishment of databases (Sumner et al., 2003; Bino et al., 2004). However, at the moment there are no effective database tools to query and/or comprehensively mine LC-MS based plant metabolomics data through automated database search engines. The generation of such tools depends on the availability of metabolite databases which can be trusted and for which the source of data and its history are maintained and made publicly accessible. Here we present the first step to implement such an open access metabolite database, the MoTo DB dedicated to tomato, which intends to systematize metabolite LC-MS, MS/MS and absorbance spectra information for common knowledge. The next step is to utilize the validated metabolomic information to study the dynamics of the metabolome, to elucidate mutants and gene functions based on differential metabolic profiles, and to decipher the biological relevance of each metabolite. The combination of information from other “omics” technologies can lead to a wider view on the systems biology of the plant studied. As a result, the integration of databases from these different disciplines will be inevitable.

Page 89: Metabolomics Technologies applied to the
Page 90: Metabolomics Technologies applied to the
Page 91: Metabolomics Technologies applied to the

91

Chapter 4

Tissue Specialization at the Metabolite Level is perceived during the Development of Tomato Fruit

Sofi a Moco, Esra Capanoglu, Yury Tikunov, Raoul J. Bino, Jacques Vervoort and Ric C.H. de Vos

A detailed biochemical analysis is pursued for the comparison of metabolite patterns of different tomato (Solanum lycopersicum) fruit tissues along ripening, using liquid chromatography (LC) coupled to photo diode array (PDA) detection, fl uorescence detection (FD) and mass spectrometry (MS). The peel and fl esh tissues of ripe fruits from several commercial tomato cultivars were analysed for their metabolite profi les, exhibiting a stronger difference in metabolite patterns between tissues than between cultivars. In addition, metabolite profi les of different fruit tissues (vascular attachment region, columella and placenta, epidermis, pericarp and jelly parenchyma) of a single tomato cultivar were monitored at the green, breaker, turning, pink and red stages of fruit development. These tissues were analysed for isoprenoids (carotenoids, xanthophylls, chlorophylls and tocopherols) and ascorbic acid contents by LC-PDA-FD, while semi-polar metabolites (including fl avonoids, phenolic acids, glycoalkaloids, saponins and other glycosylated derivatives) were profi led using LC-MS. Metabolites were identifi ed by means of the LC-MS-based database for tomato fruit (MoTo DB) in combination with experimental data (accurate mass, MS/MS, UV/Vis). By performing multivariate data analysis, signifi cant differences were observed between each tissue and each ripening stage, indicating preferential accumulation of metabolites in the different tissues at specifi c ripening stages.

Tissue Specialization at the Metabolite Level is perceived during the Development of Tomato Fruit

Sofi a Moco, Esra Capanoglu, Yury Tikunov, Raoul J. Bino, Jacques Vervoort and Ric C.H. de Vos

Page 92: Metabolomics Technologies applied to the

92

CHAPTER 4

INTRODUCTION

Tomato or Solanum lycopersicum (formerly Lycopersicum esculentum) is part of the Solanum family which contains many other plant species of commercial and/or nutritional interest (potato, pepper, eggplant, tobacco and petunia). The tomato fruit is one of the most widely produced fruits for consumption, with more than 122 million tons produced worldwide, in the year 2005 (FAOSTAT Statistics Division, 2005). There are several quality aspects associated with the nutritional value of the tomato fruit, as the contents of flavour volatiles, flavonoids, vitamins and carotenoids, all of which are of relevance for market consumption. In fact, the tomato fruit is a natural source of lycopene, a carotenoid, which has been the subject of controversy due to alleged effects on prostate cancer prevention (Basu and Imrhan, 2007; Jatoi et al., 2007).

Besides the nutritional value inherent to the tomato and the assumed health benefits, the tomato fruit is the most well studied of all fleshy fruits and represents a model of choice for developmental studies. During fruit ripening, a series of physiological phenomena occur such as alterations in pigment biosynthesis, decrease in resistance to pathogen infection, modification of cell wall structure, conversion of starch to sugars and increase of the levels of flavour and aromatic volatiles (Fraser et al., 1994; Giovannoni, 2001; Carrari et al., 2006). In climacteric fruits, such as the tomato fruit, ethylene plays a major role in fruit development and ripening, in addition to the plant hormones auxin and abscisic acid, as well as gibberellins and cytokinins (Srivastava and Handa, 2005). However, the dynamics and interactions within metabolic pathways, as well as the identity and concentrations of the interacting metabolites during fruit development, are mostly unknown.

Metabolomics allows the diagnostics of a plant status, with direct relationship to the exhibited visual characteristics (phenotype). Using metabolomics technologies, a comprehensive description of naturally occurring metabolites (primary and secondary metabolites) in a biological system, such as tomato fruit, is feasible. The expansion of metabolomic technologies resulted in the usage of a diverse range and configuration of instruments and analytical methods. Mostly MS (Schauer et al., 2005a; Tikunov et al., 2005; van der Werf et al., 2005; Moco et al., 2006a; Fraser et al., 2007) and NMR (Keun et al., 2002; Le Gall et al., 2003a; Ward et al., 2003; Kochhar et al., 2006; Griffin and Kauppinen, 2007) technologies are used, but also other techniques such as LC-photo diode array (PDA) (Porter et al., 2006), infrared and Raman spectroscopy (Ellis and Goodacre, 2006) have been applied in plant metabolomics. Among a wide variety of applications (Hall, 2006;

Page 93: Metabolomics Technologies applied to the

93

Metabolomics analyses of tomato fruit tissues

Schauer and Fernie, 2006), plant metabolomics approaches provide insight into the biochemical composition of the plant system, allowing the establishment of links to possible metabolite functions.

The description of metabolites in biological systems depends not only on technological developments in analytical methods, but also on the capacity to sort and store relevant information for re-usage in related research. The integration of experimental data (e.g. mass spectrum, fragmentation pattern, NMR spectrum, retention time in a described separation system, UV/Vis spectrum) with biological (e.g. species name, organ, tissue) and chemical (e.g. name, chemical descriptors, molecular formula, structure) information can greatly improve and optimize the ability to describe metabolites in their biochemical context. Therefore, main efforts are being made into the management of metabolite information by means of databases (Ogata et al., 1999; Kopka et al., 2005; Moco et al., 2006a; Oikawa et al., 2006; Choi et al., 2007; Wishart et al., 2007). The development of such databases can lead to a better understanding of the biochemical composition and contributes to an overall insight into the functioning of the biological system. Consequently, interpretation of complex metabolic transformation processes, as occurs within fruit development, can benefit from (already available) database information, avoiding the need of intense identification efforts.

Within the multiplicity of metabolites that constitutes the tomato fruit metabolome, carotenoids, flavonoids, phenolic acids and alkaloids can be analysed using LC-PDA/MS techniques. A variety of biological functions have been assigned to these classes of secondary metabolites, for example as elements involved in pollination, photoprotection, seed dispersal, adaptation to abiotic conditions and defense, as well as being involved in other non-ecological phenomena such as auxin transport (Tracewell et al., 2001; Friedman, 2002; Taylor and Grotewold, 2005; Kunz et al., 2006; Simons et al., 2006). Furthermore, biochemical studies on crops, including tomato fruit, may generate knowledge that potentially have a direct consumer impact as it provides insight into nutritional and quality aspects.

Using LC-QTOF (quadrupole time-of-flight)-MS based metabolomics, we have recently compared peel and flesh tissues of the tomato cultivar Money Maker and showed that distinct metabolites could be attributed to both of these tissues (Moco et al., 2006). For instances, all flavonoids and α-tomatine were mainly found in the peel, while tomatoside A, a hydroxyfurostanol-tetrahexose, was uniquely present in the flesh (including seeds). In the present study, fruit tissues from a range of tomato cultivars and at different ripening stages were compared in terms of their metabolite profiles using both LC-QTOF MS and LC-PDA-FD. The combination

Page 94: Metabolomics Technologies applied to the

94

CHAPTER 4

of different analytical methods resulted in the detection of a large variety of metabolites (including isoprenoids, ascorbic acid, phenolic acids, flavonoids, saponins and glycoalkaloids) and provided novel insight into tissue-specificity within ripening tomato fruit.

MATERIALS AND METHODS

Plant material: Fruits from seven different cultivars of tomato (Solanum lycopersicum, cultivars Conchita, Campari, Favorita, Macarena, Cedrico, Aromata and Celine), at the red stage of development were acquired from local supermarkets, while fruits from cultivar Money Maker were harvested from plants grown in a greenhouse of Wageningen University. From these fruits, the peel was separated from the rest of the fruit (flesh) and both fruit tissues were immediately frozen in liquid nitrogen. The frozen material was ground to a fine powder and stored at -80 ºC before analysis. The peel and flesh tissues of these cultivars were used for metabolite profiling by LC-MS (Moco et al., 2006a).

Of a single tomato cultivar, Ever, fruits were harvested at different stages of development, i.e. green, breaker, turning, pink, and red (Fig. 4.2A-E), from plants grown in an environmentally controlled greenhouse located at Wageningen University. The tissues vascular attachment region (VAR), epidermis (EP), pericarp (PR), columella and placenta (CL), and jelly parenchyma (including the seeds, JE) (Fig. 4.2F) from 10 fruits, at each developmental stage, were separated, immediately frozen in liquid nitrogen, and combined. After grinding the frozen tomato material, these different tissue samples were freeze-dried and the water content was determined (Horwitz, 2000). These samples were analysed and quantified for the occurrence of specific metabolites (isoprenoid derivatives and ascorbic acid) by LC-PDA-FD and also profiled for semi-polar metabolites by LC-MS. Seeds of variety Money Maker were extracted from red fruits using 0.1 M hydrochloric acid followed by extensive washing with water and air-drying, while seeds from variety Ever were kindly provided by Seminis (Wageningen, The Netherlands). The gelatinous part of the JE of the varieties Ever and Money Maker was separated from the seeds and analysed separately.

Chemicals: The standard compounds: chlorogenic acid, β-carotene, lutein, all trans-lycopene (from tomato, 90-95%), chlorophyll a and b from spinach, α-, δ- and γ-tocopherol were purchased from Sigma (St. Louis, USA), naringenin from ICN (Ohio, USA), rutin from Acros (New Jersey, USA), naringenin chalcone from Apin Chemicals (Abingdon, UK), ascorbic acid from Merck (Darmstadt, Germany),

Page 95: Metabolomics Technologies applied to the

95

Metabolomics analyses of tomato fruit tissues

neoxanthin and violaxanthin from CaroteNature (Lupsingen, Switzerland) and lutein from Extrasynthese (Genay, France). The solvents acetonitrile, methanol and chloroform were of HPLC supra gradient quality and obtained from Biosolve (Valkenswaard, The Netherlands) and ethyl acetate (for HPLC) from Acros (New Jersey, USA). Tris(hydroxymethyl)methylamine (Tris) was obtained from Invitrogen (Carlsbad, USA). Meta-phosphoric acid ((HPO3)n), sodium chloride (NaCl), diethylene triamine pentaacetic acid (DTPA), butylated hydroxytoluene (BHT) and leucine enkaphaline were obtained from Sigma (St. Louis, USA). Formic acid (FA) for synthesis, 98-100%, was purchased from Merck-Schuchardt (Hohenbrunn, Germany), monopotassium phosphate (KH2PO4) pro analysis and hydrochloric acid were from Merck (Darmstadt, Germany) and dipotassium phosphate (K2HPO4) 98% from Sigma (St. Louis, USA). Ultra pure water was obtained from an Elga Maxima purification unit (Bucks, UK).

Extraction, separation and detection of isoprenoid derivatives: The extraction of lipid-soluble isoprenoids was performed according to Bino et al. (2005) using 25 (± 0.05) mg of freeze-dried tomato material. The extracts were injected (10 µL) into a LC-PDA-FD system composed of a W600 pump system (Waters Chromatography, Milford, MA, USA) equipped with a YMC-Pack reverse-phase C30 column (250 x 4.6 mm, particle size 5 µm), maintained at 40 °C, for chromatographic separation. Eluting compounds were detected using a Waters 996 PDA detector over the UV/Vis range of 240 to 750 nm coupled online with a Waters 2475 fluorescence detector. Data were analysed using Empower Pro software (2002; Waters). The detection of neoxanthin and violaxanthin was at 440 nm, chlorophyll b at 470 nm, β-carotene, lutein and lycopene at 478 nm, and chlorophyll a at 665 nm. α-, δ- and γ-tocopherols were detected by a fluorescence detector (Waters) with excitation at 296 and emission at 340 nm. The quantification of isoprenoids was based on calibration curves constructed from analysing the respective standard compounds. Three extractions of the same material were made and analysed.

Extraction, separation and detection of ascorbic acid: For the analysis of ascorbic acid, a 5% (m/v) (HPO3)n with 1 mM DTPA aqueous solution was prepared as extraction solution (continuous stirring and sonicating was needed to obtain an homogeneous solution). This solution was stored at 4 ºC before analysis. To 25 (± 0.05) mg of freeze-dried tomato tissue, 0.475 mL of water was added, immediately followed by 2 mL of ice-cold extraction solution. The extracts were stirred and left on ice before 15 min of sonication. After centrifugation at 2,500 rpm for 10 min, the supernatants were filtered through 0.2 µm polytetrafluoroethylene filters and taken for LC-PDA analysis. The same LC-PDA system was used as for the analysis of

Page 96: Metabolomics Technologies applied to the

96

CHAPTER 4

isoprenoids. The separation was performed at 30°C on a YMC-Pack Pro C18 (150 x 4.6 mm, 5 µm particle size) column using 50 mM phosphate buffer (pH 4.4) as mobile phase. After 15 minutes of separation, the column was washed with acetonitrile and conditioned for the injection of the next sample. The detection and quantification of ascorbic acid was made at 262 nm by means of a calibration curve.

Extraction, separation and detection of semi-polar metabolites: For the analysis of the cultivars Conchita, Campari, Favorita, Macarena, Cedrico, Aromata, Celine and Money Maker, 0.5 g fresh weight of tomato powder was extracted with 1.5 mL of methanol, following the protocol described by Moco et al. (2006a). For the analysis of the fruit tissues from cultivar Ever, the same procedure was applied using 25 ± (0.05) mg of freeze-dried material and 2 mL of 75% methanol (in 3 replicates). The seeds of Ever and Money Maker were also extracted by this procedure, using 50 ± (0.05) mg in 2 mL. The extracts obtained were taken for LC-PDA-QTOF MS analysis in ESI negative mode, as previously described (Moco et al., 2006a). In brief, a Waters Alliance 2795 HT system equipped with a Luna C18(2) pre-column (2.0 x 4 mm) and analytical column (2.0 × 150 mm, 100 Å, particle size 3 μm) from Phenomenex (Torrance, CA, USA) were used for chromatographic separation. The HPLC system was connected online to a Waters 2996 PDA detector and subsequently to a QTOF Ultima V4.00.00 mass spectrometer (Waters-Corporation, MS technologies, Manchester, UK). For LC-MS measurements 5 µL of sample (methanolic extract) were injected into the system and for LC-MS/MS 10 µL. The MS/MS measurements were made with increasing collision energies according to the following program: 10, 15, 25, 35 and 50 eV. Leucine enkaphalin ([M-H]- = 554.2620) was injected in a separate inlet and used as lock mass.

Data analysis and alignment: Acquisition, visualisation and manual processing of LC-PDA-MS/MS data were performed under MassLynx 4.0 (Waters). Mass data were automatically processed by metAlign version 1.0 (www.metAlign.nl). Baseline and noise calculations were performed from scan number 70 to 2,400, corresponding to retention times 1.4 min to 48.6 min. The maximum amplitude was set to 35,000 and peaks below two times the local noise were discarded. More details about the settings of metAlign can be found elsewhere (De Vos et al., 2007). The accurate masses from the metAlign-extracted peaks were recalculated from the 3 top scans of each MassLynx-signal.

Annotation of metabolites: The obtained data sets were analysed as [retention time x accurate mass x peak intensity] matrixes for metabolite identification. The matrix was reduced by discarding all signals below a signal intensity of 100 (ion counts/scan at the centre of the peak) and eluting within the

Page 97: Metabolomics Technologies applied to the

97

Metabolomics analyses of tomato fruit tissues

first 4.0 min of chromatography. This data set was checked for the presence of known tomato fruit metabolites using the MoTo DB (http://appliedbioinformatics.wur.nl/moto), after manually calculating the accurate masses by taking into account a mass signal intensity ratio of analyte versus lock mass of 0.25-2.0 (Moco et al., 2006a). For mass signals lower than 0.25 the intensity of the lock mass, it was impossible to calculate a correct accurate mass. To annotate compounds, the tolerance for mass deviation was set at 5 ppm, taking into account the correct analyte/lock mass ratio. For an observed accurate mass, a list of possible molecular formulae was obtained, selected for the presence of C, H, O, and N, S or P. In addition, raw data sets were checked manually in MassLynx software for retention time, UV/Vis spectra and QTOF-MS/MS-fragmentation patterns for chromatographically separated peaks not present in the MoTo DB, complementing the accurate mass-based elemental formulas.

Multivariate analyses of LC-MS data: For the comparison and visualisation of the main tendencies of the LC-MS data acquired for the peel and flesh tissues of the 8 cultivars, the data matrix obtained from metAlign was loaded into GeneMaths software (Applied Maths, Belgium). Principal component analysis was performed after logarithmic (of base 2) transformation and standardisation of mass signals over the samples (by subtraction of average).

The LC-MS derived data set from the tissues of Ever at different ripening stages and processed by metAlign initially consisted of about 20,000 mass peaks aligned across all samples analysed. Low intensity mass peak patterns were discarded (as described above), thereby reducing the data set to 10,388 mass peaks. Most compounds are usually represented by a number of ions (isotopes and unintended fragments and adducts) that makes the entire LC-MS data redundant. This redundancy was removed by clustering of mass peak patterns using an approach called Multivariate Mass Spectra Reconstruction (MMSR) (Tikunov et al., 2005). This resulted in 504 mass peak clusters and each of them was represented by a single mass signal in further analyses. A small data set containing the quantified levels of carotenoids, tocopherols, chlorophylls and ascorbic acid, analysed by LC-PDA-Fl, was appended to the LC-MS data resulting in a final 528 component (variable) data set. Each variable was normalised over the samples using range scaling (Smilde et al., 2005). The normalised data were subjected to an unsupervised clustering using Self Organizing Tree Algorithm (SOTA) (Herrero et al., 2001). Fourteen clusters with significant internal variability (p < 0.001) were derived.

Page 98: Metabolomics Technologies applied to the

98

CHAPTER 4

RESULTS

The metabolites in tomato fruit extracts have been analysed by different LC-hyphenated methods, in order to profile a wide variety of compounds naturally occurring in tomato fruit and to establish differences between the various fruit tissues and ripening stages. To assess the tissue specificity of metabolites, fruits from a series of tomato cultivars were separated in peel and flesh, followed by a more detailed study on a single cultivar using a finer separation of the fruit tissues at different stage of ripening.

LC-MS analyses of peel and flesh from different tomato cultivars

The specificity of metabolite accumulation in two fruit tissues (peel and flesh) has been tested previously for one tomato cultivar, Money Maker, at the red stage of fruit development. The epidermis (also containing some pericarp tissue) was classified as peel and the rest of the fruit (including the seeds) was classified as flesh (Moco et al., 2006a). In the present study these two tissues were further tested for other tomato cultivars, also at the red stage of development. The cultivars chosen (Conchita, Campari, Favorita, Macarena, Cedrico, Aromata and Celine) are all commercial cultivars available for consumption. Three replicates of 75% methanolic extracts of the same biological material per cultivar and per tissue were analysed by LC-PDA-QTOF-MS.

Firstly, the extraction procedure and the LC-MS measurements were tested for reproducibility. The standard error of the means of 3 repetitive measurements of the same extract was 5.2%, which indicates a good reproducibility of the LC-MS analyses. The overall standard error of the means of replicate (n = 3) mass signal intensities of extracts prepared from the same biological material was 6.8%, indicating that the extraction procedure was also highly reproducible.

Secondly, in order to compare the LC-MS profiles of the different tomato cultivars, including Money Maker, a principal component analysis (PCA) was performed (Fig. 4.1). The x-axis of the PCA plot (first component) coincides with the separation of peel and flesh, while the y-axis (second component) corresponds to the different cultivars. This result supports a stronger tissue-driven variation than a cultivar-driven variation. The metabolite profiles of the same fruit tissue in different cultivars are more similar than the profiles of the two tissues within each cultivar. The metabolite putatively annotated as tomatoside A (Moco et al., 2006a) was one of the main signals responsible for the segregation of the flesh from the peel tissues for all the analysed tomato cultivars, while flavonoids were found to be specific for the peel tissues.

Page 99: Metabolomics Technologies applied to the

99

Metabolomics analyses of tomato fruit tissues

-40 -20 0 20 40-40

-20

0

20

40

FleshPeel

Favorita Campari Aromata Money MakerMacarenaCeline CedricoConchita

PC 2

PC 1

Figure 4.1. Principal component analysis (PC1 versus PC2) of tomato fruit tissues peel and flesh of different cultivars: Favorita, Campari, Conchita, Cedrico, Aromata, Celine, Macarena and Money Maker for three replicate extractions (explained variance in the x-axis, PC1, 22.2% and in the y-axis, PC2 9.6%).

Tissue specificity of metabolites during fruit ripening

In order to evaluate the tissue distribution of metabolites upon ripening, fruits from a single tomato cultivar, Ever, was chosen for more extensive analysis. Therefore, fruits at five ripening stages were divided into five different tissues. From the outside to the centre of the fruit, the following fruit parts were separated and analysed separately: the fruit tissue where the sepals were directly connected (VAR), the external epidermal tissue layer (exocarp or EP), the fleshy tissue layer below the epidermis (PR), the gelatinous locular tissue of the fruit including the seeds (JE), and the central inner fleshy tissue of fruit (CP), Fig. 4.2F. These tissues were tested for their metabolite profiles by both LC-PDA-FD and LC-PDA-QTOF-MS.

During ripening of the tomato fruit, changes in pigmentation are evident through the different developmental points chosen for the analyses, from the stage green, passing through breaker, turning, pink and finally reaching red fruit stage, Fig. 4.2A-E. No obvious changes in fruit size were observed between the analysed developmental stages.

Page 100: Metabolomics Technologies applied to the

100

CHAPTER 4

A

C

E

B

D1 52 43

F

Figure 4.2. Fruit ripening stages of tomato cultivar Ever: green (A), breaker (B), turning (C), pink (D) and red (E) and different tissues within the fruit: VAR (1), EP (2), JE (3), CP (4) and PR (5).

Isoprenoids and ascorbic acid in fruit tissues during ripening

The amounts of specific isoprenoids were determined, using LC-PDA-FD, in the different fruit tissues and ripening stages of tomato (Table 4.1). The tendencies observed during ripening were similar for all tissues: there was an increase in lycopene during fruit development and a decrease in chlorophylls (a and b), which was also obvious from the changes in fruit colour (from a green to a red coloured-fruit). β-Carotene increased, neoxanthin slightly decreased, while lutein was virtually constant during development. Violaxanthin showed a profile that was slightly different from the other xanthophylls: this compound first increased up to breaker/pink stage and then decreased in the red stage. In general, the α-, γ- and δ-tocopherols increased during development in all tissues except the JE, in which all tocopherols decreased.

However, there were clear differences in the levels of isoprenoids between the different tissues, as well as in the increase or decrease rate upon development. Lycopene increased mostly in the EP: more than 20,000 fold from the green to the red stage (up to about 2.5 mg/g dry weight). Chlorophyll a was higher than chlorophyll b, with a ratio of about 3 in all tissues. Both chlorophylls were most abundant in the VAR and the JE, while lowest levels were detected in the CP and EP. The amount of β-carotene ranged from 4 (green CP) to 85 (red EP) µg/g dry weight. In green fruits the xanthophylls were highest in the JE, while in red fruits they were highest in the VAR. Low amounts of neoxanthin occurred in all tissues of tomato, below 15 µg/g dry weight. The levels of violaxanthin were relatively stable during development of all tissues. In the EP, lutein showed the lowest amount and in the VAR the highest amount for all developmental stages.

Page 101: Metabolomics Technologies applied to the

101

Metabolomics analyses of tomato fruit tissues

Dev

W(%

)N

eoxa

nthi

nVi

olax

anth

inβ-

Caro

tene

Allt

rans

-ly

cope

neLu

tein

Chlo

roph

ylla

Chlo

roph

yllb

δ-to

coph

erol

γ-to

coph

erol

α-to

coph

erol

Asco

rbic

acid

VAR

G94

11.8

1.31

20.9

0.47

10.3

1.70

1.91

±0.

4834

.12

±4.

2725

4.56

±23

.51

96.3

10.7

00.

22±

0.02

10.1

0.16

406.

29±

9.39

274.

09±

2.83

B93

10.6

0.81

27.0

0.35

20.0

3.70

4.96

±0.

8034

.02

±0.

8921

4.12

±19

.65

69.3

6.63

1.08

±0.

0015

.42

±0.

2048

3.93

±15

.679

6.03

±2.

31T

9410

.97

±0.

6936

.44

±0.

1038

.12

±5.

3948

.86

±8.

2536

.63

±1.

1816

6.13

±11

.84

53.4

3.59

2.60

±0.

0732

.09

±0.

4755

4.89

±7.

4941

2.44

±6.

73P

948.

46±

0.42

32.6

0.19

52.2

5.48

110.

15±

18.0

230

.06

±0.

4410

8.02

±5.

0130

.60

±2.

382.

84±

0.04

37.9

0.81

631.

76±

9.76

415.

3.47

R94

5.51

±0.

2722

.87

±0.

6774

.03

±3.

8940

0.83

±35

.19

30.7

0.67

49.1

2.02

14.2

1.29

3.23

±0.

0962

.73

±1.

1353

7.59

±3.

4012

86.8

5.84

CPG

946.

25±

0.40

12.1

0.24

4.17

±0.

56nd

16.2

0.99

98.0

6.85

37.4

2.09

0.03

±0.

023.

11±

0.11

209.

03±

1.36

146.

56±

2.51

B94

4.91

±0.

1920

.29

±0.

4221

.81

±1.

7720

.22

±1.

0918

.24

±1.

2851

.40

±3.

3718

.32

±1.

590.

44±

0.01

7.34

±0.

0621

0.75

±2.

7954

9.76

±3.

2T

955.

73±

0.56

22.8

±0.

4028

.15

±3.

8235

.68

±4.

5115

.85

±0.

6045

.95

±7.

4215

.99

±1.

240.

82±

0.01

10.0

0.14

278.

47±

2.66

701.

97±

3.37

P94

4.04

±0.

4521

.43

±0.

3543

.84

±2.

4711

0.26

±7.

5311

.42

±0.

2415

.73

±1.

467.

52±

0.92

0.83

±0.

019.

99±

0.12

279.

53±

2.47

869.

48±

6.38

R95

3.62

±0.

4423

.95

±0.

3360

.98

±3.

1025

3.98

±16

.29

12.2

0.61

nd2.

23±

0.50

1.06

±0.

0015

.34

±0.

3431

6.95

±1.

1313

02.3

5.08

EPG

942.

01±

0.46

8.55

±0.

457.

64±

0.12

nd9.

15±

1.27

102.

56±

7.22

26.0

0.67

1.76

±0.

0333

.67

±0.

3118

1.19

±5.

5411

76.5

7.24

B94

3.13

±0.

5615

.71

±0.

8723

.61

±1.

5340

.18

±3.

2411

.35

±1.

2081

.02

±5.

1520

.56

±1.

743.

27±

0.06

41.5

1.54

193.

69±

7.94

1609

.36

±10

.28

T94

1.7

±0.

1613

.23

±0.

9439

.54

±0.

4621

4.69

±2.

328.

95±

0.56

24.4

±1.

493.

75±

0.45

5.86

±0.

0462

.11

±0.

3420

8.25

±5.

3115

53.4

5.96

P94

1.8

±0.

216.

47±

0.49

64.8

1.78

874.

25±

23.3

88.

58±

0.13

5.03

±0.

511.

28±

0.08

7.38

±0.

0577

.38

±0.

7321

4.31

±1.

8916

16.4

±3.

27R

93nd

9.92

±0.

1484

.64

±4.

5427

86.5

86.8

39.

04±

0.38

ndnd

7.57

±0.

1169

.11

±1.

2119

3.69

±3.

2616

70.7

7.61

PRG

956.

34±

0.56

11.3

0.08

8.29

±0.

230.

48±

0.04

18.0

0.54

121.

83±

13.3

743

.41

±4.

032.

72±

0.57

77.3

25.3

315

5.4

±11

.52

703.

75±

2.48

B95

6.73

±0.

6216

.84

±0.

3129

.15

±0.

3341

.58

±1.

2225

.36

±0.

5073

.10

±6.

2523

.54

±3.

336.

16±

0.22

151.

07±

5.64

157.

54±

5.72

1174

.49

±8.

25T

952.

91±

0.61

11.9

1.18

44.6

6.89

78.4

33.7

521

.37

±0.

6027

.57

±2.

748.

49±

0.70

5.19

±0.

1612

8.69

±4.

6316

5.18

±1.

7114

04.2

10.9

7P

963.

09±

0.20

11.0

1.32

50.0

5.68

301.

50±

9.98

20.9

0.78

6.65

±0.

263.

34±

0.05

4.66

±0.

1685

.58

±2.

8119

5.6

±3.

8011

41.8

7.67

R95

nd5.

48±

0.09

49.9

±1.

8384

5.68

±17

.21

14.1

0.45

ndnd

4.33

±0.

1312

9.74

±6.

1421

6.62

±7.

0615

17.8

±6.

08JE

G94

13.5

1.57

25.0

0.77

15.2

0.59

nd39

.73

±1.

4428

1.95

±8.

111

9.37

±9.

093.

07±

0.51

102.

11±

22.1

613

3.02

±8.

5883

0.34

±12

.8B

934.

62±

0.12

17.8

1.61

64.9

4.35

115.

02±

7.44

25.7

1.01

21.7

3.34

5.48

±0.

821.

02±

0.00

13.3

0.21

121.

36±

1.60

1040

.81

±11

.57

T94

2.90

±0.

3813

.29

±0.

5973

.99

±1.

5921

3.18

±5.

4720

.23

±0.

212.

39±

1.43

0.86

±0.

431.

37±

0.00

21.6

1.19

132.

45±

2.41

1108

.41

±8.

82P

932.

63±

0.20

12.8

0.77

73.7

4.11

366.

83±

16.8

421

.16

±0.

960.

96±

0.96

nd1.

42±

0.00

25.2

0.96

136.

40±

1.80

1039

±7.

24R

941.

93±

0.32

8.34

±0.

4069

.27

±2.

4154

2.63

±21

.10

21.4

1.09

0.32

±0.

32nd

1.52

±0.

0122

.22

±0.

3111

4.29

±0.

5411

41.9

±8.

36

Tabl

e 4.

1. W

ater

con

tent

(W

), i

n %,

and

mas

s co

nten

ts,

in µ

g/g

dry

wei

ght,

of

isop

reno

id d

eriv

ativ

es (

neox

anth

in,

viol

axan

thin

, β-

caro

tene

, al

l tr

ans-

lyco

pene

, lu

tein

, ch

loro

phyl

l a,

chl

orop

hyll

b, α

-toc

ophe

rol,

δ-t

ocop

hero

l, γ

-toc

ophe

rol)

and

asc

orbi

c in

the

tis

sues

of

tom

ato

frui

t (V

AR,

CP,

EP,

PR a

nd

JE),

at

diff

eren

t ri

peni

ng s

tage

s (g

reen

, br

eake

r, t

urni

ng,

pink

and

red

) re

pres

ente

d at

mea

ns ±

sta

ndar

d er

ror

of t

he m

eans

(n

= 3)

; nd

= n

ot d

etec

tabl

e (<

1 µg

/g d

ry w

eigh

t);

Ripe

ning

sta

ges:

G =

gre

en,

B =

brea

ker,

T =

tur

ning

, P

= pi

nk,

R =

red.

Page 102: Metabolomics Technologies applied to the

102

CHAPTER 4

The tocopherols showed different concentrations in diverse parts of the tomato fruit. Vitamin E (α-tocopherol) was the most abundant tocopherol in all tissues in any of the ripening stages, being highest in the VAR and lowest in the PR. γ-Tocopherol, which is the biosynthetic precursor of α-tocopherol, was highest in the JE of the tomato fruit. The ratio α- versus γ-tocopherol clearly differed between tissues, suggesting tissue-dependent differences in the activity of the corresponding γ-tocopherol methyltransferase. The levels of δ-tocopherol were relatively low in all tissues (highest in the red EP), while β-tocopherol was not detectable at all (less than 0.1 µg/g dry weight).

The level of ascorbic acid (vitamin C) increased during ripening in all tissues, though its increase was generally largest between green and breaker or turning (Table 4.1). In the VAR, vitamin C displayed a rather specific pattern upon fruit ripening: a nearly 3-fold increase from green to breaker, followed by a 2-fold decrease from breaker to turning and again a 3-fold increase from pink to red stage. When red fruit is compared to green fruit, ascorbic acid increased nearly 10 fold in the CP and less than 2 fold in the EP. At all ripening stages, the highest levels of this antioxidant were detected in the EP of the fruits.

Semi-polar metabolites in the fruit tissues during ripening

The LC-PDA-QTOF-MS analysis of metabolites present in the semi-polar extracts, over a definite range of polarity imposed by the reversed-phase column used for the analytical separation, allowed the detection of mostly glycosylated derivatives of phenolic acids, flavonoids, alkaloids and other small molecules. The different fruit tissue profiles were quite diverse, as shown by the obtained mass chromatograms, Fig. 4.3. It was also visible that, in all tissues, marked changes in metabolites occurred during ripening of the fruit, such as the complete disappearance as well as the appearance of mass signals.

From principle component analyses of these LC-MS profiles (Fig. 4.4), it appeared that metabolite differences between tissues are more pronounced than differences between ripening stages. Largest metabolite variations were observed between the tissues EP and JE, which corresponded to the first principal component in the PCA plot, while the second and third components correspond to fruit development. During ripening the differences between tissues become more pronounced, suggesting ripening-dependent tissue differentiation of metabolites.

Quantification of compounds could not be performed in these analyses, as most of the compounds detected are not commercially available as standards.

Page 103: Metabolomics Technologies applied to the

103

Metabolomics analyses of tomato fruit tissues

Compounds identified using LC-PDA-MS/MS in the analysed tissues are listed in Table 4.2. The performance of the LC-PDA-MS system and the results obtained in this study are in accordance with previous findings (Moco et al., 2006a), indicating the robustness of the method. Some of the compounds reported in Table 4.2 have been detected before in tomato peel and are present in the MoTo DB. The analysis of different fruit tissues enabled a complementation of the putative identifications with additional or improved experimental data, in addition to newly found compounds.

Time10.00 20.00 30.00 40.00 50.00

%

0

100

%

0

100

%

0

100

%

0

100

%

0

100 TOF MS ES- BPI

3.38x1026.64131514.26

35312.674439.85

503

16.41355

24.44609

18.14431

43.631455

33.351137

30.60515

26.571315

15.18411

13.08431

17.37445 24.39

60943.631455

34.91425

24.44609

15.5477112.65

443

22.4774117.81

755

26.651315

42.71271

27.41593

39.33677

30.55515

43.691456

26.641315

25.3659715.19

411

12.63443

43.681456

37.911081

26.581315

10.30341

26.401287

23.841315

27.12962

37.361080

30.56515

43.661455

TOF MS ES- BPI

TOF MS ES- BPI

TOF MS ES- BPI

TOF MS ES- BPI

A

E

D

C

B

4

3.38x104

3.38x104

3.38x104

3.38x104

Figure 4.3. LC-MS chromatograms of the tissues VAR (A), CP (B), EP (C), PR (D) and JE (E) in red fruits of tomato fruit Ever.

Page 104: Metabolomics Technologies applied to the

104

CHAPTER 4

B

T

G

G

G

R

P

B

BTPR

BTPR

GBTPR G

BT

PR

Jelly parenchymaColumella & placenta

Vascular attachment regionEpidermisPericarp

A

z

y

x

Figure 4.4. Principal component analysis of LC-MS data from tomato fruit Ever over different developmental stages (G = green, B = breaker, T = turning, P = pink and R = red) and different tissues within the fruit (VAR, EP, JE, CP and PR). PCA plot of the mass signals (A) and of the samples (B) with an explained variance over the x axis (PC1) of 33.6%, y axis (PC2) 22.2 % and z axis (PC3) of 13.2%.

Phenolic acids and derivatives thereof

Conjugated forms of caffeic acid, coumaric acid (most likely p-coumaric acid) and ferulic acid were detected as relative high signals in all tissues of tomato fruit. Three isomers of caffeic acid-hexose, eluting at 10.27, 10.88 and 13.19 min, were present in all fruit tissues. These compounds, in particular the first eluting isomer, were most abundant in the JE and highest at the turning and pink stages of ripening. The second isomer (10.88 min) was almost uniquely present in the VAR. Four isomers of coumaric acid-hexose were found in tomato fruit, eluting at 10.47, 13.90, 14.26 and 15.39 min. The first and fourth isomers were most abundant in the JE, the second in the EP and the third in the VAR. Three isomers of ferulic acid-hexose, eluting at 12.91, 16.38 and 17.29 min, were detected and these compounds were highest in the JE (first isomer) or VAR (the second and third isomer).

Three isomers of caffeoylquinic acid were present in all different tomato tissues, appearing in the chromatograms at 14.23, 14.86 and 17.18 min, respectively. In addition, there were three isomers of dicaffeoylquinic acids (eluting at 27.82, 28.47 and 30.58 min), as well as three isomers of tricaffeoylquinic acids (appearing at 39.32, 40.63 and 41.35 min) present in all tomato tissues, in particular in the EP, and which all increased upon fruit ripening.

Page 105: Metabolomics Technologies applied to the

105

Metabolomics analyses of tomato fruit tissues

Table 4.2. Metabolites putatively identified by LC-PDA-ESI-QTOF-MS/MS (negative ionisation mode) in tissues of tomato fruit. Ret (min) = averaged retention time, in minutes. Mass = averaged accurate mass ([M-H]-), in Da, obtained from signals with an intensity ratio 0.25 < analyte / lock mass < 2.0 - in italic, from signals with analyte / lock mass < 0.25. Δmass (ppm) = deviation between the averages of observed and calculated accurate masses, in ppm – in italic, obtained manually, by calculation from analyte mass signal at intensities comparable to the lock mass. UV/Vis = absorbance maxima in the UV/Vis range (not detectable absorbances are represented by “-“). MS/MS = fragments obtained through increased collision energy on indicated parent mass. Metabolite name = common name of putatively identified metabolite. Mol Form = molecular formula of the metabolite. MM = molecular monoisotopic mass of the metabolite. ()FA = formic acid adduct. (T) = metabolite previously described in the MoTo DB. I, II, III, IV = different isomers (only one reported in literature).

Ret (min)

Max intensity Mass Δmass

(ppm) UV/Vis MS/MS Metabolite Name Mol Form MM

4.82 797 164.0725 4.7 146, 103 Phenylalanine C9H11NO2 165.0790

7.38 1071 380.1561 -3.8 -308, 263, 218, 200, 174, 161, 146, 134

Zeatin hexose C16H23N5O6 381.1648

10.27 8816 341.0880 0.6 Caffeic acid-hexose I (T) C15H18O9 342.0951

10.47 2864 325.0930 0.3 Coumaric acid-hexose I (T) C15H18O8 326.1002

10.88 913 341.0884 1.6 Caffeic acid-hexose II (T)

12.67 1081 443.1918 -1.0 -

381, 307, 281, 237, 219, 201, 189, 179, 161, 153, 143, 119, 113, 101, 89

Dehydrophaseic acid-hexose C21H31O10 444.1995

12.91 2077 355.1035 0.0 Ferulic acid-hexose I (T) C16H20O9 356.1107

13.19 2340 341.0879 0.3 Caffeic acid-hexose III (T) C15H18O9 342.0951

13.90 1630 325.0929 0.1 163, 119, 93 Coumaric acid-hexose II (T) C15H18O8 326.1002

14.23 14901 353.0873 -1.3 191 3-Caffeoylquinic acid (T) C16H18O9 354.0951

14.26 1510 325.0935 2.0 - Coumaric acid-hexose III C15H18O8 326.1002

14.86 2800 353.0878 -0.1 5-Caffeoylquinic acid (T) C16H18O9 354.0951

15.20 7603 411.1872 0.0 - 249, 161, 101 (iso)pentyl dihexose C17H32O11 412.1945

15.39 286 325.0936 2.1 - 265, 235, 205, 163, 145, 117 Coumaric acid-hexose IV C15H18O8 326.1002

15.55 7842 771.1989 -0.1 Quercetin-dihexose-deoxyhexose (T) C33H40O21 772.2062

15.89 631 595.1660 -1.4549, 475, 433, 415, 385, 355, 313, 271, 263

Naringenin dihexose (T) C27H32O15 596.1741

16.38 9995 355.1038 1.0 Ferulic acid-hexose II (T) C16H20O9 356.1107

17.18 2280 353.0876 -0.6 4-Caffeoylquinic acid (T) C16H18O9 354.0951

17.29 1312 355.1031 -0.9 297sh, 329 193, 175, 160, 132 Ferulic acid-hexose III C16H20O9 356.1107

17.83 2177 755.2036 -0.6 264, 349 593, 447, 285 Kaempferol-dihexose-deoxyhexose C33H40O20 756.2113

20.40 1396 1272.5891 2.0 (Esculeoside B) FA (T) C57H95NO30 1273.5939

22.47 16993 741.1946 8.4 Quercetin-hexose-deoxyhexose-pentose (T) C32H38O20 742.1956

23.85 3619 1314.5978 0.5(Lycoperoside F) FA or (Lycoperoside G) FA or (Esculeoside A) FA I (T)

C59H97NO31 1315.6045

Page 106: Metabolomics Technologies applied to the

106

CHAPTER 4

24.14 232 1094.5459 6.4 -

1049, 917, 887, 754, 736, 718, 700, 688, 609, 592, 395, 305, 143, 89

(Lycoperoside H) FA I C51H85NO24 1095.5462

24.44 33024 609.1459 -0.4 Rutin (T) C27H30O16 610.1534

24.76 7396 725.1936 0.2 264, 345 593, 575, 285, 255

Kaempferol-hexose-deoxyhexose-pentose C32H38O19 726.2007

25.50 737 1094.5419 2.8 -1049, 917, 887, 754, 688, 592, 455, 305, 143

(Lycoperoside H) FA II C51H85NO24 1095.5462

25.70 353 425.1821 1.0 - 263, 153 Abscisic acid-hexose C21H30O9 426.1890

25.83 980 1314.5954 -1.4(Lycoperoside F) FA or (Lycoperosyde G) FA or (Esculeoside A) FA II (T)

C59H97NO31 1315.6045

25.87 2053 1312.5817 0.2(Dehydrolycoperoside F) FA or (Dehydrolycoperoside G) FA or (Dehydroesculeoside A) FA I (T)

C59H95NO30 1313.5888

26.54 2044 1094.5397 0.7 -1049, 917, 887, 754, 688, 592, 179, 143, 125

(Lycoperoside H)FA III C51H85NO24 1095.5462

26.62 695 1312.5881 5.0 -

1267, 1137, 1107, 975, 944, 812, 746, 650, 275, 143

(Dehydrolycoperoside F) FA or (Dehydrolycoperoside G) FA or (Dehydroesculeoside A) FA II

C59H95NO30 1313.5888

26.62 32144 1314.6003 2.4(Lycoperoside F) FA or (Lycoperoside G) FA or (Esculeoside A) FA III (T)

C59H97NO31 1315.6045

27.31 537 1094.5421 3.0 (Lycoperoside H) FA IV (T) C51H85NO24 1095.5462

27.45 16802 593.1514 0.3 Kaempferol-3-O-rutinoside (T) C27H30O15 594.1585

27.60 1810 1312.5843 2.1 - 1267(Dehydrolycoperoside F) FA or (Dehydrolycoperoside G) FA or (Dehydroesculeoside A) FA III

C59H95NO30 1313.5888

27.62 1314.5920 1.8 -(Lycoperoside F) FA or (Lycoperosyde G) FA or (Esculeoside A) FA IV

C59H97NO31 1315.6045

27.82 1662 515.1199 0.8 Dicaffeoylquinic acid I (T) C25H24O12 516.1268

28.47 868 515.1199 0.7 Dicaffeoylquinic acid II (T) C25H24O12 516.1268

30.58 4972 515.1193 -0.5 Dicaffeoylquinic acid III (T) C25H24O12 516.1268

31.13 3614 887.2255 0.4 258, 321Quercetin-hexose-deoxyhexose-pentose-p-coumaric acid (T)

C41H44O22 888.2324

32.19 3517 1076.5283 0.0 -

1031, 899, 868, 736, 670, 574, 305, 143, 119, 113

(α-Dehydrotomatin)FA I C51H83NO23 1077.5356

32.75 380 1136.5520 2.2 -(Lycoperoside A) FA or (Lycoperoside B) FA or (Lycoperoside C) FA I

C53H87NO25 1137.5567

32.82 1081 433.1141 0.2 Naringenin chalcone-hexose (T) C21H22O10 434.1213

32.84 1960 1078.5451 1.1 - (α-Tomatin)FA I C51H85NO23 1079.5512

33.31 686 1076.5309 2.4 - (α-Dehydrotomatin) FA II C51H83NO23 1077.5356

33.33 33734 1078.5438 -0.1

1033, 901, 870, 738, 672, 576, 305, 143, 119, 113

(α-Tomatin) FA II (T) C51H85NO23 1079.5512

Page 107: Metabolomics Technologies applied to the

107

Metabolomics analyses of tomato fruit tissues

33.35 23219 1136.5489 -0.4 1092, 959, 929, 796, 731, 634

(Lycoperoside A) FA or (Lycoperoside B) FA or (Lycoperoside C) FA II (T)

C53H87NO25 1137.5567

33.46 2614 433.1143 0.6 Naringenin chalcone-hexose II (T) C21H22O10 434.1213

34.09 918 1136.5505 0.9 -(Lycoperoside A) FA or (Lycoperoside B) FA or (Lycoperoside C) FA III

C53H87NO25 1137.5567

37.94 33859 1081.5448 1.1 -1037, 919, 903, 757, 740, 595, 161

Tomatoside A C51H86O24 1082.5509

39.32 7103 677.1519 1.0 Tricaffeoylquinic acid I (T) C34H30O15 678.1585

40.63 145 677.1533 1.1 Tricaffeoylquinic acid II (T) C34H30O15 678.1585

41.35 112 677.1533 4.5 - 515, 353 Tricaffeoylquinic acid III C34H30O15 678.1585

42.17 5885 271.0621 3.4 Naringenin (T) C15H12O5 272.0685

42.73 34406 271.0622 3.6 Naringenin chalcone (T) C15H12O5 272.0685

Flavonoids and derivatives thereof

Flavonoids were typically present in the epidermal tissues. Quercetin, kaempferol, naringenin and of naringenin-chalcone derivatives were found mostly in the EP while some others, such as the aglycons naringenin-chalcone and naringenin and the trisaccharides of kaempferol and quercetin, were present in both EP and VAR. Some flavonoids (e.g. quercetin-dihexose-deoxyhexose, quercetin-hexose-deoxyhexose-pentose-coumaric acid, kaempferol-dihexose-deoxyhexose, naringenin, naringenin chalcone-hexose and naringenin-dihexose increased during development, while quercetin-hexose-deoxyhexose-pentose decreased. Rutin and naringenin chalcone were the most abundant flavonoids in the fruit, exhibiting intense mass signals in the EP extracts. It has been reported that these flavonoids increase during fruit development (Bovy et al., 2002). However, in the present study an increase of these flavonoids was only observed from the green to the breaker stage, followed by stabilisation of their levels up to the red stage. Two glycosylated derivatives of kaempferol, namely kaempferol-rutinoside and kaempferol-hexose-deoxyhexose-pentose, exhibited non-linear patterns during development. The first derivative increased from breaker to red stage, after a decrease from the green to the breaker stage. The second one decreased from green to turning stage, increased again to the pink stage, and returns to lower intensities at the red stage of the fruits. The ripening and tissue-distribution profiles of the flavonoids rutin, naringenin and naringenin chalcone were confirmed by LC-PDA analyses (data not shown).

Page 108: Metabolomics Technologies applied to the

108

CHAPTER 4

Glycoalkaloids

A variety of glycoalkaloids, detected as their formic acid adducts (Moco et al., 2006a), were present in different tissues and at specific developmental stages. Esculeoside B, eluting at a retention time of 20.40 min, mainly occurred in the JE and its signal intensity slightly decreased during fruit ripening (1.4 fold from green to red stage). Four isomers of lycoperoside H were detected in tomato fruit, at retention times of 24.14, 25.50, 26.54 and 27.31 min, and all these were present essentially in the EP and at highest levels at the early stages of ripening (green and breaker). The alkaloids lycoperoside F and G and esculeoside A have the same molecular mass which appeared four times in the LC-MS chromatogram, suggesting four different isomers (at 23.85, 25.83, 26.62 and 27.62 min). These compounds were present in all tissues, in particular in the EP and JE, but hardly occurred in the green fruit. In the red stage, the third isomer appeared as one of the highest signals in the mass chromatograms. Three derivatives of lycoperosides F and G or esculeoside A, with a mass difference of 2 Da, and therefore recently suggested to be named as dehydrolycoperosides F and G or dehydroesculeoside A (Moco et al., 2006a), occurred in all tomato fruit tissues. These metabolites had lower signal intensities than the lycoperosides F and G or esculeoside A isomers, but displayed about similar tissue distribution and response upon ripening. The last dehydro isomer, eluting at 27.60 min, was only present in the EP and increased more than 1500 fold from the green (below the detection limit) to the red stage. The most known tomato alkaloids, α-tomatin and dehydrotomatin, occurred as two different isomers in all tissues at the green stage, in particular in the EP and JE, decreasing towards levels below detection limit at the red stage of development. For both compounds, one isomer was more abundant than the other: the first eluting isomer of dehydrotomatin, retention time 32.19 min (more than 110 fold decrease from green to red) and the second eluting isomer of α-tomatin, retention time 33.33 min (more than 65 fold decrease from green to red). Three isomers of the equal mass metabolites lycoperosides A, B and C were found in all tissues, with the second isomer (at 33.35 min) being the most abundant. Analogous to α-tomatin, these alkaloids preferably accumulated in the EP and JE, and were highest at the green stage.

Other metabolites

A large number of metabolites present in the extract appear in the chromatogram before 4 minutes of retention time, as a large and mostly asymmetrical

Page 109: Metabolomics Technologies applied to the

109

Metabolomics analyses of tomato fruit tissues

peak. These are (very) polar metabolites that do not interact with the stationary phase. Amino acids, nucleosides, mono-, di-, and tri-phosphate nucleotides, sugars and organic acids are present in this part of the chromatogram. Phenylalanine was the only amino acid actually separated by our instrumental setup (retention time 4.8 min). This amino acid was present in all fruit tissues, though most abundant in the PR and CP, and slightly increased during development (about 2-fold from green to red).

The metabolite assigned as the saponin tomatoside A seemed to be a specific for the JE extracts, as it displayed extremely high signals at all stages of development of this tissue. From the analysis of the seeds of Ever and Money Maker and their comparison to the gelatinous part of the JE tissue, it appeared that this saponin was highly abundant in the seeds (Fig 4.5). These results suggest that this compound may be uniquely present in the seeds of tomato. Likewise, a compound with m/z 962.5, at 26.2 min, is also present in the seeds and not in the JE (Fig 4.5), and appears to be a formic acid adduct of a dihexose alkaloid. On the other hand, m/z 1314.595 at retention time 25.8, annotated as lycoperoside F or G or esculeoside A (Table 4.2), was not detectable in the isolated seeds.

10.00 20.00 30.00 40.00 50.00

%

0

100

%

0

100

TOF MS ES- BPI

3.41e437.561081

26.46962

22.66609

20.26625 34.49

1244

32.32946

42.021066

TOF MS ES- BPI

3.41e425.901315

23.201315

19.701273

Time

9.83341

15.28411

44.231455

10.00 20.00 30.00 40.00 50.00

A

B

Figure 4.5. Chromatograms of seeds (A) and gelatinous part (B) of JE of Ever tomato fruit samples (both 1:10 diluted).

Page 110: Metabolomics Technologies applied to the

110

CHAPTER 4

Glycosylated hormonal metabolites, zeatin hexose and abscisic acid hexose, were also found in the fruit tissues of cultivar Ever, in particular in the JE extracts (excluding seeds). Both metabolites increased during development (up to 8 fold). The metabolite dehydrophaseic acid-hexose, belonging to the abscisic acid pathway, was found in all tissues, including in the seeds, and developmental stages of the fruit, being highest in the red VAR.

Metabolite pattern classification according to tissue and development

To enable grouping of metabolites according to their relative distribution over the samples, the LC-PDA-FI and LC-MS data sets from the analyses of tissues and ripening stages of the tomato variety Ever were concatenated. Using the Multivariate Mass Spectra Reconstruction (MMSR) method, previously described for gas chromatography-MS data (Tikunov et al., 2005), the combined data set was processed. This processing approach grouped the data set into 14 different pattern clusters. Metabolites previously identified (Tables 4.1 and 4.2, and (Moco et al., 2006a)) fitted into 9 of these 14 clusters (Fig. 4.6). In each cluster chemically diverse metabolites were grouped, indicating that the clustering was independent of the chemical class of metabolites. In contrast, this pattern analysis led to a biological classification according to tissue and ripening stage. The clusters A, B and C included metabolites that are relatively high in the EP, either present at constant levels (A), or increasing (C) or decreasing (B) upon fruit ripening. In these clusters, flavonoids, phenolic acids and alkaloids are present. The clusters D and E group metabolites that are abundant in the JE. Metabolites relatively high in the VAR are grouped in the clusters F, G and I, while metabolites most abundant in the PR are grouped in cluster H. The mass intensity signals (taken from the LC-MS data) or the quantitative values (µg/mg dry weight) of each classified metabolite fit the proposed cluster patterns (see Supplementary Materials Figs. 4S.1-3).

DISCUSSION

Tissue specialization during fruit development is an intricate biological process where transformations at the physical, chemical and biological level are taking place simultaneously. These transformations result in a series of dynamic modifications in the whole metabolic pathway network leading to cell division, cell expansion and ripening. The transformation of the ovary into the mature fruit has been target of intensive studies, in practically all aspects of fruit development.

Page 111: Metabolomics Technologies applied to the

111

Metabolomics analyses of tomato fruit tissues

PR-R

0.6

0.4

0.2

0

0.8

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-P

F Caffeic acid-hexose II

Dehydrophaseic acid-hexose

Ferulic acid-hexose II + III

Caffeoylquinic acid Iα-TocopherolLutein

Violaxanthin

0.6

0.4

0.2

0

-0.2

G

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-PPR

-R

PhenylalanineH0.6

0.4

0.2

0

-0.2

-0.4

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-PPR

-R

Chlorophyll bNeoxanthin

I0.4

0.2

0

-0.2

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-PPR

-R

(Lycoperoside H)FA II + IV

Coumaric acid-hexose II + IV

Kaempferol-3-rutinose

Kaempferol-hexose-deoxyhexose-pentose

Quercetin-hexose-deoxyhexose-pentose

Rutin

0.6

0.4

0.2

0

-0.2

A

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-PPR

-R

(Lycoperoside A/B/C)FA III

(Lycoperoside H)FA I

(α-Dehydrotomatin)FA I

(α-Tomatin)FA I + II

Caffeoylquinic acid III

0.8

0.4

0.2

0

-0.2

0.6

B

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-PPR

-R

(Dehydro(lycoperoside F/G/ Esculeoside A))FA III(Esculeoside B)FA

(Lycoperoside F/G/ Esculeoside A)FA I + III

Dicaffeoylquinic acid II + III

Kaempferol-dihexose-deoxyhexose

Naringenin

Naringenin chalcone

Naringenin chalcone-hexose

Naringenin dihexose

Quercetin-dihexose-deoxyhexose

Quercetin-hexose-deoxyhexose-pentose- coumaric acid

Tricaffeoylquinic acid I + II + III

Lycopene

β-CaroteneAscorbic acid

0.6

0.4

0.2

0

0.8

C

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-PPR

-R

0.6

0.4

0.2

0

-0.2

0.8

D

Caffeic acid-hexose I

Tomatoside A

Zeatin hexose

γ-Tocopherol

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-PPR

-R

0.6

0.4

0.2

0

-0.2

E (Dehydro(lycoperoside F/ G/ Esculeoside A))FA II(Iso)pentyl dihexose

Caffeoylquinic acid II

VAR-

GVA

R-B

VAR-

TVA

R-P

VAR-

RCP

-GCP

-BCP

-TCP

-PCP

-REP

-GEP

-BEP

-TEP

-PEP

-RJE

-GJE

-BJE

-TJE

-PJE

-RPR

-GPR

-BPR

-TPR

-PPR

-R

Figure 4.6. Classification of metabolite data according to their relative abundance (in µg/g dry weight or LC-MS signal intensity) in the fruit tissues, upon ripening. Plots A, B, C, D, E, F, G, H and I represent 9 clusters, out of a total of 14 recognized clusters, containing the (putatively) assigned metabolites (see Tables 4.1 and 4.2). The patterns display, from left to right, the different fruit tissues, VAR (VAR, in black), CP (CP, in grey), EP (EP, in black), JE (JE, in grey) and PR (PR, in black); within each tissue, the ripening stages are displayed: green (G), breaker (B), turning (T), pink (P) and red (R).

Page 112: Metabolomics Technologies applied to the

112

CHAPTER 4

In particular at the early stage of fruit set, the hormonal regulation and the interaction of different hormones have been studied extensively (Gillaspy et al., 1993; Giovannoni, 2001). The assignment of genes involved in fruit formation and functions, the analysis of transcripts and proteins as well as registering alterations of a limited set of known metabolites during development have been previously documented (Gillaspy et al., 1993; Buta and Spaulding, 1997; Srivastava and Handa, 2005). Recently, a comprehensive study on the primary metabolism in developing tomato fruit, using a GC-TOF MS based metabolomics approach, has been performed at both metabolite and transcript levels (Carrari et al., 2006). So far, tissue differentiation during fruit formation and maturation has been monitored mostly at the morphological level and much less at the biochemical level. In the present study, we followed the (relative) abundance of a large number of metabolites present in specific fruit tissues during ripening, using LC-PDA-FD and LC-PDA-QTOF-MS based metabolomics approaches. Most of the metabolites detected by these approaches are part of the plant’s secondary metabolism and therefore are generally believed not to be involved in vital plant processes but yet to confer biotic or abiotic stress resistance.

The LC-MS analyses of peel and flesh tissues from several different commercial tomato cultivars and subsequent principle component analysis indicated that the tissue specificity was the major factor determining the differences in metabolic profiles between samples (Fig. 4.1). These findings suggest the presence of common metabolites localized in the same tissue among all cultivars, despite the different genetic backgrounds of the fruits. Furthermore, the presence of certain metabolites specifically accumulating in particular tissues provides information about metabolite localization and tissue specificity within the tomato fruit, as general trends.

The observed loss of rigidity during ripening has been documented before and is involved with physicochemical changes occurring in the fruit cell walls, derived from modifications in the cell wall polymers, mainly due to enzymatic hydrolyses of pectin (Carrari et al., 2006; Tomassen et al., 2007). The alteration of fruit size during development is almost irrelevant after the green stage, as the fruit size appears stable (Giovannoni, 2001).

The pigmentation of the tomato fruit is due to the presence of light-absorbing metabolites such as chlorophylls, carotenoids and flavonoids. The green colour of the tissues in the early stages of development is caused by the presence of chlorophylls. In tomato, both chlorophyll a and b, differing in an aldehyde group (in chlorophyll b) instead of a methyl group (in chlorophyll a) are detected. These two related metabolites complement each other in their photoreception capabilities,

Page 113: Metabolomics Technologies applied to the

113

Metabolomics analyses of tomato fruit tissues

enlarging the light absorbing spectrum. At each ripening stage the level of total chlorophylls was the highest in the VAR and the lowest in the EP. In fact, there is evidence (Smillie et al., 1999) that the chlorophylls present in all tissues of the fruit are photosynthetically active including the VAR and inner tissues as the JE. The same authors suggested that the photosynthetic abilities of the JE might be of importance in the development and maturation of the seeds. The presence of chlorophylls in the inner fruit layers of the tomato fruit is in accordance with the transcriptional patterns of chlorophyll a/b related proteins, which were found preferentially expressed in the locular fruit tissue (Lemaire-Chamley et al., 2005).

The transformation of chloroplasts into chromoplasts during fruit ripening is paired with the degradation of chlorophylls and production of carotenoids, in particular lycopene. Lycopene is the pigment conferring the red colour of ripe tomato fruits and is accumulating in fruit-localized phytochromes. These organelles regulate the colour development in tomato by controlling the amount of accumulated lycopene (Alba et al., 2000). Lycopene is present in all fruit tissues, though highest levels were detected in the EP. Carotenoids, in general, attract seed dispersals and therefore can influence the propagation of the species. The profiles of carotenoids in tomato fruit during ripening are comparable to other studies (Fraser et al., 1994), although a different cultivar was used. The main carotenoids detected (neoxanthin, violaxanthin, β-carotene, lycopene and lutein) are present in all tissues of the fruit, including the inner parts of the fruit such as the JE.

LC-MS analysis of tomato fruit tissues can provide important information about the tissue distribution of semi-polar metabolites and their fate upon ripening of the fruit. From these analyses, it became clear that most metabolites are not equally distributed over the fruit, but show preferential accumulation in one or more specific tissues. Moreover, variation in metabolites between tissues was more pronounced than the variation induced by ripening, which was the second source of profile segregation.

At all ripening stages, the most extreme differences with regard to metabolite composition were observed between the EP and JE. Glycosylated flavonoids, including rutin, kampferol rutinoside and a quercetin trisaccharide (i.e. quercetin-hexose-deoxyhexose-pentose), are specifically abundant in the EP, either present at similar levels in all developmental stages or accumulating upon ripening. Especially deglycosylated flavonoids are known for their capacity of electron transfer, as antioxidants but also as prooxidants (Awad et al., 2001; Lemanska et al., 2001). As such, flavonoids can participate as plant protective elements in both biotic and abiotic phenomena: defense against pathogens and

Page 114: Metabolomics Technologies applied to the

114

CHAPTER 4

environmental stress (drought, UV radiation, wounding) (Pourcel et al., 2007). In addition, flavonoids have been associated with auxin transport in the plant, acting as endogenous mediators of auxin flow (Besseau et al., 2007). During fruit development, high levels of naringenin chalcone are being formed in the EP. Also, multiple glycosylated forms of flavonoids (dihexose-deoxyhexoses of kampferol and quercetin) increase in concentration during development, possibly due to the increase of sugars production and therefore higher conjugation possibilities. The presence of flavonoid derivatives in the outer fruit layers of tomato is in accordance with the fact that the transcription of genes involved in flavonoid biosynthesis, including phenylalanine ammonia-lyase, flavanone 3-hydroxylase, flavonol synthase and sugar transporters, is higher in the exocarp tissues than in the inner locular tissue (Lemaire-Chamley et al., 2005). Likewise, the antioxidant ascorbic acid increases in the fruit upon ripening and is particularly high in the EP. By acting as a major electron transfer element in numerous reactions, ascorbic acid is likely involved in vital plant processes such as hormone biosynthesis (e.g. abscisic acid), detoxification of reactive oxygen species, regeneration of isoprenoid derivatives (e.g. α-tocopherol, zeaxanthin) and consequently, photosynthetic activity and plant growth (Chen and Gallie, 2006). In agreement with our findings that ascorbic acid is relatively high in the EP, the transcript level of guanidine diphosphate-mannose pyrophosphorylase, an enzyme involved in ascorbic acid biosynthesis, is higher in the exocarp than in the locular tissue (Lemaire-Chamley et al., 2005).

The JE material, including the seeds, is relatively rich in specific semi-polar metabolites. This locular tissue surrounding the seeds is of importance to the maturation of seeds during fruit development. The presence of hormonal compounds in this tissue suggests that the cellular environment surrounding the seeds is subjected to hormonal regulation processes (Gillaspy et al., 1993). In fact, abscisic acid is involved in seed dormancy phenomena, which can explain the presence of abscisic acid pathway related metabolites such as dehydrophaseic acid-hexose and absisic acid-hexose in the JE material. Elevated levels of abscisic acid-hexose were detected in the gelatinous part (without seeds) of the JE as well as the glycosylated cytokinin, zeatin-hexose, that increased in concentration during fruit ripening. This is in agreement with the preferential expression of the zeatin-O-glucosyltransferase gene in the locular tissue (Lemaire-Chamley et al., 2005). The increase of glycosylated hormonal metabolites during fruit development is in accordance with the discontinued need of hormonal (non-glycosylated) triggering molecules, such as abscisic acid and zeatin, after the cell expansion of the fruit (prior to the mature green formation) (Gillaspy et al., 1993). In addition to these

Page 115: Metabolomics Technologies applied to the

115

Metabolomics analyses of tomato fruit tissues

plant hormones, the saponin tomatoside A was specifically present in the seeds of tomato. However, no biological function has yet been assigned to this compound.

A class of compounds that exhibited marked developmental features are the glycoalkaloids. Glycoalkaloids such as tomatine are proposed to be formed through the cholesterol pathway, in which a series of modifications takes place in the steroid moiety. Only at the last step, the steroid tomatidine is glycosylated into α-tomatine (Friedman, 2002). The green fruit-specific metabolites α-tomatine and dehydrotomatine could be detected in all fruit tissues, being specifically abundant in the EP, and both alkaloids strongly decreased at the first signs of fruit ripening. In contrast, several other alkaloids, also preferentially accumulating in the EP, markedly increased upon ripening: lycoperosides F, G and esculeoside A and their dehydro forms dehydrolycoperosides F, G and dehydroesculeoside A. Esculeoside A was postulated to be formed from α-tomatine, through a ring rearrangement in the steroid moiety (Fujiwara et al., 2004). The functions of these high molecular mass metabolites in the plant are still not fully understood. Glycoalkaloids are known to have antimicrobial and anti-insect properties, allelopathic activity and participate in plant defense mechanisms (Friedman, 2002). The antifungal activity of both α-tomatine and its aglycone tomatidine has been studied recently (Simons et al., 2006). α-Tomatine is easily permeabilized through the cell membrane, however it has lower antifungal activity compared to tomatidine, suggesting that the sugar moiety is important for cell penetration while the steroid moiety confers toxicity to the fungus. In fact, upon internalization, α-tomatine is recognized by enzymes produced by the fungal pathogens that can hydrolyse one sugar unit or even the complete sugar moiety yielding tomatidine. Based on fungal gene expression data, tomatidine (and not α-tomatine) inhibits ergosterol biosynthesis which is key target for chemical control of fungal pathogens of plants and animals (Simons et al., 2006).

CONCLUSION

Tomato fruit ripening involves unique biological processes and regulation of metabolic pathways taking place at specific tissues of the fruit. In the present study, an extensive biochemical characterization of different fruit tissues, along different ripening stages, was performed through LC-based metabolomics approaches biased towards secondary metabolites. Many metabolites detected are not uniformly distributed over the tomato fruit upon fruit ripening, but rather preferentially

Page 116: Metabolomics Technologies applied to the

116

CHAPTER 4

accumulate in specific fruit tissue(s) as well as at specific ripening stage(s). This information can be added to the understanding of tissue differentiation of metabolic pathways during fruit development. Future studies, e.g. by using fruits from other genotypes and natural or induced mutants, might be performed to confirm the observed tissue-preference of metabolites and to give more insight into the tissue-specific regulation of metabolic pathways and biological function of (secondary) metabolites.

Page 117: Metabolomics Technologies applied to the

117

Metabolomics analyses of tomato fruit tissues

SUPPLEMENTARY MATERIALS

LYCOPEROSIDE H II

0

100

200

300

400

500

600

700

800

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

RTissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

Tissue - Ripening stage

ALYCOPEROSIDE H IV

0

100

200

300

400

500

600

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

COUMARIC ACID-HEXOSE II

0

200

400

600

800

1000

1200

1400

1600

1800

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

COUMARIC ACID-HEXOSE IV

0

50

100

150

200

250

300

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

KAEMPFEROL-3-O -RUTINOSIDE

0

2000

4000

6000

8000

10000

12000

14000

16000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

KAEMPFEROL HEXOSE DEOXYHEXOSE PENTOSE

0

1000

2000

3000

4000

5000

6000

7000

8000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

QUERCETIN HEXOSE DEOXYHEXOSE PENTOSE

0

2000

4000

6000

8000

10000

12000

14000

16000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

RUTIN

0

5000

10000

15000

20000

25000

30000

35000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

BTissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

LYCOPEROSIDE A/B/C III

0

200

400

600

800

1000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

LYCOPEROSIDE H I

0

50

100

150

200

250

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

DEHYDROTOMATINE I

0

500

1000

1500

2000

2500

3000

3500

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

TOMATINE I

0

200

400

600

800

1000

1200

1400

1600

1800

2000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

TOMATINE II

0

5000

10000

15000

20000

25000

30000

35000

40000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

CAFFEOYLQUINIC ACID III

0

400

800

1200

1600

2000

2400

Tissue - Ripening stage

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

Figure 4S.1 Mass signal intensities (peak height) or levels (µg/g dry weight) of (putatively) assigned metabolites (see Tables 4.1 and 4.2) from the LC-MS analyses of the fruit tissues of tomato cultivar Ever, organized in the metabolite clusters A and B, according to the obtained pattern classification (see Fig. 4.6).

Page 118: Metabolomics Technologies applied to the

118

CHAPTER 4

CLYCOPEROSIDE F/G / ESCULEOSIDE A I

0

500

1000

1500

2000

2500

3000

3500

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

LYCOPEROSIDE F/G / ESCULEOSIDE A III

0

5000

10000

15000

20000

25000

30000

35000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

DEHYDROLYCOPEROSIDE F/G / DEHYDROESCULEOSIDE A III

0

200

400

600

800

1000

1200

1400

1600

1800

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

(ESCULEOSIDE B)FA

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

DICAFFEOYLQUINIC ACID II

0

100

200

300

400

500

600

700

800

900

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

DICAFFEOYLQUINIC ACID III

0

1000

2000

3000

4000

5000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

KAEMPFEROL HEXOSE DEOXYHEXOSE PENTOSE

0200

400600

80010001200

14001600

18002000

22002400

Tissue - Ripening stage

Mas

s si

gnal

int

ensi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

NARINGENIN

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

Tissue - Ripening stage

NARINGENIN CHALCONE

0

5000

10000

15000

20000

25000

30000

35000

40000

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

NARINGENIN CHALCONE HEXOSE

0

200

400

600

800

1000

1200

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

NARINGENIN DIHEXOSE

0

100

200

300

400

500

600

700

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

QUERCETIN DIHEXOSE DEOXYHEXOSE

0

1000

2000

3000

4000

5000

6000

7000

8000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

QUERCETIN HEXOSE DEOXYHEXOSE-PENTOSE-COUMARIC ACID

0

500

1000

1500

2000

2500

3000

3500

4000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

TRICAFFEOYLQUINIC ACID I

0

1000

2000

3000

4000

5000

6000

7000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

TRICAFFEOYLQUINIC ACID II

0

50

100

150

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

TRICAFFEOYLQUINIC ACID III

0

20

40

60

80

100

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

Figure 4S.2 Mass signal intensities (peak height) or levels (µg/g dry weight) of (putatively) assigned metabolites (see Tables 4.1 and 4.2) from the LC-MS analyses of the fruit tissues of tomato cultivar Ever, organized in the metabolite cluster C, according to the obtained pattern classification (see Fig. 4.6).

Page 119: Metabolomics Technologies applied to the

119

Metabolomics analyses of tomato fruit tissues

C(cont.)

Tissue - Ripening stage

β-CAROTENE

0

10

20

30

40

50

60

70

80

90

100

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

LYCOPENE

0

500

1000

1500

2000

2500

3000

Tissue - Ripening stage

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

ASCORBIC ACID

0

200

400

600

800

1000

1200

1400

1600

1800

Tissue - Ripening stage

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

DCAFFEIC ACID-HEXOSE I

0

1000

2000

3000

4000

5000

6000

7000

8000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

ZEATIN-HEXOSE

0

100

200

300

400

500

600

700

800

900

1000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

TOMATOSIDE A

0

5000

10000

15000

20000

25000

30000

35000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

γ-TOCOPHEROL

0

20

40

60

80

100

120

140

160

180

Tissue - Ripening stage

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

ECAFFEOYLQUINIC ACID II

0

400

800

1200

1600

2000

2400

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

DEHYDROLYCOPEROSIDE F/G / DEHYDROESCULEOSIDE A II

0

100

200

300

400

500

600

700

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

(ISO)PENTYL DIHEXOSE

0

1000

2000

3000

4000

5000

6000

7000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

FCAFFEIC ACID-HEXOSE II

0

100

200

300

400

500

600

700

800

900

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

DEHYDROPHASEIC ACID-HEXOSE

0

500

1000

1500

2000

2500

3000

3500

4000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

FERULIC ACID-HEXOSE II

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

FERULIC ACID-HEXOSE III

0

200

400

600

800

1000

1200

1400

1600

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

Figure 4S.3Mass signal intensities (peak height) or levels (µg/g dry weight) of (putatively) assigned metabolites (see Tables 4.1 and 4.2) from the LC-MS analyses of the fruit tissues of tomato cultivar Ever, organized in the metabolite clusters C (continuation), D, E and F, according to the obtained pattern classification (see Fig. 4.6).

Page 120: Metabolomics Technologies applied to the

120

CHAPTER 4

GCAFFEOYLQUINIC ACID I

0

2000

4000

6000

8000

10000

12000

14000

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

α-TOCOPHEROL

0

100

200

300

400

500

600

700

Tissue - Ripening stage

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

VIOLAXANTHIN

0

5

10

15

20

25

30

35

40

Tissue - Ripening stage

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

LUTEIN

0

5

10

15

20

25

30

35

40

45

Tissue - Ripening stage

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

H

I

PHENYLALANINE

0

100

200

300

400

500

600

700

800

900

Tissue - Ripening stage

Mas

s si

gnal

inte

nsi

ty h

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

CHLOROPHYLL B

0

20

40

60

80

100

120

Tissue - Ripening stage

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

NEOXANTHIN

0

2

4

6

8

10

12

14

16

Tissue - Ripening stage

µg /

g d

ry w

eigh

t

VAR-

G

VAR-

B

VAR-

T

VAR-

P

VAR-

R

CP-

G

CP-

B

CP-

T

CP-

P

CP-

R

EP-

G

EP-

B

EP-

T

EP-

P

EP-

R

JE-

G

JE-

B

JE-

T

JE-

P

JE-

R

PR-

G

PR-

B

PR-

T

PR-

P

PR-

R

Figure 4S.4 Mass signal intensities (peak height) or levels (µg/g dry weight) of (putatively) assigned metabolites (see Tables 4.1 and 4.2) from the LC-MS analyses of the fruit tissues of tomato cultivar Ever, organized in the metabolite clusters G, H and I, according to the obtained pattern classification (see Fig. 4.6).

Page 121: Metabolomics Technologies applied to the
Page 122: Metabolomics Technologies applied to the
Page 123: Metabolomics Technologies applied to the

123

Chapter 5

Building up a Comprehensive Database of Flavonoids based on Nuclear Magnetic Resonance Data

Sofi a Moco, Li-Hong Tseng, Manfred Spraul, Zheng Chen, Jacques Vervoort Chromatographia 64: 503-508 (2006)

The improvements in separation and analysis of complex mixtures by LC-NMR during the last decade have shifted its emphasis from data acquisition to data analysis. For correct data analysis, not only high quality data sets are necessary, but adequate software and adequate databases for semi (or fully)-automated assignments of complex molecules are needed. Only by using NMR, when necessary in combination with MS, the identifi cation of molecules, as present for example in natural products, can be achieved. Here we report on the ongoing efforts required for the construction of an NMR database of fl avonoids, implemented for automated assignments of fl avonoids. The procedure is demonstrated for a series of fl avonoids.

Building up a Comprehensive Database of Flavonoids based on Nuclear Magnetic Resonance Data

Sofi a Moco, Li-Hong Tseng, Manfred Spraul, Zheng Chen, Jacques Vervoort Chromatographia 64: 503-508 (2006)

Page 124: Metabolomics Technologies applied to the

124

CHAPTER 5

INTRODUCTION

Flavonoids are an important class of secondary metabolites naturally occurring in plants. Flavonoids are flavone-based compounds, and in agreement with Harborne (1980), can be classified into 12 subclasses, according to the oxidation level of the central pyran ring (C): anthocyanins, chalcones, aurones, flavones, flavonols, flavanones, dihydrochalcones, proanthocyanidins, catechins, flavan-3,4-diols, biflavonoids and isoflavonoids (Fig. 5.1). Phenolic compounds occur in plants primarily in a conjugated form. Common substitution patterns of flavonoids include hydroxylation, methoxylation, methylation and / or glycosylation (Harborne, 1980).

Aurone

O

O

HO

HO

OH

OH

OOH

OH

OH

OH

Chalcone

Flavone

OOH

HO O

OH

Flavonol

OH

OOH

HO O

OH

OH

Dihydrochalcone

OOH

HO OH

OH

O

H O OH

O

OH

O

H O

OH

O

OH

Biflavonoid

Isoflavonoid

OOH

HO O

OH

O

O

5

6

7

8

43

2

2'

3'

4 '

5 '

6'

B

A C

1'

O H

O H

O H

+O

O H

O H

OH

Anthocyanin

A

F

E

C

BK

J

I

Cathechin

DOH

OH

HO O

OH

OH

R

S

Z

E

Flavanone

HOOH

HO O

OH

S

Flavan-3,4-diol

GOH

O

OH

S

R

S

Proanthocyanidin

L

HO

OH

O HO

OH

HO

R

R

HO

OH

OHO

O H

HO

R

R

R

Figure 5.1. Flavonoid structure (with numbering) and examples of molecular structures belonging to the different flavonoid classes: A, delphinidin (anthocyanin); B, aureusidin (aurone); C, amentoflavone (biflavonoid), D, catechin (catechin); E, naringenin chalcone (chalcone); F, phloretin (dihydrochalcone); G, flavan-3,4-diol (flavan-3,4-diol); H, naringenin (flavanone); I, apigenin (flavone); J, quercetin (flavonol); K, genistein (isoflavonoid); L, proanthocyanidin B2 (proanthocyanidin).

The interest on studying polyphenols, in particular flavonoids, lies not only in their biological role in plants, but perhaps even more in their potential health benefits for humans. A large number of observational epidemiological studies have shown that specific flavonoid-containing diets can be associated with reduced risks of specific forms of cancer or cardiovascular diseases (Ross and Kasum, 2002). Identification of the actual compound in a specific diet responsible for the claimed

Page 125: Metabolomics Technologies applied to the

125

Building a Flavonoid NMR Database

health effect remains an important bottleneck. The detection, isolation and characterization of low-abundant compounds from complex mixtures depend on an efficient analytical procedure to assure correct identification.

LC-MS is a fast and useful method for the profiling of metabolites present in mixtures. In combination with photo diode array (PDA) and MS/MS information, LC-MS can provide valuable information for identification purposes. By using this combined technology, identification of compounds can be accomplished by testing commercially available standards and by making use of prior biochemical knowledge about the chemical composition of the mixture, either from literature or from dedicated databases (Moco et al., 2006a). In many situations, especially for complex molecules, LC-PDA-MS/MS may not provide sufficient selectivity for the structure elucidation analysis. In fact, the distinction between possible (dia-)stereoisomers or some constitutional isomers proves to be inconclusive solely by using this LC-PDA-MS/MS technology. A common difficulty is the elucidation of conjugated forms of flavonoids, in which the position of the substitution and / or the nature of the substituent cannot be defined based on LC-PDA-MS/MS data, meaning that additional NMR data are required. The combination of LC with MS, PDA and NMR is one of the most powerful methods to separate and structurally elucidate unknown compounds from biochemical mixtures (Wolfender et al., 2003). It combines the separation over a wide range of polarities (LC) with photo-spectrometric information (PDA), molecular mass value (MS) and full structural elucidation capabilities (NMR). Specifically, LC-solid phase extraction-NMR (LC-SPE-NMR) (Exarchou et al., 2003; Exarchou et al., 2005) and capillary LC-NMR (capLC-NMR) (Krucker et al., 2004) methods, which have been developed recently, improve the isolation and efficiency of identification of compounds present in mixtures.

Given the diversity of secondary metabolites, especially concerning the variety of possible conjugations, the assignment of structures by NMR, and the often tedious analytical isolation, it is essential that reliable NMR-based metabolite databases are constructed. These can facilitate the identification procedure by preventing the isolation of large amounts of material for a full structure elucidation by 1H-13C NMR data. In order to improve the identification of flavonoids present in complex mixtures, we are assembling a database of a wide variety of flavonoids. This database is mainly focussed on NMR data of more than 220 flavonoids, acquired under controlled experimental conditions. For all flavonoids, one (1D) and two-dimensional (2D) NMR data sets have been obtained and the 1H and 13C resonances have been assigned. In addition the 1D 1H NMR resonances have been fitted with the PERCH software [http://www.perchsolutions.com, (Laatikainen et al., 1996)]

Page 126: Metabolomics Technologies applied to the

126

CHAPTER 5

in order to obtain precise coupling constants. In this study, the precise chemical shifts values and coupling constants as obtained by using PERCH on 1H NMR of 12 related flavonoids: quercetin (1) and the derivatives quercetin-4’-O-glucoside (2) and quercetin-3-O-glucoside (3), naringenin (4) and the derivative naringenin-7-O-glucoside (5), kaempferol (6) and the derivatives kaempferol-3-O-rutinoside (7) and kaempferol-7-O-neohesperoside (8), apigenin (9), syringetin (10) and the derivatives syringetin-3-O-galactoside (11) and syringetin-3-O-glucoside (12) are shown. From the data obtained, we evaluate the possibility to identify many flavonoids and derivatives thereof solely on 1H NMR chemical shifts and 1H-1H NMR coupling constant values, without the need of going into extensive 13C NMR data acquisition routines.

MATERIALS AND METHODS

Materials and Reagents: The flavonoids 2, 3, 5-7, 8, 10-12 were purchased from Extrasynthese (France), 1 from Sigma (Germany), 4 from Aldrich (Germany) and 9 from Fluka (Germany). The methanol-d4 was obtained from Deutero GmbH (Germany) and tetramethylsilane (TMS) from Merck (Germany).

Sample Preparation: A total of 3-5.5 mg of flavonoid was dissolved in 650 µL methanol-d4. The solutions were homogenized by vortex and for some samples the ultrasonic bath was used to obtain solubility. From the homogenized solution, 600 µL were taken into the NMR tube.

NMR measurements: NMR measurements were acquired at 300 K using a Bruker Avance 600 spectrometer, proton frequency 600.23 MHz, equipped with a 5 mm TXI probe. Data acquisition was controlled under ICON-NMR version 3.5.6, Bruker XWIN-NMR version 3.5 and Bruker TopSpin version 1.3. (Germany). The 1D 1H spectra were acquired with 65 K data points over a spectral width of 20.028 ppm. The following 2D experiments were recorded: COSY and TOCSY (spectral width 16.0194 ppm in both dimensions; 400 experiments in t1), J-resolved (spectral width 16.6602 ppm in t2 and 0.1302 in t1; 128 experiments in t1), HSQC (spectral width 16.0194 ppm in t2 and 185.0601 in t1; 400 experiments in t1) and HMBC (spectral width 16.0194 ppm in t2 and 222.3160 in t1; 400 experiments in t1).

Data Analysis: Chemical shifts were referenced to TMS signal (δ = 0 ppm). Bruker TopSpin version 1.3 was used for data conversion and data analysis. PERCH 2005 (Finland) (Laatikainen et al., 1996) was used for 1H NMR spectral and line shape analysis. AMIX 3.6.8 (Germany) was used for data-integration.

Page 127: Metabolomics Technologies applied to the

127

Building a Flavonoid NMR Database

RESULTS AND DISCUSSION

Measurement and assignment of the 1H NMR spectra of the flavonoids

Assignments of protons and carbons within each measured flavonoid (1-12) were made by data analysis of the 1D and 2D NMR spectra.

Strategy for the identification of the flavonoid and its sites of substitution: glycosylated flavonoids

As a first example of the subtle but consistent changes on the NMR characteristics of a molecule which occur on substitution, quercetin (1) and two conformational isomers 2 and 3 (Fig. 5.2) have been studied. These two conformational isomers cannot be easily distinguished by MS(/MS), as they are constitutional isomers in which a glucose moiety is placed at two different positions in the flavonoid molecule, 4’ and 3, respectively. However, the 1H-NMR analyses can reveal the identity of these two closely related molecules based on chemical shift data. Table 5.1 shows the 1H NMR chemical shift values of these two glycosylated quercetin derivatives (2 and 3), as well as the resonances of the aglycone quercetin (1) itself. The H6 and H8 protons of both 2 and 3 do not shift relative to 1. In contrast, the H2’, H5’ and H6’ protons of 2 shift in comparison to 1 by 0.03, 0.42 and 0.08 ppm respectively. The protons of the A- and B-rings of 2 do not shift relative to 1, indicating that the glucose-moiety is not attached to either one of these two aromatic rings. The large downfield shift of H5’ in 2 relative to 1 indicates a substitution at an ortho position to the H5’ proton, i.e., at C4’. The observed shift effects for 2, especially protons located ortho to the substituting group, is consistent throughout other analogous examples in our enlarged database. The absence of substantial chemical shift effects for the aromatic ring protons of 3 relative to 1 indicates that the electronic configuration of the backbone structure does not change. This suggests that the glucose moiety should therefore be attached to the 3-OH group in the C-ring.

From this database, a small selection of related glycosylated forms of the aglycones 4, 6 and 10 are shown in Fig. 5.2 (5, 7, 8, 11, 12). The 1H NMR chemical shift values of the glycosylated forms of these flavonoids and their aglycones (Table 5.1) indicate that the protons located ortho to the O-glycosylation position shift to higher ppm values by at least 0.24 ppm. The chemical shift values of the H6 and H8 protons of structurally related molecules (1, 2, 3, 6, 7, 9, 10, 11, 12), in the absence

Page 128: Metabolomics Technologies applied to the

128

CHAPTER 5

of substitution on C5 or on C7, are virtually identical. The presence of a saturated bond between C2 and C3, as present in naringenin (4) and derivatives thereof (for example, 5), does shift the H6 and H8 protons to lower ppm values by 0.3 ppm. This observation indicates a change in backbone structure, as well as in electronic configuration.

1 R1 = H; R2 = H2 R1 = H; R2 = glucose3 R1 = glucose; R2 = H

ORO

HO

HO

O

O H

OR2

1 O

HO

RO

O

OH

4 R = H5 R = glucose

6 R1 = H; R2 = H7 R1 = H; R2 = rutinose8 R1 = neohesperose; R2 = H

ORO

HO

R O

O

OH

2

1

O

HO

HO

O

OH

9

ORO

HO

HO

O

O

OH

O

10 R = H11 R = galactose12 R = glucose

Table 5.1. 1H NMR chemical shifts of a selected group of flavonoids (see Fig. 5.2). The 1H NMR chemical shifts values of the rhamnoside moiety of compounds 7 and 8 are not shown. M = methoxy.

position 1 2 3 4 5 6 7 8 9 10 11 12

H2 5.33 5.38

H3a 2.69 2.75 6.59

H3b 3.11 3.17

H6 6.18 6.19 6.20 5.88 6.19 6.18 6.21 6.42 6.21 6.19 6.21 6.21

H8 6.39 6.39 6.39 5.89 6.21 6.39 6.41 6.74 6.46 6.42 6.42 6.42

H2’ 7.73 7.76 7.71 7.31 7.32 8.08 8.06 8.12 7.85 7.57 7.52 7.57

H3’ 6.81 6.81 6.90 6.89 6.91 6.93

H5’ 6.88 7.30 6.87 6.81 6.81 6.90 6.89 6.91 6.93

H6’ 7.63 7.71 7.58 7.31 7.32 8.08 8.06 8.12 7.85 7.57 7.52 7.57

H1” 4.91 5.25 4.96 5.13 5.18 5.47 5.40

H2” 3.55 3.48 3.45 3.42 3.69 3.46 3.82

H3” 3.51 3.42 3.43 3.33 3.63 3.45 3.57

H4” 3.43 3.34 3.38 3.25 3.40 3.30 3.84

H5” 3.48 3.22 3.45 3.41 3.54 3.25 3.48

H6a” 3.93 3.57 3.87 3.80 3.70 3.75 3.60

H6b” 3.74 3.71 3.68 3.38 3.92 3.57 3.66

H3’5’M 3.93 3.93 3.94

Figure 5.2 Structure of flavonoids: 1, quercetin; 2, quercetin-4’-O-glucoside; 3, quercetin-3-O-glucoside; 4, naringenin; 5, naringenin-7-O-glucoside; 6, kaempferol; 7, kaempferol-3-O-rutinoside; 8, kaempferol-7-O-neohesperoside; 9, apigenin; 10, syringetin; 11, syringetin-3-O-galactoside; 12, syringetin-3-O-glucoside.

Page 129: Metabolomics Technologies applied to the

129

Building a Flavonoid NMR Database

Strategy for the identification of the flavonoid and its sites of substitution: methoxylated flavonoids

In addition to the systematic changes of the shifts observed of glycosylated flavonoids, also systematic changes are, for example, observed on O-methylation. Our observations are consistent with the recent publications of Lambert et al (2005) in which isoflavonoids from Smirnowia iranica have been identified by LC-SPE-NMR and of Kim et al (Kim et al., 2006), in which a complete set of 1H NMR assignments of flavonol derivatives have been obtained. O-methylation of phenol-rings of flavonols (Kim et al., 2006) or O-methylation of ortho-disubstituted phenols (Lambert et al., 2005) results in higher ppm values for protons located at the ortho, meta or para position. The O-methylation-induced-chemical shifts can therefore be used for the identification of O-methylated flavonoids.

Strategy for identification of the type of glycoside in a glycosylated flavonoid

The distinction between different conformations of equal-mass-sugar moieties cannot be readily achieved by electrospray (ESI)-MS. This implies that glycosylated flavonoids with either one or two of the most occurring monosacharides in plants, such as glucose and galactose, cannot be discriminated by LC-MS/MS, as these two moieties have identical mass. Discrimination between a glucosylated and a galactosylated flavonoid should therefore be done by NMR-based identification. We expect that identification is feasible exclusively based on 1D 1H NMR data, providing that good quality spectra are obtained. However, in contrast to the possibilities for identification of the flavonoid backbone based on the 1H chemical shifts, the identification of the carbohydrate moiety cannot be easily performed by analysis of chemical shift values, as described above. But from careful analyses of more than 25 glucosylated flavonoid derivatives (data not shown), it can be seen that the coupling constant values in the glucopyranoside-ring are very similar for identical through-bond couplings. The 3JH-H coupling constants obtained from evaluating the NMR data of the glucosylated flavonoids 2, 3, 5, 12 using PERCH were 3JH1”-H2”(7.7 ± 0.3) Hz, 3JH2”-H3”(9.3 ± 0.2) Hz, 3JH3”-H4”(9.0 ± 0.1) Hz and 3JH4”-H5”(9.8 ± 0.1) Hz (Table 5.2). It is well known that these coupling constants are dependent on the dihedral angle and the substitution pattern in the sugar ring (Haasnoot et al., 1980). Almost no change in the values of the coupling constants of the different glucopyranosides studied was observed, implying that all these sugar moieties are in the same (minimized) energy conformation, in the solvent used (methanol-d4). This is an encouraging observation, as it indicates that the type of sugar moiety present in a flavonoid molecule can be

Page 130: Metabolomics Technologies applied to the

130

CHAPTER 5

obtained from detailed analysis of its 1H NMR splitting patterns. In our opinion, the software tool PERCH (Laatikainen et al., 1996) best suits the analyses of complex 1H NMR splitting patterns, even in the presence of strong coupling and overlapping signals.

Table 5.2. 1H NMR coupling constants of a selected group of flavonoids (see Fig. 5.2). The 1H NMR coupling constant values of the rhamnoside moiety of compounds 7 and 8 are not shown.

Coupling 1 2 3 4 5 6 7 8 9 10 11 124J(H6,H8) 2.1 2.1 2.1 2.2 2.3 2.1 2.1 2.1 2.1 2.0 2.1 2.13J(H2’,H3’) 8.4 8.4 8.8 8.7 8.8 8.73J(H5’,H6’) 8.5 8.7 8.5 8.4 8.4 8.8 8.7 8.8 8.74J(H2’,H6’) 2.2 2.2 2.2 2.6 2.5 2.6 2.4 2.5 2.5 2.0 2.0 2.04J(H3’,H5’) 2.4 2.4 2.7 2.5 2.5 2.43J(H1’’,H2’’) 7.8 7.9 8.0 7.4 7.8 7.8 7.73J(H2’’,H3’’) 9.4 9.3 9.5 9.2 9.2 9.6 9.33J(H3’’,H4’’) 9.0 9.0 8.9 9.0 9.2 3.4 8.93J(H4’’,H5’’) 9.8 9.8 9.8 9.9 9.8 1.1 9.83J(H5’’,H6a”) 2.3 2.4 2.0 1.8 2.3 6.5 2.33J(H5’’,H6b”) 5.7 5.4 5.0 6.1 5.9 5.6 5.62J(H6a”,H6b”) -12.1 -11.9 -12.1 -11.2 -12.3 -11.3 -11.92J(H3b,H3a) -17.1 -17.23J(H3b,H2) 13.0 12.93J(H3a,H2) 3.0 3.1

As an example of two closely related glycosylated molecules with the same molecular formula (and hence these molecules can not be discriminated by MS, as both have the same molecular mass, 508 Da), the 1H-NMR regions of the two sugar moieties of 11 and 12 are shown (Fig. 5.3). The complexity of the splitting patterns is due to the partially overlapping resonances and also to the presence of strong coupling effects. Nevertheless, when fitting the NMR spectra using the PERCH software programme, all coupling constants are adequately obtained (Fig. 5.3) and the difference between a glucopyranose and a galactopyranose can be readily observed. In our fitting procedure, a peak-top interpolation is firstly performed, including a line-shape analysis, followed by deconvolution with total-line-shape fitting. When fixing the coupling constants to specific preset values, as discussed above, it is possible, in an iterative manner, to obtain a convergent fit of the observed 1H NMR spectrum, resulting in the 1H NMR chemical shift values for the glucopyranose or galactopyranose moieties. When exchanging H2” to H3” in Fig. 5.3D (12), for example, the PERCH fit does not adequately converge, indicating that the proton resonances have not been correctly assigned.

Page 131: Metabolomics Technologies applied to the

131

Building a Flavonoid NMR Database

3.800 3.700 3.600 3.500

3.800 3.700 3.600 3.500

H4”

H2” H6b” H6a”H3”

H5”A

B

C

D

H4”

H2”

H6b”H6a”

H3”

H5”

3.700 3.600 3.500 3.400 3.300 ppmppm

ppm 3.700 3.600 3.500 3.400 3.300 ppm

Figure 5.3. The spectra of syringetin-3-O-galactoside (A) and syringetin-3-O-glucoside (C) using PERCH with preset coupling constants for the different sugar pyranoside rings and adjusting the 1H resonances of H2”, H3”, H4”, H5”, H6a” and H6b” to the most optimal chemical shift position. The sugar region of the measured 1H NMR spectra of syringetin-3-O-galactoside (B) and syringetin-3-O-glucoside (D) are depicted. Note the solvent resonance of methanol (3.30 ppm), still visible in D, which in B is simulated using PERCH.

Applicability of fitting strategies for identification purposes

Identification of conjugations present in the flavonoid backbone using the procedure described requires some form of purification, either by using LC-NMR or preferably by using LC-SPE-NMR (Wolfender et al., 2003). Peaks separated in a LC column can contain more than one compound, but as first shown by Exarchou et al (2003), even with two or three compounds trapped into the same SPE cartridge, the NMR spectra can be of the required quality. Subsequent fitting of the data, as proposed in this study, can result in the correct identification, thereby avoiding time consuming 1H-13C HSQC or 1H-13C HMBC data acquisition and analysis. We expect that when using LC-SPE-NMR with cryoprobes with flow insert, the identification of flavonoids can be achieved in the 500-800 ng region in a 30 minutes-1H NMR-run. This amount of material is much less than when one has to rely on identification based on 2D 1H-13C HSQC and 1H-13C HMBC data (Exarchou et al., 2003).

Building up a NMR database of flavonoids

The incorporation of the flavonoid 1H NMR chemical shifts into a searchable database will provide predictive 1H NMR values for conjugated flavonoids facilitating identification. Moreover, the small but systematic changes observed can be of value

Page 132: Metabolomics Technologies applied to the

132

CHAPTER 5

for theoretical chemists who are developing programs for prediction of 1H NMR chemical shift values (Abraham et al., 2005). Over the last decade, these theoretical prediction programs have steadily improved in performance but, as yet, are not able to predict the 1H NMR chemical shift values for flavonoids or related complicated biological molecules with the precision needed for identification, solely based on 1H NMR chemical shift values. All our data are being incorporated into AMIX (MOL files, 1H and 13C chemical shifts) and will be published in more detail in the near future. Its usage in complex mixture analysis will be demonstrated.

CONCLUSION

The 1H NMR chemical shift values of flavonoids show systematic differences depending on the substitution pattern. These systematic differences apply to the protons of the backbone moiety. The resonances of the sugar moieties in glycosylated flavonoids are too scattered to be of use in direct identification. However, as we have observed, the coupling constants are extremely predictable and through fitting the 1H resonances with preset coupling constants in PERCH, discrimination between the type of sugar, for example galactopyranoside or glucopyranoside, can be readily obtained. As the 1H NMR spectrum of glycosylated flavonoids is crowded with resonances in the 2.8-4.0 ppm region (with often strong coupling effects), it is essential to isolate the flavonoid under study to a reasonable degree of homogeneity, preferably through LC-SPE-NMR.

Page 133: Metabolomics Technologies applied to the
Page 134: Metabolomics Technologies applied to the
Page 135: Metabolomics Technologies applied to the

135

Chapter 6

Push-button fl avonoid identifi cation: A database of NMR data integrated with a 1H NMR predictive model

Sofi a Moco, Li-Hong Tseng, Frank van Zimmeren, Zheng Chen, Matthias Niemitz, Reino Laatikainen, Manfred Spraul, Jacques Vervoort

Flavonoids are a class of natural compounds essentially produced by plants, being part of the diets of animals and men. Polyphenols, in particular fl avonoids, have assumed health-promoting benefi ts. For the identifi cation of bio-organic molecules, as are fl avonoids, from complex mixtures, the combination of spectroscopic information taken from mass spectrometry (MS) and nuclear magnetic resonance (NMR) has been shown to be quite powerful. In principle, two-dimensional (2D) NMR spectra are essential to achieve complete compound identifi cation, including the structure elucidation.

In theory, 1H NMR data can direct identifi cation of bio-organic molecules, avoiding the acquisition of 2D NMR data sets. Based on a database of over 200 fl avonoids of detailed NMR data analyses, we show that the acquisition of 1H NMR, only, can guide the identifi cation process. In order to guarantee the robustness of this approach, NMR spectra were acquired under controlled conditions, followed by a fi tting procedure by means of a prediction model (PERCH).

From the 1H NMR analyses of related fl avonoids, substituent and modifi cation effects on the fl avonoid backbone (hydroxylation, methoxylation, methylation, glycosylation, unsaturation on the 2 position and presence of a cyano group on the 4 position) were compared and related to shifts in the 1H NMR spectrum. The observed shifts can be paired to the presence of certain chemical features that allow direct metabolite identifi cation through 1H NMR analysis.

Push-button fl avonoid identifi cation: A database of NMR data integrated with a 1H NMR predictive model

Sofi a Moco, Li-Hong Tseng, Frank van Zimmeren, Zheng Chen, Matthias Niemitz, Reino Laatikainen, Manfred Spraul, Jacques Vervoort

Page 136: Metabolomics Technologies applied to the

136

CHAPTER 6

INTRODUCTION

Flavonoids are a class of compounds with diverse functional roles and significant effects on plants and animals. They are almost exclusively biosynthesized in plants and are involved in many biologically relevant functions. In fact, flavonoids can act as signalling molecules in the attraction of pollinators, by conferring colours to flowers and fruits, they can influence the flux of hormonal molecules in the plant, they can confer to the plant the discriminative power towards certain pathogens in the rhizosphere and they can promote the production of pollen-tube germination (Taylor and Grotewold, 2005).

Nevertheless, flavonoids are probably most commonly known for their antioxidant activity and assumed health-promoting benefits such as anti-proliferative and anti-tumour behaviour. The consumption of flavonoid-containing food products (e.g. vegetables, fruit, soybean, olive oil) and drinks (e.g. tea, wine) (USDA database for the flavonoid content of selected foods, 2003) has been target of numerous reflections about dietary habits and longevity. Nevertheless, it has also been demonstrated that flavonoids can confer other “non-health promoting” effects. Isoflavones were shown to exhibit estrogenic activity while catechins and flavonols were able to bind to DNA and RNA (Piersen, 2003; van der Woude et al., 2005; Kuzuhara et al., 2006; Henley et al., 2007).

Major developments in biochemical fields are being pursued in order to understand the biological activities of flavonoids in plants and animals. The biological phenomena in which flavonoids play a role are dependent on the organism and its physicochemical environment. Flavonoids can act as ligands binding to receptor proteins, influencing physiological processes. For example, soy is a rich source of genistein that has been targeted to the proteasome, relating the inhibition of the proteasome activity to the cancer preventive properties of genistein (Kazi et al., 2003). Thus, the chemical nature of the flavonoids and their binding interaction with different type of receptors are important for understanding their biological effects. Therefore, a deeper chemical knowledge about flavonoids can lead to a better understanding of their biological roles and consequences.

About 4,000 flavonoids were reported to have been identified from natural sources (Iwashina, 2000) but probably due to the known diversity of the plant kingdom, many more are bound to exist. This large number of flavonoids is a consequence of modifications or conjugations in the flavonoid backbone (e.g. methylation, acylation, sulphation, glycosylation, glucuronidation, etc.). These modifications or conjugations have impact on physicochemical characteristics of

Page 137: Metabolomics Technologies applied to the

137

Flavonoid Database

the flavonoids, as solubility, receptor-binding abilities and antioxidant behaviour. Flavonoids have been classified in different chemical classes, according to their chemical features (Harborne, 1980).

The detection, isolation, and elucidation of flavonoids are essential steps for the chemical characterization of these molecules, prior to usage on biochemical or toxicological studies. In most cases, flavonoids have to be administered or handled in their pure form to be able to study the biological effect. This implies that sufficient amounts of naturally occurring flavonoids have to be isolated from plants. In fact, the isolation of natural compounds from plants can be laborious, involving several analytical steps, ranging from sample preparation to extraction, analytical detection, preparative isolation and purification without chemical degradation along the procedure. Flavonoids have been detected and isolated through a variety of analytical methods, either separately, sequentially or hyphenated: thin layer chromatography, gas chromatography, photo diode array, liquid chromatography (LC), MS and NMR (de Rijke et al., 2006) but as with most other bio-organic molecules, structure elucidation is usually achieved by using NMR spectroscopy.

For most molecules, structural elucidation is achieved by performing direct one dimensional (1D) NMR measurements on high abundant-NMR active-nuclei (e.g. 1H) in addition to 2D (or multi-dimensional) spectra for the magnetization of detection of less abundant nuclei (e.g. 13C). Therefore, for small organic molecules, these measurements include 2D homonuclear 1H-1H correlated experiments (COSY (correlated spectroscopy) and TOCSY (total correlated spectroscopy)), as well as heteronuclear 1H-13C correlated experiments (heteronuclear multiple bond correlation (HMBC) and heteronuclear single quantum correlation (HSQC)). The structure elucidation of molecules by NMR can be a challenging and puzzling task which requires experience and expertise. The structure elucidation process can benefit from MS information of the molecule under study. MS provides useful information about the molecule to be elucidated: molecular mass, MS/MS fragments and expected molecular formulae. In particular, the high mass accuracy at a wide dynamic range provided by some MS instruments can be very useful in obtaining molecular formulae from accurate mass values (Kind and Fiehn, 2006; Moco et al., 2006a). Nevertheless, the full elucidation of unknown molecules is hardly achieved by MS only. Therefore, the combination of MS and NMR is probably the most informative complementation of analytical techniques towards structure elucidation and identification of metabolites from complex mixtures (Exarchou et al., 2003).

NMR is the most selective technique for molecular recognition and offers a quantitative response which makes it an advantageous choice for molecular studies.

Page 138: Metabolomics Technologies applied to the

138

CHAPTER 6

However, NMR suffers from a relative low sensitivity compared to other analytical techniques such as MS. Improvements in the NMR sensitivity are one of the priorities in NMR technological developments (Exarchou et al., 2005; Kovacs et al., 2005), opening the application of this technique to a broader analytical range. An approach to avoid intensive analytical preparations, time-consuming 2D NMR measurements, as well as demanding spectral interpretation, is to make use of databases containing experimental NMR spectra. Thus, 1H NMR spectra of the (unknown) molecules can be matched to the (already assigned and characterized) spectra present in these NMR databases. The Human Metabolome Database (http://www.hmdb.ca/), the Spectral Database for Organic Compounds SDBS (http://www.aist.go.jp/RIODB/SDBS/cgi-bin/cre_index.cgi), the NMRShiftDB (http://nmrshiftdb.ice.mpg.de/) and the Biological Magnetic Resonance Bank (http://www.bmrb.wisc.edu/) are publicly available databases that contain NMR spectra of metabolites. Even though there is a wealth of NMR spectral information, small differences in chemical shift values for spectra of the same molecule makes unambiguous assignments difficult in the absence of tools which can help matching the 1H NMR data sets. There are several considerations to be taken into account when comparing (NMR) experimental data. The experimental conditions of the measurements should be analogous (solvent, pH, temperature, pulse sequence, magnetic field strength) for a reliable spectral comparison. In fact, differences in experimental conditions can lead to shifts in NMR properties that are difficult to foresee and interpret. This is particularly the case for 1H NMR where protons are known to be sensitive to the chemical environment. This chemical sensitivity is the basis for the advantageous selectivity of 1H NMR, but makes 1H NMR prediction tools a delicate task. The prediction of 1H NMR spectra can help testing possible alternatives for the elucidation of unknown molecules as well as verifying elucidations.

The most advanced methods used for the prediction of NMR properties are based on ab initio calculations or density functional theory methods. These strategies make use of molecular mechanic calculations based on fundamental quantum chemistry principals in order to predict chemical shifts and coupling constants (Helgaker et al., 1999; Abraham et al., 2005; Bagno et al., 2006). However, these methods can be computationally intensive and often only feasible to be applied on a small scale approach. Studies on the prediction of 13C NMR chemical shifts for flavonoids have been performed before (Biekofsky et al., 1991; Burns et al., 2007), though not for 1H NMR. Reliable 1H NMR models are more difficult to implement, due to the chemical sensitivity of the protons. However, a robust 1H NMR prediction tool can be more advantageous than a 13C NMR prediction tool in the identification of metabolites.

Page 139: Metabolomics Technologies applied to the

139

Flavonoid Database

In this study, a large number of NMR spectra were acquired for (mostly) naturally occurring phenolic compounds. By covering a large collection of related molecules, measured in the same experimental conditions, spectral comparison was feasible. Hence, effects of substitutions and conjugations in 1H chemical shifts have been investigated which allowed a better understanding of the origin of proton shifts due to molecular chemical features. All the molecules have been modelled in a 1H NMR prediction software, PERCH, which enabled the accurate extraction of NMR properties (chemical shifts and coupling constants) and confirmation of the assignments. The prediction of 1H NMR spectra is a useful tool for unravelling the identification of metabolites avoiding major efforts in obtaining carbon information through 13C NMR acquisition and definitely aiding spectroscopists in the interpretation and assignment of (often complex) NMR spectra to the correct molecular structure.

MATERIALS AND METHODS

Materials and Reagents: The standard compounds (Table 6.1): F-227 and F-244 were purchased from Acros (New Jersey, USA), F-2, F-3, F-5, F-6, F-11, F-12, F-22, F-23, F-27, F-30, F-31-34, F-36, F-38, F-40, F-41, F-155, F-157, F-159, F-163, F-245, F-246, F-248 and F-249 from Aldrich (Steinheim, Germany and Milwaukee, WI, USA), F-168-200, F-220-226, F-229-234 from Apin (Oxon, United Kingdom), F-16, F-42-152, F-154, F-156, F-202, F-203, F-215, F-218 and F-237 from Extrasynthese (Genay, France), F-153, F-164-167 and F-210-F-213 from Fluka (Steinheim, Germany), F-20 and F-228 from ICN (Ohio, USA), F-4, F-7-10, F-13-15, F-17-19, F-21, F-25 and F-236 from Indofine (Hillsborough, NJ, USA), F-242 and F-247 from Janssen (Beerse, Belgium), F-241 and F-243 from Merck (Damstadt, Germany) and F-1, F-24, F-26, F-28, F-29, F-35, F-37, F-39, F-158, F-160-162, F-201, F-204-206, F-207-209, F-214, F-216, F-217, F-219, F-235, F-238-240 and F-250 from Sigma (St. Louis, USA). The methanol-d4 was obtained from Deutero (Kastellaun, Germany) and tetramethylsilane (TMS) from Merck (Darmstadt, Germany).

Sample Preparation: Two hundred and fifty standard compounds were prepared for NMR analysis in eight batches of experiments, from April 2004 to July 2006. A total of 1-7 mg of compound was dissolved in 650 µL methanol-d4 (C

2H3O2H)

in order to achieve a high concentration. The solutions were homogenized by vortex and for some samples the ultrasonic bath was used to obtain solubility. From the homogenized solution, 600 µL were taken into the NMR tube for analysis. To some of

Page 140: Metabolomics Technologies applied to the

140

CHAPTER 6

the compounds water-d2 (2H2O) or chloroform-d (C2HCl3) was added up to a maximum

of 50% (v/v) to promote sufficient solubility. All samples were prepared just before analysis. Some of the solutions were prepared for a second analysis and re-analysed due to degradation thorough time made obvious in the NMR spectra.

NMR measurements and processing: NMR measurements were acquired at 300 K using a Bruker Avance 600 spectrometer, proton frequency 600.23 MHz, equipped with a 5 mm TXI probe and an Automatic Sample Changer autosampler for 60 samples. The instrument was set up before a series of analyses, so that these could be acquired under automatic control. Data acquisition and processing was controlled under ICON-NMR version 3.5.6, Bruker XWIN-NMR version 3.5 and Bruker TopSpin version 1.3 and 2.0. (Germany). For each sample zg, NOESY, COSY, TOCSY, JRes HMBC and HSQC were measured, taking less than 5 h of acquisition. Two 1D 1H spectra were acquired: a NOESY and a zg with pre-saturation with 65 K data points over a spectral width of 20.043 ppm, 2.73 s of acquisition time and 6 s of pre-saturation delay. The Fourier transformation was processed with an exponential multiplication. Three 2D 1H-1H measurements were acquired: COSY, TOCSY and JRes (J-Resolved). The COSY and TOCSY measurements were performed with 0.21 s acquisition time and a spectral width 16.0194 ppm in both dimensions, 4k (COSY) and 2k (TOCSY) experiments in t2 and 400 experiments in t1. The Fourier transformation was processed with a sine function. The JRes was performed with 0.41 s of acquisition time, a spectral width of 16.6602 ppm in t2 and 0.1302 in t1, 8k experiments in t2 and 128 in t1. The acquisition of 13C was made through the following 2D 1H-13C measurements: HSQC and HMBC (0.11 s acquisition time, spectral width 16.0194 ppm in t2 and 185.0601 in t1, 2k experiments in t2 and 400 experiments in t1. All the data sets were corrected for phase (if needed), calibrated for the chemical shift axis towards the resonance of TMS of signal (δ = 0 ppm) and baseline corrected.

Flavonoid Database build-up: Common and IUPAC (International Union of Pure and Applied Chemistry) names were searched and confirmed by chemical databases: PubChem (http://pubchem.ncbi.nlm.nih.gov) and SciFinder Scholar (American Chemical Society, USA). The IUPAC nomenclature was followed from the given in PubChem. The molecular mass of each molecule was calculated from the atomic masses documented by the IUPAC (de Laeter, 2003). Molecular structures were drawn using ChemDraw Ultra 10.0 (1986-2005, CambridgeSoft, USA). The database was implemented in PHP and is running on a Linux cluster. In this database, molecule information: names, CAS (Chemical Abstracts Service) number, EINECS (European Inventory of Existing Chemical Substances) number, molecular formulae, molecular masses) and NMR assignments (1H and 13C) were included and can be viewed.

Page 141: Metabolomics Technologies applied to the

141

Flavonoid Database

Data Analysis: PERCH 2005 (PERCH Solutions Ltd, Kuopio, Finland) was used for 1H and 13C NMR spectral and line shape analysis (Laatikainen et al., 1996). Spectra were directly imported from TopSpin (Bruker) into PERCH. The three-dimensional (3D) structures of molecules were edited and minimized by a molecular mechanics tool within the module Molecular Modelling Software (MMS) of PERCH. Solvent effects are integrated within this minimization. A preliminary spectral parameter prediction was obtained. This simulation was iteratively modified, making use of the experimental spectrum. Thus, a fine approximation of the predicted spectrum to the experimental spectrum was obtained for all analysed compounds.

RESULTS AND DISCUSSION

Measurement of standard compounds by NMR

A total of 250 standard compounds was analysed by NMR spectroscopy for 1H and 1H-13C measurements. The following spectral measurements were done for each standard compound: 1D-NOESY, COSY, TOCSY, JRes, HSQC and HMBC for obtaining a wide characterization of the compounds by NMR. The common name (for compactness reasons) and internal identifier of the measured compounds are listed in Table 6.1. All compounds were measured at the exact same experimental conditions, so that these could be consistently compared. The solvent chosen was methanol-d4 and the temperature of the measurements was set to 300 K. However, some of the compounds revealed poor solubility in methanol-d4. Due to a lower polarity, inspected by the molecular structures of the compounds, chloroform-d was added to methanol-d4 as solvent (up to 50% (v/v)) to the following compounds: F-42, F-108, F-114, F-128, F-136, F-30, F-204, F-31, F-146, F-169, F-180 and F-206. In an analogous way, the more polar compounds, with poor solubility in methanol-d4, were solubilised in methanol-d4/water-d2 (85%/15% (v/v)): F-55, F-172, F-205 and F-103.

Table 6.1. Common names and identifiers (F-i, i = 1, …, n, n = 250) of compounds measured by NMR.

F-i Common name F-i Common name F-i Common name F-i Common name

F-1 Flavanone F-64 4-Deoxyphloridzin F-127 4’,6,7-Trimethoxyisoflavone F-190 3,2’-Dihydroxyflavone

F-2 6-Hydroxyflavanone F-65 3’,4’,5’,5,7-Pentamethoxyflavone F-128

3,5,7-Trihydroxy-3’,4’,5’-trimethoxyflavone

F-191 3,6-Dimethoxyflavone

F-3 3-Hydroxyflavone F-66 Chrysoeriol F-129 Kaempferol-3-O-glucoside F-192 6,4’-Dihydroxyflavone

F-4 5-Hydroxyflavone F-67 3’,4’,5’,5,6,7-Hexamethoxyflavone F-130

Quercetin-3-O-beta-D-glucopyranosyl-6’’-acetate

F-193 8-Carboxy-3-methylflavone

F-5 6-Hydroxyflavone F-68 2’,6’-Dihydroxy-4,4’-dimethoxychalcone F-131 Datiscoside F-194 3,7-Dimethoxyflavone

Page 142: Metabolomics Technologies applied to the

142

CHAPTER 6

F-6 7-Hydroxyflavone F-69 Luteolin-4’-O-glucoside F-132 Homoeriodictyol F-195 Texasin

F-7 3’,4’-Dihydroxyflavone F-70 Eriodictyol-7-O-glucoside F-133 Tiliroside F-196 Tamarixetin-7-rutinoside

F-8 5,4’-Dihydroxyflavone F-71 Homobutein F-134 Karanjin F-197 Aromadendrin

F-9 7,3’-Dihydroxyflavone F-72 6,7-Dihydroxyflavone F-135 Vitexin F-198 Ampelopsin

F-10 7,4’-Dihydroxyflavone F-73 6-Methoxyluteolin F-136 Tectochrysin F-199 Capillarisin

F-11 Flavone F-74 Baicalein-7-methylether F-137 Vitexin-2’’-O-rhamnoside F-200 3,4’-Dihydroxyflavone

F-12 Galangin F-75 Luteolin tetramethylether F-138 Quercetagetin F-201 Hesperidin

F-13 Gossypin F-76 Orientin F-139 2,3-Dimethoxy-2’-hydroxychalcone F-202

2’,4’,6’-Dihydroxy-4-methoxychalcone-4’-O-neohesperidoside

F-14 4’-Hydroxyflavone F-77 Isorhamnetin-3-O-glucoside F-140 3’,4’-Dimethoxyflavone F-203 Bavachinin A

F-15 Isoquercitrin F-78 Myricitrin F-141 Baicalein-5,6,7-trimethylether F-204 alpha-Naphthoflavone

F-16 Isorhamnetin F-79 2’,6’-Dihydroxy-4,4’-dimethoxydihydrochalcone F-142 Cupressuflavone F-205 Diosmin

F-17 Quercitrin F-80 Diosmetin F-143 Chrysin dimethylether F-206 Acacetin

F-18 Rhoifolin F-81 Linarin F-144 5,7-Dimethoxyflavanone F-207 Hesperetin

F-19 3,3’,4’-Trihydroxyflavone F-82 Gossypetin-3,3’,4’,7-tetramethylether F-145 Pinocembrin-7-

methylether F-208 beta-Naphthoflavone

F-20 Chrysin F-83 4’-Hydroxychalcone F-146 5,7-Dihydroxy-3’,4’,5’-trimethoxyflavone F-209 Pinocembrine

F-21 Resokaempferol F-84 Homoorientin F-147 2’,4’,6’,3,4-Pentahydroxychalcone F-210 Phloridzin dihydrate

F-22 3,6-Dihydroxyflavone F-85 Luteolin-3’,7-di-O-glucoside F-148 Malvidin chloride F-211 Rhamnetin

F-23 3,7-Dihydroxyflavone F-86 Liquiritigenin F-149 Peltatoside F-212 Amentoflavone

F-24 7,8-Dihydroxyflavone F-87 Flavanone Hydrazone F-150 Narirutin F-213 Phloretin

F-25 3,3’-Dihydroxyflavone F-88 Fortunellin F-151 3,7-Dihydroxy-3’,4’,5’-trimethoxyflavone F-214 2’-Hydroxyflavanone

F-26 5,7-Dihydroxy-4’-methoxyisoflavone F-89 Syringetin F-152 Syringetin-3-O-

galactoside F-215 Malvidin-3-O-galactoside

F-27 6-Methoxyflavanone F-90 Sulfuretin F-153 Genistin F-216 Neohesperidin dihydrochalcone

F-28 5-Methoxyflavanone F-91 3’,4’,7,8-Tetrahydroxyflavone F-154 Kaempferide F-217 3-hydroxy-6-

methoxyflavone

F-29 Morin F-92 Syringetin-3-O-glucoside F-155 Trans-chalcone F-218 Oenin

F-30 4-Methoxychalcone F-93 Spiraeoside F-156 Isorhamnetin-3-O-rutinoside F-219 Silibinin

F-31 6-Methoxyflavone F-94 Sinensetin F-157 Rutin Hydrate F-220 3-Hydroxy-2’-methoxyflavone

F-32 6-Methylflavone F-95 3’,5,7-Trihydroxy-3,4’-dimethoxyflavone F-158 Naringenin F-221 6-Hydroxyflavone-β-D-

glucose

F-33 3-Methoxyflavone F-96 Scutellarein tetramethylether F-159 Baicalin hydrate F-222 8-Hydroxy-7-

methoxyflavone

F-34 2’-Hydroxy-4,4’,6’-trimethoxychalcone F-97 Tamarixetin F-160 2’-methoxyflavone F-223 4’-Methoxy-alpha-

naphthoflavone

F-35 5-Methoxyflavone F-98 Kaempferol-7-O-neohesperidoside F-161 Chlorogenic acid F-224 2’-Methoxy-beta-

naphthoflavone

F-36 4’-Methoxychalcone F-99 Quercetin-3-O-glucose F-162 3-Hydroxy-7-Methoxyflavone F-225 5,2’-Dihydroxyflavone

F-37 4’-Hydroxyflavanone F-100 Robinin F-163 Baicalin F-226 5,4’-Dihydroxyflavone

F-38 7-Methoxyflavone F-101 Gardenin A F-164 Eriodictyol F-227 Quercetin

F-39 7,8-Dimethoxyflavone F-102 5,7-Dihydroxy-3’,4’,5’-trimethoxyflavanone F-165 Prunetin F-228 Kaempferol

F-40 Baicalein F-103 Gossypin F-166 Apigenin F-229 Formononetin

F-41 Fisetin hydrate F-104 Maritimein F-167 Geraldol F-230 Naringin

F-42 Quercetin-3,7,3’,4’-tetramethylether F-105 Fustin F-168 Cyrtopterinetin F-231 Catechin

F-43 Sakuranetin F-106 2’,4-Dihydroxy-4’,6’-dimethoxychalcone F-169 Kaempferol-7,4’-

dimethylether F-232 Piceid

F-44 Robinetin F-107 4’-Methoxyflavone F-170 7-Hydroxyflavanone F-233 Butein

F-45 Sciadopitysin F-108 Genkwanin F-171 Tangeretin F-234 Epicatechin

F-46 Saponarin F-109 Naringenin-7-O-glucoside F-172 7-Hydroxyflavone-β-D-glucoside F-235 Morin

F-47 Scutellarein F-110 Kaempferol-3-O-rutinoside F-173 3’-Hydroxyflavanone F-236 Luteolin

F-48 3’,4’,7-Trihydroxyflavone F-111 Flavanone diacetyl hydrazone F-174 2’-Methoxy-α-

naphtoflavone F-237 Diosmetin

F-49 Quercetin-3,5,7,3’,4’-pentamethylether F-112 Eupatorin-5-methylether F-175 Wogonin F-238 Trans-Stilbene

F-50 2’-Hydroxychalcone F-113 5,7,8-Trihydroxyflavone F-176 Pseudobaptigenin F-239 Tomatidine

F-51 4,4’-Dimethoxychalcone F-114 Kaempferol-3,7,4’-trimethylether F-177 Kaempferol-3,4’-

dimethylether F-240 Tomatine

F-52 Isosakuranetin F-115 Dihydrorobinetin F-178 5-Hydroxyflavanone F-241 Pyrogallol

F-53 2’,6’-Dihydroxy-4’-methoxychalcone F-116 Eupatorin F-179 7-Hydroxyisoflavone F-242 Catechol

Page 143: Metabolomics Technologies applied to the

143

Flavonoid Database

F-54 Pratol F-117 Flavanomarein F-180 Ombuin F-243 Phenol

F-55 Isorhoifolin F-118 Homoeriodyctiol F-181 7-Methoxyflavanone F-244 4-methoxyphenol

F-56 2-Hydroxychalcone F-119 4-Hydroxychalcone F-182 Gossypetin F-245 3-methoxyphenol

F-57 3’,4’,5-Trihydroxy-6,7-dimethoxyflavone F-120 3’,4’,7,8-

Tetramethoxyflavone F-183 5,2’-Dimethoxyflavone F-246 Guaiacol

F-58 3’,4’,7-Trimethoxyflavone F-121 Daidzein F-184 7,2’-Dihydroxyflavone F-247 Anisole

F-59 Neoeriocitrin F-122 3,4-Dimethoxychalcone F-185 6,2’-Dihydroxyflavone F-248 1,4-dimethoxybenzene

F-60 Marein F-123 7-Hydroxy-5-methoxyflavone F-186 3,2’-Dimethoxyflavone F-249 1,4-Dihydroxy-3-

methoxybenzene

F-61 Hinokiflavone F-124 Isoliquiritigenin F-187 2’-Hydroxyflavone F-250 Veratrol

F-62 Apigenin-4’,5,7-trimethylether F-125 7-Hydroxyflavonol F-188 4’-Hydroxyflavone

F-63 Neodiosmin F-126 Datiscetin F-189 3,4’-Dimethoxyflavone

The reproducibility in 1H NMR measurements was tested. Two solutions, prepared separately, of the same standard compound were analysed independently by NMR. Four different standard compounds were measured twice in the same analytical conditions (F-13 and F-103, F-23 and F-125, F-15 and F-99, F-8 and F-226). The identifiers reflect the chronology of the measurements (lower F-i numbers were measured before higher F-i numbers). Three of these pairs were analysed for reproducibility (F-23 and F-125, F-15 and F-99, F-8 and F-226). The percentage of the averaged standard error of the means in proton chemical shifts (δH) between identical duplicates, over the 3 standard compounds was 0.0018 ppm. Thus, it can be concluded that there is not an apparent time of measurement dependence in the 1H NMR results.

Classification of compounds into chemical classes

The measured compounds were organised into chemical categories (Table 6.2). As most of the compounds were flavonoids, other (non-flavonoid) compounds were categorized as ‘other’ molecules. The adopted classification of flavonoids was taken from Harborne (1980) (see Fig. 5.1). In this study, flavonols are considered as hydroxylated flavones in the 3 position. Therefore, flavonol derivatives are considered to be O-substituted flavones in the 3 position, i.e., 3-methoxyflavone derivatives and 3-O-glycosylated flavone derivatives are assumed to be flavonols. The classification of flavonols as a separate group from the flavones is relevant due to the different (bio)chemistry that these two subclasses are involved in.

The flavonoid compounds in this collection differ in the main chemical properties of the backbone (originating the different flavonoid classes) and also in the substitution pattern (essentially hydroxylation, methoxylation and glycosylation).

Page 144: Metabolomics Technologies applied to the

144

CHAPTER 6

Table 6.2. Number of compounds that belong to each flavonoid class (see Fig. 5.1). In parenthesis the number of glycosylated compounds is indicated. The absence of compounds in a certain class is represented by “-”.

Class Number

Anthocyanins 3 (2)

Aurones 2 (1)

Biflavonoids 3 (-)

Catechins 2 (-)

Chalcones 20 (2)

Dihydroxychalcones 5 (3)

Flavan-3,4-diols -

Flavanones 37 (7)

Flavones 83 (17)

Flavonols 63 (20)

Isoflavonoids 8 (1)

Proanthocyanins -

Other 24 (3)

Compound characterization and molecular structure assignment

Obtaining the molecular structure of each compound was essential before proceeding with the assignment of NMR signals to the respective atoms within the molecules. In some cases, ambiguity in the common names and even molecular formulae lead to the assumption of invalid molecular structures. Names were checked in chemical databases and through these the molecular structures were verified. Not only various common names are used for the same chemical structure, but also more than one chemical structure was found to correspond to one name. For example, when the common name ‘rutin’ (by this common name it was intended to search the compound with IUPAC name: 2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-3-[(2S,3R,4S,5R,6R)-3,4,5-trihydroxy-6-[[(2R,3R,4R,5S,6S)-3,4,5-trihydroxy-6-methyl-oxan-2-yl]oxymethyl]oxan-2-yl]oxy-chromen-4-one; CAS registry number 153-18-4), Fig. 6.1, F-157, is used for searching the PubChem database, 15 hits are obtained; from which 3 have a molecular mass of 610.517 (molecular formula C27H30O16, corresponding to the intended molecule). From these three hits, one is deprived from stereochemical information and the other two differ in the stereochemical properties of the rhamnose sugar moiety. Absence or incongruence in the stereochemistry of compounds is (unfortunately) a recurring event. This ambiguity constitutes a barrier in the complete structure characterization of compounds, as the labelling of chiral compounds is not consistent.

Page 145: Metabolomics Technologies applied to the

145

Flavonoid Database

O

O O

O H

OH

HO

O H

HO

HO

5''

4''3''

2''

1''

6''

5'''

4'''3'''

2'''

1'''

O

O

O

HO

OH

O H

OH

7

8

6

5

3

2

2'3'

4'

5'

6'

SR

SS

R

R

RR

R

S

4

6'''

O

O

O

O H

O H

76

5

3 22'

3'

4'5'

6'

4

10

OH

OHHO

HO

O

OH

5''

4''

3''2''

1''

6''

Z

SRS

SR

O

O

OH

HO

O H

O

O

HO

OH

O H

7

8

6

5

3

2

2'3'

4'

5'

6'

5''3''

2''

6''

7''

5'''

4'''

3'''

2'''

6'''

O

OHO

O

O

6

8

7

2'3'

4'

5'

6'

1'

5

3

24

E

O

O

HO

O H

7

8

6

53A,3B

2

2'3'

4'

5'

6'S

F-157

F-142

F-34

F-209

F-104

Figure 6.1. Examples of flavonoids present in the Flavonoid Database with indicated stereochemistry and numbering: a flavonol (F-157, rutin), a biflavonoid (F-142, cupressulflavone), an aurone (F-104, maritimein), a chalcone (F-34, 2’-hydroxy-4,4’,6’-trimethoxychalcone) and a flavanone (F-209, pinocembrine).

For the characterization of chiral molecules present in our compound collection, their stereochemistry was taken into account. The priority rules of Cahn-Ingold-Prelog (CIP system or R/S system) were used. The cathechins, cathechin (F-231) and epicathechin (F-234) have two chiral carbons (2 and 3) and are stereoisomers differing in the geometry of the hydroxyl in the 3 position: (2R, 3S) for cathechin and (2R, 3R) for epicathechin. Flavanones have one chiral carbon (2S) in the absence of substituents in the 3 position, otherwise have two chiral carbons (2R, 3R), Fig. 6.1, F-209. Aurones and chalcones have stereochemical properties attributable to the presence of a double bond. The conformation of the aurones is in Z (Fig. 6.1, F-104) while the chalcones were accepted to be in the conformation E (Fig. 6.1, F-34).

The glycosidic moieties attached to the flavonoid backbones were also characterized stereochemically. In our collection, there are flavonoids conjugated to monosaccharides and to disaccharides, being the flavonoid backbone mono-substituted or di-substituted. All sugar moieties were modelled in the chair configuration, as this corresponds to the measured coupling constants 3J(H,H).

Monosaccharides are attached to the flavonoid through their 1 position, Fig. 6.1, F-157. There are two flavonoid-O-galactosides in this list: syringetin-3-O-β-D-galactoside (F-152) and malvidin-3-O-β-D-galactoside (F-215). In these two cases, the galactose ring is arranged tridimensionally as (1”S, 2”R, 3”S, 4”R, 5”R). There

Page 146: Metabolomics Technologies applied to the

146

CHAPTER 6

are 24 flavonoid-O-β-D-glucosides which are in the conformation (1”S, 2”R, 3”S, 4”S, 5”R) while the four flavonoid-C-β-D-glucosides (saponarin, F-46; homoorientin, F-84; orientin, F-76 and vitexin, F-135) have the conformation (1”S, 2”R, 3”R, 4”S, 5”R). The three flavonoid-O-rhamnosides: quercitrin (F-17), myricitrin (F-78) and robinin (F-100) have the conformation (1”S, 2”R, 3”R, 4”R, 5”S) in the attached α-L-rhamnose ring.

There are four types of disaccharides conjugated to a flavonoid in our list of molecules: robinose, vicianose, rutinose and neohesperidose. Robinin (or kaempferol-3-O-(β-D-galactopyranosyl-α-L-rhamnosyl)-7-O-α-L-rhamnoside, F-100) has a robinose disaccharide in the 3 position. The galactose is attached to the flavonoid and the rhamnose to the galactose in a 6 -> 1 conjugation, so this sugar moiety is oriented in space as (1”’S, 2”’R, 3”’S, 4”’R, 5”’R, O-β-D-galactose), (1””R, 2””R, 3””R, 4””R, 5””S, O-α-L-rhamnose). Peltatoside or quercetin-3-(O-α-L-arabinopyranosyl-O-β-D-glucopyranoside) (F-149) has a vicianose in the 3 position that has the configuration (1”S, 2”R, 3”S, 4”S, 5”R, O-β-D-glucose), (1”’S, 2”’R, 3”’S, 4”’S, O-α-L-arabinose). There are 10 flavonoid-O-rutinosides and these have a (1”S, 2”R, 3”S, 4”S, 5”R, O-β-D-glucose), (1”’R, 2”’R, 3”’R, 4”’S, O-α-L-rhamnose) configuration. While the glucose is attached to the rhamnose by a 6 -> 1 conjugation in the rutinose, in the neohesperidose the conjugation is done through 2 -> 1 position. There are 8 flavonoid-O-neohesperidosides and one flavonoid-C-neohesperidoside. All have the following conformation: (1”S, 2”R, 3”S, 4”S, 5”R, O/C-β-D-glucose), (1”’S, 2”’R, 3”’R, 4”’R, 5”’S, O-α-L-rhamnose).

After the (stereochemical) description of the respective compounds, the proton atoms were numbered. The numbering scheme of the atoms within the flavonoid molecules was followed by the proposed numbering scheme of Harborne (1975) with an exception for the chalcones and dihydrochalcones where to the α and β, the numbers 8 and 7 were attributed, respectively, due to software convenience, Fig. 6.1. The numbering of anthocyanins, catechins, flavones, isoflavonoids, flavanones and flavonols is analogous, as well as the numbering of chalcones and dihydrochalcones.

Assignment of the 1H and 13C resonances in the molecular structure

The assignment of the 1H and 13C in all measured molecules was deducted from the acquired 1D (1H) and 2D (1H-1H – COSY and TOCSY and 1H-13C HSQC and HMBC) spectra, knowing the molecular structure of the molecules. This procedure can be quite demanding in order to obtain robust assignments, especially for glycosylated

Page 147: Metabolomics Technologies applied to the

147

Flavonoid Database

flavonoids. In Fig. 6.2, a glycosylated flavonol, rutin (F-157), is presented as an illustrative example taken from the list of compounds that were measured and assigned.

F2 [ppm] 3.8 3.6 3.4

F1 [

ppm

] 9

0 8

0 7

0 6

0

[ *1

e6]

20

40

60

80

100

3.2

100

C6”C5”C4”

C2”’+C3”’C4”’

C2”C5”C3”

H6a” H3”’H2”’H2”

H5”’

H4”’H3”

H6b”

H5”

H4”

B

F2 [ppm] 7 6 5 4 3

F1 [

ppm

] 1

60

140

1

20

100

8

0 6

0

[ *1

e6]

20

40

60

80

100

C7C5

C2

H2’

H6’

H5’

H8 H6

H1”H1”’

2CH3OH

H5”’

H O

C9C4’

C6”’

C3’C6’

C1’+C2’

C3

C5’

C8

C6

C10C1”

A

BO

O O

O H

OH

HO

O H

HO

HO

5''

4''3''

2''

1''

6''

5'''

4'''3'''

2'''

1'''

O

O

O

HO

OH

O H

OH

7

8

6

5

3

2

2'3'

4'

5'

6'

SR

SS

R

R

RR

R

S

Figure 6.2. 1H and 13C assignments of the molecule F-157 (rutin) represented in the superimposed spectra: HMBC (blue), HSQC (red) and NOESY-1D (grey). The molecular structure with stereochemical proprieties and atom numbering is placed on the top left corner (A). A magnification of the sugar region (f2 = 3.9-3.0 ppm) is represented in frame B.

Page 148: Metabolomics Technologies applied to the

148

CHAPTER 6

Database building

In order to manage the large amount of data generated, a database was build on a Linux server, the Flavonoid Database, Fig. 6.3. This database can be accessed at “Flavonoid Database” under http://www.wnmrc.nl. Available information was organised for all the compounds present in our collection. This information can be found by browsing the complete compound list or by using search options. Searches can be done by text in all fields, name (including IUPAC), identifier, CAS number, EINECS number, molecular formula, molecular mass, PubChem identifier and Kegg identifier. In each compound hit, the following information is displayed: common names, IUPAC name (following the format used in PubChem), CAS and EINECS registry numbers, molecular formula and molecular mass (calculated from the atomic masses indicated by the IUPAC (de Laeter, 2003)). The 1H and 13C assignments are displayed for each compound, as well as the molecular structure with numbering and stereochemical properties (if relevant). The experimental 1H NMR spectrum can be viewed as a PDF file. In addition, there are links from each compound to other databases: PubChem, PubMed, Wikipedia and Kegg.

Figure 6.3. Screenshots from the Flavonoid Database query frame. From the home page (A), information can be viewed by the browse window (B); searches can be made by text in all fields, name (common and IUPAC), CAS number, EINECS number, molecular formula, molecular mass, PubChem identifier and KEGG identifier (C) and by experimental information (mass and chemical shifts) (D). A list of hits is displayed upon a search (E) and each hit contains information about a molecule (name, CAS number, molecular formula, etc.), links to the PubChem, PubMed, KEGG and Wikipedia databases, 1H NMR spectrum in PDF format (including assignments), list of 1H and 13C chemical shifts and 2D molecular structure with numbering and stereochemical details (F). The information displayed in E and F belongs to the molecule rutin.

Page 149: Metabolomics Technologies applied to the

149

Flavonoid Database

The implementation of this database not only helped categorizing molecules and spectroscopic data but also offers 1H and 13C experimental data, acquired at the same experimental conditions, of a large number of related molecules for public access. The identification of flavonoids (and analytical efforts for their isolation) can benefit from the establishment of databases that integrate experimental and chemical information.

PERCH analysis and prediction

The 1H NMR spectra were imported into PERCH for molecule and spectral modelling in order to obtain resolved 1H NMR properties: chemical shifts and coupling constants. The principles behind the spectral analysis of complex NMR spectra, including deconvolution of coupling information, used by PERCH, can be found elsewhere (Laatikainen et al., 1996). With PERCH, a 1H NMR spectrum can be simulated based on the molecular structure of a compound. The predicted 1H NMR spectrum can iteratively be fitted onto the measured 1H spectrum until the difference between simulated and experimental spectrum converges to a minimum. All molecules in the Flavonoid Database were analysed in detail by PERCH, yielding precise 1H NMR chemical shifts, linewidths and 1H-1H coupling information. Even very difficult 1H NMR spectra could be analysed in great detail. For example, the sugar region of the predictive and experimental spectra for the metabolite rutin (F-157) are virtually identical, Fig. 6.4. The disaccharide, rutinose, substituted on the 3 position was successfully deconvoluted into the relevant chemical shifts and coupling constants with an averaged difference in the chemical shift values of 0.151 ppm and 0.523 Hz for the coupling constants. The list of the experimental and predicted chemical shifts and coupling constants for all simulated molecules can be found in the Supplementary Materials (Tables 6S.1 and 6S.2).

3.8 3.7 3.6 3.5 3.4 3.3 ppm

Figure 6.4. Magnification of the sugar region (f2 = 3.9-3.2 ppm) of F-157 (rutin) simulated by PERCH: predicted spectrum (upper spectrum, grey) and experimental spectrum (lower spectrum, black). The presence of a large resonance in the lower spectrum belongs to the solvent CH3OH. For assignments, see Fig. 6.2.

Page 150: Metabolomics Technologies applied to the

150

CHAPTER 6

The quality of the prediction model can be asserted by the linearity of the relationship between experimental and predicted NMR properties (Bagno et al., 2006). The prediction versus experimental curves for the proton chemical shifts and coupling constants indicate a robust model for both 1H NMR properties, Fig. 6.5. The parameters concerning the accuracy of the model were calculated based on Bagno et al. (2006) and are listed in Table 6.3. Taking into account the complexity of the molecules present in the Flavonoid Database as well as the number of molecules modelled, PERCH was able to accurately fit a multitude of molecules with diverse chemical features.

0

2

4

6

8

10

0 2 4 6 8 10

H experimental

H p

redi

cted

A

-20

-15

-10

-5

0

5

10

15

20

-20 -15 -10 -5 0 5 10 15 20

(H,H) experimental

J (H

,H)

pred

icte

d

J

B

Figure 6.5. Predicted proton chemical shifts, δH, in function of experimental proton chemical shifts, δH, (A) and predicted proton coupling constants, J(H,H), in function of experimental proton coupling constants, J(H,H), (B), for protons belonging to molecules modelled by PERCH.

Table 6.3. Obtained NMR parameters for proton chemical shifts, δH, and proton coupling constants, J(H,H), from NMR prediction properties calculated by PERCH. Determination coefficient (r2), slope (m) and intercept (b) obtained by a least squares linear regression: δpredicted = m δexperimental + b; mean average error (MAE) = Σn

i=1|δpredicted - δexperimental|i/n, n = number of δi; corrected mean average error (CMAE) = Σni=1|δcorrected

- δexperimental|i/n, δcorrected = (δpredicted – b)/m; the calculation of J(H,H) parameters was performed analogously, by substituting δH by J(H,H).

Parameters δH J(H,H)

r2 0.991 0.980m 0.988 ± 0.002 0.964 ± 0.052b 0.086 ± 0.014 -0.017 ± 0.031MAE 0.113 0.532CMAE 0.019 0.182

The effect of specific chemical features on the 1H NMR chemical shifts

Small modifications in aromatic structures result in differences in 1H NMR chemical shifts. Aromatic protons are typically deshielded, appearing between 6

Page 151: Metabolomics Technologies applied to the

151

Flavonoid Database

and 10 ppm in the 1H NMR spectrum. The submission of benzene to a magnetic field provokes the delocalization of electrons from all the carbons’ p orbitals, generating a ring current. The propagation of effects due to the presence of substituents is therefore greatly attributed to the ring current effect. As flavonoids

[ppm] 8 7 4

7-Hydroxyisoflavone

3-Hydroxyflavone

4’-Hydroxyflavone

4-Methoxychalcone

H6H8

H3’/H5’H4’

H5H2’/H6’

H7

H6 H8H3’/H5’

H4’H5

H2’/H6’

H2

2’-Hydroxyflavone

H6H8 H3’

H4’

H5H6’

H7H5’

H3

7-Hydroxyflavone

H6H8H3’/H5’H4’

H5H2’/H6’

H3

6-Hydroxyflavone

H3

H8

H3’/H5’H4’ H5H2’/H6’

H7

5-Hydroxyflavone

H6H8H3’/H5’H4’ H3

H2’/H6’ H7

H6H8

H3’/H5’ H3

H5

H2’/H6’

H7

H3’/H5’H4M

H3/H5H2’/H6’H4

H2/H6

//

7.5

are phenylchromene derivatives, there is a partial aromaticity, that leads to particular substituent chemical shifts.

The effect on the properties of the 1H NMR spectra by the presence of substituents and conjugation groups in the flavonoid backbone was analysed. The comparison of chemical shifts taken from 1H NMR spectra of compounds detaining a certain chemical feature was compared to analogous compounds lacking this chemical feature. The following effects were studied: (i) hydroxyl substitution, (ii) methoxy substitution, (iii) methyl substitution, (iv) double bond in the 2 position (flavone or flavonol versus flavanone), (v) cyano group in the 4 position (instead of carbonyl) and (vi) sugar conjugation.

Effect of hydroxyl groups on the flavonoid backbone

Knowing in which position are substituents in molecules is a common problem when using MS for identification purposes, as most isomers are not able to be distinguished. The fragmentation of the molecular ion can lead to uninformative fragments and ambiguity remains. 1H NMR can readily distinguish conformational isomers, as the protons of each isomer are influenced by the presence of substituents, having different chemical environments that are made evident in the 1H NMR spectra. For example, conformational

Figure 6.6 1H NMR spectra of six hydroxyflavone isomers: 3-hydroxyflavone (F-3), 5-hydroxyflavone (F-6), 6-hydroxyflavone (F-5), 7-hydroxyflavone (F-6), 2’-hydroxyflvone (F-187) and 4’-hydroxyflavone (F-5).

Page 152: Metabolomics Technologies applied to the

152

CHAPTER 6

δ H(n

-hyd

roxy

flav

anon

e) -

δ H

(fla

van

one)

(pp

m),

n =

mon

o, …

, h

exa

δ H(n

-hyd

roxy

flav

onol

) -

δ H(f

lavo

nol

) (p

pm),

n =

mon

o, …

, h

exa

5

6 6

6

66

6 6 6 6

7

8 8 8 8

8 8 8

8

2'2'

2' 2' 2'3'3' 3' 3'

3'

4'

5' 5'

5'

5'

5'5'

6'

6'

6'

88

3'5'5'

5' 5'

6'6'

6'

6' 6'

-1.4

-1.2

-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

3,6

3,7

3,3'

3,7

3,2'

3,4'

3,5,

7

3,3'

,4'

3,7,

4'

3,7,

3',4

'

3,5,

7,2'

3,5,

7,4'

3,5,

7,2'

,4'

3,5,

7,3'

,4'

3,5,

6,7,

3',4

'

positions of hydroxylation

C

2

22

2 25

66

6

6 6

6

6

6

6

7

88

8

2'

2' 2'

3'3' 3'5'

8

8

8

8

8

8

2'3'

3'

4'5'

5'5'

5' 5'

6'

6'

6'

6'

-1.4

-1.2

-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

4' 6 2' 7 3' 5

7, 4

'

5, 7

5, 7

, 4'

3, 7

, 3'

, 4'

3, 5

, 7,

4'

3, 7

, 3'

, 4'

5'

3, 5

, 7,

3',

4',

5'

A

positions of hydroxylation

δ H(n

-hyd

roxy

flav

one)

- δ H

(fla

von

e) (

ppm

),

n =

mon

o, …

, te

tra

B

33

3

5

5

5 5 5

56

6

6 66

6

6 6

67 77

88

8 88

8 8 8

2'2'

2'3' 3'

3'4'

4'

5'

5'5'

5' 5'5'

2'

3'

3' 3'

3'5'5'

5'5'

6'

6'

6'6'

-1.4

-1.2

-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

6 7 4' 2' 5,4'

3',4

'

7,3'

7,4'

7,8

6,7

7,2'

6,2'

6,4'

5,7,

4'

5,7,

3',4

'

7,8,

3',4

'

positions of hydroxylation

5

6

8

Figure 6.7. The effect of the presence of hydroxyl groups in the proton chemical shifts, δH (ppm), of the flavanone (A), flavone (B) and flavonol (C) backbone: difference in ppm between n-hydroxyflavonoid (n = mono, di, tri, tetra, penta, hexa) and flavonoid. Hydroxylated flavanones used in the analysis (A): F-37, F-2, F-214, F-170, F-173, F-178, F-86, F-209, F-158, F-105, F-197, F-115 and F-198 were compared to flavanone (F-1).Hydroxylated flavones used in the analysis (B): F-4, F-5, F-6, F-14, F-187, F-226, F-7, F-9, F-10, F-24, F-72, F-184, F-185, F-192, F-166, F-236 and F-91 were compared to flavone (F-11).Hydroxylated flavonols used in the analysis (C): F-22, F-23, F-25, F-125, F-90, F-200, F-12, F-19, F-21, F-41, F-126, F-228, F-29, F-227, F-235 and F-138 were compared to flavonol (F-3).

Page 153: Metabolomics Technologies applied to the

153

Flavonoid Database

isomers of a hydroxyflavone produce distinct spectra, in terms of NMR properties, Fig. 6.6.

The presence of one or more hydroxyl groups in the flavanone (Fig. 6.7A), flavone (Fig. 6.7B) and flavonol (Fig. 6.7C) backbones was analysed. Thirteen flavanones, sixteen flavones and fifteen flavonols were used for this comparison. The difference in chemical shifts between the hydroxylated flavonoid and the non hydroxylated flavonoid (flavone, flavanone and flavonol) were in general negative, i.e., the presence of hydroxyl groups provokes upfield shifts. This observation, implies that the observed shifts are a result of poorer aromaticity, given the occurrence of the substituent hydroxyl.

General proton shift trends were found upon hydroxylation. The position of the hydroxyl substitution leads to consistent shifts typically about 0.5 ppm. Multiple-hydroxylated flavonoids indicate additive effects in the shifts of chemical shift values. For example, 7,4’-dihydroxyflavanone has the sum of the chemical shift effects observed for 7-hydroxyflavanone and 4’-hydroxyflavanone (Fig. 6.7A). Analogously, 5,7,4’-trihydroxyflavone has the same proton chemical shift effect as the three 5-, 7-and 4’-monohydroxyflanones combined or of 5,4’-dihydroxyflavone and 7-hydroxyflavone combined (Fig. 6.7B). The protons influenced upon hydroxylation at specific locations in the flavonoid backbone are listed in Table 6.4.

Table 6.4. Protons affected by the presence of a hydroxyl group in the backbone of flavanones, flavones and flavonols. *only for flavanones, deduced from: F-105, F-197, F-115, F-198; **deduced for flavones from: F-24; ***only applicable to flavonols; ****not applicable to flavonols.

Position of hydroxylation

Position of influenced protons

3 2*

5 6, 8

6 5, 7

7 6, 8

8 7, 5**

2’/6’ 3’, 5’, 6’***

3’/5’ 2’, 4’****, 6’

4’ 3’, 5’

Protons in the ortho position relative to the hydroxyl moiety, are influenced by the presence of this hydroxyl group. Additionally, protons in para position are influenced in a similar manner. These observations are corroborated by a previous study on the prediction of chemical shifts using molecular dynamics calculations

Page 154: Metabolomics Technologies applied to the

154

CHAPTER 6

where the protons ortho and para of phenol have been found to be significantly influenced by the presence of the hydroxyl (Abraham et al., 2000). The observed effects are analogous for flavanones and flavones. However from the analysis of flavonols, two differences were observed for this subclass of flavonoids: (i) hydroxylation in the 2’ position not only leads to a 1H NMR chemical shift of the protons H3’ and H5’ but also to a shift of the chemical shift of proton H6’; (ii) hydroxylation in the 3’ or 5’ positions leads to 1H NMR chemical shifts of the protons H2’ and H6’ but not of proton H4’, in contrast to the flavanones and flavones. These differences are probably due to the electronic distribution of flavonols compared to flavones. The presence of a hydroxyl group in the 3 position might contribute to a more intense electronic flow in the whole molecule, in particular from the chromenone to the outer phenyl ring (B ring) (Awad et al., 2001).

Effect of methoxy groups on the flavonoid backbone

Together with hydroxylation, the methoxylation is a common flavonoid conjugation. Methoxylated flavonoids are therefore also naturally produced in plants. Recent studies, comparing toxicological effects of hydroxylated flavonoids to methoxylated flavonoids, have shown that the latter have a higher potential as cancer chemopreventive agents (Wen, 2006). The effect on the chemical shift of protons, by the presence of methoxy groups in the backbone of flavonoids, was analysed, Fig. 6.8. The effects observed for methoxylated flavonoids are in accordance with the interpretation reported for the analysed hydroxylated flavonoids (see Table 6.4). However, the shifts are not as pronounced nor as systematic as in the hydroxylated flavonoids, in particular for the flavones. The methyl group attached to the oxygen, in the methoxy group, can better counterbalance for the presence of the high electronegative oxygen (than the hydroxyl proton) and therefore cause the neighbouring protons to be less influenced. In general, methoxylation causes shifts in ortho and para protons to lower ppm values, reaching a maximum of -1.1 ppm for protons influenced by a dimethoxylation (e.g. H8 in 5,7-dimethoxyflavone).

Flavanones show -0.4 ppm shifts for influenced protons and -0.8 ppm for profoundly influenced protons. It is also observed that methoxylation on the 6 position leads to the proton H5 to shift more than proton H7. Identically, methoxylation on the 7 position leads to larger shifts of the proton H8 than of the proton H6.

Page 155: Metabolomics Technologies applied to the

155

Flavonoid Database

5

55

5

6

6

6

6

6

6

6

6

6

6

6

6

77

8

8

8 8

8 8

8

8

8 8

8

8

2'

2'

2'3' 3' 3'5' 5'

5'5'

5' 5' 5'5'

6' 6'

6'6'

6'

-1.4

-1.2

-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

5

5,7 5 6 7 4' 5,7

7,8

5,7,

4'

7,3'

,4'

5,7,

3',4

'

5,6,

7,4'

7,8,

3',4

'

5,6,

7,3'

,4'

3,6

3,7

3,5,

7,3'

,4'

positions of methoxylation

δ H(n

-met

hox

yfla

von

oid)

- δ H

(fla

von

oid)

(pp

m),

n =

mon

o, …

, he

xa

6

6

8

8

5

6

6

7

8

8

2'5'

6'

Flavanones FlavonolsFlavones

2' 2'

Figure 6.8. The effect of the presence of methoxy groups in the proton chemical shifts, δH (ppm), of fl avanones, fl avones and fl avonols: difference in ppm between n-methoxyfl avonoid (n = mono, di, tri, tetra, penta) and fl avonoid. Methoxylated fl avanones used in the analysis: F-28 and F-44 were compared to fl avanone (F-1). Methoxylated fl avones used in the analysis: F-35, F-31, F-38, F-107, F-143, F-39, F-62, F-58, F-75, F-96, F-120 and F-94 were compared to fl avone (F-11). Methoxylated fl avonols used in the analysis: F-191, F-194 and F-49 were compared to fl avonol (F-3).

Effect of methyl groups on the fl avonoid backbone

Methylation is not a common conjugation of fl avonoids, though, methylated fl avonoids, in particular 6-methylfl avone has been demonstrated to act as a positive allosteric modulator of gamma-aminobutyric acid in human receptors (Hall et al., 2005). The effect of a substituted methyl group in the fl avone backbone was studied. The presence of a methyl group lead to light shifts in the resonances of the neighbouring protons (compared to hydroxylation or methoxylation). Two methylated molecules were compared to the respective non-methylated molecules. Shifts of about 0.2 ppm were found for the direct neighbouring protons of the methylation, Table 6.5. This observation is in accordance with the electron-releasing properties of alkyl groups such as the methyl group, as the presence of this group seems to only slightly infl uence the electronic confi guration of the fl avone.

Page 156: Metabolomics Technologies applied to the

156

CHAPTER 6

Table 6.5. Effect of methylation on the chemical shifts, δH (ppm). The methylated flavones, F-123 (7-hydroxy-5-methylflavone) and F-32 (6-methylflavone), were compared to the non-methylated flavones, F-6 (7-hydroxyflavone) and F-11 (flavone).

Proton position 5CH3 6CH3

2’/6’ -0.043 0.005

3 -0.125 -0.009

3’/5’ -0.018 0.012

4’ -0.022 0.010

5 - -0.209

6 -0.267 -

7 - -0.161

8 -0.176 -0.090

Effect of a double bond in the 2 position of flavones and flavonols compared to flavanones

The presence of a double bond on the 2 position is the feature that distinguishes a flavone (or flavonol) from a flavanone. This double bond provokes great differences in the activity and reactivity of flavones compared to flavanones, being the former molecules better (anti)oxidants than the latter (Awad et al., 2001). Rigidness is conferred by this double bond in the flavonoid backbone, making the chromenone moiety planar. In contrast, the chromanone moiety of the flavanone is not planar due to the presence of the sp3 carbons in the 2 and 3 positions. Given this difference in configuration, major differences in chemical shift values are observed for the protons of the flavone compared to the flavanone. In fact, all the protons of the flavone (F-1) have an aromatic character (protons appear between 6.9 and 8.1 ppm in the 1H NMR spectrum), while the C ring protons of the flavanone lack aromaticity (these protons appear around 3 ppm). The observed structural differences lead to an increase in the chemical shift values of the flavone protons (compared to the flavanone), Fig. 6.9.

In the A ring, H5 and H7 are the least affected by the conformational change between flavones (or flavonols) and flavanones, shifting about 0.3 ppm. Nevertheless, the proton H8 is highly affected, shifting up to 0.75 ppm. In the B ring, the chemical shifts of H2’ and H6’ are also influenced by this conformational change, shifting up to 0.85 ppm, while the protons H3’, H4’ and H5’ shift less than 0.25 ppm. In the C ring, the changes in the 1H NMR spectrum are notorious, as the protons H3A (2.9 ppm) and H3B (3.1 ppm) of the flavone appear as double doublets, contrasting with the H3 proton of the flavone that appears as a singlet at 6.9 ppm. This feature provides an easy distinction between flavanones and flavones.

Page 157: Metabolomics Technologies applied to the

157

Flavonoid Database

0.0

0.2

0.4

0.6

0.8

1.0

H5 H6 H7 H8 H2' H3' H4' H5' H6'

protons

δ H[

(

n-hy

drox

y-,

m-m

etho

xy,

n-hy

drox

y-m

-met

hoxy

)fla

von

(e/o

l)-

(n-h

ydro

xy-,

m-m

etho

xy,

n-hy

drox

y-m

-met

hoxy

)fla

vano

ne]

,

n, m

= m

ono,

di,

tri

, te

tra,

pen

ta (

ppm

H

Figure 6.9. The effect of a double bond in the 2 position in the protons chemical shifts, δH (ppm), of the flavones and flavonols compared to the flavanone backbone: difference in ppm between molecules differing in the presence (flavone or flavonol) or absence (flavanone) of C2=C3. Twenty three pairs of n-hydroxy-, m-methoxy- or n-hydroxy-m-methoxyflav(an)ones, (flavone, flavanone), (n, m = mono, di, tri, tetra, penta) were used in this analysis: (F11, F1), (F5, F2), (F6, F170), (F187, F214), (F14, F37), (F31, F27), (F4, F178), (F35, F28), (F20, F209), (F143, F144), (F10, F86), (F108, F43), (F166, F158), (F18, F230), (F55, F150), (F237, F207), (F66, F118), (F41, F105), (F228, F197), (F146, F102) and (F44, F115).

Effect of a cyano group on the flavonoid backbone

The presence of a cyano group instead of the carbonyl in the 4 position of the flavanone backbone was studied by comparing the chemical shift values of the flavanone hydrazone (F-87) to the flavanone (F-1). All protons suffered a shift to lower ppm values, except for H3A. Protons in the B ring were not significantly influenced by this change in functional group (<|0.021|), however the protons in the C ring, H2, H3A and H3B, were heavily influenced. H2 shifted by -0.460 ppm and H3b by -0.353 ppm while H3A shifted by 0.316 ppm. In the A ring, H6 and H8 suffered a change in about -0.1 ppm and H7 in -0.352 ppm.

Page 158: Metabolomics Technologies applied to the

158

CHAPTER 6

Effect of sugar moieties on the flavonoid backbone

Glycosylated flavonoids are probably the most common form of storage of flavonoid aglycones in plants. The identification of glycosylated flavonoids is more problematic than of non-glycosylated or aglycone flavonoid forms, due to the lability of these compounds when separation, concentration and isolation procedures are necessary for their identification. MS and MS/MS are useful strategies for the rapid assessment of general information about the nature of a glycosylated flavonoid: the mass of the glycosylated flavonoid is detected; the number and mass of the sugar moieties as well as the mass of the aglycone can be known, by fragmenting the molecular ion. However, the position(s) of glycosylation and the nature of the sugar (most commonly: hexose, deoxypentose, pentose) remains inconclusive. Due to the high chemical selectivity of NMR, the acquisition of 1H (only) NMR can already aid, if not solve, some of the uncertainties left from the putative assignments done by MS.

One of the first items, when dealing with glycosylated flavonoids, is to determine the position of glycosylation. The effect of glycosylation on the flavonoid was analysed by comparing the chemical shift of the protons of the glycosylated flavonoid with the protons of the respective aglycone, Fig. 6.10.

Glucose, glucuronic acid, glucose-coumaroyl, rutinose, vicianose and robinose when conjugated in the 3 position of the flavonoid do not cause significant shifts (|Δppm| < 0.15) on the chemical shift values of the flavonoid protons. In fact, this observation is in accordance with the hydroxylation effects described above, as the presence of a substituent in the 3 position (flavonols) does not lead to shifts on the neighbouring protons of the backbone (see Table 6.4). However, the conjugation of rhamnose, as in quercitrin (F-17), leads to upfield shifts in the protons H2’ (-0.4 ppm) and H6’ (-0.3 ppm) compared to the aglycone. The rhamnose in the quercitrin is oriented closer to the B ring of quercetin than, for example, the glucose in the isoquercetin (F-15).

Moreover, there is no apparent difference in the flavonoid backbone resonances between conjugation of a monosaccharide (glucose, glucuronic acid) and a disaccharide (rutinose, vicianose and robinose) which the connecting sugar is identical (glucose). The same finding is observed for mono- (glucose, glucuronic acid) and disaccharides (neohesperidose and rutinose) conjugated in the 7 position. Nevertheless, conjugation in this position, by glucose-derived moieties, made the protons H6 and H8 shift about 0.3 ppm to higher ppm values. Analogously, conjugation of glucose in the 6 position made the proton H5 shift in the same order of magnitude. Comparing the chemical shifts of apigenin-8-C-neohesperidose (F-137) with apigenin

Page 159: Metabolomics Technologies applied to the

159

Flavonoid Database

(F-166), hardly any differences were found in the chemical shifts of protons. In principle, the presence of 8-substitution could lead to shifts in the protons 7 and 5 (see Table 6.4), however, apigenin is hydroxylated in these positions. The protons H5’of the 4’-glucosylated luteolin (F-69) and quercetin (F-93) have shifted by 0.4 ppm compared to the aglycone. As quercetin and luteolin are hydroxylated in the 3’ position, shifts on this position due to the presence of the glucose, could not be asserted. In sum, the conjugation of sugars to flavonoids leads to shifts in (at least) the ortho position protons. It is likely that sugars influence the flavanoid backbone in a similar way as hydroxyl substitution, in a qualitative terms. Hence the presence of two sugar units conjugated in two distinct positions in the flavonoid backbone should be feasible to distinguish from a single conjugation where these two sugar units are linked by a glycosydic bond.

2'

2'

5' 5'

66

6 66

6 6

6'

6'

8

8

8 8

88

8

8

85

2'

-0.5

-0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

0.5

Q-3

O-g

lu

S-3O

-glu

K-3O

-glu

Q-3

O-g

luC

OO

H

K-3O

-glu

-cou

mar

oyl

Q-3

O-r

ham

Q-3

O-r

ut

K-3O

-rut

D-3

O-r

ut

Q-3

O-v

ici

K-3O

-rob

,7O

-rha

m

6,7O

H-6

O-g

lu

Ap-

8C-n

eo

N-7

O-g

lu

6,7O

H-7

O-g

lu

B-7O

-glu

CO

OH

N-7

O-n

eo

Ap-

7O-n

eo

Ac-

7O-n

eo

N-7

O-r

ut

Ap-

7O-r

ut

L-4'

O-g

lu

Q-4

'O-g

lu

protons

δH

flav

onoi

dgl

ycos

ide

Hfl

avon

oid

agly

cone

(ppm

)

3

Figure 6.10. The effect of glycosydic moieties in the δH (ppm) of flavonoid aglycones: difference in ppm between flavonoid glycosides and flavonoid aglycones. Twenty three flavonoid glycosides were used in this analysis: F-15, F-92, F-129, F-130, F-133, F-17, F-157, F-110, F-131, F-149, F-100, F-221, F-137, F-109, F-172, F-159, F-230, F-18, F-88, F-150, F-55, F-69, F-93. Q = quercetin; S = syringetin; K = kaempferol; D = datiscetin; 6,7OH = 6,7-dihydroxyflavone; Ap = apigenin; N = naringenin; B = baicalein; Ac = acacetin; L = luteolin; glu = glucose; gluCOOH = glucuronic acid; rham = rhamnose; rut = rutinose; vici = vicianose; rob = robinose; neo = neohesperidose.

The identification of the sugar unit cannot be accomplished by MS, as there is not a connectivity difference nor a mass difference between a galactose or a glucose, for example. It has been previously demonstrated that the comparison of

Page 160: Metabolomics Technologies applied to the

160

CHAPTER 6

coupling constants of the sugar moieties is crucial for the identifi cation of the sugar isomers (Moco et al., 2006b).

In the case of disaccharide conjugation in the fl avonoid, the connectivity between the two sugar units by the glycosidic bond should be elucidated for the identifi cation of the disaccharide itself. Thus, the effect of the position of the glycosidic linkage in disaccharide moieties on the fl avonoid derivatives was assessed, by comparing the chemical shifts of two differently linked disaccharides: rutinose (6 -> 1 bond) and neohesperidose (2 -> 1). The fl avones rhoifolin (F-18) and isorhoifolin (F-55) and the fl avanones naringin (F-230) and narirutin (F-150) were used in this analysis. Due to the high similarity between the molecules, most of the chemical shift values are identical (|Δppm| < 0.2). However, the protons H5’’’ (about -0.26 ppm) and H1’’’ (-0.55 ppm) indicate noticeable shifts when comparing the rutinose to the neohesperidose. In particular the shift of the proton H1’’’ is readily recognizable, Fig. 6.11.

[ppm]8 6 4 2

Rhoifolin

H2’/H6’ H3’/H5’

H8 H3

H6

H1’’’

H1’’

H6’’’

H2’’’ H5’’’ H6’’A

H6’’B H2’’ H3’’ H3’’’

H5’’

H4’’ H4’’’

Isorhoifolin

H2’/H6’ H3’/H5’

H8H3

H6

H1’’’

H1’’

H6’’’

H2’’’

H2’’

H6’’A

H6’’B H5’’’

H3’’ H3’’’

H5’’ H4’’

H4’’’

H6’’A

Figure 6.11. Distinction between 1H NMR spectra of the rutinose and the neohesperidose moieties when conjugated to the aglycone apigenin: isorhoifolin (F-55) and rhoifolin (F-18).

Page 161: Metabolomics Technologies applied to the

161

Flavonoid Database

CONCLUSION

The identification of compounds from extracts or mixtures can be a long-lasting mission where the combination of analytical tools with spectroscopic and spectrometric techniques can provide sufficient elucidative information towards the characterization of molecules. A strategy that can lead to successful results is to combine LC-MS with NMR. A putative identification, can be obtained by performing LC-MS. This putative identification can be further investigated and confirmed or rejected by performing NMR. Using 1H (only) NMR spectroscopy based on an extensive experimental database, taking advantage of a 1H NMR prediction and fitting procedure can reveal the identification of metabolites such as flavonoids. Using this strategy, assessing the positions of hydroxylation, methoxylation, methylation and sugar conjugation is feasible, as well as identifying glycosidic links in conjugated disaccharides of flavonoids.

Page 162: Metabolomics Technologies applied to the

162

CHAPTER 6

SUPPLEMENTARY MATERIALS

Table 6S.1. List of experimental (OBS) and predicted (PRED) 1H chemical shifts of the compounds in the Flavonoid Database extracted by PERCH. Fi is the identifier correspondent to each compound (i = 1, …, 250) and Hj are the numbered protons present in each compound; H2_6 = H2’/H6’; HjM = Hj belonging to a methoxy group, attached to carbon j.

Fi Hj OBS PRED Fi Hj OBS PRED Fi Hj OBS PRED Fi Hj OBS PRED Fi Hj OBS PRED

F1 2 5.552 5.654 F137 5’’ 3.435 3.602 F190 5 8.204 8.159 F230 2’’ 3.631 3.875 F57 6’ 7.431 7.62

F1 2_6 7.521 7.457 F137 6 6.278 6.193 F190 5’ 7.017 7.001 F230 2_6 7.317 7.487 F57 6M 3.834 3.855

F1 3_5 7.416 7.392 F137 6A 3.949 3.869 F190 6 7.453 7.519 F230 3’’ 3.58 3.723 F57 7M 3.981 3.993

F1 3A 2.864 3.155 F137 6B 3.78 3.905 F190 6’ 7.602 7.907 F230 3_5 6.814 6.914 F57 8 6.8 6.818

F1 3B 3.126 2.863 F138 2’ 7.723 7.343 F190 7 7.751 7.765 F230 3A 2.756 3.169 F58 2’ 7.562 7.804

F1 4’ 7.365 7.35 F138 5’ 6.876 6.89 F190 8 7.605 7.641 F230 3B 3.169 3.024 F58 3 6.787 6.809

F1 5 7.865 7.837 F138 6’ 7.621 7.35 F191 2_6 8.094 8.245 F230 4’’ 3.381 3.464 F58 3’M 3.956 3.964

F1 6 7.068 7.061 F138 8 6.497 6.526 F191 3_5 7.55 7.58 F230 5’’ 3.446 3.425 F58 4’M 3.925 3.921

F1 7 7.557 7.547 F139 2M 3.89 3.918 F191 3M 3.806 3.83 F230 6 6.154 6.232 F58 5 8.038 8.075

F1 8 7.073 7.046 F139 3’ 6.976 6.996 F191 4’ 7.546 7.54 F230 6A 3.861 3.708 F58 5’ 7.127 7.012

F10 2_6 7.877 7.949 F139 3M 3.892 3.937 F191 5 7.553 7.507 F230 6B 3.68 3.651 F58 6 7.071 7.087

F10 3 6.677 6.818 F139 4 7.129 7.043 F191 6M 3.906 3.839 F230 8 6.175 6.251 F58 6’ 7.678 7.674

F10 3_5 6.94 6.936 F139 4’ 7.528 7.534 F191 7 7.38 7.434 F231 2 4.558 4.784 F58 7M 3.971 3.901

F10 5 7.971 8.016 F139 5 7.15 7.098 F191 8 7.615 7.628 F231 2’ 6.832 6.979 F58 8 7.241 7.15

F10 6 6.928 6.898 F139 5’ 6.995 7.008 F192 2_6 7.904 7.994 F231 3 3.968 4.118 F59 1’’’ 5.247 5.331

F10 8 6.971 6.877 F139 6 7.469 7.333 F192 3 6.741 6.819 F231 4A 2.843 2.856 F59 2’’’ 3.926 4.027

F100 1’’ 5.568 5.396 F139 6’ 8.1 8.056 F192 3_5 6.942 6.94 F231 4B 2.5 2.653 F59 3’’’ 3.582 4.021

F100 1’’’ 5.083 5.32 F139 7 8.216 8.233 F192 5 7.424 7.483 F231 5’ 6.756 6.811 F59 4’’’ 3.383 3.491

F100 1’’’’ 4.51 4.35 F139 8 7.908 7.814 F192 7 7.266 7.303 F231 6 5.921 5.778 F59 5’’’ 3.872 4.032

F100 2’’ 4.017 3.98 F14 2_6 7.937 7.972 F192 8 7.585 7.634 F231 6’ 6.715 6.94 F59 6’’’ 1.283 1.175

F100 2’’’ 3.795 3.57 F14 3 6.807 6.858 F193 2_6 7.85 8.117 F231 8 5.849 5.91 F59 1’’ 5.108 5.212

F100 2’’’’ 3.565 3.431 F14 3_5 6.956 6.935 F193 3_5 7.557 7.567 F232 1’’ 4.888 5.021 F59 2 5.328 5.549

F100 2_6 8.117 8.049 F14 5 8.142 8.132 F193 3M 2.203 2.18 F232 2 6.786 6.662 F59 2’ 6.916 7.012

F100 3’’ 3.828 4.22 F14 6 7.49 7.471 F193 4’ 7.559 7.546 F232 2’’ 3.443 3.76 F59 2’’ 3.635 3.662

F100 3’’’ 3.532 3.971 F14 7 7.808 7.793 F193 5 8.366 8.12 F232 2_6 7.364 7.407 F59 3’’ 3.582 3.646

F100 3’’’’ 3.481 3.898 F14 8 7.716 7.653 F193 6 7.525 7.387 F232 3’’ 3.467 3.491 F59 3A 2.76 2.889

F100 3_5 6.863 6.892 F142 2_6 7.481 7.828 F193 7 8.295 8.311 F232 3_5 6.762 6.832 F59 3B 3.117 3.135

F100 4’’ 3.479 3.506 F142 26 7.481 7.798 F194 2_6 8.1 8.24 F232 4 6.445 6.501 F59 4’’ 3.381 3.435

F100 4’’’ 3.763 4.001 F142 3 6.608 7.107 F194 3_5 7.548 7.582 F232 4’’ 3.381 3.383 F59 5’ 6.775 6.805

F100 4’’’’ 3.275 3.369 F142 3_5 6.748 6.85 F194 3M 3.794 3.848 F232 5’’ 3.462 3.359 F59 5’’ 3.45 3.471

F100 5’’ 3.603 3.695 F142 35 6.748 6.811 F194 4’ 7.541 7.538 F232 6 6.61 6.651 F59 6 6.15 6.114

F100 5’’’ 3.616 3.664 F142 6 6.471 6.324 F194 5 8.062 8.046 F232 6A 3.927 3.589 F59 6’ 6.795 6.966

F100 5’’’’ 3.525 3.758 F143 2_6 7.958 8.002 F194 6 7.051 7.04 F232 6B 3.711 3.606 F59 6A 3.864 3.688

F100 6 6.48 6.443 F143 3 6.676 6.928 F194 7M 3.944 3.944 F232 7 6.847 6.395 F59 6B 3.674 3.638

F100 6’’ 1.259 1.142 F143 3_5 7.535 7.563 F194 8 7.141 7.146 F232 8 7.014 6.745 F59 8 6.176 6.231

F100 6’’’’ 1.178 1.17 F143 4’ 7.555 7.556 F195 2 8.128 8.174 F233 2 7.183 7.091 F6 2_6 8.013 8.014

F100 6A 3.722 3.46 F143 5M 3.894 3.957 F195 2_6 7.453 7.442 F233 3’ 6.29 6.455 F6 3 6.813 6.839

F100 6B 3.403 3.751 F143 6 6.5 6.607 F195 3_5 6.975 7.001 F233 5 6.82 6.833 F6 3_5 7.565 7.541

F100 8 6.763 6.608 F143 7M 3.931 3.914 F195 4’M 3.822 3.826 F233 5’ 6.417 6.504 F6 4’ 7.582 7.562

F101 2_6 7.349 7.467 F143 8 6.761 6.515 F195 5 7.472 7.527 F233 6 7.113 7.101 F6 5 7.999 8.004

F101 3 6.853 6.964 F144 2 5.46 5.593 F195 8 6.9 6.997 F233 6’ 7.944 7.825 F6 6 6.956 6.896

F101 35M 3.958 3.971 F144 2_6 7.488 7.554 F197 2 4.971 5.391 F233 7 7.721 7.448 F6 8 7.006 6.864

F101 4’M 3.861 3.888 F144 3_5 7.404 7.471 F197 2_6 7.349 7.383 F233 8 7.541 7.692 F60 1’’ 4.984 4.798

F101 6M 3.908 3.915 F144 3A 2.744 2.882 F197 3 4.535 4.667 F234 2 4.813 4.946 F60 2 7.196 7.099

F101 7M 4.105 3.944 F144 3B 3.012 3.093 F197 3_5 6.828 6.815 F234 2’ 6.97 6.877 F60 2’’ 3.555 3.458

F101 8M 3.993 4.08 F144 4’ 7.354 7.387 F197 6 5.914 5.915 F234 3 4.174 4.399 F60 3’’ 3.503 3.424

F102 2 5.4 5.621 F144 5M 3.843 3.981 F197 8 5.871 5.976 F234 4A 2.733 2.761 F60 4’’ 3.425 3.162

F102 2_6 6.811 6.825 F144 6 6.201 6.16 F198 2 4.832 5.413 F234 4B 2.856 2.797 F60 5 6.826 6.813

F102 35M 3.861 3.911 F144 7M 3.843 3.882 F198 2_6 6.523 6.575 F234 5’ 6.754 6.798 F60 5’ 6.847 6.716

F102 3A 2.759 3.021 F144 8 6.223 6.224 F198 3 4.456 4.633 F234 6 5.935 5.754 F60 5’’ 3.487 3.093

F102 3B 3.125 2.958 F145 2 5.487 5.669 F198 6 5.916 5.912 F234 6’ 6.793 6.859 F60 6 7.125 7.089

F102 4’M 3.773 3.845 F145 2_6 7.496 7.484 F198 8 5.882 5.956 F234 8 5.91 5.969 F60 6’ 7.623 7.476

F102 6 5.896 5.885 F145 3_5 7.412 7.417 F199 2_6 7.09 7.12 F235 3’ 6.436 6.452 F60 6A 3.916 3.658

F102 8 5.937 6.044 F145 3A 2.808 3.182 F199 3 5.119 5.705 F235 5’ 6.489 6.455 F60 6B 3.723 3.721

F103 1’’ 4.671 5.406 F145 3B 3.119 2.865 F199 3_5 6.882 6.904 F235 6 6.184 6.195 F60 7 7.771 7.328

F103 2’ 7.901 7.591 F145 4’ 7.364 7.363 F199 7M 3.852 3.909 F235 6’ 7.429 7.806 F60 8 7.562 7.771

F103 2’’ 3.663 3.852 F145 6 6.054 6.048 F199 8 6.413 6.385 F235 8 6.338 6.457 F62 2_6 7.881 8.077

F103 3’’ 3.478 3.774 F145 7M 3.812 3.876 F2 2 5.457 5.601 F236 2’ 7.371 7.508 F62 3 6.558 6.971

F103 4’’ 3.67 4.174 F145 8 6.09 6.127 F2 2_6 7.508 7.6 F236 3 6.534 6.839 F62 3_5 7.04 7.135

F103 5’ 6.861 6.875 F146 2_6 7.224 7.37 F2 3_5 7.408 7.442 F236 5’ 6.899 6.91 F62 4’M 3.863 3.845

F103 5’’ 3.696 3.856 F146 3 6.688 6.931 F2 3A 2.816 2.981 F236 6 6.204 6.175 F62 5M 3.881 3.964

F103 6 6.242 6.276 F146 35M 3.966 3.968 F2 3B 3.062 2.823 F236 6’ 7.378 7.456 F62 6 6.473 6.599

F103 6’ 7.883 7.501 F146 4’M 3.885 3.884 F2 4’ 7.354 7.379 F236 8 6.436 6.402 F62 7M 3.918 3.909

F104 1’’ 4.992 5.102 F146 6 6.259 6.18 F2 5 7.213 7.323 F237 2’ 7.391 7.569 F62 8 6.716 6.53

F104 10 6.766 6.01 F146 8 6.497 6.437 F2 7 7.045 6.975 F237 3 6.581 6.829 F64 1’’ 5.061 4.877

F104 2’ 7.559 6.903 F147 2 7.106 7.211 F2 8 6.941 6.993 F237 4’M 3.94 3.823 F64 2’’ 3.457 3.567

F104 2’’ 3.576 3.481 F147 3_5 5.844 5.955 F20 2_6 7.987 8.011 F237 5’ 7.075 6.99 F64 26 7.055 7.225

F104 3’’ 3.51 3.398 F147 5 6.788 6.798 F20 3 6.741 6.896 F237 6 6.449 6.152 F64 3’ 6.731 6.389

F104 4 7.249 7.402 F147 6 6.981 7.028 F20 3_5 7.558 7.541 F237 6’ 7.493 7.592 F64 3’’ 3.461 3.468

F104 4’’ 3.435 3.301 F147 7 7.633 7.405 F20 4’ 7.583 7.545 F237 8 6.21 6.436 F64 35 6.679 6.877

Page 163: Metabolomics Technologies applied to the

163

Flavonoid Database

F104 5 7.139 6.737 F147 8 8.028 7.805 F20 6 6.234 6.162 F238 2_6 7.542 7.556 F64 4’ 7.323 7.297

F104 5’ 6.866 6.872 F149 1’’’ 4.043 4.416 F20 8 6.49 6.389 F238 26 7.542 7.587 F64 4’’ 3.368 3.327

F104 5’’ 3.495 3.527 F149 2’’’ 3.368 3.668 F200 2_6 8.191 7.905 F238 3_5 7.337 7.374 F64 5’ 6.57 6.309

F104 6’ 7.358 7.061 F149 3’’’ 3.199 3.721 F200 3_5 6.937 6.915 F238 35 7.337 7.391 F64 5’’ 3.461 3.254

F104 6A 3.916 3.71 F149 4’’’ 3.641 3.905 F200 5 8.166 8.157 F238 4 7.233 7.312 F64 6A 3.88 3.687

F104 6B 3.731 3.678 F149 5A 3.14 3.783 F200 6 7.446 7.51 F238 4’ 7.233 7.299 F64 6B 3.685 3.642

F105 2 4.932 5.379 F149 5B 3.681 3.659 F200 7 7.756 7.771 F238 7 7.162 6.982 F64 7 2.889 3.389

F105 2’ 6.979 7.047 F149 1’’ 5.193 5.451 F200 8 7.679 7.633 F238 8 7.162 6.983 F64 8 3.433 5.116

F105 3 4.475 4.563 F149 2’ 7.705 7.554 F202 1’’’ 5.267 5.531 F24 2_6 8.11 8.025 F65 2_6 7.23 7.404

F105 5 7.721 7.791 F149 2’’ 3.502 3.577 F202 2’’’ 3.935 4.098 F24 3 6.81 6.799 F65 3 6.67 6.999

F105 5’ 6.802 6.831 F149 3’’ 3.422 3.642 F202 3’’’ 3.603 4.09 F24 3_5 7.563 7.541 F65 35M 3.947 3.971

F105 6 6.528 6.496 F149 4’’ 3.305 3.602 F202 4’’’ 3.403 3.692 F24 4’ 7.579 7.561 F65 4’M 3.843 3.871

F105 6’ 6.856 6.938 F149 5’ 6.874 6.891 F202 5’’’ 3.939 4.049 F24 5 7.549 7.52 F65 5M 3.904 3.97

F105 8 6.329 6.54 F149 5’’ 3.404 3.631 F202 6’’’ 1.324 1.213 F24 6 6.978 6.962 F65 6 6.51 6.602

F106 26 7.508 7.542 F149 6 6.204 6.24 F202 1’’ 5.05 5.314 F241 46 6.307 6.382 F65 7M 3.944 3.951

F106 3’ 6.083 6.226 F149 6’ 7.683 7.508 F202 2’’ 3.648 3.678 F241 5 6.483 6.668 F65 8 6.821 6.553

F106 35 6.829 6.877 F149 6A 3.906 3.58 F202 26 7.589 7.64 F242 36 6.645 6.666 F66 2’ 7.493 7.723

F106 4’M 3.834 3.862 F149 6B 3.608 3.737 F202 3’’ 3.591 3.51 F242 45 6.743 6.855 F66 3 6.633 6.863

F106 5’ 6.096 6.238 F149 8 6.405 6.403 F202 3_5 6.088 6.153 F243 26 6.756 6.786 F66 3’M 3.966 3.958

F106 6’M 3.937 3.965 F15 1’’ 5.243 5.522 F202 35 6.972 6.96 F243 35 7.149 7.179 F66 5’ 6.941 6.982

F106 7 7.698 7.718 F15 2’ 7.71 7.626 F202 4’’ 3.397 3.416 F243 4 6.786 6.797 F66 6 6.211 6.167

F106 8 7.762 7.81 F15 2’’ 3.483 3.383 F202 4M 3.84 3.878 F244 26 6.692 6.863 F66 6’ 7.52 7.588

F107 2_6 8.015 7.953 F15 3’’ 3.429 3.53 F202 5’’ 3.448 3.458 F244 35 6.741 6.863 F66 8 6.469 6.429

F107 3 6.832 6.874 F15 4’’ 3.35 3.322 F202 6A 3.898 3.769 F244 4M 3.708 3.766 F67 2_6 7.275 7.442

F107 3_5 7.106 7.134 F15 5’ 6.868 6.876 F202 6B 3.71 3.737 F245 2 6.339 6.474 F67 3 6.71 6.969

F107 4’M 3.895 3.842 F15 5’’ 3.224 3.356 F202 7 8.078 8.098 F245 3M 3.729 3.731 F67 35M 3.954 3.954

F107 5 8.141 8.162 F15 6 6.2 6.208 F202 8 7.75 7.903 F245 4 6.382 6.536 F67 4’M 3.846 3.848

F107 6 7.491 7.481 F15 6’ 7.582 7.601 F203 11 3.222 3.27 F245 5 7.04 7.168 F67 5M 3.924 3.958

F107 7 7.81 7.801 F15 6A 3.711 3.632 F203 12 5.257 5.165 F245 6 6.355 6.521 F67 6M 3.866 3.912

F107 8 7.715 7.694 F15 6B 3.576 3.694 F203 14 1.742 1.668 F246 2M 3.833 3.802 F67 7M 4.018 3.997

F108 2_6 7.839 7.933 F15 8 6.388 6.395 F203 15 1.69 1.824 F246 3 6.78 6.861 F67 8 7.176 6.815

F108 3 6.593 6.93 F150 1’’’ 4.69 4.643 F203 2 5.386 5.421 F246 4 6.77 6.876 F69 1’’ 4.934 4.836

F108 3_5 6.957 6.937 F150 2’’’ 3.875 3.869 F203 2_6 7.327 7.368 F246 5 6.772 6.903 F69 2’ 7.427 7.654

F108 6 6.365 6.44 F150 3’’’ 3.667 3.971 F203 3_5 6.815 6.875 F246 6 6.899 6.883 F69 2’’ 3.55 3.53

F108 7M 3.912 3.895 F150 4’’’ 3.331 3.369 F203 3A 2.691 3.038 F247 1M 3.774 3.743 F69 3 6.593 6.949

F108 8 6.587 6.55 F150 5’’’ 3.62 3.825 F203 3B 3.051 2.844 F247 26 6.894 6.949 F69 3’’ 3.507 3.464

F109 1’’ 4.965 5.083 F150 6’’’ 1.188 1.143 F203 5 7.568 7.691 F247 35 7.25 7.279 F69 4’’ 3.344 3.401

F109 2 5.382 5.449 F150 1’’ 4.945 5.272 F203 7M 3.866 3.908 F247 4 6.897 6.94 F69 5’ 7.31 7.172

F109 2’’ 3.454 3.487 F150 2 5.404 5.378 F203 8 6.528 6.688 F248 14M 3.73 3.764 F69 5’’ 3.493 3.356

F109 2_6 7.318 7.384 F150 2’’ 3.438 3.932 F204 11 8.644 8.391 F248 23 6.827 6.948 F69 6 6.205 6.234

F109 3’’ 3.43 3.596 F150 2_6 7.332 7.477 F204 12 7.769 7.561 F248 46 6.827 6.948 F69 6’ 7.443 7.641

F109 3_5 6.814 6.835 F150 3’’ 3.438 3.632 F204 13 7.772 7.566 F249 2 6.939 6.522 F69 6A 3.926 3.79

F109 3A 2.748 3.01 F150 3_5 6.82 6.897 F204 14 8.012 7.804 F249 3M 3.853 3.773 F69 6B 3.735 3.806

F109 3B 3.167 3.007 F150 3A 2.758 2.888 F204 2_6 8.14 8.047 F249 5 6.747 6.75 F69 8 6.439 6.416

F109 4’’ 3.382 3.408 F150 3B 3.168 3.086 F204 3 7.033 6.836 F249 6 6.77 6.377 F7 2’ 7.452 7.486

F109 5’’ 3.453 3.32 F150 4’’ 3.369 3.392 F204 3_5 7.627 7.556 F25 2’ 7.763 7.604 F7 3 6.75 6.851

F109 6 6.186 6.438 F150 5’’ 3.584 3.612 F204 4’ 7.629 7.552 F25 4’ 7.769 6.863 F7 5 8.133 8.141

F109 6A 3.871 3.783 F150 6 6.198 6.185 F204 5 8.034 8.287 F25 5 8.182 8.167 F7 5’ 6.927 6.921

F109 6B 3.682 3.777 F150 6A 3.986 3.756 F204 6 7.852 7.766 F25 5’ 7.349 7.323 F7 6 7.484 7.476

F109 8 6.211 6.634 F150 6B 3.62 3.666 F206 2_6 7.947 8.037 F25 6 7.459 7.508 F7 6’ 7.464 7.466

F11 2_6 8.017 8.011 F150 8 6.176 6.187 F206 3 6.636 6.933 F25 6’ 6.915 7.676 F7 7 7.802 7.796

F11 3 6.884 6.879 F151 2_6 7.589 7.429 F206 3_5 7.092 7.132 F25 7 7.779 7.771 F7 8 7.69 7.677

F11 3_5 7.55 7.536 F151 35M 3.926 3.941 F206 4’M 3.89 3.873 F25 8 7.683 7.589 F70 1’’ 4.971 5.257

F11 4’ 7.575 7.568 F151 4’M 3.849 3.843 F206 6 6.214 6.134 F250 12M 3.816 3.852 F70 2 5.325 5.437

F11 5 8.126 8.129 F151 5 7.992 7.986 F206 8 6.464 6.429 F250 36 6.942 6.95 F70 2’ 6.917 7.029

F11 6 7.479 7.47 F151 6 6.921 6.916 F207 2 5.319 5.462 F250 45 6.896 6.948 F70 2’’ 3.443 3.53

F11 7 7.802 7.791 F151 8 6.94 6.936 F207 2’ 6.948 6.969 F26 2 8.061 8.164 F70 3’’ 3.443 3.559

F11 8 7.702 7.658 F152 1’’ 5.403 5.283 F207 3A 2.716 3.033 F26 2_6 7.46 7.468 F70 3A 2.754 2.961

F110 1’’’ 4.513 4.387 F152 2’’ 3.816 3.471 F207 3B 3.059 3.005 F26 3_5 6.97 7.008 F70 3B 3.121 3.026

F110 2’’’ 3.627 3.326 F152 2_6 7.573 7.527 F207 4’M 3.861 3.878 F26 4’M 3.819 3.855 F70 4’’ 3.382 3.329

F110 3’’’ 3.518 3.477 F152 3’’ 3.567 3.53 F207 5’ 6.934 6.902 F26 6 6.217 6.232 F70 5’ 6.784 6.884

F110 4’’’ 3.277 3.333 F152 35M 3.94 3.921 F207 6 5.88 5.829 F26 8 6.333 6.344 F70 5’’ 3.45 3.376

F110 5’’’ 3.447 3.825 F152 4’’ 3.84 3.524 F207 6’ 6.907 6.982 F27 2 5.499 5.647 F70 6 6.183 6.164

F110 6’’’ 1.119 1.16 F152 5’’ 3.483 3.383 F207 8 5.905 5.871 F27 2_6 7.513 7.54 F70 6’ 6.789 7.013

F110 1’’ 5.129 5.68 F152 6 6.421 6.208 F208 11 8.004 7.926 F27 3_5 7.411 7.44 F70 6A 3.875 3.662

F110 2’’ 3.42 3.898 F152 6A 3.598 3.648 F208 12 7.645 7.553 F27 3A 2.846 3.103 F70 6B 3.687 3.681

F110 2_6 8.063 7.975 F152 6B 3.66 3.671 F208 13 7.741 7.697 F27 3B 3.092 2.876 F70 8 6.209 6.21

F110 3’’ 3.326 3.49 F152 8 6.207 6.409 F208 14 9.943 8.825 F27 4’ 7.359 7.361 F71 2 7.367 7.28

F110 3_5 6.89 6.954 F156 1’’’ 4.527 4.515 F208 2_6 8.072 8.014 F27 5 7.324 7.494 F71 3’ 6.29 6.481

F110 4’’ 3.249 3.356 F156 2’’’ 3.606 3.375 F208 3 7.031 6.917 F27 6M 3.798 3.888 F71 3M 3.943 3.94

F110 5’’ 3.412 3.461 F156 3’’’ 3.478 3.587 F208 3_5 7.586 7.547 F27 7 7.173 7.053 F71 5 6.847 6.876

F110 6 6.213 6.193 F156 4’’’ 3.25 3.057 F208 4’ 7.593 7.56 F27 8 7.018 7.016 F71 5’ 6.414 6.524

F110 6A 3.802 3.802 F156 5’’’ 3.41 3.606 F208 7 8.263 8.101 F28 2 5.493 5.586 F71 6 7.218 7.127

F110 6B 3.38 3.602 F156 6’’’ 1.097 1.058 F208 8 7.767 7.615 F28 2_6 7.495 7.515 F71 6’ 8.021 7.889

F110 8 6.407 6.401 F156 1’’ 5.239 6.365 F209 2 5.454 5.484 F28 3_5 7.406 7.42 F71 7 7.781 7.758

F112 2’ 7.384 7.653 F156 2’ 7.945 7.887 F209 2_6 7.487 7.565 F28 3A 2.81 3.168 F71 8 7.644 7.815

F112 3 6.556 6.866 F156 2’’ 3.466 3.99 F209 3_5 7.408 7.444 F28 3B 3.08 2.931 F72 2_6 7.974 8.023

F112 4’M 3.931 3.908 F156 3’’ 3.433 3.758 F209 3A 2.773 3.189 F28 4’ 7.355 7.358 F72 3 6.772 6.804

F112 5’ 7.051 6.988 F156 3’M 3.949 3.825 F209 3B 3.084 2.905 F28 5M 3.872 3.974 F72 3_5 7.54 7.547

F112 5M 3.911 3.888 F156 4’’ 3.242 3.37 F209 4’ 7.358 7.377 F28 6 6.672 6.633 F72 4’ 7.553 7.542

F112 6’ 7.478 7.645 F156 5’ 6.916 6.993 F209 6 5.896 5.85 F28 7 7.475 7.479 F72 5 7.405 7.424

F112 6M 3.855 3.944 F156 5’’ 3.362 3.605 F209 8 5.933 5.863 F28 8 6.662 6.654 F72 8 7.036 7.034

Page 164: Metabolomics Technologies applied to the

164

CHAPTER 6

F112 7M 3.997 3.993 F156 6 6.214 6.213 F21 2_6 8.121 7.955 F29 3’ 6.434 6.451 F73 2’ 7.363 7.5

F112 8 7.05 6.832 F156 6’ 7.63 7.793 F21 3_5 6.918 6.923 F29 5’ 6.487 6.445 F73 3 6.534 6.737

F113 2_6 8.075 8.011 F156 6A 3.815 3.602 F21 5 7.987 7.974 F29 6 6.183 6.206 F73 5’ 6.892 6.91

F113 3 6.709 6.804 F156 6B 3.42 3.615 F21 6 6.913 6.907 F29 6’ 7.431 7.779 F73 6’ 7.37 7.492

F113 3_5 7.546 7.536 F156 8 6.416 6.434 F21 8 6.922 6.892 F29 8 6.338 6.444 F73 6M 3.875 3.899

F113 4’ 7.571 7.559 F157 1’’’ 4.518 4.218 F210 1’’ 5.031 5.171 F3 2_6 8.291 8.221 F73 8 6.53 6.496

F113 5 6.751 7.449 F157 2’’’ 3.627 3.274 F210 2’’ 3.471 3.764 F3 3_5 7.538 7.545 F74 2_6 8.012 8.018

F113 8 6.301 7.037 F157 3’’’ 3.534 3.688 F210 26 7.059 7.273 F3 4’ 7.484 7.516 F74 3 6.77 6.914

F114 2_6 8.093 7.902 F157 4’’’ 3.276 3.312 F210 3’ 6.178 5.963 F3 5 8.19 8.164 F74 3_5 7.559 7.56

F114 3_5 7.057 7.112 F157 5’’’ 3.444 3.875 F210 3’’ 3.447 3.398 F3 6 7.464 7.512 F74 4’ 7.584 7.547

F114 3M 3.816 3.767 F157 6’’’ 1.119 1.247 F210 35 6.679 6.894 F3 7 7.784 7.762 F74 7M 3.999 3.849

F114 4’M 3.912 3.871 F157 1’’ 5.109 5.402 F210 4’’ 3.38 3.274 F3 8 7.704 7.595 F74 8 6.861 6.607

F114 6 6.348 6.45 F157 2’ 7.665 7.621 F210 5’ 5.953 5.956 F30 2_6 8.056 8.084 F75 2’ 7.487 7.809

F114 7M 3.899 3.941 F157 2’’ 3.468 3.882 F210 5’’ 3.444 3.316 F30 26 7.705 7.6 F75 3 6.636 6.97

F114 8 6.537 6.544 F157 3’’ 3.41 3.395 F210 6A 3.902 3.679 F30 3_5 7.533 7.58 F75 3’M 3.94 3.918

F115 2 4.855 5.424 F157 4’’ 3.265 3.405 F210 6B 3.711 3.623 F30 35 6.99 6.988 F75 4’M 3.914 3.85

F115 2_6 6.542 6.543 F157 5’ 6.873 6.878 F210 7A 2.882 2.896 F30 4’ 7.616 7.637 F75 5’ 7.099 7.008

F115 3 4.436 4.615 F157 5’’ 3.322 3.362 F210 7B 2.868 2.843 F30 4M 3.846 3.793 F75 5M 3.91 3.975

F115 5 7.716 7.799 F157 6 6.211 6.153 F210 8A 3.415 3.172 F30 7 7.764 7.799 F75 6 6.524 6.599

F115 6 6.526 6.489 F157 6’ 7.63 7.633 F210 8B 3.463 3.326 F30 8 7.61 7.626 F75 6’ 7.609 7.667

F115 8 6.33 6.576 F157 6A 3.802 3.583 F212 2’ 7.949 8.209 F31 2_6 8.128 8.047 F75 7M 3.949 3.853

F116 2’ 7.433 7.701 F157 6B 3.39 3.431 F212 2_6 7.555 7.843 F31 3 6.99 6.819 F75 8 6.806 6.581

F116 3 6.653 6.875 F157 8 6.401 6.443 F212 3 6.644 7.076 F31 3_5 7.66 7.547 F77 1’’ 5.406 5.45

F116 4’M 3.948 3.815 F158 2 5.333 5.458 F212 3’’ 6.623 7.02 F31 4’ 7.68 7.543 F77 2’ 7.926 7.865

F116 5’ 7.089 6.999 F158 2_6 7.308 7.361 F212 3_5 6.73 6.892 F31 5 7.641 7.506 F77 2’’ 3.462 3.563

F116 6’ 7.539 7.77 F158 3_5 6.814 6.85 F212 5’ 7.146 7.105 F31 6M 3.999 3.879 F77 3’’ 3.441 3.543

F116 6M 3.836 3.899 F158 3A 2.692 3.301 F212 6 6.195 6.131 F31 7 7.506 7.461 F77 3’M 3.943 3.973

F116 7M 3.984 3.98 F158 3B 3.105 3.034 F212 6’ 7.948 8.105 F31 8 7.776 7.622 F77 4’’ 3.306 3.36

F116 8 6.82 6.817 F158 6 5.88 5.865 F212 6’’ 6.393 6.319 F32 2_6 8.022 8.006 F77 5’ 6.906 6.982

F117 1’’ 4.935 5.217 F158 8 5.893 5.853 F212 8 6.415 6.356 F32 3 6.875 6.848 F77 5’’ 3.237 3.343

F117 2 5.416 5.488 F159 1’’ 5.189 5.075 F213 26 7.036 7.232 F32 3_5 7.561 7.552 F77 6 6.204 6.186

F117 2’ 6.982 7.048 F159 2’’ 3.639 3.599 F213 3_5 5.808 5.842 F32 4’ 7.584 7.554 F77 6’ 7.586 7.654

F117 2’’ 3.351 3.669 F159 2_6 7.993 8.058 F213 35 6.685 6.844 F32 5 7.916 7.891 F77 6A 3.73 3.628

F117 3’’ 3.466 3.422 F159 3 6.77 6.872 F213 7A 2.845 2.833 F32 6M 2.469 2.416 F77 6B 3.556 3.642

F117 3A 3.104 2.935 F159 3’’ 3.577 3.576 F213 7B 2.845 2.846 F32 7 7.64 7.569 F77 8 6.402 6.44

F117 3B 2.798 3.04 F159 3_5 7.543 7.576 F213 8A 3.264 3.297 F32 8 7.611 7.477 F8 2_6 7.91 7.966

F117 4’’ 3.351 3.332 F159 4’ 7.574 7.572 F213 8B 3.264 3.303 F33 2_6 8.115 8.242 F8 3 6.732 6.916

F117 5 7.371 7.394 F159 4’’ 3.67 4.019 F214 2 5.793 5.689 F33 3_5 7.557 7.569 F8 3_5 6.945 6.938

F117 5’ 6.784 6.869 F159 5’’ 4.156 3.849 F214 3’ 6.831 6.837 F33 3M 3.816 3.809 F8 6 6.776 6.74

F117 5’’ 3.494 3.274 F159 8 6.996 6.7 F214 3A 2.896 3.195 F33 4’ 7.553 7.558 F8 7 7.607 7.667

F117 6 6.922 6.626 F161 2’ 7.043 7.132 F214 3B 2.884 2.912 F33 5 8.187 8.19 F8 8 7.094 6.98

F117 6’ 6.848 6.836 F161 2A 2.225 1.957 F214 4’ 7.171 7.193 F33 6 7.481 7.488 F83 2_6 8.027 8.034

F117 6A 3.893 3.593 F161 2B 2.078 2.166 F214 5 7.866 7.955 F33 7 7.797 7.751 F83 26 7.732 7.67

F117 6B 3.707 3.612 F161 3 5.328 5.209 F214 5’ 6.893 6.925 F33 8 7.683 7.579 F83 3_5 6.905 6.986

F118 2 5.342 5.564 F161 4 3.72 3.819 F214 6 7.063 7.137 F34 26 7.602 7.592 F83 35 7.431 7.436

F118 2’ 7.072 7.145 F161 5 4.164 4.126 F214 6’ 7.507 7.304 F34 3’ 6.108 6.142 F83 4 7.415 7.358

F118 3A 2.708 3.077 F161 5’ 6.774 6.81 F214 7 7.554 7.552 F34 35 6.98 6.951 F83 7 7.747 7.823

F118 3B 3.137 2.87 F161 6’ 6.951 7.055 F214 8 7.086 7.121 F34 4M 3.845 3.875 F83 8 7.757 7.626

F118 3’M 3.876 3.965 F161 6A 2.174 2.363 F215 1’’ 5.303 5.1 F34 4’M 3.84 3.869 F84 1’’ 4.899 4.564

F118 5’ 6.818 6.921 F161 6B 2.044 1.799 F215 2’’ 3.986 3.524 F34 5’ 6.094 6.114 F84 2’ 7.359 7.571

F118 6 5.885 5.877 F161 7’ 7.555 7.067 F215 2_6 8.011 6.862 F34 6’M 3.945 3.938 F84 2’’ 4.158 3.527

F118 6’ 6.918 7.027 F161 8’ 6.259 6.646 F215 3’’ 3.674 3.494 F34 7 7.717 7.97 F84 3 6.54 6.922

F118 8 5.907 6.022 F162 2_6 8.27 8.186 F215 35M 4.013 3.955 F34 8 7.804 7.875 F84 3’’ 3.476 3.438

F119 2_6 8.047 8.061 F162 3_5 7.526 7.547 F215 4 9.043 7.835 F35 2_6 7.986 8.016 F84 4’’ 3.478 3.326

F119 26 7.621 7.545 F162 4’ 7.465 7.521 F215 4’’ 3.93 3.426 F35 3 6.769 6.968 F84 5’ 6.894 6.889

F119 3_5 7.531 7.568 F162 5 8.066 8.042 F215 5’’ 3.799 3.267 F35 3_5 7.548 7.554 F84 5’’ 3.419 3.368

F119 35 6.85 6.877 F162 6 7.049 7.076 F215 6 6.657 5.983 F35 4’ 7.569 7.553 F84 6’ 7.368 7.549

F119 4’ 7.612 7.652 F162 7M 3.958 3.927 F215 6A 3.755 3.73 F35 5M 3.943 3.954 F84 6A 3.876 3.777

F119 7 7.749 7.683 F162 8 7.168 7.178 F215 6B 3.777 3.734 F35 6 6.998 6.753 F84 6B 3.74 3.777

F119 8 7.564 7.652 F166 2_6 7.852 7.968 F215 8 6.965 5.882 F35 7 7.7 7.684 F84 8 6.483 6.419

F12 2_6 8.188 8.15 F166 3 6.593 6.903 F216 1’’’ 5.256 5.534 F35 8 7.248 6.966 F86 2 5.376 5.582

F12 3_5 7.499 7.562 F166 3_5 6.93 6.943 F216 2’’’ 3.926 4.13 F36 2_6 8.106 8.039 F86 2_6 7.321 7.284

F12 4’ 7.446 7.517 F166 6 6.209 6.172 F216 3’’’ 3.589 4.031 F36 26 7.742 7.718 F86 3_5 6.816 6.789

F12 6 6.199 6.206 F166 8 6.456 6.408 F216 4’’’ 3.394 3.52 F36 3_5 7.061 6.99 F86 3A 2.692 3.198

F12 8 6.414 6.389 F17 1’’ 5.35 5.554 F216 5’’’ 3.919 3.944 F36 35 7.437 7.442 F86 3B 3.047 2.909

F120 2’ 7.599 7.723 F17 2’ 7.338 7.607 F216 6’’’ 1.307 1.106 F36 4 7.423 7.425 F86 5 7.727 7.792

F120 3 6.802 6.856 F17 2’’ 4.217 4.246 F216 1’’ 5.026 5.364 F36 4’M 3.894 3.848 F86 6 6.495 6.468

F120 3’M 3.957 3.941 F17 3’’ 3.746 3.948 F216 2 6.701 6.929 F36 7 7.767 7.887 F86 8 6.353 6.476

F120 4’M 3.933 3.928 F17 4’’ 3.332 3.481 F216 2’’ 3.634 3.678 F36 8 7.767 7.652 F87 2 5.093 5.233

F120 5 7.887 7.78 F17 5’ 6.911 6.884 F216 3’’ 3.58 3.52 F37 2 5.435 5.613 F87 2_6 7.515 7.572

F120 5’ 7.152 7.039 F17 5’’ 3.419 3.978 F216 3_5 6.055 6.006 F37 2_6 7.341 7.21 F87 3_5 7.406 7.459

F120 6 7.261 7.24 F17 6 6.202 6.197 F216 4’’ 3.387 3.468 F37 3_5 6.821 6.739 F87 3A 3.18 3.003

F120 6’ 7.714 7.777 F17 6’ 7.306 7.628 F216 4M 3.81 3.839 F37 3A 2.791 3.218 F87 3B 2.595 2.984

F120 7M 4.025 4.005 F17 6’’ 0.941 1.058 F216 5 6.807 6.845 F37 3B 3.151 2.863 F87 4’ 7.344 7.389

F120 8M 4.032 3.98 F17 8 6.371 6.399 F216 5’’ 3.435 3.539 F37 5 7.855 7.841 F87 5 7.859 6.976

F121 2 8.136 8.176 F170 2 5.495 5.605 F216 6 6.652 6.812 F37 6 7.049 7.043 F87 6 6.943 7.418

F121 2_6 7.364 7.425 F170 2_6 7.501 7.515 F216 6A 3.887 3.779 F37 7 7.538 7.521 F87 7 7.205 6.92

F121 3_5 6.843 6.937 F170 3_5 7.41 7.413 F216 6B 3.698 3.774 F37 8 7.034 6.988 F87 8 6.922 7.598

F121 5 8.053 8.028 F170 3A 2.766 3.113 F216 7 2.833 2.866 F38 2_6 8.02 8.014 F88 1’’’ 5.285 5.367

F121 6 6.935 6.969 F170 3B 3.027 2.905 F216 8 3.306 3.306 F38 3 6.825 6.837 F88 2’’’ 3.942 4.118

F121 8 6.848 6.843 F170 4’ 7.357 7.37 F217 2_6 8.277 8.101 F38 3_5 7.559 7.56 F88 3’’’ 3.599 4.044

F122 2 7.389 7.343 F170 5 7.738 7.809 F217 3_5 7.53 7.561 F38 4’ 7.58 7.555 F88 4’’’ 3.406 3.5

Page 165: Metabolomics Technologies applied to the

165

Flavonoid Database

F122 2_6 8.08 8.183 F170 6 6.513 6.467 F217 4’ 7.475 7.528 F38 5 8.031 8.047 F88 5’’’ 3.934 4.076

F122 3_5 7.536 7.614 F170 8 6.393 6.477 F217 5 7.567 7.417 F38 6 7.061 7.014 F88 6’’’ 1.326 1.183

F122 3M 3.912 3.974 F172 1’’ 5.147 5.037 F217 6M 3.918 3.905 F38 7M 3.959 3.882 F88 1’’ 5.208 5.133

F122 4’ 7.619 7.677 F172 2’’ 3.533 3.501 F217 7 7.386 7.429 F38 8 7.199 6.997 F88 2’’ 3.697 3.724

F122 4M 3.882 3.938 F172 2_6 8.057 8.046 F217 8 7.648 7.659 F39 2_6 8.049 8.053 F88 2_6 7.983 8.079

F122 5 7.009 6.912 F172 3 6.877 6.896 F218 1’’ 5.348 5.408 F39 3 6.841 6.838 F88 3 6.709 7.063

F122 6 7.309 7.159 F172 3’’ 3.52 3.606 F218 2’’ 3.639 3.228 F39 3_5 7.577 7.572 F88 3’’ 3.629 3.665

F122 7 7.751 7.931 F172 3_5 7.575 7.584 F218 2_6 7.984 6.846 F39 4’ 7.593 7.573 F88 3_5 7.096 7.1

F122 8 7.64 7.705 F172 4’ 7.596 7.574 F218 3’’ 3.547 3.454 F39 5 7.881 7.876 F88 4’’ 3.404 3.497

F123 2_6 7.971 7.989 F172 4’’ 3.413 3.51 F218 35M 4.004 3.931 F39 6 7.257 7.257 F88 4’M 3.798 3.832

F123 3 6.688 6.923 F172 5 8.093 8.134 F218 4 9.033 7.831 F39 7M 4.018 3.981 F88 5’’ 3.543 3.599

F123 3_5 7.548 7.554 F172 5’’ 3.591 3.391 F218 4’’ 3.409 3.163 F39 8M 4.018 4.006 F88 6 6.472 6.476

F123 4’ 7.56 7.545 F172 6 7.233 7.115 F218 5’’ 3.571 3.339 F4 2_6 8.044 8.023 F88 6A 3.92 3.72

F123 5M 2.764 3.977 F172 6A 3.95 3.79 F218 6 6.662 6.031 F4 3 6.882 6.949 F88 6B 3.707 3.78

F123 6 6.689 6.388 F172 6B 3.718 3.773 F218 6A 3.915 3.737 F4 3_5 7.578 7.548 F88 8 6.805 6.565

F123 8 6.831 6.41 F172 8 7.417 7.036 F218 6B 3.693 3.712 F4 4’ 7.609 7.571 F89 2_6 7.574 7.346

F124 6-Feb 7.619 7.541 F173 2 5.478 5.705 F218 8 6.967 5.904 F4 6 6.807 6.746 F89 35M 3.926 3.965

F124 3’ 6.291 6.464 F173 2’ 6.95 6.901 F219 2 4.985 5.419 F4 7 7.644 7.661 F89 6 6.191 6.212

F124 5-Mar 6.846 6.884 F173 3A 2.852 3.142 F219 2’ 7.104 7.304 F4 8 7.137 6.973 F89 8 6.419 6.408

F124 5’ 6.415 6.522 F173 3B 3.082 2.826 F219 2’’ 7.016 7.118 F40 2_6 7.962 8.013 F9 2’ 7.386 7.613

F124 6’ 7.97 7.838 F173 4’ 6.779 6.72 F219 3 4.528 4.736 F40 3 6.707 6.883 F9 3 6.738 6.831

F124 7 7.788 7.691 F173 5 7.856 7.915 F219 3M 3.872 3.963 F40 3_5 7.539 7.547 F9 4’ 6.992 6.879

F124 8 7.614 7.722 F173 5’ 7.218 7.184 F219 5’ 7.012 7.047 F40 4’ 7.562 7.546 F9 5 7.989 8.03

F125 2_6 8.234 8.233 F173 6 7.062 7.06 F219 5’’ 6.842 6.861 F40 8 6.599 6.476 F9 5’ 7.364 7.325

F125 3_5 7.516 7.546 F173 6’ 6.954 6.987 F219 6 5.924 5.977 F41 2’ 7.764 7.335 F9 6 6.949 6.896

F125 4’ 7.455 7.506 F173 7 7.555 7.547 F219 6’ 7.045 7.148 F41 5 7.978 7.975 F9 6’ 7.457 7.594

F125 5 8.014 7.961 F173 8 7.072 7.053 F219 6’’ 6.911 6.928 F41 5’ 6.893 6.891 F9 8 6.98 6.878

F125 6 6.931 6.898 F174 11 8.597 8.386 F219 8 5.893 5.996 F41 6 6.905 6.909 F90 10 6.692 6.684

F125 8 6.944 6.857 F174 12 7.759 7.587 F219 a 4.081 4.365 F41 6’ 7.662 7.337 F90 2’ 7.516 7.192

F126 3’ 6.987 6.967 F174 13 7.776 7.571 F219 b 4.929 5.233 F41 8 6.905 6.896 F90 4 7.608 7.804

F126 4’ 7.37 7.379 F174 14 8.027 7.815 F219 g1 3.711 3.866 F42 2’ 7.715 7.822 F90 5 6.695 6.608

F126 5’ 7.006 7.012 F174 2’M 4.005 3.937 F219 g2 3.496 3.927 F42 3M 3.833 3.817 F90 5’ 6.841 6.843

F126 6 6.2 6.206 F174 3 7.237 6.885 F22 2_6 8.254 8.223 F42 3’M 3.97 3.945 F90 6’ 7.236 7.045

F126 6’ 7.549 7.942 F174 3’ 7.255 7.138 F22 3_5 7.517 7.547 F42 4’M 3.978 3.921 F90 7 6.71 6.548

F126 8 6.344 6.423 F174 4’ 7.59 7.532 F22 4’ 7.464 7.505 F42 5’ 7.071 7.047 F91 2’ 7.516 7.484

F127 2 8.222 8.179 F174 5 8.053 8.268 F22 5 7.438 7.422 F42 6 6.357 6.448 F91 3 6.632 6.796

F127 2_6 7.487 7.561 F174 5’ 7.219 7.12 F22 7 7.264 7.253 F42 6’ 7.781 7.707 F91 5 7.519 7.574

F127 3_5 6.994 6.989 F174 6 7.868 7.77 F22 8 7.564 7.575 F42 7M 3.909 3.948 F91 5’ 6.915 6.928

F127 4’M 3.832 3.859 F174 6’ 8.085 7.893 F220 2’M 3.865 3.967 F42 8 6.554 6.548 F91 6 6.951 6.982

F127 5 7.576 7.604 F175 2_6 8.032 8.04 F220 3’ 7.168 7.145 F43 2 5.356 5.534 F91 6’ 7.514 7.487

F127 6M 3.935 3.934 F175 3 6.768 6.908 F220 4’ 7.51 7.534 F43 2_6 7.31 7.343 F92 1’’ 5.466 5.673

F127 7M 3.98 4.008 F175 3_5 7.579 7.541 F220 5 8.204 8.124 F43 3_5 6.814 6.818 F92 2’’ 3.462 3.635

F127 8 7.144 7.152 F175 4’ 7.599 7.562 F220 5’ 7.085 7.106 F43 3A 2.721 3.197 F92 2_6 7.523 7.349

F128 2_6 7.536 7.407 F175 6 6.296 6.182 F220 6 7.454 7.495 F43 3B 3.13 2.865 F92 3’’ 3.447 3.441

F128 35M 3.957 3.894 F175 8M 3.938 4.001 F220 6’ 7.528 7.823 F43 6 6.041 6.024 F92 35M 3.931 3.961

F128 4’M 3.908 3.899 F177 2_6 8.066 8.134 F220 7 7.744 7.739 F43 7M 3.797 3.879 F92 4’’ 3.299 3.188

F128 6 6.262 6.218 F177 3_5 7.078 7.116 F220 8 7.563 7.626 F43 8 6.03 6.048 F92 5’’ 3.248 3.399

F128 8 6.44 6.449 F177 3M 3.786 3.836 F221 1’’ 5.028 5.095 F44 2_6 7.373 6.996 F92 6 6.209 6.206

F129 1’’ 5.244 5.404 F177 4’M 3.888 3.866 F221 2’’ 3.509 3.465 F44 5 7.977 7.975 F92 6A 3.747 3.518

F129 2’’ 3.438 3.448 F177 6 6.207 6.164 F221 2_6 8.054 8.063 F44 6 6.906 6.908 F92 6B 3.565 3.461

F129 2_6 8.051 8.028 F177 8 6.411 6.403 F221 3 6.918 6.909 F44 8 6.896 6.914 F92 8 6.419 6.402

F129 3’’ 3.41 3.418 F178 2 5.531 5.492 F221 3’’ 3.504 3.477 F47 2_6 7.834 7.992 F93 1’’ 4.913 4.889

F129 3_5 6.884 6.957 F178 2_6 7.511 7.556 F221 3_5 7.577 7.569 F47 3 6.572 6.866 F93 2’ 7.762 7.646

F129 4’’ 3.306 3.268 F178 3_5 7.417 7.427 F221 4’ 7.598 7.576 F47 3_5 6.919 6.94 F93 2’’ 3.546 3.53

F129 5’’ 3.199 3.054 F178 3A 2.885 3.194 F221 4’’ 3.452 3.245 F47 8 6.573 6.49 F93 3’’ 3.505 3.442

F129 6 6.196 6.193 F178 3B 3.202 2.876 F221 5 7.778 7.563 F49 2’ 7.701 7.791 F93 4’’ 3.43 3.366

F129 6A 3.687 3.618 F178 4’ 7.369 7.364 F221 5’’ 3.533 3.222 F49 3M 3.787 3.786 F93 5’ 7.299 7.125

F129 6B 3.526 3.609 F178 6 6.486 6.464 F221 6A 3.926 3.619 F49 3’M 3.907 3.957 F93 5’’ 3.485 3.33

F129 8 6.389 6.396 F178 7 7.41 7.415 F221 6B 3.75 3.609 F49 4’M 3.911 3.941 F93 6 6.185 6.324

F130 1’’ 5.142 5.442 F178 8 6.511 6.54 F221 7 7.593 7.556 F49 5’ 7.077 7.049 F93 6’ 7.71 7.692

F130 2’ 7.613 7.652 F179 2 8.191 8.197 F221 8 7.723 7.625 F49 5M 3.903 3.965 F93 6A 3.929 3.75

F130 2’’ 3.474 3.324 F179 2_6 7.531 7.517 F222 2_6 8.102 8.009 F49 6 6.458 6.636 F93 6B 3.737 3.744

F130 3’’ 3.415 3.537 F179 3_5 7.423 7.462 F222 3 6.835 6.811 F49 6’ 7.726 7.676 F93 8 6.392 6.402

F130 4’’ 3.307 3.336 F179 4’ 7.371 7.496 F222 3_5 7.569 7.541 F49 7M 3.921 3.99 F94 2’ 7.522 7.832

F130 5’ 6.836 6.887 F179 5 8.065 7.993 F222 4’ 7.585 7.554 F49 8 6.692 6.62 F94 3 6.663 6.945

F130 5’’ 3.358 3.585 F179 6 6.95 6.949 F222 5 7.649 7.613 F5 2_6 8.027 8.03 F94 3’M 3.948 3.941

F130 6 6.209 6.259 F179 8 6.867 6.836 F222 6 7.205 7.082 F5 3 6.86 6.855 F94 4’M 3.918 3.931

F130 6’ 7.582 7.56 F18 1’’’ 5.288 5.145 F222 7M 4.024 3.934 F5 3_5 7.564 7.545 F94 5’ 7.111 7.017

F130 6A 4.175 4.048 F18 2’’’ 3.951 3.942 F223 11 8.704 8.408 F5 4’ 7.583 7.566 F94 5M 3.922 3.938

F130 6B 4.041 4.061 F18 3’’’ 3.615 3.756 F223 12 7.803 7.579 F5 5 7.436 7.449 F94 6’ 7.634 7.729

F130 8 6.403 6.45 F18 4’’’ 3.409 3.382 F223 13 7.795 7.573 F5 7 7.293 7.308 F94 6M 3.864 3.908

F131 1’’’ 4.614 4.424 F18 5’’’ 3.943 3.743 F223 14 8.047 7.817 F5 8 7.616 7.625 F94 7M 4.014 3.981

F131 2’’’ 3.744 3.358 F18 6’’’ 1.328 1.178 F223 2_6 8.14 7.948 F50 2_6 7.783 7.704 F94 8 7.142 6.848

F131 3’’’ 3.592 3.74 F18 1’’ 5.192 5.042 F223 3 6.983 6.863 F50 3’ 6.972 7 F95 2’ 7.593 7.612

F131 4’’’ 3.305 3.299 F18 2’’ 3.699 3.359 F223 3_5 7.174 7.128 F50 3_5 7.454 7.442 F95 3M 3.787 3.871

F131 5’’’ 3.513 3.787 F18 2_6 7.865 7.968 F223 4’M 3.925 3.846 F50 4 7.447 7.417 F95 4’M 3.937 3.892

F131 6’’’ 1.174 1.249 F18 3 6.64 7.049 F223 5 8.073 8.301 F50 4’ 7.526 7.54 F95 5’ 7.056 7.002

F131 1’’ 5.023 5.129 F18 3’’ 3.634 3.655 F223 6 7.888 7.771 F50 5’ 6.997 6.988 F95 6 6.195 6.199

F131 2’’ 3.23 3.635 F18 3_5 6.922 6.918 F224 11 8.036 7.929 F50 6’ 8.137 7.895 F95 6’ 7.628 7.68

F131 3’ 6.928 6.933 F18 4’’ 3.409 3.53 F224 12 7.662 7.573 F50 7 7.912 7.849 F95 8 6.389 6.404

F131 3’’ 3.339 3.408 F18 5’’ 3.543 3.58 F224 13 7.76 7.705 F50 8 7.912 7.739 F96 2_6 7.93 7.989

F131 4’ 7.344 7.361 F18 6 6.444 6.521 F224 14 9.969 8.777 F51 2_6 7.698 7.644 F96 3 6.606 6.825

Page 166: Metabolomics Technologies applied to the

166

CHAPTER 6

F131 4’’ 3.203 3.444 F18 6A 3.927 3.799 F224 2’M 4.009 3.873 F51 26’ 8.085 8.084 F96 3_5 7.066 7.131

F131 5’ 6.94 7.066 F18 6B 3.713 3.804 F224 3 7.297 6.986 F51 3_5 6.987 6.996 F96 4’M 3.878 3.884

F131 5’’ 3.269 3.181 F18 8 6.764 6.652 F224 3’ 7.242 7.157 F51 35’ 7.05 7.058 F96 5M 3.916 3.973

F131 6 6.224 6.218 F180 2’ 7.773 7.584 F224 4’ 7.568 7.538 F51 4M 3.847 3.875 F96 6M 3.858 3.935

F131 6’ 7.547 7.988 F180 4’M 3.947 3.822 F224 5’ 7.179 7.1 F51 4’M 3.89 3.871 F96 7M 4 3.981

F131 6A 3.791 3.671 F180 5’ 7.045 7.001 F224 6’ 8.023 7.935 F51 7 7.737 7.848 F96 8 7.074 6.811

F131 6B 3.405 3.55 F180 6 6.318 6.489 F224 7 8.274 8.096 F51 8 7.633 7.646 F97 2’ 7.721 7.633

F131 8 6.332 6.43 F180 6’ 7.785 7.512 F224 8 7.752 7.612 F54 2_6 7.972 7.993 F97 4’M 3.926 3.81

F133 7’’’ 7.399 7.271 F180 7M 3.895 3.85 F226 2_6 7.906 7.98 F54 3 6.714 6.817 F97 5’ 7.04 6.974

F133 8’’’ 6.07 6.416 F180 8 6.576 6.574 F226 3 6.727 6.915 F54 3_5 7.1 7.14 F97 6 6.18 6.206

F133 1’’ 5.242 5.692 F184 3 7.354 6.883 F226 3_5 6.942 6.942 F54 4’M 3.893 3.879 F97 6’ 7.74 7.614

F133 2’’ 3.465 3.715 F184 3’ 7.667 6.957 F226 6 6.771 6.738 F54 5 7.974 8.046 F97 8 6.387 6.423

F133 2_6 7.989 7.981 F184 4’ 7.776 7.376 F226 7 7.602 7.668 F54 6 6.928 6.899 F98 1’’’ 5.283 5.226

F133 26 7.31 7.413 F184 5 7.871 8.032 F226 8 7.088 6.995 F54 8 6.97 6.894 F98 2’’’ 3.945 4.031

F133 3’’ 3.446 3.589 F184 5’ 7.46 7.021 F227 2’ 7.731 7.396 F55 1’’’ 4.724 4.779 F98 3’’’ 3.61 4.061

F133 3_5 6.82 6.925 F184 6 6.484 6.91 F227 5’ 6.882 6.899 F55 2’’’ 3.91 3.803 F98 4’’’ 3.405 3.511

F133 35 6.796 6.851 F184 6’ 8.119 7.891 F227 6 6.18 6.234 F55 3’’’ 3.721 3.865 F98 5’’’ 3.946 4.081

F133 4’’ 3.323 3.471 F184 8 6.438 6.899 F227 6’ 7.631 7.433 F55 4’’’ 3.354 3.498 F98 6’’’ 1.328 1.211

F133 5’’ 3.461 3.796 F185 3 7.307 6.875 F227 8 6.385 6.41 F55 5’’’ 3.672 3.944 F98 1’’ 5.182 5.155

F133 6 6.135 6.143 F185 3’ 6.981 6.961 F228 2_6 8.081 7.973 F55 6’’’ 1.176 1.194 F98 2’’ 3.691 3.843

F133 6A 4.294 4.179 F185 4’ 7.364 7.374 F228 3_5 6.902 6.92 F55 1’’ 5.081 4.845 F98 2_6 8.117 8.049

F133 6B 4.19 4.209 F185 5 7.431 7.484 F228 6 6.18 6.22 F55 2’’ 3.516 3.596 F98 3’’ 3.627 3.727

F133 8 6.316 6.353 F185 5’ 7.009 7.028 F228 8 6.392 6.402 F55 2_6 7.911 7.947 F98 3_5 6.906 6.901

F134 2’’ 7.985 7.777 F185 6’ 7.929 7.863 F229 2 8.156 8.169 F55 3 6.698 7.079 F98 4’’ 3.401 3.437

F134 2_6 8.179 8.193 F185 7 7.278 7.302 F229 2_6 7.466 7.405 F55 3’’ 3.693 3.645 F98 5’’ 3.536 3.468

F134 3’’ 7.341 6.569 F185 8 7.579 7.637 F229 3_5 6.983 7.008 F55 3_5 6.985 6.934 F98 6 6.42 6.469

F134 3_5 7.594 7.591 F187 3 7.379 6.903 F229 4’M 3.826 3.826 F55 4’’ 3.448 3.518 F98 6A 3.919 3.743

F134 3M 3.85 3.845 F187 3’ 6.995 6.956 F229 5 8.059 8.051 F55 5’’ 3.537 3.278 F98 6B 3.703 3.705

F134 4’ 7.577 7.551 F187 4’ 7.382 7.377 F229 6 6.942 6.954 F55 6 6.554 6.415 F98 8 6.737 6.654

F134 5 8.112 8.103 F187 5 8.152 8.167 F229 8 6.856 6.851 F55 6A 4.041 3.799 F99 1’’ 5.248 5.113

F134 6 7.653 6.95 F187 5’ 7.024 7.007 F23 2_6 8.235 8.226 F55 6B 3.686 3.99 F99 2’ 7.706 7.676

F137 1’’’ 5.093 4.836 F187 6 7.494 7.486 F23 3_5 7.517 7.542 F55 8 6.807 6.568 F99 2’’ 3.477 3.464

F137 2’’’ 3.838 3.329 F187 6’ 7.972 7.821 F23 4’ 7.456 7.508 F56 2_6 8.036 8.04 F99 3’’ 3.423 3.448

F137 3’’’ 3.389 3.335 F187 7 7.813 7.785 F23 5 8.016 7.97 F56 3 6.878 6.869 F99 4’’ 3.344 3.317

F137 4’’’ 3.114 3.336 F187 8 7.708 7.655 F23 6 6.933 6.902 F56 3_5 7.536 7.561 F99 5’ 6.866 6.898

F137 5’’’ 2.447 3.139 F19 2’ 7.827 7.365 F23 8 6.946 6.853 F56 4 7.249 7.282 F99 5’’ 3.218 3.36

F137 6’’’ 0.645 1.112 F19 5 8.161 8.172 F230 1’’’ 5.236 4.851 F56 4’ 7.616 7.651 F99 6 6.203 6.164

F137 1’’ 5.022 5.116 F19 5’ 6.914 6.901 F230 2’’’ 3.925 3.681 F56 5 6.886 6.908 F99 6’ 7.584 7.552

F137 2’’ 4.263 3.47 F19 6 7.443 7.51 F230 3’’’ 3.58 3.991 F56 6 7.664 7.548 F99 6A 3.708 3.619

F137 2_6 7.997 8.113 F19 6’ 7.734 7.349 F230 4’’’ 3.384 3.457 F56 7 8.106 7.986 F99 6B 3.572 3.57

F137 3 6.611 7.152 F19 7 7.754 7.772 F230 5’’’ 3.874 3.905 F56 8 7.799 7.723 F99 8 6.391 6.443

F137 3’’ 3.624 3.7 F19 8 7.662 7.642 F230 6’’’ 1.279 1.178 F57 2’ 7.401 7.601

F137 3_5 6.945 7.002 F190 3’ 7.002 6.95 F230 1’’ 5.109 5.346 F57 3 6.604 6.849

F137 4’’ 3.633 3.764 F190 4’ 7.383 7.376 F230 2 5.385 5.419 F57 5’ 6.908 6.923

Page 167: Metabolomics Technologies applied to the

167

Flavonoid Database

Table 6S.2. List of experimental (OBS) and predicted (PRED) 1H coupling constants of the compounds in the Flavonoid Database extracted by PERCH. Fi is the identifier correspondent to each compound (i = 1, …, 250) and JHj,Hk are the coupling constants between the numbered protons, Hj and Hk, present in each compound; H2_6 = H2’/H6’; HjM = Hj belonging to a methoxy group, attached to carbon j.

Fi JHj,Hk OBS PRED Fi JHj,Hk OBS PRED Fi JHj,Hk OBS PRED Fi JHj,Hk OBS PRED

F1 2_6,2_6 1.97 1.98 F152 4’’,3’’ 3.43 3.21 F212 6,3 -0.16 0 F37 3_5,2_6 0.39 0.42

F1 2_6,3_5 7.77 7.66 F152 5’’,4’’ 1.04 0.63 F212 6’’,3’’ -0.01 0 F37 3_5,3_5 2.49 2.87

F1 2_6,4’ 1.18 1.26 F152 5’’,6A 6.47 5.33 F212 6,8 2.11 1.96 F37 3A,2 2.91 5.41

F1 3_5,2_6 0.55 0.54 F152 5’’,6B 5.67 3.22 F213 26,26 2.62 1.92 F37 3A,3B -13.91 -16.65

F1 3_5,3_5 1.33 1.48 F152 6,8 2.07 2.59 F213 26,35 8.29 8.26 F37 3B,2 13.25 5.19

F1 3_5,4’ 7.47 7.54 F152 6A,6B -11.32 -12.54 F213 26,7A -0.5 -0.79 F37 5,6 7.88 7.85

F1 3A,2 2.96 5.25 F156 2’’’,1’’’ 1.63 2.29 F213 26,7B -0.69 -0.79 F37 5,7 1.77 1.2

F1 3B,2 13.08 6.75 F156 3’’’,2’’’ 3.45 3.69 F213 3_5,3_5 2.01 1.96 F37 5,8 0.43 0.17

F1 3B,3A -16.88 -16.63 F156 3’’’,4’’’ 9.54 6.92 F213 35,26 0.38 0.39 F37 7,6 7.16 8.14

F1 4’,2 -1.19 -1.19 F156 5’’’,4’’’ 9.45 4.04 F213 35,35 2.34 2.47 F37 7,8 8.37 7.85

F1 5,6 7.87 7.85 F156 5’’’,6’’’ 6.23 6.25 F213 7A,7B -11.59 -16.11 F37 8,6 1.01 1.2

F1 5,7 1.77 1.2 F156 1’’,2’’ 7.88 6.01 F213 7A,8A 10.11 8.12 F38 2_6,2_6 2.1 1.98

F1 5,8 0.44 0.17 F156 2’,5’ 0.43 0.46 F213 7A,8B 5.83 5.79 F38 2_6,3_5 7.94 7.85

F1 7,6 7.19 8.14 F156 2’,6’ 2.07 1.87 F213 7B,8A 5.48 5.49 F38 2_6,4’ 1.18 1.32

F1 7,8 8.36 7.85 F156 3’’,2’’ 9.22 6.68 F213 7B,8B 10.2 8.09 F38 3,2_6 0.19 0

F1 8,6 1.06 1.2 F156 4’’,3’’ 8.96 3.09 F213 8A,8B -13.43 -16.88 F38 3,5 0 0

F10 2_6,2_6 2.76 2.32 F156 4’’,5’’ 9.87 1.66 F214 3A,2 1 5.47 F38 3_5,2_6 0.58 0.58

F10 2_6,3_5 8.63 8.44 F156 5’,6’ 8.45 8.45 F214 3B,2 3.5 6.1 F38 3_5,3_5 1.38 1.26

F10 3,2_6 0.02 0 F156 5’’,6A 1.69 5.71 F214 3B,3A -17.58 -16.39 F38 3_5,4’ 7.44 7.54

F10 3_5,2_6 0.35 0.46 F156 5’’,6B 6.49 3.46 F214 4’,2 -1.19 -1.18 F38 5,8 0.29 0.04

F10 3_5,3_5 2.41 2.65 F156 6,8 2.09 2.59 F214 4’,3’ 8.12 8.14 F38 6,3 -0.09 0

F10 5,8 0.04 0.04 F156 6A,6B -11.42 -12.51 F214 5’,3’ 1.11 1.13 F38 6,5 8.89 8.45

F10 6,5 8.77 8.45 F157 1’’’,2’’’ 1.65 2.47 F214 5’,4’ 7.37 7.55 F38 6,8 2.38 2.59

F10 6,8 2.26 2.59 F157 2’’’,3’’’ 3.44 3.7 F214 5,8 0.53 0.39 F39 2_6,2_6 2.01 1.98

F100 1””,2”” 1.68 2.16 F157 3’’’,4’’’ 9.55 7.46 F214 6’,2 -1.36 -0.79 F39 2_6,3_5 7.93 7.85

F100 1’’,2’’ 1.82 2.67 F157 4’’’,5’’’ 9.48 4.18 F214 6’,3’ 0.46 0.52 F39 2_6,4’ 1.2 1.32

F100 1’’’,2’’’ 7.8 7.11 F157 5’’’,6’’’ 6.25 6.25 F214 6’,4’ 1.67 1.61 F39 3,2_6 0 0

F100 2””,3”” 3.41 3.38 F157 1’’,2’’ 7.87 7.41 F214 6,5 7.89 8.1 F39 3,5 0 0

F100 2’’,3’’ 3.48 3.28 F157 2’,5’ 0.18 0.46 F214 6’,5’ 7.69 7.67 F39 3_5,2_6 0.6 0.58

F100 2’’’,3’’’ 9.75 7.31 F157 2’,6’ 2.18 1.87 F214 6,8 1.04 1.1 F39 3_5,3_5 1.36 1.26

F100 2_6,2_6 2.54 2.32 F157 3’’,2’’ 9.24 7.17 F214 7,5 1.74 1.87 F39 3_5,4’ 7.48 7.54

F100 2_6,3_5 8.76 8.44 F157 4’’,3’’ 8.94 7.46 F214 7,6 7.19 8.05 F39 6,3 -0.02 0

F100 3””,4”” 9.51 8.17 F157 5’’,4’’ 9.86 9.07 F214 7,8 8.35 8.39 F39 6,5 9 8.45

F100 3’’,4’’ 9.5 7.37 F157 5’,6’ 8.47 8.45 F215 2’’,1’’ 7.72 7.27 F4 2_6,2_6 1.99 1.98

F100 3’’’,4’’’ 3.56 3.34 F157 5’’,6A 1.74 4.53 F215 2_6,2_6 1.98 1.42 F4 2_6,3_5 7.94 7.85

F100 3_5,2_6 0.33 0.46 F157 5’’,6B 6.34 3.4 F215 3’’,2’’ 9.62 7.6 F4 2_6,4’ 1.22 1.32

F100 3_5,3_5 2.51 2.65 F157 6,8 2.09 1.96 F215 4’’,3’’ 3.34 3.21 F4 3_5,2_6 0.63 0.58

F100 4””,5”” 9.57 9.2 F157 6B,6A -11.15 -12.33 F215 5’’,4’’ 2.48 0.7 F4 3_5,3_5 1.4 1.26

F100 4’’,5’’ 9.46 8.39 F158 2_6,2_6 2.65 2.32 F215 5’’,6A 4.76 4.59 F4 3_5,4’ 7.47 7.54

F100 4’’’,5’’’ 1.12 0.69 F158 2_6,3_5 8.42 8.26 F215 5’’,6B 7.38 5.64 F4 7,6 8.25 8.14

F100 5””,6”” 5.98 6.25 F158 3_5,2_6 0.36 0.42 F215 6,4 0.63 0 F4 7,8 8.4 7.85

F100 5’’,6’’ 6.17 6.25 F158 3_5,3_5 2.39 2.87 F215 6,8 1.51 1.96 F4 8,6 0.87 1.2

F100 5’’’,6A 5.51 4.8 F158 3A,2 3.01 4.52 F215 6A,6B -11.62 -12.49 F40 2_6,2_6 2.07 1.98

F100 5’’’,6B 6.91 3.99 F158 3B,2 12.96 8 F215 8,4 0.57 0.75 F40 2_6,3_5 7.94 7.85

F100 6A,6B -10.41 -12.56 F158 3B,3A -17.1 -15.79 F216 2’’’,1’’’ 1.72 2.43 F40 2_6,4’ 1.19 1.32

F100 8,6 2.16 2.59 F158 6,8 2.16 1.96 F216 3’’’,2’’’ 3.32 3.29 F40 3,2_6 0 0

F101 2_6,2_6 1.96 1.42 F159 2’’,1’’ 7.75 7.5 F216 4’’’,3’’’ 9.56 7.94 F40 3_5,2_6 0.58 0.58

F101 3,2_6 0.41 0 F159 2_6,2_6 2.05 1.98 F216 5’’’,4’’’ 9.59 9.11 F40 3_5,3_5 1.38 1.26

F102 2,2_6 -0.71 -0.67 F159 2_6,3_5 7.95 7.85 F216 6’’’,5’’’ 6.23 6.25 F40 3_5,4’ 7.46 7.54

F102 2_6,2_6 2.03 1.42 F159 2_6,4’ 1.16 1.32 F216 1’’,2’’ 7.8 7.09 F41 2’,5’ 0.27 0.46

F102 3A,2 3.06 3.12 F159 3’’,2’’ 9.34 7.52 F216 2’’,3’’ 9.2 7.41 F41 2’,6’ 2.17 1.97

F102 3A,3B -17.12 -16.75 F159 3,2_6 0 0 F216 2,7 -0.36 -0.79 F41 5,6 8.82 8.45

F102 3B,2 12.8 10.28 F159 3_5,2_6 0.6 0.58 F216 3’’,4’’ 9.13 7.55 F41 5’,6’ 8.47 8.45

F102 6,8 2.15 2.59 F159 3_5,3_5 1.47 1.26 F216 3_5,3_5 2 1.96 F41 5,8 0.34 0.04

F103 1’’,2’’ 7.91 7.72 F159 3_5,4’ 7.49 7.54 F216 5,2 0.47 0.42 F41 6,8 2.25 2.59

F103 2’’,3’’ 9.44 7.47 F159 4’’,3’’ 9.1 7.28 F216 5’’,4’’ 9.8 8.76 F42 2’,5’ 0.37 0.46

F103 2’,5’ 0.33 0.46 F159 4’’,5’’ 9.83 8.69 F216 5’’,6A 2.29 4.94 F42 2’,6’ 2.11 1.87

F103 2’,6’ 2.15 1.87 F161 2A,2B -13.31 -14.13 F216 5’’,6B 5.55 4.71 F42 5’,6’ 8.59 8.45

F103 3’’,4’’ 8.95 7.03 F161 3,2A 4.39 3.52 F216 6,2 2.12 1.97 F42 6,8 2.21 2.59

F103 4’’,5’’ 9.7 8.62 F161 3,2B 9.51 2.57 F216 6,5 8.19 8.27 F43 2_6,2 -0.55 -0.77

F103 5’,6’ 8.49 8.45 F161 4,2B 0 2.46 F216 6,7 -0.56 -0.79 F43 2_6,2_6 2.52 2.32

F104 1’’,2’’ 7.82 7.38 F161 4,3 8.47 3.71 F216 6B,6A -12.14 -12.6 F43 2_6,3_5 8.42 8.26

F104 2’,10 -0.02 -0.72 F161 4,6A 2.46 2.46 F216 7,8 0 7.15 F43 3_5,2_6 0.38 0.42

F104 2’’,3’’ 9.4 7.42 F161 5’,2’ 0.22 0.46 F217 2_6,2_6 1.94 1.98 F43 3_5,3_5 2.52 2.87

F104 2’,5’ 0.46 0.46 F161 5,4 3.18 2.88 F217 2_6,3_5 8.02 7.85 F43 3A,2 3.02 4.09

F104 4’’,3’’ 8.99 7.51 F161 5,6A 3.34 4.8 F217 2_6,4’ 1.19 1.32 F43 3A,3B -17.14 -16.19

F104 4’’,5’’ 9.75 8.91 F161 5,6B 5.37 10.27 F217 3_5,2_6 0.57 0.58 F43 3B,2 12.96 9.05

F104 5,4 8.5 8.45 F161 6’,2’ 2.07 1.97 F217 3_5,3_5 1.39 1.26 F43 6,8 2.31 2.59

F104 5’’,6B 5.61 5.02 F161 6’,5’ 8.17 8.45 F217 3_5,4’ 7.46 7.54 F44 2_6,2_6 2.01 1.63

F104 6’,10 -0.54 -0.73 F161 6A,2B 2.1 2.1 F217 5,8 0.45 0.39 F44 5,8 0.32 0.04

F104 6’,2’ 2.06 1.97 F161 6A,6B -14.09 -13.99 F217 7,5 3.07 2.87 F44 6,5 8.84 8.45

F104 6’,5’ 8.26 8.45 F161 6B,2A 2.23 -0.19 F217 7,8 9.2 8.75 F44 6,8 2.26 2.59

F104 6A,5’’ 2.31 3.21 F161 7’,2’ -0.45 -0.78 F218 2’’,1’’ 7.81 7.26 F47 2_6,2_6 2.39 2.32

F104 6A,6B -12.09 -12.84 F161 7’,6’ -0.52 -0.79 F218 2_6,2_6 1.97 1.42 F47 2_6,3_5 8.66 8.44

Page 168: Metabolomics Technologies applied to the

168

CHAPTER 6

F105 2,2’ -0.47 -0.76 F161 8’,7’ 15.89 17.14 F218 3’’,2’’ 9.24 7.28 F47 3,2_6 0.1 0

F105 2’,5’ 0.49 0.42 F162 2_6,2_6 1.92 1.98 F218 4’’,3’’ 8.97 7.64 F47 3_5,2_6 0.33 0.46

F105 2,6’ -0.55 -0.77 F162 2_6,3_5 8.05 7.85 F218 4’’,5’’ 9.84 9.07 F47 3_5,3_5 2.67 2.65

F105 2’,6’ 2.07 1.97 F162 2_6,4’ 1.17 1.32 F218 4’’,6A -0.46 -0.4 F49 2’,5’ 0.44 0.46

F105 3,2 11.77 7.87 F162 3_5,2_6 0.55 0.58 F218 4’’,6B -0.4 -0.4 F49 2’,6’ 2.12 1.87

F105 5,6 8.71 8.45 F162 3_5,3_5 1.51 1.26 F218 5’’,6A 2.27 3.64 F49 5’,6’ 8.59 8.45

F105 5’,6’ 8.1 8.27 F162 3_5,4’ 7.44 7.54 F218 5’’,6B 6.21 4.19 F49 6,8 2.24 2.59

F105 5,8 0.46 0.04 F162 5,6 8.95 8.45 F218 6,4 0.73 -0.01 F5 2_6,2_6 2.05 1.98

F105 6,8 2.21 2.59 F162 5,8 0.05 0.04 F218 6,8 1.28 1.96 F5 2_6,3_5 7.95 7.85

F106 26,26 2.41 1.92 F162 6,8 2.34 2.59 F218 6A,6B -12.12 -12.48 F5 2_6,4’ 1.19 1.32

F106 26,35 0.36 0.39 F166 2_6,2_6 2.52 2.32 F218 8,4 0.81 0.75 F5 3,2_6 0.15 0

F106 26,7 -0.69 -0.73 F166 2_6,3_5 8.66 8.44 F219 2’,2 -2.01 -0.63 F5 3,5 0.18 0

F106 35,26 8.47 8.44 F166 3,2_6 0.38 0 F219 2’,5’ 2.52 0.42 F5 3_5,2_6 0.58 0.58

F106 35,35 2.34 2.26 F166 3_5,2_6 0.34 0.46 F219 2’,6’ 3.06 1.87 F5 3_5,3_5 1.33 1.26

F106 5’,3’ 2.36 1.96 F166 3_5,3_5 2.44 2.65 F219 2’’,a -0.86 0 F5 3_5,4’ 7.47 7.54

F106 7,8 15.51 17.13 F166 6,3 -0.44 0 F219 3,2 11.51 9.13 F5 5,7 3.01 2.59

F107 2_6,2_6 2.51 2.32 F166 6,8 2.11 2.59 F219 5’’,2’’ 0.47 0.42 F5 5,8 0.38 0.04

F107 2_6,3_5 8.76 8.44 F17 1’’,2’’ 1.68 2.06 F219 5’,6’ 9.94 8.27 F5 7,8 9.02 8.45

F107 3,2_6 0.44 0 F17 2’’,3’’ 3.34 3.22 F219 6’,2 -2.09 -0.65 F50 2_6,2_6 2.14 1.78

F107 3,5 0.02 0 F17 2’,5’ 0.46 0.46 F219 6’’,2’’ 1.91 1.97 F50 2_6,3_5 0.63 0.39

F107 3,6 -0.04 0 F17 2’,6’ 2.13 1.87 F219 6’’,5’’ 8.1 8.27 F50 2_6,7 -0.92 -0.65

F107 3_5,2_6 0.42 0.46 F17 3’’,4’’ 9.78 8.07 F219 6,8 2 1.96 F50 3’,4’ 8.35 8.14

F107 3_5,3_5 2.54 2.65 F17 5’’,4’’ 9.7 9.04 F219 6’’,b -0.75 -0.57 F50 3’,5’ 1.16 0.9

F107 5,6 7.92 7.85 F17 5’’,6’’ 6.24 6.25 F219 b,a 8.06 8.74 F50 3’,6’ 0.36 0.52

F107 5,7 1.65 1.2 F17 6’,5’ 8.32 8.45 F219 g1,a 2.2 2.92 F50 3_5,2_6 7.68 8.09

F107 5,8 0.61 0.17 F17 6,8 2.05 1.96 F219 g1,b -0.05 -0.4 F50 3_5,3_5 1.21 1.25

F107 6,7 7.19 8.14 F170 2_6,2 -0.61 -0.79 F219 g1,g2 -12.33 -12.57 F50 4,2_6 1.35 1.73

F107 6,8 0.98 1.2 F170 2_6,2_6 1.94 1.98 F219 g2,a 4.37 3.64 F50 4,3_5 7.43 8.05

F107 7,8 8.43 7.85 F170 2_6,3_5 7.77 7.66 F22 2_6,2_6 1.98 1.98 F50 4’,5’ 7.18 7.55

F108 2_6,2_6 2.32 2.32 F170 2_6,4’ 1.2 1.26 F22 2_6,3_5 7.97 7.85 F50 4’,6’ 1.61 1.68

F108 2_6,3_5 8.44 8.44 F170 3_5,2_6 0.55 0.54 F22 2_6,4’ 1.19 1.32 F50 4,7 -0.97 -0.94

F108 3_5,2_6 0.46 0.46 F170 3_5,3_5 1.33 1.48 F22 3_5,2_6 0.63 0.58 F50 5’,6’ 8.05 7.85

F108 3_5,3_5 2.65 2.65 F170 3_5,4’ 7.48 7.54 F22 3_5,3_5 1.42 1.26 F50 8,7 13.36 17.1

F108 6,8 2.59 2.59 F170 3A,2 3.04 4.85 F22 3_5,4’ 7.41 7.54 F51 2_6,2_6 2.35 1.92

F109 1’’,2’’ 8 7.16 F170 3A,3B -16.9 -16.78 F22 5,7 2.98 2.59 F51 2_6,3_5 0.35 0.39

F109 2,2_6 -0.6 -0.77 F170 3B,2 12.94 7.46 F22 5,8 0.38 0.04 F51 2_6,7 -0.61 -0.56

F109 2_6,2_6 2.46 2.32 F170 4’,2 -1.16 -1.19 F22 7,8 9.08 8.45 F51 26’,26’ 2.73 2.32

F109 2_6,3_5 8.44 8.26 F170 5,6 8.72 8.45 F220 4’,3’ 8.45 8.14 F51 26’,35’ 8.73 8.44

F109 3’’,2’’ 9.53 7.34 F170 5,8 0.29 0.04 F220 5’,3’ 0.98 0.9 F51 3_5,2_6 8.61 8.44

F109 3_5,2_6 0.34 0.42 F170 6,8 2.26 2.59 F220 5’,4’ 7.48 7.55 F51 3_5,3_5 2.67 2.26

F109 3_5,3_5 2.45 2.87 F172 1’’,2’’ 7.66 7.09 F220 5,8 0.51 0.39 F51 35’,26’ 0.31 0.46

F109 3A,2 3.15 3.3 F172 2_6,2_6 2.05 1.98 F220 6’,3’ 0.39 0.52 F51 35’,35’ 2.3 2.65

F109 3A,3B -17.21 -16.28 F172 2_6,3_5 7.91 7.85 F220 6’,4’ 1.74 1.68 F51 7,8 15.54 17.14

F109 3B,2 12.9 9.82 F172 2_6,4’ 1.19 1.32 F220 6,5 8.09 8.1 F54 2_6,2_6 2.42 2.32

F109 4’’,3’’ 8.9 7.69 F172 3’’,2’’ 9.25 6.81 F220 6’,5’ 7.53 7.85 F54 2_6,3_5 8.75 8.44

F109 5’’,4’’ 9.8 9.04 F172 3,2_6 0 0 F220 6,8 0.99 1.1 F54 3,2_6 0.01 0

F109 5’’,6A 2 4.46 F172 3,5 0.01 0 F220 7,5 1.66 1.87 F54 3,5 0 0

F109 5’’,6B 4.97 7.9 F172 3,6 -0.09 0 F220 7,6 7.07 8.05 F54 3,6 -0.01 0

F109 6,8 2.26 2.59 F172 3_5,2_6 0.61 0.58 F220 7,8 8.53 8.39 F54 3_5,2_6 0.39 0.46

F109 6A,6B -12.14 -13.31 F172 3_5,3_5 1.33 1.26 F221 1’’,2’’ 7.65 7.4 F54 3_5,3_5 2.74 2.65

F11 2_6,2_6 2.06 1.98 F172 3_5,4’ 7.44 7.54 F221 2’’,3’’ 9.16 7.43 F54 5,6 8.78 8.45

F11 2_6,3_5 7.92 7.85 F172 4’’,3’’ 9 6.32 F221 2_6,2_6 1.77 1.98 F54 5,8 0.35 0.04

F11 2_6,4’ 1.2 1.32 F172 5’’,4’’ 9.71 7.8 F221 2_6,3_5 8.06 7.85 F54 6,8 2.25 2.59

F11 3,2_6 0.21 0 F172 5’’,6A 2.31 4.42 F221 2_6,4’ 1.12 1.32 F55 2’’’,1’’’ 1.59 2.26

F11 3_5,2_6 0.58 0.58 F172 5’’,6B 6.1 5.88 F221 3,2_6 0 0 F55 3’’’,2’’’ 3.19 3.47

F11 3_5,3_5 1.41 1.26 F172 6,5 8.85 8.45 F221 3’’,4’’ 8.95 7.33 F55 3’’’,4’’’ 9.5 7.88

F11 3_5,4’ 7.43 7.54 F172 6,8 2.31 2.59 F221 3_5,2_6 0.58 0.58 F55 4’’’,5’’’ 9 8.96

F11 5,7 1.68 1.2 F172 6A,6B -12.17 -13.3 F221 3_5,3_5 1.11 1.26 F55 6’’’,5’’’ 6.24 6.25

F11 5,8 0.17 0.17 F172 8,5 0.31 0.04 F221 3_5,4’ 7.58 7.54 F55 2’’,1’’ 7.13 7.53

F11 6,5 7.95 7.85 F173 2’,2 -0.63 -0.78 F221 4’’,5’’ 9.88 8.76 F55 2’’,3’’ 8.8 7.41

F11 6,7 7.13 8.14 F173 2’,4’ 2.49 2.65 F221 5,3 0 0 F55 2_6,2_6 2.53 2.32

F11 6,8 0.98 1.2 F173 2’,5’ 0.41 0.42 F221 5’’,6A 2.39 5.9 F55 2_6,3_5 8.68 8.44

F11 7,8 8.45 7.85 F173 2’,6’ 1.66 1.63 F221 5’’,6B 5.39 4.36 F55 3,2_6 0.25 0

F110 2’’’,1’’’ 1.63 2.51 F173 3A,2 3.01 5 F221 5,8 0.43 0.39 F55 3,6 -0.08 0

F110 3’’’,2’’’ 3.4 3.29 F173 3A,3B -16.9 -16.75 F221 6A,6B -12.16 -13.3 F55 3_5,2_6 0.36 0.46

F110 4’’’,3’’’ 9.54 7.92 F173 3B,2 12.9 5.69 F221 7,5 3 2.87 F55 3_5,3_5 2.53 2.65

F110 5’’’,4’’’ 9.59 8.92 F173 4’,2 -1.18 -1.17 F221 7,8 9.12 8.75 F55 4’’,3’’ 9.21 7.21

F110 6’’’,5’’’ 6.57 6.25 F173 4’,5’ 8.13 8.14 F222 2_6,2_6 2.02 1.98 F55 4’’,5’’ 9.7 8.77

F110 1’’,2’’ 7.23 6.98 F173 4’,6’ 0.95 0.9 F222 2_6,3_5 8.03 7.85 F55 5’’,6A 0.41 5.46

F110 2’’,3’’ 8.6 7.08 F173 5,6 7.89 7.85 F222 2_6,4’ 1.23 1.32 F55 5’’,6B 6.55 6.63

F110 2_6,2_6 2.41 2.32 F173 5’,6’ 7.63 7.67 F222 3,2_6 0 0 F55 6,8 2.18 2.59

F110 2_6,3_5 8.69 8.44 F173 5,7 1.76 1.2 F222 3_5,2_6 0.45 0.58 F55 6A,6B -11 -12.38

F110 3’’,4’’ 8.36 7.58 F173 5,8 0.48 0.17 F222 3_5,3_5 1.07 1.26 F56 2_6,2_6 1.89 1.98

F110 3_5,2_6 0.42 0.46 F173 6’,2 -0.6 -0.79 F222 3_5,4’ 7.5 7.54 F56 2_6,3_5 7.84 7.85

F110 3_5,3_5 2.51 2.65 F173 6,7 7.17 8.14 F222 5,3 0 0 F56 2_6,4’ 1.26 1.32

F110 5’’,4’’ 9.88 9.03 F173 6,8 1.04 1.2 F222 6,3 -0.34 0 F56 3,4 8.21 8.39

F110 5’’,6A 1.77 2.24 F173 7,8 8.36 7.85 F222 6,5 8.88 8.45 F56 3_5,2_6 0.59 0.58

F110 5’’,6B 6 8.67 F174 11,12 8.33 7.85 F223 12,11 8.64 7.85 F56 3_5,3_5 1.28 1.26

F110 6,8 2.11 2.59 F174 11,13 1.24 1.2 F223 13,11 1.2 1.2 F56 3_5,4’ 7.43 7.54

F110 6B,6A -11.23 -12.23 F174 11,14 0.73 0.17 F223 13,12 6.23 8.14 F56 4,7 -0.93 -1.23

F112 2’,5’ 0.35 0.46 F174 12,13 6.93 8.14 F223 14,11 1.22 0.17 F56 5,3 0.97 1.1

F112 3,2’ 0 0 F174 12,14 1.17 1.2 F223 14,12 1.2 1.2 F56 5,4 7.22 8.05

Page 169: Metabolomics Technologies applied to the

169

Flavonoid Database

F112 3,6’ 0.05 0 F174 13,14 8.14 7.85 F223 14,13 8.49 7.85 F56 5,6 7.86 8.1

F112 6’,2’ 2.28 1.97 F174 3’,4’ 8.41 8.14 F223 2_6,2_6 2.42 2.32 F56 6,3 0.39 0.39

F112 6’,5’ 8.52 8.45 F174 3’,5’ 1.01 0.9 F223 2_6,3_5 8.76 8.44 F56 6,4 1.63 1.87

F113 2_6,2_6 1.99 1.98 F174 3,6’ 0.01 0 F223 3_5,2_6 0.3 0.46 F56 6,7 -0.64 -0.78

F113 2_6,3_5 7.96 0.58 F174 3’,6’ 0.45 0.52 F223 3_5,3_5 2.77 2.65 F56 7,8 15.79 17.13

F113 2_6,4’ 1.18 1.32 F174 4’,5’ 7.35 7.55 F223 5,11 -0.3 -0.3 F57 3,2’ 0.01 0

F113 3,2_6 0.22 0 F174 4’,6’ 1.71 1.68 F223 5,13 0.11 0.11 F57 3,6’ 0.01 0

F113 3_5,2_6 0.53 7.85 F174 5,11 -0.3 -0.3 F223 5,14 -0.15 -0.15 F57 5’,2’ 0 0.46

F113 3_5,3_5 1.36 1.26 F174 5,13 0 0.11 F223 6,11 0.82 0.85 F57 6’,2’ 2.26 1.97

F113 3_5,4’ 7.43 7.54 F174 5,14 0.01 -0.15 F223 6,12 -0.3 -0.3 F57 6’,5’ 8.39 8.45

F113 5,8 0.04 0.04 F174 5,3 0 0 F223 6,13 -0.15 -0.15 F58 2’,5’ 0.42 0.46

F114 2_6,2_6 2.29 2.32 F174 5’,6’ 7.77 7.85 F223 6,14 -0.44 -0.44 F58 2’,6’ 2.18 1.97

F114 2_6,3_5 8.84 8.44 F174 6,11 0.82 0.85 F223 6,3 -0.01 0 F58 3,2’ 0.01 0

F114 3_5,2_6 0.34 0.46 F174 6,12 -0.3 -0.3 F223 6,5 8.71 8.14 F58 3,5 0 0

F114 3_5,3_5 2.82 2.65 F174 6,13 0.04 -0.15 F224 12,11 6.97 7.85 F58 3,6 0 0

F114 8,6 2.22 2.59 F174 6,14 0.48 -0.44 F224 13,11 1.52 1.2 F58 3,6’ 0 0

F115 2_6,2_6 1.98 1.63 F174 6,5 8.72 8.14 F224 13,12 8.26 8.14 F58 5,6 8.89 8.45

F115 3,2 11.65 7.76 F175 2_6,2_6 1.59 1.98 F224 14,11 0.02 0.17 F58 5’,6’ 8.51 8.45

F115 5,8 0.02 0.04 F175 2_6,3_5 7.93 7.85 F224 14,12 1.14 1.2 F58 5,8 0.01 0.04

F115 6,5 8.71 8.45 F175 2_6,4’ 1.18 1.32 F224 14,13 7.12 7.85 F58 6,8 2.38 2.59

F115 6,8 2.23 2.59 F175 3,2_6 0.01 0 F224 3,6’ 0.01 0 F59 2’’’,1’’’ 1.53 2.4

F116 2’,5’ 0.55 0.46 F175 3_5,2_6 0.59 0.58 F224 4’,3’ 8.42 8.14 F59 3’’’,2’’’ 3.34 3.37

F116 3,2’ 0.04 0 F175 3_5,3_5 0.72 1.26 F224 5’,3’ 1.01 0.9 F59 4’’’,3’’’ 9.55 7.91

F116 3,6’ 0.28 0 F175 3_5,4’ 7.48 7.54 F224 5’,4’ 7.48 7.55 F59 5’’’,4’’’ 9.44 9.1

F116 6’,2’ 2.3 1.97 F175 6,3 0 0 F224 6’,3’ 0.51 0.52 F59 6’’’,5’’’ 6.13 6.25

F116 6’,5’ 8.53 8.45 F177 2_6,2_6 0.96 2.32 F224 6’,4’ 1.62 1.68 F59 1’’,2’’ 7.27 6.65

F117 1’’,2’’ 6.97 7.29 F177 2_6,3_5 8.08 8.44 F224 6’,5’ 7.39 7.85 F59 2’’,3’’ 9.07 6.87

F117 2,2’ -0.66 -0.74 F177 3_5,2_6 1.03 0.46 F224 7,11 0.49 -0.44 F59 2,3A 3.5 2.39

F117 2’’,3’’ 9.47 7.42 F177 3_5,3_5 1.89 2.65 F224 7,12 0.03 -0.15 F59 2,3B 12.67 10.87

F117 2’,5’ 2.6 0.42 F177 6,8 2.09 1.96 F224 7,13 -0.3 -0.3 F59 2’,5’ 0.42 0.42

F117 2,6’ -3.1 -0.75 F178 2_6,2 -0.68 -0.79 F224 7,14 0.74 0.85 F59 2,6’ -0.23 -0.78

F117 2’,6’ 1.98 1.97 F178 2_6,2_6 1.93 1.98 F224 7,8 9.06 8.45 F59 2’,6’ 1.5 1.87

F117 3’’,4’’ 9 7.5 F178 2_6,3_5 7.79 7.66 F224 8,11 0.01 -0.15 F59 3’’,4’’ 9.07 7.74

F117 3A,2 16.93 2.14 F178 2_6,4’ 1.22 1.26 F224 8,12 0 0.11 F59 3A,3B -17.24 -16.73

F117 3A,3B -11.8 -16.88 F178 3_5,2_6 0.55 0.54 F224 8,14 -0.3 -0.3 F59 5’’,4’’ 9.28 9.14

F117 3B,2 17.2 10.77 F178 3_5,3_5 1.34 1.48 F226 2_6,2_6 2.42 2.32 F59 5’,6’ 3.31 8.27

F117 5’’,4’’ 9 8.79 F178 3_5,4’ 7.49 7.54 F226 2_6,3 0.22 0 F59 5’’,6A 2.3 5.14

F117 5,6 8.92 8.45 F178 3A,2 3.54 5.44 F226 2_6,3_5 8.66 8.44 F59 5’’,6B 5.07 5.15

F117 5’’,6A 2.33 4.52 F178 3B,2 12.68 6.46 F226 3_5,2_6 0.36 0.46 F59 6,8 2.24 2.59

F117 5’’,6B 5.71 5 F178 3B,3A -17.16 -15.79 F226 3_5,3_5 2.67 2.65 F59 6B,6A -12.14 -12.44

F117 6’,5’ 8.13 8.27 F178 4’,2 -1.16 -1.19 F226 6,3 -0.03 0 F6 2_6,2_6 2.95 1.98

F117 6B,6A -12.11 -12.73 F178 6,8 0.93 0.95 F226 6,8 0.86 0.95 F6 2_6,3_5 7.93 7.85

F118 2’,2 -0.58 -0.79 F178 7,6 8.29 8.39 F226 7,6 8.23 8.39 F6 2_6,4’ 1.27 1.32

F118 2’,5’ 0.41 0.42 F178 7,8 8.29 8.39 F226 7,8 8.4 8.39 F6 3,2_6 0.2 0

F118 2’,6’ 2.02 1.87 F179 2_6,2 -0.14 0 F227 5’,2’ 0.32 0.46 F6 3_5,2_6 0.48 0.58

F118 3A,2 3 3.62 F179 2_6,2_6 1.82 1.78 F227 6’,2’ 2.17 1.97 F6 3_5,3_5 0.5 1.26

F118 3A,3B -17.12 -16.74 F179 2_6,3_5 7.73 8.09 F227 6’,5’ 8.48 8.45 F6 3_5,4’ 7.38 7.54

F118 3B,2 12.92 9.92 F179 2_6,4’ 1.2 1.73 F227 8,6 2.08 1.96 F6 5,8 0.04 0.04

F118 6’,2 -0.59 -0.79 F179 3_5,2_6 0.6 0.39 F228 2_6,2_6 2.57 2.32 F6 6,5 8.79 8.45

F118 6’,5’ 8.11 8.27 F179 3_5,3_5 1.37 1.25 F228 2_6,3_5 8.74 8.44 F6 6,8 2.25 2.59

F118 6,8 2.17 2.59 F179 3_5,4’ 7.48 8.05 F228 3_5,2_6 0.39 0.46 F60 2’’,1’’ 7.39 7.39

F119 2_6,2_6 1.83 1.98 F179 5,6 8.81 8.45 F228 3_5,3_5 2.51 2.65 F60 3’’,2’’ 7.39 7.46

F119 2_6,3_5 7.88 7.85 F179 5,8 0.35 0.04 F228 8,6 2.08 2.59 F60 4’’,3’’ 7.48 7.61

F119 2_6,4’ 1.25 1.32 F179 6,8 2.27 2.59 F229 2,2_6 0.25 0 F60 5,2 0.39 0.39

F119 26,26 2.39 1.92 F18 1’’’,2’’’ 1.77 2.37 F229 2_6,2_6 2.72 2.32 F60 5’’,4’’ 8.92 9.01

F119 26,35 0.34 0.39 F18 2’’’,3’’’ 3.34 3.44 F229 2_6,3_5 8.55 8.44 F60 5,6 8.45 8.45

F119 26,7 -0.59 -0.51 F18 3’’’,4’’’ 9.56 7.81 F229 3_5,2_6 0.41 0.46 F60 5’’,6A 4.29 5.71

F119 3_5,2_6 0.51 0.58 F18 4’’’,5’’’ 9.58 9.06 F229 3_5,3_5 2.41 2.65 F60 5’’,6B 5.1 4.44

F119 3_5,3_5 1.15 1.26 F18 5’’’,6’’’ 6.22 6.25 F229 5,8 0.36 0.39 F60 6,2 1.77 1.77

F119 3_5,4’ 7.43 7.54 F18 1’’,2’’ 7.75 7 F229 6,5 8.82 8.45 F60 6’,5’ 8.45 8.45

F119 35,26 8.49 8.44 F18 2’’,3’’ 9.2 6.63 F229 6,8 2.28 2.11 F60 6A,6B -12.37 -12.36

F119 35,35 2.5 2.26 F18 2_6,2_6 2.69 2.32 F23 2_6,2_6 1.97 1.98 F60 7,8 17.13 17.13

F119 7,8 15.56 17.12 F18 2_6,3_5 8.7 8.44 F23 2_6,3_5 7.98 7.85 F62 2_6,2_6 2.45 2.32

F12 2_6,2_6 1.96 1.98 F18 3,2_6 0 0 F23 2_6,4’ 1.2 1.32 F62 2_6,3_5 8.77 8.44

F12 2_6,3_5 8.02 7.85 F18 3’’,4’’ 9.11 7.28 F23 3_5,2_6 0.64 0.58 F62 3,2_6 0.26 0

F12 2_6,4’ 1.19 1.32 F18 3’’,6A 0.01 0 F23 3_5,3_5 1.5 1.26 F62 3,6 -0.32 0

F12 3_5,2_6 0.57 0.58 F18 3_5,2_6 0.26 0.46 F23 3_5,4’ 7.43 7.54 F62 3_5,2_6 0.35 0.46

F12 3_5,3_5 1.39 1.26 F18 3_5,3_5 2.31 2.65 F23 5,8 0.33 0.04 F62 3_5,3_5 2.75 2.65

F12 3_5,4’ 7.46 7.54 F18 4’’,5’’ 9.75 8.88 F23 6,5 8.82 8.45 F62 6,8 2.29 2.59

F12 6,8 2 2.59 F18 4’’,6A -0.27 -0.39 F23 6,8 2.21 2.59 F64 1’’,2’’ 8.33 7.39

F120 2’,5’ 0.44 0.46 F18 4’’,6B -0.03 -0.4 F230 2’’’,1’’’ 1.71 3.56 F64 26,26 2.31 2.32

F120 2’,6’ 2.18 1.97 F18 5’’,6A 2.31 4.49 F230 2’’’,3’’’ 3.36 3.24 F64 26,35 8.31 8.26

F120 3,2’ 0.01 0 F18 5’’,6B 5.94 7.47 F230 3’’’,4’’’ 9.85 2.29 F64 3’’,2’’ 9.76 7.12

F120 3,5 0.01 0 F18 6A,6B -12.25 -12.43 F230 4’’’,5’’’ 8.74 5.37 F64 3’,5’ 0.98 0.95

F120 3,6 -0.01 0 F18 8,6 2.2 2.59 F230 6’’’,5’’’ 6.14 6.25 F64 3’’,6A 2.87 0

F120 3,6’ 0.01 0 F180 2’,5’ 0.46 0.46 F230 1’’,2’’ 7.75 6.31 F64 35,26 0.33 0.42

F120 5,6 9.01 8.45 F180 2’,6’ 2.2 1.87 F230 2_6,2_6 2.46 2.32 F64 35,35 2.7 2.87

F120 5’,6’ 8.52 8.45 F180 5’,6’ 8.06 8.45 F230 2_6,3_5 8.43 8.26 F64 4’,3’ 8.35 8.39

F121 2_6,2_6 2.53 2.32 F180 6,8 2.2 1.96 F230 3’’,2’’ 9.07 5.83 F64 4’’,3’’ 10.85 7.3

F121 2_6,3_5 8.39 8.44 F184 3’,4’ 8.44 8.14 F230 3_5,2_6 0.37 0.42 F64 4’,5’ 8.32 8.39

F121 2_6,O 0.36 0 F184 3,5 0.1 0 F230 3_5,3_5 2.4 2.87 F64 4’’,5’’ 8.44 8.76

F121 3_5,2_6 0.46 0.46 F184 3’,5’ 1.01 0.9 F230 3A,2 2.86 3.05 F64 5’’,6A 5.13 3.98

Page 170: Metabolomics Technologies applied to the

170

CHAPTER 6

F121 3_5,3_5 2.39 2.65 F184 3,6 -0.01 0 F230 3B,2 12.92 10.08 F64 5’’,6B 5.9 5.06

F121 5,8 0.45 0.04 F184 3,6’ 0.04 0 F230 3B,3A -17.12 -16.5 F64 6A,6B -12.11 -12.45

F121 6,5 8.82 8.45 F184 3’,6’ 0.49 0.52 F230 4’’,3’’ 9.61 7.25 F64 7,26 -0.42 -0.74

F121 6,8 2.29 2.59 F184 4’,5’ 7.15 7.55 F230 5’’,4’’ 9.59 9.16 F64 8,7 7.59 10.82

F122 2,7 -0.06 -0.62 F184 4’,6’ 1.67 1.68 F230 5’’,6A 2.47 6.24 F65 2_6,2_6 1.91 1.42

F122 2_6,2_6 1.87 1.98 F184 5,6 8.8 8.45 F230 5’’,6B 5.4 4.24 F65 3,2_6 0.44 0

F122 2_6,3_5 7.89 7.85 F184 5’,6’ 7.95 7.85 F230 6A,6B -12.02 -12.46 F65 3,6 -0.21 0

F122 2_6,4’ 1.25 1.32 F184 5,8 0.24 0.04 F230 8,6 2.27 2.59 F65 6,8 2.25 2.59

F122 3_5,2_6 0.51 0.58 F184 6,8 2.35 2.59 F231 2’,2 -0.54 -0.75 F66 2’,5’ 0.46 0.46

F122 3_5,3_5 1.19 1.26 F185 3’,4’ 8.24 8.14 F231 2,3 7.53 7.7 F66 2’,6’ 2.31 1.87

F122 3_5,4’ 7.41 7.54 F185 3,5 0.01 0 F231 3,4A 5.41 5.96 F66 5’,6’ 8.35 8.45

F122 5,2 0.53 0.39 F185 3’,5’ 1.12 0.9 F231 3,4B 8.15 7.78 F66 6,8 2.23 2.59

F122 5,6 8.31 8.45 F185 3,6’ 0.01 0 F231 4B,4A -16.1 -15.33 F67 2_6,2_6 2.02 1.42

F122 6,2 2.03 1.77 F185 3’,6’ 0.35 0.52 F231 5’,2’ 0.29 0.42 F67 3,2_6 0.03 0

F122 6,7 -0.54 -0.65 F185 4’,5’ 7.25 7.55 F231 6’,2 -0.56 -0.76 F69 1’’,2’’ 7.19 7.07

F122 7,8 15.56 17.13 F185 4’,6’ 1.7 1.68 F231 6’,2’ 2.06 1.97 F69 3’’,2’’ 7.33 7.28

F123 2_6,2_6 1.99 1.98 F185 5’,6’ 7.96 7.85 F231 6’,5’ 8.11 8.27 F69 3’’,4’’ 7.45 7.47

F123 2_6,3_5 7.89 7.85 F185 5,7 3.01 2.59 F231 6,8 2.31 2.59 F69 5’,2’ 0.46 0.46

F123 2_6,4’ 1.23 1.32 F185 5,8 0.4 0.04 F232 1’’,2’’ 7.7 7.39 F69 5’’,4’’ 8.76 8.68

F123 3,2_6 0 0 F185 7,8 9.02 8.45 F232 2’’,3’’ 9.21 7.46 F69 5’’,6A 6.07 4.54

F123 3_5,2_6 0.61 0.58 F187 3’,4’ 8.25 8.14 F232 2,6 1.38 1.28 F69 5’’,6B 4.65 4.88

F123 3_5,3_5 1.31 1.26 F187 3,5 0 0 F232 2,7 -0.53 -0.78 F69 6’,2’ 1.97 1.97

F123 3_5,4’ 7.49 7.54 F187 3’,5’ 1.12 0.9 F232 2_6,2_6 2.5 2.32 F69 6’,5’ 8.45 8.45

F123 6,8 2.34 2.59 F187 3,6 0 0 F232 2_6,3_5 8.45 8.44 F69 6B,6A -12.75 -12.76

F124 2-6,2-6 2.56 1.92 F187 3,6’ 0 0 F232 3’’,4’’ 9 7.73 F69 8,6 1.96 1.96

F124 2-6,3-5 0.35 0.39 F187 3’,6’ 0.38 0.52 F232 3_5,2_6 0.37 0.46 F7 2’,5’ 0.57 0.46

F124 2-6,7 -0.56 -0.7 F187 4’,5’ 7.25 7.55 F232 3_5,3_5 2.5 2.65 F7 2’,6’ 2.2 1.97

F124 3-5,2-6 8.47 8.44 F187 4’,6’ 1.7 1.68 F232 4,2 2.21 2.57 F7 3,2’ 0.6 0

F124 3-5,3-5 2.35 2.26 F187 5,6 7.97 7.85 F232 4,6 2.13 2.57 F7 3,5 0.18 0

F124 5’,3’ 2.4 2.3 F187 5’,6’ 7.96 7.85 F232 4,7 -1.2 -1.19 F7 3,6’ 0.36 0

F124 6’,3’ 0.22 0.39 F187 5,7 1.68 1.2 F232 5’’,4’’ 9.72 9.09 F7 5’,6’ 8.27 8.45

F124 6’,5’ 8.88 8.45 F187 5,8 0.51 0.17 F232 5’’,6A 2.46 4.33 F7 5,7 1.65 1.2

F124 7,8 15.37 17.12 F187 6,7 7.14 8.14 F232 5’’,6B 6.08 4.43 F7 5,8 0.56 0.17

F125 2_6,2_6 1.9 1.98 F187 6,8 1.02 1.2 F232 6,7 -0.51 -0.76 F7 6,3 -0.47 0

F125 2_6,3_5 7.98 7.85 F187 7,8 8.46 7.85 F232 6B,6A -12.08 -12.46 F7 6,5 7.92 7.85

F125 2_6,4’ 1.15 1.32 F19 2’,5’ 0.1 0.46 F232 7,8 16.26 17.13 F7 6,7 7.32 8.14

F125 3_5,2_6 0.59 0.58 F19 2’,6’ 2.17 1.97 F232 8,2_6 -0.55 -0.79 F7 6,8 1.06 1.2

F125 3_5,3_5 1.33 1.26 F19 5’,6’ 8.48 8.45 F233 2,5 0.36 0.39 F7 7,8 8.44 7.85

F125 3_5,4’ 7.44 7.54 F19 5,7 1.65 1.2 F233 2,6 2.09 1.67 F70 1’’,2’’ 6.42 7.09

F125 5,6 8.83 8.45 F19 5,8 0.52 0.17 F233 5’,3’ 2.41 2.3 F70 1’’,5’’ -6.8 -0.19

F125 5,8 0.33 0.04 F19 6,5 8.03 7.85 F233 5,6 8.17 8.45 F70 2,2’ -0.73 -0.79

F125 6,8 2.23 2.59 F19 6,7 7.06 8.14 F233 6’,3’ 0.35 0.39 F70 2’’,3’’ 7.14 7.23

F126 3’,4’ 8.29 8.14 F19 6,8 1 1.2 F233 6’,5’ 8.88 8.45 F70 2,6’ -0.01 -0.79

F126 3’,5’ 1.09 0.9 F19 7,8 8.52 7.85 F233 7,2 -0.51 -0.79 F70 3’’,4’’ 10.44 7.67

F126 3’,6’ 0.49 0.52 F190 3’,4’ 8.28 8.14 F233 7,6 -0.55 -0.78 F70 3’’,6A 1.95 0

F126 4’,5’ 7.28 7.55 F190 3’,5’ 1.08 0.9 F233 7,8 15.3 17.13 F70 3A,2 3.12 2.79

F126 4’,6’ 1.7 1.68 F190 3’,6’ 0.42 0.52 F234 2’,2 -0.64 -0.79 F70 3A,3B -17.2 -16.34

F126 5’,6’ 7.77 7.85 F190 4’,5’ 7.31 7.55 F234 2,3 2.42 2.07 F70 3B,2 12.69 10.34

F126 6,8 2.08 2.59 F190 4’,6’ 1.66 1.68 F234 2’,5’ 0.34 0.42 F70 4’’,5’’ 0.5 9.03

F127 2,2_6 0.29 0 F190 5,6 8.06 7.85 F234 2’,6’ 2.06 1.87 F70 4’’,6A -2.47 -0.4

F127 2_6,2_6 2.54 2.32 F190 5’,6’ 7.73 7.85 F234 3,4A 3.1 3.03 F70 4’’,6B -0.54 -0.4

F127 2_6,3_5 8.52 8.44 F190 5,7 1.62 1.2 F234 3,4B 4.6 2.77 F70 5’,2’ 0.57 0.42

F127 3_5,2_6 0.48 0.46 F190 5,8 0.48 0.17 F234 4A,4B -16.71 -14.82 F70 5’’,6A 0.23 5.3

F127 3_5,3_5 2.51 2.65 F190 6,7 7.06 8.14 F234 5’,6’ 8.15 8.27 F70 5’’,6B 10.05 4.1

F127 5,8 0.55 0.04 F190 6,8 0.93 1.2 F234 6’,2 -0.67 -0.79 F70 6,1’’ 1.8 0

F128 2_6,2_6 2.07 1.42 F190 7,8 8.51 7.85 F234 6,8 2.34 2.59 F70 6’,2’ 1.81 1.97

F128 6,8 2 2.59 F191 2_6,2_6 2.17 1.98 F235 3’,5’ 2.36 2.3 F70 6’,5’ 0.47 8.27

F129 1’’,2’’ 7.83 6.77 F191 2_6,3_5 6.29 7.85 F235 3’,6’ 0.38 0.39 F70 6B,6A -12.16 -13.29

F129 2_6,2_6 2.47 2.32 F191 2_6,4’ 2.34 1.32 F235 5’,6’ 8.6 8.45 F70 8,1’’ 1.8 0

F129 2_6,3_5 8.71 8.44 F191 3_5,2_6 1.3 0.58 F235 6,8 2.1 1.96 F70 8,6 1.43 1.96

F129 3’’,2’’ 9.26 6.93 F191 3_5,3_5 1.1 1.26 F236 3,2’ 0.48 0 F71 2,7 -0.51 -0.61

F129 3_5,2_6 0.38 0.46 F191 3_5,4’ 4.6 7.54 F236 3,6’ 0 0 F71 5,2 0.39 0.39

F129 3_5,3_5 2.49 2.65 F191 5,7 3.1 2.59 F236 5’,2’ 0.1 0.46 F71 5’,3’ 2.4 2.3

F129 4’’,3’’ 9 7.46 F191 5,8 0.03 0.04 F236 6’,2’ 2.16 1.97 F71 5,6 8.16 8.45

F129 4’’,5’’ 9.83 8.9 F191 7,8 9.16 8.45 F236 6’,5’ 8.47 8.45 F71 6,2 1.96 1.77

F129 5’’,6A 2.35 4 F192 2_6,2_6 2.69 2.32 F236 6,8 2.12 1.96 F71 6’,3’ 0.39 0.39

F129 5’’,6B 5.57 5.29 F192 2_6,3_5 8.67 8.44 F237 2’,3 0.22 0 F71 6’,5’ 8.88 8.45

F129 6,8 1.82 2.59 F192 3,2_6 0.22 0 F237 5’,2’ 0.34 0.46 F71 6,7 -0.54 -0.64

F129 6B,6A -11.87 -12.77 F192 3,5 0.02 0 F237 5’,6’ 8.53 8.45 F71 7,8 15.31 17.14

F130 1’’,2’’ 7.87 7.14 F192 3_5,2_6 0.29 0.46 F237 6’,2’ 2.29 1.97 F72 2_6,2_6 2.35 1.98

F130 2’,5’ 0.18 0.46 F192 3_5,3_5 2.4 2.65 F237 6,3 -0.26 0 F72 2_6,3_5 7.92 7.85

F130 2’,6’ 2.18 1.87 F192 5,7 3.01 2.59 F237 6’,3 0.27 0 F72 2_6,4’ 1.19 1.32

F130 3’’,2’’ 9.26 7.03 F192 5,8 0.4 0.04 F237 6,8 2.12 1.96 F72 3,2_6 0.29 0

F130 3’’,4’’ 8.92 7.28 F192 7,8 9.01 8.45 F238 2_6,2_6 2.03 1.78 F72 3,5 0.01 0

F130 4’’,5’’ 9.85 8.97 F193 2_6,2_6 1.99 1.98 F238 2_6,3_5 0.55 0.39 F72 3_5,2_6 0.56 0.58

F130 5’,6’ 8.46 8.45 F193 2_6,3_5 7.82 7.85 F238 2_6,4’ 1.08 1.73 F72 3_5,3_5 1.67 1.26

F130 6,8 2.08 2.59 F193 2_6,4’ 1.24 1.32 F238 2_6,8 -0.71 -0.78 F72 3_5,4’ 7.47 7.54

F130 6A,5’’ 2.21 4.29 F193 3_5,2_6 0.57 0.58 F238 26,26 1.72 1.98 F72 5,8 0.4 0.04

F130 6A,6B -11.79 -13.21 F193 3_5,3_5 1.36 1.26 F238 26,35 0.52 0.58 F73 2’,5’ 0.82 0.46

F130 6B,5’’ 5.84 4.86 F193 3_5,4’ 7.46 7.54 F238 26,4 1.26 1.32 F73 2’,6’ 2.06 1.87

F131 2’’’,1’’’ 1.67 4.53 F193 5,6 7.97 7.85 F238 26,7 -0.48 -0.78 F73 3,2’ 0.04 0

F131 3’’’,2’’’ 3.43 4.77 F193 5,7 1.76 1.2 F238 3_5,2_6 7.63 8.09 F73 3,6’ 0.29 0

Page 171: Metabolomics Technologies applied to the

171

Flavonoid Database

F131 4’’’,3’’’ 9.53 7.99 F193 6,7 7.46 8.14 F238 3_5,3_5 1.57 1.25 F73 5’,6’ 7.94 8.45

F131 4’’’,5’’’ 9.49 7.23 F194 2_6,2_6 2.14 1.98 F238 3_5,4’ 7.44 8.05 F74 2_6,2_6 1.97 1.98

F131 5’’’,6’’’ 6.24 6.25 F194 2_6,3_5 7.87 7.85 F238 35,26 7.99 7.85 F74 2_6,3_5 7.95 7.85

F131 1’’,2’’ 7.8 7.18 F194 2_6,4’ 1.3 1.32 F238 35,35 1.15 1.26 F74 2_6,4’ 1.19 1.32

F131 3’’,2’’ 9.2 6.85 F194 3_5,2_6 0.63 0.58 F238 35,4 7.37 7.54 F74 3,2_6 0.3 0

F131 3’,4’ 8.27 8.14 F194 3_5,3_5 1.26 1.26 F238 4,7 -1.24 -1.23 F74 3_5,2_6 0.56 0.58

F131 3’’,4’’ 8.91 7.28 F194 3_5,4’ 7.43 7.54 F238 4’,8 -1.24 -1.23 F74 3_5,3_5 1.39 1.26

F131 3’,5’ 1.04 0.9 F194 5,6 8.94 8.45 F238 8,7 13.29 17.14 F74 3_5,4’ 7.41 7.54

F131 3’,6’ 0.44 0.52 F194 5,8 0.04 0.04 F24 2_6,2_6 2.76 1.98 F75 3,2’ 0.11 0

F131 4’,5’ 7.35 7.55 F194 6,8 2.34 2.59 F24 2_6,3 0.12 0 F75 3,6 -0.12 0

F131 4’’,5’’ 9.86 9.01 F195 2,2_6 0.01 0 F24 2_6,3_5 7.86 7.85 F75 3,6’ 0.38 0

F131 4’,6’ 1.71 1.68 F195 2_6,2_6 2.83 2.32 F24 2_6,4’ 1.28 1.32 F75 5’,2’ 0.5 0.46

F131 5’,6’ 7.69 7.85 F195 2_6,3_5 8.53 8.44 F24 3_5,2_6 0.6 0.58 F75 6’,2’ 2.17 1.97

F131 6,8 2.09 2.59 F195 3_5,2_6 0.37 0.46 F24 3_5,3_5 1.26 1.26 F75 6’,5’ 8.52 8.45

F131 6A,5’’ 1.8 2.99 F195 3_5,3_5 2.26 2.65 F24 3_5,4’ 7.35 7.54 F75 8,6 2.3 2.59

F131 6A,6B -11.13 -12.4 F195 5,8 0.02 0.39 F24 6,3 0 0 F77 1’’,2’’ 7.76 7.32

F131 6B,5’’ 5.71 3.54 F197 2,3 11.59 7.87 F24 6,5 8.72 8.45 F77 2’,5’ 0.21 0.46

F133 7’’’,26 -0.52 -0.79 F197 2_6,2 0.48 -0.79 F241 46,46 1.93 1.42 F77 2’,6’ 2.08 1.87

F133 8’’’,7’’’ 15.92 17.06 F197 2_6,2_6 2.27 2.32 F241 5,46 8.1 8.39 F77 3’’,2’’ 9.27 7.28

F133 1’’,2’’ 7.8 7.37 F197 2_6,3 0.01 0 F242 36,36 0.28 0.39 F77 4’’,3’’ 9.15 7.61

F133 2’’,3’’ 9.25 7.43 F197 2_6,3_5 8.4 8.26 F242 36,45 7.9 8.39 F77 5’’,4’’ 9.8 9.11

F133 2_6,2_6 2.43 2.32 F197 3_5,2 0.28 0.57 F242 45,36 1.53 1.57 F77 5’,6’ 8.44 8.45

F133 2_6,3_5 8.71 8.44 F197 3_5,2_6 0.34 0.42 F242 45,45 7.47 8.05 F77 5’’,6A 2.34 4.45

F133 26,26 2.32 2.32 F197 3_5,3_5 2.7 2.87 F243 26,26 2.57 2.57 F77 5’’,6B 5.61 6.88

F133 26,35 8.45 8.44 F197 6,8 2.1 2.59 F243 26,35 0.46 0.39 F77 6,8 2.1 2.59

F133 3’’,4’’ 8.95 7.63 F197 8,3 0 0.24 F243 35,26 8.19 8.39 F77 6A,6B -11.93 -12.35

F133 3_5,2_6 0.36 0.46 F198 2_6,2 -0.47 -0.72 F243 35,35 1.85 1.72 F8 2_6,2_6 2.39 2.32

F133 3_5,3_5 2.57 2.65 F198 2_6,2_6 1.99 1.42 F243 4,26 1.06 1.42 F8 2_6,3_5 8.68 8.44

F133 35,26 0.38 0.46 F198 3,2 11.34 8.36 F243 4,35 7.36 8.05 F8 3,2_6 0.15 0

F133 35,35 2.51 2.65 F198 6,8 2.13 2.59 F244 26,26 3.09 2.72 F8 3_5,2_6 0.33 0.46

F133 4’’,5’’ 9.86 8.95 F199 2_6,2_6 3.14 3.12 F244 26,35 8.81 8.75 F8 3_5,3_5 2.64 2.65

F133 5’’,6A 2.2 3.03 F199 2_6,3_5 8.82 8.75 F244 35,26 0.36 0.39 F8 6,7 8.24 8.14

F133 5’’,6B 6.69 7.17 F199 3_5,2_6 0.27 0.39 F244 35,35 3.09 2.72 F8 6,8 0.87 1.2

F133 6B,6A -11.76 -13.57 F199 3_5,3_5 2.92 3.12 F245 4,2 2.42 2.42 F8 7,8 8.4 7.85

F133 8,6 2.09 2.59 F2 2_6,2 -1.29 -0.79 F245 5,2 0.3 0.39 F83 2_6,2_6 2.62 2.32

F134 2_6,2_6 1.98 1.98 F2 2_6,2_6 0.92 1.98 F245 5,4 8.24 8.39 F83 2_6,3_5 8.61 8.44

F134 2_6,3_5 7.93 7.85 F2 2_6,3_5 7.75 7.66 F245 6,2 2.29 2.42 F83 26,26 1.98 1.78

F134 2_6,4’ 1.23 1.32 F2 2_6,4’ 1.06 1.26 F245 6,4 0.82 1.28 F83 26,35 0.63 0.39

F134 3’’,2’’ 2.21 5 F2 3_5,2_6 0.65 0.54 F245 6,5 8.08 8.39 F83 26,4 1.21 1.73

F134 3_5,2_6 0.63 0.58 F2 3_5,3_5 0.48 1.48 F246 4,3 7.93 8.39 F83 26,7 -0.42 -0.65

F134 3_5,3_5 1.4 1.26 F2 3_5,4’ 7.42 7.54 F246 5,3 1.57 1.57 F83 3_5,2_6 0.32 0.46

F134 3_5,4’ 7.44 7.54 F2 3A,2 3.2 3.85 F246 5,4 7.67 8.05 F83 3_5,3_5 2.29 2.65

F134 5,2’’ 0.11 0 F2 3B,2 13 10.14 F246 6,3 0.09 0.39 F83 35,26 7.77 8.09

F134 5,3’’ -0.3 -0.04 F2 3B,3A -16.99 -16.6 F246 6,4 1.47 1.57 F83 35,35 1.36 1.25

F134 5,6 8.83 8.14 F2 4’,2 -1.17 -1.19 F246 6,5 8.17 8.39 F83 35,4 7.4 8.05

F134 6,2’’ -0.3 0 F2 5,7 3.1 2.59 F247 26,26 2.71 2.57 F83 4,7 -1.07 -0.95

F134 6,3’’ 0.8 0.75 F2 5,8 0.43 0.04 F247 26,35 0.37 0.39 F83 7,8 13.99 17.12

F137 1’’’,2’’’ 1.74 2.06 F2 7,8 8.88 8.45 F247 35,26 8.39 8.39 F84 2’’,1’’ 9.91 9.16

F137 2’’’,3’’’ 3.23 3.11 F20 2_6,2_6 2.07 1.98 F247 35,35 1.84 1.72 F84 3’’,2’’ 8.96 7.61

F137 4’’’,3’’’ 9.59 8.32 F20 2_6,3_5 7.93 7.85 F247 4,26 1.03 1.42 F84 3,6’ 0 0

F137 5’’’,4’’’ 9.46 9 F20 2_6,4’ 1.2 1.32 F247 4,35 7.35 8.05 F84 4’’,3’’ 8.75 7.55

F137 5’’’,6’’’ 6.18 6.25 F20 3_5,2_6 0.59 0.58 F248 23,23 8.11 8.75 F84 4’’,5’’ 9.6 9

F137 1’’,2’’ 9.98 8.98 F20 3_5,3_5 1.37 1.26 F248 23,46 1 0.39 F84 5’,2’ 0.12 0.46

F137 2_6,2_6 2.6 2.32 F20 3_5,4’ 7.48 7.54 F248 46,23 2.03 2.72 F84 5’’,6A 2.32 6.1

F137 2_6,3_5 8.67 8.44 F20 6,8 2.11 2.59 F248 46,46 8.11 8.75 F84 5’’,6B 5.52 4.03

F137 3’’,2’’ 8.72 6.79 F200 2_6,2_6 2.71 2.32 F249 5,2 0.39 0.39 F84 6’,2’ 2.27 1.97

F137 3,2_6 0.01 0 F200 2_6,3_5 8.74 8.44 F249 5,6 8.01 8.75 F84 6’,5’ 8.49 8.45

F137 3_5,2_6 0.32 0.46 F200 3_5,2_6 0.34 0.46 F249 6,2 1.94 2.57 F84 6A,6B -12.13 -13.29

F137 3_5,3_5 2.37 2.65 F200 3_5,3_5 2.32 2.65 F25 4’,2’ 1.07 2.73 F86 2_6,2 -0.56 -0.77

F137 4’’,3’’ 8.77 7.45 F200 5,6 8.01 7.85 F25 5’,2’ 0.67 0.46 F86 2_6,2_6 2.38 2.32

F137 4’’,5’’ 9.68 9.18 F200 5,7 1.64 1.2 F25 5’,4’ 7.94 8.14 F86 2_6,3_5 8.41 8.26

F137 5’’,6A 2.23 5.9 F200 5,8 0.48 0.17 F25 5,6 8.06 7.85 F86 3_5,2_6 0.38 0.42

F137 5’’,6B 5.89 4 F200 6,7 7.07 8.14 F25 5,7 1.61 1.2 F86 3_5,3_5 2.63 2.87

F137 6A,6B -12.11 -13.29 F200 6,8 0.96 1.2 F25 5,8 0.5 0.17 F86 3A,2 2.93 5.04

F138 2’,5’ 0.47 0.46 F200 7,8 8.5 7.85 F25 6’,2’ 1.16 1.63 F86 3A,3B -16.89 -16.75

F138 2’,6’ 2.16 1.87 F202 2’’’,1’’’ 1.79 2.43 F25 6’,4’ 2.28 0.97 F86 3B,2 13.11 5.84

F138 5’,6’ 8.48 8.45 F202 3’’’,2’’’ 3.37 3.28 F25 6’,5’ 8.16 7.85 F86 5,6 8.73 8.45

F139 4’,3’ 8.37 8.14 F202 4’’’,3’’’ 9.56 9 F25 6,7 7.03 8.14 F86 5,8 0 0.04

F139 4,7 -1.24 -1.01 F202 5’’’,4’’’ 9.58 9.33 F25 6,8 0.98 1.2 F86 6,8 2.27 2.59

F139 5’,3’ 1.15 0.9 F202 6’’’,5’’’ 6.24 6.25 F25 7,8 8.55 7.85 F87 2_6,2 -0.63 -0.79

F139 5,4 8.15 8.39 F202 1’’,2’’ 7.78 6.65 F250 36,36 0.12 0.39 F87 2_6,2_6 1.91 1.98

F139 5’,4’ 7.17 7.55 F202 2’’,3’’ 9.19 6.28 F250 36,45 1.49 1.57 F87 2_6,3_5 7.74 7.66

F139 5,6 7.95 8.1 F202 26,26 2.27 1.92 F250 45,36 8.04 8.39 F87 2_6,4’ 1.26 1.26

F139 6’,3’ 0.43 0.52 F202 26,35 8.63 8.44 F250 45,45 7.59 8.05 F87 3_5,2_6 0.63 0.54

F139 6,4 1.38 1.72 F202 26,7 -0.38 -0.79 F26 2,2_6 0.08 0 F87 3_5,3_5 1.44 1.48

F139 6’,4’ 1.61 1.68 F202 3’’,4’’ 9.14 7.25 F26 2_6,2_6 2.78 2.32 F87 3_5,4’ 7.45 7.54

F139 6’,5’ 8.03 7.85 F202 3_5,3_5 2.05 1.96 F26 2_6,3_5 0.38 0.46 F87 3A,2 3.27 1.96

F139 6,7 -0.52 -0.68 F202 35,26 0.38 0.39 F26 3_5,2_6 8.56 8.44 F87 3A,3B -16.76 -16.76

F139 7,8 15.66 17.13 F202 35,35 2.56 2.26 F26 3_5,3_5 2.33 2.65 F87 3B,2 12.01 10.42

F14 2_6,2_6 2.43 2.32 F202 5’’,4’’ 9.75 8.87 F26 6,8 2.16 2.59 F87 4’,2 -1.19 -1.18

F14 2_6,3_5 8.66 8.44 F202 5’’,6A 2.34 4.04 F27 2,3B 13.19 9.59 F87 6,5 7.96 7.85

F14 3,2_6 0.16 0 F202 5’’,6B 5.52 5.46 F27 2_6,2 -0.61 -0.78 F87 7,5 1.67 1.2

Page 172: Metabolomics Technologies applied to the

172

CHAPTER 6

F14 3_5,2_6 0.36 0.46 F202 6B,6A -12.15 -12.54 F27 2_6,2_6 1.97 1.98 F87 7,6 7.19 8.14

F14 3_5,3_5 2.61 2.65 F202 7,8 15.57 17.12 F27 2_6,3_5 7.77 7.66 F87 8,5 0.45 0.17

F14 5,7 1.68 1.2 F203 11,12 7.42 10.82 F27 2_6,4’ 1.21 1.26 F87 8,6 1.2 1.2

F14 5,8 0.49 0.17 F203 12,14 -1 -1.69 F27 3_5,2_6 0.54 0.54 F87 8,7 8.22 7.85

F14 6,5 7.94 7.85 F203 12,15 -1 -1.69 F27 3_5,3_5 1.32 1.48 F88 2’’’,1’’’ 1.54 2.39

F14 6,7 7.15 8.14 F203 2_6,2 0 -0.75 F27 3_5,4’ 7.51 7.54 F88 3’’’,2’’’ 3.21 3.25

F14 6,8 1.02 1.2 F203 2_6,2_6 2.46 2.32 F27 3A,2 2.95 3.84 F88 4’’’,3’’’ 9.71 7.96

F14 7,8 8.45 7.85 F203 2_6,3_5 8.41 8.26 F27 3A,3B -16.96 -16.54 F88 5’’’,4’’’ 9.42 9.11

F142 2_6,2_6 2.04 2.32 F203 3_5,2_6 0.43 0.42 F27 4’,2 -1.1 -1.18 F88 6’’’,5’’’ 6.3 6.25

F142 2_6,3_5 8.68 8.44 F203 3_5,3_5 2.57 2.87 F27 5,7 3.2 2.59 F88 1’’,2’’ 7.72 6.78

F142 26,26 2.85 2.32 F203 3A,2 2.92 3.55 F27 5,8 0.42 0.04 F88 2’’,3’’ 8.97 6.69

F142 26,35 8.62 8.44 F203 3B,2 13.23 9.25 F27 7,8 9.01 8.45 F88 2_6,2_6 2.99 2.32

F142 3,2_6 0.02 0 F203 3B,3A -16.9 -16.17 F28 2,2_6 -0.59 -0.79 F88 2_6,3 0.01 0

F142 3,26 0.4 0 F203 5,11 -0.75 -0.72 F28 2,4’ -1.19 -1.19 F88 2_6,3_5 0.2 0.46

F142 3,6 0 0 F203 5,8 0.01 0.39 F28 2_6,2_6 1.99 1.98 F88 3’’,4’’ 9.49 7.58

F142 3_5,2_6 0.24 0.46 F204 12,11 5.93 7.85 F28 2_6,3_5 7.73 7.66 F88 3_5,2_6 8.8 8.44

F142 3_5,3_5 2.04 2.65 F204 13,11 1.33 1.2 F28 2_6,4’ 1.24 1.26 F88 3_5,3_5 2.13 2.65

F142 35,26 0.51 0.46 F204 13,12 6.81 8.14 F28 3_5,2_6 0.63 0.54 F88 5’’,4’’ 9.87 9.21

F142 35,35 2.51 2.65 F204 14,11 0.17 0.17 F28 3_5,3_5 1.48 1.48 F88 5’’,6A 2.61 3.22

F143 2_6,2_6 2.01 1.98 F204 14,12 6.5 1.2 F28 3_5,4’ 7.47 7.54 F88 5’’,6B 5.76 3.51

F143 2_6,3_5 7.93 7.85 F204 14,13 6 7.85 F28 3A,2 3.01 4.25 F88 6,3 -0.16 0

F143 2_6,4’ 1.19 1.32 F204 2_6,2_6 0 1.98 F28 3A,3B -16.56 -16.76 F88 6,8 2.19 2.59

F143 3,2_6 0 0 F204 2_6,3_5 7.07 7.85 F28 3B,2 12.87 6.83 F88 6B,6A -12.43 -12.44

F143 3,6 -0.02 0 F204 2_6,4’ 1.32 1.32 F28 6,7 8.39 8.14 F89 2_6,2_6 2.02 1.42

F143 3_5,2_6 0.58 0.58 F204 3,2_6 1.2 0 F28 6,8 0.92 1.2 F89 6,8 2.07 2.59

F143 3_5,3_5 1.32 1.26 F204 3_5,2_6 1.07 0.58 F28 7,8 8.33 7.85 F9 2’,4’ 2.43 2.73

F143 3_5,4’ 7.45 7.54 F204 3_5,3_5 4 1.26 F29 3’,5’ 2.35 2.3 F9 2’,5’ 0.46 0.46

F143 6,8 2.29 2.59 F204 3_5,4’ 2.83 7.54 F29 3’,6’ 0.29 0.39 F9 2’,6’ 1.75 1.63

F144 2_6,2 -0.65 -0.79 F204 5,11 -0.3 -0.3 F29 5’,6’ 8.6 8.45 F9 4’,5’ 8.14 8.14

F144 2_6,2_6 1.91 1.98 F204 5,13 0 0.11 F29 6,8 2.09 2.59 F9 4’,6’ 0.91 0.97

F144 2_6,3_5 7.77 7.66 F204 5,14 -0.15 -0.15 F3 2_6,2_6 1.92 1.98 F9 5,6 8.78 8.45

F144 2_6,4’ 1.21 1.26 F204 5,3 0.02 0 F3 2_6,3_5 8.03 7.85 F9 5’,6’ 7.77 7.85

F144 3_5,2_6 0.57 0.54 F204 6,11 0.85 0.85 F3 2_6,4’ 1.19 1.32 F9 5,8 0.04 0.04

F144 3_5,3_5 1.31 1.48 F204 6,12 -0.3 -0.3 F3 3_5,2_6 0.58 0.58 F9 6,8 2.24 2.59

F144 3_5,4’ 7.47 7.54 F204 6,13 -0.15 -0.15 F3 3_5,3_5 1.42 1.26 F90 10,2’ -0.51 -0.77

F144 3A,2 3.08 2.65 F204 6,14 -0.44 -0.44 F3 3_5,4’ 7.44 7.54 F90 10,6’ -0.01 -0.79

F144 3A,3B -16.64 -16.73 F204 6,5 8.71 8.14 F3 5,6 8.04 7.85 F90 5’,2’ 0.19 0.46

F144 3B,2 12.8 10.69 F206 2_6,2_6 2.69 2.32 F3 5,7 1.66 1.2 F90 5,4 8.49 8.45

F144 4’,2 -1.15 -1.19 F206 2_6,3_5 8.79 8.44 F3 5,8 0.47 0.17 F90 6’,2’ 2.07 1.97

F144 6,8 2.3 2.59 F206 3,2_6 0.08 0 F3 6,7 7.05 8.14 F90 6’,5’ 8.23 8.45

F145 2_6,2 -0.63 -0.78 F206 3_5,2_6 0.36 0.46 F3 6,8 0.96 1.2 F90 7,4 0.26 0.04

F145 2_6,2_6 1.93 1.98 F206 3_5,3_5 2.48 2.65 F3 7,8 8.54 7.85 F90 7,5 1.97 2.59

F145 2_6,3_5 7.78 7.66 F206 6,3 -0.23 0 F30 2_6,2_6 1.92 1.98 F91 2’,5’ 0.32 0.46

F145 2_6,4’ 1.21 1.26 F206 6,8 2.13 1.96 F30 2_6,3_5 7.84 7.85 F91 2’,6’ 2.16 1.97

F145 3_5,2_6 0.57 0.54 F207 2’,2 -0.03 -0.79 F30 2_6,4’ 1.24 1.32 F91 3,6 -0.01 0

F145 3_5,3_5 1.33 1.48 F207 3A,2 3.09 3.3 F30 26,26 2.43 1.92 F91 5,6 8.71 8.45

F145 3_5,4’ 7.47 7.54 F207 3B,2 12.67 10.06 F30 26,35 0.4 0.39 F91 5’,6’ 8.36 8.45

F145 3A,2 3.13 5.07 F207 3B,3A -17.11 -16.1 F30 26,7 -0.53 -0.56 F92 1’’,2’’ 7.11 6.83

F145 3A,3B -17.14 -16.74 F207 5’,2’ 0.37 0.42 F30 3_5,2_6 0.59 0.58 F92 2’’,3’’ 7.1 6.93

F145 3B,2 12.78 6.4 F207 6’,2’ 2.19 1.97 F30 3_5,3_5 1.3 1.26 F92 2_6,2_6 1.42 1.42

F145 4’,2 -1.19 -1.18 F207 6’,5’ 8.29 8.27 F30 3_5,4’ 7.43 7.54 F92 3’’,4’’ 7.63 6.98

F145 6,8 2.3 2.59 F207 6,8 2.18 1.96 F30 35,26 8.61 8.44 F92 5’’,4’’ 9.2 8.36

F146 2_6,2_6 1.93 1.42 F208 12,11 8.06 7.85 F30 35,35 2.63 2.26 F92 5’’,6A 4.83 5.45

F146 3,2_6 0.35 0 F208 13,11 1.36 1.2 F30 7,8 15.6 17.11 F92 5’’,6B 4.64 4.82

F146 3,6 -0.23 0 F208 13,12 6.87 8.14 F31 2_6,2_6 1.98 1.98 F92 6B,6A -12.62 -12.62

F146 6,8 2.12 2.59 F208 14,11 0.66 0.17 F31 2_6,3_5 7.93 7.85 F92 8,6 2.59 2.59

F147 2,7 -0.53 -0.63 F208 14,12 1.11 1.2 F31 2_6,4’ 1.19 1.32 F93 1’’,2’’ 7.83 7.5

F147 3_5,3_5 2.08 1.96 F208 14,13 8.62 7.85 F31 3,2_6 0.02 0 F93 2’,5’ 0.19 0.46

F147 5,2 0.01 0.39 F208 2_6,2_6 1.77 1.98 F31 3,5 0.73 0 F93 2’,6’ 2.2 1.87

F147 5,6 8.13 8.45 F208 2_6,3_5 8 7.85 F31 3_5,2_6 0.56 0.58 F93 3’’,2’’ 9.39 7.49

F147 6,2 2.06 1.77 F208 2_6,4’ 1.03 1.32 F31 3_5,3_5 1.09 1.26 F93 4’’,3’’ 8.98 7.46

F147 6,7 -0.56 -0.66 F208 3,2_6 0.14 0 F31 3_5,4’ 7.3 7.54 F93 4’’,6A -0.02 -0.4

F147 7,8 15.51 17.13 F208 3_5,2_6 0.64 0.58 F31 5,7 3.09 2.59 F93 4’’,6B -0.15 -0.4

F149 1’’’,2’’’ 7.06 7.5 F208 3_5,3_5 1.31 1.26 F31 5,8 0.68 0.04 F93 5’’,3’’ 0.09 -0.4

F149 2’’’,3’’’ 8.98 7.78 F208 3_5,4’ 7.46 7.54 F31 7,8 9.14 8.45 F93 5’’,4’’ 9.85 8.95

F149 3’’’,4’’’ 3.53 3.67 F208 7,11 0.26 -0.44 F32 2_6,2_6 2.08 1.98 F93 5’,6’ 8.71 8.45

F149 4’’’,5A 1.71 2.05 F208 7,12 0.03 -0.15 F32 2_6,3_5 7.93 7.85 F93 5’’,6A 2.32 3.84

F149 4’’’,5B 3.42 1.5 F208 7,13 -0.3 -0.3 F32 2_6,4’ 1.2 1.32 F93 5’’,6B 5.68 4.76

F149 5A,5B -12.52 -12.46 F208 7,14 0.7 0.85 F32 3,2_6 0.11 0 F93 6,8 2.07 2.59

F149 2’’,1’’ 7.86 7.2 F208 7,8 9.05 8.45 F32 3_5,2_6 0.6 0.58 F93 6A,6B -12.1 -12.44

F149 2’’,3’’ 9.26 7.3 F208 8,11 0.02 -0.15 F32 3_5,3_5 1.29 1.26 F94 2’,5’ 0.28 0.46

F149 2’,5’ 0.16 0.46 F208 8,12 0 0.11 F32 3_5,4’ 7.44 7.54 F94 2’,6’ 2.17 1.87

F149 2’,6’ 2.19 1.87 F208 8,14 -0.3 -0.3 F32 5,6M 0.64 -0.59 F94 3,2’ 0.01 0

F149 3’’,4’’ 8.94 7.6 F209 2_6,2 -0.61 -0.79 F32 5,7 2.29 1.7 F94 3,6’ 0.02 0

F149 4’’,5’’ 10.01 9.08 F209 2_6,2_6 1.98 1.98 F32 5,8 0.07 0.07 F94 5’,6’ 8.52 8.45

F149 5’,6’ 8.49 8.45 F209 2_6,3_5 7.75 7.66 F32 7,6M 0.37 -0.59 F95 2’,5’ 0.17 0.46

F149 5’’,6A 1.95 4.2 F209 2_6,4’ 1.23 1.26 F32 7,8 8.58 7.97 F95 2’,6’ 2.23 1.87

F149 5’’,6B 6.28 7.21 F209 3_5,2_6 0.6 0.54 F33 2_6,2_6 1.94 1.98 F95 5’,6’ 8.62 8.45

F149 6A,6B -12.01 -12.27 F209 3_5,3_5 1.41 1.48 F33 2_6,3_5 8 7.85 F95 6,8 2.09 2.59

F149 8,6 2.09 2.59 F209 3_5,4’ 7.48 7.54 F33 2_6,4’ 1.2 1.32 F96 2_6,2_6 2.51 2.32

F15 1’’,2’’ 7.85 7.51 F209 3A,2 3.12 4.88 F33 3_5,2_6 0.61 0.58 F96 2_6,3_5 8.77 8.44

F15 2’,5’ 0.46 0.46 F209 3B,2 12.77 6.7 F33 3_5,3_5 1.39 1.26 F96 3,2_6 0.3 0

Page 173: Metabolomics Technologies applied to the

173

Flavonoid Database

F15 2’,6’ 2.19 1.87 F209 3B,3A -17.1 -15.79 F33 3_5,4’ 7.49 7.54 F96 3_5,2_6 0.36 0.46

F15 3’’,2’’ 9.27 7.4 F209 4’,2 -1.18 -1.19 F33 5,6 8.02 7.85 F96 3_5,3_5 2.59 2.65

F15 4’’,3’’ 8.99 7.44 F209 6,8 2.17 1.96 F33 5,7 1.68 1.2 F97 2’,5’ 0.16 0.46

F15 5’’,4’’ 9.8 9.01 F21 2_6,2_6 2.65 2.32 F33 5,8 0.51 0.17 F97 2’,6’ 2.22 1.87

F15 5’,6’ 8.45 8.45 F21 2_6,3_5 8.74 8.44 F33 7,6 7.1 8.14 F97 5’,6’ 8.64 8.45

F15 5’’,6A 2.37 4.88 F21 3_5,2_6 0.35 0.46 F33 7,8 8.49 7.85 F97 6,8 2.08 2.59

F15 5’’,6B 5.38 5.69 F21 3_5,3_5 2.35 2.65 F33 8,6 1.01 1.2 F98 2’’’,1’’’ 1.76 2.42

F15 6,8 2.08 2.59 F21 5,8 0.28 0.04 F34 26,26 2.35 1.92 F98 3’’’,2’’’ 3.32 3.18

F15 6A,6B -11.89 -12.2 F21 6,5 8.87 8.45 F34 26,35 0.37 0.39 F98 4’’’,2’’’ 0.6 -0.39

F150 2’’’,1’’’ 2.28 2.33 F21 6,8 2.22 2.59 F34 26,7 -0.53 -0.63 F98 4’’’,3’’’ 9.51 8.04

F150 3’’’,2’’’ 9.59 3.54 F210 1’’,2’’ 7.61 7.02 F34 3’,5’ 2.38 1.96 F98 5’’’,4’’’ 9.59 8.94

F150 4’’’,3’’’ 4.12 7.41 F210 2’’,3’’ 9.17 7.16 F34 35,26 8.6 8.44 F98 6’’’,5’’’ 6.22 6.25

F150 5’’’,4’’’ 9.45 8.07 F210 2’’,4’’ 1.88 -0.4 F34 35,35 2.69 2.26 F98 1’’,2’’ 7.75 6.28

F150 6’’’,5’’’ 3.28 6.25 F210 26,26 2.35 1.92 F34 8,7 15.54 17.15 F98 2’’,3’’ 9.2 5.57

F150 1’’,2’’ 7 7.44 F210 26,35 8.28 8.26 F35 2_6,2_6 2.06 1.98 F98 2_6,2_6 2.47 2.32

F150 2’’,3’’ 7.98 7.46 F210 3’’,4’’ 8.92 7.56 F35 2_6,3_5 7.93 7.85 F98 2_6,3_5 8.75 8.44

F150 2_6,2 -0.01 -0.78 F210 3’’,5’’ 1.4 -0.4 F35 2_6,4’ 1.2 1.32 F98 3’’,4’’ 9.18 6.91

F150 2_6,2_6 3.06 2.32 F210 35,26 0.42 0.39 F35 3,2_6 0 0 F98 3’’,5’’ 0.22 -0.4

F150 2_6,3_5 8.59 8.26 F210 35,35 2.67 2.47 F35 3,6 -0.01 0 F98 3_5,2_6 0.37 0.46

F150 3’’,4’’ 0.1 7.39 F210 5’,3’ 2.29 1.96 F35 3_5,2_6 0.59 0.58 F98 3_5,3_5 2.46 2.65

F150 3’’,5’’ 2.19 -0.4 F210 5’’,4’’ 9.83 8.95 F35 3_5,3_5 1.32 1.26 F98 4’’,6A 0.04 -0.4

F150 3’’,6B 1.23 0 F210 5’’,6A 2.3 5.35 F35 3_5,4’ 7.47 7.54 F98 5’’,4’’ 9.79 8.75

F150 3_5,2_6 0.12 0.42 F210 5’’,6B 5.66 5.29 F35 7,6 8.32 8.14 F98 5’’,6A 2.32 5.84

F150 3_5,3_5 1.1 2.87 F210 6B,6A -12.1 -12.27 F35 7,8 8.45 7.85 F98 5’’,6B 5.91 4.71

F150 3A,2 10.36 2.18 F210 7A,7B -14.41 -16.12 F35 8,6 0.9 1.2 F98 6,8 2.14 1.96

F150 3A,3B -17.13 -16.67 F210 7A,8A 9.17 9.23 F36 2_6,2_6 2.38 2.32 F98 6B,6A -12.26 -12.69

F150 3B,2 12.92 10.87 F210 7A,8B 5.86 5.1 F36 2_6,3_5 8.73 8.44 F99 2’’,1’’ 7.85 7.52

F150 4’’,6A -0.81 -0.4 F210 7B,8A 6.37 4.99 F36 26,26 1.99 1.78 F99 2’,5’ 0.56 0.46

F150 4’’,6B -9.53 -0.39 F210 7B,8B 9.1 9.26 F36 26,35 0.59 0.39 F99 2’,6’ 2.18 1.87

F150 5’’,4’’ 9.14 8.88 F210 8A,8B -16.99 -16.87 F36 26,7 -0.83 -0.7 F99 3’’,2’’ 9.27 7.39

F150 5’’,6A 1.68 6.99 F212 2’,5’ 0.4 0.46 F36 3_5,2_6 0.35 0.46 F99 4’’,3’’ 8.99 7.25

F150 5’’,6B 0 4.71 F212 2’,6’ 2.41 2.17 F36 3_5,3_5 2.65 2.65 F99 4’’,5’’ 9.8 8.75

F150 6,8 2.21 2.59 F212 2’,6’’ -0.15 -0.15 F36 35,26 7.79 8.09 F99 4’’,6A -0.48 -0.4

F150 6B,6A -11.24 -13.26 F212 2_6,2_6 2.58 2.32 F36 35,35 1.45 1.25 F99 4’’,6B -0.51 -0.4

F151 2_6,2_6 1.98 1.63 F212 2_6,3_5 8.69 8.44 F36 4,26 1.22 1.73 F99 5’,6’ 8.45 8.45

F151 5,6 8.82 8.45 F212 3,2’ 0.01 0 F36 4,35 7.42 8.05 F99 5’’,6A 2.36 6.35

F151 5,8 0.35 0.04 F212 3’’,2_6 0.12 0 F36 4,7 -1.01 -1.05 F99 5’’,6B 5.39 4.9

F151 6,8 2.2 2.59 F212 3,6’ 0 0 F36 8,7 12.5 17.13 F99 6,8 2.09 2.59

F152 2’’,1’’ 7.81 7.33 F212 3_5,2_6 0.32 0.46 F37 2_6,2 -0.54 -0.79 F99 6B,6A -11.89 -12.59

F152 2’’,3’’ 9.62 7.63 F212 3_5,3_5 2.3 2.65 F37 2_6,2_6 2.51 2.32

F152 2_6,2_6 2.02 1.42 F212 5’,6’ 8.65 8.45 F37 2_6,3_5 8.42 8.26

Page 174: Metabolomics Technologies applied to the
Page 175: Metabolomics Technologies applied to the

175

Chapter 7

Metabolite correlations in tomato obtained by fusion of liquid chromatography-mass spectrometry and nuclear magnetic resonance data

Sofi a Moco, Jenny Forshed, Ric C.H. De Vos, Arnaud Bovy, Raoul J. Bino, Jacques Vervoort

Nuclear magnetic resonance (NMR) and liquid chromatography (LC)-mass spectrometry (MS) are frequently used as technological platforms for metabolomics studies. Both techniques provide complementary information resulting in the ability of molecular structure elucidation, giving insight into the identifi cation of metabolites present in complex extracts such as those from plant material. Metabolomics has many applications and can help in the description of metabolic statuses, which are a refl ection of the phenotype.

The metabolic profi les of ripe fruits from 50 different tomato cultivars, including beef, cherry and round tomatoes, were recorded by both 1H NMR and LC-MS. Different analytical selectivities were found for both metabolite profi ling methods. The NMR and LC-MS data sets were separately analysed for metabolic differences by principal component analysis (PCA). Although the two profi ling methods (NMR and LC-MS) are complementary and enable the analyses of metabolites from essentially different metabolic pathways, both methods resulted in a clear segregation of, on the one hand, the cherry tomatoes and, on the other hand, the beef and round tomatoes. By performing both intra-method correlation analyses (NMR-NMR and LC-MS - LC-MS) and inter-method correlation analyses (NMR - LC-MS), highly correlating metabolite signals were identifi ed, which may help in the annotation of metabolites. It is concluded that the combined analysis by LC-MS and NMR is a promising strategy in extending the analytical capacities of these techniques for metabolite identifi cation purposes.

Metabolite correlations in tomato obtained by fusion of liquid chromatography-mass spectrometry and nuclear magnetic resonance data

Sofi a Moco, Jenny Forshed, Ric C.H. De Vos, Arnaud Bovy, Raoul J. Bino, Jacques Vervoort

Page 176: Metabolomics Technologies applied to the

176

CHAPTER 7

INTRODUCTION

Analytical methods such as NMR and MS provide information about the physicochemical composition of biological samples, at the molecular level. In plant metabolomics, these two technologies are commonly used as independent approaches (Le Gall et al., 2003a; Ward et al., 2003; Tikunov et al., 2005; Moco et al., 2006a). The online combination of NMR and MS technologies by LC-solid phase extraction (SPE)-NMR/MS has been used for the efficient detection, separation, isolation and unequivocal elucidation of metabolites from plant origin (Exarchou et al., 2003; Miliauskas et al., 2006; Tatsis, 2007). This analytical approach can be especially powerful for the identification of biomarkers discovered in high-throughput MS and NMR metabolomics studies.

LC-MS and NMR are distinct analytical techniques, concerning detection and sensitivity. On the one hand, LC-MS is a fast and sensitive technique. However, the separation of metabolites is dependent on the column used for chromatographic separation, the detection is dictated by the ionisation aptitude of the analytes and the molecular elucidation has some intrinsic limitations (Chapter 1). On the other hand, NMR is indiscriminative towards matrix properties, given that the analytes are soluble. NMR is a highly selective technique for distinguishing molecular structures, but has a lower sensitivity as compared to MS.

The statistical combination of LC-MS and NMR analyses acquired for equivalent samples opens opportunities to inter-technique correlations for single metabolites, improving the analytical resolution of both techniques (Crockford et al., 2006; Forshed et al., 2007). This statistical strategy was applied to large scale analytical analyses of urine. The appropriate interpretation of metabolomics data is dependent on consistent pre-processing and statistical validation. Multivariate analyses methods are useful in discriminating information, dealing with the redundancy often present in metabolomics data sets (Trygg et al., 2007).

In this study, we recorded the metabolic profiles of 50 different cultivars of tomato (Solanum lycopersicum) by both 1H NMR and LC-quadrupole (Q) time-of-flight (TOF)-MS. Both MS (Schauer et al., 2005b; Tikunov et al., 2005; Moco et al., 2006a; Fraser et al., 2007) and NMR (Le Gall et al., 2003a; Mattoo et al., 2006) have been previously described for the analysis of metabolites in tomato fruit. The semi-polar metabolite content of tomato was captured by using methanol as extraction solvent. These extracts were successively analysed by 1H NMR and LC-MS, and relationships between the different signals obtained by each method were studied by performing correlation analyses within the data sets separately. Additionally, the fusion of and

Page 177: Metabolomics Technologies applied to the

177

Tomato LC-MS and NMR correlations

correlation between LC-MS to NMR signals across the 50 samples was investigated. Our strategy provided insight into the complementariness and coincidence of LC-MS and NMR as metabolite profiling technologies and as molecular elucidators, applied to the assignment of metabolites in tomato fruit.

MATERIALS AND METHODS

Plant material: Fruits from 50 different cultivars of tomato (Solanum lycopersicum), at the ripe stage of development, were obtained from a series of 96 different genotypes grown simultaneously in greenhouses in Wageningen (Tikunov et al., 2005). From these cultivars, 17 cherry, 26 round and 7 beef type of tomato fruits were included in our selection. The fruits were chopped into small pieces and immediately frozen in liquid nitrogen. The frozen material was ground to a fine powder and stored at -80 ºC before further analysis.

Chemicals: The standard compounds tryptophan, D-(+)-glucose and citric acid were purchased from Merck (Damstadt, Germany), rutin from Aldrich (Steinheim, Germany) and chlorogenic acid and α-tomatine from Sigma (St. Louis, USA). Methanol-d4 (HDO + D2O < 0.03 %) was purchased from Euriso-top (Gif-Sur-Yvette, France) and protonated acetonitrile HPLC supra gradient quality was obtained from Biosolve (Valkenswaard, The Netherlands). Formic acid for synthesis, 98-100%, was purchased from Merck-Schuchardt (Hohenbrunn, Germany). Ultra pure water was obtained from an Elga Maxima purification unit (Bucks, UK).

Sample preparation for NMR and LC-MS analysis: About 0.3 g of fresh weight of tomato fruit powder was freeze-dried just before proceeding with NMR and LC-MS analyses. To the dried powder, 1.2 mL of methanol-d4 was added as extraction solvent. The extracts were sonicated for 15 minutes, followed by a 5 min-centrifugation (3,000 x g) step. After filtration of the supernatants through a 0.2 μm inorganic membrane filter (Anotop 10 Whatman, Maidstone, England), exactly 600 µL of tomato extract were transferred to dry 5 mm NMR tubes and taken to NMR analysis. After NMR analysis, the methanol-d4 tomato extracts were diluted to 25 % (v/v) ultra pure water. The diluted extracts were sonicated, centrifuged and filtrated before LC-MS analysis. In between analyses, the extracts were kept at 4 ºC. Standard compounds were dissolved separately in methanol-d4, to obtain dilution series of six different solutions with increasing concentrations (between about 4 mg/L and 130 mg/L). These samples were taken for 1H NMR analyses and later prepared for LC-MS, as described above.

Page 178: Metabolomics Technologies applied to the

178

CHAPTER 7

NMR analysis: 1H NMR measurements were carried out in a 500 MHz Bruker AMX NMR spectrometer, proton frequency 500.137 MHz, equipped with a 5 mm TXI probe. A zg pulse sequence was used for the acquisition of 1H NMR spectra. All measurements were performed at 298 K, containing 1536 scans with 4 initial dummy scans. The receiver gain was set to 512 and the acquisition time to 2.23 s, over a time domain of 32768 and spectral width of 14.7018 ppm. A 45º pulse was given with a delay of 1.5 s. In total, each measurement took 1 h 35 min 56 s of acquisition time. Data acquisition was done under the control of Bruker XWIN-NMR version 2.1. The data sets were Fourier-transformed, corrected for phase, calibrated for the chemical shift axis towards the resonance of the methanol signal (δ = 3.31 ppm) and baseline corrected.

LC-MS analysis: The extracts were analysed for their metabolite contents, following the protocol described previously (Moco et al., 2006a). The LC-QTOF-MS analyses were carried out in electrospray (ESI) negative mode. In brief, a Waters Alliance 2795 HT system equipped with a Luna C18(2) pre-column (2.0 x 4 mm) and analytical column (2.0 × 150 mm, 100 Å, particle size 3 μm) from Phenomenex (Torrance, CA, USA) were used for chromatographic separation. The HPLC system was connected online to a Waters 2996 photo diode array (PDA) detector and subsequently to a QTOF Ultima V4.00.00 mass spectrometer (Waters-Corporation, MS technologies, Manchester, UK).

Data pre-processing: Acquisition, visualisation and manual processing of LC-MS data were performed under MassLynx 4.0 (Waters). Mass data were automatically processed by metAlign version 1.0 (www.metalign.nl). Baseline and noise calculations were performed from scan number 75 to 2,550, corresponding to retention times 1.5 min to 50.1 min. The maximum amplitude was set to 20,000 and signals below two times the local noise were discarded. More details about the settings of metAlign can be found elsewhere (De Vos et al., 2007).

Processing and visualisation of 1H NMR data was done under Bruker TopSpin version 2.0. (Germany). The matrix of chemical shift amplitudes across all tomato samples was calculated by a bucket analysis provided by the AMIX software (Bruker, Germany). The signals were integrated by sum of intensities and calibrated to total spectral intensity. It was applied a bucket width of 0.01 ppm and the following spectral intervals were excluded: 4.71-5.05 ppm (water signal) and 2.29-3.32 ppm (methanol signal).

The areas of the mass signals of the analysed standard compounds were obtained by manual integration in MassLynx while the mass signal heights were obtained by metAlign. The areas of the resonances of the analysed standard

Page 179: Metabolomics Technologies applied to the

179

Tomato LC-MS and NMR correlations

compounds were obtained by manual integration in TopSpin. The calculation of coefficients of correlation related to the LC-MS and NMR data of the standard compounds was obtained by least squares linear regression, using the Statistical Analysis Tools of Microsoft Excel 2003.

Data analysis: The data matrixes of 1H NMR and LC-MS of tomato varieties were subjected to chemometric analyses using MATLAB, version 7.10. In the analyses a 0.95 confidence level was chosen. The NMR data were normalised and scaled to equal total spectral area between samples while the LC-MS data were normalised but not scaled. Principal component analyses (PCA) were performed on the NMR and LC-MS data sets separately. Correlation analyses were performed within NMR signals and LC-MS signals, separately, as well as between NMR and LC-MS signals. For the correlation analyses, the Pearson correlation coefficient (corrcoef), was used. Equation 7.1 was applied in the calculation of NMR and LC-MS correlations, in which cov is the covariance matrix of NMR and LC-MS variables and σNMR and σLCMS are the standard deviations of NMR and LC-MS, respectively.

LCMSNMR σσLCMSNMR

corrcoef.

),cov(= (7.1)

Using Mahalanobis distance calculation, spurious or “false” correlations were identified in the LC-MS and NMR correlations. These correlations gave rise to correlations dependent on the presence of one sample only, being the rest of the samples randomly distributed in space. The Mahalanobis distances between the deviating sample and the sample group were taken from the correlation plots. This distance could be calculated by the unit-variance-scaled Euclidean distance between two vectors x and y (with length p = the dimensionality, in this case, two), equation 7.2. The elements of x are denoted by xi, the elements of vector y are denoted by yi and σi is the standard deviation of xi over all samples. Calculations were done for each sample separately, regarded as y, and the distance to the centre (mean) of the remaining samples in x were measured. Large (>50) Mahalanobis distances revealed “false” correlations which were therefore discarded.

∑=

=p

1i

y_xsMahalanobi )(distance

i

ii (7.2)

Sample outliers in the LC-MS and NMR data were discarded based on three parameters: distinct segregation from the other samples in the PCA plots of the normalised and mean-centred data; abnormal characteristics of the raw data (such

Page 180: Metabolomics Technologies applied to the

180

CHAPTER 7

as evident differences in the baseline or noise) and persistent outlier-dependent correlations in the correlation analysis within and between LC-MS and NMR data (measured by the Mahalanobis distances). The correlations were recalculated omitting these outliers.

RESULTS AND DISCUSSION

1H NMR analyses1H NMR analyses were performed on methanol-d4 extracts of 50 different

cultivars of tomato fruit. The NMR spectra appeared reasonably crowded with resonances, indicating the presence of a multitude of metabolites. In addition, the presence of resonances through the whole spectral width (0-10 ppm), denoted the existance of variable chemical features, from aliphatic to aromatic groups. In particular, intense signals in the sugar region, 3-6 ppm, were observed as a consequence of the presence of glycosylated metabolites and free sugars, Fig. 7.1. From the visual comparison of the spectra, no obvious differences were found in the metabolic profiles of tomato fruits.

The analysis of (non-fractionated) tomato fruit extracts by 1H NMR allowed the detection of essentially primary (polar) metabolites such as sugars, amino acids, organic acids and nucleotides, as a result of the high natural concentration of these metabolites in these fruits. The lower abundance of secondary metabolites and the large amount of resonances in the spectrum (as a consequence of the presence of highly abundant metabolites and resonance overlap) made the detection of secondary metabolites, such as phenolic acids, flavonoids and alkaloids, more difficult as compared to the detection of primary metabolites.

The assignment of NMR resonances into metabolites was based on previously reported findings for tomato fruit samples (Le Gall et al., 2003a; Sobolev et al., 2003; Mattoo et al., 2006) and on NMR-based databases: the Spectral Database for Organic Compounds (SDBS; http://www.aist.go.jp/RIODB/SDBS/cgi-bin/cre_index.cgi), the Flavonoid Database (see Chapter 6), the Human Metabolome Database (HMDB; http://www.hmdb.ca) and the Biological Magnetic Resonance Data Bank (BMRB; http://www.bmrb.wisc.edu) resources.

Page 181: Metabolomics Technologies applied to the

181

Tomato LC-MS and NMR correlations

[ppm] 8 6 4 2

A B C D

[ppm] 9.0 8.5 8.0 7.5 7.0 6.5

A

[ppm] 5.5 5.0 4.5

B

[ppm] 4.0 3.8 3.6 3.4 3.2

C

[ppm] 2.5 2.0 1.5 1.0 0.5

D

trig

onel

line

sucr

ose

Methanol

Water

phen

ylal

anin

e

-glu

cose

-glu

cose

tryp

toph

an

tyro

sine

argi

nine

ADP

aden

osin

e

asco

rbic

aci

d

prol

ine gl

ucos

e

alan

inegl

utam

ic a

cid

glut

amin

e

GAB

A

GAB

A

GAB

A

unsa

tura

ted

fatt

y ac

ids

GAB

A

leuc

ine

/ is

oleu

cine

valin

e

urac

il

gala

tose

amino acids, glucoses, fructose, galactose

asco

rbic

aci

d

phen

ylal

anin

e

gluc

ose

citr

ic a

cid

Figure 7.1. NMR spectrum of a cherry tomato fruit cultivar (top frame), indicating distinct regions: (A) aromatic region, (B and C) sugar region and (D) amino acid region. Metabolites are indicated next to the respective resonances: GABA = γ-aminobutyric acid; ADP = adenosine diphosphate.

Based on the NMR signals, relative differences between the various tomato varieties were visualised by a PCA plot, Fig. 7.2. On the PC2 a clear segregation between, on the one hand, cherry tomatoes and, on the other hand, beef and round tomatoes was found. The last two tomato types were not separated by the variable defining the PC2 (second largest variation over all variables), implying similarity of the metabolic profiles, but may exhibit other differences.

Page 182: Metabolomics Technologies applied to the

182

CHAPTER 7

Cherry

Round

Beef PC2

PC3

PC1

100

50

0

-50

-100

-150

-200

100 50 0 -50 -100 -150

250 200 150

-900

-1000-1100

-1200-1300

-1400

-1500

Figure 7.2. PCA plot of normalised mean-centred bucket 1H NMR data of tomato fruit cultivars (explained variance by PC1: 98.5%, PC2: 0.6% and PC3: 0.3%).

LC-MS analyses

The same extracts used for 1H NMR analyses were taken for LC-MS analysis, after aqueous dilution to 25 % (v/v). The obtained metabolic profiles were analogous to the ones described before (Chapters 4 and 5). Phenolic acids, alkaloids and flavonoids were detected in the tomato fruit extracts by LC-MS analyses. The assignment of the metabolites is given in the LC-MS chromatogram for one of the analysed tomato extracts, Fig. 7.3.

Time5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00 55.00

%

0

100

TOF MS ES- BPI

3.24x1026.641315

24.45609

14.18353

22.49741

17.17353

37.881082

30.62515

27.85515

42.75271

42.12271

5.07208

a 7.21380

b

11.83341

c

12.61443

d13.15341

e

f

14.18353

g15.19411

h

i

20.391273

j

k

23.791315

l

m

25.871313

n

o

27.45593

p

q

28.54515

r

s

32.86433

t

33.341137

u

33.78271

v

37.291079

w

x

39.35677

y

40.64677

z

41.35677

aa

ab

ac

4

Figure 7.3. Negative ion mode electrospray ionisation (ESI-)-LC-MS chromatogram of a cherry tomato fruit cultivar. For the following metabolites, the retention time (min) and the detected mass are indicated: a, (phenylalanine)FA; b, zeatin hexose; c, e, caffeic acid hexose; d, dehydrophaseic acid hexose; f, g, i, caffeoylquinic acid; h, (iso)pentyl dihexose; j, (esculeoside B)FA; k, quercetin-hexose-deoxyhexose-pentose; l, o, (lycoperoside F)FA or (lycoperoside G)FA or(esculeoside A)FA; m, rutin; n, (dehydrolycoperoside F)FA or (dehydrolycoperoside G)FA or(dehydroesculeoside A)FA; p, kaempferol-3-O-rutinose; q, r, s, dicaffeoylquinic acid; t, v, naringenin chalcone-hexose; u, (lycoperoside A)FA or (lycoperoside B)FA or (lycoperoside C)FA; w, α-tomatin; x, tomatoside A; y, z, aa, tricaffeoylquinic acid; ab, naringenin; ac, naringenin chalcone.

Page 183: Metabolomics Technologies applied to the

183

Tomato LC-MS and NMR correlations

Using LC-MS, mostly semi-polar metabolites are detected. The polar metabolites (i.e. more polar than the stationary phase of the column used for the chromatographic separation), which include sugars, organic acids, most amino acids and nucleotides, elute as large signals before 4 min of retention time. The differences between the tomato extracts analysed by LC-MS were visualised by PCA, Fig. 7.4. In the PCA plot, the cherry tomatoes were apart from the round and the beef tomatoes. This LC-MS based PCA plot is in a large extent analogous to the one obtained by NMR analyses (Fig 7.2). The similarity between both plots is somewhat surprising given the observed divergence in metabolites detected. Therefore, it can be concluded that metabolic differences separate the cherry tomatoes from the round and beef types both on the primary (mostly detected by NMR) and secondary (mostly detected by LC-MS) metabolism.

PC3

Cherry

Round

Beef

50

10

0

20

-10

-20

-30

40

-20 0 20 40 60 80

-80 -60 -40PC1

-200

20

40

-60-40

PC2

Figure 7.4. PCA of mean-centred unit-variance scaled LC-MS data of tomato fruit cultivars (explained variance by PC1: 90.0%, PC2: 4.7% and PC3: 1.2%).

NMR and LC-MS linearity analyses

Six standard compounds (glucose, citric acid, tryptophan, chlorogenic acid, rutin and tomatine) at different concentrations were analysed by NMR and LC-MS for the assessment of instrumental dose-response relationships. An increase in signal height (for MS mass signals) or area (for NMR resonances) was registered with increasing concentrations of a standard compound by both methods. Mass heights were used, as these are the intensity format given after pre-processing the LC-MS data by the alignment software metAlign. A different linearity range of the instrumental signals along the compound concentration was found for the two techniques. As an example, signal intensities obtained by NMR and LC-MS analyses of rutin are shown in Fig. 7.5. NMR signals showed a wider linearity range than LC-MS signals within the same range of concentrations (Fig. 7.5A, B). It is known that NMR spectrometers

Page 184: Metabolomics Technologies applied to the

184

CHAPTER 7

r = 0.86

0

50

100

150

200

250

300

350

400

0 10 20 30 40

Mass signal height (10 )

7.66

ppm

NM

R in

tegr

al (

10 )6

3

Er = 0.97

609 610

r = 1.00

r = 1.00

0

250

500

750

1000

1250

1500

0 50 100 150 200 250

Concentration (µM)

NM

R in

tegr

al (

10 )

7.66 7.63 6.87 6.41 6.21 5.11

4.52 3.80 3.62 3.53 1.12

6

A

r = 1.00

r = 0.94

r = 0.97r = 1.00

0

5

10

15

20

0 10 20 30 40 50

610 611 677 678

D

Mas

s si

gnal

hei

ght

(10

)3

609 m/z signal height (10 )3

25

r = 1.00

r = 1.00

r = 0.99

0

250

500

750

1000

1250

1500

0 100 200 300 400 500

7.66 ppm NMR integral (10 )

7.63 6.87 6.41 6.21 5.11

4.52 3.80 3.62 3.53 1.12

C

NM

R in

tegr

al (

10 )6

6

r = 0.85

r = 0.97

r = 0.850

10

20

30

40

50

Mas

s si

gnal

hei

ght

(10

)

609 610 611 677 678

3

B

Concentration (µM)

0 50 100 150 200 250

r = 0.86r = 0.95

0

2

4

6

8

10

12

14

16

0 10 20 30 40 50

609 m/z signal height (10 )3

609

m/z

sig

nal a

rea

(10

)3

r = 0.98

F

Figure 7.5. NMR (integral) and LC-MS (mass signal height) signals of rutin: (A) NMR integrals of rutin resonances (ppm) versus concentration of rutin (6, 13, 25, 51, 101, 202 µM) and fitted linear relationships for two different resonances (high and low intensity) indicating the correlation coefficient (r); (B) Mass signal heights of rutin versus concentration of rutin and fitted linear relationships for different mass signals: 609 (1st isotope [M-H]-), 610 (2nd isotope [M-H]-), 611 (3rd isotope [M-H]-), 677 (1st isotope [M+NaHCOO]-), 678 (2nd isotope [M+NaHCOO]-); (C) Fitted linear relationships between resonance integrals of rutin; (D) Fitted linear relationships between mass signal intensities of rutin; (E) NMR integral versus mass signal height of two pairs of rutin signals (7.66 ppm and 609 m/z, 7.66 ppm and 610 m/z) and fitted linear relationships; (F) Mass signal area versus mass signal height for rutin. Correlation coefficients were obtained by least squares regression calculation of linear fits of the type: y = mx, m = slope, with 95% confidence.

Page 185: Metabolomics Technologies applied to the

185

Tomato LC-MS and NMR correlations

have a wide dynamic range that can reach up to 22 bits (more than a million). For this reason, the NMR instrumental response along the concentration was linear for all rutin resonances (Fig. 7.5A). As a consequence, the relationship among the different resonances was, as expected, linear (the obtained correlation coefficients, r, were on average 1.00), indicating the power of NMR as a quantitative technique, Fig. 7.5C. Conversely, MS has an inherent lower dynamic range within the concentration range tested. New type of MS instruments have improved in hardware to enable detection in a larger dynamic range, but for many instrumental designs this limited dynamic range is intrinsic and will remain a bottleneck. Limitations in the dynamic range of the QTOF-MS instrument used in this study have been reported previously (Chernushevich et al., 2001; Moco et al., 2006a) and are also clear from the analysis of a concentration series, as depicted in Fig. 7.5B. With the experimental conditions used for our LC-MS analyses, the detector response was linear for parent ion intensities up to about 20,000 counts per scan, corresponding to about 25 µM of rutin concentration. Above a maximum of parent ion intensities of about 30,000 counts per scan, for the compound rutin corresponding to about 75 µM, there was no longer linearity with the concentration, as suggested by the geometry of the instrumental response curve (Fig. 7.5B). However, better linear correlations with increasing rutin concentrations were obtained for the 2nd (610 m/z) and 3rd (611 m/z) isotope signals. As a consequence, the linear correlations between ions belonging to the same compound (isotopes, adducts such as sodium formate adducts) were not consistently high (Fig. 7.5D). In particularly, the molecular ion produced by rutin, 609 m/z, and its 2nd isotope (610 m/z) correlated up to r = 0.94, exhibiting a nonlinear relationship due to detector saturation at high parent ion signal intensities.

This poorer linearity of the LC-MS response at higher intensity regions was reflected in the correlation between NMR signals and LC-MS signals (Fig. 7.5E). For intense LC-MS signals, correlations with NMR resonances can be classified as moderately high correlations (r = 0.86). However, higher NMR - LC-MS correlations (r = 0.97) were found using the 2nd mass isotope (610 m/z) than using the molecular ion (609 m/z).

The areas and heights obtained from the detected mass signals were compared for reproducibility. There is a loss of linearity between the height and the area of the LC-MS signal upon saturation of the detector, as the shapes of the saturated signals are not Gaussian. Nevertheless, even with the increase of chromatographic signal width, in case of saturation, the signal areas do not reflect linearity with the concentration, at the analysed concentration range (Fig. 7.5F).

Page 186: Metabolomics Technologies applied to the

186

CHAPTER 7

From the analyses of the standard compounds, different LC-MS response rates were observed for the different compounds, as was evident from the differences in the slope of instrumental response with increasing compound concentration. The MS signal intensity response of a particular compound depends not only on its concentration but also on its ionisation ability, which is a consequence of the physicochemical property of the specific analyte.

NMR-NMR correlations

The correlations between different NMR intensity signals (buckets) in the NMR data matrix of tomato cultivars were analysed, after normalisation. Correlation analyses allow the assessment of the degree of linear association between two variables. The correlations can range from the value -1 to +1. Negative correlations are obtained for r < 0, which indicate linear negative associations between two variables, while positive correlations are obtained for r > 0, indicating positive linear associations. The lack of any association between two variables is originated by r = 0.

In our NMR-NMR correlation studies, the subsection of the correlation matrix, for which r ≥ 0.8, was analysed. This limit of correlation is considered to be high, allowing the identification of robust variable associations. From a data set of 1008 NMR-buckets, more than 29,000 positive correlations were found. From this analysis, the similarity in behaviour across the tomato varieties between certain NMR buckets was assessed, in terms of signal intensity. For example, the signal intensity of bucket x, present in all tomato cultivars, is consistently related to the signal intensity of bucket y; therefore, there is a relationship between these two buckets that is common to most cultivars. These associations between buckets are given by their correlation coefficients. In principle, NMR signals belonging to the same compound should have a high correlation with each other. Trigonelline, sucrose and citric acid were analysed in detail by NMR-NMR correlation analyses and are given as examples of this approach. The NMR-NMR correlation plots for some related NMR buckets of these three compounds are shown in Fig. 7.6.

The alkaloid trigonelline produced very low signal intensities in the NMR spectra of tomato fruits (see Fig. 7.1). Due to high deshielding of some of the trigonelline protons (e.g. 9.2 ppm and 8.9 ppm), this compound has resonances in a relatively empty region of the NMR spectrum. The resonances belonging to trigonelline exhibited high linear correlations (r ≥ 0.81) between each other across all samples, as shown in Fig. 7.6A. In fact, this correlation analysis enabled the

Page 187: Metabolomics Technologies applied to the

187

Tomato LC-MS and NMR correlations

3-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

NMR bucket 9.205 ppm

8.875 ppm 8.895 ppm 8.915 ppm r = 0.82r = 0.88r = 0.81

A

NM

R bu

cket

NMR bucket 5.385 ppm -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.375 ppm r = 0.85

3.425 ppm r = 0.98

4.085 ppm r = 0.98

3.815 ppm r = 0.95

3.715 ppm r = 0.91

B

NM

R bu

cket

2.715 ppm 2.775 ppm 2.685 ppm r = 0.82r = 0.96r = 0.82

-2

4

-1

0

1

2

3

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5

NMR bucket 2.805 ppm

C

NM

R bu

cket

Figure 7.6. Selected normalised signal intensities of NMR buckets of 1H NMR spectra of tomato cultivars belonging to resonances of the same compound: (A) trigonelline (B) sucrose (C) citric acid. The correlation coefficients (r) are displayed for each pair of NMR buckets (x axis, y axis).

Page 188: Metabolomics Technologies applied to the

188

CHAPTER 7

assignment of signals that otherwise would be masked by the high complexity of the extracts and the relative low concentration of trigonelline. Using this approach, a full 1H NMR characterization of trigonelline was achieved (Fig. 7S.1 and Table 7S.1).

Sucrose is an abundant compound in tomato fruits, which was made evident by the high and characteristic signal at 5.385 ppm in the 1H NMR spectra, Fig. 7.1. This disaccharide has a complex NMR spectrum, in particular in the 3-4.5 ppm region which overlaps with the resonances of other (free or conjugated) sugar moieties. Using NMR-NMR correlations across the 50 tomato samples, it was possible to identify other resonances belonging to sucrose, Fig. 7.6B.

Citric acid was also detected in tomato fruits and its resonance signals appear around 2.8 ppm. The four resonances of citric acid were readily identified from their strong correlations (r ≥ 0.82), Fig. 7.6C.

Given the complexity of the NMR spectra of tomato fruit extracts, the overlap of signals is a difficulty in the assignment of endogenous metabolites. The identification of NMR resonances belonging to the same compound was facilitated by correlation analyses, enabling the assignment of signals from the same compound within complex mixtures. As the pre-processing of NMR data was performed by bucketing, there was an associated loss in spectral resolution. However, as there are usually shifts between NMR spectra (even under the best controlled experimental conditions) a direct comparison of signals is performed more safely by applying such a pre-processing approach.

LC-MS - LC-MS correlations

Analogously to the NMR data, the LC-MS data set of the tomato cultivars was analysed for correlations between LC-MS signals across the samples. The pre-processed data matrix contained 3,374 mass signals, aligned by retention time and m/z. More than 125,000 significant strong positive correlations (r > 0.80) were obtained. The highest correlations (r ≥ 0.96) were found for mass signals belonging to the same metabolite, such as adducts, dimers and fragments. As an example, three compounds were analysed in more detail: sucrose (341 m/z), phenylalanine (164 m/z) and caffeoylquinic acid (353 m/z, retention time 14.9 min).

Sucrose was detected both by NMR and LC-MS. Sucrose has a short elution time from the LC column: between 2.1 and 2.8 min, with a large overlap with other eluting polar metabolites. Parent ion adducts (dimer and trimer), isotopes and formic acid adducts of sucrose exhibited correlations higher than 0.96 with the

Page 189: Metabolomics Technologies applied to the

189

Tomato LC-MS and NMR correlations

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3-2

-1

0

1

2

3

4

341 m/z at 2.5 min

1025 m/z at 2.4 minr = 0.96

683 m/z at 2.4 minr = 0.99

684 m/z at 2.4 minr = 0.99

387 m/z at 2.5 minr = 1.00

A

m/z

-2 -1 0 1 2 3

164 m/z at 4.8 min

4-2

-1

0

1

2

3

4

165 m/z at 4.8 minr = 0.99

147 m/z at 4.7 minr = 1.00

B

m/z

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

-2

353 m/z at 14.9 min

515 m/z at 28.6 minr = 0.92

353 m/z at 13.2 minr = 0.88

515 m/z at 27.9 minr = 0.93

515 m/z at 30.7 minr = 0.91

677 m/z at 39.4 minr = 0.88

677 m/z at 40.7 minr = 0.86

C

m/z

Figure 7.7. Selected normalised signal heights of LC-MS signals (m/z) with high correlation coefficients (r) to the compounds: (A) sucrose (341 m/z): 387 m/z = [sucrose + HCOOH – H]-, 683 m/z = [sucrose + sucrose – H]-, 684 m/z = 2nd isotope of [sucrose + sucrose – H]-, 1025 m/z [sucrose + sucrose + sucrose – H]-; (B) phenylalanine (164 m/z): 165 m/z = 2nd isotope of [phenylalanine – H]-, 147 m/z = [phenylalanine - NH3]- and (C) caffeoylquinic acid II (353 m/z at 14.9 min): 677 m/z at 40.7 min = [tricaffeoylquinic acid II – H]-, 677 m/z at 39.4 min = [tricaffeoylquinic acid I – H]-, 515 m/z at 30.7 min = [dicaffeoylquinic acid III – H]-, 515 m/z at 28.6 min = [dicaffeoylquinic acid II – H]-, 515 m/z at 27.9 min = [dicaffeoylquinic acid I – H]-, 353 m/z at 13.2 min = [caffeoylquinic acid I – H]-. The correlation coefficients (r) are displayed for each pair of LC-MS signals (x axis, y axis).

Page 190: Metabolomics Technologies applied to the

190

CHAPTER 7

molecular ion of sucrose (341 m/z), Fig. 7.7A.Also the amino acid phenylalanine is detectable by both NMR and LC-MS in

tomato fruit extracts. Very high correlations (r > 0.99) were found between isotopes, a fragment of phenylalanine and the molecular ion of phenylalanine (164 m/z), Fig. 7.7B.

A series of caffeoylquinic acid derivatives have been putatively assigned in tomato (Chapters 3 and 4): three caffeoylquinic acid isomers, three dicaffeoylquinic acid isomers and three tricaffeoylquinic acid isomers. By performing correlation analyses with one of the three caffeoylquinic acid isomers (the second eluting isomer, m/z 353 at retention time 14.9 min, assigned as 5-caffeoylquinic acid (Moco et al., 2006a)), not only signals related to this specific isomer (isotopes, adducts, parent ion adducts, r > 0.98) were found to be highly correlated, but also signals from other caffeoylquinic acids derivatives. Thus, this strategy was able to verify the presence of other caffeoylquinic acid derivatives in tomato fruit: dicaffeoylquinic acid isomers and tricaffeoylquinic acid isomers, Fig. 7.7C and Fig. 7S.2. This correlation approach clearly showed the potential to identify related compounds within complex mixtures.

In general, strong LC-MS – LC-MS correlations were found between chemically related compounds. Two items particularly contributed to this outcome: firstly, the usage of a chromatographic step before mass detection increased the LC-MS resolution, promoting a better separation of metabolites and increasing sensitivity, and secondly the alignment and signal extraction of mass spectral data improved the robustness of the correlation analyses.

NMR - LC-MS correlations

NMR and LC-MS data sets acquired for the different tomato cultivars were further analysed using an inter-method correlation analyses. For this, each mass signal (m/z value at a certain retention time) obtained by LC-MS analyses was directly compared to the chemical shifts present in the NMR buckets. Using such approach, 149 significant positive and 185 significant negative correlations were found. As expected from the results obtained with standard compounds (see Fig. 7.5E), the correlation coefficients between NMR – LC-MS data sets were not high compared to NMR-NMR or LC-MS - LC-MS correlations. A correlation map of NMR – LC-MS data was drawn for |r| ≤ 0.6, which is considered as the lower limit for establishing correlation analyses (moderate correlations). Fig. 7.8 shows a section of this NMR – LC-MS map.

Page 191: Metabolomics Technologies applied to the

191

Tomato LC-MS and NMR correlations

Figure 7.8. Section of the correlation map of NMR – LC-MS for tomato fruit: [3-4 ppm (NMR); 2700-3000 (LC-MS peak number) = (16.25 min to 25.79 min)]. Horizontally, the NMR buckets (ppm) of the samples are overlaid and vertically the LC-MS peak numbers (peak numbers increase with retention time and m/z) of the samples are overlaid. In the central frame, the correlation coefficients, r, for NMR – LC-MS correlations are displayed as a blue-red heat map.

However, detailed inspection of the observed correlations in the NMR – LC-MS correlation map revealed no consistent assignment of metabolites. Either the intensity of the LC-MS signal was too low to deduce a correct molecular formula, or the metabolite annotated by its accurate mass did not correlate significantly with the expected NMR signals. As an example, phenylalanine (observed accurate mass 164.0709, i.e., -1.5 ppm from the calculated mass) correlated with the bucket at 8.315 ppm but not to any of the buckets in the 7 ppm range. Likewise, the mass signal 274 m/z at 42.1 min, which corresponded to the flavonoid naringenin (4th isotope), positively correlated to an aromatic NMR bucket (7.765 ppm), Fig. 7.9A. However the obtained correlation seemed to be essentially dictated by one sample only, even though the distance between the outlier and the group was < 50 (as it was not considered as an irrelevant correlation by the Mahalanobis distance analyses), Fig. 7.9B.

The origin of (strong) correlations between variables is not self-explanatory. It might be related to the presence of two (closely) chemically related compounds, as described in Fig. 7.6 and Fig. 7.7, or might be a consequence of the presence

Page 192: Metabolomics Technologies applied to the

192

CHAPTER 7

of two compounds that have a similar behaviour across the samples analysed. Future analyses can provide explanations for the cause-effect of this similarity in behaviour.

-1 0 1 2 3 4 5 6-1

0

1

2

3

4

5

6

7.765 ppm

Peak

num

ber

3298

(42

.1 m

in,

274

m/z

)

r = 0.71

B

500 1500 25003296

3297

3298

3299

3300

7.727.747.767.787.87.827.840

1

2

x 10−4

NMR (ppm)

−0.1

0

0.1

0.2

0.3

0.4

0.5

LC-M

S (p

eak

num

ber)

A

correlation coefficient

Figure 7.9. Correlation between the NMR bucket 7.765 ppm and the LC-MS signal (peak number 3298: 274 m/z at 42.1 min). (A) Zoom of the correlation map of NMR – LC-MS for tomato fruit on the correlation pair (7.765 ppm, 3298): red dot, (B) Correlation plot of the pair (NMR bucket 7.765 ppm, LC-MS peak number 3298) over the samples, in terms of normalised signal intensities of the NMR bucket (x axis) and the LC-MS signal (y axis).

From the present correlation analyses, it can be concluded that the chemical overlap between the NMR and LC-MS signals obtained by our tomato metabolic profiling techniques was minimal, as depicted in Fig. 7.10A. In order to expand the NMR and LC-MS overlap in compound correlations, improvements in both the analytical measurements and the data analyses strategies are needed. Firstly, the investigation of other sample preparation protocols can contribute to diminish the spectral complexity and signal overlap in 1H NMR analyses. Secondly, the optimization of the interaction nature and polarity ranges of the chromatography used in the LC-MS can be explored. For example, the use of hydrophilic columns, which are more suitable for the separation of the polar primary metabolites (amino acids, organic acids) than the reverse-phase column used in the present study, may result in an increased detection of metabolites also detected by NMR. Thirdly, the increase on response linearity of MS, resulting in better correlation with the linear NMR signal intensities, can be enhanced by using wider-dynamic range MS instruments, such as Fourier Transform (FT)-MS. Using accurate mass data acquired by such MS instruments can also facilitate the identification of metabolite signals observed in NMR – LC-MS correlations. Fourthly, developments on the quality of NMR data, such as obtaining flat baselines, very precise temperature control and increase in the sensitivity (by making use of cryogenic probes, higher magnetic field instruments or

Page 193: Metabolomics Technologies applied to the

193

Tomato LC-MS and NMR correlations

other type of NMR measurements) can be advantageous. Fifthly, the development of nonlinear algorithms that can correct for the non-linearities in the data matrixes can help the quality of the inter-method correlations. The implementation of these items can lead to improvements in the quality of the NMR – LC-MS correlations by enhancing chemical overlap (as is depicted in Fig. 7.10B) and may allow the application of this strategy in metabolomics studies.

NMR LC-MS NMR LC-MS

NMR + LC-MS

A B

Figure 7.10. Scheme emphasising the overlap between metabolites detected by both NMR and LC-MS upon complex mixture analyses: (A) limited overlap (as observed in the present study) and (B) increased overlap by improved analytical and statistical methods.

CONCLUSION

The direct comparison of NMR with LC-MS data generated from the analysis of tomato fruit extracts is potentially a powerful strategy on obtaining metabolite information. Due to the analytical selectivities and sensitivities of NMR and LC-MS, different arrays of metabolites can be captured by these two techniques. In addition, the identification of novel biomarkers can profit from direct correlations within LC-MS and NMR signals. The inter-method correlation NMR – LC-MS aspires to be a promising strategy in the direct integration of chemical information from NMR and LC-MS analyses, when method optimization is achieved. The application of chemometric methods to the analysis and interpretation of metabolomics data sets is an advantageous tool in handling and visualising data, and also in leading results towards adequate conclusions. But further developments in the technologies are required to fully explore the interactive opportunities that a combined NMR and MS approach may offer.

Page 194: Metabolomics Technologies applied to the

194

CHAPTER 7

SUPPLEMENTARY MATERIALS

NO

-Oa b

c

de

Figure 7S.1 Trigonelline (3-carboxy-1-methyl-pyridinium; InChI=1/C7H7NO2/c1-8-4-2-3-6(5-8)7(9)10/h2-5H,1H3; CAS = 535-83-1).

Table 7S.1 Correlation coefficients, r, (with associated p-values) between NMR buckets (ppm) corresponding to trigonelline, with proton assignments, according to Fig. 7S.1.

NMR bucket (ppm)

NMR bucket (ppm) protons r p-value NMR bucket

(ppm)NMR bucket

(ppm) protons r p-value

9.205 9.205 a 1 0 8.915 8.865 b 0.94 0

8.915 0.82 1.8X10-12 8.905 0.85 1.2X10-13

8.895 0.88 1.0X10-15 8.895 0.91 0

8.875 0.81 7.5X10-12 8.875 0.93 0

9.205 8.915 b 0.82 1.8X10-12 8.865 1 0

8.915 1 0 8.075 0.81 8.8X10-12

8.905 0.86 3.0X10-14 8.065 0.81 9.9X10-12

8.895 0.95 0 8.875 8.075 c 0.81 8.5X10-12

8.875 0.95 0 8.865 0.81 8.8X10-12

8.865 0.94 0 8.075 1 0

8.065 0.84 1.8X10-13 8.065 0.89 0

8.045 0.83 7.1X10-13 8.915 8.065 c 0.84 1.8X10-13

8.905 b 1 0 8.895 0.87 5.0X10-15

9.205 8.895 b 0.88 1.0X10-15 8.875 0.84 4.1X10-13

8.915 0.95 0 8.865 0.81 9.9X10-12

8.905 0.87 4.0X10-15 8.075 0.89 0

8.895 1 0 8.065 1 0

8.875 0.93 0 8.045 0.82 3.5X10-12

8.865 0.91 0 8.915 8.045 c 0.83 7.1X10-13

8.065 0.87 5.0X10-15 8.905 0.81 6.4X10-12

8.045 0.85 6.2X10-14 8.895 0.85 6.2X10-14

4.435 0.81 8.3X10-12 8.875 0.85 4.9X10-14

9.205 8.875 b 0.81 7.52X10-12 8.065 0.82 3.5X10-12

8.915 0.95 0 8.045 1 0

8.905 0.83 8.22X10-13 4.435 0.81 5.9X10-12

8.895 0.93 0 8.895 4.435 d 0.81 8.3X10-12

8.875 1 0 8.045 0.81 5.9X10-12

8.865 0.93 0 4.435 1 0

8.075 0.81 8.52X10-12

8.065 0.84 4.09X10-13

8.045 0.85 4.90X10-14

Page 195: Metabolomics Technologies applied to the

195

Tomato LC-MS and NMR correlations

Table 7S.2 High correlations, r > 0.85, (with associated p-values) between the LC-MS signal of 5-caffeoylquinic acid (14.9 min, 353 m/z) and other LC-MS signals (m/z).

Ret (min) m/z Negative ion r p-value

13.2 353 caffeoylquinic acid I 0.88 2.0X10-15

13.2 354 2nd isotope of caffeoylquinic acid I 0.86 1.8X10-14

13.3 375 caffeoylquinic acid I + Na 0.91 0

14.9 179 caffeic acid 0.99 0

14.9 191 quinic acid of caffeoylquinic acid II 0.89 0

14.9 354 2nd isotope of caffeoylquinic acid II 1.00 0

14.9 355 3rd isotope of caffeoylquinic acid II 0.99 0

14.9 375 caffeoylquinic acid II + Na 1.00 0

14.9 376 2nd isotope of caffeoylquinic acid II + Na 0.99 0

14.9 707 Dimer of caffeoylquinic acid II 0.98 0

27.9 515 dicaffeoylquinic acid I 0.93 0

27.9 516 3rd isotope of dicaffeoylquinic acid I 0.93 0

28.6 515 dicaffeoylquinic acid II 0.92 0

28.6 516 2nd isotope of dicaffeoylquinic acid II 0.90 0

28.6 537 dicaffeoylquinic acid II+Na 0.91 0

28.6 353 caffeolquiic acid from dicaffeoylquinic acid II 0.92 0

30.7 515 dicaffeoylquinic acid III 0.91 0

30.7 516 2nd isotope of dicaffeoylquinic acid III 0.93 0

30.7 517 3rd isotope of dicaffeoylquinic acid III 0.93 0

30.7 537 dicaffeoylquinic acid III+Na 0.92 0

39.4 677 tricaffeoylquinic acid I 0.88 1.0X10-15

39.4 678 2nd isotope of tricaffeoylquinic acid I 0.89 0

39.4 699 tricaffeoylquinic acid I + Na 0.89 0

40.7 677 tricaffeoylquinic acid II 0.86 1.9X10-14

Page 196: Metabolomics Technologies applied to the
Page 197: Metabolomics Technologies applied to the

197

Summarizing discussion and conclusions

The chemical diversity of metabolites present in plant cells is enormous. Unlike sequence-based macrostructures such as DNA, RNA or proteins, “small” organic molecules detain highly variable chemical features, at the molecular level. In tomato fruit, metabolites as carotenoids co-exist with fl avonoids, sugars, organic acids, alkaloids, lipids, among many other chemical classes of compounds. The concentration and chemical conjugation (e.g. glycosylation, methoxylation) of such metabolites depends on the plant species, tissue as well as cellular compartment. Metabolites are involved in a variety of biochemical functions and their contribution to the plant’s vital functions or adaptive plasticity is essential. The chemical diversity of metabolites is greatly attributed to secondary metabolites which more than 200,000 are estimated to exist in the plant kingdom.

0

50

100

150

200

250

300

350

num

ber

of p

ubli

cati

ons

<1999 1999 2000 2001 2002 2003 2004 2005 2006 2007years

plant metabolomics metabolomics metabonomics

Figure 8.1. Number of publications listed by the ISI Web of Science for searches made on “topic” for: plant metabolomics, metabolomics and metabonomics (search performed on 12th July 2007).

Chemical analyses of natural compounds performed on plants have been performed for decades. Nevertheless, only with the development of robust analytical

Summarizing discussion and conclusions

Page 198: Metabolomics Technologies applied to the

198

DISCUSSION

technologies, adapted to a wide chemical range of applications, the informative detection of a large number of metabolites has been made possible. With the appearance of the first metabolomics / metabonomics studies in 1999/2000 the field blossomed in the past few years (Fig. 8.1).

The development of metabolomics technologies was set in motion by the rise in interest in global systems analyses, where the need to overview cellular biology as dynamic networks triggered the interest in biochemical phenomena. Within this thesis, state-of-the-art analytical developments have been applied and developed to establish protocols, routines and optimization of methods for metabolite analyses in plants. Liquid chromatography (LC)-mass spectrometry (MS) and nuclear magnetic resonance (NMR) were used as alternative and complementary technologies for the detection and characterization of metabolites in tomato fruit. LC-photo diode array (PDA)-MS-based methodology was used for the analysis of semi-polar metabolites in plants (Chapter 2). With this method, the detection and (putative) identification of various phenolic acids, flavonoids, alkaloids, glucosinolates and many derivatives thereof was feasible. The instrumental set-up was further implemented using an accurate mass quadropole (Q)-time of flight (TOF) as mass detector. Using such a hyphenated LC-PDA-QTOF-MS system, plant metabolic profiles can be obtained serving as chemical prints of phenotypes and/or genotypes. Intensities of thousands of mass signals, representing hundreds of metabolites, are obtained which have been deconvoluted and aligned for direct comparison of samples and multivariate analyses of the data sets. With this analytical and computational strategy, the interpretation of metabolomics data have been applied to specific research questions and engaged in a biochemical context.

The developed LC-PDA-QTOF-MS method has been used to provide relevant information for the identification of metabolites in plant extracts: retention time, accurate mass value, UV/Vis absorbance and MS/MS-fragmentation pattern. The complementation of such experimental data taken from tomato fruit and combined with literature information was the basis for the construction of a dedicated LC-MS-based database for tomato fruit, the MoTo DB (Chapter 3). This database is therefore a primary effort in the systematization of the tomato fruit metabolome, containing validated experimental data of semi-polar metabolites. The construction of dedicated databases, that integrate chemical information with spectrometric and spectroscopic data of a plant species, has enabled the linkage of metabolic characteristics to biological phenomena such as physiology and development. The biochemical difference between the flesh and peel tissues of ripe tomato fruit was studied at the metabolite level. Different metabolites were assigned to the peel

Page 199: Metabolomics Technologies applied to the

199

DISCUSSION

compared to the flesh (Chapter 4). In this chapter, a multitude of metabolites were assigned for the different tissues of tomato fruit, reflecting the different physiology of the tissues. Moreover, certain metabolites appeared to be characteristic of certain ripening stages, assuming differences in ecological and defense functions.

The study of plant biology by means of metabolomics relies on the amount of metabolites that can be detected, and especially in our ability to describe them. This means that the identification of metabolites is a major bottleneck in metabolomics. The absolute identification of metabolites from extracts or complex mixtures can be an extreme effort. The physicochemical properties of metabolites may hamper their isolation, concentration and manipulation for structural elucidation. In the elucidation of metabolites, NMR plays a central role due to its exquisite chemical selectivity. In Chapters 5 and 6, a database of mostly flavonoids based on 1H and 13C NMR data was initiated, the Flavonoid Database. About 250 standard compounds were measured by NMR, for 1H and 13C correlated spectra. The establishment of this experimental NMR database can greatly improve the identification of flavonoids present in plant extracts, limiting the need of acquisition of complex 2D-NMR spectra that require a higher amount of isolated material. The integration of a 1H NMR prediction model allows the extraction of NMR properties (coupling constants and chemical shifts) and confirms the assignments of protons to the respective modelled three-dimensional molecular structure. This model, presented here for the prediction of protons, proved to have a high correlation with the experimental data, indicating the robustness of this method for identification purposes. From the acquisition of 1H NMR spectra of a large number of related flavonoid structures, the influence of substituents (e.g. hydroxyl, methoxy, sugar) on the molecule’s protons, reflected by its 1H NMR spectrum, was analysed. Due to the partial aromaticity of flavonoids, substituent effects are also reflected in non-neighbouring protons.

In Chapter 7, the analytical capacities and complementarity of NMR and LC-MS based metabolomic profiling technologies were tested. A range of different tomato cultivars were analysed by both NMR and LC-MS for semi-polar metabolites in their ripe fruits. The obtained data matrixes were fused for correlation analyses. This combined strategy expands the analytical competences of LC-MS and NMR and enables a wider overview over the tomato fruit metabolome. By NMR mostly organic acids, sugars and amino acids are detectable, while by reversed phase-LC-MS flavonoids, phenolic acids and alkaloids are detected upon analysis of analogous extracts. The assignment of metabolites in the LC-MS and NMR data sets was performed based on the MoTo DB and the Flavonoid DB, and the possibilities for establishing intra-method correlations (mass signal-mass signal and chemical shift-

Page 200: Metabolomics Technologies applied to the

200

DISCUSSION

chemical shift) as well as inter-method correlations (mass signal-chemical shift) were investigated.

In this thesis, strategies for the simultanous analysis of a large number of naturally occurring metabolites, at a wide chemical nature, were implemented using NMR and LC-PDA-QTOF MS. These methods were applied to study the metabolome of tomato fruit, but are also feasible to be directly implemented in the analysis of other plant species or other biological systems. LC-PDA-MS and NMR methods not only allow the diagnostics of metabolic statuses of plants, but also provide relevant information for compound identification purposes. In addition, metabolite databases based on LC-PDA-MS and NMR experimental data were implemented with the prospect of increasing metabolite identification efficiency and interpretation of metabolomics data. From this technology-oriented thesis, the outcome of the methods, protocols and metabolite databases developed can be directly applied in the interpretation of physiological and developmental effects on plant biochemistry.

In addition to the work presented, other topics may be scientifically interesting and biologically relevant to study, as a follow up of this thesis. The isolation of endogenous metabolites from tomato fruit for structural elucidation by NMR can be performed by online LC-PDA-solid phase extraction (SPE)-NMR or offline preparative LC-PDA and NMR analyses. The full identification of putatively assigned metabolites in tomato fruit by LC-PDA-MS (in Chapters 3 and 4) can be achieved by these means, which may help to elucidate their biosynthesis pathway and biological or physiological functions. Extending the Moto DB with newly detected and identified metabolites in tomato, e.g. by using specific mutants, transgenic plants, organs, tissues and growth conditions, will greatly improve our possibilities to annotate metabolite signals. The application of the developed Flavonoid Database (Chapter 6) in real metabolomics studies may reveal the importance of building such chemical databases containing NMR-based spectroscopic information for metabolite identification and quantification purposes. These applications may involve the analysis of plant extracts by LC-(SPE)-NMR with 1H (only) NMR acquisition, followed by spectral matching with the Flavonoid Database.

Furthermore, other means for describing the metabolome of tomato fruit may be tested. Exploring a range of sample preparation protocols, such as the usage of different extracting solvents, can lead to the detection of compounds on a wide range of chemical diversity. However, the incompatibility of organic solvents with the analytical platforms (e.g. LC-MS) is often an obstacle for testing such procedures. The usage of 13C labelled biological material, in this case tomato fruit, is an elegant way of obtaining more insight into the identification of metabolites, but requires

Page 201: Metabolomics Technologies applied to the

201

DISCUSSION

a robust and controlled preparation. The elucidation of metabolite functions and the better understanding of metabolites as active elements in plant physiology or plant developmental processes can highly benefit from the integration of molecular biology and genetics information to metabolomics data.

The potential applications of metabolomics have already been demonstrated in a wide variety of disciplines: plant biology, plant breeding, medicine, toxicology, pharmacology, microbiology, phytopathology, etc. Further improvements of technologies and strategies for metabolomics analyses are likely, in particular in the area of statistics and bioinformatics for data handling, data integration and database functionalities. The competence of metabolomics will greatly profit from automation efforts in the large scale data analysis inherent to metabolomics studies. Efforts in the identification of metabolites and characterization of the metabolome of complete biological systems are now becoming evident, as seen in mega-projects such as the Human Metabolome Database. High-throughput systems, database developments and routine applications using LC-(SPE)-NMR-MS are promising means in the efficiency of identification of metabolites in biological extracts. The emergence of MS-imaging techniques, such as desorption electrospray ionisation (DESI)-MS and matrix assisted laser desorption ionisation (MALDI)-MS imaging, are promising developments in extending metabolomics applications. In fact, improvements in sensitivity, robustness and resolution of analytical platforms may, in the future, even pair metabolomics technologies to single-cell analyses. The contribution of metabolomics to systems biology can become even more relevant in combination with transcriptomics and proteomics analyses for the progressive understanding of metabolic pathways, signalling cascades and diagnostic status of biological systems. The time-dependency in the dynamics of metabolic pathways and systems is studied in fluxomics analyses, where metabolomics also plays a key role.

In conclusion, although metabolomics is still a developing technique in the “omics” era and just at the beginning of its integration into systems biology, it is already a powerful tool in many biochemical research areas, including those involving tomato fruit development and quality aspects.

Page 202: Metabolomics Technologies applied to the
Page 203: Metabolomics Technologies applied to the

203

Samenvatting

Planten, dieren en micro-organismen bevatten een enorme diversiteit aan metabolieten. Een vrucht van een plant bevat carotenoïden, fl avonoïden, suikers, organische zuren, alkaloïden, lipiden en nog vele andere klassen van verbindingen. Het type verbinding en de inter- alsmede intracellulaire concentratie hangt af van het soort plant, het weefsel maar ook van de compartimentalisatie in de cel. Metabolieten zijn betrokken bij veel biochemische reacties en de bijdrage van deze verbindingen voor vitale functies is essentieel. De chemische diversiteit is enorm, alleen al in de plantenwereld zijn er meer dan 200,000 (en waarschijnlijk ver over de 1,000,000) verschillende verbindingen.

Het gebruik van natuurverbindingen is al eeuwenoud. Zo bevatten vele plantensoorten bijvoorbeeld salicylzuur. Hiervan is eind 19-de eeuw een verbinding afgeleid, acetylsalycilzuur, ook wel bekend als aspirine. Plantenextracten met salicylzuur worden al sinds Hippocrates gebruikt als koortswerend middel gebruikt. De interesse in het gebruik en identifi catie van de verschillende belangrijke componenten in plantenextracten is al eeuwenoud. Eerst recent met de ontwikkeling van robuuste en analytische technologische mogelijkheden is het mogelijk gebleken om de enorme diversiteit van chemische verbindingen meer gericht in kaart te brengen. In het laatste decennium is hieruit een nieuw vakgebied ontstaan: metabolomics. Dit vakgebied heeft een grote vlucht genomen zoals blijkt uit de toename in het aantal gepubliceerde wetenschappelijke artikelen (Figuur 8.1).

De ontwikkeling van metabolomics technologieën is de laatste jaren zeer sterk gestimuleerd door de toegenomen interesse in systeembiologie. In de systeembiologie wordt de cellulaire biologie als een dynamisch netwerk weergegeven. Om deze dynamische processen beter te begrijpen is een nader inzicht in biochemische parameters die hier aan ten grondslag liggen noodzakelijk. In dit proefschrift zijn nieuwe analytische methoden ontwikkeld en getest alsmede

Samenvatting

Page 204: Metabolomics Technologies applied to the

204

Samenvatting

protocollen geoptimaliseerd voor de identificatie van metabolieten in planten. Vloeistof chromatografie (LC) massa spectrometrie (MS) en kern spin resonantie (NMR) zijn gebruikt als verschillende en complementaire technieken voor de identificatie van metabolieten in de vrucht van tomaat. LC-foto-diode-array (PDA)-MS-gebaseerde methoden zijn gebruikt voor de analyse van semipolaire metabolieten in planten (Hoofdstuk 2). Met dit soort technieken is het mogelijk diverse fenolische zuren, flavonoïden, alkaloïden, glucosinolaten en veel afgeleide verbindingen hiervan, te detecteren en te identificeren. De instrumentele set-up maakte gebruik van een nauwkeurige quadrupool (Q)-“time of flight” (TOF) massadetector. Met behulp van een gekoppelde LC-PDA-QTOF-MS zijn metabole profielen gemaakt die het best kunnen worden omschreven als een chemische fotoafdruk (‘fingerprint’) van het organisme. Duizenden signalen zijn waargenomen in de massa spectra, welke representatief zijn voor de vele verschillende metabolieten. De massa signalen zijn gedeconvolueerd (uit elkaar gehaald) en uitgelijnd voor directe vergelijking van de monsters en de multivariate analyse van de data sets. Met deze analytische en mathematische strategie, is de basis gelegd van een interpretatie van metabolomics data.

De gebruikte LC-PDA-QTOF-MS methode is gebruikt om belangrijke informatie over de identiteit van metabolieten in plantextracten te verkrijgen: retentietijd, zuivere massa, UV/Vis absorptie en MS/MS-fragmentatie patronen. Deze informatie is vervolgens gecombineerd met experimentele gegevens van de tomaten monsters en heeft geresulteerd in een specifieke LC-MS-gebaseerde database voor tomaat: de MoTo Database (Hoofdstuk 3). Deze database is de eerste stap in het systematisch in kaart brengen van het tomaat metaboloom, en bevat gevalideerde experimentele data van semipolaire metabolieten uit tomaat. De constructie van geoormerkte databasen, die chemische informatie combineren met spectrometrische en spectroscopische gegevens van een plant, maakt het mogelijk om metabole eigenschappen te linken aan biologische eigenschappen zoals fysiologie en ontwikkeling. Zo zijn er bijvoorbeeld duidelijke verschillen in metaboliet samenstelling gevonden in de schil en het vlees van de tomaat (Hoofdstuk 4). Belangrijk was de observatie dat bepaalde metabolieten karakteristiek bleken te zijn voor de rijpheid van de tomaat. De verschillen in metaboliet samenstelling is waarschijnlijk gerelateerd aan ecologische eigenschappen (zoals bv resistentie).

Metabole studies zoals beschreven in dit proefschrift, zijn afhankelijk van de mogelijkheid om moleculen te detecteren en nog belangrijk om deze te identificeren. De identificatie van metabolieten is een belangrijke bottleneck in metabolomics. De identificatie van metabolieten zoals aanwezig in extracten van

Page 205: Metabolomics Technologies applied to the

205

Samenvatting

complexe mengsels kan een enorme taak zijn. Voor de identificatie van metabolieten is NMR onmisbaar. In hoofdstuk 5 en 6 wordt een 1H en 13C NMR database van vooral flavonoïden beschreven: de Flavonoid DB. Ongeveer 250 verbindingen zijn gemeten met behulp van NMR. Alle relevante een dimensionale en twee dimensionale 1H en 13C spectra zijn gemeten, verwerkt en geanalyseerd op een niveau zoals niet eerder gedaan in de wetenschappelijke literatuur. Deze database zal de identificatie van flavonoïden enorm versnellen, omdat tijdrovende 1H-13C gecorreleerde spectra niet meer noodzakelijk zijn. Op basis van de gegevens van de Flavonoid DB is het nu mogelijk om enkel gebaseerd op 1H NMR spectra te komen tot identificatie van flavonoïden. Dit is een grote stap voorwaarts omdat nu ook flavonoïden, zoals bijvoorbeeld in humaan of dierlijk metabool onderzoek, geïdentificeerd kunnen worden.

In hoofdstuk 7, wordt de koppeling van LC-MS en NMR data beschreven. Verschillende tomaten cultivars zijn zowel met behulp van LC-MS en NMR geanalyseerd. De verkregen datamatrices zijn gefuseerd voor gecorreleerde analysen. Deze gecombineerde strategie geeft een verbreding van de analytische mogelijkheden van LC-MS en NMR en geeft een breder overzicht van het metaboloom van tomaat. In de NMR studies zijn vooral organische zuren, suikers en aminozuren waargenomen, terwijl bij de LC-MS flavonoïden, fenolische zuren en alkaloïden gedetecteerd zijn. De verschillende metabolieten in de LC-MS en NMR spectra zijn toegekend met behulp van de MoTo DB en de Flavonoid DB. Intra-methoden correlaties (massa-massa, NMR-NMR) en ook inter-methode (massa-NMR) zijn onderzocht.

In dit proefschrift zijn methoden gebaseerd op moderne analyse technieken, NMR en LC-MS, toegepast voor de simultane analyse van een groot aantal metabolieten. Deze methoden zijn vooral gebruikt om de vrucht van een tomaat te bestuderen, maar zijn ook toepasbaar voor andere planten of andere organismen. LC-MS en NMR methoden kunnen niet alleen een diagnostisch beeld geven van de metabole status van een plant, maar geven ook relevante informatie voor identificatie van specifieke verbindingen. Ook zijn metabole databasen ontwikkeld, die de mogelijkheid geven voor een grote toename in efficiency van identificatie van onbekende verbindingen. Als een resultaat van dit proefschrift, zijn nieuwe innovatieve methoden, protocollen en databasen ontwikkeld die nu direct bruikbaar zijn gemaakt voor de interpretatie van fysiologische eigenschappen.

In aansluiting op het werk gepresenteerd in dit proefschrift, kunnen verschillende onderwerpen interessant zijn als follow-up. Endogene metabolieten kunnen geïsoleerd worden uit tomaat voor structurele analyse met behulp van LC-SPE-NMR. De verbindingen die in hoofdstuk 3 en 4 beschreven worden aan

Page 206: Metabolomics Technologies applied to the

206

Samenvatting

de hand van LC-MS gegevens kunnen dan eenduidig worden geïdentificeerd. Die gegevens kunnen vervolgens uitermate nuttig zijn om een beter inzicht te krijgen in de biologische en fysiologische functies. Een uitbreiding van de MoTo DB met nieuw gedetecteerde en geïdentificeerde metabolieten door gebruik te maken van specifieke mutanten, transgene planten, organen, weefsels en groei condities, zullen de mogelijkheden sterk vergroten om metaboliet signalen te annoteren (“correct labelen”). De toepassing van een flavonoïd database (Hoofdstuk 6) voor metabole studies zullen identificatie en kwantificering enorm versnellen.Samenvattend, hoewel metabolomics nog een techniek in ontwikkeling is in het “omics” tijdperk en we nog maar aan het begin staan van systeem biologie, is metabolomics toch al een zeer krachtig gereedschap voor veel biochemisch onderzoek zoals in dit proefschrift beschreven.

Page 207: Metabolomics Technologies applied to the

207

References

References

Abraham RJ, Byrne JJ, Griffiths L, Koniotou R (2005) H-1 chemical shifts in NMR: Part 22(+) - Prediction of the H-1 chemical shifts of alcohols, diols and inositols in solution, a conformational and solvation investigation. Magnetic Resonance in Chemistry 43: 611-624

Abraham RJ, Canton M, Reid M, Griffiths L (2000) Proton chemical shifts in NMR. Part 14. Proton chemical shifts, ring currents and pi electron effects in condensed aromatic hydrocarbons and substituted benzenes. Journal of the Chemical Society-Perkin Transactions 2 4: 803-812

Aharoni A, De Vos RCH, Verhoeven HA, Maliepaard CA, Kruppa G, Bino R, Goodenowe DB (2002) Nontargeted metabolome analysis by use of Fourier Transform Ion Cyclotron Mass Spectrometry. Omics 6: 217-234

Alba R, Cordonnier-Pratt MM, Pratt LH (2000) Fruit-localized phytochromes regulate lycopene accumulation independently of ethylene production in tomato. Plant Physiology 123: 363-370

Aubert S, Hennion F, Bouchereau A, Gout E, Bligny R, Dorne A-J (1999) Subcellular compartmentation of proline in the leaves of the subantarctic Kerguelen cabbage Pringlea antiscorbutica R. Br. in vivo 13C-NMR study. Plant, Cell and Environment 22: 255-259

Awad HM, Boersma MG, Boeren S, van Bladeren PJ, Vervoort J, Rietjens I (2001) Structure-activity study on the quinone/quinone methide chemistry of flavonoids. Chemical Research in Toxicology 14: 398-408

Bagno A, Rastrelli F, Saielli G (2006) Toward the complete prediction of the H-1 and C-13 NMR spectra of complex organic molecules by DFT methods: Application to natural substances. Chemistry-a European Journal 12: 5514-5525

Balogh MP (2004) Debating resolution and mass accuracy. Lc Gc North America 22: 118-125Basu A, Imrhan V (2007) Tomatoes versus lycopene in oxidative stress and carcinogenesis:

conclusions from clinical trials. Eur J Clin Nutr 61: 295-303Beekwilder J, Jonker H, Meesters P, Hall RD, van der Meer IM, De Vos RCH (2005)

Antioxidants in raspberry: On-line analysis links antioxidant activity to a diversity of individual metabolites. Journal of Agricultural and Food Chemistry 53: 3313-3320

Besseau S, Hoffmann L, Geoffroy P, Lapierre C, Pollet B, Legrand M (2007) Flavonoid accumulation in Arabidopsis repressed in lignin synthesis affects auxin transport and plant growth. Plant Cell 19: 148-162

Bianco G, Schmitt-Kopplin P, De Benedetto G, Kettrup A, Cataldi TR (2002) Determination of glycoalkaloids and relative aglycones by nonaqueous capillary electrophoresis coupled with electrospray ionization-ion trap mass spectrometry. Electrophoresis 23: 2904-2912

Biekofsky RR, Buschi CA, Pomilio AB (1991) Conformational-Analysis of 5,6,7-Trisubstituted Flavones - C-13 Nmr and Molecular Mechanics Study. Magnetic Resonance in Chemistry 29: 569-575

Bino RJ, De Vos RCH, Lieberman M, Hall RD, Bovy A, Jonker HH, Tikunov Y, Lommen A, Moco S, Levin I (2005) The light-hyperresponsive high pigment-2dg mutation of tomato: alterations in the fruit metabolome. New Phytol 166: 427-438

Bino RJ, Hall RD, Fiehn O, Kopka J, Saito K, Draper J, Nikolau BJ, Mendes P, Roessner-

Page 208: Metabolomics Technologies applied to the

208

References

Tunali U, Beale MH, Trethewey RN, Lange BM, Wurtele ES, Sumner LW (2004) Potential of metabolomics as a functional genomics tool. Trends Plant Sci 9: 418-425

Bovy A, De Vos RCH, Kemper M, Schijlen E, Almenar Pertejo M, Muir S, Collins G, Robinson S, Verhoeyen M, Hughes S, Santos-Buelga C, van Tunen A (2002) High-flavonol tomatoes resulting from the heterologous expression of the maize transcription factor genes LC and C1. Plant Cell 14: 2509-2526

Breitling R, Pitt AR, Barrett MP (2006) Precision mapping of the metabolome. Trends in Biotechnology 24: 543-548

Bristow AWT (2006) Accurate mass measurement for the determination of elemental formula - A tutorial. Mass Spectrometry Reviews 25: 99-111

Brown SC, Kruppa G, Dasseux JL (2005) Metabolomics applications of FT-ICR mass spectrometry. Mass Spectrometry Reviews 24: 223-231

Burns DC, Ellis DA, Li HX, Lewars EG, March RE (2007) A combined nuclear magnetic resonance and computational study of monohydroxyflavones applied to product ion mass spectra. Rapid Communications in Mass Spectrometry 21: 437-454

Buta JG, Spaulding DW (1997) Endogenous levels of phenolics in tomato fruit during growth and maturation. Journal of Plant Growth Regulation 16: 43-46

Carrari F, Baxter C, Usadel B, Urbanczyk-Wochniak E, Zanor MI, Nunes-Nesi A, Nikiforova V, Centero D, Ratzka A, Pauly M, Sweetlove LJ, Fernie AR (2006) Integrated analysis of metabolite and transcript levels reveals the metabolic shifts that underlie tomato fruit development and highlight regulatory aspects of metabolic network behavior. Plant Physiol 142: 1380-1396

Chen Z, Gallie DR (2006) Dehydroascorbate reductase affects leaf growth, development, and function. Plant Physiol 142: 775-787

Chernushevich IV, Loboda AV, Thomson BA (2001) An introduction to quadrupole-time-of-flight mass spectrometry. Journal of Mass Spectrometry 36: 849-865

Choi C, Munch R, Leupold S, Klein J, Siegel I, Thielen B, Benkert B, Kucklick M, Schobert M, Barthelmes J, Ebeling C, Haddad I, Scheer M, Grote A, Hiller K, Bunk B, Schreiber K, Retter I, Schomburg D, Jahn D (2007) SYSTOMONAS - an integrated database for systems biology analysis of Pseudomonas. Nucl. Acids Res. 35: D533-537

Claridge T (1999) High-resolution NMR techniques in organic chemistry, Vol 19. PergamonClifford MN, Johnston KL, Knight S, Kuhnert N (2003) Hierarchical scheme for LC-MSn

identification of chlorogenic acids. Journal of Agricultural and Food Chemistry 51: 2900-2911

Colombo M, Sirtori FR, Rizzo V (2004) A fully automated method for accurate mass determination using high-performance liquid chromatography with a quadrupole/orthogonal acceleration time-of-flight mass spectrometer. Rapid Communications in Mass Spectrometry 18: 511-517

Crockford DJ, Holmes E, Lindon JC, Plumb RS, Zirah S, Bruce SJ, Rainville P, Stumpf CL, Nicholson JK (2006) Statistical heterospectroscopy, an approach to the integrated analysis of NMR and UPLC-MS data sets: Application in metabonomic toxicology studies. Analytical Chemistry 78: 363-371

Crozier A, Lean MEJ, McDonald MS, Black C (1997) Quantitative analysis of the flavonoid content of commercial tomatoes, onions, lettuce, and celery. Journal of Agricultural and Food Chemistry 45: 590-595

de Laeter JJR (2003) Atomic weights of the elements: Review 2000 (IUPAC Technical Report). Pure and applied chemistry 75: 683

de Rijke E, Out P, Niessen WMA, Ariese F, Gooijer C, Brinkman UAT (2006) Analytical separation and detection methods for flavonoids. Journal of Chromatography A

Page 209: Metabolomics Technologies applied to the

209

References

Plant Analysis 1112: 31-63De Vos RCH, Moco S, Lommen A, Keurentjes JJB, Bino RJ, Hall RD (2007) Untargeted

large-scale plant metabolomics using liquid chromatography coupled to mass spectrometry. Nat Protoc 2: 778-791

Dixon RA, Gang DR, Charlton AJ, Fiehn O, Kuiper HA, Reynolds TL, Tjeerdema RS, Jeffery EH, German JB, Ridley WP, Seiber JN (2006) Perspective - Applications of metabolomics in agriculture. Journal of Agricultural and Food Chemistry 54: 8984-8994

Dixon RA, Strack D (2003) Phytochemistry meets genome analysis, and beyond. Phytochemistry 62: 815-816

Ellis DI, Goodacre R (2006) Metabolic fingerprinting in disease diagnosis: biomedical applications of infrared and Raman spectroscopy. Analyst 131: 875-885

Exarchou V, Godejohann M, van Beek TA, Gerothanassis IP, Vervoort J (2003) LC-UV-solid-phase extraction-NMR-MS combined with a cryogenic flow probe and its application to the identification of compounds present in Greek oregano. Anal Chem 75: 6288-6294

Exarchou V, Krucker M, van Beek TA, Vervoort J, Gerothanassis IP, Albert K (2005) LC-NMR coupling technology: recent advancements and applications in natural products analysis. Magnetic Resonance in Chemistry 43: 681-687

Fabre N, Rustan I, de Hoffmann E, Quetin-Leclercq J (2001) Determination of flavone, flavonol, and flavanone aglycones by negative ion liquid chromatography electrospray ion trap mass spectrometry. Journal of the American Society for Mass Spectrometry 12: 707-715

FAOSTAT Statistics Division (2005). In FAOSTAT / Food and Agriculture Organization of the United Nations, Vol 2007

Fernie AR (2003) Metabolome characterisation in plant system analysis. Functional Plant Biology 30: 111-120

Fiehn O, Kopka J, Dormann P, Altmann T, Trethewey RN, Willmitzer L (2000) Metabolite profiling for plant functional genomics. Nat Biotechnol 18: 1157-1161

Fleuriet A, Macheix J-J (1981) Quinyl esters and glucose derivatives of hydroxycinnamic acids during growth and ripening of tomato fruit. Phytochemistry 20: 667-671

Fleuriet A, Macheix JJ (1977) Effect des blessures sur les composés phénoliques des fruits de tomates «cerise» (Lycopersicum esculentum var. cerasiforme). Physiologie Vegetale 15: 239-250

Forshed J, Idborg H, Jacobsson SP (2007) Evaluation of different techniques for data fusion of LC/MS and 1H-NMR. Chemometrics and Intelligent Laboratory Systems 85: 102-109

Fraser PD, Enfissi EMA, Goodfellow M, Eguchi T, Bramley PM (2007) Metabolite profiling of plant carotenoids using the matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Plant Journal 49: 552-564

Fraser PD, Truesdale MR, Bird CR, Schuch W, Bramley PM (1994) Carotenoid Biosynthesis during Tomato Fruit Development (Evidence for Tissue-Specific Gene Expression). Plant Physiol 105: 405-413

Friedman M (2002) Tomato glycoalkaloids: Role in the plant and in the diet. Journal of Agricultural and Food Chemistry 50: 5751-5780

Friedman M, Kozukue N, Harden LA (1997) Structure of the tomato glycoalkaloid tomatidenol-3-beta-lycotetraose (dehydrotomatine). Journal of Agricultural and Food Chemistry 45: 1541-1547

Friedman M, Kozukue N, Harden LA (1998) Preparation and characterization of acid hydrolysis products of the tomato glycoalkaloid alpha-tomatine. Journal of Agricultural and Food Chemistry 46: 2096-2101

Page 210: Metabolomics Technologies applied to the

210

References

Friedman M, Levin CE, Mcdonald GM (1994) α-tomatine determination in tomatoes by HPLC using pulsed amperometric detection. Journal of Agricultural and Food Chemistry 42: 1959-1964

Fu J, Swertz MA, Keurentjes JJB, Jansen RC (2007) MetaNetwork: a computational protocol for the genetic study of metabolic networks. Nature Protocols 2: 685-694

Fujiwara Y, Takaki A, Uehara Y, Ikeda T, Okawa M, Yamauchi K, Ono M, Yoshimitsu H, Nohara T (2004) Tomato steroidal alkaloid glycosides, esculeosides A and B, from ripe fruits. Tetrahedron 60: 4915-4920

Fujiwara Y, Yahara S, Ikeda T, Ono M, Nohara T (2003) Cytotoxic major saponin from tomato fruits. Chemical & Pharmaceutical Bulletin 51: 234-235

Galperin MY, Ellison MJ (2006) Systems biology: sprint or marathon? Curr Opin Biotechnol 17: 437-439

Gillaspy G, Bendavid H, Gruissem W (1993) Fruits - a Developmental Perspective. Plant Cell 5: 1439-1451

Giovannoni J (2001) Molecular Biology of Fruit Maturation and Ripening. Annu Rev Plant Physiol Plant Mol Biol 52: 725-749

Goodacre R, York EV, Heald JK, Scott IM (2003) Chemometric discrimination of unfractionated plant extracts analyzed by electrospray mass spectrometry. Phytochemistry 62: 859-863

Griffin JL, Kauppinen RA (2007) A metabolomics perspective of human brain tumours. Febs Journal 274: 1132-1139

Haasnoot CAG, Deleeuw FAAM, Altona C (1980) The Relationship between Proton-Proton Nmr Coupling-Constants and Substituent Electronegativities .1. An Empirical Generalization of the Karplus Equation. Tetrahedron 36: 2783-2792

Hall B, Chebibb M, JR H, Johnston G (2005) 6-Methylflavanone, a more efficacious positive allosteric modulator of gamma-aminobutyric acid (GABA) action at human recombinant alpha(2)beta(2)gamma(2L) than at alpha(1)beta(2)gamma(2L) and alpha(1)beta(2)GABA(A) receptors expressed in Xenopus oocytes. European Journal of Pharmacology 512: 97-104

Hall RD (2006) Plant metabolomics: from holistic hope, to hype, to hot topic. New Phytologist 169: 453-468

Harborne JB (1980) Secondary plant products. In EA Bell, BV Charlwood, eds, Encyclopedia of Plant Physiology, Vol 8. Springer-Verlag, Berlin, p 674

Harborne JB, Mabry TJ, Mabry H (1975) Flavonoids. Chapman and Hall Ltd., LondonHelgaker T, Jaszunski M, Ruud K (1999) Ab initio methods for the calculation of NMR

shielding and indirect spin-spin coupling constants. Chemical Reviews 99: 293-352Henley DV, Lipson N, Korach KS, Bloch CA (2007) Prepubertal gynecomastia linked to

lavender and tea tree oils. New England Journal of Medicine 356: 479-485Herrero J, Valencia A, Dopazo J (2001) A hierarchical unsupervised growing neural network

for clustering gene expression patterns. Bioinformatics 17: 126-136Hertog MGL, Hollman PCH, Katan MB (1992) Content of potentially anticarcinogenic

flavonoids of 28 vegetables and 9 fruits commonly consumed in the Netherlands. Journal of Agricultural and Food Chemistry 40: 2379-2383

Hirai MY, Klein M, Fujikawa Y, Yano M, Goodenowe DB, Yamazaki Y, Kanaya S, Nakamura Y, Kitayama M, Suzuki H, Sakurai N, Shibata D, Tokuhisa J, Reichelt M, Gershenzon J, Papenbrock J, Saito K (2005) Elucidation of gene-to-gene and metabolite-to-gene networks in Arabidopsis by integration of metabolomics and transcriptomics. Journal of Biological Chemistry 280: 25590-25595

Horwitz W (2000) Official methods of analysis of AOAC International. AOAC Internatinal, Maryland

Huhman DV, Sumner LW (2002) Metabolic profiling of saponins in Medicago sativa and

Page 211: Metabolomics Technologies applied to the

211

References

Medicago truncatula using HPLC coupled to an electrospray ion-trap mass spectrometer. Phytochemistry 59: 347-360

Hunt GM, Baker EA (1980) Phenolic constituents of tomato fruit cuticles. Phytochemistry 19: 1415-1419

Idborg H, Zamani L, Edlund PO, Schuppe-Koistinen I, Jacobsson SP (2005) Metabolic fingerprinting of rat urine by LC/MS Part 2. Data pretreatment methods for handling of complex data. Journal of Chromatography B-Analytical Technologies in the Biomedical and Life Sciences 828: 14-20

Iwashina T (2000) The structure and distribution of the flavonoids in plants. Journal of Plant Research 113: 287-299

Jander G, Norris SR, Joshi V, Fraga M, Rugg A, Yu SX, Li LL, Last RL (2004) Application of a high-throughput HPLC-MS/MS assay to Arabidopsis mutant screening; evidence that threonine aldolase plays a role in seed nutritional quality. Plant Journal 39: 465-475

Jatoi A, Burch P, Hillman D, Vanyo JM, Dakhil S, Nikcevich D, Rowland K, Morton R, Flynn PJ, Young C, Tan W (2007) A tomato-based, lycopene-containing intervention for androgen-independent prostate cancer: results of a Phase II study from the North Central Cancer Treatment Group. Urology 69: 289-294

Jenkins H, Hardy N, Beckmann M, Draper J, Smith AR, Taylor J, Fiehn O, Goodacre R, Bino RJ, Hall R, Kopka J, Lane GA, Lange BM, Liu JR, Mendes P, Nikolau BJ, Oliver SG, Paton NW, Rhee S, Roessner-Tunali U, Saito K, Smedsgaard J, Sumner LW, Wang T, Walsh S, Wurtele ES, Kell DB (2004) A proposed framework for the description of plant metabolomics experiments and their results. Nature Biotechnology 22: 1601-1606

Jones CM, Mes P, Myers JR (2003) Characterization and inheritance of the Anthocyanin fruit (Aft) tomato. Journal of Heredity 94: 449-456

Joyce AR, Palsson BO (2006) The model organism as a system: integrating ‘omics’ data sets. Nature Reviews Molecular Cell Biology 7: 198-210

Justesen U, Knuthsen P, Leth T (1998) Quantitative analysis of flavonols, flavones, and flavanones in fruits, vegetables and beverages by high-performance liquid chromatography with photo-diode array and mass spectrometric detection. Journal of Chromatography A 799: 101-110

Juvik JA, Stevens MA, Rick CM (1982) Survey of the genus Lycopersicon for variability in alpha-tomatine content. HortScience 17: 764-766

Katajamaa M, Miettinen J, Oresic M (2006) MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22: 634-636

Kazi A, Daniel KG, Smith DM, Kumar NB, Dou QP (2003) Inhibition of the proteasome activity, a novel mechanism associated with the tumor cell apoptosis-inducing ability of genistein. Biochemical Pharmacology 66: 965-976

Keun HC, Ebbels TMD, Antti H, Bollard ME, Beckonert O, Schlotterbeck G, Senn H, Niederhauser U, Holmes E, Lindon JC, Nicholson JK (2002) Analytical reproducibility in H-1 NMR-based metabonomic urinalysis. Chemical Research in Toxicology 15: 1380-1386

Keurentjes JJB, Fu JY, De Vos RCH, Lommen A, Hall RD, Bino RJ, van der Plas LHW, Jansen RC, Vreugdenhil D, Koornneef M (2006) The genetics of plant metabolism. Nature Genetics 38: 842-849

Kim H, Moon BH, Ahn JH, Lim Y (2006) Complete NMR signal assignments of flavonol derivatives. Magn Reson Chem 44: 188-190

Kind T, Fiehn O (2006) Metabolomic database annotations via query of elemental compositions: Mass accuracy is insufficient even at less than 1 ppm. Bmc

Page 212: Metabolomics Technologies applied to the

212

References

Bioinformatics 7: 234Kochhar S, Jacobs DM, Ramadan Z, Berruex F, Fuerhoz A, Fay LB (2006) Probing gender-

specific metabolism differences in humans by nuclear magnetic resonance-based metabonomics. Analytical Biochemistry 352: 274-281

Kopka J, Schauer N, Krueger S, Birkemeyer C, Usadel B, Bergmuller E, Dormann P, Weckwerth W, Gibon Y, Stitt M, Willmitzer L, Fernie AR, Steinhauser D (2005) [email protected]: the Golm Metabolome Database. Bioinformatics 21: 1635-1638

Kovacs H, Moskau D, Spraul M (2005) Cryogenically cooled probes - a leap in NMR technology. Progress in Nuclear Magnetic Resonance Spectroscopy 46: 131-155

Kozukue N, Friedman M (2003) Tomatine, chlorophyll, beta-carotene and lycopene content in tomatoes during growth and maturation. Journal of the Science of Food and Agriculture 83: 195-200

Krause M, Galensa R (1992) Determination of naringenin and naringenin-chalcone in tomato skins by reversed phase HPLC after solid-phase extraction. Zeitschrift Fur Lebensmittel-Untersuchung Und-Forschung 194: 29-32

Krishnan P, Kruger NJ, Ratcliffe RG (2005) Metabolite fingerprinting and profiling in plants using NMR. Journal of Experimental Botany 56: 255-265

Krucker M, Lienau A, Putzbach K, Grynbaum MD, Schuler P, Albert K (2004) Hyphenation of capillary HPLC to microcoil (1)H NMR spectroscopy for the determination of tocopherol homologues. Anal Chem 76: 2623-2628

Kunz BA, Cahill DM, Mohr PG, Osmond MJ, Vonarx EJ (2006) Plant Responses to UV Radiation and Links to Pathogen Resistance. In KW Jeon, ed, A Survey of Cell Biology, International Review of Cytology, Vol 255. Academic Press, pp 1-40

Kuzuhara T, Sei Y, Yamaguchi K, Suganuma M, Fujiki H (2006) DNA and RNA as new binding targets of green tea catechins. J Biol Chem 281: 17446-17456

Laaksonen R, Katajamaa M, Paiva H, Sysi-Aho M, Saarinen L, Junni P, Lutjohann D, Smet J, Van Coster R, Seppanen-Laakso T, Lehtimaki T, Soini J, Oresic M (2006) A systems biology strategy reveals biological pathways and plasma biomarker candidates for potentially toxic statin-induced changes in muscle. PLoS ONE 1: e97

Laatikainen R, Niemitz M, Weber U, Sundelin J, Hassinen T, Vepsalainen J (1996) General strategies for total-lineshape-type spectral analysis of NMR spectra using integral-transform iterator. Journal of Magnetic Resonance Series A 120: 1-10

Lambert M, Staerk D, Hansen SH, Sairafianpour M, Jaroszewski JW (2005) Rapid extract dereplication using HPLC-SPE-NMR: analysis of isoflavonoids from Smirnowia iranica. J Nat Prod 68: 1500-1509

Le Gall G, Colquhoun IJ, Davis AL, Collins GJ, Verhoeyen ME (2003a) Metabolite profiling of tomato (Lycopersicon esculentum) using 1H NMR spectroscopy as a tool to detect potential unintended effects following a genetic modification. Journal of Agricultural and Food Chemistry 51: 2447-2456

Le Gall G, DuPont MS, Mellon FA, Davis AL, Collins GJ, Verhoeyen ME, Colquhoun IJ (2003b) Characterization and content of flavonoid glycosides in genetically modified tomato (Lycopersicon esculentum) fruits. Journal of Agricultural and Food Chemistry 51: 2438-2446

Lemaire-Chamley M, Petit J, Garcia V, Just D, Baldet P, Germain V, Fagard M, Mouassite M, Cheniclet C, Rothan C (2005) Changes in transcriptional profiles are associated with early fruit tissue specialization in tomato. Plant Physiol 139: 750-769

Lemanska K, Szymusiak H, Tyrakowska B, Zielinski R, Soffers AEMF, Rietjens IMCM (2001) The influence of pH on antioxidant properties and the mechanism of antioxidant action of hydroxyflavones. Free Radical Biology and Medicine 31: 869-881

Lisec J, Schauer N, Kopka J, Willmitzer L, Fernie AR (2006) Gas chromatography mass spectrometry-based metabolite profiling in plants. Nature Protocols 1: 387-396

Page 213: Metabolomics Technologies applied to the

213

References

Makarov A, Denisov E, Kholomeev A, Baischun W, Lange O, Strupat K, Horning S (2006) Performance evaluation of a hybrid linear ion trap/orbitrap mass spectrometer. Analytical Chemistry 78: 2113-2120

Mallet CR, Lu ZL, Mazzeo JR (2004) A study of ion suppression effects in electrospray ionization from mobile phase additives and solid-phase extracts. Rapid Communications in Mass Spectrometry 18: 49-58

Martinez-Valverde I, Periago MJ, Provan G, Chesson A (2002) Phenolic compounds, lycopene and antioxidant activity in commercial varieties of tomato (Lycopersicum esculentum). Journal of the Science of Food and Agriculture 82: 323-330

Mashego MR, Wu L, Van Dam JC, Ras C, Vinke JL, Van Winden WA, Van Gulik WM, Heijnen JJ (2004) MIRACLE: mass isotopomer ratio analysis of U-C-13-labeled extracts. A new method for accurate quantification of changes in concentrations of intracellular metabolites. Biotechnology and Bioengineering 85: 620-628

Masoum S, Bouveresse DJR, Vercauteren J, Jalali-Heravi M, Rutledge DN (2006) Discrimination of wines based on 2D NMR spectra using learning vector quantization neural networks and partial least squares discriminant analysis. Analytica Chimica Acta 558: 144-149

Mathews H, Clendennen SK, Caldwell CG, Liu XL, Connors K, Matheis N, Schuster DK, Menasco DJ, Wagoner W, Lightner J, Wagner DR (2003) Activation tagging in tomato identifies a transcriptional regulator of anthocyanin biosynthesis, modification, and transport. Plant Cell 15: 1689-1703

Mattila P, Kumpulainen J (2002) Determination of free and total phenolic acids in plant-derived foods by HPLC with diode-array detection. Journal of Agricultural and Food Chemistry 50: 3660-3667

Mattoo AK, Sobolev AP, Neelam A, Goyal RK, Handa AK, Segre AL (2006) Nuclear magnetic resonance spectroscopy-based metabolite profiling of transgenic tomato fruit engineered to accumulate spermidine and spermine reveals enhanced anabolic and nitrogen-carbon interactions. Plant Physiology 142: 1759-1770

McLuckey SA, Wells JM (2001) Mass analysis at the advent of the 21st century. Chemical Reviews 101: 571-606

Miliauskas G, van Beek T, de Waard P, Venskutonisb R, Sudhölter E (2006) Comparison of analytical and semi-preparative columns for high-performance liquid chromatography-solid-phase extraction-nuclear magnetic resonance. Journal of chromatography 1112: 276-284

Minoggio M, Bramati L, Simonetti P, Gardana C, Iemoli L, Santangelo E, Mauri PL, Spigno P, Soressi GP, Pietta PG (2003) Polyphenol pattern and antioxidant activity of different tomato lines and cultivars. Ann Nutr Metab 47: 64-69

Moco S, Bino RJ, Vorst O, Verhoeven HA, de Groot J, van Beek TA, Vervoort J, De Vos RCH (2006a) A liquid chromatography-mass spectrometry-based metabolome database for tomato. Plant Physiology 141: 1205-1218

Moco S, Tseng LH, Spraul M, Chen Z, Vervoort J (2006b) Building-Up a Comprehensive Database of Flavonoids Based on Nuclear Magnetic Resonance Data. Chromatographia 9/10: 503-508

Montoya T, Nomura T, Yokota T, Farrar K, Harrison K, Jones JG, Kaneta T, Kamiya Y, Szekeres M, Bishop GJ (2005) Patterns of Dwarf expression and brassinosteroid accumulation in tomato reveal the importance of brassinosteroid synthesis during fruit development. Plant J 42: 262-269

Muir SR, Collins GJ, Robinson S, Hughes S, Bovy A, De Vos RCH, van Tunen AJ, Verhoeyen ME (2001) Overexpression of petunia chalcone isomerase in tomato results in fruit containing increased levels of flavonols. Nat Biotechnol 19: 470-474

Nave CR (2005) Hyperphysics Concepts. In. [http://hyperphysics.phy-astr.gsu.edu/hbase/

Page 214: Metabolomics Technologies applied to the

214

References

hph.html#hph]Nordström A, O’Maille G, Qin C, Siuzdak G (2006) Nonlinear data alignment for UPLC-

MS and HPLC-MS based metabolomics: Quantitative analysis of endogenous and exogenous metabolites in human serum. Analytical Chemistry 78: 3289-3295

Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research 27: 29-34

Oikawa A, Nakamura Y, Ogura T, Kimura A, Suzuki H, Sakurai N, Shinbo Y, Shibata D, Kanaya S, Ohta D (2006) Clarification of pathway-specific inhibition by Fourier transform ion cyclotron resonance/mass spectrometry-based metabolic phenotyping studies. Plant Physiology 142: 398-413

Overy SA, Walker HJ, Malone S, Howard TP, Baxter CJ, Sweetlove LJ, Hill SA, Quick WP (2005) Application of metabolite profiling to the identification of traits in a population of tomato introgression lines. Journal of Experimental Botany 56: 287-296

Pauli GF (2001) qNMR - A versatile concept for the validation of natural product reference compounds. Phytochemical Analysis 12: 28-42

Peterman SM, Duczak N, Kalgutkar AS, Lame ME, Soglia JR (2006) Application of a linear ion trap/orbitrap mass spectrometer in metabolite characterization studies: Examination of the human liver microsomal metabolism of the non-tricyclic anti-depressant nefazodone using data-dependent accurate mass measurements. Journal of the American Society for Mass Spectrometry 17: 363-375

Petró-Turza M (1987) Flavor of tomato and tomato products. Food Reviews International 2: 309-351

Piersen CE (2003) Phytoestrogens in botanical dietary supplements: implications for cancer. Integr Cancer Ther 2: 120-138

Porter SEG, Stoll DR, Rutan SC, Carr PW, Cohen JD (2006) Analysis of Four-Way Two-Dimensional Liquid Chromatography-Diode Array Data: Application to Metabolomics. Anal. Chem. 78: 5559-5569

Pourcel L, Routaboul JM, Cheynier V, Lepiniec L, Debeaujon I (2007) Flavonoid oxidation in plants: from biochemical properties to physiological functions. Trends in Plant Science 12: 29-36

Raffo A, Leonardi C, Fogliano V, Ambrosino P, Salucci M, Gennaro L, Bugianesi R, Giuffrida F, Quaglia G (2002) Nutritional value of cherry tomatoes (Lycopersicon esculentum cv. Naomi F1) harvested at different ripening stages. Journal of Agricultural and Food Chemistry 50: 6550-6556

Ratcliffe RG, Shachar-Hill Y (2005) Revealing metabolic phenotypes in plants: inputs from NMR analysis. Biological Reviews 80: 27-43

Ratcliffe RG, Shachar-Hill Y (2006) Measuring multiple fluxes through plant metabolic networks. Plant Journal 45: 490-511

Reschke A, Herrmann K (1982) Vorkommen von 1-O-hydroxycinnamyl-β-D-glucosen im gemüse. 1. Phenolcarbonsäure-verbindungen des gemüses. Zeitschrift Fur Lebensmittel Untersuchung Und Forschung 174: 5-8

Rischer H, Oresic M, Seppanen-Laakso T, Katajamaa M, Lammertyn F, Ardiles-Diaz W, Van Montagu MCE, Inze D, Oksman-Caldentey KM, Goossens A (2006) Gene-to-metabolite networks for terpenoid indole alkaloid biosynthesis in Catharanthus roseus cells. Proceedings of the National Academy of Sciences of the United States of America 103: 5614-5619

Roessner U, Luedemann A, Brust D, Fiehn O, Linke T, Willmitzer L, Fernie AR (2001) Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems. Plant Cell 13: 11-29

Roessner U, Willmitzer L, Fernie AR (2002) Metabolic profiling and biochemical

Page 215: Metabolomics Technologies applied to the

215

References

phenotyping of plant systems. Plant Cell Reports 21: 189-196Rosman KJR, Taylor PDP (1998) Isotopic compositions of the elements 1997. Pure and

Applied Chemistry 70: 217-235Ross JA, Kasum CM (2002) Dietary flavonoids: Bioavailability, metabolic effects, and safety.

Annual Review of Nutrition 22: 19-34Saito K, Dixon R, Willmitzer L (2006) Plant metabolomics. In T Nagata, H Lorz, JM

Widholm, eds, Biotechnology in agriculture and forestry, Vol 57. Springer, Berlin, pp XIX, 347 p;

Sakakibara H, Honda Y, Nakagawa S, Ashida H, Kanazawa K (2003) Simultaneous determination of all polyphenols in vegetables, fruits, and teas. Journal of Agricultural and Food Chemistry 51: 571-581

Sato S, Soga T, Nishioka T, Tomita M (2004) Simultaneous determination of the main metabolites in rice leaves using capillary electrophoresis mass spectrometry and capillary electrophoresis diode array detection. Plant Journal 40: 151-163

Schauer N, Fernie AR (2006) Plant metabolomics: towards biological function and mechanism. Trends in Plant Science 11: 508-516

Schauer N, Steinhauser D, Strelkov S, Schomburg D, Allison G, Moritz T, Lundgren K, Roessner-Tunali U, Forbes MG, Willmitzer L, Fernie AR, Kopka J (2005a) GC-MS libraries for the rapid identification of metabolites in complex biological samples. Febs Letters 579: 1332-1337

Schauer N, Zamir D, Fernie AR (2005b) Metabolic profiling of leaves and fruit of wild species tomato: a survey of the Solanum lycopersicum complex. Journal of Experimental Botany 56: 297-307

Schmidtlein H, Herrmann K (1975) Über die phenolsäuren des gemüses. II. Hydroxyzimtsäuren und hydroxybenzoesäuren der frucht- und samengemüsearten. Z Lebensm Unters Forsch 159: 213-218

Scholz M, Kaplan F, Guy CL, Kopka J, Selbig J (2005) Non-linear PCA: a missing data approach. Bioinformatics 21: 3887-3895

Schroeder FC, Gronquist M (2006) Extending the Scope of NMR Spectroscopy with Microcoil Probes. Angewandte Chemie International Edition 45: 7122-7131

Shchelochkova AP, Vollerner YS, Koshoev KK (1981) Tomatoside a from the seeds of Lycopersicum esculentum. Chem. Nat. Compounds 16: 386-392

Shintu L, Ziarelli F, Caldarelli S (2004) Is high-resolution magic angle spinning NMR a practical speciation tool for cheese samples? Parmigiano Reggiano as a case study. Magnetic Resonance in Chemistry 42: 396-401

Simons V, Morrissey JP, Latijnhouwers M, Csukai M, Cleaver A, Yarrow C, Osbourn A (2006) Dual effects of plant steroidal alkaloids on Saccharomyces cerevisiae. Antimicrobial Agents and Chemotherapy 50: 2732-2740

Sitter B, Lundgren S, Bathen TF, Halgunset J, Fjosne HE, Gribbestad IS (2006) Comparison of HR MAS MR spectroscopic profiles of breast cancer tissue with clinical parameters. Nmr in Biomedicine 19: 30-40

Smilde AK, van der Werf MJ, Bijlsma S, van der Werff-van-der Vat BJC, Jellema RH (2005) Fusion of mass spectrometry-based metabolomics data. Analytical Chemistry 77: 6729-6736

Smillie RM, Hetherington SE, Davies WJ (1999) Photosynthetic activity of the calyx, green shoulder, pericarp, and locular parenchyma of tomato fruit. Journal of Experimental Botany 50: 707-718

Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G (2006) XCMS: Processing mass spectrometry data for metabolite profiling using Nonlinear peak alignment, matching, and identification. Analytical Chemistry 78: 779-787

Sobolev AP, Segre A, Lamanna R (2003) Proton high-field NMR study of tomato juice.

Page 216: Metabolomics Technologies applied to the

216

References

Magnetic Resonance in Chemistry 41: 237-245Srivastava A, Handa AK (2005) Hormonal regulation of tomato fruit development: A

molecular perspective. Journal of Plant Growth Regulation 24: 67-82Stewart AJ, Bozonnet S, Mullen W, Jenkins GI, Lean MEJ, Crozier A (2000) Occurrence of

flavonols in tomatoes and tomato-based products. Journal of Agricultural and Food Chemistry 48: 2663-2669

Sumner LW, Mendes P, Dixon RA (2003) Plant metabolomics: large-scale phytochemistry in the functional genomics era. Phytochemistry 62: 817-836

Tatsis E (2007) Identification of the major constituents of Hypericum perforatum by LC/SPE/NMR and/or LC/MS. Phytochemistry 68: 383-393

Taylor LP, Grotewold E (2005) Flavonoids as developmental regulators. Current Opinion in Plant Biology 8: 317-323

Tikunov Y, Lommen A, De Vos RCH, Verhoeven HA, Bino RJ, Hall RD, Bovy AG (2005) A novel approach for nontargeted data analysis for metabolomics. Large-scale profiling of tomato fruit volatiles. Plant Physiology 139: 1125-1137

Tokusoglu O, Unal MK, Yildirim Z (2003) HPLC-UV and GC-MS characterization of the flavonol aglycons quercetin, kaempferol, and myricetin in tomato pastes and other tomato-based products. Acta Chromatographica 13: 196-207

Tolstikov VV, Fiehn O (2002) Analysis of highly polar compounds of plant origin: Combination of hydrophilic interaction chromatography and electrospray ion trap mass spectrometry. Analytical Biochemistry 301: 298-307

Tolstikov VV, Lommen A, Nakanishi K, Tanaka N, Fiehn O (2003) Monolithic silica-based capillary reversed-phase liquid chromatography/electrospray mass spectrometry for plant metabolomics. Analytical Chemistry 75: 6737-6740

Tomassen MMM, Barrett DM, van der Valk HCPM, Woltering EJ (2007) Isolation and characterization of a tomato non-specific lipid transfer protein involved in polygalacturonase-mediated pectin degradation. J. Exp. Bot. 58: 1151-1160

Tracewell CA, Vrettos JS, Bautista JA, Frank HA, Brudvig GW (2001) Carotenoid Photooxidation in Photosystem II. Archives of Biochemistry and Biophysics 385: 61-69

Trethewey RN (2004) Metabolite profiling as an aid to metabolic engineering in plants. Current Opinion in Plant Biology 7: 196-201

Trygg J, Holmes E, Lundstedt T (2007) Chemometrics in Metabonomics. J. Proteome Res. 6: 469-479

USDA database for the flavonoid content of selected foods (2003). In. Agricultural Research Service, Nutrient Data Laboratory

Vaidyanathan S, Harrigan GG, Goodacre R, eds (2005) Metabolome analyses - Strategies for systems biology. Springer, New York

van der Greef J, Stroobant P, van der Heijden R (2004) The role of analytical sciences medical systems biology. Current Opinion in Chemical Biology 8: 559-565

van der Werf MJ, Jellema RH, Hankemeier T (2005) Microbial metabolomics: replacing trial-and-error by the unbiased selection and ranking of targets. Journal of Industrial Microbiology & Biotechnology 32: 234-252

van der Woude H, Alink GM, van Rossum BEJ, Walle K, van Steeg H, Walle T, Rietjens IMCM (2005) Formation of Transient Covalent Protein and DNA Adducts by Quercetin in Cells with and without Oxidative Enzyme Activity. Chem. Res. Toxicol. 18: 1907-1916

van Tuinen A, De Vos RCH, Hall RD, van der Plas LHW, Bowler C, Bino RJ (2005) Use of metabolomics for development of tomato mutants with enhanced nutritional value by exploiting natural non-GMO light-hyperresponsive mutants. In P Jaiwal, ed, Plant Genetic Engineering: Improvement of the Nutritional and the Therapeutic

Page 217: Metabolomics Technologies applied to the

217

References

Qualities of Plants. Agritech Publications/Agricell Report, Shrub Oak, NY, USA (in press)

Verhoeven HA, De Vos RCH, Bino RJ, Hall RD (2006) Plant metabolomics strategies based upon quadrupole time of flight mass spectrometry (QTOF-MS). In K Saito, RA Dixon, L Willmitzer, eds, Plant Metabolomics - Biotechnology and Forestry, Vol 57. Springer-Verlag, Berlin Heidelberg

Viant MR (2003) Improved methods for the acquisition and interpretation of NMR metabolomic data. Biochemical and Biophysical Research Communications 310: 943-948

Vlahov G (2006) C-13 nuclear magnetic resonance spectroscopy to determine olive oil grades. Analytica Chimica Acta 577: 281-287

von Roepenack-Lahaye E, Degenkolb T, Zerjeski M, Franz M, Roth U, Wessjohann L, Schmidt J, Scheel D, Clemens S (2004) Profiling of Arabidopsis secondary metabolites by capillary liquid chromatography coupled to electrospray ionization quadrupole time-of-flight mass spectrometry. Plant Physiol 134: 548-559

Vorst O, De Vos RCH, Lommen A, Staps RV, Visser RGF, Bino RJ, Hall RD (2005) A non-directed approach to the differential analysis of multiple LC-MS derived metabolic profiles. Metabolomics 1: 169-180

Ward JL, Harris C, Lewis J, Beale MH (2003) Assessment of H-1 NMR spectroscopy and multivariate analysis as a technique for metabolite fingerprinting of Arabidopsis thaliana. Phytochemistry 62: 949-957

Wen X (2006) Methylated flavonoids have greatly improved intestinal absorption and metabolic stability. Drug metabolism and disposition 34: 1786-1792

Wilkinson M, Schoof H, Ernst R, Haase D (2005) BioMOBY successfully integrates distributed heterogeneous bioinformatics Web services. The PlaNet exemplar case. Plant Physiology 138: 4-16

Willker W, Leibfritz D (1992) Complete assignment and conformational studies of tomatine and tomatidine. Magnetic resonance in chemistry 30: 645-650

Wilson ID, Brinkman UAT (2003) Hyphenation and hypernation - The practice and prospects of multiple hyphenation. Journal of Chromatography A 1000: 325-356

Winkel-Shirley B (2002) Biosynthesis of flavonoids and effects of stress. Curr Opin Plant Biol 5: 218-223

Winter M, Herrmann K (1986) Esters and glucosides of hydroxycinnamic acids in vegetables. Journal of Agricultural and Food Chemistry 34: 616-620

Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, Cheng D, Jewell K, Arndt D, Sawhney S, Fung C, Nikolai L, Lewis M, Coutouly M-A, Forsythe I, Tang P, Shrivastava S, Jeroncic K, Stothard P, Amegbey G, Block D, Hau DD, Wagner J, Miniaci J, Clements M, Gebremedhin M, Guo N, Zhang Y, Duggan GE, MacInnis GD, Weljie AM, Dowlatabadi R, Bamforth F, Clive D, Greiner R, Li L, Marrie T, Sykes BD, Vogel HJ, Querengesser L (2007) HMDB: the Human Metabolome Database. Nucl. Acids Res. 35: D521-526

Wolfender JL, Ndjoko K, Hostettmann K (2003) Liquid chromatography with ultraviolet absorbance-mass spectrometric detection and with nuclear magnetic resonance spectroscopy: a powerful combination for the on-line structural investigation of plant metabolites. Journal of Chromatography A 1000: 437-455

Wolff JC, Eckers C, Sage AB, Giles K, Bateman R (2001) Accurate mass liquid chromatography/mass spectrometry on quadrupole orthogonal acceleration time-of flight mass analyzers using switching between separate sample and reference sprays. 2. Applications using the dual-electrospray ion source. Analytical Chemistry 73: 2605-2612

Xi Y, Ropp JSd, Viant MR, Woodruff DL, Yu P (2006) Automated screening for metabolites

Page 218: Metabolomics Technologies applied to the

218

References

in complex mixtures using 2D COSY NMR spectroscopy. Metabolomics V2: 221-233Yahara S, Uda N, Nohara T (1996) Lycoperosides A-C, three stereoisomeric 23-

acetoxyspirosolan-3 beta-ol beta-lycotetraosides from Lycopersicon esculentum. Phytochemistry 42: 169-172

Yahara S, Uda N, Yoshio E, Yae E (2004) Steroidal alkaloid glycosides from tomato (Lycopersicon esculentum). Journal of Natural Products 67: 500-502

Yoshizaki M, Matsushita S, Fujiwara Y, Ikeda T, Ono M, Nohara T (2005) Tomato new sapogenols, isoesculeogenin A and esculeogenin B. Chemical & Pharmaceutical Bulletin 53: 839-840

Zhao Q, Stoyanova R, Du SY, Sajda P, Brown TR (2006) HiRes - a tool for comprehensive assessment and interpretation of metabolomic data. Bioinformatics 22: 2562-2564

Page 219: Metabolomics Technologies applied to the

219

Curriculum vitae

Curriculum vitae

Sofia Isabel Abraúl Viana Moço was born on the 28th May 1978 in Leiria, Portugal. She studied natural sciences in the Francisco Rodrigues Lobo School in Leiria where she completed secondary education. Between 1996 and 2001, she studied Chemical Engineering, with a specialization on Biotechnology, at the Instituto Superior Técnico (IST), Technical University of Lisbon. On the course of her studies, she did two 5-month-theses: at the Organic Chemistry Laboratory of Wageningen University within the group of Dr. Teris A. van Beek and at the National Institute of Engineering and Industrial Technology in Lisbon, supervised by Dr. Santino Di Berardino. For one year, she worked as a graduate student at the Centre for Biological and Chemical Engineering, part of the IST. In 2003, she accepted a project at the Biochemistry Laboratory, in Wageningen University, in collaboration with Plant Research International, which resulted in the proposal for her degree of Doctor. The present thesis is the outcome of this doctoral research.

Page 220: Metabolomics Technologies applied to the

220

List of publications

List of publications and manuscripts in preparation

Sofia Moco, Jenny Forshed, Ric C.H. De Vos, Raoul Bino, Jacques Vervoort A tomato metabolite correlation matrix obtained by liquid chromatography-mass spectrometry and nuclear magnetic resonance (in preparation)

Sofia Moco, Li-Hong Tseng, Frank van Zimmeren, Zheng Chen, Matthias Niemitz, Reino Laatikainen, Manfred Spraul, Jacques Vervoort Push-button flavonoid identification: a NMR database integrated with a 1H NMR predictive model (in preparation)

Sofia Moco, Esra Capanoglu, Yury Tikunov, Raoul J. Bino, Jacques Vervoort and Ric C.H. De Vos (2007) Tissue specialization at the metabolite level is perceived during the development of tomato fruit (submitted)

Sofia Moco, Raoul J. Bino, C.H. Ric De Vos, Jacques Vervoort (2007) Metabolomics technologies and metabolite identification. Trends in Analytical Chemistry 26 (in press)

Ric C.H. De Vos*, Sofia Moco*, Arjen Lommen*, Joost J.B. Keurentjes, Raoul J. Bino and Robert D. Hall (2007) Untargeted large-scale plant metabolomics using liquid chromatography coupled to mass spectrometry. Nature Protocols 2: 778-791 *equally contributing authors

Sofia Moco, Li-Hong Tseng, Manfred Spraul, Zheng Chen, Jacques Vervoort (2006) Building up a comprehensive database of flavonoids based on nuclear magnetic resonance data. Chromatographia 64: 503-508

Sofia Moco, Raoul J. Bino, Oscar Vorst, Harrie A. Verhoeven, Joost de Groot, Teris A. van Beek, Jacques Vervoort and Ric C.H. De Vos (2006) A liquid chromatography mass spectrometry based metabolome database for tomato. Plant Physiology 141: 1205-1218

Yongyan Qi, Sofia Moco, Sjef Boeren, Ric C. H. De Vos, Arnaud Bovy (2005) Isolation and identification of glycinol from Glycine max [L.] Merri. Chinese Journal of Chromatography 23: 353-7

Raoul J. Bino, Ric C.H. De Vos, Michal Lieberman, Robert D. Hall, Arnaud Bovy, Harry H. Jonker, Yury Tikunov, Arjen Lommen, Sofia Moco, Ilan Levin (2005) The light-hyperresponsive high pigment-2dg mutation of tomato: alterations in the fruit metabolome. New Phytologist 166: 427-438

Page 221: Metabolomics Technologies applied to the

221

Acknowledgements

Acknowledgements / Agradecimentos

The author of this thesis would like to thank the following people whom directly or indirectly have contributed to the accomplishment of the work presented here. This project was part of the CBSG (Centre for BioSystems Genomics) and was carried out at the Laboratory of Biochemistry (BIC) from Wageningen University (WU) and at the Bioscience unit from Plant Research International (PRI).

The supervisors Prof. Dr. Raoul J. Bino and Prof. Dr. Sacco C. de Vries and the co-supervisors Dr. Jacques Vervoort, Dr. Ric C.H. De Vos and Dr. Teris A. van Beek.

BIC: Dr. Vassiliki Exarchou, Dr. Yiannis Fiamegos, Sjef Boeren, Kees-Jan Françoijs, Zheng Chen, Frank van Zimmeren, Dr. Cathy Albrecht, Laura van Egmond and the work discussion group.

PRI: Esra Capanaglu, Bert Schipper, Harry Jonker, Jan Blaas, Dr. Jan Cordewener, Dr. Oscar Vorst, Joost de Groot, Dr. Harrie Verhoeven, Dr. Robert Hall, Dr. Velitchka Mihaleva, Dr. Roeland van Ham, Dr. Arnaud Bovy, Dr. Yury Tikunov, Dr. Tom Dueck and the metabolic regulation cluster meeting group.

WU: Dr. Pim Lindhout (Laboratory of Plant Breeding), Dr. Pieter de Waard (Laboratory of Biophysics), Elbert van der Klift and Frank Claassen (Laboratory of Organic Chemistry), Ageeth van Tuinen (Laboratory of Plant Physiology).

RIKILT/PRI: Dr. Arjen LommenKarolinska Institutet: Dr. Jenny ForshedBruker BioSpin: Dr. Li-Hong Tseng, Dr. Peter Neidig, Dr. Christian Fischer, Dr. Markus

Godejohann, Dr. Manfred Spraul, Dr. Alexandre Schefer, Renata Schefer and Andrea Jaeckel.Radboud Universiteit Nijmegen: Prof. Dr. Lutgarde Buydens and Dr. Ron Wehrens.Vrije Universiteit Amsterdam: Prof. Dr. Udo BrinkmanCBSG partners and sponsoring companies, in particular Syngenta Seeds Inc., Seminis

Inc., Enza Zaden, Rijk Zwaan, Nickerson-Zwaan and De Ruiter Seeds Inc. for providing the seeds of the 96 tomato cultivars and the Technology, Bioinformatics and Tomato CBSG groups.

I would like to thank the colleagues at Biochemistry and Plant Research International for their help and companionship. And also former housemates and friends that I met along my stay in The Netherlands. And to Mark, a very special thanks. E finalmente, agradeço os meus pais, a minha irmã Rita e as minhas avós Idalina e Júlia.

The work presented in this thesis has received funding from the European Union and the Centre for BioSystems Genomics.

Page 222: Metabolomics Technologies applied to the

222

Training and supervision plan

Training and Supervision Plan

Courses16-17th April 2003. Interpretation of MS-MS Mass Spectra, Thermo Finnigan, Breda, The

Netherlands (NL)23-29th November 2003. 4th International Advanced Course Chemistry and Biochemistry of

Antioxidants, VLAG, Wageningen, NL15-23rd March 2004. Basic Course Molecular Spectroscopy, ANAC, Amsterdam, Utrecht and

Wageningen, NL28th June-1st July 2004. Summer Course Glycosciences, VLAG, Wageningen, NL6th September-1st November 2004 (9-12h, Mondays).

Scientific Writing Course, Centa, Wageningen, NL8-16th November 2004. Bioinformation Technology –1, VLAG, Wageningen, NL2-4th May 2005. 1st Metabolomics Workshop, VLAG/EPS, Wageningen, NL20-29th September 2006. Summer School Plant Genomics and Bioinformatics: Metabolite Profiling

and Data Analyses, ETNA (European Training and Network Activity), Max Planck of Molecular Plant Physiology, Golm, Potsdam, Germany (DE)

Conferences and Meetings25-28th April 2003. 2nd International Conference on Plant Metabolomics, Potsdam, DE27-28th October 2003. NWO Analytical Chemistry Meeting, Lunteren, NL5-7th November 2003. 1st International Symposium on Recent Advances in Food Analysis,

IAEAC, Prague, Czech Republic8th June 2004. 40th Thematic Meeting – Biotechnology for Food, VLAG, Wageningen, NL15th June 2004. Next generation Waters innovative MS solutions for your

pharmaceutical & proteomics research applications, Waters, Utrecht, NL

12th October 2004. Proteomics Seminar, Thermo, Tilburg, NL20th May 2005. ALW-NWO Metabolomics Conference, Zeist, NL1st July 2005. Uniformity in Diversity - WNMRCentre, Wageningen, NL31st October-1st November 2005.

NWO Analytical Chemistry Meeting, Lunteren, NL8th November 2005. LC-MS/MS Seminar, Thermo Electron Corporation, Tilburg, NL26-29th March 2006. Hyphenation Conference, Tübingen, DE7-10th April 2006. 4th International Conference on Plant Metabolomics, Reading, England

(UK)25-28th June 2006. 2nd Scientific Meeting of the Metabolomics Society, Boston, United

States of America29th November 2006. Netherlands Bioinformatics Centre – Workshop Bioinformatics for

Metabolomics, Wageningen, NL15th March 2007. All Molecule Tour, Thermo User Meeting LC-MS/MS, Utrecht, NL 2-3rd April 2007. NWO Plant Meeting, Lunteren, NL19th April 2007. Intermediar PhD Career Event 2007, Amsterdam, NL6-8th June 2007. 2nd Metabolomics Workshop, VLAG/EPS, Wageningen, NL25-28th June 2007. 3rd Scientific Meeting of the Metabolomics Society, Manchester, UK

Total ECTS credits: 44.2