103
The evolutionary history of sialylation - Perspectives from fish genomes Dissertation Zur Erlangung des Doktorgrades der Naturwissenschaften -Dr. rer. nat.- vorgelegt dem Promotionsausschuss des Fachbereichs 2 (Biologie und Chemie) der Universität Bremen von Friederike Lehmann Universität Bremen 2004

The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

The evolutionary history of sialylation -

Perspectives from fish genomes

Dissertation

Zur Erlangung des Doktorgrades der Naturwissenschaften

-Dr. rer. nat.-

vorgelegt dem Promotionsausschuss

des Fachbereichs 2 (Biologie und Chemie) der Universität Bremen

von

Friederike Lehmann

Universität Bremen

2004

Page 2: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

The evolutionary history of sialylation -

Perspectives from fish genomes

Dissertation

Zur Erlangung des Doktorgrades der Naturwissenschaften

-Dr. rer. nat.-

vorgelegt dem Promotionsausschuss

des Fachbereichs 2 (Biologie und Chemie) der Universität Bremen

von

Friederike Lehmann

1. Gutachter: Prof. Dr. Sørge Kelm

2. Gutachterin: Prof. Dr. Rita Gerardy-Schahn

Universität Bremen

2004

Page 3: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Statement of Originality

I declare that this thesis has not been submitted in any form for another degree at any other

university. The material discussed in this thesis is my own work, unless otherwise stated.

Information derived from the literature or unpublished work of others has been acknowledged

in the text and a list of references is provided.

Friederike Lehmann Bremen, 25.06.2004

Page 4: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Table of Contents

I

1 INTRODUCTION ................................................................................. 1

1.1 Sialic acids .............................................................................................. 1

Structural diversity of sialic acids .......................................................................... 1

Biological roles of sialic acids .................................................................................. 2

Occurrence of sialic acids ........................................................................................ 3

Metabolism of sialic acids ........................................................................................ 6

1.2 Sialyltransferases ................................................................................... 8

Mammalian STs and their substrate specificity .................................................... 8

Structure of vertebrate STs................................................................................... 10

1.3 Lectins................................................................................................... 12

Vertebrate pathogen lectins................................................................................... 12

Vertebrate endogenous lectins .............................................................................. 13

1.4 Evolutionary aspects of sialic acid biology........................................ 17

1.5 Project objectives................................................................................. 18

1.6 Abbreviations ....................................................................................... 20

1.7 References............................................................................................. 21

2 THE EVOLUTIONARY HISTORY OF SIALYLATION ............. 32

2.1 Abstract ................................................................................................ 34

2.2 Introduction ......................................................................................... 35

2.3 Material and Methods......................................................................... 36

Homology searches for STs in fugu and zebrafish genomic sequences............. 36

Phylogenetic analysis ............................................................................................. 37

2.4 Results................................................................................................... 37

Sequence analysis for the teleost genomes ........................................................... 37

Sequence analysis for other deuterostomian genomes........................................ 39

Phylogenetic analysis ............................................................................................. 40

Page 5: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Table of Contents

II

2.5 Discussion ............................................................................................. 40

2.6 Acknowledgements .............................................................................. 45

2.7 References............................................................................................. 46

2.8 Legends to figures................................................................................ 50

2.9 Tables and figures................................................................................ 51

3 EVOLUTION OF SIALIC ACID-BINDING PROTEINS: MOLECULAR CLONING AND EXPRESSION OF FISH SIGLEC-4 ............................................................................................ 58

3.1 Abstract ................................................................................................ 60

3.2 Introduction ......................................................................................... 61

3.3 Results................................................................................................... 62

Identification of putative siglec sequences in fish genomes ................................ 62

Structure prediction analysis of fish Siglec-4 ...................................................... 63

Molecular cloning of fugu Siglec-4 cDNA............................................................ 63

Genomic Organization of Fish Siglec-4................................................................ 64

Splice variants of fish Siglec-4............................................................................... 64

Expression pattern of Fish Siglec-4 ...................................................................... 65

Fish Siglec-4 as Sia-dependent cell adhesion molecules...................................... 65

3.4 Discussion ............................................................................................. 66

3.5 Materials and Methods ....................................................................... 68

Homology searches for siglecs in fish and frog genomes .................................... 68

Molecular cloning of a full length cDNA encoding fugu Siglec-4 ...................... 68

Cell adhesion assays ............................................................................................... 69

Production of recombinant fish Siglec-4-Fc......................................................... 69

Northern blot analysis............................................................................................ 69

Binding specificity of fish Siglec-4 ........................................................................ 69

3.6 Acknowledgements .............................................................................. 70

Page 6: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Table of Contents

III

3.7 Abbreviations ....................................................................................... 70

3.8 References............................................................................................. 71

3.9 Legends to figures................................................................................ 75

3.10 Tables and figures................................................................................ 77

4 UNPUBLISHED DATA...................................................................... 81

4.1 Background .......................................................................................... 81

4.2 Materials and Methods ....................................................................... 81

3’-RACE-PCR ........................................................................................................ 82

Molecular cloning of a truncated form of putative ST3Gal II cDNA ............... 82

Expression of put. ST3Gal II/protein A-constructs ............................................ 82

4.3 Results and Discussion ........................................................................ 83

3’ RACE-PCR......................................................................................................... 83

Expression of put. ST3Gal II/protein A-constructs ............................................ 83

5 CONCLUSIONS.................................................................................. 85

6 SUMMARY.......................................................................................... 94

7 ZUSAMMENFASSUNG..................................................................... 95

8 ACKNOWLEDGEMENTS................................................................ 97

Page 7: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

1

1. Introduction

1.1. Sialic acids Beside proteins, nucleic acids and lipids carbohydrates are one of the four major classes of biomolecules. Carbohydrates are aldehyde and ketone derivatives that have several OH groups attached to the carbon chain. They can form branching structures, and for this reason a relatively simple set of sugars can build a huge number of complex structures. The attachment of sugar residues is one of the most complicated co- or posttranslational modifications that a protein can undergo. It is an event that reaches beyond the genome and is controlled by factors that differ greatly among cell types and species. A large number of proteins are needed for glycosylation raising the question how such a complex apparatus evolved. The greatest diversity tends to be found among the outermost regions of glycans on cell surfaces and extracellular molecules. To get further insight into the evolution of glycosylation this thesis therefore focuses on the sialic acids (Sias), a family of monosaccharides mainly occuring as terminal components of cell surface glycoproteins and glycolipids. Sias are not ubiquitiously expressed in nature and exhibit a unique and complex metabolism pathway, two attributes making them particularly interesting for evolutionary investigations. To account for the different sides of the evolution of this monosaccharide family this thesis will emphasize on the proteins involved in the biosynthesis and recognition of Sias. Exemplary, the family of sialyltransferases catalyzing the transfer of Sias to glycoconjugates as well as the siglecs, a family of Sia recognizing lectins will be investigated.

Structural diversity of sialic acids More than 50 naturally occuring derivatives of the nine carbon sugar neuraminic acid (5-Amino-3,5-dideoxy-D-glycero-D-galacto-non-2-ulopyranosonic acid; Neu) have been found. The unsubstituted form, Neu, does not exist in its free form in nature. Usually the amino group is acetylated leading to N-acetylneuraminic acid (5-acetamido-3,5-dideoxy-D-glycero-D-galacto-non-2-ulopyranosonic acid; Neu5Ac), the most widespread form of Sia (Schauer and Kamerling 1997). Substituting one of the hydrogens in the methyl moiety of the acetyl group by a hydroxyl group yields N-glycolylneuraminic acid (5-hydroxyacetamido-3,5-dideoxy-D-glycero-D-galacto-non-2-ulopyranosonic acid; Neu5Gc). Substitution of the amino group by a hydroxyl at position 5 of Neu leads to the loss of the amino group resulting in deaminoneuraminic acid (2-keto-3-deoxy-D-glycero-D-galacto-non-2-ulopyranosonic acid; Kdn) (Fig. I) (Schauer and Kamerling 1997). Sias can undergo further modifications at any one of the four hydroxyl groups located at C-4, C-7, C-8 and C-9. These groups can be methylated or form esters, such as acetyl, lactyl, sulphate or phosphate esters. An introduction of a double bond between C-2 and C-3 has also been described (Schauer and Kamerling 1997).

Page 8: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

2

Figure I Structure of three important Sias

Biological roles of sialic acids The structural diversity of Sia is reflected by the variety of its biological functions. Due to their negative charge, Sias are involved in binding and transport of positively charged molecules (e.g. Ca2+) as well as attraction and repulsion phenomena between cells and molecules. Their exposed terminal position in carbohydrate chains, in addition to their size and negative charge enable them to function as a protective shield for the subterminal part of the molecule or cell. In infection processes, the colonization of bacteria can be limited by the Sia coat covering the host cell. The repulsing forces acting between their negative charges stabilise the correct conformation of glycoproteins (Varki 1992) and are important for the lubricative and protective functions of mucins, found in saliva and on epithelial cells (Schauer and Kamerling 1997). Moreover, the repulsive effects of negatively charged Sias hinder aggregation of erythrocytes (Izumida et al. 1991). Sias take part in a variety of recognition processes between cells and molecules. Thus, the immune system can distinguish between self and nonself structures according to their Sia pattern. The sugar represents an antigenic determinant, for example of blood group substances, and is a necessary component of receptors for many endogenous substances such as hormones and cytokines. Likewise, many pathogenic agents such as toxins, viruses, bacteria and protozoa also bind to host cells via Sia-containing receptors (Vimr and Lichtensteiger 2002). Additionaly, selectins, which bind to sialylated glycans (e.g. sialyl-Lewisx and sialyl-Lewisa) on the surface of leucocytes, play an important role in the initial stage of adhesion of leucocytes to endothelia prior to their invasion into the lymphatic tissue (Kannagi 2002). Furthermore, Siglecs, a group of Sia-recognising Ig-like lectins, can recognise Sias with a far greater specificity than selectins (discussed later). Sias can also mask specific cellular recognition sites, as has been observed for erythrocytes and other blood cells, as well as serum glycoproteins, where the addition of Sia to the subterminal Gal impedes the

R= NHCOCH3 (Neu5Ac) R= NHCOCH2OH (Neu5Gc) R= OH (Kdn)

Page 9: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

3

binding of Gal-specific receptors of macrophages and hepatocytes hindering the degradation of those molecules (Bratosin et al. 1995). The same masking effect can, on the other hand, help to hide antigenic sites on bacteria, protozoa and tumor cells from the host immune system (Jarvis 1995).

Occurrence of sialic acids The expression of Sias was previously thought to be unique to deuterostomes and pathogenic bacteria infecting these animals. However, recent studies have led to the discovery of these monosaccharides in formerly unexpected organisms, such as certain insects and fungi. Also, the molecular cloning of most of the bacterial and vertebrate enzymes involved in Sia biosynthesis (which show significant sequence similarities) has allowed to seek orthologous sequences in the genomes of other organisms. The available information regarding such genomic sequences has expanded explosively in recent years. The bioinformatic analysis of whole genomes along with the experimental evidence published in recent years indicate that Sias (and/or related molecules) are more widely distributed than previously thought and possibly quite ancient in their origin (see Fig. II).

Deuterostomes

Deuterostomes express four major types of glycoconjugates which are expressed on the cell surface or secreted glycoproteins, glycolipids, proteoglycans, and glycosylphosphatidyl-inositol (GPI) anchors. With some exceptions, Sias are expressed mostly on the glycan chains of glycoproteins and glycolipids. In most cases they occupy the distal (outermost) end of glycan chains, although there are also some examples in echinoderms and amphibians where the Sias are found as internal residues, i.e., substituted at one or more hydroxyl groups by another sugar other than Sia (Maes et al. 1995; Schauer and Kamerling 1997). Deuterostomes express the highest structural diversity of Sias. All types of O-substitutions (acetylation, lactylation, sulfation, methylation) can be found. Within this group the echinoderms (sea urchin, starfish, etc.) are known to express these modified Sias in large quantities compared to vertebrates, where mainly O-acetylation and occasional lactylation can be found.

Protostomes

In general, the classes of glycoconjugates expressed by protostomes are comparable to those found in deuterostomes. In addition, arthropods, such as insects and crustaceans express an ectoskeleton made of a polymer of N-acetylglucosamine (GlcNAc) called chitin. Protostomes were generally thought to be devoid of Sia, although there are some reports of these monosaccharides in gastropods (Inoue 1965). Indeed, the original extensive survey by Warren using the thiobarbituric acid test failed to show a positive response in a wide variety of protostomes examined (Warren 1963). Furthermore, with one possible exception (Davidson

Page 10: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

4

and Castellino 1993), recombinant proteins expressed in insect cells are not reported to have Sias (Altmann et al. 1999; Marchal et al. 2001). However, studies have shown that some insects may express Neu5Ac in a stage-specific manner (Malykh et al. 1999; Roth et al. 1992) and analysis of the completely cloned genome revealed sequences of several proteins involved in Sia metabolism (Kim et al. 2002; Koles et al. 2004). Also, octopuses and squids (belonging to the molluscs) have recently been shown to express gangliosides (Sia-containing glycolipids) (Saito et al. 2001).

Figure II "Universal tree" of cellular organisms and occurrence of Sias (Angata and Varki 2002). Expression of Sias (determined by physicochemical methods, e.g., NMR, mass-spectrometry, etc.) is indicated with a closed circle and putative expression (predicted from DNA sequences encoding enzymes involved in Sia biosynthesis) with an open circle. The three examples of apparently complete eukaroytic genomes which do not show any evidence for the genes involved in Sia biosynthesis (Arabidopsis thaliana; Caenorhabditis elegans; Saccharomyces cerevisiae) are indicated by open boxes. Additionaly there are also nine complete genomes of archaea that do not show any evidence for such genes.

Thus, it was suggested that in protostomes the sialylation pathway may be present as a highly specialized biochemical process which occurs in a tissue- and/or developmental stage specific manner, or perhaps that Sias are present in only a small number of protostome species.The question also arises whether these animals actually synthesize their own Sias or simply utilize Sia assimilated through the food chain or synthesized by a commensal organism.

Page 11: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

5

Fungi

There are reports on the presence of Sias in some pathogenic fungal cells (Alviano et al. 1999), such as Candida albicans, Cryptococcus neoformans, Aspergillus fumigatus, and Sporothrix schenckii with Sia species identified as Neu5Ac, Neu5Gc, and N-acetyl-9-O-acetylneuraminic acid (Neu5,9Ac2). On the other hand an analysis of the available genomic DNA sequences of three fungal species, i.e., Saccharomyces cerevisiae, Candida albicans and Cryptococcus neoformans - the latter two still incomplete - did not reveal any sequences significantly similar to enzymes known to be involved in the biosynthesis, activation, or transfer of Sias in bacteria and mammals (Angata and Varki 2002). However, it is possible that the expression of Sias in fungi is strain-specific (as in many bacteria). Alternatively, as also discussed for protostomian species, Sias may have been acquired from external sources. It also cannot be ruled out completely that these fungi have developed independently a novel pathway to synthesize and express Sias.

Plants

With one possible exception, structural studies of natural and recombinant plant glycoproteins (Lerouge et al. 1998; Matsumoto et al. 1995) have shown no evidence of Sias. The analysis of plant DNA sequences, including that of thale cress Arabidopsis thaliana, which is completely sequenced, revealed no evidence for genes known to be involved in the biosynthesis, activation, or transfer of Sias (Angata and Varki 2002).

Protozoa

Some protozoa, such as Trypanosoma cruzi (which causes Chagas' disease in South America), are known to carry Sias. However, these organisms do not synthesize their own Sias but acquire it from their host glycoproteins by the action of a unique enzyme called trans-sialidase (Schenkman et al. 1993), which is able to cleave Sia like a sialidase and transfer it like a sialyltransferase. The Sias are thought to protect these organisms from detection and attack by host immune system (Belen et al. 2000). Although there are some studies which describe the presence of Sias on other protozoal cells, it remains an open question whether these organisms are able to synthesize Sias on their own (Sorice et al. 1996; Watarai et al. 1996).

Bacteria

Bacteria express various types of glycoconjugates, such as capsular polysaccharides (K-antigens), lipopolysaccharides (O-antigens), S-layer glycoproteins, and peptidoglycans (component of the cell wall). Most of those Sias are found in capsular polysaccharides and lipopolysaccharides. In contrast to animals, in bacteria Sias mostly exist as internal residues, either in homopolymers (polySias) linked via 2-8 and/or 2-9 linkages or in repetitive units made of several sugar residues. However, there are also examples of terminal Sias in certain

Page 12: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

6

bacterial lipooligosaccharides (Gilbert et al. 2000; Wakarchuk et al. 2001). Within the Sias, bacteria express predominantly Neu5Ac, although some are able to synthesize Kdn and its derivatives (Gil-Serrano et al. 1998). So far, Neu5Gc has not been reported in bacteria. However, most bacteria do not express Sias. Notably, many of the Sia-expressing bacteria are causative agents of serious illness in humans and domestic animals, such as certain Escherichia coli strains, Neisseria meningitidis, Campylobacter jejuni and Helicobacter pylori (Kelm and Schauer 1997), where Sias on the cell surface of these bacteria are thought to provide a protective barrier to evade detection and attack by the host's immune system.

Archaea

Although information about archaeal glycans is rare, it is known that they express glycoproteins, glycolipids, and polysaccharides. The genome analysis of nine archaeal species (whose complete genomes were sequenced) do not show evidence for expression of Sias. In contrast, the genome of a hyperthermophilic archaea, Methanococcus jannaschii, show putative genes similar to Neu5Ac synthetase and CMP-Neu5Ac synthetase in a gene cluster, suggesting that the organism may express Sias or a similar molecule (Angata and Varki 2002). Further studies are neccessary to provide information about the functionality of these genes.

Metabolism of sialic acids The reactions of Sia synthesis and degradation are distributed between different compartments of the cell (Fig. III). The sugar is synthesized from N-acetylmannosamine-6-phosphate and phosphoenolpyrovate in the cytosol. After dephosphorylation of the reaction product, Neu5Ac-9-phosphate, the molecule is activated in the nucleus by the transfer of a cytidine monophosphate (CMP) residue from cytidine triphosphate (CTP) through CMP-Neu5Ac synthase. CMP-Neu5Ac is then translocated into the Golgi apparatus or the endoplasmatic reticulum (Kelm and Schauer 1997). There, the activated Sia can be transferred by a sialyltransferase onto the appropriate acceptor molecule, that is the oligosaccharide chain of a nascent glycoconjugate. The bound Sia can then be modified by O-acetylation or O-methylation before transport of the mature glycoconjugate to the cell surface, whereas the only modification known to take place before the transfer onto the glycoconjugate is the hydroxylation of the N-acetyl group of CMP-Neu5Ac, which leads to CMP-Neu5Gc in the cytosol (Schauer et al. 1995). The key enzyme of Sia catabolism is sialidase. Sia residues can be removed from cell surface or serum sialoglycoconjugates by membrane-bound sialidases. Usually, the glycoconjugates that are prone to degradation are taken up by receptor-mediated endocytosis in higher animals. After fusion of the endosome with a lysosome, the terminal Sia residues are removed by lysosomal sialidases. A prerequisite for the effective action of sialidases is the removal of O-acetyl groups by sialate-O-acetyl esterase, whereas bound Neu5Gc seems to be good

Page 13: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

7

substrate. Free Sia molecules are transported through the lysosomal membrane into the cytosol, from where they can be recycled by activation and transfer onto another nascent glycoconjugate molecule in the Golgi. Alternatively, they are degraded to acylmannosamine and pyruvate with the aid of a cytosolic acylneuraminate lyase (Schauer 2000).

Figure III Metabolism of Sias. The enzymatic reactions involved in Sia biosynthesis, activation, transfer, modification and catabolism are shown with their intracellular localization (modified from Kelm and Schauer 1997).

SAM

O-acetyltransferases

8-O-methyltransferase

plasmamembrane

ST

CT

endoplasmaticreticulum

CoA-Ac

esterasesialidase

sialic acid transferand modification

catabolic reactions

Sialic acid biosynthesisand activation

Golgiapparatus

enzymes and cosubstratesCS

CT

ST

CoA-Ac

SAM

Neu5Gc

carbohydrate structures

ManNAc-6-P

Neu5Ac-9-P

Neu5Ac+ ATP

lysosome

esterasesialidase

ManNAc +pyruvate

lyase

cytosol

nucleus

CS + CTP

OMe

O-AcO-Ac

OMe

O-Ac O-Ac

CMP

sialyltransferases

CMP-Neu5Ac synthase

CMP-Sia transporter

acetyl coenzyme A

S-adenosylmethionine

8-O-methylated sialic acid OMe

O-acetylated sialic acid O-Ac

Neu5Ac

glycoconjugate without Sia

Glc

Fru-6-P

GlcNAc-6-P

GlcNAc-1-P

UDP-GlcNAc

ManNAc

+ Gln

+ CoA-Ac

+ UTP- UDP

+ ATP

+ PEP

- P

hydroxylaseCMP

CMP

CMP

ATAcCoA-Transporter

GlcNH2-6-P

AT

Page 14: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

8

1.2. Sialyltransferases The sialyltransferase (ST) gene family represents a group of enzymes that transfer Sia from CMP-Neu5Ac to carbohydrate groups of various glycoproteins and glycolipids. Each of the ST genes is differentially expressed in a tissue-, cell type, and stage specific manner to regulate the sialylation pattern of cells. Most of the STs known so far were cloned from mammalian species (e.g. Homo sapiens, Mus musculus, Rattus norvegicus) and birds (Gallus gallus). In addition, one ST cDNA has been isolated from a amphibian species (Xenopus laevis). For protostomian species only one ST has been reported which has been isolated from Drosophila melanogaster. All of these enzymes share certain sequence similarities named “sialylmotifs” indicating their common ancestry. In contrast to the apparent common ancestry of the eukaryotic STs, at least five unrelated classes of STs have been reported in bacteria: α2-8(9) polySTs of E. coli K1/K92 strains and N. meningitidis groups B/C (Steenbergen et al. 1992), α2-6 ST of a marine bacterium Photobacterium damsela (Yamamoto et al. 1996), α2-3/6 ST of N. meningitidis and N. gonorrhoeae involved in lipooligosaccharide synthesis (Gilbert et al. 1996; Wakarchuk et al. 2001), α2-3 ST of Haemophilus ducreyi/ H. influenzae/ Streptococcus agalactiae (Bozue et al. 1999; Hood et al. 2001), and α2-3 ST of C. jejuni (Gilbert et al. 2000). Interestingly, none of these bacterial STs shows significant similarity to vertebrate enzymes, suggesting that bacterial and vertebrate enzymes were invented independently. Moreover, the lack of sequence similarity even between bacterial enzymes indicates that STs have been invented several times in bacteria.

Mammalian STs and their substrate specificity To account for all sialylated structures described to date, the mammalian ST family consists of at least 20 STs. Classically, they have been divided into 4 groups, depending on the type of linkage formed and the nature of the monosaccharide acceptor (ST3Gal, ST6Gal, ST6GalNAc, and ST8Sia).

The galactose α2,3-sialyltransferases

Six different α2,3-ST cDNAs have been cloned and characterized enzymatically. ST3Gal I and II mediate the transfer of Sia residues to a Gal residue of terminal Galβ1-3GalNAc oligosaccharide found on glycolipids or glycoproteins (Kim et al. 1996; Kitagawa and Paulson 1994). ST3Gal III is believed to be mainly involved in the α2-3-sialylation of the less common type I (Galβ1-3GlcNAc) sequences in N-linked glycan chains, although it can also catalyze the α2-3-sialylation of type II (Galβ1-4GlcNAc) sequence in N-linked glycan chains. In contrast, ST3Gal IV prefers the type II sequence to the type I sequence (Sasaki et al. 1993). ST3Gal V is specifically involved in ganglioside GM3 formation. Since GM3 is a common precursor of almost all gangliosides, GM3 synthase plays an important role in the biosynthesis

Page 15: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

9

of more complex gangliosides, and primarily regulates the amount of GM3 (Stern et al. 2000). ST3Gal VI utilizes almost exclusively Galβ1,4GlcNAc on glycoproteins and glycolipids (Okajima et al. 1999a).

The galactose α2,6-sialyltransferases

The two galactose α2,3-sialyltransferases synthesize the Neu5Acα2,6Galβ1,4GlcNAc structure. This structure is found mainly in N-linked glycans, but it has also been found in some O-glycans, glycosphingolipids, and sialyloligosaccharides. Whereas ST6Gal I exhibits broad substrate specificities, ST6Gal II is mainly involved in the synthesis of sialyloligosaccharides (Takashima et al. 2002a).

The N-acetylgalactosamine α2,6-sialyltransferases

Six different GalNAcα2,6-STs are known to exist in mammalia. As far as seen with recombinant enzymes in in vitro assays, three of them (ST6GalNAc I, II and IV) catalyze the formation of α2,6-linkages onto GalNAc residues O-glycosidically linked to Ser/Thr on proteins, whereas the three other enzymes (ST6GalNAc III, V, and VI) catalyze the addition of Sia residues onto gangliosides. ST6GalNAc I exhibits the broadest specificity for the following structures: GalNAcα-O-Ser/Thr, Galβ1-3GalNAcα-O-Ser/Thr and Neu5Acα2-3Galβ1-3GalNAcα-O-Ser/Thr (Ikehara et al. 1999). ST6GalNAc II exhibits a narrower substrate specificity using Galβ1-3GalNAcα-O-Ser/Thr and Neu5Acα2-3Galβ1-3GalNAcα-O-Ser/Thr as substrates (Samyn-Petit et al. 2000). ST6GalNAc III and IV show a most restricted substrate specificity, only utilizing the Neu5Acα2-3Galβ1-3GalNAc trisaccharide sequence found on either O-glycosylproteins or gangliosides GM1b with ST6GalNAc III preferring glycolipids to O-glycans and ST6GalNAc IV preferring O-glycans to glycolipids (Lee et al. 1999). ST6GalNAc V seems to be specific for GM1b (Okajima et al. 1999b) whereas ST6GalNAc VI appears to be specific for glycolipid acceptors synthesizing all α-series gangliosides defined so far (Okajima et al. 2000).

The Sialyl α2,8-sialyltransferases

ST8Sia I catalyzes the synthesis of GD3 and GT3 (Nakayama et al. 1996). ST8Sia II and IV are the key enzymes that control the expression of polySia. Both ST8Sia II and IV can transfer multiple α2,8-linked Sia residues to an acceptor N-glycan containing a Neu5Acα2-3 (or 6) Galβ1-4GlcNAcβ1-R structure mainly found attached to the neural cell adhesion molecule N-CAM without participation of other enzymes (Angata and Fukuda 2003). ST8Sia III transfers Sias towards Neu5Acα2-3Galβ1-4GlcNAc sequences on both N-linked oligosaccharides of glycoproteins and α2-3sialylated glycosphingolipids, such as α2-3-sialylparagloboside and GM3 (Yoshida and Tsuji 2002). ST8Sia V exhibits activity toward GM1b, GD1a, GT1b, and GD3, and synthesizes GD1c, GT1a, GQ1b, and GT3, respectively

Page 16: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

10

(Kono et al. 1996). ST8Sia VI exhibites activity toward glycolipids, glycoproteins and oligosaccharides that have the Neu5Acα2,3(6)Gal sequence at the nonreducing end of their glycans. This enzyme forms Neu5Acα2,8Neu5Ac structures, but not oligosialic or polySia structures, preferring O-glycans to N-glycans and glycolipids as acceptor substrates (Takashima et al. 2002b).

Structure of vertebrate STs STs are localized in the Golgi apparatus and share with the other Golgi-resident glycosyltransferases a typical type II architecture consisting of a short NH2-terminal cytoplasmic tail, a transmembrane domain followed by a stem region, and a large C-terminal catalytic domain facing the luminal side (Paulson and Colley 1989). Comparison of peptide sequences strongly indicates that the length of the catalytic domain is relatively well conserved and variations in protein sizes are generally attributable to differences in the length of the stem region. The stem region can be defined as the peptide portion after the transmembrane domain that can be removed without altering the activity (Jeanneau et al. 2004). This peptide portion often contains Cys residues as well as several N- and O-glycosylation sites, which could contribute to a local conformation. Recent data suggest that the stem portion could also modulate the in vivo acceptor specificity (Legaigneur et al. 2001).

Figure IV Schematic representation of ST8Sia IV and ST6Gal I. ST8Sia IV, shown on the left, contains two disulfide bridges, denoted by thin lines with Cys (circled "c"), and the COOH terminus is close to sialylmotifs L and S. In contrast, STs other than those of the 2,8 ST gene family lack the Cys at the COOH-terminal end, and the COOH terminus is not close to sialylmotif L or sialylmotif S (shown on the right). Sialylmotifs L and S are boxed. ●, ●□, and □ denote Sia, CMP-Neu5Ac and CMP, respectively (Angata et al. 2001a).

Since all STs share the same monosaccharide donor and recognize similar acceptor substrates, it was expected that they would exhibit similar protein sequences. Surprisingly, amino acid sequences of the cloned sialytransferase cDNAs show only very little homology with the

Page 17: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

11

exception of short consensus sequences called sialylmotifs. So far, 4 conserved peptide regions have been identified and named sialylmotif 1 (L-motif), sialylmotif 2 (S-motif), sialymotif 3 and sialylmotif 4 (VS-motif) (Jeanneau et al. 2004). The functional significance of the sialymotifs 1 and 2 has been assessed by site-directed mutagenesis, using ST6Gal I as a model. The mutagenesis of the most conserved residues led to the conclusion that motif 1 is mainly involved in donor substrate binding (Datta and Paulson 1995), whereas mutations in sialylmotif 2 have been shown to affect both, donor and acceptor binding (Datta et al. 1998). Considering all known ST sequences, each of sialylmotif 1 and 2 contains at least one Cys residue highly conserved among all known STs. Mutation of the two conserved Cys yielded inactive enzymes (Datta et al. 1998; Datta and Paulson 1995; Drickamer 1993) and recent data suggest that they participate in the formation of an intramolecular disulfide linkage that is essential for maintaining an active conformation of the enzyme (Datta et al. 2001). Similar observations were made with the polyST ST8Sia IV (Angata et al. 2001a). In the latter case, a second intramolecular disulfide bond, which brings the sialylmotifs and the COOH terminus within proximity, has been evidenced (Fig. IV). Formation of dimers through disulfide bonds also has been demonstrated in the case of ST6Gal I (Ma and Colley 1996). This dimer comprises 20-30% of the enzyme found in liver Golgi and exhibits reduced catalytic activity because of its lower affinity for the sugar nucleotide donor, CMP-Neu5Ac. Recent data demonstrated that the mutation of a single Cys residue in the transmembrane domain of ST6Gal I abolished dimerization of the enzyme (Qian et al. 2001). Two other Cys residues located downstream of sialylmotif 2 could also be critical for in vivo enzyme activity (Qian et al. 2001). In addition, mutational analysis of sialylmotif 3 showed that the enclosed His and Tyr residues are essential for activity since their substitution by Ala yielded inactive enzymes (Jeanneau et al. 2004). Also highly conserved are a Glu residue separated by four amino acids from a His present in the sialylmotif 4 located at the C-terminus. The functional role of the His residue has been recently examined in the polySTs ST8Sia II and IV (Kitazume-Kawaguchi et al. 2001). In both enzymes the replacement of this His by Lys was shown to be critical for their catalytic activity since they exhibited no detectable enzyme activity. Similar results were obtained with the mutation of the invariant His of motif 4 in human ST3Gal I (Jeanneau et al. 2004) thus reinforcing the hypothesis of its involvement in catalysis. In contrast, the replacement of the conserved Glu residue by Gln did not inactivate the enzyme completely. However, the strict conservation of motif 4 in all vertebrate STs characterized to date strongly suggests that both residues are necessary for optimal catalytic efficiency of the glycosyl transfer reaction. Altogether, these findings indicate, that the C-terminal part of the catalytic domain (including motifs 3, 4, and part of motif 2) is primarily dedicated to the recognition of the acceptor substrate whereas the N-terminal part is mostly involved in nucleotide sugar binding.

Page 18: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

12

1.3. Lectins Proteins which recognize sugars (excluding antibodies and enzymes) are collectively called lectins. There are numerous lectins in nature which can recognize Sias. These can be grouped into three categories: vertebrate pathogen lectins, vertebrate endogenous lectins and lectins from other sources (plants, protostomes, etc.). The natural functions of the members of the third group are not clearly understood to date, although some are thought to be involved in antibacterial self-defense (Armstrong et al. 1996; Tunkijjanukij et al. 1998). Thus, this group is not dicussed here in detail.

Vertebrate pathogen lectins Various pathogens (viruses, bacteria, and protozoa) express lectins that recognize Sias. Some pathogens use these lectins for recognition and entry into the host cells, while others express soluble lectins with a Sia-binding property. In some cases, these pathogens do not only use Sias but also some other cell surface molecules for attachment, and different strains show different degrees of dependence toward a particular type of such cellular "receptors".

Viruses

Viral Sia-recognizing lectins are usually capable of agglutinating red blood cells and thus are traditionally called hemagglutinins rather than lectins. Many viruses utilize Sias to facilitate attachment to host cells, although their degree of dependence on Sias for this purpose varies. Hemagglutinins of influenza viruses (A, B, and C), Newcastle disease virus, mouse polyoma virus, Sendai virus, mouse hepatitis virus, and some others have been isolated and shown to bind Sias (Angata and Varki 2002).

Bacteria

Some bacterial pathogens interact with host cells in a Sia-dependent manner. The lectins involved are attached to the bacterial surface and are typically called adhesins. The identity of the recognition molecule is uncertain in many cases. There are also soluble bacterial lectins, which are typically toxins and show high specificity toward Sias (Sakarya and Oncu 2003).

Protozoa

A well-studied example is the strain EBA-175 of Plasmodium falciparum, the protozoon which causes the most virulent form of malaria. EBA-175 is suspected to be involved in plasmodium infection of erythrocytes (Orlandi et al. 1992). In contrast, other laboratory and field strains are believed to not require Sia for infection (Dolan et al. 1990). There are some other Sia-binding lectins expressed by various protozoal pathogens, such as Tritrichomonas foetus (Babal et al. 1999) and Entamoeba histolytica (Feingold et al. 1984).

Page 19: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

13

Vertebrate endogenous lectins Sia-dependent receptors have been recognized to play an important role in cellular communication of mammalian cells. Many of them are found in the immune system.

Complement Factor H

This first reported example of a vertebrate Sia-recognizing lectin is part of the alternative pathway of complement, one of the earliest response components of the innate immune system. Factor H is a regulatory (inhibitory) factor, which prevents the alternate complement pathway from inadvertently attacking "self" cells by binding to cell surface glycoconjugates containing Sias (Jozsi et al. 2004).

Selectins

Selectins are a family of three C-type lectins (defined by a shared structural motif and calcium requirement for carbohydrate recognition) which recognize particular carbohydrate structures containing Sias and fucose called sialyl-Lewisx and sialyl-Lewisa, with or without sulfation at adjacent residues. E-selectin is expressed on activated endothelial cells, L-selectin on leucocytes, and P-selectin on activated platelets or endothelial cells. All three are involved in leucocyte trafficking, e.g., initiation of leucocyte localization at the site of inflammation (E- and P-selectin) and leucocyte homing to lymph nodes (L-selectin) (Patel et al. 2002). P-selectin has also been shown to be involved in tumor metastasis (Kim et al. 1998). Although the cognate glycans of the natural binding partners for selectins contain Sia, these lectins do not really recognize the entire monosaccharide. Instead, apparently the selectins interact only with its negative charge, since Sia residues can be replaced with much simpler negatively charged groups (Kelm and Schauer 1997).

Siglecs

Siglecs (Sia-binding Ig superfamily lectins) are the largest family of mammalian Sia-recognizing lectins. The initial discovery of this lectin family came about through independent studies on sialoadhesin (Sn/Siglec-1), a macrophage lectin-like adhesion molecule (Crocker et al. 1991), and CD22 (Siglec-2), a B-cell restricted member of the Ig-superfamily (IgSF) (Powell et al. 1993) that plays an important role in regulating B-cell activation. Both molecules were found to mediate cell-cell interactions in vitro via recognition of sialylated glycoconjugates (Crocker et al. 1991; Powell et al. 1993). The cloning of Sn/Siglec-1 revealed striking sequence similarities to CD22/Siglec-2 and led to the demonstration that two other related IgSF proteins, myelin-associated glycoprotein (MAG/Siglec-4) and CD33 (Siglec-3), which were never previously known to bind Sias, were also members of the Siglec family (Freeman et al. 1995; Kelm et al. 1994). Seven additional members have been characterized over the last years (Angata et al. 2002; Crocker and Varki 2001c). All are highly related to

Page 20: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

14

CD33/Siglec-3 and to each other, sharing about 50-80% sequence similarity. Therefore, they have been described collectively as “CD33-related siglecs” and can be considered as a subgroup separate from Sn, CD22/Siglec-2 and MAG/Siglec-4, both from functional and from evolutionary perspectives (Crocker 2002). With the exception of Sn/Siglec-1, all human siglec-encoding genes are found in close proximity on chromosome 19. The structures of these genes are very similar suggesting that they have arisen by relatively recent gene duplication events.

Figure V The currently known human Siglecs (modified from Crocker and Varki 2001b). Tyr residues within potential signaling motifs are shown as circles. Those motifs that fit the consensus sequence for an immunoreceptor Tyr-based inhibition motif (ITIM) are shown in red. Potential N-linked glycans are indicated in ball-and stick form. In addition positions of key residues in Sn that bind the Neu5Ac portion of 3’ sialyllactose, as revealed in a ligand-bound crystal structure of the Sn N-terminal domain (May et al. 1998), are shown.

Structurally, all siglecs are type I membrane proteins, consisting of one N-terminal V-set Ig-domain, a variable number of C2-set Ig-domains, a single-pass transmembrane domain, and a cytoplasmic tail. Ig-domains are characterized by two opposing β-sheets typically held together by a disulfide bridge (Barclay and Brown 1997). However, an unusual arrangement of Cys residues occurs in the two N-terminal domains of all siglecs described to date. Within the V-set domain this leads to an intrasheet disulfide bridge allowing a somewhat less compact stacking of the two β-sheets. Furthermore, two additional Cys residues form an interdomain disulfide bond between domains 1 and 2 (Pedraza et al. 1990). The first Ig-like domain mediates the recognition of sialylated glycans. Involved are some well-conserved amino acid residues, such as an Arg on β-strand F, and two aromatic amino acids, one located near the N-terminus, the other downstream of the conserved Arg. The conserved Arg forms a

SS

VS S

C2

COOH

SialoadhesinSiglec-1

COOH

MAGSiglec-4

COOH

CD33Siglec-3

COOH

Y

S

CD22Siglec-2

SS

VS S

S

S

C2

NH2

Arg97

Trp2

Trp106

A

F

GNeu5Ac

COOH

Siglec-10

VS S

S

NH2

SSSC2

SSC2

SSC2

SSC2

COOH

Siglec-5

NH2

S

SS

C2

SS

C2

SS

C2

VS S

S

COOH

Siglec-6

VS S

NH2

S

SS

C2

SS

C2

S

COOH

Siglec-7

S

NH2

SSSC2

SSC2

VS S

COOH

NH2

S

S

VS S

SSC2

SSC2

COOH

Siglec-9

SSC2

VS S

S

NH2

SSSC2

Siglec-8

S

S

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

S

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

NH2

NH2

NH2

VS S

S

SS

C2

SS

C2

SS

C2

SS

C2

NH2

VS S

S

S

SS

C2

VS S

S

COOH

VS S

S

NH2

SSSC2

SSC2

SSC2

SSC2

Siglec-11

ITIM motifTyr-based signaling motif

CD33-related Siglecs

SS

VS S

C2

COOH

SialoadhesinSiglec-1

COOH

MAGSiglec-4

COOH

CD33Siglec-3

COOH

Y

S

CD22Siglec-2

SS

VS S

S

S

C2

NH2

Arg97

Trp2

Trp106

A

F

GNeu5Ac

COOH

Siglec-10

VS S

S

NH2

SSSC2

SSC2

SSC2

SSC2

COOH

Siglec-5

NH2

S

SS

C2

SS

C2

SS

C2

VS S

S

COOH

Siglec-6

VS S

NH2

S

SS

C2

SS

C2

S

COOH

Siglec-7

S

NH2

SSSC2

SSC2

VS S

COOH

NH2

S

S

VS S

SSC2

SSC2

COOH

Siglec-9

SSC2

VS S

S

NH2

SSSC2

Siglec-8

S

S

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

S

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

SS

C2

NH2

NH2

NH2

VS S

S

SS

C2

SS

C2

SS

C2

SS

C2

NH2

VS S

S

S

SS

C2

VS S

S

COOH

VS S

S

NH2

SSSC2

SSC2

SSC2

SSC2

Siglec-11

ITIM motifTyr-based signaling motif

CD33-related Siglecs

Page 21: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

15

salt bridge with the carboxylate group of bound Sia. Experimental mutation of this residue markedly diminishes binding in all Siglecs studied to date (Angata and Varki 2000b; Tang et al. 1997; van der Merwe et al. 1996; Vinson et al. 1996). Each siglec has a distinct preference for specific types of Sia and linkages to subterminal sugars (Crocker and Varki 2001c) as well as distinct expression pattern in different cell types, indicating that they perform highly specific functions (see table 1). Many of the siglecs have potential Tyr phosphorylation sites in the context of an immunoreceptor-based inhibitory motif (ITIM) in their cytoplasmic tail, suggesting their involvement in intracellular signaling pathways (Crocker and Varki 2001a). A schematic representation of the Siglecs is given in Fig. V. Siglecs have only been documented in mammals and birds so far (Crocker 2002; Dulac et al. 1992). Interestingly, the comparison of siglec-related genes from human and great apes revealed differential and complex evolutionary paths indicating a high rate of evolution within this gene family (Angata et al. 2001b; Brinkman-Van der Linden et al. 2000; Gagneux and Varki 2001; Varki 2001a; Varki 2001b). Although some Ig superfamily genes are highly conserved from nematodes to primates, there seem to be no distinct siglec homologs in Caenorhabditis elegans or Drosophila melanogaster, two of the most extensively studied protostome lineage animals (Angata and Varki 2000a).

Siglecs of the immune system

With the exception of the myelin-associated glycoprotein (MAG/Siglec-4a) and Schwann cell myelin protein (SMP/Siglec-4b), all siglecs found so far are expressed by cells of the immune system. Sn is one of the receptors exclusively expressed on tissue macrophages, whereas CD22/Siglec-2 is exclusively expressed on B-cells. Most of the CD33-related Siglecs are differentially expressed on the cells of the haematopoietic system, where they seem to be involved in regulating cellular activation within the immune system (Crocker and Varki 2001b).

Table 1 Tissue distribution and glycan recognition of Siglecs

Name Other name Tissue distribution Glycan recognized

Siglec-1 Sialoadhesin Macrophages α2-3 Sia>α2-6 Sia>>α2-8 Sia Siglec-2 CD22 B cells α2-6 Sia Siglec-3 CD33 Myeloid progenitors, mature monocytes α2-6 Sia>α2-3 Sia Siglec-4 MAG Oligodendrocytes, Schwann cells α2-3 Sia Siglec-5 Monocytes, neutrophils α2-3 Sia/α2-6 Sia/α2-8 Sia Siglec-6 B cells, placental trophoblasts Siaα2-6GalNAc (sialylTn) Siglec-7 NK cells, monocytes α2-6 Sia/α2-8 Sia/α2-3 Sia (3 Ig D) α2-6 Sia>α2-8 Sia>α2-3 Sia (2 Ig D) Siglec-8 Eosinophils, mast cells α2-3 Sia>α2-6 Sia Siglec-9 Monocytes, neutrophils, NK cells α2-3 Sia/SLeX>>α2-6 Sia>>α2-8 Sia Siglec-10 B cells, eosinophils, monocytes α2-3 Sia/α2-6 Sia Siglec-11 Macrophages, brain microglia α2-8 Sia

Page 22: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

16

Myelin-associated glycoprotein (MAG/Siglec-4a)

Together with its putative orthologue Schwann cell myelin protein (SMP/Siglec-4b) the myelin-associated glycoprotein (MAG/Siglec-4a) is the only Siglec to be found exclusively in the nervous system where it is expressed by oligodendrocytes (in central nervous system, CNS) and Schwann cells (in peripheral nervous system, PNS). It is a minor constituent of myelin, comprising ~1% and ~0.1% of all myelin proteins in the CNS and PNS, respectively (Trapp 1990). Because of its early expression during myelination, and its location in the myelin membrane immediately next to the axon, MAG has been suggested to play a role in both the formation of myelin and in the maintenance of mature myelin by stabilizing the myelin-axon interface (Filbin 1995; Trapp 1990). In addition to its role in the intact nervous system, MAG has been shown to be a potent inhibitor of axonal regeneration in vitro (McKerracher et al. 1994; Mukhopadhyay et al. 1994) and is consequently likely to contribute to the lack of regeneration of the adult mammalian CNS after injury (Filbin 1996). Interestingly, MAG is a bifunctional molecule, which can either promote or inhibit axonal regeneration, depending on the age and the type of the neuron. MAG promotes axonal regeneration from dorsal root ganglion (DRG) neurons up to post-natal day 3 and from embryonic spinal motor neurons (Mukhopadhyay et al. 1994; Turnley and Bartlett 1998), but inhibits neurite outgrowth from older DRG neurons and all other post-natal neurons tested to date (deBellard et al. 1996). The capability of MAG to influence axonal outgrowth and to stabilize the axon/myelin interaction requires that MAG binds to receptors on the neuronal cell surface. The gangliosides GD1a and GT1b are known as functional MAG ligands on neuronal cells (Vinson et al. 2001; Vyas et al. 2002). It has been postulated that the MAG–dependent clustering of GD1a and GT1b in lipid rafts activates a Rho-dependent signalling pathway that leads to the inhibition of neurite outgrowth (McKerracher 2002). Notably, this Rho activation seems to be dependent on the participation of the p75 neurotrophin receptor (p75NTR) (Yamashita et al. 2002). In addition recent studies have shown that MAG can mediate high-affinity, Sia-independent interactions with the Nogo receptor, a glycosylphosphatidylinositol-linked protein that is important in neurite inhibition (Hunt et al. 2002; McGee and Strittmatter 2003). Taken together, these findings suggest that MAG-dependent inhibition of neurite outgrowth depends on a complex four-way molecular interaction bewteen MAG, GT1b, p75NTR and Nogo receptor. MAG is encoded by a single gene as two alternatively spliced isoforms. L-MAG (for large MAG) and S-MAG (for small MAG) are identical in their extracellular and transmembrane domains and share a common region in their cytoplasmic domain but are distinct in their carboxyl termini (Fujita et al. 1998). In the CNS, the expression pattern of the L- and S-MAG is developmentally regulated with L-MAG being expressed predominantly during early stages of myelination and S-MAG being the major isoform in adulthood. In the PNS, S-MAG is mainly expressed throughout development and in adulthood (Inuzuka et al. 1991; Pedraza et al. 1991). The differential expression of these two isoforms indicates different functions for S-

Page 23: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

17

and L-MAG. The role of L-MAG as a transmembrane receptor is supported by its activation of the non-receptor Tyr kinase Fyn (Umemori et al. 1994), its Tyr phosphorylation site (Jaramillo et al. 1994) and its interaction with the calcium binding protein S100beta (Kursula et al. 2000). The potential of S-MAG to interact with tubulin and microtubules supports its role as a cell adhesion molecule linking the axonal surface and the glial cell cytoskeleton (Kursula et al. 2001). How these molecular interactions affect properties of myelin-forming cells remains to be solved. SMP/Siglec-4b is an avian Siglec expressed on Schwann cells and oligodendrocytes (Dulac et al. 1992). Molecular cloning of SMP and primary sequence analysis revealed certain degree of sequence similarity between SMP and MAG, particularly in the transmembrane domain (80.0% identity), first Ig-like domain (61.4%), and an eight-amino acid segment at the C-terminal that include a Tyr residue (Dulac et al. 1992). It is still unclear whether SMP is the avian ortholog of MAG, although several studies have provoded some evidence against this possibility (Collins et al. 1997; Dulac et al. 1992). Nevertheless, SMP shows almost the identical binding specificity towards sialylated glycolipids as MAG (Collins et al. 1997), suggesting that the molecule is a functional equivalent of MAG in the avian nervous system.

1.4. Evolutionary aspects of sialic acid biology In contrast to the traditional view that Sias were invented in the deuterostome lineage, recent studies suggest that the origin of the Sia biosynthetic pathway predates the split of protostome and deuterostome lineages, possibly even the separation of the three domains of life (Angata and Varki 2002). Taken together, based on the known occurrence of Sias, the enzymes involved in the Sia pathway and the binding partners, three different evolutionary scenarios have been proposed (see Fig. VI) (Angata and Varki 2002). In the first scenario, the Sia biosynthesis pathway predates the emergence of the three major branches of life. In this scenario, the genes of this pathway are evolutionarily very ancient but were partially or completely lost in most lineages remaining only in deuterostomes, a few protostomes, and some prokaryotes. The maintenance of the Sia pathway in microbial pathogens could then be explained by its survival advantage in vertebrate hosts and its frequent lateral transfer between bacteria. In the second scenario, the Sias were invented in bacteria after the split of the bacteria and archaea/eukarya clades and later introduced via lateral gene transfer(s) to a limited number of organisms belonging to archaea and eukarya. These possibly include the single-cellular ancestor of eukaryotic animals which had a much better chance of incorporating foreign DNA into its genome compared with a multicellular organism (Angata and Varki 2002). Also in this scenario extensive gene losses in many of the lineages descended from such animal ancestors have to be assumed. In this regard, a curious fact is that a large number of protostomes seem to have soluble lectins that recognize Sias (Varki

Page 24: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

18

1997). Since many of these lectins seem to be involved in host defense, it is possible that an early ancestor in the protostome lineage eliminated expression of Sias and instead started to recognize it as a foreign epitope on invading pathogens. In the third scenario, sialylation was a unique invention in metazoan animals (deuterostomes + protostomes) and all other life forms that have Sias acquired the genes necessary to synthesize sialylated glycans from descendants of these animals. Thereafter, the pathway would have been prominently expressed in the deuterostomes and completely or partially eliminated in the protostomes. The occurrence of Sias in microbial pathogens would then have resulted either from lateral gene transfer from animals prior to the protostome:deuterostome split (as in pathogenic bacteria) or by convergent evolution involving genes that cannot be identified by searching for sequence similarities (as in some pathogenic fungi) (Angata and Varki 2002).

Figure VI Possible scenarios for the invention and spread of Sias. Presence of Sias are indicated with solid circles, and possible presence of Sia in archaea (M. jannaschii) predicted from genomic DNA sequence analysis is indicated with an open circle. In scenario 1, Sia (or rather the biosynthetic machinery for the expression of Sia) was invented in the common ancestor of cellular life, inherited in all three domains of life, but subsequently lost in most of the lineages. In scenario 2, Sia was invented in bacteria and spread to other domains of life via horizontal (lateral) transfer. In scenario 3, Sia was invented in eukaryotes (prior to the split of deuterostome and protostome lineages) and spread to other domains via horizontal transfer (Angata and Varki 2002).

1.5. Project objectives Sialylation is an important carbohydrate modification of glycoconjugates and Sias have been implicated in various biological phenomena. In spite of the crucial role Sias seem to play in biology, little is known about the evolution of these molecules. Many questions remain with regard to the evolutionary origin and evolutionary paths of Sias and proteins involved in Sia biology. The primary aim of this study was to provide new insights into the evolutionary history of sialylation using genomic information currently available. In particular, database searches in the genomes of Takifugu rubripes and Danio rerio would give information about the set of proteins involved in Sia biology in species which diverged about 450 millions years ago. Phylogenetic trees would help to further elucidate the evolutionary relationships of these genes. Furthermore, sequence analysis in comparison with mammalian orthologues would provide clues to functionally important domains of these molecules as well as confirm or reveal findings about essential residues required for specific protein function. In addition,

Page 25: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

19

analysis of protein expression and function would give information about potential physiological roles of these proteins compared to mammalian species.

Page 26: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

20

1.6. Abbreviations CMP Cytidine monophosphate CMP-Neu5Ac Cytidine monophosphate N-acetylneuraminic acid CMP-Neu5Gc Cytidine monophosphate N-glycolylneuraminic acid CNS Central nervous system CTP Cytidine triphosphate DRG Dorsal root ganglion Gal Galactose GalNAc N-acetylgalactosamine GlcNAc N-acetylglucosamine GPI Glycosylphosphatidyl-inositol Ig Immunoglobulin ITIM immunoreceptor tyrosine-based inhibitory motif KDN Deaminoneuraminic acid MAG Myelin-associated glycoprotein Neu Neuraminic acid Neu5,9Ac2 N-acetyl-9-O-acetylneuraminic acid Neu5Ac N-acetylneuraminic acid Neu5Gc N-glycolylneuraminic acid p75NTR p75 neurotrophin receptor PNS Peripheral nervous system Sia Sialic acid Siglec Sialic acid-binding Ig superfamily lectins SMP Schwann cell myelin protein Sn Sialoadhesin ST Sialyltransferase

Abbreviations and symbols for amino acids Amino acid Three letter

code One letter code

Amino acid Three letter code

One letter code

Alanine Ala A Leucine Leu L Arginine Arg R Lysine Lys K Asparagine Asn N Methionine Met M Aspartic acid Asp D Phenylalanine Phe F Cysteine Cys C Proline Pro P Glutamine Gln Q Serine Ser S Glutamic acid Glu E Threonine Thr T Glycine Gly G Tryptophan Trp W Histidine His H Tyrosine Tyr Y Isoleucine Ile V Valine Val V

Page 27: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

21

1.7. References Altmann, F., Staudacher, E., Wilson, I. B., and Marz, L. (1999) Insect cells as hosts for the expression of recombinant glycoproteins. Glycoconj.J, 16, 109-123.

Alviano, C. S., Travassos, L. R., and Schauer, R. (1999) Sialic acids in fungi: a minireview. Glycoconj.J, 16, 545-554.

Angata, K. and Fukuda, M. (2003) Polysialyltransferases: major players in polysialic acid synthesis on the neural cell adhesion molecule. Biochimie, 85, 195-206.

Angata, K., Yen, T. Y., El Battari, A., Macher, B. A., and Fukuda, M. (2001a) Unique disulfide bond structures found in ST8Sia IV polysialyltransferase are required for its activity. J Biol Chem, 276, 15369-15377.

Angata, T., Kerr, S. C., Greaves, D. R., Varki, N. M., Crocker, P. R., and Varki, A. (2002) Cloning and characterization of human Siglec-11. A recently evolved signaling that can interact with SHP-1 and SHP-2 and is expressed by tissue macrophages, including brain microglia. J Biol.Chem, 277, 24466-24474.

Angata, T. and Varki, A. (2000a) Cloning, characterization, and phylogenetic analysis of siglec-9, a new member of the CD33-related group of siglecs. Evidence for co-evolution with sialic acid synthesis pathways. J Biol Chem, 275, 22127-22135.

Angata, T. and Varki, A. (2000b) Siglec-7: a sialic acid-binding lectin of the immunoglobulin superfamily. Glycobiology, 10, 431-8.

Angata, T. and Varki, A. (2002) Chemical diversity in the sialic acids and related alpha-keto acids: an evolutionary perspective. Chem.Rev., 102, 439-469.

Angata, T., Varki, N. M., and Varki, A. (2001b) A second uniquely human mutation affecting sialic acid biology. J.Biol.Chem., 276, 40282-40287.

Armstrong, P. B., Swarnakar, S., Srimal, S., Misquith, S., Hahn, E. A., Aimes, R. T., and Quigley, J. P. (1996) A cytolytic function for a sialic acid-binding lectin that is a member of the pentraxin family of proteins. J Biol.Chem, 271, 14717-14721.

Babal, P., Pindak, F. F., Russell, L. C., and Gardner, W. A., Jr. (1999) Sialic acid-specific lectin from Tritrichomonas foetus. Biochim.Biophys.Acta, 1428, 106-116.

Barclay, A. N. and Brown, M. H. (1997) Heterogeneity of interactions mediated by membrane glycoproteins of lymphocytes. Biochem Soc.Trans., 25, 224-228.

Page 28: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

22

Belen, C. M., Gao, W., Herrera, M., Alroy, J., Moore, J. B., Beverley, S. M., and Pereira, M. A. (2000) Heterologous expression of Trypanosoma cruzi trans-sialidase in Leishmania major enhances virulence. Infect.Immun., 68, 2728-2734.

Bozue, J. A., Tullius, M. V., Wang, J., Gibson, B. W., and Munson, R. S., Jr. (1999) Haemophilus ducreyi produces a novel sialyltransferase. Identification of the sialyltransferase gene and construction of mutants deficient in the production of the sialic acid-containing glycoform of the lipooligosaccharide. J Biol.Chem, 274, 4106-4114.

Bratosin, D., Mazurier, J., Debray, H., Lecocq, M., Boilly, B., Alonso, C., Moisei, M., Motas, C., and Montreuil, J. (1995) Flow cytofluorimetric analysis of young and senescent human erythrocytes probed with lectins. Evidence that sialic acids control their life span. Glycoconjugate J., 12, 258-267.

Brinkman-Van der Linden, E.C., Sjoberg, E. R., Juneja, L. R., Crocker, P. R., Varki, N., and Varki, A. (2000) Loss of N-glycolylneuraminic acid in human evolution. Implications for sialic acid recognition by siglecs. J.Biol.Chem., 275, 8633-8640.

Collins, B. E., Kiso, M., Hasegawa, A., Tropak, M. B., Roder, J. C., Crocker, P. R., and Schnaar, R. L. (1997) Binding specificities of the sialoadhesin family of I-type lectins. Sialic acid linkage and substructure requirements for binding of myelin-associated glycoprotein, Schwann cell myelin protein, and sialoadhesin. J Biol.Chem, 272, 16889-16895.

Crocker, P. R. (2002) Siglecs: sialic-acid-binding immunoglobulin-like lectins in cell-cell interactions and signalling. Curr.Opin.Struct.Biol., 12, 609-615.

Crocker, P. R., Kelm, S., Dubois, C., Martin, B., McWilliam, A. S., Shotton, D. M., Paulson, J. C., and Gordon, S. (1991) Purification and properties of sialoadhesin, a sialic acid-binding receptor of murine tissue macrophages. Embo J, 10, 1661-1669.

Crocker, P. R. and Varki, A. (2001a) Siglecs in the immune system. Immunology, 103, 137-145.

Crocker, P. R. and Varki, A. (2001b) Siglecs, sialic acids and innate immunity. Trends Immunol, 22, 337-342.

Datta, A. K., Chammas, R., and Paulson, J. C. (2001) Conserved cysteines in the sialyltransferase sialylmotifs form an essential disulfide bond. J Biol Chem, 276, 15200-15207.

Datta, A. K. and Paulson, J. C. (1995) The sialyltransferase "sialylmotif" participates in binding the donor substrate CMP-NeuAc. J Biol Chem, 270, 1497-1500.

Page 29: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

23

Datta, A. K., Sinha, A., and Paulson, J. C. (1998) Mutation of the sialyltransferase S-sialylmotif alters the kinetics of the donor and acceptor substrates. J Biol Chem, 273, 9608-9614.

Davidson, D. J. and Castellino, F. J. (1993) The influence of the nature of the asparagine 289-linked oligosaccharide on the activation by urokinase and lysine binding properties of natural and recombinant human plasminogens. J Clin.Invest, 92, 249-254.

deBellard, M. E., Tang, S., Mukhopadhyay, G., Shen, Y. J., and Filbin, M. T. (1996) Myelin-associated glycoprotein inhibits axonal regeneration from a variety of neurons via interaction with a sialoglycoprotein. Mol Cell Neurosci., 7, 89-101.

Dolan, S. A., Miller, L. H., and Wellems, T. E. (1990) Evidence for a switching mechanism in the invasion of erythrocytes by Plasmodium falciparum. J Clin.Invest, 86, 618-624.

Drickamer, K. (1993) A conserved disulphide bond in sialyltransferases. Glycobiology, 3, 2-3.

Dulac, C., Tropak, M. B., Cameron-Curry, P., Rossier, J., Marshak, D. R., Roder, J., and Le Douarin, N. M. (1992) Molecular characterization of the Schwann cell myelin protein, SMP: structural similarities within the immunoglobulin superfamily. Neuron, 8, 323-334.

Feingold, C., Mirelman, D., Lotan, D., and Lotan, R. (1984) Enhancement by retinoic acid of the sensitivity of different tumor cell lines to the sialic acid-specific toxin of Entamoeba histolytica. Cancer Lett., 24, 263-271.

Filbin, M. T. (1995) Myelin-associated glycoprotein: a role in myelination and in the inhibition of axonal regeneration? Curr Opin Neurobiol, 5, 588-95.

Filbin, M. T. (1996) The Muddle with MAG. Mol Cell Neurosci, 8, 84-92.

Freeman, S. D., Kelm, S., Barber, E. K., and Crocker, P. R. (1995) Characterization of CD33 as a new member of the sialoadhesin family of cellular interaction molecules. Blood, 85, 2005-2012.

Fujita, N., Kemper, A., Dupree, J., Nakayasu, H., Bartsch, U., Schachner, M., Maeda, N., Suzuki, K., and Popko, B. (1998) The cytoplasmic domain of the large myelin-associated glycoprotein isoform is needed for proper CNS but not peripheral nervous system myelination. Journal Of Neuroscience, 18, 1970-8.

Gagneux, P. and Varki, A. (2001) Genetic differences between humans and great apes. Mol.Phylogenet.Evol., 18, 2-13.

Page 30: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

24

Gil-Serrano, A. M., Rodriguez-Carvajal, M. A., Tejero-Mateo, P., Espartero, J. L., Thomas-Oates, J., Ruiz-Sainz, J. E., and Buendia-Claveria, A. M. (1998) Structural determination of a 5-O-methyl-deaminated neuraminic acid (Kdn)-containing polysaccharide isolated from Sinorhizobium fredii. Biochem J, 334 ( Pt 3), 585-594.

Gilbert, M., Brisson, J. R., Karwaski, M. F., Michniewicz, J., Cunningham, A. M., Wu, Y., Young, N. M., and Wakarchuk, W. W. (2000) Biosynthesis of ganglioside mimics in Campylobacter jejuni OH4384. Identification of the glycosyltransferase genes, enzymatic synthesis of model compounds, and characterization of nanomole amounts by 600-mhz (1)h and (13)c NMR analysis. J Biol.Chem, 275, 3896-3906.

Gilbert, M., Watson, D. C., Cunningham, A. M., Jennings, M. P., Young, N. M., and Wakarchuk, W. W. (1996) Cloning of the lipooligosaccharide alpha-2,3-sialyltransferase from the bacterial pathogens Neisseria meningitidis and Neisseria gonorrhoeae. J Biol Chem, 271, 28271-28276.

Hood, D. W., Cox, A. D., Gilbert, M., Makepeace, K., Walsh, S., Deadman, M. E., Cody, A., Martin, A., Mansson, M., Schweda, E. K., Brisson, J. R., Richards, J. C., Moxon, E. R., and Wakarchuk, W. W. (2001) Identification of a lipopolysaccharide alpha-2,3-sialyltransferase from Haemophilus influenzae. Mol Microbiol, 39, 341-350.

Hunt, D., Coffin, R. S., and Anderson, P. N. (2002) The Nogo receptor, its ligands and axonal regeneration in the spinal cord; a review. J Neurocytol., 31, 93-120.

Ikehara, Y., Kojima, N., Kurosawa, N., Kudo, T., Kono, M., Nishihara, S., Issiki, S., Morozumi, K., Itzkowitz, S., Tsuda, T., Nishimura, S. I., Tsuji, S., and Narimatsu, H. (1999) Cloning and expression of a human gene encoding an N- acetylgalactosamine-alpha2,6-sialyltransferase (ST6GalNAc I): a candidate for synthesis of cancer-associated sialyl-Tn antigens. Glycobiology, 9, 1213-1224.

Inoue, S. (1965) Isolation of new polysaccharide sulphates from Charonia lampas. Biochim.Biophys.Acta, 101, 16-25.

Inuzuka, T., Fujita, N., Sato, S., Baba, H., Nakano, R., Ishiguro, H., and Miyatake, T. (1991) Expression of the large myelin-associated glycoprotein isoform during the development in the mouse peripheral nervous system. Brain Res., 562, 173-175.

Izumida, Y., Seiyama, A., and Maeda, N. (1991) Erythrocyte aggregation: bridging by macromolecules and electrostatic repulsion by sialic acid. Biochim.Biophys.Acta, 1067, 221-226.

Page 31: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

25

Jaramillo, M. L., Afar, D. E., Almazan, G., and Bell, J. C. (1994) Identification of tyrosine 620 as the major phosphorylation site of myelin-associated glycoprotein and its implication in interacting with signaling molecules. J Biol.Chem, 269, 27240-27245.

Jarvis, G. A. (1995) Recognition and control of neisserial infection by antibody and complement. Trends Microbiol, 3, 198-201.

Jeanneau, C., Chazalet, V., Auge, C., Soumpasis, D. M., Harduin-Lepers, A., Delannoy, P., Imberty, A., and Breton, C. (2004) Structure-function analysis of the human sialyltransferase ST3Gal I- Role of N-glycosylation and a novel conserved sialylmotif. J Biol.Chem.,

Jozsi, M., Manuelian, T., Heinen, S., Oppermann, M., and Zipfel, P. F. (2004) Attachment of the soluble complement regulator factor H to cell and tissue surfaces: relevance for pathology. Histol.Histopathol., 19, 251-258.

Kannagi, R. (2002) Regulatory roles of carbohydrate ligands for selectins in the homing of lymphocytes. Curr.Opin.Struct.Biol., 12, 599-608.

Kelm, S., Pelz, A., Schauer, R., Filbin, M. T., Tang, S., De Bellard, M. E., Schnaar, R. L., Mahoney, J. A., Hartnell, A., Bradfield, P., and et al. (1994) Sialoadhesin, myelin-associated glycoprotein and CD22 define a new family of sialic acid-dependent adhesion molecules of the immunoglobulin superfamily. Curr Biol, 4, 965-72.

Kelm, S. and Schauer, R. (1997) Sialic acids in molecular and cellular interactions. Int.Rev.Cytol., 175, 137-240.

Kim, K., Lawrence, S. M., Park, J., Pitts, L., Vann, W. F., Betenbaugh, M. J., and Palter, K. B. (2002) Expression of a functional Drosophila melanogaster N-acetylneuraminic acid (Neu5Ac) phosphate synthase gene: evidence for endogenous sialic acid biosynthetic ability in insects. Glycobiology, 12, 73-83.

Kim, Y. J., Borsig, L., Varki, N. M., and Varki, A. (1998) P-selectin deficiency attenuates tumor growth and metastasis. Proc.Natl.Acad.Sci.U.S.A, 95, 9325-9330.

Kim, Y. J., Kim, K. S., Kim, S. H., Kim, C. H., Ko, J. H., Choe, I. S., Tsuji, S., and Lee, Y. C. (1996) Molecular Cloning and Expression of Human Gal[beta]1,3GalNAc [alpha]2,3-Sialyltransferase (hST3Gal II). Biochemical and Biophysical Research Communications, 228, 324-327.

Kitagawa, H. and Paulson, J. C. (1994) Differential expression of five sialyltransferase genes in human tissues. J.Biol.Chem., 269, 17872-17878.

Page 32: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

26

Kitazume-Kawaguchi, S., Kabata, S., and Arita, M. (2001) Differential biosynthesis of polysialic or disialic acid structure by ST8Sia II and ST8Sia IV. J Biol Chem, 276, 15696-15703.

Koles, K., Irvine, K. D., and Panin, V. M. (2004) Functional characterization of Drosophila sialyltransferase. J Biol.Chem., 279, 4346-4357.

Kono, M., Yoshida, Y., Kojima, N., and Tsuji, S. (1996) Molecular cloning and expression of a fifth type of alpha2,8-sialyltransferase (ST8Sia V). Its substrate specificity is similar to that of SAT-V/III, which synthesize GD1c, GT1a, GQ1b and GT3. J Biol Chem, 271, 29366-29371.

Kursula, P., Lehto, V. P., and Heape, A. M. (2000) S100beta inhibits the phosphorylation of the L-MAG cytoplasmic domain by PKA. Brain Res.Mol Brain Res., 76, 407-410.

Kursula, P., Lehto, V. P., and Heape, A. M. (2001) The small myelin-associated glycoprotein binds to tubulin and microtubules. Brain Res.Mol Brain Res., 87, 22-30.

Lee, Y. C., Kaufmann, M., Kitazume-Kawaguchi, S., Kono, M., Takashima, S., Kurosawa, N., Liu, H., Pircher, H., and Tsuji, S. (1999) Molecular cloning and functional expression of two members of mouse NeuAcalpha2,3Galbeta1,3GalNAc GalNAcalpha2,6-sialyltransferase family, ST6GalNAc III and IV. J Biol Chem, 274, 11958-11967.

Legaigneur, P., Breton, C., El Battari, A., Guillemot, J. C., Auge, C., Malissard, M., Berger, E. G., and Ronin, C. (2001) Exploring the acceptor substrate recognition of the human beta- galactoside alpha 2,6-sialyltransferase. J Biol Chem, 276, 21608-21617.

Lerouge, P., Cabanes-Macheteau, M., Rayon, C., Fischette-Laine, A. C., Gomord, V., and Faye, L. (1998) N-glycoprotein biosynthesis in plants: recent developments and future trends. Plant Mol Biol., 38, 31-48.

Ma, J. Y. and Colley, K. J. (1996) A disulfide-bonded dimer of the Golgi beta-galactoside alpha 2,6- sialyltransferase is catalytically inactive yet still retains the ability to bind galactose. J.Biol.Chem., 271, 7758-7766.

Maes, E., Wieruszeski, J. M., Plancke, Y., and Strecker, G. (1995) Structure of three Kdn-containing oligosaccharide-alditols released from oviducal secretions of Ambystoma tigrinum: characterization of the carbohydrate sequence Fuc (alpha 1-5) [Fuc (alpha 1-4)] Kdn (alpha 2-3/6). FEBS Lett., 358, 205-210.

Malykh, Y. N., Krisch, B., Gerardy-Schahn, R., Lapina, E. B., Shaw, L., and Schauer, R. (1999) The presence of N-acetylneuraminic acid in Malpighian tubules of larvae of the cicada Philaenus spumarius. Glycoconj.J, 16, 731-739.

Page 33: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

27

Marchal, I., Jarvis, D. L., Cacan, R., and Verbert, A. (2001) Glycoproteins from insect cells: sialylated or not? Biol.Chem., 382, 151-159.

Matsumoto, S., Ikura, K., Ueda, M., and Sasaki, R. (1995) Characterization of a human glycoprotein (erythropoietin) produced in cultured tobacco cells. Plant Mol Biol., 27, 1163-1172.

May, A. P., Robinson, R. C., Vinson, M., Crocker, P. R., and Jones, E. Y. (1998) Crystal structure of the N-terminal domain of sialoadhesin in complex with 3' sialyllactose at 1.85 A resolution. Mol Cell, 1, 719-728.

McGee, A. W. and Strittmatter, S. M. (2003) The Nogo-66 receptor: focusing myelin inhibition of axon regeneration. Trends Neurosci., 26, 193-198.

McKerracher, L. (2002) Ganglioside rafts as MAG receptors that mediate blockade of axon growth. Proc.Natl.Acad.Sci.U.S.A, 99, 7811-7813.

McKerracher, L., David, S., Jackson, D. L., Kottis, V., Dunn, R. J., and Braun, P. E. (1994) Identification of myelin-associated glycoprotein as a major myelin-derived inhibitor of neurite growth. Neuron, 13, 805-811.

Mukhopadhyay, G., Doherty, P., Walsh, F. S., Crocker, P. R., and Filbin, M. T. (1994) A novel role for myelin-associated glycoprotein as an inhibitor of axonal regeneration. Neuron, 13, 757-767.

Nakayama, J., Fukuda, M. N., Hirabayashi, Y., Kanamori, A., Sasaki, K., Nishi, T., and Fukuda, M. (1996) Expression cloning of a human GT3 synthase. GD3 and GT3 are synthesized by a single enzyme. J Biol Chem, 271, 3684-3691.

Okajima, T., Chen, H. H., Ito, H., Kiso, M., Tai, T., Furukawa, K., Urano, T., and Furukawa, K. (2000) Molecular cloning and expression of mouse GD1alpha/GT1aalpha/GQ1balpha synthase (ST6GalNAc VI) gene. J Biol Chem, 275, 6717-6723.

Okajima, T., Fukumoto, S., Miyazaki, H., Ishida, H., Kiso, M., Furukawa, K., Urano, T., and Furukawa, K. (1999a) Molecular cloning of a novel alpha2,3-sialyltransferase (ST3Gal VI) that sialylates type II lactosamine structures on glycoproteins and glycolipids. J Biol Chem, 274, 11479-11486.

Okajima, T., Fukumoto, S., Ito, H., Kiso, M., Hirabayashi, Y., Urano, T., Furukawa, K., and Furukawa, K. (1999b) Molecular cloning of brain-specific GD1alpha synthase (ST6GalNAc V) containing CAG/glutamine repeats. J.Biol.Chem., 274, 30557-30562.

Page 34: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

28

Orlandi, P. A., Klotz, F. W., and Haynes, J. D. (1992) A malaria invasion receptor, the 175-kilodalton erythrocyte binding antigen of Plasmodium falciparum recognizes the terminal Neu5Ac(alpha 2-3)Gal-sequences of glycophorin A. J Cell Biol., 116, 901-909.

Patel, K. D., Cuvelier, S. L., and Wiehler, S. (2002) Selectins: critical mediators of leukocyte recruitment. Semin.Immunol., 14, 73-81.

Paulson, J. C. and Colley, K. J. (1989) Glycosyltransferases: Structure, localization, and control of cell type-specific glycosylation. J.Biol.Chem., 264, 17615-17618.

Pedraza, L., Frey, A. B., Hempstead, B. L., Colman, D. R., and Salzer, J. L. (1991) Differential expression of MAG isoforms during development. J Neurosci.Res., 29, 141-148.

Pedraza, L., Owens, G. C., Green, L. A., and Salzer, J. L. (1990) The myelin-associated glycoproteins: membrane disposition, evidence of a novel disulfide linkage between immunoglobulin-like domains, and posttranslational palmitylation. J Cell Biol., 111, 2651-2661.

Powell, L. D., Sgroi, D., Sjoberg, E. R., Stamenkovic, I., and Varki, A. (1993) Natural ligands of the B cell adhesion molecule CD22 b carry N- linked oligosaccharides with a-2,6-linked sialic acids that are required for recognition. J.Biol.Chem., 268, 7019-7027.

Qian, R., Chen, C., and Colley, K. J. (2001) Location and Mechanism of alpha 2,6-Sialyltransferase Dimer Formation. Role of cysteine residues in enzyme dimerization, localization, activity, and processing. J Biol Chem, 276, 28641-28649.

Roth, J., Kempf, A., Reuter, G., Schauer, R., and Gehring, W. J. (1992) Occurrence of sialic acids in Drosophila melanogaster. Science, 256, 673-675.

Saito, M., Kitamura, H., and Sugiyama, K. (2001) Occurrence of gangliosides in the common squid and pacific octopus among protostomia. Biochim.Biophys.Acta, 1511, 271-280.

Sakarya, S. and Oncu, S. (2003) Bacterial adhesins and the role of sialic acid in bacterial adhesion. Med.Sci.Monit., 9, RA76-RA82.

Samyn-Petit, B., Krzewinski-Recchi, M. A., Steelant, W. F., Delannoy, P., and Harduin-Lepers, A. (2000) Molecular cloning and functional expression of human ST6GalNAc II. Molecular expression in various human cultured cells. Biochim.Biophys.Acta, 1474, 201-211.

Sasaki, K., Watanabe, E., Kawashima, K., Sekine, S., Dohi, T., Oshima, M., Hanai, N., Nishi, and Hasegawa, M. (1993) Expression cloning of a novel gal beta(1-3/1-4)g1cNAc alpha 2,3- sialyltransferase using lectin resistance selection. J.Biol.Chem., 268, 22782-22787.

Page 35: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

29

Schauer, R. (2000) Achievements and challenges of sialic acid research. Glycoconjugate J., 17, 485-499.

Schauer, R. and Kamerling, J. P. (1997) Chemistry, biochemistry and biology of sialic acids. In Montreuil, J., Vliegenthart, J. F. G., and Schachter, H.(eds), Glycoproteins II. Elsevier, Amsterdam, pp. 243-402.

Schauer, R., Kelm, S., Reuter, G., Roggentin, P., and Shaw, L. (1995) Biochemistry and role of sialic acids. In Rosenberg, A.(ed), Biology of the Sialic Acids. Plenum, New York, pp. 7-67.

Schenkman, S., Ferguson, M. A., Heise, N., de Almeida, M. L., Mortara, R. A., and Yoshida, N. (1993) Mucin-like glycoproteins linked to the membrane by glycosylphosphatidylinositol anchor are the major acceptors of sialic acid in a reaction catalyzed by trans-sialidase in metacyclic forms of Trypanosoma cruzi. Mol Biochem Parasitol., 59, 293-303.

Sorice, M., Griggi, T., Nicodemo, G., Garofalo, T., Marangi, M., Sanguigni, S., Becker, S. I., and Mirelman, D. (1996) Evidence for the existence of ganglioside molecules in the antigen of Entamoeba histolytica. Parasite Immunol., 18, 133-137.

Steenbergen, S. M., Wrona, T. J., and Vimr, E. R. (1992) Functional analysis of the sialyltransferase complexes in Escherichia coli K1 and K92. J.Bacteriol., 174, 1099-1108.

Stern, C. A., Braverman, T. R., and Tiemeyer, M. (2000) Molecular identification, tissue distribution and subcellular localization of mST3GalV/GM3 synthase. Glycobiology, 10, 365-374.

Takashima, S., Tsuji, S., and Tsujimoto, M. (2002a) Characterization of the second type of human beta-galactoside alpha 2,6-sialyltransferase (ST6Gal II), which sialylates Galbeta 1,4GlcNAc structures on oligosaccharides preferentially. Genomic analysis of human sialyltransferase genes. J Biol.Chem, 277, 45719-45728.

Takashima, S., Ishida, H. k., Inazu, T., Ando, T., Ishida, H., Kiso, M., Tsuji, S., and Tsujimoto, M. (2002b) Molecular cloning and expression of a sixth type of alpha 2,8-sialyltransferase (ST8Sia VI) that sialylates O-glycans. J.Biol.Chem., 277, 24030-24038.

Tang, S., Shen, Y. J., deBellard, M. E., Mukhopadhyay, G., Salzer, J. L., Crocker, P. R., and Filbin, M. T. (1997) Myelin-associated glycoprotein interacts with neurons via a sialic acid binding site at ARG118 and a distinct neurite inhibition site. J Cell Biol, 138, 1355-66.

Trapp, B. D. (1990) Myelin-associated glycoprotein. Location and potential functions. Ann.N.Y.Acad.Sci., 605, 29-43.

Page 36: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

30

Tunkijjanukij, S., Giaever, H., Chin, C. C., and Olafsen, J. A. (1998) Sialic acid in hemolymph and affinity purified lectins from two marine bivalves. Comp Biochem.Physiol B Biochem.Mol.Biol., 119, 705-713.

Turnley, A. M. and Bartlett, P. F. (1998) MAG and MOG enhance neurite outgrowth of embryonic mouse spinal cord neurons. Neuroreport, 9, 1987-1990.

Umemori, H., Sato, S., Yagi, T., Aizawa, S., and Yamamoto, T. (1994) Initial events of myelination involve Fyn tyrosine kinase signalling. Nature, 367, 572-576.

van der Merwe, P. A., Crocker, P. R., Vinson, M., Barclay, A. N., Schauer, R., and Kelm, S. (1996) Localization of the putative sialic acid-binding site on the immunoglobulin superfamily cell-surface molecule CD22. J.Biol.Chem., 271, 9273-9280.

Varki, A. (1992) Diversity in the sialic acids. Glycobiology, 2, 25-40.

Varki, A. (1997) Sialic acids as ligands in recognition phenomena. FASEB J, 11, 248-255.

Varki, A. (2001a) Loss of N-glycolylneuraminic acid in humans: Mechanisms, consequences, and implications for hominid evolution. Am.J.Phys.Anthropol., Suppl 33, 54-69.

Varki, A. (2001b) N-glycolylneuraminic acid deficiency in humans. Biochimie, 83, 615-622.

Vimr, E. and Lichtensteiger, C. (2002) To sialylate, or not to sialylate: that is the question. Trends Microbiol, 10, 254-257.

Vinson, M., Strijbos, P. J., Rowles, A., Facci, L., Moore, S. E., Simmons, D. L., and Walsh, F. S. (2001) Myelin-associated glycoprotein interacts with ganglioside GT1b. A mechanism for neurite outgrowth inhibition. J Biol.Chem, 276, 20280-20285.

Vinson, M., van der Merwe, P. A., Kelm, S., May, A., Jones, E. Y., and Crocker, P. R. (1996) Characterization of the sialic acid-binding site in sialoadhesin by site-directed mutagenesis. J Biol Chem, 271, 9267-72.

Vyas, A. A., Patel, H. V., Fromholt, S. E., Heffer-Lauc, M., Vyas, K. A., Dang, J., Schachner, M., and Schnaar, R. L. (2002) Gangliosides are functional nerve cell ligands for myelin-associated glycoprotein (MAG), an inhibitor of nerve regeneration. Proc.Natl.Acad.Sci.U.S.A, 99, 8412-8417.

Wakarchuk, W. W., Watson, D., St Michael, F., Li, J., Wu, Y., Brisson, J. R., Young, N. M., and Gilbert, M. (2001) Dependence of the bi-functional nature of a sialyltransferase from Neisseria meningitidis on a single amino acid substitution. J Biol.Chem, 276, 12785-12790.

Page 37: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Introduction

31

Warren, L. (1963) The distribution of sialic acids in nature. Comp Biochem Physiol, 10, 153-171.

Watarai, S., Sugimoto, C., Hosotani-Kaihara, K., Kobayashi, K., Onuma, M., Lee, J. T., Kushi, Y., Handa, S., and Yasuda, T. (1996) Isolation and characterization of gangliosides from Theileria sergenti. J Vet.Med.Sci., 58, 1099-1105.

Yamamoto, T., Nakashizuka, M., Kodama, H., Kajihara, Y., and Terada, I. (1996) Purification and characterization of a marine bacterial beta- galactoside alpha 2,6-sialyltransferase from Photobacterium damsela JT0160. J.Biochem.Tokyo., 120, 104-110.

Yamashita, T., Higuchi, H., and Tohyama, M. (2002) The p75 receptor transduces the signal from myelin-associated glycoprotein to Rho. J Cell Biol., 157, 565-570.

Yoshida, Y. and Tsuji, S. (2002) ST8Sia-III. In Taniguchi, N., Honke, K., and Fukuda, M.(eds), Handbook of glycosyltransferases and related genes. Springer-Verlag, Tokyo, Japan, pp. 335-339.

Page 38: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

32

2. The evolutionary history of sialylation

Friederike Lehmann, Thomas Hurek, Sørge Kelm

Manuscript submitted for publication in “Journal of Molecular Biology”

Declaration of my contribution to the publication With the exception of the phylogenetic analyses, which were performed by Dr. Thomas Hurek, I have planned, processed and analysed the research stated in this publication. I have written the publication in discussion with both co-authors. In the section “Material and Methods” the part concerning phylogenetic analysis was added by Dr. Thomas Hurek.

Page 39: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

33

The evolutionary history of sialylation

Friederike Lehmann1, Thomas Hurek2, Sørge Kelm1

(1) Centre for Biomolecular Interactions Bremen, Department of Biology and Chemistry,

University Bremen, 28334 Bremen, Germany

(2) General Microbiology, Department of Biology and Chemistry, University Bremen,

28334 Bremen, Germany

Running title: Evolution of sialyltransferases

Keywords: sialic acids, sialyltransferases, evolution, Danio rerio, Takifugu rubripes

1To whom correspondance should be addressed: Phone: ++49 (421) 218-4324,

FAX:++49(421) 218-7433, email: [email protected] or [email protected]

Page 40: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

34

2.1. Abstract Sialic acids have profound effects on the biological properties of cell surfaces and are involved in a variety of biological processes. The evolution of the rather complex sialylation apparatus has remained unclear, since the enzymes involved have been described mainly for relatively recent vertebrate lineages such as birds and mammals, although sialic acids occur in all deuterostomia, some proteostomia, a few protozoa, bacteria and viruses. In this study we have identified and compared such enzymes from different deuterostomian species based on genomic sequences. Analysis of two completely sequenced teleost genomes (Takifugu rubripes and Danio rerio) revealed the existence of all sialyltransferase subgroups with orthologues to almost all mammalian sialyltransferases. In addition, putative sialyltransferase genes were identified in the partially sequenced genomes of Branchiostoma floridae, Ciona intestinalis and the sea urchin Strongylocentrotus purpuratus, which could also be assigned unambigously to these subgroups. A thorough phylogenetic analysis has provided strong evidence that these subgroups developed well before the split of protostomes and deuterostomes suggesting that the characteristic motifs of sialyltransferases were already present in the ancestor of all bilaterians. Interestingly, based on this phylogenetic relationship between the subgroups and the known metabolic pathways of glycan biosynthesis, a scenario for an acceptor substrate driven evolution of the sialyltransferase family can be deduced.

Page 41: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

35

2.2. Introduction Sialic acids (Sia) are a family of α−keto acids with 9-carbon backbones found mostly at distal positions of oligosaccharide chains of glycoproteins and glycolipids. They are glycosidically linked to either the 3- or 6-hydroxyl group of galactose residues (Gal) or to the 6-hydroxyl group of N-acetylglucosamine (GlcNAc) or N-acetylgalactosamine (GalNAc) residues, and can form oligo- to polysialic chains via their 8-hydroxyl and/or 9-hydroxyl groups. The complexity of sialylated glycans is enhanced further by modifications of basically all substituents leading to about 50 naturally occurring Sia (Angata and Varki 2002; Schauer and Kamerling 1997). Because of their terminal position and their negative charge, sialylated oligosaccharide sequences can be considered as key determinants in a variety of complex biological regulatory and signalling events (Schauer et al. 1995). Due to their negative charge, Sias are involved in binding and transport of positively charged molecules as well as in attraction and repulsion phenomena between cells and molecules. Their exposed terminal position in carbohydrate chains, in addition to their size and negative charge enable them to function as a protective shield for the subterminal part of the molecule or cell. Furthermore, Sias take part in a variety of recognition processes between cells and molecules (Kelm and Schauer 1997; Varki 1997). The biosythesis of sialylated glycans requires several enzymes, as for example the UDP-GlcNAc 2-epimerase/ManNAc kinase, Neu5Ac-9-phosphate synthetase, CMP-Neu5Ac synthase, CMP-Neu5Ac/CMP antiporter, and finally sialyltransferases (STs) for the transfer of CMP-Neu5Ac to glycoconjugates (Angata and Varki 2002; Traving and Schauer 1998). So far, about 20 different STs have been cloned in mammals, each of which exhibits a typical specificity for acceptor substrates and forms one of the four glycosidic linkages mentioned above (Harduin-Lepers et al. 2001; Jeanneau et al. 2004; Takashima et al. 2002). As far as investigated, each of the STs has a more or less characteristic acceptor substrate specificity usually accepting several related glycans but being very precise with respect to the hydroxyl group which is sialylated. Based on the linkage formed, they have been named ST6Gal, ST6GalNAc, ST3Gal and ST8Sia synthesizing Siaα2,6Gal, Siaα2,6GalNAc, Siaα2,3Gal, or Siaα2,8Sia, respectively (Tsuji et al. 1996). Within each of these subgroups the acceptor substrate specificities are quite complex with large to almost complete overlap for some enzymes (Harduin-Lepers et al. 2001; Taniguchi et al. 2002). This leads to partial redundancy of these enzymes. All known vertebrate STs have the same topology of type II transmembrane proteins with a short cytoplasmic domain and a signal anchor at their N-termini followed by a stem region and a large C-terminal catalytic domain facing the luminal side, characteristic of all glycosyltransferases localized along the secretory pathway (Amado et al. 1999). A comparison of peptide sequences strongly indicates that the length and amino acid compositions of catalytic domains are relatively well conserved between STs and variations in protein sizes are generally attributable to differences in the length of the stem region (Donadio

Page 42: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

36

et al. 2003). All eukaryotic STs show conserved peptide regions in their catalytic domain referred to as L (long)-, S (short)- and VS (very short) sialylmotifs suggesting their common ancestry (Datta and Paulson 1997; Geremia et al. 1997). In addition, another sialylmotif located between motifs S and VS has been decribed for STs, which seems to be involved in acceptor recognition (Jeanneau et al. 2004). For better clarity, the sialylmotifs will be named motif 1 (L-motif), motif 2 (S-motif), motif 3 ands motif 4 (VS-motif) as suggested recently (Jeanneau et al. 2004). These sialylmotifs contain several amino acids thought to be critical for enzymatic activity. Depending on the ST subgroup, at least two Cys residues, one located in motif 1, the other in motif 2, are conserved. Recent data suggest that they participate in an intramolecular disulfide bridge essential for maintaining an active conformation of the enzyme (Datta et al. 2001). Three additional Cys residues appear in the ST8Sia. Two of these, at the C-terminus and in sialylmotif 1, seem to form a second intramolecular disulfide bridge. The third Cys residue, also located within sialylmotif 1 has been discussed as critical residue for polysialylation as well (Angata et al. 2001). Further amino acid residues suggested to play key roles for full ST activity are a conserved Tyr residue in sialylmotif 3 as well as a conserved Glu and a conserved His residue in sialylmotif 4 (Jeanneau et al. 2004; Kitazume-Kawaguchi et al. 2001). Although expression of Sia is well documented in animals of the deuterostomian lineage (primarily in echinoderms and vertebrates), enzymes involved in Sia metabolism have been described almost exclusively for higher vertebrates like birds and mammals. To further extend our knowledge of the evolution of sialylation, we took advantage of the publicly available databases of different deuterostomian species. Here we describe the existence of putative orthologues of all key enzymes involved in Sia metabolism including the large family of STs in fish (Takifugu rubripes and Danio rerio) showing high sequence similarity to their mammalian counterparts. In addition, we have identified putative ST orthologues in other deuterostomian species (Branchiostoma floridae, Ciona intestinalis, and Strongylocentrotus purpuratus). Our phylogenetic analysis indicates that the complexity of the ST family has already evolved in the last common ancestor (LCA) of all bilaterians 1000 Mya.

2.3. Material and Methods Homology searches for STs in fugu and zebrafish genomic sequences The sequences of human and murine STs as well as other key enzymes of the sialylation apparatus were used as templates in homology searches of Takifugu rubripes whole genome shotgun assembly version 2.0 and 3.0 and the Ciona intestinalis whole genome shotgun assembly version 1.0 at the DOE Joint Genome Institute (JGI) web site (Aparicio et al. 2002), the Ensembl Fugu and Zebrafish Genome Browser at the Sanger Institut, the Ciona savignyi whole genome shotgun assembly release 1 at the Center for Genome Research and the

Page 43: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

37

Strongylocentrotus purpuratus whole genome shotgun assembly at the Sea Urchin Genome Project web site using BlastN and TblastN program. In case of the STs, sequences encoding the whole protein as well as only the sialylmotifs were used as templates. Scaffolds showing significant similarity were assembled using VectorNTI and analyzed for putative protein-coding sequences using the GENSCAN Web Server at the MIT Department of Biology (Burge and Karlin 1997). The multiple protein sequence alignments were constructed by VectorNTI using ClustalW algorithm (pairwise alignment parameters with gap openings set at 10 and gap extention set at 0.1; multiple alignment parameters with gap openings set at 10, gap extention set at 0.05 and delay divergent sequences set at 40%) (Thompson et al. 1994). Protein sequences of putative fish orthologues were analyzed using the prediction servers available at the CBS homepage. Genomic structures were deduced from comparison of predicted mRNA with genomic DNA.

Phylogenetic analysis Protein sequences corresponding to 160 amino acids were aligned by ClustalW as described above (Thompson et al. 1994). Sequences beyond sialylmotifs 1 and 2 were excluded from the phylogenetic analysis. The phylogenetic tree was inferred by a minimum evolution analysis (CNI search level 2, gapped sites were pairwise deleted) using MEGA 3.0 software (Kumar et al. 2001) with a JTT model (Jones et al. 1992) and a gamma model of among-site variation and bootstrapped 1000 times. Support for the internal branches on the phylogeny was also assessed using 1000 interior branch test replicates (Nei and Kumar 2000). The gamma shape parameter alpha for a four-category gamma distribution was estimated by the quartet puzzling method (Strimmer and von Haeseler 1996) using a JTT model as implemented by Puzzle 4.0. The likelihood mapping analysis was carried out as described by Strimmer and von Haeseler (Strimmer and von Haeseler 1997).

2.4. Results Sequence analysis for the teleost genomes To obtain information about the sialylation apparatus in fish, the genomes of Fugu rubripes and Danio rerio were searched using mammalian sequences of the key enzymes of the Sia metabolism as template. The analysis revealed the existence of putative orthologues for all enzymes crucial for the synthesis of Sias and their translocation and transfer to glycoconjugates as discussed in detail below (for Accession numbers see Fig. 1 unless mentioned in the text). Both genomes were searched using complete sequences of different mammalian STs as well as sequences encoding only the different sialylmotifs as queries. We were able to identify 20 different putative STs in the fugu genome as well as 25 different sequences showing significant sequence similarities to known STs in the genome of the zebrafish. Analysis of the

Page 44: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

38

amino acid sequences revealed a characteristic type II transmembrane architecture for all of the putative fish STs. Furthermore, for all STs known from mammalian species clear orthologues could be assigned in the teleost genomes as will be described in detail below. Comparison of complete fugu ST sequences with their mammalian orthologues showed sequence similarity values of between 51% and 85% on amino acid level (see table 1) showing a high degree of conservation in the sialylmotifs (see table 1 and figure 1).

Key enzymes of the sialic acid pathway

Homology searches were carried out using amino acid sequences of different mammalian enzymes belonging to the Sia pathway. Putative orthologues have been detected for UPD-GlcNac-2-epimerase/ManNAc kinase (Accession number AJ745719), Neu5Ac-9-phosphate synthase (Accession number AJ705104), CMP-sialic acid synthetase (Accession number AJ703819 ), the CMP-sialic acid transporter (Accession number AJ705103) and CMP-sialic acid hydroxylase (Accession number AJ703820), an enzyme, which converts CMP-Neu5Ac into CMP-Neu5Gc (see table 1). The genomic organization of all these genes seems to be highly conserved. Only for the UPD-GlcNac-2-epimerase/ManNAc kinase one additional exon occurs in both fish genomes analyzed compared to their mammalian counterparts.

Galactose α2,3-sialyltranferases

In mammals six STs with this specificity (ST3Gal I to ST3Gal VI) have been described (Harduin-Lepers et al. 2001). Analysis of both fish genomes indicated putative orthologues for all these mammalian STs except for ST3Gal VI. Instead, each of the two teleost genomes contain one additional sequence showing highest similarity values to mammalian ST3Gal V (in the following named ST3Gal V.2). In addition, the analysis of the zebrafish genome suggested the existence of at least 5 additional putative ST3Gal. Three of these sequences show highest similarity to mammalian ST3Gal I (in the following named ST3Gal I.2 , ST3Gal 1.3 and ST3Gal I.4 ), and one sequence has high similarity values to ST3Gal II (named ST3Gal II.2), and ST3Gal III (named ST3Gal III.2) respectively. The comparison of genomic structures between mammalian and putative fugu orthologues also revealed a high degree of conservation including identical split patterns of the sialylmotif 1 as shown in table 2.

Galactose α2,6-sialyltranferases

Blast search revealed putative orthologues for both ST6Gal members, ST6Gal I and ST6Gal II, in both fish genomes. The sequence and the genomic organization of putative fugu ST6Gal II could not be assigned with certainty, since it is located on two different scaffolds and therefore the sequence information is incomplete (Accession numbers AJ705075 and AJ705076). The fugu ST6Gal I is the only putative ST described here having more exons

Page 45: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

39

compared to its mammalian counterpart. The first coding exon from mammalian ST6Gal I is represented by two exons in fugu (see table 2). It is unclear, whether this is due to a loss of an intron in mammals or an introduction of an intron in fugu. All other exon-intron boundaries including the split patterns of sialylmotif 1 are conserved between fish and mammals. Unfortunately, the sequence of putative ST6Gal I orthologue from danio (Accession number AJ705077) is still incomplete and therefore no information is available for its genomic organization.

N-Acetylgalactosamine α2,6-sialyltranferases

So far, six mammalian ST6GalNAc representatives (ST6GalNAc I to ST6GalNAc VI) have been cloned and characterized (Harduin-Lepers et al. 2001). For all of them putative orthologues could be identified in at least one of the fish genomes, based on sequence similarity and complete conservation of the genomic structures. Surprisingly, no ST6GalNAc I orthologue was detected in the fugu and no ST6GalNAc IV orthologue in the zebrafish genome. The lack of these STs appears to be species-specific rather than a general fish feature, since each of them is only missing in one of the fish genomes analyzed. In addition, it cannot be excluded that this is due to the remaining gaps in the databases for these genomes. Additional sequences could be identified showing high similarity values for ST6GalNAc II (fugu) and ST6GalNAc V (danio). Also for these genes it can be assumed that they are results of species-specific duplication events which took place relatively late in fish evolution, since these sequences were only found in one the two fish genomes available.

Sialyl α2,8-sialyltranferases

Six different mammalian ST8Sia (ST8Sia I to ST8Sia VI) have been cloned and enzymatically characterized (Harduin-Lepers et al. 2001; Takashima et al. 2002). Putative fish orthologues for all of them have been identified in at least one of the two fish genomes. The only exception was the lack of an putative ST8Sia IV in the fugu genome. Whether this is a peculiarity of the fugu genome or due to the still uncomplete sequencing of the genome, has to be investigated in further studies. However, the existence of a second gene showing high sequence similarity to mammalian ST8Sia III in the fugu genome seems to be a species-specific duplication.

Sequence analysis for other deuterostomian genomes To allow a comparison also with putative STs from more distantly related deuterostomia, available genomic sequences of Ciona intestinalis and Ciona savignyi, Branchiostoma floridae and Strongylocentrotus purpuratus were searched using complete sequences of different mammalian STs as well as sequences encoding only the different sialylmotifs as queries. Putative ST sequences have been detected in both Ciona genomes showing highest

Page 46: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

40

sequence similarity to the ST3Gal subgroup. In the genome of Strongylocentrotus purpuratus putative ST sequences were found showing highest similarity to mammalian ST6GalNAc I and ST6GalNAc II. In addition, we were able to identify a putative ST sequence in Branchiostoma floridae as well as a short putative ST fragment in Strongylocentrotus purpuratus, which can be classified into the ST8Sia subgroup. However, for these genes no obvious orthologues could be assigned in mammalian species (see also phylogenetic analysis below).

Phylogenetic analysis To further investigate the relationship between fish and mouse STs, a minimum-evolution distance tree of the 44 newly identified putative ST orthologues together with the 22 known ST sequences from mouse, Xenopus tropicalis and Drosophila melanogaster was reconstructed (Fig. 2). This analysis distributes the newly identified putative STs together with their orthologues into 5 highly supported branches in accordance with substrate specificity and genomic organization. The well established classification based on substrate specificity of the STs into 4 families (Harduin-Lepers et al. 2001) is resembled in all cases except for the ST6GalNAc family which is clearly represented by two well separated branches (subgroups A and B). In addition, the ST3Gal and ST8Sia branches are divided, each in two distinct subgroups. Interestingly, the putative ST8Sia sequence from the cephalochordate Branchiostoma floridae (bfST8Sia) forms a single branch within the ST8Sia family suggesting that this gene belongs to an additional ancient ST8Sia subgroup. Although, the phylogenetic position of the Branchiostoma sequence is not well resolved, the genomic organization of this gene suggests a closer relationship to the vertebrate ST8Sia III. One of the striking results of the phylogenetic analysis was the unambiguous polyphyletic origin of the ST6GalNAc family. To evaluate this observation in more detail, a four-cluster likelihood mapping statistic was used. This analysis further supports a polyphyletic origin of the N-acetylgalactosamine α2,6-sialyltranferases by grouping one branch (ST6GalNAc A) together with ST3Gal and the other branch (ST6GalNAc B) together with ST8Sia and ST6Gal (Fig. 3). An additional likelihood mapping analysis (not shown) also grouped ST6Gal together with ST6GalNAc B and ST8Sia versus ST3Gal and ST6GalNAc A suggesting that ST genes form two monophyletic groups with one group consisting of ST3Gal and ST6GalNAc A and the other of ST6Gal, ST6GalNAc B and ST8Sia.

2.5. Discussion In the past, comparisons of animal genomes have been limited to mouse and human (96 million years of divergence), followed by human and fly or worm (protostome-deuterostome divergence, 670 million years ago) leaving a 570 million gap in the history of our lineage. Insight into this gap has become possible because of the release of two teleost gene

Page 47: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

41

assemblies (Danio rerio and Takifugu rubripes) and initial data from genomes of evolutive older deuterostomes. These data are particularly valuable for a better understanding of evolutionary processes leading to the complex sialylation apparatus. This aspect of evolution has drawn our interest mainly for three reasons. (1) The occurrence of Sia is almost, but not completely restricted to deuterostomia; (2) the biosynthetic pathway of Sia is quite unusual requiring a series of specific enzymes and (3) STs represent a family of related enzymes with distinct, but overlapping acceptor specificities. In this context the following questions have been asked: How well conserved are biosynthetic enzymes and STs in particular and are they well enough conserved to unambiguously identify them also in more distantly related genomes? How old is the complexity of the ST family? To address these questions, we have analysed genomic sequences coding for enzymes of the sialylation pathway from different deuterostomian branches.

Key enzymes of the sialic acid biosynthesis

In both teleost genomes we could readily identify putative orthologues for all key proteins of the Sia metabolism. In principle, the relative high conservation of enzymes in the Sia biosynthesis and transport is not too surprising, since the loss of any of these enzymes, with the exception of the hydroxylase, would lead to a complete loss of sialylated glycans. In this context it is important to point out that this degree of conservation is high enough to identify easily corresponding genes also in D. melanogaster (Kim et al. 2002; Koles et al. 2004). The only enzyme with a somewhat lower sequence similarity is the CMP-Sia synthase. Although this definitely is one of the key enzymes in the sialylation pathway, the lower conservation seems to be related to a broader substrate specificity of the fish enzymes accepting also Kdn, a Sia commonly found in fish but only in traces in mammals, as substrate as described for the CMP-sialic acid synthetase from rainbow trout (Nakata et al. 2001). Our observation that only one obvious orthologue of the CMP-Sia synthase exists in both fish genomes analyzed, confirms the suggestion that the activation of Neu5Ac and Kdn is catalyzed by a single enzyme in fish (Nakata et al. 2001). Another interesting feature of this enzyme is its unusual nuclear localization suggesting an additional function of this protein (Munster-Kuhnel et al. 2004) which may be the reason for the lower sequence conservation. In conclusion, the high conservation of the biosynthetic pathway further underscores the biological importance of sialylation leading to a high selective pressure maintaining sialylation throughout all vertebrates.

Assignments of sialyltransferase orthologues

Certainly, more complicated is the situation for the STs. Besides the sequences found in the two teleost genomes (Takifugu rubripes and Danio rerio), several putative STs have been identified in the genomes of species which diverged earlier in deuterostomian evolution, e.g.

Page 48: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

42

lancelets (Branchiostoma floridae), tunicates (Ciona intestinalis) and echinoderms (Strongylocentrotus purpuratus). Although these genome projects are not yet completed, those sequences identified were helpful in the phylogenetic analysis. In summary, the analysis of the newly identified ST sequences allowed several interesting conclusions and suggested possible scenarios regarding their evolution which will be discussed further down. These are: (1) All STs identified in the two teleost genomes can be assigned to mammalian orthologues. Consequently, it seems reasonable to assume that the LCA of vertebrates had the same branches within the ST family known from mammalian genomes and probably no additional branches. (2) Almost certainly, the 5 major branches of this protein family already existed before the separation of proto- and deuterostomia. This is particularly interesting, since the occurrence of Sia has been shown only in very few protostomia and no evidence for the sialylation pathway can be found in the genome of C. elegans (Angata and Varki 2002; Oriol et al. 1999) (3) The evolution of the different ST branches appears to be directed by the availability of the acceptor substrates leading to enzymes producing the same linkage (Sia2,6GalNAc) in different major branches. All putative non-mammalian STs identified in this study have the characteristic features of ST as described in the introduction. These include in particular the ST-characteristic sialylmotifs and the conserved Cys, Tyr, Glu and His residues within these motifs (Fig. 1). Further evidence for correct assignments of the ST orthologues comes from the genomic organization of the putative ST orthologues, at least for those sequences which could be completely identified, since most STs have their characteristic exons pattern. Nearly all of these putative ST orthologues in fish show the same number of exons including an almost identical length as their mammalian counterparts (Table 2). Also the assignments to the branches of the ST family described for the mammalian enzymes was unambiguous for all teleost genes. For example, the Cys residues characteristic for the ST8Sia branch, forming a second intramolecular disulfide bond and adding a free Cys residue within sialylmotif 1 (Angata et al. 2001), also exist in the putative non-mammalian ST8Sia orthologues. Exceptions are some putative danio orthologues within the ST3Gal branch (ST3Gal I.1-4; ST3Gal II.2; ST3Gal III.2; ST3Gal V.2) showing relatively low sequence similarity to their mammalian counterparts. Whether these sequences represent functional ST3Gal genes or pseudogenes or whether these sequence differences are due to incorrect predictions for the coding sequences has to be clarified in further experiments. Notably, in some cases more than one sequence showing highest similarity to a particular mammalian ST has been found in these fish genomes. This is consistent with other studies on multigene families showing a higher number of functional members in fish than in mammals (Aparicio 2000). In this context, it has been suggested that a whole genome duplication occurred in the teleost lineage since its divergence from the tetrapod lineage (Barbazuk et al. 2000; Christoffels et al. 2004). In zebrafish many of these additional genes have been retained, whereas especially in species of the fugu lineage large numbers of these genes

Page 49: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

43

appear to have been lost for unknown reasons. This would explain our observation, that up to four putative orthologues of a particular mammalian ST are present in the danio genome, whereas for fugu no more than two have been identified. Whether these additional genes represent only remnants of previously functional genes or active enzymes which possibly produce a further specialization of gene function as described for other fish proteins (Aparicio 2000), remains an open question.

Evolutionary origin of the sialyltransferase family

To get further insights into the evolution of this enzyme family, a phylogenetic tree has been reconstructed. The ST phylogeny shown in Fig. 2 places the mammalian STs with similar substrate specificities and their putative orthologues from fish and other deuterostomian species in mutually exclusive clades. Notably, our analysis revealed a polyphyletic origin of the ST6GalNAc genes. This result is consistent with the genomic organization and substrate specificity showing differences between mammalian members of the ST6GalNAc A and ST6GalNAc B groups. The implications of this observation will be discussed further below. The clustering of STs showing high sequence similarities into different branches implies that each of these subgroups has originated from one ancestral gene by a gene duplication, subsequently acquired distinct functions, and then became noninterchangeable. This is in agreement with the scenario that gene duplication and exon shuffling represent the major mechanisms for the generation of structurally related glycosyltransferases (Kurosawa et al. 1996). However, it is difficult to estimate when such events occurred. In this respect it is important to note that protostomian and deuterostomian ST6Gal representatives are members of one monophyletic clade. Together with the organismal phylogeny, the phylogenetic position of this clade clearly demonstrate that the diversification of the ST family has probably already taken place in the LCA of bilaterians. Also the occurrence of Sia in molluscs (Saito et al. 2001; Saito et al. 2002; Tunkijjanukij et al. 1998) indicates, that the sialylation apparatus evolved prior to the splitting of protostomes in ecdysozoans (including the arthropods) and lophotrochozoans (including the mollucs) (Adoutte et al. 1999; Adoutte et al. 2000). Although at present nothing is known about the proteins involved in Sia metabolism in these organisms, the glycan structures found in molluscs suggest the presence of at least ST3Gal and ST8Sia in these animals. Notably, no evidence for Sia expression has been found in the completely sequenced genome of the nematode C. elegans (Oriol et al. 1999), representing a protostomian class without sialylation pathway. It can be speculated that the development of complex organisms coincided with the evolution of more complex glycan structures, especially sialylation. Interestingly, it appears that in C. elegans this has been accomplished by fucosylation instead of sialylation, since Sia and fucose usually compete for the same acceptors and the lack of Sia

Page 50: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

44

in C. elegans coincides with the expression of 18 different fucosyltranferase genes (Oriol et al. 1999). The apparently isolated occurrence of Sia in some protostomia, especially in D. melanogaster, has lead to the discussion that a lateral gene transfer for enzymes of the Sia pathway may have occurred (Angata and Varki 2002). At least for the ST6Gal gene this can be ruled out on the basis of this phylogenetic tree. This raises the interesting question why only one ST in the completely sequenced genome of D. melanogaster can be identified. One explanation could be the lack of appropriate donor- or acceptor-substrates. In this case, it seems more likely that this could be the loss of specific glycan acceptor substrates rather that a loss of the donor substrate CMP-Sia, since it is difficult to imagine why a single ST would survive the loss of the donor substrate common for all STs. In this respect, it is interesting to note that the acceptor glycans for STs commonly occurring in vertebrates are rare or not present at all in insects (Maerz et al. 1997). Besides endogeneous factors, exogenous selection pressures, for example by viral and microbial pathogens or parasites recognizing Sia, may have influenced sialylation. In protostomian species this could have led to the loss of most functional ST genes. As an alternative strategy, deuterostomian species would have avoided such pathogens by modifications of Sia, a scenario proposed as the driving force towards the structural variety of Sia commonly found in deuterostomia (Angata and Varki 2002).

Acceptor substrate-driven evolution of sialyltransferase diversity

Based on the overall structure of the phylogenetic tree, it can be assumed that the first STs were enzymes transferring Sia to the hydroxyl group at position 6 of Gal or GalNAc, probably independent of underlying glycan structures. In this context it should be mentioned that at least some of the ST6Gal members accept GalNAc at least as well as Gal. For example, the ST6Gal from D. melanogaster prefers LacdiNac (GalNAcβ1-4GlcNAc) over LacNAc (Galβ1-4GlcNAc) as a substrate (Koles et al. 2004). On the other end of the tree, the peripheral branch topologies of the ST tree provide some interesting aspects as well. In particular, together with information about the acceptor glycan specificities of the different STs, several assumptions regarding the evolution of the ST family can be made. Although obvious, it is important to note that the STs which require sialylated structures as acceptor substrates must have emerged later than those STs building these structures. Obviously, this holds for all members of the ST8Sia branch. However, also several ST6GalNAc enzymes use sialylated acceptor structures, namely the sialylated T-antigen (Siaα2,3Galβ1,3GalNAc) although they do not transfer Sia to Sia. Interestingly, all these STs are members of the ST6GalNAc B group and are more closely related to the ST8Sia and the ST6Gal than to the enzymes of the branch formed by ST6GalNAc A and ST3Gal. This means that all STs recognizing sialylated acceptor glycans are found in the same branch of the phylogenetic tree.

Page 51: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

45

Based on this and in accordance with the phylogenetic analysis (see Fig.4) the following scenario for the evolution of the different STs can be developed. An ancestral ST recognizing Gal and/or GalNAc and building glycans with α2,6-linked Sia diverged into two different lineages. Then, enzymes evolved which transfer Sia to the hydroxyl group in position 3 of Gal instead of position 6 leading to all ST3Gal within the branch related to the ST6GalNAc A subgroup. Once the products of the ST3Gal were available as acceptor substrates, the ST6GalNAc B subgroup evolved from the ST6Gal-related ancestor in the other branch. The most simple explanation would be that an expansion of the acceptor recognition site including other parts of the glycan lead to a more restricted specificity. In contrast, for the evolution of ST8Sia two changes in specificity had to occur, the recognition of the acceptor glycan as well as the selectivity for the acceptor hydroxyl group. The structure of the phylogenetic tree suggests that after the establishment of the first ST8Sia, a diversification into different subgroups occurred. In this respect, one interesting aspect is the phylogenetic location of a putative ST8Sia orthologue from Branchiostoma floridae implicating a third ST8Sia subgroup which has maintained at least in cephalochoardates, but probably has disappeared in all vertebrates investigated so far. This subbranch may represent STs with a somewhat different substrate specificity than mammalian and teleost ST8Sia synthesizing glycans found only in these species. One example for such structures are glycans found exclusively in echinoderms, such as α2->5-Oglycolyl-linked oligo/polyNeu5Gc chains (Kitazume et al. 1997; Kitazume et al. 1996).

Conclusions

This study has provided new insights into the evolution of sialylation. The existence of putative orthologues to almost all mammalian STs in two genomes of teleosts demonstrates that the radiation of ST representatives is present at least in all vertebrates. The phylogenetic analysis presented here also indicates that the ST subfamilies were probably already present in the LCA of all bilaterians strongly suggesting that the typical characteristics of animal STs have evolved at least 1000 million years ago and have been maintained during that period. Finally, the high conservation of the proteins involved in Sia metabolism from arthropods to mammals underlines the particular importance of sialylation not only in all deuterostomes, but also in some protostomian species. In general, it can be speculated that during the development of increasingly complex organisms emerging new functions required also more complex glycans and their regulation. Certainly, the evolution of the diverse STs driven by the availability of new acceptor substrates has contributed to this structural diversity.

2.6. Acknowledgements The financial support by Zentrale Forschungsförderung der Universität Bremen is acknowledged.

Page 52: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

46

2.7. References Adoutte, A., Balavoine, G., Lartillot, N., and de Rosa, R. (1999) Animal evolution. The end of the intermediate taxa? Trends Genet., 15, 104-108.

Adoutte, A., Balavoine, G., Lartillot, N., Lespinet, O., Prud'homme, B., and de Rosa, R. (2000) The new animal phylogeny: reliability and implications. Proc.Natl.Acad.Sci.U.S.A, 97, 4453-4456.

Amado, M., Almeida, R., Schwientek, T., and Clausen, H. (1999) Identification and characterization of large galactosyltransferase gene families: galactosyltransferases for all functions. Biochim.Biophys.Acta, 1473, 35-53.

Angata, K., Yen, T. Y., El Battari, A., Macher, B. A., and Fukuda, M. (2001) Unique disulfide bond structures found in ST8Sia IV polysialyltransferase are required for its activity. J Biol Chem, 276, 15369-15377.

Angata, T. and Varki, A. (2002) Chemical diversity in the sialic acids and related alpha-keto acids: an evolutionary perspective. Chem.Rev., 102, 439-469.

Aparicio, S. (2000) Vertebrate evolution: recent perspectives from fish. Trends in Genetics, 16, 54-56.

Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J. m., Dehal, P., Christoffels, A., Rash, S., Hoon, S., Smit, A. F., Sollewijn Gelpke, M. D., Roach, J., Oh, T., Ho, I. Y., Wong, M., Detter, C., Verhoef, F., Predki, P., Tay, A., Lucas, S., Richardson, P., Smith, S. F., Clark, M. S., Edwards, Y. J. K., Doggett, N., Zharkikh, A., Tavtigian, S. V., Pruss, D., Barnstead, M., Evans, C., Baden, H., Powell, J., Glusman, G., Rowen, L., Hood, L., Tan, Y. H., Elgar, G., Hawkins, T., Venkatesh, B., Rokhsar, D., and Brenner, S. (2002) Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes. Science, 297, 1301-1310.

Barbazuk, W. B., Korf, I., Kadavi, C., Heyen, J., Tate, S., Wun, E., Bedell, J. A., McPherson, J. D., and Johnson, S. L. (2000) The Syntenic Relationship of the Zebrafish and Human Genomes. Genome Res., 10, 1351.

Burge, C. and Karlin, S. (1997) Prediction of complete gene structures in human genomic DNA. J Mol.Biol., 268, 78-94.

Page 53: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

47

Christoffels, A., Koh, E. G., Chia, J. M., Brenner, S., Aparicio, S., and Venkatesh, B. (2004) Fugu Genome Analysis Provides Evidence for a Whole-Genome Duplication Early During the Evolution of Ray-Finned Fishes. Mol Biol.Evol.

Datta, A. K., Chammas, R., and Paulson, J. C. (2001) Conserved cysteines in the sialyltransferase sialylmotifs form an essential disulfide bond. J Biol Chem, 276, 15200-15207.

Datta, A. K. and Paulson, J. C. (1997) Sialylmotifs of sialyltransferases. Indian J Biochem Biophys, 34, 157-165.

Donadio, S., Dubois, C., Fichant, G., Roybon, L., Guillemot, J. C., Breton, C., and Ronin, C. (2003) Recognition of cell surface acceptors by two human alpha-2,6-sialyltransferases produced in CHO cells. Biochimie, 85, 311-321.

Geremia, R. A., Harduin-Lepers, A., and Delannoy, P. (1997) Identification of two novel conserved amino acid residues in eukaryotic sialyltransferases: implications for their mechanism of action. Glycobiology, 7, v-vii.

Harduin-Lepers, A., Vallejo-Ruiz, V., Krzewinski-Recchi, M. A., Samyn-Petit, B., Julien, S., and Delannoy, P. (2001) The human sialyltransferase family. Biochimie, 83, 727-737.

Jeanneau, C., Chazalet, V., Auge, C., Soumpasis, D. M., Harduin-Lepers, A., Delannoy, P., Imberty, A., and Breton, C. (2004) Structure-function analysis of the human sialyltransferase ST3Gal I- Role of N-glycosylation and a novel conserved sialylmotif. J Biol.Chem., 279, 13461-13468.

Jones, D. T., Taylor, W. R., and Thornton, J. M. (1992) The rapid generation of mutation data matrices from protein sequences. Comput.Appl.Biosci., 8, 275-282.

Kelm, S. and Schauer, R. (1997) Sialic acids in molecular and cellular interactions. Int.Rev.Cytol., 175, 137-240.

Kim, K., Lawrence, S. M., Park, J., Pitts, L., Vann, W. F., Betenbaugh, M. J., and Palter, K. B. (2002) Expression of a functional Drosophila melanogaster N-acetylneuraminic acid (Neu5Ac) phosphate synthase gene: evidence for endogenous sialic acid biosynthetic ability in insects. Glycobiology, 12, 73-83.

Kitazume, K., Inoue, S., Inoue, Y., and Lennarz, W. J. (1997) Identification of sulfated oligosialic acid units in the O-linked glycan of the sea urchin egg receptor for sperm. Proc Natl Acad Sci U S A, 94, 3650-3655.

Page 54: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

48

Kitazume, S., Kitajima, K., Inoue, S., Haslam, S. M., Morris, H. R., Dell, A., Lennarz, W. J., and Inoue, Y. (1996) The occurrence of novel 9-O-sulfated N-glycolylneuraminic acid-capped alpha2-->5-Oglycolyl-linked oligo/polyNeu5Gc chains in sea urchin egg cell surface glycoprotein. Identification of a new chain termination signal for polysialyltransferase. J Biol.Chem, 271, 6694-6701.

Kitazume-Kawaguchi, S., Kabata, S., and Arita, M. (2001) Differential biosynthesis of polysialic or disialic acid Structure by ST8Sia II and ST8Sia IV. J Biol Chem, 276, 15696-15703.

Koles, K., Irvine, K. D., and Panin, V. M. (2004) Functional characterization of Drosophila sialyltransferase. J Biol.Chem., 279, 4346-4357.

Kumar, S., Tamura, K., Jakobsen, I. B., and Nei, M. (2001) MEGA2: molecular evolutionary genetics analysis software. Bioinformatics, 17, 1244-1245.

Kurosawa, N., Inoue, M., Yoshida, Y., and Tsuji, S. (1996) Molecular cloning and genomic analysis of mouse Galbeta1, 3GalNAc- specific GalNAc alpha2,6-sialyltransferase. J Biol Chem, 271, 15109-15116.

Maerz, L., Altmann, F., Staudacher, E., and Kubelka, V. (1997) Protein glycosylation in insects. In Montreuil, J., Schachter, H., and Vliegenthart, J. F.(eds), Glycoproteins. Elsevier Science, Amsterdam, pp. 543-563.

Munster-Kuhnel, A. K., Tiralongo, J., Krapp, S., Weinhold, B., Ritz-Sedlacek, V., Jacob, U., and Gerardy-Schahn, R. (2004) Structure and function of vertebrate CMP-sialic acid synthetases. Glycobiology, cwh113.

Nakata, D., Munster, A. K., Gerardy-Schahn, R., Aoki, N., Matsuda, T., and Kitajima, K. (2001) Molecular cloning of a unique CMP-sialic acid synthetase that effectively utilizes both deaminoneuraminic acid (KDN) and N-acetylneuraminic acid (Neu5Ac) as substrates. Glycobiology, 11, 685-692.

Nei, M. and Kumar, S. (2000) Molecular evolution and phylogenetics.

Oriol, R., Mollicone, R., Cailleau, A., Balanzino, L., and Breton, C. (1999) Divergent evolution of fucosyltransferase genes from vertebrates, invertebrates, and bacteria. Glycobiology, 9, 323-334.

Saito, M., Kitamura, H., and Sugiyama, K. (2001) Occurrence of gangliosides in the common squid and pacific octopus among protostomia. Biochim.Biophys.Acta, 1511, 271-280.

Page 55: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

49

Saito, M., Kitamura, H., and Sugiyama, K. (2002) Occurrence and tissue distribution of c-series gangliosides in the common squid Todarodes pacificus. Comp Biochem Physiol B Biochem Mol Biol., 131, 433-441.

Schauer, R. and Kamerling, J. P. (1997) Chemistry, biochemistry and biology of sialic acids. In Montreuil, J., Vliegenthart, J. F. G., and Schachter, H.(eds), Glycoproteins II. Elsevier, Amsterdam, pp. 243-402.

Schauer, R., Kelm, S., Reuter, G., Roggentin, P., and Shaw, L. (1995) Biochemistry and role of sialic acids. In Rosenberg, A.(ed), Biology of the Sialic Acids. Plenum, New York, pp. 7-67.

Strimmer, K. and von Haeseler, A. (1996) Quartet puzzling: A quartet maximum-likelihood method for reconstructing tree topologies. Molecular Biology and Evolution, 13, 964-969.

Strimmer, K. and von Haeseler, A. (1997) Likelihood-mapping: A simple method to visualize phylogenetic content of a sequence alignment. Proceedings of the National Academy of Sciences of the United States of America, 94, 6815-6819.

Takashima, S., Ishida, H. k., Inazu, T., Ando, T., Ishida, H., Kiso, M., Tsuji, S., and Tsujimoto, M. (2002) Molecular Cloning and Expression of a Sixth Type of alpha 2,8-Sialyltransferase (ST8Sia VI) That Sialylates O-Glycans. J.Biol.Chem., 277, 24030-24038.

Taniguchi, N., Honke, K., and Fukuda, M. (2002) Handbook of glycosyltransferases and related genes.

Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673-4680.

Traving, C. and Schauer, R. (1998) Structure, function and metabolism of sialic acids. Cell Mol.Life Sci., 54, 1330-1349.

Tsuji, S., Datta, A. K., and Paulson, J. C. (1996) Systematic nomenclature for sialyltransferases. Glycobiology, 6, R5-R7.

Tunkijjanukij, S., Giaever, H., Chin, C. C., and Olafsen, J. A. (1998) Sialic acid in hemolymph and affinity purified lectins from two marine bivalves. Comp Biochem.Physiol B Biochem.Mol.Biol., 119, 705-713.

Varki, A. (1997) Sialic acids as ligands in recognition phenomena. FASEB J, 11, 248-255.

Page 56: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

50

2.8. Legends to figures Fig. 1. - Amino acid alignment Vertebrate and insect STs have been aligned with putative orthologues from Takifugu rubripes (fugu), Danio rerio (danio), Branchiostoma floridae (branchiostoma), Ciona intestinalis (Ciona in.), Ciona savignyi (Ciona sa.), and Strongylocentrotus purpuratus (sea urchin) using VectorNTI based on the ClustalW algorithm as described under Methods. Identical residues are printed white on black and conserved residues on gray background. Amino acid residues known to be essential for enzyme activity are indicated with asterisks. Fig. 2. – Phylogeny of sialyltransferases Phylogenetic tree recalculation was carried out with 20 mouse, 1 african clawfrog and 1 fruitfly STs and 44 putative deuterostomian orthologues as described in the text with 1000 bootstrap replicates (>70% bootstrap support percentages are shown at each node). Species are indicated by m, x, d, f, bf, ci, cs, sp, and dm standing for Mus musculus, Xenopus tropicalis, Danio rerio, Takifugu rubripes, Branchiostoma floridae, Ciona intestinalis, Ciona savignyi, Strongylocentrotus purpuratus, and Drosophila melanogaster, respectively. Fig. 3 – Likelyhood mapping analysis of the sialyltransferase subgroups For each quartet of sequences representing the subgroups ST3Gal, ST6GalNAc A, ST6GalNAc B, ST6Gal plus ST8Sia, the likelihood for each of the three possible phylogenetic arrangements have been plottet in a triangle where the tip indicates absolute support for one of the arrangements of the quartets. The percentage of the quartets favoring a grouping of ST6GalNAc B with ST8Sia and ST6Gal to the exclusion of ST6GalNAc A sequences is shown in the upper area of the three regions of the triangle. Fig.4 - Phylogenetic tree of sialyltransferase subgroups Interior branch test ≥ 95% and bootstrap support ≥ 80% (in brackets) from 1000 replicates or from four-cluster likelihood-mapping (mean ± SD) including all possible combinations of the five ST subgroups are given.

Page 57: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

51

2.9. Tables and figures Table 1 Comparison of amino acid sequences of putative fugu STs and other key enzymes of the sialylation apparatus with potential murine orthologues.

Enzyme Entire protein Sialylmotif 1 Sialylmotif 2 ST3Gal family

ST3Gal I 63.5/51.6 84.9/75.5 84.0/64.0 ST3Gal II 76.9/67.5 86.8/79.2 100/95.8 ST3Gal III 79.4/64.2 94.3/77.4 95.8/87.5 ST3Gal IV 53.2/42.9 79.2/69.8 79.2/66.7 ST3Gal V 65.9/53.9 84.9/73.6 87.5/75.0 ST3Gal V.2 46.0/32.1 75.0/57.7 79.2/76.7

ST6Gal family

ST6Gal I 65.5/49.7 88.0/72.0 88.0/64.0 ST6GalNAc family

ST6GalNAc II 53.6/39.4 80.0/62.2 92.0/87.0 ST6GalNAc II.2 52.7/38.8 81.3/64.6 91.3/87.0 ST6GalNAc III 68.0/59.2 77.6/67.3 83.3/79.2 ST6GalNAc IV 54.4/43.2 82.8/70.7 78.6/69.0 ST6GalNAc V 62.2/51.6 81.3/70.8 95.8/87.5 ST6GalNAc VI 56.8/45.8 78.3/65.2 91.7/79.2

ST8Sia family

ST8Sia I 69.9/58.6 82.2/73.3 86.4/68.2 ST8Sia II 65.5/56.9 84.1/72.7 92.0/84.0 ST8Sia III 85.3/75.0 87.0/80.4 95.8/83.3 ST8Sia III.2 76.0/67.0 84.4/77.8 100/91.7 ST8Sia V 74.3/64.1 87.0/80.4 95.5/81.8 ST8Sia VI 67.3/50.9 63.0/58.7 90.5/68.2

other key enzymes

UPD-GlcNac-2-epimerase/ ManNAc kinase 87.6/79.1 Neu5Ac-9-phosphate synthase 86.6/77.2 CMP-sialic acid synthase 62.5/50.0 CMP-sialic acid transporter 87.1/78.2 CMP-sialic acid hydroxylase 71.4/64.1

Page 58: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

52

NOTE.- Complete sequences were compared for all proteins. In addition, for STs the sequences encoding sialylmotifs 1 and 2 were compared separately. Similar/identical amino acids are shown as percentage of the sequences compared.

Page 59: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

53

Table 2 Comparison of the genomic structures of putative fugu STs and other key enzymes of the sialylation appparatus with potential mammalian orthologues

Length Exons containing

(aa) Number of Exons sialylmotif 1 Enzyme Fru Mmu Hsa Fru Mmu Hsa Fru Mmu Hsa ST3Gal family

ST3Gal I 333 337 340 6 6 6 2 2 2 ST3Gal II 350 350 350 6 6 6 2 2 2 ST3Gal III 370 375 375 11 11 11 2 2 2 ST3Gal IV * $280 333 333 $3 10 10 2 2 2 ST3Gal V $289 387 362 $5 6 6 2 2 2 ST3Gal V.2 383 / / 7 / / 2 / /

ST6Gal family

ST6Gal I $322 403 406 6 5 5 2 2 2 ST6GalNAc family

ST6GalNAc II 371 373 374 $8 9 9 2 2 2 ST6GalNAc II.2 371 / / $8 / / 2 / / ST6GalNAc III 311 305 305 5 5 5 1 1 1 ST6GalNAc IV 306 302 302 5 5 5 1 1 1 ST6GalNAc V 313 335 336 5 5 5 1 1 1 ST6GalNAc VI $302 333 333 $4 5 5 1 1 1

ST8Sia family

ST8Sia I 337 355 356 5 5 5 2 2 2 ST8Sia II 352 375 375 6 6 6 2 2 2 ST8Sia III 371 380 380 4 4 4 1 1 1 ST8Sia III.2 403 / / 4 / / 1 / / ST8Sia V* 372 376 376 7 7 7 2 2 2 ST8Sia VI $219 398 398 $4 8 8 2 2 2

other key enzymes

UPD-GlcNac-2-epimerase/ 716 722 722 12 11 11 ManNAc kinase Neu5Ac-9-phosphate 358 359 359 6 6 6 synthase CMP-sialic acid synthase 432 431 434 8 8 8 CMP-sialic acid transporter 338 336 337 8 8 8 CMP-sialic acid hydroxylase $419 577 / $9 14 /

Page 60: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

54

NOTE.- The genomic organization of murine (Mmu), human (Hsa) and putative fugu (Fru) STs and other enzymes involved in synthesis of sialylated glycoconjugates are presented. STs with different splice variants in mammals are marked with asterisks. Sequence length and exon numbers of putative fugu STs with incomplete 5’ and /or 3’ ends are indicated by $.

Page 61: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

55

Fig. 1

Page 62: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

56

Fig.2

Page 63: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Sialylation

57

Fig.3

Fig.4

Page 64: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

58

3. Evolution of sialic acid-binding proteins:

Molecular cloning and expression of fish Siglec-4

Friederike Lehmann, Heiko Gäthje, Sørge Kelm, Frank Dietz

Published in Glycobiology (Advance Access, June 30, 2004, cwh120)

Declaration of my contribution to the publication The experiments stated in this manuscript were designed, processed and interpreted by myself except for the Northern blot, which was performed by Dr. Frank Dietz. Specificity assays were done in collaboration with Heiko Gäthje. I have written the publication in collaboration with all co-authors.

Page 65: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

59

Evolution of sialic acid-binding proteins: Molecular cloning and expression

of fish Siglec-4

Friederike Lehmann1, Heiko Gäthje, Sørge Kelm, Frank Dietz

Centre for Biomolecular Interactions Bremen, Department of Biology and Chemistry,

University Bremen, 28334 Bremen, Germany

Running title: Evolution of siglecs

Keywords: myelin-associated glycoprotein, siglecs, evolution, Danio rerio, Takifugu rubripes

1To whom correspondance should be addressed: Phone: ++49 (421) 218-4324,

FAX:++49(421) 218-7433, email: [email protected] or [email protected]

Page 66: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

60

3.1. Abstract Siglecs are the largest family of sialic-acid recognizing lectins identified so far with eleven members in the human genome. Most of these siglecs are exclusively expressed by cells of the immune system. Comparison of different mammalian species has revealed differential and complex evolutionary paths for this protein family even within the primate lineage. To understand the evolution of siglecs, in particular the origin of this family, we investigated the occurrence of corresponding genes in bony fish. Interestingly, only unambiguous orthologues of mammalian Siglec-4, a cell adhesion molecule expressed exclusively in the nervous system, could be identified in the genomes of fugu and zebrafish, whereas no obvious orthologues of the other mammalian siglecs were found. As in mammals fish Siglec-4 expression is restricted to nervous tissues as demonstrated by Northern blot. Expressed as recombinant protein, fish Siglec-4 binds to sialic acids with a specificity similar to the mammalian orthologues. Relatively low sequence similarities in the cytoplasmic tail as well as an additional splice variant found in fish Siglec-4 suggest alternative signaling pathways compared to mammalian species. Our observations suggest that this siglec occurs at least in the nervous system of all vertebrates.

Page 67: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

61

3.2. Introduction Sialic acids (Sia) are a diverse family of acidic ketoses with 9-carbon backbones being present in all deuterostomes as well as on some protostomes, protozoa, bacteria and viruses. Based on the role of Sia as recognition determinants for various pathogenic agents such as viruses, bacteria, protozoa, and toxins, the unusual structural complexity of Sia has been interpreted as a result of evolutionary arms race between the hosts and pathogens (Varki 1997; Karlsson 1998; Angata and Varki 2002). More recent studies have revealed another prominent role of Sia, namely their functions as ligand determinants for endogeneous lectins (Kelm and Schauer 1997). A group of structurally related proteins recognizing Sia has been identified to play an important role in cellular communication of mammalian cells. These siglecs (sialic acid-binding immunoglobulin-like lectins) are the largest family of Sia-recognizing lectins discovered so far. Structurally, all siglecs are type I membrane proteins, consisting of typical Ig-like domains characterized by two opposing β-sheets held together by a disulfide bridge (Barclay and Brown 1997), one N-terminal V-set Ig-domain, a variable number of C2-set Ig-domains, a single-pass transmembrane domain, and a cytoplasmic tail. The N-terminal domains of all siglecs share several similarities in their amino acid sequences. In contrast to the regular pattern of cysteins found in typical members of the IgSF, which form an intersheet disulfide bridge between the B and F strands, siglecs contain an intrasheet bridge between the B and E strands. This allows a somewhat larger distance between the two ß-sheets which appears to be important for the binding of Sia residues. One additional cystein residue occurs at the C-terminal end of the B strand which builds an interdomain bridge to the following domain 2. Whereas most of the other conserved amino acids are important for the overall structure of the Ig-fold, three amino acid residues in domain 1 are critical for sialic acid recognition (May et al. 1998). Many of the siglecs have potential tyrosine phosphorylation sites, in most cases in the context of an immunoreceptor tyrosine-based inhibitory motif (ITIM), in their cytoplasmic tail, suggesting their involvement in intracellular signaling pathways (Crocker 2002). With the exception of Siglec-4, all siglecs are expressed on the cells of the immune system, mainly on those involved in innate immunity, such as monocytes, macrophages, natural killer cells, and granulocytes (Crocker and Varki 2001a; Crocker and Varki 2001b). By virtue of their differential tissue expression and specificities for Sia, and regarding the fact that they are the only proteins specifically recognizing sialylated glycans found in deuterostomes so far, siglecs are thought to play important roles in a wide array of recognition and signaling events (Crocker 2002; Crocker and Varki 2001a, Crocker and Varki 2001b). Although Sia occur in all deuterostomia, siglecs have only been documented in mammals and birds so far (Crocker 2002; Dulac et al. 1992). Interestingly, the comparison of siglec-related genes from human and great apes revealed differential and complex evolutionary paths

Page 68: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

62

indicating a high rate of evolution within this gene family (Gagneux and Varki 2001; Angata et al. 2001; Brinkman-Van der Linden et al. 2000; Varki 2001a; Varki 2001b). So far, only Siglec-4 has been described from a non mammalian species, from quail, where it is known as Schwann cell myelin protein (SMP) due to its restricted expression pattern (Dulac et al. 1992). In mammals, Siglec-4 is exclusively expressed by myelinating glia cells in the nervous system and is known since over 20 years as myelin-associated glycoprotein (MAG). Several lines of evidence, including those from gene knock out experiments, suggest important roles of Siglec-4/MAG in the maintenance of myelin integrity and the regulation of neuronal growth (Filbin 2003; Hunt et al. 2002; Spencer et al. 2003). To understand the evolution of siglecs on a deeper timescale, it is of particular interest to evaluate deuterostomian branches which diverged early in evolution. Here we report the molecular cloning and characterization of siglec orthologues expressed in three fish species. They exhibit all structural features typical for siglecs, including all amino acid residues known to be essential for the interaction with Sia. Apparently, these proteins represent the fish orthologues of Siglec-4, since high levels of sequence similarities were only found with this siglec.

3.3. Results

Identification of putative siglec sequences in fish genomes As a part of systematic in silico analysis of the siglec family, the Takifugu rubripes whole genome shotgun assembly and the Ensembl Zebrafish Genome Browser were searched using complete sequences of different mammalian siglecs as well as sequences encoding only the first N-terminal domains as queries. Scaffolds producing expect-values below 10-5 were assembled into contigs and the resulting genomic sequences were analyzed for putative protein-coding sequences. Searches using sequences of Siglec-4/MAG as queries lead to significant alignments with 56% identical amino acids in the N-terminal domain. Eight exons were predicted for the coding region of putative fish Siglec-4. Also against the Xenopus tropicalis genome at JGI homology searches with mammalian Siglec-4/MAG sequences revealed a fragment showing high similarity values to mammalian Siglec-4/MAG (scaffold_35387). In addition, in the fish databases we identified several sequences with low degrees of sequence similarity to domain 1 of mammalian siglecs (10.6 -27.0% identical amino acids). In an attempt to identify further siglecs, the predicted amino acid sequences were aligned with the N-terminal domains of mammalian siglecs and analysed more closely. Some of the candidates (sequences I, II, VII, and VIII; see table I) contained the main characteristic features of siglecs, such as the unusual distribution of cysteins and the typical amino acids found in the binding site. In contrast, the other genes (sequences III, IV, V, VI, and IX; see

Page 69: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

63

table I) are very unlikely to encode siglecs, since their pattern of cystein residues did not reflect the situation found in siglecs, nor could we assign the residues typical for siglecs. The potential additional fish siglecs mentioned above could not be assigned as relatives to any of the known mammalian siglecs due to the low sequence similarity to these proteins. Even the degree of similarity between fugu and zebrafish genes (12-32% identical amino acids) was to low to allow assignments. Therefore, a conclusive phylogenetic analysis of all putative siglec sequences is not possible without additional information on related genes in amphibians, birds and reptiles.

Structure prediction analysis of fish Siglec-4 An analysis of the amino acid sequences of putative fish Siglec-4 predicted a structure with 5 characteristic Ig-like domains suggesting their affiliation with the immunoglobulin superfamily (IgSF) (Gough et al. 2001). Using the complete cDNA of the putative fish Siglec-4 orthologues (s. below) as well as the sequence from X. tropicalis, a sequence alignment was produced together with the sequences of the known mammalian representatives of Siglec-4/MAG (for domains 1 and 2 see Fig. 1). A comparison of complete fish Siglec-4 sequences with mammalian orthologues revealed 38% identical amino acids, whereas the similarity to other mammalian siglecs was much lower (6.6% to 21.0% identical amino acids) . The degree of conservation is higher in the extracellular domains, especially in the first N-terminal domain with a successive decrease of conservation from first to fifth domain (56% to 35% identical amino acids). The fragment of X. tropicalis Siglec-4 show a slightly higher degree of similarity to mammalian counterparts (52% identical amino acids) than to fish representatives of Siglec-4 (about 44% identical amino acids). The cytoplasmic tail is less conserved (21% identical amino acids between fish and mammalian Siglec-4). Both fish and putative X. tropicalis Siglec-4 cDNAs encode a protein with the typical features of all siglecs previously described. Particularly, the three amino acid residues critical for sialic acid recognition are the arginine in β-strand F (Arg99 in fugu Siglec-4) and two aromatic amino acids, one in β-strand A (Trp2 in fugu Siglec-4) and one in β-strand G (Tyr108 in fugu Siglec-4). Also the cystein residues forming the characteristic intrasheet bridges between the B and F strands of domain 1 and between the C-terminal end of the B-strand in domain 1 and domain 2 are present in the fish sequences. Furthermore, they show Siglec-4-characteristic features like the absence of a salt bridge found in domain one of all other siglecs and one additional disulfide bridge in domain five.

Molecular cloning of fugu Siglec-4 cDNA From the in silico analysis described above, most parts of a mRNA coding for fugu Siglec-4 could be predicted. However, the 5’- and 3’-ends remained completely unclear. In order to confirm the predicted exons and to obtain the 5’- and 3’-exons of fugu Siglec-4 mRNA, RACE experiments were performed using fugu brain RNA as template and sequenced. The

Page 70: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

64

cDNA finally cloned contained all exons predicted from the in silico analysis plus one additional exon encoding the signal peptide leading to an open reading frame of 1902 nucleotides (Fig. 2). In addition, evidence for alternative splicing at the 3’-end leading to three different C-termini was obtained as will be described below.

Genomic Organization of Fish Siglec-4 The genomic organization of Siglec-4 from fugu and zebrafish and the different mammalian Siglec-4/MAG representatives was obtained by comparing the cloned cDNA with the genomic sequences of these genes (Fig. 3). All exon/intron boundaries were defined precisely. The signal peptide is encoded by a separate exon, followed by five exons encoding the Ig-like domains. The transmembrane domain is encoded by a separate exon and is followed by a cytoplasmatic domain which is encoded by two or more exons as described below. Only minor differences were found between the genomic organization of fish Siglec-4 orthologues and the mammalian representatives (Fig. 3). Notably, the exons encoding domains one to four have identical length in fish and mammalian Siglec-4. Only small differences were found in the lengths of exons coding for domain 5 and the membrane anchor (Fig. 3). However, substantial size differences were observed for the corresponding introns of zebrafish, fugu and mammalian Siglec-4 genes. In accordance with the overall situation in these genomes, most of the fugu introns were significantly smaller than those from the other species (Fig. 3).

Splice variants of fish Siglec-4 Sequencing of 3´RACE- PCR products revealed the presence of at least three splice variants at the 3’-end of the coding sequence leading to three different C-termini as shown schematically in Fig. 4A. The cytoplasmic region of the variant shown in Fig. 2 was termed L-tail (accession numbers AJ628724 for fugu and AJ628726 for zebrafish), since it resembles mammalian L-MAG. The second splice variant contains one additional sequence of 54 nucleotides (exon 9 in Fig. 4A) coding for additional 18 amino acids and consequently the corresponding cytoplasmic region has been termed XL-tail (accession numbers AJ628725 for fugu and AJ628727 for zebrafish). The third splice variant of the gene contains all exons present in the XL-tail variant plus the 252 nucleotides between exons 7 and 9 (exon 8 in Fig. 4A). Since this splice form introduces a termination codon at the beginning of exon 8, the resulting truncated C-terminus has been termed S-tail (accession number AJ628729), corresponding to mammalian S-MAG. In mammalian Siglec-4, the cytoplasmic regions of L- and S-MAG contain different phosphorylation sites. Since these are likely to be relevant for signal transduction, the three cytoplasmic tails of fish Siglec-4 were analyzed for consensus sequences with the potential for signal transduction. Whereas no such sites were found in the S-tail of fish Siglec-4, the L-tail and the XL-tail sequences contain three putative tyrosine-based phosphorylation sites with two of them found in the context of an ITIM (LYY545SAV and LNY581ASL in the L-form of

Page 71: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

65

fugu, SDY566QSV and LNY582AAL in the L-form of zebrafish) (Fig. 4B). Interestingly, these phosphorylation sites are significantly more similar to such sites in several other siglecs than to the phosphorylation sites in mammalian L-MAG, whereas outside of these sites the sequences of the cytoplasmic regions have only very low similarities (Fig. 4B). Surprisingly, exons 7 and 9 encode for two potential sumoylation sites (S I and SII; Fig. 4C). Wheras S I occurs in all splice variants, S II occurs only in the XL-tail of Siglec-4. In none of the known mammalian or avian siglecs a corresponding sequence has been identified.

Expression pattern of Fish Siglec-4 For Northern blot analysis we used RNA from carp tissues because of the size of this fish, the possibility to obtain sufficient amounts of RNA from different tissues and its close relationship to zebrafish. For this purpose an authentic probe from carp Siglec-4 was synthesized and sequenced (accession number AJ628728). It covered exons 2-4 showing 87% and 73% sequence identity to zebrafish and fugu, respectively. As shown in Fig. 5 the expression of Siglec-4 appears to be restricted to nervous tissues with higher levels in lobus opticus, cerebellum and spinal cord and lower levels in bulbus olfactorius, lobus inferior and rhombencephalon. The major mRNA species in carp is estimated to be about 6.0 kilobases which is twice as large as in avian or mammalian species (Fujita et al. 1989; Dulac et al. 1992).

Fish Siglec-4 as Sia-dependent cell adhesion molecules To investigate whether the predicted fish Siglec-4 orthologues represent true siglecs, it was important to demonstrate their ability to bind Sia-containing glycoconjugates. Therefore, cell adhesion assays were performed with COS-7 fibroblasts transfected with full-length fugu Siglec-4 cDNA and human erythrocytes as target cells. These experiments clearly demonstrated that fugu Siglec-4 mediates Sia-dependent adhesion of erythrocytes (Fig. 6), since binding was completely abolished by pretreating erythrocytes with sialidase. Notably, sialidase pretreatment of COS-7 cells was not required for erythrocyte binding, indicating that at least in this cell system fugu Siglec-4 is not occupied by Sia-containing glycoconjugates on the same cell surface (cis-ligands). Fc-chimeras of siglecs are commonly used to characterize their binding specificity (Kelm 2001). Therefore, recombinant Fc-chimeras containing the extracellular three or five domains of fugu and zebrafish Siglec-4 were prepared. On SDS-PAGE Fc-chimeras of fish Siglec-4 exhibited a higher apparent molecular weight than the corresponding murine protein despite identical sequence length (data not shown). This could be explained by a different glycosylation of these proteins, since fish Siglec-4 contains an additional potential N-glycosylation site in domain three (see Fig. 2). Complexed with anti-human IgG, Fc-chimeras of fugu and zebrafish Siglec-4 showed robust binding to the GlycoWell™ plates derivatized with Neu5Ac at levels similar to the

Page 72: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

66

corresponding murine protein. All siglecs investigated so far require an intact extracyclic glycerol side chain of Neu5Ac for binding. Also the binding of fish Siglec-4 to periodate oxidized Neu5Ac on GlycoWell™ plates is reduced to background levels (Fig. 7). Mammalian Siglec-4/MAG binds preferentially to α2,3-linked Sia (Strenge et al. 1998). For a further investigation on the glycan specificity, hapten inhibition assays were performed with increasing concentrations of α2,3- or α2,6-sialyllactose. Similar to murine Siglec-4/MAG, fugu Siglec-4 binding is reduced by α2,3-sialyllactose in a concentration-dependent manner, whereas no inhibition by α2,6-sialyllactose was observed (Fig. 7). These data provide evidence that also fish Siglec-4 binds with high preference to α2,3-linked Sia. In all the above mentioned binding experiments no differences were found between murine and fugu Siglec-4. However, whereas murine Siglec-4/MAG Fc-chimeras bind well to immobilized fetuin or polyacrylamide derivatives containing α2,3-linked Sia (unpublished observation), no specific binding of fish Siglec-4 to these glycoconjugates was observed (data not shown).

3.4. Discussion In mammals, siglecs appear to play critical roles in mediating cell-cell interactions and signaling functions in the haematopoietic, immune and nervous systems (Crocker and Varki 2001b; Crocker 2002). This study has been initiated to explore the occurrence of siglecs in fish, since these represent the most distant vertebrates sharing with mammals basically all blood cells, a nervous system with myelinated axons, and a complex immune system. We were able to identify genomic sequences of a putative Siglec in two fish species (Takifugu rubripes and Danio rerio). The encoded proteins were clearly identified as fish orthologues of mammalian Siglec-4/MAG based on the degree of sequence identity and exhibit all structural features described for Siglecs in general as well as those specific for Siglec-4. To identify these putative orthologues as authentic siglecs, binding studies have been performed providing direct evidence that these proteins are true fish orthologues of mammalian Siglec-4/MAG by showing specificity for α2,3-linked Sia. Although in this respect the binding specificity appears identical in fish and mammalian Siglec-4, the observation that fish Siglec-4 failed to bind to immobilized fetuin suggests that fish and mammalian orthologues have different selectivities for specific glycan structures. Further studies will be necessary to investigate these aspects of fine specificity in particular towards the modifications of sialic acids, since besides Neu5Ac other Sia such as Neu5Gc and KDN are present in high amounts in fish glycoconjugates (Inoue and Inoue 1997). In mammals Siglec-4/MAG is expressed exclusively by myelinating glia cells in the central and peripheral nervous system. It has been proposed to regulate interactions between glia cells and neurons involved in events like myelination, axonal growth and signal transduction

Page 73: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

67

(Schachner and Bartsch 2000; Domeniconi et al. 2002; Wong et al. 2002; Vinson et al. 2003). However, it has been well established that in contrast to mammalian myelin, fish myelin does not inhibit neuronal growth, and this difference has been addressed to myelin composition (Schweitzer et al. 2003; Klinger et al. 2004). As demonstrated by Northern blot analysis, also in fish the expression of Siglec-4 appears to be restricted to the nervous system. This is in good agreement to the tissue distribution described for mammalian Siglec-4/MAG. However, to clarify with certainty, whether also in fish the expression is restricted to myelinating cells, additional histochemical experiments will be necessary. Interestingly, in contrast to the two splice variants found for mammalian Siglec-4 (Fujita et al. 1989; Heape et al. 1999; Ishiguro et al. 1991) in fish alternative splicing leads to three isoforms with different cytoplasmic tails with two of them containing potential phosphorylation and sumoylation sites. Notably, two of the three potential tyrosine-based phosphorylation sites are found in the context of ITIMs, a characteristic feature of the Siglecs found in the immune system, but absent in the mammalian and avian representatives of Siglec-4. The existence and high conservation of those motifs potentially involved in signal transduction in fish suggest important biological functions different from those described for mammalian Siglec-4. Furthermore, the existence of ITIMs in fish Siglec-4 indicates, that ITIMs may be an ancestral feature of siglecs, which persisted in most siglecs (Fig. 4B), but was lost later during the evolution of Siglec-4 at least in mammals and birds. This rises the question whether Siglec-4 represents the common ancestor also of immune system-related siglecs. In this respect it is interesting that we could identify several additional potential siglec genes in both the fugu and the zebrafish genomes representing additional candidates for such a common ancestor. However, the low degree of sequence similarities does not allow a reasonable phylogenetic analysis. Additional insight in this topic could be expected to come from the analysis of other deuterostomian genomes, e.g. from amphibians, reptiles, birds, or non-vertebrate deuterostomes evolutive older than fish. If compared with the much higher degree of conservation found for the enzymes of the Sia metabolism, the results of this study suggest that the siglec representatives known to date are relatively new inventions in sialic acid evolution. In this context, our observation, that Siglec-4 is the only siglec known to date that shows a high conservation from fish to mammals is of particular interest. This indicates a high evolutive pressure for this protein suggesting an indispensable role for Siglec-4 in the maintenance of a nervous system with myelinated axons. In contrast, other putative siglecs apparently maintained a high degree of structural and possibly functional flexibility during evolution. It may be speculated that this reflects the adaptive potential of the immune system and its flexibility during evolution.

Page 74: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

68

3.5. Materials and Methods Fugu brain RNA was obtained from MRC Gene Service. Tissue culture media were obtained from Invitrogen. Unless otherwise specified, all other reagents were purchased from Sigma.

Homology searches for siglecs in fish and frog genomes The sequences of human and murine siglecs were used as templates in homology searches of Takifugu rupripes whole genome shotgun assembly version 2.0 at the DOE Joint Genome Institute (JGI) web site (Aparicio et al. 2002), the Ensembl Zebrafish Genome Browser at the Sanger Institute and JGI Xenopus tropicalis web site using the tblastn program. Scaffolds showing e-values below 10-5 were assembled using VectorNTI and analyzed for putative protein-coding sequences using the GENSCAN Web Server at the MIT Department of Biology (Burge and Karlin 1997). The multiple protein sequence alignment was constructed by VectorNTI using ClustalW algorithm (Thompson et al. 1994) and adjusted manually if necessary to accommodate structural restrictions, such as the positions of conserved cystein residues and predictable β-strands. Protein sequences of putative Siglec-4 orthologues were analyzed using the ExPASy Molecular Biology Server, the superfamily1.63 HMM library and genome assignments server and the prediction servers available at the CBS homepage. Genomic structures were deduced from comparison of cDNAs with genomic DNA sequences.

Molecular cloning of a full length cDNA encoding fugu Siglec-4 To obtain 5´- ends of the cDNA RACE-PCR was performed using the GeneRacer™ Kit (Invitrogen) and gene specific primer 5´-TAGCCATTGTCTCCGACGCAAGTATAGA-3´. To obtain the entire coding region, 5´-end and 3´-ends were first amplified separately by PCR using the primer sets 5´- GCTCTAGAGTGGAAACCATGTGGTGTTT-3´, complementary to nucleotides 1-20 (Fig. 2), and 5´-TAGCCATTGTCTCCGACGCAAGTATAGA-3´, complementary to nucleotides 914-941 for the 5´-end and 5´-TCAAGTATGCTCCTCGCTCTGTG-3´, complementary to nucleotides 716-738 and 5´-CCGGATCCGGGCCCTCATTTGGCTTTGATTTCCCTATAGTTG-3´, complementary to nucleotides 1875-1902, for the 3´-end. Fugu brain cDNA generated using RevertAid™ H Minus First Strand cDNA Synthesis Kit (Fermentas) was used as template. The above PCR was performed as follows: 94 °C for 2 min, 25 cycles of 94 °C for 30 s, 55 °C for 45 s, 72 °C for 90 s, and 72 °C for 2 min. The resulting 5´-fragment was digested with XbaI and XhoII, the resulting 3´-fragment with XhoII and BamHI and dephosphorylated with shrimp alkaline phosphatase (Fermentas). Both fragments were ligated into XbaI/BamHI cut pcDNA 3.1 vector (Invitrogen) and sequenced.

Page 75: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

69

Cell adhesion assays Fugu Siglec-4/pcDNA 3.1 constructs were transfected using ExGen 500 in vitro transfection reagent (Fermentas) into COS-7 cells. Erythrocyte adhesion assays were performed 48h after transfection, with or without Vibrio cholerae sialidase (Dade Behring) pretreatment of COS-7 cells or erythrocytes as described previously (Kelm et al. 1994).

Production of recombinant fish Siglec-4-Fc A cDNA fragment encoding the three N-terminal Ig-like domains of fugu Siglec-4 was amplified by PCR using 5´-GCTCTAGAGTGGAAACCATGTGGTGTTT-3´ (nucleotides 1-20) and 5´-CCGGATCCACTTACCTGTCTTGACCGCCAGATACATGGAGGT-3´ (complementary to nucleotides 955-978) as primers and fugu Siglec-4/pcDNA 3.1 as template. The corresponding fragment coding for the three N-terminal domains of zebrafish Siglec-4 was cloned using primers 5´-GCTCTAGAATGAAGGGCTTAGAGCTGCT-3´ and 5´-CCGGATCCACTTACCTGTATTTACTGCCAGATACAT-3´ and zebrafish brain cDNA as template. The fish Siglec-4d1-3 fragments were digested with XbaI and BamHI and ligated to XbaI/BamHI cut pIgBOS vector (van der Merwe et al. 1995). Sequences of both inserts were verified by sequencing. The constructs leading to soluble proteins were transfected as described above, culture supernatants were collected, and the chimeras were purified on protein A-sepharose (Amersham Biosciences) as described previously (Crocker and Kelm 1996).

Northern blot analysis Total RNA was prepared from different carp (Cyprinus carpio) tissues using TriFast reagent (Peqlab) following the manufacturers recommendations. Northern blots were prepared as described (Gieselmann et al. 1989) using 7µg RNA per lane. Probes for carp Siglec-4 domains 1 and 2 were prepared by nested PCR using carp brain cDNA, generated as described above, as template and the following primers: 5´- CAGTGGAATGTGTGGATGCCN-3´ (sense primer 1), 5´- ATTTCAGCCATGACAAACTCCTN- 3´ (sense primer 2), 5´- GGGGCAGGATTACTGTCTACATCACN-3´ (antisense primer 1) and 5´- GTGTTTGGGAAGTTGACCCGGCAACCCN-3´ (antisense primer 2) giving a product of the expected size (587 base pairs) which was subcloned into pCR 2.1-TOPO (Invitrogen) and sequenced. Hybridization was performed using ULTRAhyb reagent (Ambion, Inc.) following the manufacturers recommendations.

Binding specificity of fish Siglec-4

Page 76: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

70

GlycoWell™ plates with covalently linked Neu5Ac were used for Siglec-4 binding studies (SW-01-004; Lundonia Biotech). As negative control wells were treated with 150µl 10mM sodium periodate in 0.1M sodium acetate pH 5.5. for 1h at 4°C, rinsed with H2O, incubated for 30 minutes at 4°C with sodium borohydride (5mg/ml), and thoroughly washed with H2O. Purified Siglec-4 Fc-chimeras (0.5 µg/ml final concentration) were complexed with affinity purified AP-conjugated anti-human IgG (0.3 µg/ml final concentrations; Dianova) and incubated in the presence of 0 to 5 mM α2,3- or α2,6-sialyllactose in the GlycoWell™ plates (20µl/well). After an incubation for 4 hours at 4°C, the wells were washed with HBS/0.05 % Tween-20 (5 x 200 µl). Bound AP was quantified kinetically with 15µM fluorescein diphosphate (50 µl/well, MoBiTec) in 50mM Tris-HCl pH8.5/10mM MgCl2 as substrate (excitation at 485 nm, emission at 520 nm; Fluoroskan Ascent SL, Thermo Labsystems). Assays were performed in duplicate and repeated at least three times. Inhibition curves were calculated using the ligand module of SigmaPlot (SPSS Inc.) assuming a single binding site for the inhibiting oligosaccharides.

3.6. Acknowledgements We thank Nicolai Bovin, Moscow, Russia for the kind donation of sialyllactose-PAA-derivatives and Nazila Isakovic for technical assistance. The financial support by Zentrale Forschungsförderung der Universität Bremen and Volkswagen Foundation is acknowledged.

3.7. Abbreviations Ig, immunoglobulin; IgSF, Immunoglobulin superfamily; ITIM, immunoreceptor tyrosine-based inhibitory motif; MAG, myelin-associated glycoprotein; Neu5Ac, N-acetylneuraminic acid; Neu5Gc, N-glycolylneuraminic acid; PCR, polymerase chain reaction; RACE, rapid amplification of cDNA ends; Siglecs, sialic acid-binding immunoglobulin-like lectins; Sn, sialoadhesin

Page 77: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

71

3.8. References Angata, T. and Varki, A. (2002) Chemical diversity in the sialic acids and related alpha-keto acids: an evolutionary perspective. Chem.Rev., 102, 439-469.

Angata, T., Varki, N. M., and Varki, A. (2001) A second uniquely human mutation affecting sialic acid biology. J.Biol.Chem., 276, 40282-40287.

Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J. m., Dehal, P., Christoffels, A., Rash, S., Hoon, S., Smit, A. F., Sollewijn Gelpke, M. D., Roach, J., Oh, T., Ho, I. Y., Wong, M., Detter, C., Verhoef, F., Predki, P., Tay, A., Lucas, S., Richardson, P., Smith, S. F., Clark, M. S., Edwards, Y. J. K., Doggett, N., Zharkikh, A., Tavtigian, S. V., Pruss, D., Barnstead, M., Evans, C., Baden, H., Powell, J., Glusman, G., Rowen, L., Hood, L., Tan, Y. H., Elgar, G., Hawkins, T., Venkatesh, B., Rokhsar, D., and Brenner, S. (2002) Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes. Science, 297, 1301-1310.

Barclay, A. N. and Brown, M. H. (1997) Heterogeneity of interactions mediated by membrane glycoproteins of lymphocytes. Biochem Soc.Trans., 25, 224-228.

Brinkman-Van der Linden, E.C., Sjoberg, E. R., Juneja, L. R., Crocker, P. R., Varki, N., and Varki, A. (2000) Loss of N-glycolylneuraminic acid in human evolution. Implications for sialic acid recognition by siglecs. J.Biol.Chem., 275, 8633-8640.

Burge, C. and Karlin, S. (1997) Prediction of complete gene structures in human genomic DNA. J Mol.Biol., 268, 78-94.

Crocker, P. R. (2002) Siglecs: sialic-acid-binding immunoglobulin-like lectins in cell-cell interactions and signalling. Curr.Opin.Struct.Biol, 12, 609-615.

Crocker, P. R. and Kelm, S. (1996) Methods for studying the cellular binding properties of lectin- like receptors. In Herzenberg, L. A., Weir, D. M., and Blackwell, C.(eds), Weir's Handbook of Experimental Immunology. Blackwell Science, Cambridge, pp. 166.1-166.11.

Crocker, P. R. and Varki, A. (2001a) Siglecs in the immune system. Immunology, 103, 137-145.

Crocker, P. R. and Varki, A. (2001b) Siglecs, sialic acids and innate immunity. Trends in Immunology, 22, 337-342.

Domeniconi, M., Cao, Z., Spencer, T., Sivasankaran, R., Wang, K., Nikulina, E., Kimura, N., Cai, H., Deng, K., Gao, Y., He, Z., and Filbin, M. (2002) Myelin-associated glycoprotein interacts with the Nogo66 receptor to inhibit neurite outgrowth. Neuron, 35, 283-290.

Page 78: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

72

Dulac, C., Tropak, M. B., Cameron Curry, P., Rossier, J., Marshak, D. R., Roder, J., and Le Douarin, N. M. (1992) Molecular characterization of the Schwann cell myelin protein, SMP: structural similarities within the immunoglobulin superfamily. Neuron, 8, 323-334.

Filbin, M. T. (2003) Myelin-associated inhibitors of axonal regeneration in the adult mammalian CNS. Nat.Rev.Neurosci., 4, 703-713.

Fujita, N., Sato, S., Kurihara, T., Kuwano, R., Sakimura, K., Inuzuka, T., Takahashi, Y., and Miyatake, T. (1989) cDNA cloning of mouse myelin-associated glycoprotein: a novel alternative splicing pattern. Biochem.Biophys.Res.Commun., 165, 1162-1169.

Gagneux, P. and Varki, A. (2001) Genetic differences between humans and great apes. Mol.Phylogenet.Evol., 18, 2-13.

Gieselmann, V., Polten, A., Kreysing, J., and von Figura, K. (1989) Arylsulfatase A pseudodeficiency: loss of a polyadenylylation signal and N-glycosylation site. Proc.Natl.Acad.Sci.U.S.A, 86, 9436-9440.

Gough, J., Karplus, K., Hughey, R., and Chothia, C. (2001) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol., 313, 903-919.

Heape, A. M., Lehto, V. P., and Kursula, P. (1999) The expression of recombinant large myelin-associated glycoprotein cytoplasmic domain and the purification of native myelin-associated glycoprotein from rat brain and peripheral nerve. Protein Expr Purif, 15, 349-361.

Hunt, D., Coffin, R. S., and Anderson, P. N. (2002) The Nogo receptor, its ligands and axonal regeneration in the spinal cord; a review. J Neurocytol., 31, 93-120.

Inoue, S. and Inoue, Y. (1997) Fish Glycoproteins. In Montreuil, J., Vliegenthart, J. F., and Schachter, H.(eds), Glycoproteins II. Elsevier Science, Amsterdam, pp. 143-162.

Ishiguro, H., Sato, S., Fujita, N., Inuzuka, T., Nakano, R., and Miyatake, T. (1991) Immunohistochemical localization of myelin-associated glycoprotein isoforms during the development in the mouse brain. Brain Res., 563, 288-292.

Karlsson, K. A. (1998) Meaning and therapeutic potential of microbial recognition of host glycoconjugates. Molecular Microbiology, 29, 1-11.

Kelm, S. (2001) Ligands for Siglecs. In Crocker, P. R.(ed), Mammalian Carbohydrate Recognition Systems. Springer, Berlin, pp. 153-176.

Page 79: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

73

Kelm, S., Pelz, A., Schauer, R., Filbin, M. T., Tang, S., De Bellard, M. E., Schnaar, R. L., Mahoney, J. A., Hartnell, A., Bradfield, P., and et al. (1994) Sialoadhesin, myelin-associated glycoprotein and CD22 define a new family of sialic acid-dependent adhesion molecules of the immunoglobulin superfamily. Curr Biol, 4, 965-72.

Kelm, S. and Schauer, R. (1997) Sialic acids in molecular and cellular interactions. Int Rev Cytol, 175, 137-240.

Klinger, M., Taylor, J. S., Oertle, T., Schwab, M. E., Stuermer, C. A., and Diekmann, H. (2004) Identification of Nogo-66 Receptor (NgR) and Homologous Genes in Fish. Mol Biol.Evol., 21, 76-85.

May, A. P., Robinson, R. C., Vinson, M., Crocker, P. R., and Jones, E. Y. (1998) Crystal structure of the N-terminal domain of sialoadhesin in complex with 3' sialyllactose at 1.85 A resolution. Mol Cell, 1, 719-28.

Schachner, M. and Bartsch, U. (2000) Multiple functions of the myelin-associated glycoprotein MAG (siglec- 4a) in formation and maintenance of myelin. Glia, 29, 154-165.

Schweitzer, J., Becker, T., Becker, C. G., and Schachner, M. (2003) Expression of protein zero is increased in lesioned axon pathways in the central nervous system of adult zebrafish. Glia, 41, 301-317.

Spencer, T., Domeniconi, M., Cao, Z., and Filbin, M. T. (2003) New roles for old proteins in adult CNS axonal regeneration. Curr.Opin.Neurobiol., 13, 133-139.

Strenge, K., Schauer, R., Bovin, N., Hasegawa, A., Ishida, H., Kiso, M., and Kelm, S. (1998) Glycan specificity of myelin-associated glycoprotein and sialoadhesin deduced from interactions with synthetic oligosaccharides. Eur J Biochem, 258, 677-85.

Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673-4680.

van der Merwe, P. A., McNamee, P. N., Davies, E. A., Barclay, A. N., and Davis, S. J. (1995) Topology of the CD2-CD48 cell-adhesion molecule complex: implications for antigen recognition by T cells. Curr.Biol., 5, 74-84.

Varki, A. (1997) Sialic acids as ligands in recognition phenomena. FASEB J., 11, 248-255.

Varki, A. (2001a) Loss of N-glycolylneuraminic acid in humans: Mechanisms, consequences, and implications for hominid evolution. Am.J.Phys.Anthropol., Suppl 33, 54-69.

Page 80: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

74

Varki, A. (2001b) N-glycolylneuraminic acid deficiency in humans. Biochimie, 83, 615-622.

Vinson, M., Rausch, O., Maycox, P. R., Prinjha, R. K., Chapman, D., Morrow, R., Harper, A. J., Dingwall, C., Walsh, F. S., Burbidge, S. A., and Riddell, D. R. (2003) Lipid rafts mediate the interaction between myelin-associated glycoprotein (MAG) on myelin and MAG-receptors on neurons. Mol.Cell Neurosci., 22, 344-352.

Wong, S. T., Henley, J. R., Kanning, K. C., Huang, K. H., Bothwell, M., and Poo, M. M. (2002) A p75(NTR) and Nogo receptor complex mediates repulsive signaling by myelin-associated glycoprotein. Nat.Neurosci., 5, 1302-1308.

Yoder, J. A., Mueller, M. G., Wei, S., Corliss, B. C., Prather, D. M., Willis, T., Litman, R. T., Djeu, J. Y., and Litman, G. W. (2001) Immune-type receptor genes in zebrafish share genetic and functional properties with genes encoded by the mammalian leukocyte receptor cluster. Proc.Natl.Acad.Sci.U.S.A, 98, 6771-6776.

Page 81: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

75

3.9. Legends to figures Fig. 1. - Amino acid alignment of fish and putative Xenopus tropicalis Siglec-4 domains 1 and 2 with mammalian representatives of Siglec-4. Sequences have been aligned using VectorNTI based on the ClustalW algorithm. The strand assignment in domain 1 is based on the crystal structure of Siglec-1 (May et al. 1998). Amino acid residues known to be essential for the siglec-characteristic disulfide bridges and Sia binding are indicated with one or two asterisks, respectively. Fig. 2. - mRNA and derived amino acid sequences of Siglec-4 from Takifugu rubripes. Dashed and double lines indicate the putative signal peptide and the transmembrane domain, respectively. Potential N-glycosylation sites are marked with a single line. The putative N-terminal ends of the five Ig domains are indicated with arrows and labeled for each domain (D1, D2, D3, D4 and D5). The ITIM motif in the cytoplasmic region (L-tail is shown, see Fig. 4) is boxed. Fig. 3. - Genomic organization of Siglec-4. Schematic diagram showing the genomic structures of fugu and zebrafish Siglec-4 along with mammalian representatives. Exons are shown by boxes with the corresponding lengths indicated in base pairs and introns by the connecting lines with their lengths indicated above each line. The length of the first exon was calculated from the start codon and the length of the last exon is shown up to the stop codon. The upper panel indicates the domains encoded by the exons (TM, transmembrane region; SP, signal peptide). Only exons leading to the L-tail of Siglec-4 (see Fig. 4 for alternative splicing) are shown. Figure is not drawn to scale. Fig. 4. – Splice variants including potential signal transduction sites of fugu Siglec-4. A) The 3’-ends of the tree alternatively spliced variants found in fugu brain mRNA are shown. The names of the corresponding cytoplasmic tails are given on the right. Exon lengths are in base pairs and their denominations are shown below each box. The termination codon in exon 8 leading to the S-tail is indicated. B) Potential tyrosine phosphorylation sites in the cytoplasmic region. Shown is an alignment of the C-terminal parts of Siglec-4 (fugu, zebrafish and human) present in L-and XL-tail with those of the indicated human siglecs expressed in the immune system. Identical residues are printed white on black and conserved residues on gray background. The two potential tyrosine phosphorylation motifs are indicated. As in most siglecs of mammalian immune systems, in fugu and zebrafish Siglec-4 the proximal motif is an ITIM-motif [consensus sequence: (S/I/V/L)xYxx(L/V)], whereas this is not found in mammalian Siglec-4.

Page 82: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

76

C) Potential sumoylation motifs in fish Siglec-4. Shown is an alignment of sequences within the cytoplasmic region of fish Siglec-4 (fugu, zebrafish and carp) encoded by exons 7 and 9 corresponding to the XL-tail. Identical residues are printed white on black and conserved residues on gray background. The two potential sumoylation motifs (SI and SII; consensus sequence: ψKxE with ψ for hydrophobic amino acid) are boxed in gray. Fig. 5. – Expression pattern of fish Siglec-4. Shown is a Northern blot analysis of total RNA isolated from different tissues of adult carp as described under Methods. 1, bulbus olfactorius; 2, lobus opticus; 3, lobus inferior; 4 rhombencephalon; 5, cerebellum; 6, spinal cord; 7, liver; 8, kidney; 9, heart; 10, muscle; 11, gills; 12, thymus; 13, pronephros; 14, spleen; 15, erythrocytes; 16, leukocytes; 17, ovaries. The positions of molecular weight markers (in kb) are indicated on the left. A) The membrane was probed with a cDNA of carp Siglec-4. B) The gel was stained with ethidium bromide to check equivalent loading between lanes. Fig. 6. – Fugu Siglec-4 as Sia-recognizing cell adhesion molecule. COS-7 cells transfected with cDNA encoding full length Siglec-4 were tested for Sia-dependent cell adhesion with human erythrocytes as described under Methods. A: transfected COS-7 cells were treated with sialidase before the incubation with untreated erythrocytes; B: transfected COS-cells without sialidase-treatment incubated with untreated erythrocytes; C: transfected COS-cells without sialidase-treatment incubated with sialidase-treated erythrocytes. Fig. 7. – Specificity of fugu Siglec-4. Binding and inhibition assays with Fc- chimeras of murine (white) or fugu (black) Siglec-4 were performed in GlycoWell plates as described under Methods. The data shown represent the results of three independent experiments. A) Binding assays were performed in untreated or periodate-treated wells in the absence or presence of 5 mM α2,3- or 5 mM α2,6-sialyllactose (SL) as indicated. B) Binding of Siglec-4 Fc-chimeras was determined in the presence of either α2,3-sialyllactose (triangles) or α2,6-sialyllactose (circles) at the concentrations indicated.

Page 83: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

77

3.10. Tables and figures

Table I: Databank sources for putative fish siglecs which are not orthologues of Siglec-4

Number genome Genomic location (bp) EnsEMBL Peptide/GenscanI Danio rerio ctg24161 GENSCAN00000035206 II Danio rerio chr. 4 (17767496 - 17815647) ENSDARP00000012915 III Danio rerio ctg64 (192177 - 196340) ENSDARP00000018174 IV Danio rerio NA54084 GENSCAN00000034794 V Danio rerio chr. 15 (18579818 to 18597058) GENSCAN00000010988 VI Takifugu rubripes scaffold_124 GENSCAN00000002174 VII Takifugu rubripes scaffold_124 (263826 to 264985) / VIII Takifugu rubripes scaffold_124 GENSCAN00000002172 IX Takifugu rubripes scaffold_170 GENSCAN00000029369

Fig. 1

Page 84: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

78

Fig. 2

Page 85: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

79

Fig.3

Fig. 4

Page 86: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Evolution of Siglecs

80

Fig. 5

Fig. 6

Fig. 7

Page 87: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Unpublished Data

81

4. Unpublished Data

4.1. Background Analysing the genomes of different deuterostomian species we were able to identify different sequences of putative STs. To verify that these sequences encode functional proteins and to investigate substrate and linkage specificity, two of them were chosen for cloning and expression. To be able to compare the characteristics of STs of different species, the STs showing highest sequence similarity to mammalian ST3Gal II from Ciona intestinalis and Takifugu rubripes were selected.

4.2. Materials and Methods Fugu brain RNA was obtained from MRC Gene Service. Tissue culture media were obtained from Invitrogen. Unless otherwise specified, all other reagents were purchased from Sigma. Vectors used for cloning are modified pcDNA 3.1 and pIB/V5 (Invitrogen, modified version kindly provided by Dr. Frank Dietz, University Bremen), both containing sequences encoding a signal peptide, a protein A IgG binding domain and a 6xHis peptide .

Figure VII Vectors used for subcloning truncated cDNA encoding a putative ST3Gal II from Ciona intestinalis and Takifugu rubripes.

Preparation of cDNA Total RNA was prepared from different tissues of Ciona intestinalis using TriFast reagent (Peqlab) following the manufacturers recommendations. Ciona and fugu brain cDNA were generated using RevertAid™ H Minus First Strand cDNA Synthesis Kit (Fermentas).

pIB/V5 Prot A 6xHis4351 bp

protein A

signal peptide

6xHis

OpIE2 promoter

EcoRVXbaI

pcDNA 3.1 Prot A 6xHis6319 bp

protein A

signal peptide

6xHis

CMV promoter

EcoRVXbaI pIB/V5 Prot A 6xHis

4351 bp

protein A

signal peptide

6xHis

OpIE2 promoter

EcoRVXbaI

pIB/V5 Prot A 6xHis4351 bp

protein A

signal peptide

6xHis

OpIE2 promoter

EcoRVXbaI

pcDNA 3.1 Prot A 6xHis6319 bp

protein A

signal peptide

6xHis

CMV promoter

EcoRVXbaI

pcDNA 3.1 Prot A 6xHis6319 bp

protein A

signal peptide

6xHis

CMV promoter

EcoRVXbaI

Page 88: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Unpublished Data

82

3’-RACE-PCR To obtain 3´- ends of the cDNA nested RACE-PCR was performed using the GeneRacer™

Kit (Invitrogen) and gene specific primer 5´- CGGACATGATTTTGTCATTCG-3´ (PCR 1) and 5’- CCTACCTATAACTATG-3’ (PCR2) for Ciona intestinalis as well as 5´- GCTGTTGTGGGCAACTCTGGGAACC-3´ (PCR 1 and 2) for Takifugu rubripes.

Molecular cloning of a truncated form of putative ST3Gal II cDNA To obtain the cDNA of Ciona and fugu putative ST3Gal II for subcloning, nested PCR was perfomed using the primer sets 5´- CAACTTGTAAATTAAGTCGTGTGATTGC-3´ and 5´- GAGTAGGGCTAATTGTTTGGAAACTATG-3´ (PCR1) as well as 5´- GCGATATCC AAAAACAAAGCCGGCCAAG-3´ and 5´- CGTCTAGACTATGATGATTGTTCCGAC CTGTG-3´ (PCR2) for Ciona and primer sets 5´- CTGGAGGCGCCGGGGTGGGA -3´ and 5´-GCTCTAGATTACTTCCCAGGGAAC ACGGTGAT -3´ ( PCR1) as well as 5´- GCGATATCCCGCGTGAAGCTGGTGCCCAG -3´ and 5´-GCTCTAGATTACTTCCCA GGGAACACGGT GAT -3´ (PCR2) for fugu. The above PCR was performed using cDNA generated from heart and brain RNA of Ciona and fugu respectively as template under the following conditions: 94 °C for 2 min, 25 cycles of 94 °C for 30 s, 55 °C for 45 s, 72 °C for 90 s, and 72 °C for 2 min. The resulting fragments were digested with EcoRV and XbaI, and ligated into EcoRv/XbaI cut pcDNA 3.1 vector for expression in eukaryotic cells and pIB/V5 vector (both described above) for expression in insect cells and sequenced.

Expression of put. ST3Gal II/protein A-constructs Fugu and Ciona putative ST3Gal II/ pcDNA 3.1 constructs were transfected using ExGen 500 in vitro transfection reagent (Fermentas) into COS-7 cells. Fugu and Ciona putative ST3Gal II/ pIB/V5 constructs were transfected using Cellfectin® Reagent (Invitrogen) into S2 cells following the manufacturers recommendations. Culture supernatants were collected, and the protein constructs were purified on IgG-agarose (Sigma). Proteins were detected by Western Blot analysis using an anti-His-tag antibody (Invitrogen) and an ECL kit (Amersham Pharmacia).

Page 89: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Unpublished Data

83

4.3. Results and Discussion

3’ RACE-PCR From the in silico analysis, most parts of a mRNAs coding for fugu and Ciona putative ST3Gal II could be predicted. However, RACE experiments were performed using fugu brain and Ciona heart RNA as template and sequenced in order to confirm the predicted exons and to verify the 3’-end, since the C-terminus can be extremely critical for proper enzyme function (Gerardy-Schahn, personal communication). In case of the putative ST3Gal II of Ciona the RACE-PCR confirmed the predicted sequences, whereas for fugu three additional basepairs were identified resulting in an additional amino acid at the C-terminus of the correponding protein (see alignment in Fig. IX).

Expression of put. ST3Gal II/protein A-constructs The soluble ST/ protein A-constructs were expressed transiently in mammalian cells (COS-7) to ensure correct glycosylation which seems to be neccessary for full enzyme activity (Kelm, personal communication). In addition, enzyme production were carried out using an insect cell line (S2). Although proteins produced in insect cells are glycosylated differently, this method has the advantage that the cells are grown at 25°C. Since both species, Ciona and fugu, usually do not experience 37°C, the proteins originated from these cold-water species might be more heat-sensitive than mammalian enzymes.

Figure VIII Protein A-chimeras of fugu and Ciona putative ST3Gal II were prepared and analyzed by SDS-PAGE followed by Western blot analysis using an anti –His-tag antibody as described under Methods. Lane 1: putative fugu ST3Gal II produced in S2 cells; lane 2: putative fugu ST3Gal II produced in COS-7 cells; lane 3: putative Ciona ST3Gal II produced in S2 cells; lane 4: putative Ciona ST3Gal II produced in COS-7 cells. Positions of molecular weight markers (in kDa) are indicated on the left.

The soluble proteins were purified using IgG-sepharose and analyzed using an anti-His-tag antibody. As shown in Fig. VIII both putative ST3Gal II/prot. A-chimeras are procuced in both cell lines exhibiting the expected molecular weight of 70kDA and 66kDA for the fugu and Ciona constructs, respectively. The availability of these proteins will allow to test these proteins for enzyme activity and investigate substrate and linkage specificity where appropriate.

1 8 0

2 4

3 5

4 85 4

7 3

1 0 0

1 3 01 8 0

2 4

3 5

4 85 4

7 3

1 0 0

1 3 0

Page 90: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Unpublished Data

84

Figure IX Amino acid alignment of mammalian ST3Gal II with putative orthologues from Ciona intestinalis and Takifugu rubripes. Sequences have been aligned using VectorNTI based on the ClustalW algorithm. Transmembrane domain and sialylmotifs are indicated. Additional information optained by RACE-PCR is circled.

: : : :

---MKCSLRVWFLSVAFLLVFIMSLLFTYSHHSMATLPYLDSGALDGTHRVKLVPGYAGLQRLSKERLSG-------MKCSLRVWFLSMAFLLVFIMSLLFTYSHHSMATLPYLDSGALGGTHRVKLVPGYSGLQRLGKEGLLG--------RCSCRVWVFLVSLTLVFLTSLFFSVSLRGGVGFNYLEAPGWEESRRVKLVPSYAGARRPSAAASSQQ---MLINFKLSRVIAMLLVVAIFLTYSWLLLWSTKTALQTNRKNKAGQDEVPVINVIKEDSYVQQKTQNLNKGKRFD

: : : :

-------------KSCACRRCMGDAGASDWFDSHFDGNISPVWTRENMDLPPDVQRWWMMLQPQFKSHNTNEVL-------------RNCACSRCMGDASTSEWFDSHFDGNISPVWTRDNMNLPPDVQRWWMMLQPQFKSHNTNEVL-------------RTCECPRCVGDPGVSDWFDENYDPDVSPVWTRDNGQLPPDVYYWWVMLQPQFKPHSIQQVLLGRVNHSHPREEIQQNNKCGHQLDASQTRWFRARFNPEIEPVWTQSALEIDYLVYDWWLSLQSSEAENLDKTFE

: : : :

EKLFQIVPGENPYR---FRDPHQCRRCAVVGNSGNLRGSGYGQDVDGHNFIMRMNQAPTVGFEQDVGSRTTHHFEKLFQIVPGENPYR---FRDPQQCRRCAVVGNSGNLRGSGYGQEVDSHNFIMRMNQAPTVGFEKDVGSRTTHHFLKLFQVVPGRSPYG---SWDPGRCLRCAVVGNSGNLRGAGYGATIDGHNYIMRINLAPTVGFEEDAGSHTTHHFALYKEGVPRKDPFARLTHDREAGCRSCAVVGNSGNILNSNYGNVIDGHDFVIRMNKGPTYNYENDVGSKTTHRF

: : : :

MYPESAKN-LPANVSFVLVPFKVLDLLWIASALSTGQIRFTYAPVKSFLRVDKEKVQIYNPAFFKYIHDRWTEHMYPESAKN-LPANVSFVLVPFKALDLMWIASALSTGQIRFTYAPVKSFLRVDKEKVQIYNPAFFKYIHDRWTEHMYPESAKN-LAANVSFVLVPFKTLDLVWITSALSTGQIRFTYAPVKQFLRVDKDKVQIFNPAFFKYVHDRWTRHMYPTTAASSLPQGVSLVLVPFQPLDIKWLLSALTTGEITRTYQPLVRRVTCDKSKITIISPTFIRYVHDRWTQH

: : : :

HGRYPSTGMLVLFFALHVCDEVNVYGFGADSRGNWHHYWENN--RYAGEFRKTGVHDADFEAHIIDMLAKASKIHGRYPSTGMLVLFFALHVCDEVNVYGFGADSRGNWHRHWENN--RYAGEFRKTGVHDADFEAHIIDMLAKASKIHGRYPSTGMLVLFFALHVCDEVNVFGFGADGRGNWHHYWEQN--RYSGEFRKTGVHDADYEAQIIQRLAKAGKIHGRYPSTGLLALIYALHECDEVDVYGFGANRAGNWHHYWEDLPPHVAGAFRKTGVHDSAQENEIIDQLHIHGLL

: : : :

EVYRGN---EVYRGN---TVFPGK---RVHRSEQSS

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

sialylmotif 1

sialylmotif 2 sialylmotif 3 sialylmotif 4Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

transmembrane region : : : :

---MKCSLRVWFLSVAFLLVFIMSLLFTYSHHSMATLPYLDSGALDGTHRVKLVPGYAGLQRLSKERLSG-------MKCSLRVWFLSMAFLLVFIMSLLFTYSHHSMATLPYLDSGALGGTHRVKLVPGYSGLQRLGKEGLLG--------RCSCRVWVFLVSLTLVFLTSLFFSVSLRGGVGFNYLEAPGWEESRRVKLVPSYAGARRPSAAASSQQ---MLINFKLSRVIAMLLVVAIFLTYSWLLLWSTKTALQTNRKNKAGQDEVPVINVIKEDSYVQQKTQNLNKGKRFD

: : : :

-------------KSCACRRCMGDAGASDWFDSHFDGNISPVWTRENMDLPPDVQRWWMMLQPQFKSHNTNEVL-------------RNCACSRCMGDASTSEWFDSHFDGNISPVWTRDNMNLPPDVQRWWMMLQPQFKSHNTNEVL-------------RTCECPRCVGDPGVSDWFDENYDPDVSPVWTRDNGQLPPDVYYWWVMLQPQFKPHSIQQVLLGRVNHSHPREEIQQNNKCGHQLDASQTRWFRARFNPEIEPVWTQSALEIDYLVYDWWLSLQSSEAENLDKTFE

: : : :

EKLFQIVPGENPYR---FRDPHQCRRCAVVGNSGNLRGSGYGQDVDGHNFIMRMNQAPTVGFEQDVGSRTTHHFEKLFQIVPGENPYR---FRDPQQCRRCAVVGNSGNLRGSGYGQEVDSHNFIMRMNQAPTVGFEKDVGSRTTHHFLKLFQVVPGRSPYG---SWDPGRCLRCAVVGNSGNLRGAGYGATIDGHNYIMRINLAPTVGFEEDAGSHTTHHFALYKEGVPRKDPFARLTHDREAGCRSCAVVGNSGNILNSNYGNVIDGHDFVIRMNKGPTYNYENDVGSKTTHRF

: : : :

MYPESAKN-LPANVSFVLVPFKVLDLLWIASALSTGQIRFTYAPVKSFLRVDKEKVQIYNPAFFKYIHDRWTEHMYPESAKN-LPANVSFVLVPFKALDLMWIASALSTGQIRFTYAPVKSFLRVDKEKVQIYNPAFFKYIHDRWTEHMYPESAKN-LAANVSFVLVPFKTLDLVWITSALSTGQIRFTYAPVKQFLRVDKDKVQIFNPAFFKYVHDRWTRHMYPTTAASSLPQGVSLVLVPFQPLDIKWLLSALTTGEITRTYQPLVRRVTCDKSKITIISPTFIRYVHDRWTQH

: : : :

HGRYPSTGMLVLFFALHVCDEVNVYGFGADSRGNWHHYWENN--RYAGEFRKTGVHDADFEAHIIDMLAKASKIHGRYPSTGMLVLFFALHVCDEVNVYGFGADSRGNWHRHWENN--RYAGEFRKTGVHDADFEAHIIDMLAKASKIHGRYPSTGMLVLFFALHVCDEVNVFGFGADGRGNWHHYWEQN--RYSGEFRKTGVHDADYEAQIIQRLAKAGKIHGRYPSTGLLALIYALHECDEVDVYGFGANRAGNWHHYWEDLPPHVAGAFRKTGVHDSAQENEIIDQLHIHGLL

: : : :

EVYRGN---EVYRGN---TVFPGK---RVHRSEQSS

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

sialylmotif 1

sialylmotif 2 sialylmotif 3 sialylmotif 4Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

Homo sapiensMus musculusTakifugu rubripesCiona intestinalis

transmembrane region

Page 91: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

85

5. Conclusions

Sialic acids (Sias) are negatively charged acidic sugars, that usually occur at the terminal ends of the carbohydrate groups of glycoproteins and glycolipids. Because of their negative charge and their wide occurrence at exposed positions on cell-surface molecules, Sias function as key determinants of oligosaccharide structures that mediate a variety of biological phenomena, such as cell-cell communication, cell-substrate interaction, adhesion and protein targeting (Schauer and Kamerling 1997). Sias are underlying a complex biosynthetic mechanism involving a large number of proteins being crucial for biosynthesis, activation, transfer, modification and catabolism of these sugars (Traving and Schauer 1998). This raises the question how such a complex metabolic apparatus could have been developed during evolution. Although Sias are widely distributed among all deuterostomes, information about the enzymes required for the biosynthesis of Sias as well as the ligands recognizing them, is almost exclusively available for relatively recent vertebrate lineages such as bird and mammals (Angata and Varki 2002). To understand the evolution of sialylation on a deeper timescale, it was of particular interest to evaluate branches which diverged early in deuterostomian evolution. The increasingly available genomic sequence information allowed us to search for putative genes related to the biosynthetic machinery of Sias in various deuterostomian species. Because the genomic DNA contains the genes even if they are under strict spatio-temporal regulation, such an approach should be even more sensitive than biochemical analyses. Furthermore, access to sequence information enables to analyze the phylogenetic relationship of these genes. Although such predictions have to be confirmed by biochemical and/or genetic analysis, these screenings have provided first important insights into the evolution involved in Sia biosynthesis. In mammals, about 20 different STs are known. Each of these is highly specific for the linkage formed but more promiscuous with respect to the acceptor substrate and several STs have overlapping glycan specificities (Harduin-Lepers et al. 2001; Jeanneau et al. 2004; Takashima et al. 2002a; Takashima et al. 2002b). Therefore, from an evolutionary point of view, STs are interesting mainly in three aspects. (1) These enzymes are suitable to predict the presence and composition of sialylated structures in a particular species or phylum. (2) The high diversity and partial redundancy in specificity of these enzymes permits an evolutionary flexibility allowing the development of new lineages, which in turn leads to new structures with potentially new functions. (3) The presence of highly conserved sequence motifs enables a fairly direct identification of putative family members, whereas the otherwise rather diverse sequences allow clear assignments of genes to specific lineages even across long phylogenetic distances. In general, these three aspects together provide interesting insight in the

Page 92: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

86

development of a complex system of secondary gene products (i.e. sialylation) based on molecular evolution on the genomic level. In a first step we investigated two teleost genomes (Danio rerio and Takifugu rubripes) with regard to the proteins involved in Sia metabolism. Database searches revealed putative orthologues of almost all known mammalian sialytransferases and other key proteins of the Sia apparatus. In a second step the database search was extended to the available sequences of genomes which diverged earlier in deuterostomian evolution, e.g. lancelets (Branchiostoma floridae), tunicates (Ciona intestinalis) and echinoderms (Strongylocentrotus purpuratus). Although the sequencing of these genomes have not been completed yet, putative orthologues of several STs could be identified. All of these putative ST orthologues show a type II transmembrane architecture, and exhibit the conserved peptide region referred to as sialylmotifs (Jeanneau et al. 2004). The high sequence similarities between mammalian STs and putative fish orthologues including the amino acids known to be crucial for normal enzyme function (Jeanneau et al. 2004) strongly indicates that the identified sequences represent functional proteins. The high conservation of the genomic organization between fish and mammalian genes further supports the assumption, that these newly identified sequences represent real orthologues of the known mammalian STs. The identification of additional putative ST sequences particular in the zebrafish is consistent with previous observations from other studies on multigene families and seems to be the result of a whole genome duplication occurred in the teleost lineage (Barbazuk et al. 2000; Christoffels et al. 2004). However, the existence and high conservation of all key enzymes involved in sialylation indicates the presence of a complex sialylation apparatus in fish comparable to that of mammalian species. To get further insights into the evolution of this enzyme family, a phylogenetic tree was reconstructed distributing the putative fish STs together with their mouse orthologues onto 5 highly supported branches in accordance with substrate specificity and genomic organization. Notably, the well established classification based on substrate specificity of the STs into 4 families (Harduin-Lepers et al. 2001) is resembled in all cases except for the ST6GalNAc family which is represented by two well separated branches (subgroup A consisting of ST6GalNAc I and II and subgroup B consisting of ST6GalNAc III to VI). This observation implicates that the two groups of N-Acetylgalactosamine α2,6-sialyltranferases developed independently. Furthermore, the phylogenetic analysis revealed the splitting of the investigated STs into two monophyletic groups with one lineage consisting of all galactose α2,3-STs and the ST6GalNAc subgroup A and the other branch consisting of all galactose α2,6-sialyltransferases, sialyl α2,8-sialyltransferases and the ST6GalNAc subgroup B. The grouping of the STs into these five lineages implies that these originated from an ancestral enzyme gene by duplication, subsequently acquired distinct functions, and then became noninterchangeable. Interestingly, from the relationship between the ST subfamilies it can be

Page 93: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

87

deduced that the evolution of sialyltransferases was driven to a more restricted specificity probably by an expansion of the acceptor recognition site including other parts of the glycan. However, the timing of such gene duplication events is difficult to estimate, especially considering the controversial discussion about the phylogeny of animals and in particular of protostomes. The traditional view of an acoel-like ancestor, progressively aquiring a coelome, differentiated internal organs, segments, etc. has been abandoned in favour of a newly developed phylogeny, taken aspects of molecular evolution into account. Recent phylogenies, reconstructed based on a comparison of 18S ribosomal RNA suggest a three-branched tree for the bilateria comprising the deuterostomes and two separate protostome clades, the lophotrochozans and the ecdyzozoans (see Fig. X) (Adoutte et al. 1999). For considerations about the origin of sialylation the phylogenetic composition of the galactose α2,6-sialyltransferase branch is of particular importance representing the only lineage containing a ST from a protostomian species. Interestingly, the phylogeny of the galactose α2,6-sialyltransferase branch is in accordance with the organismal phylogeny clearly demonstrating that the splitting into STs of different specificity took place earlier in evolution than the divergence of the arthropods. Thus, according to the recently established metazoan phylogeny the diversification of the ST family must have happened prior to the protostomian divergence into lophotrochozans and ecdyzozoans.

Figure X Metazoan phylogenies. (A) The traditional phylogeny based on morphology and embryology. (B) The new molecule-based phylogeny (Adoutte et al. 2000).

Page 94: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

88

This point in evolution reflects the appearance of increasingly more complex organisms with new structures and functions. In this context, it is tempting to speculate that this development was facilitated by the potential to synthesize more complex glycans, for example those produced by STs. Beside the endogenous factors involved in the evolution of glycans, exogenous selection pressures mediated by viral and microbial pathogens and parasites that recognize glycans, probably have played a prominent role in diversification particularly of the outermost regions of the glycans. It is known that not only deuterostomian express Sias, but also some of their pathogens, generating a kind of moelcular mimicry (Angata and Varki 2002). These pathogens show a markedly reduced immunogenicity within the vertebrate organism protecting them from destruction by host complement. Therefore it is reasonable to assume, that at least some variations in sialylation may have arisen from this selection pressure. One could hypothesize, that evolution developed different strategies for multicellular organisms to evade their pathogens: First, modifications of Sia like acetylation and lactylation, as found in vertebrates and particularly in echinoderms could help to evade pathogens. Second, the reduction of Sias on cell surfaces could have been advantageous, preventing molecular mimicry as well as reducing pathogen binding sites, which could be the case for arthropods and nematodes. Arthropods do not or only marginally express Sias, although the (completely sequenced) genome of Drosophila melanogaster appears to contain the genes critical for Sia synthesis including one galactose α2,6-sialyltransferase. Maybe the need to avert pathogens has produced a selective pressure towards the reduction of the other STs. The functional role of the product of this galactose α2,6-sialyltransferase in Drosophila awaits clarification. In contrast, nematodes do not show any evidence of Sia expression (Oriol et al. 1999). Thus, it is likely that the ancestors of the recent nematodes secondarily lost the genes involved in Sia biology. The loss of only one gene encoding an key enzyme of the Sia biosynthesis, e.g. the UPD-GlcNac-2-epimerase/ ManNAc kinase or CMP-Sia synthetase, would have led to the complete disappearance of the other genes related to Sia biosynthesis due to the lack of a appropriate substrate. Interestingly, it seems that for some unknown reason the nematode C. elegans have favored fucosylation instead of sialylation expressing about 18 different fucosyltranferase genes (Oriol et al. 1999). Since Sia and fucose are usually in competition for the same acceptors, the lack of all forms of Sias in C. elegans fits well with a large expression of different fucosyltransferase genes with the potential to generate complex glycosylation pattern. In contrast to Sias, which according to our phylogenetic analysis have been evolved in the beginning of bilateria, the largest family of Sia recognizing lectins found in mammals, the Siglecs, seem to be a relatively new invention in evolution. Investigations of the completed genomes from C. elegans and Drosophila melanogaster revealed no evidence of the existence of siglec-like sequences at least in these protostomian genomes (Angata and Varki 2002).

Page 95: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

89

Structurally, siglecs belong to the immunoglobulin superfamily (IgSF) (Crocker 2002) and it seems likely that siglecs evolved from a protein-binding Ig domain precursor that was adapted to Sia recognition. This suggests, that the Sia-binding features of siglecs evolved after the emergence of the prominent expression of Sias in deuterostomes. Until the start of this thesis the information available about the occurrence of siglecs in animal species was restricted to birds and mammals. About eleven different Siglecs have been identified in the human genome and for one of them, the myelin-associated glycoprotein (MAG), an orthologue in birds called Schwann cell membrane protein (SMP) has been described (Crocker 2002; Dulac et al. 1992). Because siglecs are thought to play important roles in a wide array of recognition and signaling events in the mammalian brain and immune system (Crocker 2002; Crocker and Varki 2001a; Crocker and Varki 2001b), the investigation of other deuterostomian species seemed to be a promising approach to get further information about the evolution and in particular the origin of this carbohydrate binding protein family. The availability of two almost completely sequenced fish genomes offered the possibility to investigate species representing the most distant vertebrates sharing with mammals the two highly complex structures known to express siglecs, the nervous system with myelinated axons, and the immune system. The analysis of the genomes of Takifugu rubripes and Danio rerio revealed the existence of putative Siglec sequences in both fish species. Based on the degree of sequence identity the encoded proteins were clearly identified as fish orthologues of mammalian Siglec-4/MAG, the only Siglec expressed in the nervous system. In mammals Siglec-4/MAG is expressed exclusively by myelinating glia cells in the central and peripheral nervous system and has been proposed to regulate interactions between glia cells and neurons involved in events like myelination, axonal growth and signal transduction (Domeniconi et al. 2002; Schachner and Bartsch 2000; Vinson et al. 2003; Wong et al. 2002). The sequences identified in the fish genomes exhibit all structural features described for Siglecs in general as well as those specific for Siglec-4 and show the same genomic organisation as described for mammalian Siglec-4. To identify these putative orthologues as authentic siglecs, binding studies have been performed providing direct evidence that these proteins are true fish orthologues of mammalian Siglec-4/MAG by showing specificity for α2,3-linked Sia. As demonstrated by Northern blot analysis, the expression of fish Siglec-4 appears to be restricted to the nervous system. This is in good agreement with the tissue distribution described for mammalian Siglec-4/MAG. However, to clarify with certainty, whether also in fish the expression is restricted to myelinating cells, additional histochemical experiments with suitable antibodies will be necessary. Interestingly, unlike the highly conserved exctracellular region involved in Sia-recognition, the cytoplasmic tail seems to be implicated in functions different from those described for mammals. In contrast to the two splice variants found for mammalian Siglec-4 (Fujita et al. 1989; Heape et al. 1999; Ishiguro et al. 1991), in fish alternative splicing leads to three

Page 96: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

90

isoforms with different cytoplasmic tails with two of them containing potential phosphorylation and sumoylation sites. Notably, two of the three potential tyrosine-based phosphorylation sites are found in the context of ITIMs (immunoreceptor tyrosine-based inhibition motif), a characteristic feature of the Siglecs found in the immune system (Crocker and Varki 2001a; Crocker and Varki 2001b), but absent in mammalian and avian representatives of Siglec-4. The existence of ITIMs in fish Siglec-4 indicates, that ITIMs are an ancestral feature of siglecs, which persisted in most siglecs, but was lost later during mammalian and avian Siglec-4 evolution. This rises the question whether Siglec-4 represents the common ancestor of these immune system-related siglecs. At first sight, this seems to be quite likely. However, inconsistent with this hypothesis is the observation that all known Siglec-4 V-set domains lack the salt bridge typical for V-set Ig-domains, distinguishing Siglec-4 representatives from all other siglecs (Clements et al. 2003). This suggests the existence of another ancestral gene besides Siglec-4 which led to the evolution of the other members of the siglec family. Notably, in addition to Siglec-4 some other sequences were identified showing only low degrees of sequence similarity to mammalian siglecs but containing the main characteristic features of siglecs, such as the unusual distribution of cysteins and the typical amino acids found in the binding site (Kelm 2001). The additional putative fish siglecs mentioned above could not be assigned as relatives to any of the known mammalian siglecs due to the low sequence similarity to these proteins. Even the degree of similarity between fugu and zebrafish genes was too low to allow assignments. Therefore, functional studies of these putative fish siglecs as well as genome analyses of amphibians, reptiles and birds will be necessary to perform a conclusive phylogenetic analysis and to follow the siglec evolution from bony fishes to mammals. However, this study has provided strong evidence that most of the siglec representatives known to date are relatively new inventions in Sia evolution. This is further illustrated by the expanded repertoires of CD33-related Siglecs in humans, compared with mice (Crocker 2002). Contrary to the high divergence of the CD33-related siglecs, the high conservation of Siglec-4 from fish to mammals indicates a high evolutive pressure for this protein. Although tissue distribution of fish Siglec-4 is in good agreement to that of mammalian species, histochemical data will be necessary to clarify with certainty, whether also in fish expression of Siglec-4 is restricted to myelinating cells. In addition the analysis of cartilliganeous fish (Chondrichthyes, e.g. sharks and rays ) with fully developed myelin sheats in CNS and PNS in contrast to the non-myelinated jawless fish (Agnatha, e.g. lamprey and hagfish) will lead to a better understanding of the evolution of Siglec-4 and its function.

Page 97: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

91

References

Adoutte, A., Balavoine, G., Lartillot, N., and de Rosa, R. (1999) Animal evolution. The end of the intermediate taxa? Trends Genet., 15, 104-108.

Adoutte, A., Balavoine, G., Lartillot, N., Lespinet, O., Prud'homme, B., and de Rosa, R. (2000) The new animal phylogeny: reliability and implications. Proc.Natl.Acad.Sci.U.S.A, 97, 4453-4456.

Angata, T. and Varki, A. (2002) Chemical diversity in the sialic acids and related alpha-keto acids: an evolutionary perspective. Chem.Rev., 102, 439-469.

Barbazuk, W. B., Korf, I., Kadavi, C., Heyen, J., Tate, S., Wun, E., Bedell, J. A., McPherson, J. D., and Johnson, S. L. (2000) The Syntenic Relationship of the Zebrafish and Human Genomes. Genome Res., 10, 1351-

Christoffels, A., Koh, E. G., Chia, J. M., Brenner, S., Aparicio, S., and Venkatesh, B. (2004) Fugu Genome Analysis Provides Evidence for a Whole-Genome Duplication Early During the Evolution of Ray-Finned Fishes. Mol Biol.Evol.,

Clements, C. S., Reid, H. H., Beddoe, T., Tynan, F. E., Perugini, M. A., Johns, T. G., Bernard, C. C., and Rossjohn, J. (2003) The crystal structure of myelin oligodendrocyte glycoprotein, a key autoantigen in multiple sclerosis. Proc.Natl.Acad.Sci.U.S.A, 100, 11059-11064.

Crocker, P. R. (2002) Siglecs: sialic-acid-binding immunoglobulin-like lectins in cell-cell interactions and signalling. Curr.Opin.Struct.Biol., 12, 609-615.

Crocker, P. R. and Varki, A. (2001a) Siglecs in the immune system. Immunology, 103, 137-145.

Crocker, P. R. and Varki, A. (2001b) Siglecs, sialic acids and innate immunity. Trends Immunol, 22, 337-342.

Domeniconi, M., Cao, Z., Spencer, T., Sivasankaran, R., Wang, K., Nikulina, E., Kimura, N., Cai, H., Deng, K., Gao, Y., He, Z., and Filbin, M. (2002) Myelin-associated glycoprotein interacts with the Nogo66 receptor to inhibit neurite outgrowth. Neuron, 35, 283-290.

Dulac, C., Tropak, M. B., Cameron-Curry, P., Rossier, J., Marshak, D. R., Roder, J., and Le Douarin, N. M. (1992) Molecular characterization of the Schwann cell myelin protein, SMP: structural similarities within the immunoglobulin superfamily. Neuron, 8, 323-334.

Page 98: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

92

Fujita, N., Sato, S., Kurihara, T., Kuwano, R., Sakimura, K., Inuzuka, T., Takahashi, Y., and Miyatake, T. (1989) cDNA cloning of mouse myelin-associated glycoprotein: a novel alternative splicing pattern. Biochem.Biophys.Res.Commun., 165, 1162-1169.

Harduin-Lepers, A., Vallejo-Ruiz, V., Krzewinski-Recchi, M. A., Samyn-Petit, B., Julien, S., and Delannoy, P. (2001) The human sialyltransferase family. Biochimie, 83, 727-737.

Heape, A. M., Lehto, V. P., and Kursula, P. (1999) The expression of recombinant large myelin-associated glycoprotein cytoplasmic domain and the purification of native myelin-associated glycoprotein from rat brain and peripheral nerve. Protein Expr Purif, 15, 349-361.

Ishiguro, H., Sato, S., Fujita, N., Inuzuka, T., Nakano, R., and Miyatake, T. (1991) Immunohistochemical localization of myelin-associated glycoprotein isoforms during the development in the mouse brain. Brain Res., 563, 288-292.

Jeanneau, C., Chazalet, V., Auge, C., Soumpasis, D. M., Harduin-Lepers, A., Delannoy, P., Imberty, A., and Breton, C. (2004) Structure-function analysis of the human sialyltransferase ST3Gal I- Role of N-glycosylation and a novel conserved sialylmotif. J Biol.Chem.,

Kelm, S. (2001) Ligands for Siglecs. In Crocker, P. R.(ed), Mammalian Carbohydrate Recognition Systems. Springer, Berlin, pp. 153-176.

Oriol, R., Mollicone, R., Cailleau, A., Balanzino, L., and Breton, C. (1999) Divergent evolution of fucosyltransferase genes from vertebrates, invertebrates, and bacteria. Glycobiology, 9, 323-334.

Schachner, M. and Bartsch, U. (2000) Multiple functions of the myelin-associated glycoprotein MAG (siglec- 4a) in formation and maintenance of myelin. Glia, 29, 154-165.

Schauer, R. and Kamerling, J. P. (1997) Chemistry, biochemistry and biology of sialic acids. In Montreuil, J., Vliegenthart, J. F. G., and Schachter, H.(eds), Glycoproteins II. Elsevier, Amsterdam, pp. 243-402.

Takashima, S., Tsuji, S., and Tsujimoto, M. (2002a) Characterization of the second type of human beta-galactoside alpha 2,6-sialyltransferase (ST6Gal II), which sialylates Galbeta 1,4GlcNAc structures on oligosaccharides preferentially. Genomic analysis of human sialyltransferase genes. J Biol.Chem, 277, 45719-45728.

Takashima, S., Ishida, H. k., Inazu, T., Ando, T., Ishida, H., Kiso, M., Tsuji, S., and Tsujimoto, M. (2002b) Molecular Cloning and Expression of a Sixth Type of alpha 2,8-Sialyltransferase (ST8Sia VI) That Sialylates O-Glycans. J.Biol.Chem., 277, 24030-24038.

Page 99: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Conclusions

93

Traving, C. and Schauer, R. (1998) Structure, function and metabolism of sialic acids. Cell Mol.Life Sci., 54, 1330-1349.

Vinson, M., Rausch, O., Maycox, P. R., Prinjha, R. K., Chapman, D., Morrow, R., Harper, A. J., Dingwall, C., Walsh, F. S., Burbidge, S. A., and Riddell, D. R. (2003) Lipid rafts mediate the interaction between myelin-associated glycoprotein (MAG) on myelin and MAG-receptors on neurons. Mol.Cell Neurosci., 22, 344-352.

Wong, S. T., Henley, J. R., Kanning, K. C., Huang, K. H., Bothwell, M., and Poo, M. M. (2002) A p75(NTR) and Nogo receptor complex mediates repulsive signaling by myelin-associated glycoprotein. Nat.Neurosci., 5, 1302-1308.

Page 100: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Summary

94

6. Summary Sialic acids are key determinants of carbohydrate structures that play important roles in a variety of biological functions, like cell-cell communication, cell-substrate interaction, adhesion, and signal transduction. Though sialic acids seem to be found at least in all species of the deuterostome lineage, proteins involved in metabolism and recognition of these sugar determinants are mainly described for relatively recent vertebrate lineages such as birds and mammals. We investigated different genomes of deuterostomian species with emphasis on the proteins involved in the biosynthesis and recognition of sialic acids in order to obtain a better understanding how sialylation developed during evolution. Analysis of two completely sequenced teleost genomes (Takifugu rubripes and Danio rerio) revealed the existence of all key enzymes crucial for sialic acid metabolism including putative orthologues for almost all 20 sialyltransferases known for mammalian species. In addition, putative ST genes were identified in the partially sequenced genomes of Branchiostoma floridae, Ciona intestinalis and the sea urchin Strongylocentrotus purpuratus. A thorough phylogenetic analysis of all ST sequences has provided strong evidence that the development of the major ST branches occurred well before the diversification into protostomes and deuterostomes. These results strongly suggests that the defining characteristics of STs were already present in the ancestor of all bilaterians and have not changed much in 1000 million years. Interestingly, from the relationship between the ST subfamilies it can be deduced that the evolution of the different sialyltransferase lineages was driven by the availability of appropriate acceptor substrates. In contrast to sialyltransferases, the largest family of sialic acid-recognizing lectins found in mammals, the siglecs, seems to be a relatively new invention in evolution. Analysis of the two teleost genomes (Takifugu rubripes and Danio rerio) revealed only one unambiguous orthologue of the known mammalian siglecs. The identified fish siglec shows highest sequence similarity to mammalian Siglec-4, the myelin-associated glycoprotein (MAG). Similar to mammalian Siglec-4/MAG, fish Siglec-4 binds to glycans containing α2,3-linked sialic acids and its expression appears to be restricted to nervous tissues. Low sequence similarity in the cytoplasmic tail, different splice variants and the existence of several motifs implicated in signal transduction suggests alternative biological functions compared to mammalian Siglec-4 for this part of the protein. In addition to Siglec-4, other putative siglec genes have been identified in both fish genomes. However, the low degree of similarity of these sequences to mammalian siglecs does not allow a reasonable phylogenetic analysis which would have provided information concerning a potential ancestor of the mammalian siglecs. However, the high conservation of Siglec-4 from fish to mammals emphasizes an indispensable role for this protein in all vertebrates in contrast to other Siglecs which maintained a high degree of structural and possibly functional flexibility during evolution.

Page 101: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Zusammenfassung

95

7. Zusammenfassung Als terminale Komponenten von Glykokonjugaten spielen die Sialinsäuren eine Schlüsselrolle in einer Vielzahl biologischer Funktionen wie z.B. Zell-Zell-Kommunikation, Zell-Substrat-Interaktionen, Adhäsions- und Signaltransduktionsprozessen. Obwohl Sialinsäuren in allen bisher untersuchten Spezies der Deuterostomia nachgewiesen werden konnten, wurden die Proteine, die an Biosynthese und Erkennung dieser Zuckerdeterminanten beteiligt sind, bislang nur für höhere Vertebraten wie Säugetiere und Vögel beschrieben. Um einen tieferen Einblick in die Evolution dieser Proteine zu gewinnen, wurden Genome verschiedener Deuterostomier untersucht. Die Analyse der vollständig sequenzierten Genome zweier Knochenfisch-Arten (Takifugu rubripes und Danio rerio) ergab putative Orthologe für sämtliche Schlüsselenzyme des Sialinsäure-Stoffwechsels einschließlich fast aller bekannter Sialyltransferasen. Auch in Genomen, deren Sequenzierung noch nicht abgeschlossen ist (Branchiostoma floridae, Ciona intestinalis, Strongylocentrotus purpuratus), konnten potentielle Sialyltransferase-Gene identifiziert werden. Die phylogenetische Untersuchung der Sialyltransferase-Sequenzen legt nahe, dass sich die Hauptzweige der Sialyltransferasen, deren phylogenetische Einteilung den jeweiligen Substratspezifitäten entspricht, weit vor der Aufspaltung der Bilateria in Deuterostomier und Protostomier entwickelt haben und wahrscheinlich schon im Vorläufer aller Bilateria vorhanden waren. Daraus lässt sich folgern, dass sich die charakteristischen Merkmale der Sialyltransferasen im Laufe der letzten Milliarde Jahre kaum verändert haben. Die phylogenetischen Beziehungen der einzelnen Zweige der Sialyltransferasefamilie untereinander deutet darauf hin, dass die Evolution der Sialyltransferasen maßgeblich von der Verfügbarkeit geeigneter Akzeptorsubstrate beeinflusst wurde. Im Gegensatz zu den Sialyltransferasen scheint die größte Familie der Sialinsäure-erkennenden Lektine, der Siglecs, eine entwicklungsgeschichtlich relativ junge Proteinfamilie zu sein. Die Analyse der Genome von Takifugu rubripes und Danio rerio ermöglichte nur in einem Fall die Zuordnung einer potentiellen Siglec-Sequenz zu einem bekanntem Säugetier-Siglec. Das identifizierte Fisch-Siglec zeigte höchste Sequenz-Ähnlichkeit zu Siglec-4, dem Myelin-assoziierten Glykoprotein. In Übereinstimmung mit seinem Ortholog aus dem Säugetier zeigte das Fisch-Siglec eine Bindungsspezifität für Glykane, die α2,3-verknüpfte Sialinsäuren tragen sowie eine Expression, die auf das Nervengewebe beschränkt zu sein scheint. Starke Sequenzunterschiede im zytoplasmatischen Bereich, unterschiedliche Spleiss-Varianten sowie das Auftreten von Motiven, die in Verbindung mit Signaltransduktions-Mechanismen diskutiert werden, lassen für diesen Teil des Proteins auf biologische Funktionen schließen, die sich von denen des Säugetier-Siglec-4 unterscheiden. Zusätzlich zu Siglec-4 konnten in beiden Fisch-Genomen weitere putative Siglec-Gene identifiziert werden. Die geringen Übereinstimmungen dieser Sequenzen mit denen der Säugetier-Siglecs erlaubte

Page 102: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Zusammenfassung

96

jedoch keine sinnvolle phylogenetische Analyse und damit auch keine Aussagen über den potentiellen Vorläufer der Säugetier-Siglecs. Die Tatsache, dass Siglec-4 das einzige bisher bekannte Siglec zu sein scheint, das zwischen Fisch und Säugetier hoch konserviert ist, deutet auf eine für den Organismus des Vertebraten unverzichtbare Funktion hin, die im Kontrast steht zur strukturellen und wahrscheinlich auch funktionellen Flexibilität der übrigen Siglecs.

Page 103: The evolutionary history of sialylation - Perspectives from fish genomeselib.suub.uni-bremen.de/diss/docs/E-Diss973_lehmann.pdf · The evolutionary history of sialylation - Perspectives

Acknowledgements

97

8. Acknowledgements Mein besonderer Dank gilt Herrn Prof. Dr. Sørge Kelm für die Übertragung des Themas, die Betreuung meiner Arbeit, die große Unterstützung bei der Veröffentlichung meiner Ergebnisse und nicht zuletzt seine unermüdliche Bereitschaft, mit mir über mögliche Evolutionswege der Sialinsäuren zu philosophieren (warum bloß hat Drosophila nur eine Sialyltransferase?). Frau Prof. Dr. R. Gerardy-Schahn danke ich herzlich für die Übernahme des Koreferats sowie ihre hohe zeitliche Flexibilität bei der Begutachtung dieser Arbeit. Herrn Dr. Thomas Hurek danke ich sehr für seine Einführung in die Methoden phylogenetischer Analyse, sein Interesse an der Evolution der Sialyltransferasen und die Bereitschaft, für kurze Zeit Bakterien gegen Kugelfische einzutauschen. Der gesamten AG Kelm möchte ich für Unterstützung und Rat in fachlichen Fragen und die nette Atmosphäre danken. Es hat viel Spaß gemacht, hier zu arbeiten.