35
Critical Reviews in Microbiology, 31:101–135, 2005 Copyright c Taylor & Francis Inc. ISSN: 1040-841X print / 1549-7828 online DOI: 10.1080/10408410590922393 Protein Signatures Distinctive of Alpha Proteobacteria and Its Subgroups and a Model for α-Proteobacterial Evolution Radhey S. Gupta Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada Alpha (α) proteobacteria comprise a large and metabolically diverse group. No biochemical or molecular feature is presently known that can distinguish these bacteria from other groups. The evolutionary relationships among this group, which includes nu- merous pathogens and agriculturally important microbes, are also not understood. Shared conserved inserts and deletions (i.e., indels or signatures) in molecular sequences provide a powerful means for identification of different groups in clear terms, and for evo- lutionary studies (see www.bacterialphylogeny.com). This review describes, for the first time, a large number of conserved indels in broadly distributed proteins that are distinctive and unifying char- acteristics of either all α-proteobacteria, or many of its constituent subgroups (i.e., orders, families, etc.). These signatures were iden- tified by systematic analyses of proteins found in the Rickettsia prowazekii (RP) genome. Conserved indels that are unique to α- proteobacteria are present in the following proteins: Cytochrome c oxidase assembly protein Ctag, PurC, DnaB, ATP synthase α- subunit, exonuclease VII, prolipoprotein phosphatidylglycerol transferase, RP-400, FtsK, puruvate phosphate dikinase, cyto- chrome b, MutY, and homoserine dehydrogenase. The signatures in succinyl-CoA synthetase, cytochrome oxidase I, alanyl-tRNA syn- thetase, and MutS proteins are found in all α-proteobacteria, ex- cept the Rickettsiales, indicating that this group has diverged prior to the introduction of these signatures. A number of proteins con- tain conserved indels that are specific for Rickettsiales (XerD inte- grase and leucine aminopeptidase), Rickettsiaceae (Mfd, ribosomal protein L19, FtsZ, Sigma 70 and exonuclease VII), or Anaplasmat- aceae (Tgt and RP-314), and they distinguish these groups from all others. Signatures in DnaA, RP-057, and DNA ligase A are commonly shared by various Rhizobiales, Rhodobacterales, and Caulobacter, suggesting that these groups shared a common an- cestor exclusive of other α-proteobacteria. A specific relationship between Rhodobacterales and Caulobacter is indicated by a large insert in the Asn-Gln amidotransferase. The Rhizobiales group of species are distinguished from others by a large insert in the Trp-tRNA synthetase. Signature sequences in a number of other proteins (viz. oxoglutarate dehydogenase, succinyl-CoA synthase, LytB, DNA gyrase A, LepA, and Ser-tRNA synthetase) serve to distinguish the Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae families from Bradyrhizobiaceae and Methylobacteriaceae. Based Received 20 December 2004; accepted 8 December 2005. Address correspondence to Radhey S. Gupta, Department of Bio- chemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada L8N 3Z5. E-mail: [email protected] on the distribution patterns of these signatures, it is now possi- ble to logically deduce a model for the branching order among α-proteobacteria, which is as follows: Rickettsiales Rhodo- spirillales-Sphingomonadales Rhodobacterales-Caulobacterales Rhizobiales (Rhizobiaceaea-Brucellaceae-Phyllobacteriaceae, and Bradyrhizobiaceae). The deduced branching order is also con- sistent with the topologies in the 16 rRNA and other phylogenetic trees. Signature sequences in a number of other proteins provide ev- idence that α-proteobacteria is a late branching taxa within Bacte- ria, which branched after the δ,-subdivisions but prior to the β,γ- proteobacteria. The shared presence of many of these signatures in the mitochondrial (eukaryotic) homologs also provides evidence of the α-proteobacterial ancestry of mitochondria. Keywords Bacterial Phylogeny; Alpha Proteobacteria Trees; Pro- tein Signatures; Rickettsiales; Rhodobacterales; Branch- ing Order; Mitochondrial Origin; Rickettsia prowazekii; Rhizobiales INTRODUCTION The alpha (α) proteobacteria comprise an important group within Bacteria, which has contributed seminally to many as- pects of the history of life (Margulis 1970; Kersters et al. 2003). It is now established that mitochondria, which enable eukary- otic cells to produce energy via oxidative phosphorylation, are the result of endosymbitotic capture of an α-proteobacteria by the primitive eukaryotic cell (Margulis 1970; Falah & Gupta 1994; Viale & Arakaki 1994; Andersson et al. 1998; Gray et al. 1999; Karlin & Brocchieri 2000; Emelyanov 2001a; Esser et al. 2004). There is also strong evidence indicating that the ances- tral eukaryotic cell itself may have originated via a fusion, or long-term symbiotic association, event between one or more α- proteobacteria and an archaebacteria (or Archaea) (Gupta et al. 1994; Lake & Rivera 1994; Gupta & Golding 1996; Margulis 1996; Gupta 1998; Martin & Muller 1998; Ribeiro & Golding 1998; Andersson et al. 1998; Karlin et al. 1999; Lang et al. 1999; Kurland & Andersson 2000; Emelyanov 2001a, 2003b). The symbiosis between α-proteobacteria (viz. Rhizobiaceae species) and plant root nodules plays a central role in the fixation of at- mospheric nitrogen by plants (Sadowsky & Graham 2000; Van Sluys et al. 2002; Kersters et al. 2003; Sawada et al. 2003). Ad- ditionally, many α-proteobacterial species (viz. Rickettsiales, 101 Critical Reviews in Microbiology Downloaded from informahealthcare.com by Cornell University For personal use only.

Protein Signatures Distinctive of Alpha Proteobacteria and ... specific/Gupta,2005.pdf · Griffiths & Gupta 2002, 2004a; Gupta & Griffiths 2002; Gupta et al. 2003). We have previously

Embed Size (px)

Citation preview

Critical Reviews in Microbiology, 31:101–135, 2005Copyright c© Taylor & Francis Inc.ISSN: 1040-841X print / 1549-7828 onlineDOI: 10.1080/10408410590922393

Protein Signatures Distinctive of Alpha Proteobacteria andIts Subgroups and a Model for α-Proteobacterial Evolution

Radhey S. GuptaDepartment of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada

Alpha (α) proteobacteria comprise a large and metabolicallydiverse group. No biochemical or molecular feature is presentlyknown that can distinguish these bacteria from other groups. Theevolutionary relationships among this group, which includes nu-merous pathogens and agriculturally important microbes, are alsonot understood. Shared conserved inserts and deletions (i.e., indelsor signatures) in molecular sequences provide a powerful meansfor identification of different groups in clear terms, and for evo-lutionary studies (see www.bacterialphylogeny.com). This reviewdescribes, for the first time, a large number of conserved indels inbroadly distributed proteins that are distinctive and unifying char-acteristics of either all α-proteobacteria, or many of its constituentsubgroups (i.e., orders, families, etc.). These signatures were iden-tified by systematic analyses of proteins found in the Rickettsiaprowazekii (RP) genome. Conserved indels that are unique to α-proteobacteria are present in the following proteins: Cytochromec oxidase assembly protein Ctag, PurC, DnaB, ATP synthase α-subunit, exonuclease VII, prolipoprotein phosphatidylglyceroltransferase, RP-400, FtsK, puruvate phosphate dikinase, cyto-chrome b, MutY, and homoserine dehydrogenase. The signatures insuccinyl-CoA synthetase, cytochrome oxidase I, alanyl-tRNA syn-thetase, and MutS proteins are found in all α-proteobacteria, ex-cept the Rickettsiales, indicating that this group has diverged priorto the introduction of these signatures. A number of proteins con-tain conserved indels that are specific for Rickettsiales (XerD inte-grase and leucine aminopeptidase), Rickettsiaceae (Mfd, ribosomalprotein L19, FtsZ, Sigma 70 and exonuclease VII), or Anaplasmat-aceae (Tgt and RP-314), and they distinguish these groups fromall others. Signatures in DnaA, RP-057, and DNA ligase A arecommonly shared by various Rhizobiales, Rhodobacterales, andCaulobacter, suggesting that these groups shared a common an-cestor exclusive of other α-proteobacteria. A specific relationshipbetween Rhodobacterales and Caulobacter is indicated by a largeinsert in the Asn-Gln amidotransferase. The Rhizobiales groupof species are distinguished from others by a large insert in theTrp-tRNA synthetase. Signature sequences in a number of otherproteins (viz. oxoglutarate dehydogenase, succinyl-CoA synthase,LytB, DNA gyrase A, LepA, and Ser-tRNA synthetase) serve todistinguish the Rhizobiaceae, Brucellaceae, and Phyllobacteriaceaefamilies from Bradyrhizobiaceae and Methylobacteriaceae. Based

Received 20 December 2004; accepted 8 December 2005.Address correspondence to Radhey S. Gupta, Department of Bio-

chemistry and Biomedical Sciences, McMaster University, Hamilton,Ontario, Canada L8N 3Z5. E-mail: [email protected]

on the distribution patterns of these signatures, it is now possi-ble to logically deduce a model for the branching order amongα-proteobacteria, which is as follows: Rickettsiales → Rhodo-spirillales-Sphingomonadales → Rhodobacterales-Caulobacterales→ Rhizobiales (Rhizobiaceaea-Brucellaceae-Phyllobacteriaceae,and Bradyrhizobiaceae). The deduced branching order is also con-sistent with the topologies in the 16 rRNA and other phylogenetictrees. Signature sequences in a number of other proteins provide ev-idence that α-proteobacteria is a late branching taxa within Bacte-ria, which branched after the δ,ε-subdivisions but prior to the β,γ-proteobacteria. The shared presence of many of these signatures inthe mitochondrial (eukaryotic) homologs also provides evidence ofthe α-proteobacterial ancestry of mitochondria.

Keywords Bacterial Phylogeny; Alpha Proteobacteria Trees; Pro-tein Signatures; Rickettsiales; Rhodobacterales; Branch-ing Order; Mitochondrial Origin; Rickettsia prowazekii;Rhizobiales

INTRODUCTIONThe alpha (α) proteobacteria comprise an important group

within Bacteria, which has contributed seminally to many as-pects of the history of life (Margulis 1970; Kersters et al. 2003).It is now established that mitochondria, which enable eukary-otic cells to produce energy via oxidative phosphorylation, arethe result of endosymbitotic capture of an α-proteobacteria bythe primitive eukaryotic cell (Margulis 1970; Falah & Gupta1994; Viale & Arakaki 1994; Andersson et al. 1998; Gray et al.1999; Karlin & Brocchieri 2000; Emelyanov 2001a; Esser et al.2004). There is also strong evidence indicating that the ances-tral eukaryotic cell itself may have originated via a fusion, orlong-term symbiotic association, event between one or more α-proteobacteria and an archaebacteria (or Archaea) (Gupta et al.1994; Lake & Rivera 1994; Gupta & Golding 1996; Margulis1996; Gupta 1998; Martin & Muller 1998; Ribeiro & Golding1998; Andersson et al. 1998; Karlin et al. 1999; Lang et al. 1999;Kurland & Andersson 2000; Emelyanov 2001a, 2003b). Thesymbiosis between α-proteobacteria (viz. Rhizobiaceae species)and plant root nodules plays a central role in the fixation of at-mospheric nitrogen by plants (Sadowsky & Graham 2000; VanSluys et al. 2002; Kersters et al. 2003; Sawada et al. 2003). Ad-ditionally, many α-proteobacterial species (viz. Rickettsiales,

101

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

102 R. S. GUPTA

Brucella, Bartonella) are adapted to intracellular life style andare major human and animal pathogens (Moreno & Moriyon2001; Kersters et al. 2003; Yu & Walker 2003).

The α-proteobacteria exhibit enormous diversity in terms oftheir morphological and metabolic characteristics and they in-clude numerous phototrophs, chemolithotrophs and chemoorgan-otrophs (Stackebrandt et al. 1988; De Ley 1992; Kersters et al.2003). This group also harbors all known aerobic photoheterotro-hic bacteria, which contain bacteriochlorophyll a, but are unableto grow photosynthetically under anaerobic conditions (Yurkov& Beatty 1998). These bacteria are abundant in the upper layersof oceans (Kolber et al. 2001). The α-proteobacterial species arepresently recognized on the basis of their branching pattern inthe 16S rRNA trees, where they form a distinct clade within theproteobacterial phylum (Woese et al. 1984; Stackebrandt et al.1988; Olsen et al. 1994; Gupta 2000; Kersters et al. 2003). Thisgroup has been given the rank of a Class or subdivision withinthe Proteobacteria phylum (Stackebrandt et al. 1988; Murrayet al. 1990; De Ley 1992; Stackebrandt 2000; Ludwig & Klenk2001; Garrity & Holt 2001; Kersters et al. 2003). Other thantheir distinct branching in the 16S rRNA or other phylogenetictrees (De Ley 1992; Viale et al. 1994; Eisen 1995; Gupta et al.1997; Gupta 2000; Stepkowski et al. 2003; Emelyanov 2003a;Battistuzzi et al. 2004), there is no reliable phenotypic or molec-ular characteristic known at present that is uniquely shared bydifferent α-proteobacteria which distinguish them from all otherbacteria (Kersters et al. 2003). On the basis of 16S rRNA trees theα-proteobacteria have been divided into seven main subgroupsor orders (viz. Caulobacterales, Rhizobiales, Rhodobacterales,Rhodospirillales, Rickettsiales, Sphingomondales, and Parvu-larucales) (Maidak et al. 2001; Garrity & Holt 2001; Kersterset al. 2003). However, the branching order and interrelation-ships among these subgroups are presently not resolved and nodistinctive features that can distinguish these groups from eachother are known (Kersters et al. 2003).

In our recent work, we have been utilizing a new approachbased on identification of conserved indels (also referred to assignatures) in proteins sequences that is proving very useful inidentifying different groups within Bacteria in clear molecu-lar terms and clarifying evolutionary relationships among them(see www.bacterialphylogeny.com) (Gupta 1998, 2003, 2004;Griffiths & Gupta 2002, 2004a; Gupta & Griffiths 2002; Guptaet al. 2003). We have previously described many protein sig-natures that are distinctive characteristics of the proteobacte-rial phylum and which also provided information regarding itsbranching position relative to other bacterial groups (Gupta 1998,2000; Griffiths & Gupta 2004b). This review focuses on ex-amining the evolutionary relationships among α-proteobacteriausing the signature sequence as well as traditional phyloge-netic approaches. In recent years, complete genomes of sev-eral α-proteobacteria (viz. Bartonella henselae, Bart. quintana,Bradyrhizobium japonicum, Brucella melitensis, Bru. suis,Caulobacter crescentus, Mesorhizobium loti, Sinorhizobium loti,Rhodopseudomonas palustris, Agrobacterium tumefaciens, Rick-

ettsia conorii, Ri. prowazekii, Ri. typhi, and Wolbachia sp.(Drosophila endosymbiont)) have become available (Anders-son et al. 1998; Kaneko et al. 2000, 2002; Nierman et al. 2001;Wood et al. 2001; Ogata et al. 2001; Galibert et al. 2001;DelVecchio et al. 2002; Paulsen et al. 2002; Larimer et al. 2004;McLeod et al. 2004). These provide valuable resources for iden-tifying novel molecular features that are likely distinctive char-acteristics of α-proteobacteria and its various subgroups, andwhich may prove helpful in clarifying the evolutionary rela-tionships among them. This article, describes for the first time,a large number of conserved indels in widely distributed pro-teins that are either uniquely shared by all α-proteobacteria, orwhich are shared by only particular subgroups (i.e., families ororders) of this Class. These signatures provide novel and defini-tive molecular means for distinguishing α-proteobacteria andmany of its subgroups from all other bacteria. The distributionof these signatures in different α-proteobacteria also enables oneto logically deduce the relative branching orders and interrela-tionships among different α-proteobacteria subgroups. Phylo-genetic studies have also been carried out based on 16S rRNAand a number of proteins sequences. Based on this informa-tion, a detailed model for the evolutionary relationships amongα-proteobacteria has been developed.

PHYLOGENETIC TREE FOR ALPHA PROTEOBACTERIABASED ON 16S rRNA SEQUENCES

Although α-Proteobacteria comprise a major group withinBacteria (Garrity & Holt 2001) with >5200 sequences in theRibosomal Database Project II (Maidak et al. 2001), there is nodetailed review or article that discusses the evolutionary relation-ships among this group (i.e. indicating the relationships amongdifferent subgroups and orders within this Class) (Kersters et al.2003). Most of the articles on α-Proteobacteria are aimed at clar-ifying the phylogenetic placement of particular species at eithergenus or family levels (Dumler et al. 2001; Gaunt et al. 2001;Young et al. 2001; Taillardat-Bisch et al. 2003; van Berkumet al. 2003; Broughton 2003; Stepkowski et al. 2003; Sawadaet al. 2003). The second edition of Bergey’s Manual (Ludwig &Klenk 2001) and the third edition of Prokaryotes (Kersters et al.2003) present condensed phylogenetic trees for the α-Proteo-bacteria (or Proteobacteria) as a whole to indicate presumedrelationships among different subgroups comprising this sub-division. However, most of these trees do not show any boot-strap scores or even individual species (Ludwig & Klenk 2001;Kersters et al. 2003), making it difficult to get a clear sense of thereliability of the observed (or indicated) relationships. Hence,as an initial step toward understanding the evolutionary rela-tionships among α-Proteobacteria, a phylogenetic tree based on16S rRNA sequences was constructed from 65 α-proteobacterialspecies, covering its major subgroups. The resulting neighbor-joining bootstrapped consensus tree is presented in Figure 1.The tree shown was rooted using the 16S rRNA sequences fromepsilon proteobacteria, which show deeper branching than theα-subdivision in the rRNA as well as various other trees (Olsen

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 103

FIG. 1. A neighbor-joining bootstrap consensus tree for α-proteobacteria based on 16S rRNA sequences. The tree was bootstrapped 100 times and bootstrapscores which were >60 are indicated on the nodes. The tree was rooted using H. pylori. However, the tree topologies was not altered on rooting with otherdeep branching bacteria (e.g., Aq. aeolicus). The groups of species corresponding to some of the main subgroups within α-proteobacteria are marked. ∗indicatesanomalous branching in the tree.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

104 R. S. GUPTA

et al. 1994; Viale et al. 1994; Eisen 1995; Gupta 1998). Thebootstrap scores for all nodes, which were >60 (out of 100) areindicated on the tree.

In the resulting tree a number of different clades are eitherclearly (>90% bootstrap score) or reasonably well resolved.These included the clades corresponding to group of specieswhich are recognized as major orders within theα-Proteobacteria(Rhizobiales, Rhodospirillales, Caulobacterales, Sphingomon-adales, Rhodobacterales, and Rickettsiales) (Ludwig & Klenk2001; Garrity & Holt 2001; Kersters et al. 2003). Within Rhi-zobiales, the Bradyrhizobiaceae family of species was clearlyseparated from some of the other families within this order(viz. Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae)(Wang et al. 1998; Sadowsky & Graham 2000; Dumler et al.2001; van Berkum et al. 2003; Stepkowski et al. 2003). Withinα-Proteobacteria, the deepest branching was observed for theRickettsiales group of species. Within the Rickettsiales, the Rick-ettsia, and Orientia genera, which form part of the Rickttsi-aceae family, were clearly resolved from the Anaplasmataceaefamily comprised of Ehrlichia, Wolbachia, Anaplasma, and Ne-orickettsia species (Dumler et al. 2001; Yu & Walker 2003). Incontrast to these well-resolved clades or relationships, variousnodes indicating the interrelationships among different ordershad lower bootstrap scores (<60%), indicating that interrela-tionships among them were not resolved. The relationships ob-served here are very similar to those reported in earlier studies(Olsen et al. 1994; Sadowsky & Graham 2000; Dumler et al.2001; Yu et al. 2001; Kersters et al. 2003; Yu & Walker 2003;Stepkowski et al. 2003). The tree shown here will serve as auseful reference for determining the evolutionary significanceof various signature sequences.

SIGNATURE SEQUENCES DISTINCTIVE OFα-PROTEOBACTERIA

To identify conserved indels that might be distinctive of α-Proteobacteria, or a particular groups of species within thisClass, multiple sequence alignments of all proteins that are foundin the genome of Ri. prowazekii (Andersson et al. 1998) were cre-ated using the CLUSTAL X program (Jeanmougin et al. 1998).These alignments were visually inspected for any conserved in-dels that were mainly restricted to the α-proteobacterial species.The indels that we focussed on were generally of defined sizeand they were present in the same position in a given protein.The indels of interest were also required to be flanked on bothsides by conserved regions to ensure that the sequence align-ment in the region was reliable and that the indel under con-sideration was not resulting from any alignment artefact (Gupta1998, 2000; Rokas & Holland 2000; Gupta & Griffiths 2002).The indels that appeared unique to other groups of bacteria, orwhich were present in only a single α-proteobacterial species,were not further investigated. This has led to identification ofmany conserved indels that are specific for α-proteobacteria orits subgroups. A brief description of these signatures as well asof the proteins in which they are found is given below.

A. Signature Sequences That are Common to Allα-Proteobacteria

Signature sequences in the following proteins are uniquelyshared by different α-proteobacteria. Cytochrome c oxidase(CoxI) is an integral component of the respiratory chain in mi-tochondria and various aerobic bacteria, and it serves as theterminal electron acceptor (Stryer 1995; Andersson et al. 1998;Emelyanov 2003a). This membrane-associated complex requiresthe association of several protein subunits and the formation ofmany different metal centers. One of the proteins involved inits assembly is Ctag (Cox11), which is required for the for-mation of CuB and magnesium centers of Cox I (Hiser et al.2000). In the Ctag protein, a 5 aa insert in a conserved region ispresent in all α-proteobacteria, but not found in any other bacte-ria (Figure 2). Within bacteria the homologs of this protein aremainly restricted to α, β, and γ -subdivisions of proteobacteria.Although a protein which carries out a similar function (alsoknown as Ctag) is present in gram-positive bacteria, it does notshow any sequence similarity to the proteobacterial homologs(Bengtsson et al. 2004). The observed insert in the Ctag proteinis also present in various eukaryotic homologs, supporting theirderivation from α-proteobacteria.

Another conserved insert that is specific for α-proteobacteriais present in the enzyme 5′-phosphoribosyl-5-aminoimidazole-4-N-succinocarboxamide (SAICAR or PurC) synthetase, whichcarries out the seventh step in the de novo purine biosyntheticpathway (Hui & Morrison 1993; Stryer 1995). The enzyme is en-coded by the purC gene and it is broadly distributed in bacteria.A 3 aa insert in this protein is present in various α-proteobacteria(Figure 3), but not in any other bacteria, indicating that it is adistinctive characteristic of the group. The eukaryotic homologsof the SAICAR synthetase do not contain this insert, but theiroverall similarity in this region is limited (not shown). In ad-dition to α-proteobacteria, a 2 aa insert is also present in thisposition in Magnetococcus sp. MC-1, suggesting that it may bedistantly related to this group. The phylogenetic assignment ofMagnetococcus sp. MC-1 is presently uncertain (Garrity & Holt2001).

Two other signatures that are specific for α-proteobacteria arepresent in the replicative DNA helicase (DnaB) and the α subunitof ATP synthase complex. DnaB helicase is a multifunctionalenzyme involved in the DNA replication process (Soni et al.2003). It interacts with a number of proteins involved in DNAreplication and exhibits multiple enzymatic activities includinghelicase, ATP hydrolysis and DNA binding. An insert of between8 and 14 aa is present in a conserved region of DnaB, which isunique to various α-proteobacteria (Figure 4). Most of the Rhizo-biales as well as Rhodobacter and Caulobacter species are foundto contain the 14 aa insert, whereas a smaller insert is present invarious Rickettsiales and certain other α-proteobacteria. In Mag.magnetotacticum, three different homologs of DnaB are foundand they all contained the 14 aa insert. Based upon the fact thatthis insert is present in the same position in all α-proteobacteria,it is likely that it was introduced only once in a common ancestor

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 105

FIG. 2. Partial sequence alignment for Cytochrome c oxidase assembly protein (Ctag) showing a 5 aa insert (boxed), which is specific for α-proteobacteria.Dashes in this sequence alignment as well as all others indicate identity with the amino acid on the top line. The position of this sequence in the protein is markedon the top. The accession numbers of various sequences are shown in the second column. Only representative sequences from different bacteria are shown. Theidentified insert is also present in various eukaryotic (mitochondrial) homologs indicating their derivation from an α-proteobacterial ancestor. The abbreviationsused in the species names are listed at the end of this review.

of the group and that subsequent genetic changes led to the ob-served variation in its length. The DnaB homologs are not foundin eukaryotic species.

The synthesis of ATP in different organisms is carried outby F1F0ATP synthase, a multisubunit complex located in thecytoplasmic membrane of bacteria or inner membrane of mito-chondria (Stryer 1995; Leyva et al. 2003). The F1 portion of thiscomplex is a heteromer made up of five subunits, α, β, γ , δ andε with the stoichiometry α3β3γ δε. The α subunit of ATP syn-thase contains an 8 aa insert in a highly conserved region that

is commonly present in all α-proteobacteria, but which is notfound in any other proteobacterial species (Figure 5). Besides,α-proteobacteria, inserts of variable lengths are also present inthis position in various Actinobacteria and Bacteriodetes (notshown). In phylogenetic tree based on ATP synthase (α), theselatter groups do not show any affinity for each other or to the α-proteobacteria (Gupta 2004), indicating that these inserts havelikely been introduced independently. The observed insert inATP synthase α is also present in various eukaryotic homologsproviding evidence of their α-proteobacterial ancestry.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

106 R. S. GUPTA

FIG. 3. Partial sequence alignment of PurC (SAICAR synthetase) showing a 3 aa insert that is specific for various α-proteobacteria.

The enzyme exonuclease VII degrades single-stranded DNAbidirectionally and processively (Chase et al. 1986). In the largesubunit of exonuclease VII, encoded by the xseA gene, a numberof useful signatures for α-proteobacteria are present. These sig-natures include a 3 aa insert and a 1 aa deletion that are presentin all known α-proteobacterial homologs (Figure 6A), but notfound in any other groups of bacteria. In the same position wherethe 3 aa insert is found, an additional 3 aa insert is present inall α-proteobacteria except the Rickettsiales. Elsewhere in thisprotein, a 1 aa deletion (at position 141 in the Ri. prowazekii se-quence) that is unique to various Rickettsia species is also found(not shown). In a phylogenetic tree based on exonuclease VII se-quences (Figure 7), all of theα-proteobacterial homologs formeda well-defined clade, which was strongly supported by bootstrapscores. Similar to the rRNA tree, the Rickettsiales species formedthe earliest branching group within the α-proteobacteria, andboth the Rickettsiales clade, as well as a clade comprising of theremainder of the α-proteobacteria were clearly resolved in thistree. The inferences from signature sequences are thus strongly

supported by the phylogenetic analysis. The evolutionary stageswhere different identified signatures have likely been introducedin this gene/protein are marked on the tree (Figure 7).

Another insert that is specific for α-proteobacteria is presentin the enzyme prolipoprotein-phosphatidylglycerol (PLPG) trans-ferase, which carries out the first committed step in the pathwayleading to synthesis of lipid modified proteins (Figure 6B) (Qiet al. 1995). The indicated 3 aa insert in PLPG- transferase isunique to α-proteobacteria and not found in other bacteria. Thehomologs of exonuclease VII and PLPG-transferase were notdetected in eukaryotes.

Two other proteins where α-proteobacteria-specific insertsare found are, RP-400 and puryvuate phosphate dikinase (PPDK).The first of these is a protein of unknown function present in theRi. prowazekii genome (RP-400), which is distantly related tomurein transglycosylases. This protein contains a 4–6 aa insertin a conserved region that is a distinctive characteristic of var-ious α-proteobacteria, except Zymomonas mobilis (Figure 8).The absence of this insert in the latter species could result from

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 107

FIG. 4. Partial sequence alignment of replicative DNA helicase, DnaB, showing an 8–14 aa insert that is specific for various α-proteobacteria.

either selective loss, or exchange of this gene from some otherspecies lacking the insert. PPDK is a key enzyme in photosyn-thesis, which catalyzes the reversible conversion of phospho-enolpyruvate to pyruvate (Ku et al. 1996). This enzyme is notfound in mammalian cells but it is broadly distributed in bacteriaand plants. A conserved insert in PPDK provides an informativesignature for α-proteobacteria. Rickettsiales species contain a5 aa insert in this position, whereas a larger insert of 12 aa is

found in various other α-proteobacteria (Figure 9). Interestingly,an insert of 10 aa is also present in the same position in variousδ-proteobacteria, suggesting a distant relationship of this groupto the α-proteobacteria. Because the insert sequence in vari-ous species appears to be related, it is possible that this insertwas originally introduced in a common ancestor of the α- and δ-proteobacteria. The varying lengths of the inserts in Rickettsialesand other α-proteobacteria could then result from subsequent

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

108 R. S. GUPTA

FIG. 5. Sequence alignment of ATP synthase α subunit showing an 8 aa insert in a highly conserved region that is present in various α-proteobacterial homologs,but not found in any other proteobacteria. The shared presence of this insert in various eukaryotic homologs provides evidence of their α-proteobacterial ancestry.

genetic changes in the branches leading to these groups. Alter-natively, the inserts of different lengths could have been inde-pendently introduced in these groups and the observed sequencesimilarity may be a consequence of their related function. It isof interest that in contrast to other α-proteobacteria, which con-tain only a single PPDK homolog, two homologs of this proteinare found in Bradyrhizobium (Brad.) japonicum. Of these, onlyone contained the insert. The homolog lacking the insert is quite

divergent and it is possible that this may have been acquired bymeans of lateral gene transfer (LGT) from other bacteria.

Two different proteins, FtsK and Cytochrome b (PetB) con-tain deletions which are mainly limited to the α-proteobacterialspecies. In the FtsK protein, which plays a central role in celldivision and chromosome segregation in bacteria (Capiaux et al.2002; Espeli et al. 2003), two 1 aa deletions are present in con-served regions that are largely distinctive of α-proteobacteria

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 109

FIG. 6. Signature sequence in exonucleaseVII (A) and prolipoprotein-phosphatidylglycerol (PLPG) transferase (B) proteins that are distinctive of α-proteobacterial species. Exonuclease VII contains a 3 aa insert and a 1 aa deletion that is unique to all α-proteobacteria. The presence of an additional 3 aainsert distinguishes Rickettsiales species from other α-proteobacteria. The enzyme PLPG-transferase also contains a 3 aa insert that is specific for α-proteobacteria.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

110 R. S. GUPTA

FIG. 7. A neighbor-joining bootstrap consensus tree based on exonucleaseVII sequences. Bootstrap scores >50% are indicated on various nodes. All insertsand deletions were excluded from the sequence alignment used for phylogenetic analysis. The α-proteobacteria formed a well-defined clade in this tree, however,their branching position relative to other groups was not resolved. The Rickettsiales order formed the deepest branch within α-proteobacteria and they were alsoclearly resolved from other α-proteobacteria. The arrows mark the suggested positions where the identified signatures were introduced in this protein.

(Figure 10). One of these deletions is a distinctive characteris-tic of all α-proteobacteria and not found in any other bacteria.The other deletion, in addition to the α-proteobacteria, isalso commonly present in the two Desulfovibrio species(δ-proteobacteria), suggesting a distant relationship of this groupto α-proteobacteria, as also seen with the PPDK protein (Figure9). In addition to these deletions, the FtsK protein also con-tains a 5–6 aa insert that is unique to various α-proteobacteriain comparison to the other groups of proteobacteria (present in

position corresponding to aa 513–520 in Ri. prowazekii protein).Since the region where this insert is found exhibits variabilityin other bacteria, this signature is not shown. The FtsK proteinhas also been previously shown to contain an 8–9 aa insert in adifferent region of the protein that is a distinctive characteristicof various Bacteriodetes and Chlorobium species (Gupta 2004).The FtsK homologs are not found in most eukaryotic organisms.However, a homolog of this protein is present in Plasmodiumyoelii (Genebank accession number 23485217). The origin and

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 111

FIG. 8. Partial sequence alignment of RP-400 protein showing a 4–6 aa insert that is specific for various α-proteobacteria, except Z. mobilis.

possible significance of this gene/protein is presently unclear.A 1 aa deletion that is specific for various α-proteobacteria isalso present in the Cytochrome b (Cyt b; PetB) protein (Fig-ure 11), which is a subunit of the cytochrome reductase, whichis an integral part of the electron transport chain (Daldal et al.1987; Stryer 1995; Emelyanov 2003a). This indel is not presentin other bacteria including that from Aquifex aeolicus, indicat-ing that it is a deletion in α-proteobacteria, rather than an insertin other bacteria. Cyt b is one of the 13 proteins that is stillencoded by mitochondrial DNA (Lang et al. 1999). Sequenceinformation for Cyt b is available from a large number (>500)of mitochondrial genomes and phylogenetic studies based onthis protein provides evidence for the origin of mitochondriafrom within the Rickettsiaceae (Sicheritz-Ponten et al. 1998;Emelyanov 2003a). Similar to the α-proteobacteria, Cyt b fromall eukaryotic mitochondrial homologs was found to lack this 1aa indel, providing evidence of their specific relationship to theα-proteobacteria.

B. Signature Sequences Distinguishing Rickettsiales fromOther α-Proteobacteria

In phylogenetic trees based on 16S rRNA, as well as manyprotein sequences, the Rickettsiales are found to form the deepest

branching clade within α-proteobacteria (see Figures 1 and 7)(Dumler et al. 2001; Gaunt et al. 2001; Yu et al. 2001; Kersterset al. 2003; Yu & Walker 2003; Stepkowski et al. 2003). Wehave identified several signatures that are present in various α-proteobacteria, except the Rickettsiales. These signatures aredescribed below.

The enzyme succinyl CoA-synthetase, which is part of thecitric acid cycle, carries out cleavage of the thioester bond insuccinyl-CoA in a coupled reaction to generate succinate andproducing GTP (Bridger et al. 1987; Stryer 1995). It is theonly step in the citric acid cycle that directly leads to the for-mation of a high-energy phosphate bond. The beta subunit ofthis protein contains a conserved insert of 10 aa, that is com-monly present in all other α-proteobacteria, except the Rick-ettsiales (Figure 12). Surprisingly, this insert is also presentin Ral. metallidurans (a β-proteobacterium), but not in anyother β-proteobacteria, including the closely related species Ral.solanacearum. This suggests that the Succ-CoA synthetase genein Ral. metallidurans has likely originated by non-specific meanssuch as LGT. A smaller unrelated insert in this region, which ispresumably of independent origin, is also present in Cytophagaand Rhodopirellula species (not shown). It is of interest that a7–8 aa insert is also present in this position in various eukary-otic homologs. It is unclear at present, whether this latter insert

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

112 R. S. GUPTA

FIG. 9. Excerpt from sequence alignment for pyruvate phosphate dikinase (PPDK) protein showing a signature for α-proteobacteria. The Rickettsiales speciescontain a 5 aa long insert, where all other α-proteobacteria have a 12 aa insert in the same position. Two different homologs of PPDK are found in Brad. japonicum,only one of which is found to contain the insert. A smaller conserved insert of 10 aa is also present in this position in various δ-proteobacteria suggesting that theymay be specifically, but distantly, related to the α-proteobacteria.

has originated from an α-proteobacterial ancestor or it is of in-dependent origin. If these inserts are of common origin, thenthis would suggest that the eukaryotic homologs of Succ-CoA-synthetase have originated from an α-proteobacterial ancestorother than the Rickettsiales. This observation will be at variancewith other evidence pointing to a closer relationship of mito-chondria to the Rickettsiales species (Viale & Arakaki 1994;Gupta 1995; Andersson et al. 1998; Sicheritz-Ponten et al. 1998;Gray et al. 1999; Lang et al. 1999; Emelyanov 2001a, 2001b,2003a). Emelyanov (2001a, 2001b) has observed a closer rela-tionship of mitochondrial homologs to certain rickettsial species(e.g. Holospora obtusa, Caedibactera caryophila), for whichsequence information for this protein is lacking at present. Itis possible that Succ-CoA synthetase from these species maycontain this insert. Presently, the possibility that the insert ineukaryotic homologs was independently introduced also cannotbe excluded.

Another signature showing a similar distribution pattern hasbeen identified in cytochrome oxidase polypeptide I (Cox I).In this case, a 5 aa insert in a conserved region is commonlypresent in various α-proteobacterial species except the Rick-ettsiales (Figure 13). It should be noted that α-proteobacteriacontain two different related proteins. One of these, which har-bors this insert seems to correspond to Cox I, whereas the otherhomologs lacking the insert are mainly those from Cytochromeo ubiquinol oxidase (Davidson & Daldal 1987). However, allRickettsiales species contain only a single homolog of this pro-tein, corresponding to Cox I. The observed insert in both Succ-CoA-synthetase and Cox I were thus likely introduced in a com-mon ancestor of the remainder of the α-proteobacteria after thebranching of Rickettsiales. Similar to the Cyt b, the Cox I in eu-karyotic cells is also encoded by mitochondrial DNA (Anderssonet al. 1998; Gray et al. 1999) and sequence information forthis protein is available from a large number of mitochondrial

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 113

FIG. 10. Partial sequence alignments of FtsK protein showing two different signatures (1 aa deletions) that are informative characteristics of α-proteobacteria.The deletion on the left is unique to various α-proteobacteria, whereas the one on the right is also commonly shared by two Desulfovibrio species (δ-proteobacteria)suggesting their relatedness to the α-proteobacteria.

genomes. The eukaryotic homologs of Cox I do not containthe identified insert (results not shown) indicating their possiblederivation from Rickettsiales (Emelyanov 2003a).

Two other proteins were found to contain inserts of variablelengths in highly conserved regions in various α-proteobacterialspecies, with the exception of Rickettsiales (Figure 14). Inalanyl-tRNA synthetase (AlaRS), which is ubiquitously foundin all organisms, an insert of between 5–11 aa is present ina highly conserved region in various α-proteobacteria, exceptthe Rickettsiales (and also Mag. magnetotacticum) (Figure 14A).Another signature showing a similar distribution pattern is foundin the MutS protein, which is involved in the DNA mismatch re-pair (Sixma 2001; Martins-Pinheiro et al. 2004). In this case, aconserved insert of 2–5 aa is present in various α-proteobacteria(Figure 14B), but not in Rickettsiales. The simplest explanation

for these signatures is that they were introduced in an ancestralα-proteobacterial lineage, after the branching of Rickettsiales(and also possibly Mag. magnetotacticum). The observed vari-ations in the lengths of these inserts have presumably resultedfrom subsequent genetic changes.

We have also identified a number of α-proteobacteria-specificsignatures in proteins for which no homologs are foundin the Rickettsiales. In the MutY protein, which is an A-Gspecific DNA glycosylase involved in DNA repair (Parker &Eshleman 2003; Martins-Pinheiro et al. 2004), a 4–9 aa in-sert in a conserved region is present in various α-proteobacteria(Figure 15A). An insert of similar length is also present in mosteukaryotic homologs (with the exception of Anopheles gambiae)indicating their possible derivation from α-proteobacteria. An-other signature showing similar species distribution is present in

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

114 R. S. GUPTA

FIG. 11. Partial sequence alignment for Cyt b protein showing a 1 aa deletion that is specific for various α-proteobacteria. This deletion is also present in allmitochondrial homologs (Cyt b is encoded by mitochondrial DNA) providing strong evidence of their α-proteobacterial ancestry.

the protein homoserine dehydrogenase (Figure 15B). This indelconsists of a 1 aa insert in a conserved region that is presentin various α-proteobacteria, but not any other proteobacteria.The homologs of both these proteins were not detected in theRickettsiales species and their absence is very likely due to selec-tive loss of these genes in a common ancestor of the Rickettsiales(Martins-Pinheiro et al. 2004), presumably due to the intracel-lular life-style of these organisms (Boussau et al. 2004). The

observed inserts in these genes could have been introduced in acommon ancestor of the α-proteobacteria, either before or afterthe loss of these genes in Rickettsiales.

Several proteins contain conserved inserts that are either uniquefor the Rickettsiales or for the two main families, Rickettsi-aceae and Anaplasmataceae, comprising this order (Dumleret al. 2001; Yu & Walker 2003). The Rickettsiales-specific signa-tures are present in the proteins XerD and leucine aminopeptidase

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 115

FIG. 12. Partial sequence alignment of Succ-CoA synthase showing a 10 aa insert that is present in various α-proteobacteria, except the Rickettsiales. This insertis not found in other bacteria, except Ral. metallidurans, which has likely acquired it by non-specific means. A smaller insert is also present in this position invarious eukaryotic homologs.

(Figure 16). XerD protein (Figure 16B) is a part of the XerCDintegrase/recombinase that is involved in the cell division pro-cess and decatenation of DNA duplexes (Ip et al. 2003). A 7 aainsert is present in a conserved region of this protein which isuniquely shared by all Rickettsiales and not found in any otherbacteria (Figure 16A). Another 2 aa insert that is specific forRickettsiales is present in leucine aminopeptidase (Figure 16A),which is an exopeptidase that selectively releases N-terminalamino acids from peptides and proteins (Gonzales & Robert-Baudouy 1996). The signatures that are specific for Rickettsiainclude a 4 aa insert in a highly conserved region of the tran-scription repair coupling factor (Mfd) (Martins-Pinheiro et al.2004) (Figure 17A), a 10 aa insert in ribosomal protein L19

(Figure 17B) and a 1 aa insert in the FtsZ protein (Figure 17C).Two additional Rickettsia-specific signatures consisting of a 1aa insert in the major sigma factor-70 (at position 141 in the R.prowazekii sequence) and a 1 aa deletion in exouclease VII (atposition 137 in the Ri. prowazekii homolog) were also identi-fied, but they are not shown here. The identified signatures inthese proteins are present only in various Rickettsiaceae speciesand not found in other Rickettsiales (viz. Ehrlichia, Wolbachia,Anaplasma) or other groups of bacteria. Within eukaryotes, ahomolog of the transcription repair-coupling factor is only de-tected in Arabidoposis thaliana and it lacks the identified insert(results not shown). The homologs of ribosomal protein L19 arefound in various plants and algae but not in any of the animal

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

116 R. S. GUPTA

FIG. 13. Partial sequence alignment of Cox I showing a 5 aa insert that is present in various α-proteobacteria, except Rickettsiales. The other α-proteobacteriaalso contains a second more distantly related homolog that lacks this insert.

species. Of these, an 8 aa insert in the same position is presentonly in the homolog from Cyanophora paradox (not shown). Thesignificance and possible origin of this insert is not clear. Simi-lar to the ribosomal protein L19, FtsZ homologs are also foundonly in plants but not in animals. These homologs also lacked theinsert that is present in Rickettsiaceae. The plant homologs ofthese proteins likely correspond to those of the plastids, whichbecause of their cyanobacterial ancestry (Gray 1989; Mordenet al. 1992; Margulis 1993; Gupta et al. 2003) are expected tobe lacking Rickettsia-specific signatures.

We have also identified two large inserts that are commonlyshared by the Ehrlichia, Wolbachia, and Anaplasma species butnot found in any of the Rickettsia species or other bacteria. Thesesignatures include a 15 aa insert in the HlyD family of secretoryprotein (Figure 18A) and a 10–11 aa insert in the tRNA guaninetransglycosylase (Tgt) protein (Figure 18B), involved in the syn-thesis of hypermodified nucleoside queousine (Reuter & Ficner1995). The eukaryotic homologs of Tgt do not contain this insertproviding evidence against their origin from Anaplasmatacaeafamily of species (results not shown). The homologs of HlyD arenot found in eukaryotes. These signatures point to a close rela-tionship between Ehrlichia, Wolbachia, and Anaplasma species,which is also seen in phylogenetic trees based on many other se-quences (Dumler et al. 2001; Gaunt et al. 2001; Yu et al. 2001;Taillardat-Bisch et al. 2003; Yu & Walker 2003; Stepkowski

et al. 2003; Emelyanov 2003a). These signatures were likely in-troduced in a common ancestor of the Anaplasmataceae family,which now includes all Ehrlichia, Anaplasma, Cowdria, Wol-bachia, and Neorickettsia species (Dumler et al. 2001; Yu &Walker 2003).

C. Signature Sequences for Other Subgroups ofα-Proteobacteria and Providing Information RegardingTheir Interrelationships

Signature sequences in a number of other proteins are use-ful in distinguishing other subgroups of α-proteobacteria andthey also provide information clarifying the interrelationshipsamong them. In the DnaA protein involved in chromosomalreplication (Messer 2002), a 5 aa insert is present in various Rhi-zobiales and Caulobacter/Rhodobacter species (Figure 19A).However, this insert is not found in any of the Rickettsiales, aswell most α-proteobacterial species belonging to the orders Sph-ingomonadales and Rhodospirillales. The species Mag. mag-netotacticum contains two different homologs of this protein,only one of which is found to contain the insert. Another insertshowing a similar distribution pattern is present in the proteinRP057, which is a homolog of the glucose-inhibited divisionprotein B (Romanowski et al. 2002). This protein contains a 3 aainsert that is common to the same subgroups of α-proteobacteria

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 117

FIG. 14. Signature sequences in alanyl-tRNA synthetase (AlaRS) and MutS proteins that are informative for the α-proteobacteria. In AlaRS (upper panel)an insert of variable length in a highly conserved region is present in various α-proteobacteria, except the Rickettsiales and Mag. magentotacticum. The DNAmismatch repair protein MutS (lower panel) also contains a 3–5 aa insert in various α-proteobacteria, except Rickettsiales. The inserts lengths in this case alsoserve to differentiate Rhodospirillales and Sphingomonadales species from the Rickettsiales, Rhodobacterales, and Caulobacterales.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

118 R. S. GUPTA

FIG. 15. Partial sequence alignments of MutY (upper panel) and homoserine dehydrogenase (lower panel) proteins showing inserts (boxed) in conserved regionsthat are specific for α-proteobacteria. The homologs of both these proteins are not found in the Rickettsiales. For MutY, an insert of approximately similar lengthis also present in various eukaryotic homologs, with the exception of Anopheles gambiae.

as the insert in the DnaA protein, but which is not found inthe Rickettsiales or Rhodospirillales/Sphingomonadales species(Figure 19B). The variable length inserts are also present inthis position in other bacteria (not shown). However, withinproteobacteria this insert is limited to the above subgroups of

α-proteobacteria. Based on the distribution patterns of thesesignatures, these inserts were likely introduced in a commonancestor of the Rhizobiales and Caulobacter/Rhodobacterafter the branching of Rickettsiales and Rhodospirillales/Sphingomonadales orders (Figure 19C).

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 119

FIG. 16. Signature sequences in XerD integrase (upper panel) and leucine aminopeptidase (lower panel) that are distinctive of the Rickettsiales order and notfound in other α-proteobacteria or other bacteria.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

120 R. S. GUPTA

FIG. 17. Signature sequences in transcription repair coupling factor Mfd (A), Ribosomal protein L19 (B), and FtsZ (C) proteins that are distinctive of Rickettsiaspecies and not found in other α-proteobacteria including Anaplasmataceae family (e.g., Wolbachia, Ehrlichia, Anaplasma) of species. Two additional signaturesshowing similar distribution are found in the sigma factor-70 and exonuclease VII proteins.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 121

FIG. 18. Signature sequences in RP-314 (A) and tRNA guanine transglycosylase (Tgt) (B) proteins that are distinctive of the Anaplasmataceae family of speciesand not found in Rickettsia or various other bacteria.

The protein DNA ligase (NAD dependent; Lig A) contains a12 aa insert in a highly conserved region that is commonly sharedby various Rhizobiales as well as Rhodobacterales species(Figure 20A), but which is not found in C. crescentus, Rho-dospirillales (Rhodo. rubrum, Mag. magnetotacticum), and Sph-

ingomonadales (Z. mobilis, Novo. armoaticivorans). The ab-sence of this insert in the Mesorhizobium sp. BNC1, is some-what surprising, but it could result from non-specific mecha-nisms. This signature suggests that Rhizobiales species maybe more closely related to Rhodobacterales in comparison to

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

122 R. S. GUPTA

FIG. 19. Partial sequence alignments of DnaA (panel A) and RP-057 (panel B) proteins showing inserts in conserved regions (boxed) that are only present invarious Rhizobiales, Rhodobacterales, and Caulobacter, but not found in other α-proteobacteria or bacteria. These inserts were likely introduced in a commonancestor of the above groups after the branching of Rickettsiales, Rhodospirillales, and Sphingomonadales as indicated in panel C.

Caulobacter and otherα-proteobacteria. However, another promi-nent insert (11 aa) in a highly conserved region of the proteinaspargine-glutamine amidotransferase points to a specific re-lationship between Rhodobacterales and Caulobacter species(Figure 20B), to the exclusion of all other α-proteobacteria.Martins-Pinheiro et al. (2004) have reported phylogenetic anal-ysis based on LigA sequences. The α-proteobacteria formed

a distinct clade in the tree, but they consisted of only certainRhizobiaceace and Caulobacter species (Martins-Pinheiro et al.2004). To fully understand the evolutionary significance of thesesignatures, it would be necessary to obtain sequence informationfor these proteins from additional Caulobacterales.

We have also identified many conserved inserts that arespecific for species belonging to the Rhiziobiales order. The

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 123

FIG. 20. Signature sequences in DNA ligase A and (upper panel) and Asn-Gln amidotransferase (lower panel) that are informative for α-proteobacteria. Thesignature in DNA ligase is commonly shared by various Rhizobiales as well as C. crescentus species, while that in the Asn-Gln amidotransferase is uniquely sharedby Rhodobacterales and Caulobacter, indicating a specific relationship between these subgroups.

Trp-tRNA synthetase (TrpRS) contains a large insert in ahighly conserved region which is uniquely shared by variousRhizobiales species (Figure 21A), but not found in any of theother α-proteobacteria or other groups of bacteria (results forother groups of bacteria not shown). The absence of this insert invarious Rickettsiales, Rhodospirillales, Sphingomonadales, andRhodobacterales as well as Caulobacter provides evidence thatthese groups have branched off prior to the introduction of thisinsert (Figure 21A). The length of the insert in TrpRS also servesto distinguish the Rhizobiaceae, Brucellaceae, and Phyllobacte-riaceae family of species from those belonging to Bradyrhizobi-aceae and Methylobacteriaceae. The insert in the former groupof species is 19 aa long, whereas the latter species contain only a9–10 aa insert. Because the insert sequence in all of these speciesis conserved, it is likely that the insert was introduced only once

in a common ancestor of the Rhizobiales and subsequent modifi-cation has led to the observed length variation. The distinctnessof Bradyrhizobium and Rhodopseudomonas from other Rhizo-biales is also supported by a signature (3 aa insert) in Seryl-tRNAsynthetase (SerRS), which is uniquely present in these species(Figure 21B) and it serves to distinguish them from other Rhizo-biales as well as other α-proteobacteria. A schematic diagramindicating the suggested positions where signatures described inFigures 20 to 23 have been introduced is presented in Figure 21C.

We have also identified several signatures that are uniquelypresent in the Rhizobiaceae, Brucellaceae, and Phyllobacteri-aceae families of species, but not found in other α-proteobacteriaincluding Bradyrhizobium and Rhodopseudomonas. These sig-natures include a 7 aa insert in Oxoglutarate dehydrogenease(Figure 22A), a 5 aa insert in Succ-CoA synthase (Figure 22B),

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

124 R. S. GUPTA

FIG. 21. Signature sequences in Trp-tRNA synthetase (upper panel) and Ser-tRNA synthetase (lower panel) that are informative for α-proteobacteria. The firstof these signatures is specific for Rhizobiales. The insert length in this signature also distinguishes Bradyrhizobiaceae and Methylobacteriaceae species from otherRhizobiales. The insert in the Ser-tRNA synthetase is specific for the Bradyrhizobiaceae species and distinguishes this family from other Rhizobiales.

a 3 aa insert in LytB metalloproteinase (Figure 23A) and a 2 aainsert in DNA gyrase A subunit (Figure 23B). A smaller insertin oxoglutarate dehydrogenase is also present in Novosphingob-acteria, but since its sequence is unrelated, it is either of in-dependent origin or could have resulted from LGT. In additionto these proteins, a 1–2 aa insert that is specific for Rhizobi-aceae is also found in a conserved region of the LepA protein(Figure 23C). The evolutionary positions where these signa-tures have been introduced are indicated in Figure 21C. It isof interest that in contrast to other Rhizobiaceae species, whichcontain only 1 aa inserts, Sinorhizobium meliloti and Agrobac-terium tumefacienes are found to contain 2 aa inserts in the LepAprotein (Figure 23C). This observation points to a specific rela-tionship between these two Rhizobiaceae species, as has beensuggested based on other lines of evidences (Young et al. 2001).A 2 aa insert in the DnaK protein, which is commonly sharedby species belonging to Rhizobium and Sinrhizobium genera, as

well as Ehrlichia and a few other proteobacteria, has also beendescribed by Stepkowski et al. (2003).

D. Signature Sequences Indicating the PhylogeneticPlacement of α-Proteobacteria

A number of signatures described in earlier work have indi-cated that proteobacteria is a late branching phylum in compar-ison to other main groups within Bacteria (Gupta 1998, 2000,2003; Gupta & Griffiths 2002; Griffiths & Gupta 2004b). Thesesignatures included a 4 aa insert in alanyl-tRNA synthetase, aninsert of >100 aa in RNA polymerase β (RpoB) subunit, a 10aa insert in CTP synthase, a 2 aa insert in inorganic pyrophos-phatase, and a 2 aa insert in Hsp70 protein. The identified sig-natures in these proteins were present in all proteobacterial ho-mologs, but they were absent from most other bacterial phyla(viz. Firmicutes, Actinobacteria, Thermotogae, Deinococcus-Thermus, Cyanobacteria, Spirochetes). In a number of cases,

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 125

FIG. 22. Signature sequences in Oxoglutarate dehydrogenase (upper panel) and Succ-CoA synthase (lower panel) proteins that are commonly shared by onlycertain Rhizobiales families (e.g., Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae), and not found in Bradyrhizobiaceae or other α-proteobacteria.

where the corresponding proteins were present in Archaea (viz.RpoB, Hsp70, AlaRS), the archael homologs also lacked the in-dicated inserts, indicating that the absence of these indels con-stitute the ancestral states and that these signatures were intro-duced after branching of the groups lacking these indels (Gupta& Griffiths 2002; Gupta 2003; Griffiths & Gupta 2004b). A num-ber of identified signatures (7 aa insert in SecA, 1 aa deletion inthe Lon protease) were uniquely shared by only the α, β, andγ -proteobacteria, providing evidence of the later branching ofthese subdivisions (Gupta 2000, 2001, 2003). Two additionalsignatures that are helpful in understanding the phylogeneticplacement of α-proteobacteria are described in the followingsection.

Figure 24 shows the excerpt from a sequence alignment forthe transcription termination factor Rho, which is an RNA-binding protein that plays a central role in the RNA chaintermination (Opperman & Richardson 1994). This protein ispresent in all main groups of bacteria, except cyanobacteria(Gupta & Griffiths 2002; Gupta 2003), where RNA chain termi-nation presumably occurs via a Rho-independent mechanism.A 3 aa insert is present in a highly conserved region of Rho,which is a distinctive characteristic of all α, β, and γ -proteo-bacteria. The length of this insert is 2–3 aa longer in variousRickettsiales species, which suggests an additional insert in thisgroup of bacteria. In contrast to the α, β, and γ -proteobacteria,this insert is not present in δ, ε-proteobacteria or any other

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

126 R. S. GUPTA

FIG. 23. Signature sequences in LytB (A), DNA gyrase A (B) and LepA proteins that are distinctive characteristics of only certain Rhizobiales families (e.g.,Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae), but not found in Bradyrhizobiaceae or other α-proteobacteria.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 127

FIG. 24. Partial sequence alignment of Rho protein showing a conserved insert that is commonly shared by various α, β, and γ -proteobacteria, but not found inany other groups of bacteria including the δ,ε-proteobacteria and all other phyla of gram-positive and gram-negative bacteria. This insert was likely introduced in acommon ancestor of the α, β, and γ -proteobacteria after the branching of other bacterial phyla (see Figure 26). Many other signatures showing similar distributionpattern and supporting the indicated branching position of α, β, and γ -proteobacteria have been described in earlier work.

groups of Gram-negative and Gram-positive bacteria. Thissignature provides evidence that the groups consisting of α,β, and γ -proteobacteria have branched off late in compari-son to the other groups of bacteria. Another novel signaturethat is useful in understanding the branching position of α-proteobacteria is present in the ATP synthase alpha subunit.In this case, an 11 aa insert in a highly conserved region

is present in various β and γ -proteobacteria, but it is notfound in any α-proteobacteria or other groups of bacteria (Fig-ure 25). The absence of this insert in various other bacteria aswell as archael homologs provides evidence that it was intro-duced in a common ancestor of the β and γ -proteobacteria af-ter the divergence of other bacteria, including α-proteobacteria(Figure 26).

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

128 R. S. GUPTA

FIG. 25. Partial sequence alignment of ATP synthase α-subunit showing a highly conserved insert that is commonly shared by various β and γ -proteobacteria,but not found in any other groups of bacteria including the α- and δ,ε-proteobacteria and all other phyla of Gram-positive and Gram-negative bacteria. This insertis also not present in archael or eukaryotic homologs indicating that it was introduced in a common ancestor of the β and γ -proteobacteria after the branching ofall other groups including α-proteobacteria. Other signatures showing similar relationships have been described in earlier work (Gupta 1998, 2000, 2001, 2003).

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 129

FIG. 26. Evolutionary relationships among α-proteobacteria based on signature sequences in different proteins. The branching position of α-proteobacteriarelative to other groups of bacteria is based on signature sequences such as those shown in Figures 24 and 25. The evolutionary stages where these signatures havebeen introduced are indicated by thick arrows. Many other signatures that are helpful in resolving the branching order of other groups have been described in ourearlier work (Gupta 1998, 2000, 2001, 2003, 2004; Gupta & Griffiths 2002; Griffiths & Gupta 2004b (see also www.bacterialphylogeny.com)). The evolutionaryrelationship among α-proteobacteria shown here was deduced based on the distribution patterns of different signatures described in this review. The long thinarrows mark the positions where the signature sequences in various proteins have likely been introduced.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

130 R. S. GUPTA

CONCLUSIONSThe α-proteobacteria are a morphologically and metaboli-

cally very diverse group of organisms, which are presently rec-ognized as a distinct group solely on the basis of their branchingpattern in the 16S rRNA tree (Woese et al. 1984; Stackebrandtet al. 1988; Murray et al. 1990; De Ley 1992; Ludwig & Klenk2001; Kersters et al. 2003). No biochemical, molecular or otherfeatures are presently known, which are uniquely shared byvarious α-proteobacteria and that can clearly distinguish thisgroup from all others. The evolutionary relationships withinthis group of bacteria are also presently not understood. Thisreview describes many novel signatures consisting of conservedinserts and deletions in widely distributed proteins that providedefinitive means for defining the α-proteobacteria and many ofits subgroups, and for understanding evolutionary relationshipsamong them. Because of the rarity and highly specific nature ofthese genetic changes, the possibility of their arising indepen-dently by either convergent or parallel evolution is low (Gupta1998; Rokas & Holland 2000). The simplest and most parsi-monious explanation for such rare genetic changes, when re-stricted to a particular clade(s), is that they were introducedonly once in common ancestors of the particular group(s) andthen passed on to various descendants. The signature approachhas proven very useful in the past in clarifying a number ofimportant evolutionary relationships, which could not be reli-ably resolved based on phylogenetic trees (Rivera & Lake 1992;Baldauf & Palmer 1993). Our earlier work has identified manysignatures that are either specific for particular groups of bac-teria (viz. chlamydiae, cyanobacteria, Bacteroidetes-Chlorobi-Fibrobacter, Deinococcus-Thermus, Proteobacteria) (Gupta2000, 2004; Griffiths & Gupta 2002, 2004a; Gupta et al. 2003),or which are commonly shared by certain bacterial phyla provid-ing information regarding their interrelationships (Gupta 1998,2003; Gupta & Griffiths 2002; Griffiths & Gupta 2004b).

A summary of the different signatures that were describedin this review and the overall picture of α-proteobacterial evo-lution that emerges based upon them is presented in Figure 26.Most of the signatures described here were unique for either allα-proteobacteria or certain of its subgroups, and except for a fewisolated instances, they were not found in other bacteria. Thesefinding provides evidence that the genes containing these signa-tures have not been laterally transferred from α-proteobacteriato other bacteria, although LGT for certain other genes havebeen previously reported (Wolf et al. 1999). A large numberof these signatures, present in broadly distributed proteins (cy-tochrome assembly protein Ctag, SAICAR synthetase, DnaB,ATP synthase α, exonuclease VII, PLPG transferase, RP-400,puruvate phosphate dikinase, FtsK, and Cyt b) were distinctivecharacteristics of all α-proteobacteria. Two additional proteins,MutY and homoserine dehydrogenase, also contain signaturesthat were specific for α-proteobacteria. However, the homologsof these proteins were not found in Rickettsiales. These signa-tures, for the first time, describe molecular characteristics thatunify all α-proteobacteria, and provide means to clearly distin-

guish them from all other bacteria. The unique presence of thesesignatures in various α-proteobacteria, which is a very diversegroup (Kersters et al. 2003), strongly suggests that these indelsshould be functionally important for this group of organisms.Hence, studies examining their functional effects should be ofmuch interest.

Signature sequences in other proteins are helpful in definingmany of the α-proteobacteria subgroups and in clarifying evolu-tionary relationships among them. A number of proteins, whichinclude, Succ-CoA synthetase, Cox I, AlaRS, and MutS, containconserved inserts that are shared by all other α-proteobacteria,except the Rickettsiales. The homologs of these proteins fromother bacteria also lack these indels providing evidence that thesesignatures were introduced in a common ancestor of other α-proteobacteria after the divergence of Rickettsiales. The Rick-ettsiales order also consistently forms the deepest branchinglineage in 16S rRNA and various protein trees (Dumler et al.2001; Gaunt et al. 2001; Kersters et al. 2003; Yu & Walker2003; Stepkowski et al. 2003). Signature sequences in a num-ber of proteins were found to be specific for either the Rick-ettsiales order (viz. XerD integrase and leucine aminopeptidase)or the two main families, Rickettsiaceae (viz. transcription repaircoupling factor, ribosomal protein L19, and FtsZ proteins) andAnaplasmataceae (RP-314 and Tgt proteins). These signatureswere likely introduced in the common ancestors of these groups.These groups are also clearly distinguished in the phylogenetictrees based on 16S rRNA (Figure 1) (Dumler et al. 2001; Yu &Walker 2003) and various proteins (Figure 7) (Stepkowski et al.2003; Emelyanov 2003a).

Signature sequences in a number of proteins (viz. chromoso-mal replication factor, RP-057 and DNA ligase) were commonlyshared by various Rhizobiales, Rhodobacterales, and in mostcases Caulobacterales (currently represented by only C. cres-centus), but they were not present in Rickettsiales, Rhodospir-illales as well as Sphingomonadales species. These results pro-vide evidence that the groups lacking these signatures divergedprior to the introduction of these signatures. A unique signaturehas also been identified for the Rhizobiales order (viz. TrpRS),and one which is commonly shared by Rhodobacterales and C.crescentus. The latter signature suggests a specific relationshipbetween the Rhodobacterales and Caulobacter groups. The re-lationships indicated by these signatures are also generally sup-ported by the phylogenetic trees based on 16S rRNA and variousproteins (Gaunt et al. 2001; Kersters et al. 2003; Stepkowski et al.2003; Emelyanov 2003a). Signatures sequences in a numberof other proteins (viz. oxoglutarate dehydrogenase, Succ-CoAsynthase, DNA gyrase A, LepA, and LytB), are able to distin-guish the Rhizobiaceae, Brucellaceae, and Phyllobacteriaceaefamilies from the Bradyrhizobiaceae species. The distinctnessof Bradyrhizobiaceae from other Rhizobiales is also clearlyindicated by a signature sequence in seryl-tRNA synthetasethat is specific for this group. These signatures are also con-sistent with the observation that Bradyrhizobiaceae species areonly distantly related to other Rhizobiales (viz. Rhizobeaceae,

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 131

Brucellaceae, and Phyllobacteriaceae) (Figure 1) (Sadowsky& Graham 2000; Gaunt et al. 2001; van Berkum et al. 2003;Kersters et al. 2003; Stepkowski et al. 2003; Moulin et al. 2004).A specific relationship between Sinorhizobium and Agrobac-terium species was also indicated by the signature sequence inthe LepA protein.

On the basis of 16S rRNA or various genes/proteins trees, ithas proven difficult to reliably determine the interrelationshipsamong different α-proteobacterial subgroups (Ludwig & Klenk2001; Kersters et al. 2003). However, based upon the distribu-tion patterns of various signatures, it is now possible to logicallydeduce the branching order of the main α-proteobacterial sub-groups (Figure 26). The model for α-proteobacterial evolution,which has been developed here is based upon a large number ofproteins, which are involved in different functions. This modelis internally highly consistent and it is difficult to logically ex-plain the observed distributions of these signatures by alternatemeans. The model developed here is also consistent with the rela-tionships, which are resolved in the 16S rRNA or other phyloge-netic trees (viz. deep branching and distinctness of Rickettsiales,a closer relationship between Rhizobiaceae, Brucellaceae, andPhyllobacteriaceae as compared to Bradyrhizobiaceae; a closerrelationship between Rhodobacterales and Caulobacterales; dis-tinctness of Rickettsiaceae from Anaplasmataceae species; dis-tinctness of Rhizobiales order containing various root nodulebacteria, etc.) (Sadowsky & Graham 2000; Dumler et al. 2001;Kersters et al. 2003; Yu & Walker 2003; Moulin et al. 2004).A few minor inconsistencies seen at present (e.g., phyloge-netic placement of Ca. crescentus) should be clarified whensequence information from additional species becomes avail-able. In this context, it is important to acknowledge that se-quence information is available at present from only a limitednumber of α-proteobacterial species. Although, these speciesinclude representatives from different α-proteobacterial orders,it is necessary to obtain sequence information for many otherspecies from different genera and families to test and validate thismodel.

Signature sequences in a number of proteins, a few of whichare described here, also provide evidence that α-proteobacteriais a late diverging group within Bacteria (Gupta 1998, 2000,2003; Gupta & Griffiths 2002). Within proteobacteria, δ andε-subdivisions are indicated to have branched prior to α-proteo-bacteria, whereas β and γ -subdivisions are indicated as laterbranching groups (see also www.bacterialphylogeny.com). Thebranching of α-proteobacteria in this position is also supportedby the16S rRNA and various protein trees (Olsen et al. 1994;Viale et al. 1994; Eisen 1995; Kersters et al. 2003). The α-proteobacteria, which is a very large group within Bacteria(>5000 entries in the RDP-II database) (Maidak et al. 2001),are presently recognized as a Class within the Proteobacteriaphylum (Woese et al. 1984; Stackebrandt et al. 1988; Murrayet al. 1990; Ludwig & Schleifer 1999; Boone et al. 2001;Kersters et al. 2003). However, presently there are no clearly de-fined criteria for the higher taxa (viz. Phylum, Class, Order, etc.)

within Bacteria (Woese et al. 1985; Stackebrandt 2000; Ludwig& Klenk 2001; Gupta & Griffiths 2002; Gupta 2002). Based onthe observations that α-proteobacteria can now be clearly dis-tinguished from all other bacteria based upon a large number ofmolecular characteristics, and that this group also branches dis-tinctly from all other groups of bacteria including the β, γ - andδ,ε-proteobacteria, it is suggested that α-proteobacteri should berecognized as a main group or phylum within Bacteria, ratherthan as a subdivision or class of the Proteobacteria (Gupta 2000,2004; Gupta & Griffiths 2002). Signature sequences in a fewproteins (viz. PPDK and FtsK) indicate that α-proteobacteriamight have shared a distant ancestry with the δ-proteobacteriaexclusive of other bacteria, but this relationship needs to be fur-ther investigated and confirmed.

The α-proteobacteria have also given rise to mitochondria(Margulis 1970; Gray & Doolittle 1982; Andersson et al. 1998;Sicheritz-Ponten et al. 1998; Gray et al. 1999; Gupta 2000;Emelyanov 2001a, 2003a, 2003b) and very likely played a cen-tral role in the origin of the ancestral eukaryotic cell (Gupta& Singh 1994; Gupta & Golding 1996; Margulis 1996; Gupta1998; Martin & Muller 1998; Lopez-Garcia & Moreira 1999;Karlin et al. 1999; Lang et al. 1999; Emelyanov 2003b; Rivera& Lake 2004). Many of the α-proteobacteria specific signa-tures identified in the present work are also present in the mito-chondrial/eukaryotic homologs, providing additional evidenceof their derivation from an α-proteobacterial ancestor. In a fewcases, theα-proteobacterial signatures are present in genes whichare encoded by the mitochondrial DNA (viz. Cox I and Cyt b).The shared presence of these signatures in the mitochondrial ho-mologs provides further strong evidence for theα-proteobacterialancestry of mitochondria, as previously shown by phylogeneticanalysis (Andersson et al. 1998; Sicheritz-Ponten et al. 1998;Emelyanov 2003a). The current evidence suggests that withinα-proteobacteria, the Rickettsiales group of species are the clos-est relatives of mitochondria (Gupta 1995; Andersson et al. 1998;Sicheritz-Ponten et al. 1998; Gray et al. 1999; Lang et al. 1999;Emelyanov 2001a, 2001b). However, this view is supported byonly some of the identified signatures and further work is neededto clarify this aspect.

LIST OF ABBREVIATIONSAlaRS, alanyl-tRNA synthetase; CFBG, Chlamydia-

Fibrobacter-Bacteroidetes-Green sulfur bacteria; Cyt., Cyto-chrome; Cox I, Cytochrome oxidase polypeptide I; LGT, lat-eral gene transfer; PLPG, Prolipoprotein-phosphatidylgycerol;PPDK, pyruvate phosphate dikinase; RP, Rickettsia prowazekii;SerRS, serine-tRNA synthetase; Succ-CoA, Succinyl-CoA; Tgt,tRNA-guanine transglycosylase; TrpRS, tryptophanyl-tRNAsynthetase; Abbreviations in the species names are: A., Agrobac-terium; Ana., Anaplasma; Aqu., Aquifex; Azo., Azotobacter;Azospir., Azospirillum; Bac., Bacillus; Bact., Bacteroides; Bart.,Bartonella; Bdello., Bdeollovibrio; Bif., Bifidobacterium; Bor.,Borrelia; Bord., Bordetella; Brad. Bradyrhizobium; Bru.,

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

132 R. S. GUPTA

Brucella; Buch., Buchnera; Burk., Burkholderia; Ca., Caulobac-ter; Camp., Campylobacter; Cb., Chlorobium; Cfx., Chlorof-lexus; Chl., Chlamydia; Chlam, Chlamydophila; Chromo.,Chromo-bacterium; Clo., Clostridium; Cor., Cornyebacterium;Cox., Coxiella; Cyt., Cytophaga; Dei., Deinococcus; Dechloro.,Dechloromonas; Des., Desulfovibrio; Desulf., Desulfito-bacterium; Dros. endo., Drosophila endosymbiont; E., Esche-richia; Ent., Enterococcus; Fuso., Fusobacterium; Geo., Geo-bacter; H., Haemophilus; Hel., Helicobacter; Lac., Lactococ-cus; Lactobac., Lactobacillus; Lep., Leptospira; Lis., Listeria;Leg., Legionella; Mag., Magnetococcus; Meso., Mesorhizobium;Methano., Methanobacterium; Methyl., Methylobacillus; Mi-crobul., Microbulbifer; Myc., Mycobacterium; Myx., Myxococ-cus; Nei., Neisseria; Nit., Nitrosomonas; Nitro., Nitrosospira;Novo., Novosphingobacterium; Olig., Oligotropha; Para., Para-coccus; Pas., Pasteurella; Photobac., Photobacterium; Por.,Porphyromonas; Pse., Pseudomonas; Ral., Ralstonia; Rhi., Rhi-zobium; Rho., Rhodobacter; Rhodo., Rhodospirillum; Rhodo-pseud., Rhodopseudomonas; Ri., Rickettsia; Shew., Shewanella;Sino., Sinorhizobium; Sta., Staphylococcus; Str., Streptomyces;Strep., Streptococcus; Syn., Synechococcus; Sulfo., Sulfolobus;T., Thermotoga; Thermoan., Thermoanaerobacter; Thermosyn.,Thermosynechococcus; Tre., Treponema; Vib., Vibrio; Xan.,Xanthomonas; Thiobac., Thiobacillus; Wol., Wolinella; Xyl.,Xylella; Yer., Yersinia; Z., Zymomonas.

ACKNOWLEDGMENTSThe competent technical assistance of Pinay Kanth, Jeveon

Clements, Larissa Shamseer, and Adeel Mahmood in creatingsequence alignments of proteins from Rickettsia prowazekii andother genomes is thankfully acknowledged. I am also thankful toYan Li for developing certain computer programs that facilitatedthe creation of signature sequence files and for help in setting upthe bacterial signatures website (www.bacterialphylogeny.com).Thanks are also due to Emma Griffiths and Pinay Kanth forhelpful comments on the manuscript. The work on signaturesequences described here was mostly completed by August2004. This work was supported by a research grant from theNational Science and Engineering Research Council of Canadaand the Canadian Institute of Health Research.

REFERENCESAndersson, S.G., Zomorodipour, A., Andersson, J.O., Sicheritz-Ponten, T.,

Alsmark, U.C., Podowski, R.M., Naslund, A.K., Eriksson, A.S., Winkler,H.H., and Kurland, C.G. 1998. The genome sequence of Rickettsia prowazekiiand the origin of mitochondria. Nature 396, 133–140.

Baldauf, S.L., and Palmer, J.D. 1993. Animals and fungi are each other’s closestrelatives: Congruent evidence from multiple proteins. Proc. Natl. Acad. Sci.USA 90, 11558–11562.

Battistuzzi, F.U., Feijao, A., and Hedges, S.B. 2004. A genomic timescale ofprokaryote evolution: Insights into the origin of methanogenesis, phototrophy,and the colonization of land. BMC. Evol. Biol. 4, 44.

Bengtsson, J., von Wachenfeldt, C., Winstedt, L., Nygaard, P., and Hederstedt,L. 2004. CtaG is required for formation of active cytochrome c oxidase inBacillus subtilis. Microbiology 150, 415–425.

Boone, D.R., Castenholz, R.W., and Garrity, G.M. 2001. Bergey’s Manual ofSystematic Bacteriology. Springer, New York.

Boussau, B., Karlberg, E.O., Frank, A.C., Legault, B.A., and Andersson, S.G.2004. Computational inference of scenarios for alpha-proteobacterial genomeevolution. Proc. Natl. Acad. Sci. USA 101, 9722–9727.

Bridger, W.A., Wolodko, W.T., Henning, W., Upton, C., Majumdar, R., andWilliams, S.P. 1987. The subunits of succinyl-coenzyme A synthetase—function and assembly. Biochem. Soc. Symp. 103–111.

Broughton, W. J. 2003. Roses by other names: Taxonomy of the Rhizobiaceae.J. Bacteriol. 185, 2975–2979.

Capiaux, H., Lesterlin, C., Perals, K., Louarn, J.M., and Cornet, F. 2002. A dualrole for the FtsK protein in Escherichia coli chromosome segregation. EMBORep. 3, 532–536.

Chase, J.W., Rabin, B.A., Murphy, J.B., Stone, K.L., and Williams, K.R. 1986.Escherichia coli exonuclease VII. Cloning and sequencing of the gene encod-ing the large subunit (xseA). J. Biol. Chem. 261, 14929–14935.

Daldal, F., Davidson, E., and Cheng, S. 1987. Isolation of the structural genesfor the Rieske Fe-S protein, cytochrome b and cytochrome c1 all componentsof the ubiquinol: Cytochrome c2 oxidoreductase complex of Rhodopseu-domonas capsulata. J. Mol. Biol. 195, 1–12.

Davidson, E., and Daldal, F. 1987. Primary structure of the bc1 complex ofRhodopseudomonas capsulata. Nucleotide sequence of the pet operon encod-ing the Rieske cytochrome b, and cytochrome c1 apoproteins. J. Mol. Biol.195, 13–24.

De Ley, J. 1992. The Proteobacteria: Ribosomal RNA cistron similaritiesand bacterial taxonomy. In The Prokaryotes, eds. A. Balows, H.G. Truper,M. Dworkin, W. Harder, and K.H. Schleifer, 2111–2140. Springer-Verlag,New York.

DelVecchio, V.G., Kapatral, V., Redkar, R.J., Patra, G., Mujer, C., Los, T.,Ivanova, N., Anderson, I., Bhattacharyya, A., Lykidis, A., Reznik, G.,Jablonski, L., Larsen, N., D’Souza, M., Bernal, A., Mazur, M., Goltsman, E.,Selkov, E., Elzer, P. H., Hagius, S., O’Callaghan, D., Letesson, J. J., Haselkorn,R., and Kyrpides, N. 2002. The genome sequence of the facultative intracel-lular pathogen Brucella melitensis. Proc. Natl. Acad. Sci. USA 99, 443–448.

Dumler, J.S., Barbet, A.F., Bekker, C.P., Dasch, G.A., Palmer, G.H., Ray, S.C.,Rikihisa, Y., and Rurangirwa, F.R. 2001. Reorganization of genera in thefamilies Rickettsiaceae and Anaplasmataceae in the order Rickettsiales: Uni-fication of some species of Ehrlichia with Anaplasma, Cowdria with Ehrlichiaand Ehrlichia with Neorickettsia, descriptions of six new species combinationsand designation of Ehrlichia equi and ‘HGE agent’ as subjective synonymsof Ehrlichia phagocytophila. Int. J. Syst. Evol. Microbiol. 51, 2145–2165.

Eisen, J.A. 1995. The RecA protein as a model molecule for molecular systematicstudies of bacteria: Comparison of trees of RecAs and 16S rRNAs from thesame species. J. Mol. Evol. 41, 1105–1123.

Emelyanov, V.V. 2001a. Evolutionary relationship of Rickettsiae and mitochon-dria. FEBS Letters 501, 11–18.

Emelyanov, V.V. 2001b. Rickettsiaceae, rickettsia-like endosymbionts, and theorigin of mitochondria. Biosci. Rep. 21, 1–17.

Emelyanov, V.V. 2003a. Common evolutionary origin of mitochondrial and rick-ettsial respiratory chains. Arch. Biochem. Biophys. 420, 130–141.

Emelyanov, V.V. 2003b. Mitochondrial connection to the origin of eukaryoticcell. Eur. J. Biochem. 270, 1599–1618.

Espeli, O., Lee, C., and Marians, K.J. 2003. A physical and functional interactionbetween Escherichia coli FtsK and topoisomerase IV. J. Biol. Chem. 278,44639–44644.

Esser, C., Ahmadinejad, N., Wiegand, C., Rotte, C., Sebastiani, F., Gelius-Dietrich, G., Henze, K., Kretschmann, E., Richly, E., Leister, D., Bryant,D., Steel, M.A., Lockhart, P.J., Penny, D., and Martin, W. 2004. A GenomePhylogeny for Mitochondria Among {alpha}-Proteobacteria and a Predom-inantly Eubacterial Ancestry of Yeast Nuclear Genes. Mol. Biol. Evol. 21,1643–1660.

Falah, M., and Gupta, R.S. 1994. Cloning of the hsp70 (dnaK) genes fromRhizobium meliloti and Pseudomonas cepacia: Phylogenetic analyses of mi-tochondrial origin based on a highly conserved protein sequence. J. Bacteriol.176, 7748–7753.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 133

Galibert, F., Finan, T.M., Long, S.R., Puhler, A., Abola, P., Ampe, F., Barloy-Hubler, F., Barnett, M. J., Becker, a., Boistard, P., Bothe, G., Boutry, M.,Bowser, L., Buhrmester, J., Cadieu, E., Capela, D., Chain, P., Cowie, A.,Davis, R. W., Dreano, s., Federspiel, N. A., Fisher, R. F., Gloux, S., godrie, T.,Goffeau, A., Golding, B., Gouzy, J., Gurjal, M., Hernandez-Lucas, I., Hong,A., Huizar, L., Hyman, R. W., Jons, T., Kahn, D., Kahn, M. L., Kalman, S.,Keating, D. H., Kiss, E., Komp, c., Lelaure, v., Masuy, d., Palm, C., Peck,M. C., Pohl, T. M., Portetelle, d., Purnelle, B., Ramsperger, U., Surzycki,r., Thebault, P., Vandenbol, M., Vorholter, F. J., Weidner, S., Wells, D. H.,Wong, K., Yeh, K. C., and Batut, J. 2001. The composite genome of thelegume symbiont Sinorhizobium meliloti. Science 293, 668–672.

Garrity, G.M., and Holt, J.G. 2001. The road map to the manual. In Bergey’sManual of Systematic Bacteriology, eds. D. R. Boone and R. W. Castenholz,119–166. Springer-Verlag, Berlin.

Gaunt, M.W., Turner, S.L., Rigottier-Gois, L., Lloyd-Macgilp, S.A., and Young,J.P. 2001. Phylogenies of atpD and recA support the small subunit rRNA-based classification of rhizobia. Int. J. Syst. Evol. Microbiol. 51, 2037–2048.

Gonzales, T., and Robert-Baudouy, J. 1996. Bacterial aminopeptidases: Proper-ties and functions. FEMS Microbiol. Rev. 18, 319–344.

Gray, M.W. 1989. The evolutionary origins of organelles. Trends in Genet. 5,294–299.

Gray, M.W., Burger, G., and Lang, B.F. 1999. Mitochondrial evolution. Science283, 1476–1481.

Gray, M.W., and Doolittle, W.F. 1982. Has the endosymbiont hypothesis beenproven?. Microbiol. Rev. 46, 1–42.

Griffiths, E., and Gupta, R.S. 2002. Protein signatures distinctive of chlamy-dial species: Horizontal transfer of cell wall biosynthesis genes glmU fromArchaebacteria to Chlamydiae, and murA between Chlamydiae and Strepto-myces. Microbiology 148, 2541–2549.

Griffiths, E., and Gupta, R.S. 2004a. Distinctive protein signatures pro-vide molecular markers and evidence for the monophyletic nature of theDeinococcus-Thermus phylum. J. Bacteriol. 186, 3097–3107.

Griffiths, E., and Gupta, R.S. 2004b. Signature sequences in diverse proteinsprovide evidence for the late divergence of the order Aquificales. InternationalMicrobiol. 7, 41–52.

Gupta, R.S. 1995. Evolution of the chaperonin families (Hsp60, Hsp10 and Tcp-1) of proteins and the origin of eukaryotic cells. Mol. Microbiol. 15, 1–11.

Gupta, R.S. 1998. Protein phylogenies and signature sequences: a reappraisal ofevolutionary relationships among archaebacteria, eubacteria, and eukaryotes.Microbiol. Mol. Biol. Rev. 62, 1435–1491.

Gupta, R.S. 2000. The phylogeny of Proteobacteria: Relationships to other eu-bacterial phyla and eukaryotes. FEMS Microbiol. Rev. 24, 367–402.

Gupta, R.S. 2001. The branching order and phylogenetic placement of speciesfrom completed bacterial genomes, based on conserved indels found in variousproteins. Inter. Microbiol. 4, 187–202.

Gupta, R.S. 2002. Phylogeny of Bacteria: Are we now close to understandingit?. ASM News. 68, 284–291.

Gupta, R.S. 2003. Evolutionary relationships among photosynthetic bacteria.Photosynth. Res. 76, 173–183.

Gupta, R.S. 2004. The phylogeny and signature sequences characteristics ofFibrobacters, Chlorobi and Bacteroidetes. Crit. Rev. Microbiol. 30, 123–143.

Gupta, R.S., Aitken, K., Falah, M., and Singh, B. 1994. Cloning of Giardialamblia heat shock protein HSP70 homologs: Implications regarding originof eukaryotic cells and of endoplasmic reticulum. Proc. Natl. Acad. Sci. USA91, 2895–2899.

Gupta, R.S., Bustard, K., Falah, M., and Singh, D. 1997. Sequencing of heatshock protein 70 (DnaK) homologs from Deinococcus proteolyticus and Ther-momicrobium roseum and their integration in a protein-based phylogeny ofprokaryotes. J. Bacteriol. 179, 345–357.

Gupta, R.S., and Golding, G.B. 1996. The origin of the eukaryotic cell. TrendsBiochem. Sci. 21, 166–171.

Gupta, R.S., and Griffiths, E. 2002. Critical issues in bacterial phylogenies.Theor. Popul. Biol. 61, 423–434.

Gupta, R.S., Pereira, M., Chandrasekera, C., and Johari, V. 2003. Molecularsignatures in protein sequences that are characteristic of Cyanobacteria andplastid homologues. Int. J. Syst. Evol. Microbiol. 53, 1833–1842.

Gupta, R.S., and Singh, B. 1994. Phylogenetic analysis of 70 kD heat shockprotein sequences suggests a chimeric origin for the eukaryotic cell nucleus.Curr. Biol. 4, 1104–1114.

Hiser, L., Di Valentin, M., Hamer, A.G., and Hosler, J.P. 2000. Cox11p is requiredfor stable formation of the Cu(B) and magnesium centers of cytochrome coxidase. J. Biol. Chem. 275, 619–623.

Hui, F.M., and Morrison, D.A. 1993. Identification of a purC gene from Strep-tococcus pneumoniae. J. Bacteriol. 175, 6364–6367.

Ip, S.C., Bregu, M., Barre, F.X., and Sherratt, D.J. 2003. Decatenation of DNAcircles by FtsK-dependent Xer site-specific recombination. EMBO J. 22,6399–6407.

Jeanmougin, F., Thompson, J.D., Gouy, M., Higgins, D.G., and Gibson, T.J.1998. Multiple sequence alignment with Clustal x. Trends Biochem. Sci. 23,403–405.

Kaneko, T., Nakamura, Y., Sato, S., Asamizu, E., Kato, T., Sasamoto, S.,Watanabe, a., Idesawa, K., Ishikawa, a., Kawashima, K., Kimura, t., Kimura,T., Kishida, Y., Kiyokawa, c., Kohara, M., Matsumoto, M., Matsuno, a.,Mochizuki, Y., Nakayama, S., Nakazaki, N., Shimpo, S., Sugimoto, M.,Takeuchi, C., Yamada, M., and tabata, S., Complete genome structure ofthe nitrogen-fixing symbiotic bacterium Mesorhizobium loti. DNA Res. 7,331–338.

Kaneko, T., Nakamura, Y., Sato, S., Minamisawa, K., UCHIUMI, T., Sasamoto,s., Watanabe, A., Idesawa, K., Iriguchi, M., Kawashima, K., Kohara, M.,Matsumoto, M., Shimpo, S., Tsuruoka, H., Wada, T., Yamada, M., and Tabata,S., 2002. Complete genomic sequence of nitrogen-fixing symbiotic bacteriumBradyrhizobium japonicum USDA110. DNA Res. 9, 189–197.

Karlin, S., and Brocchieri, L. 2000. Heat shock protein 60 sequence comparisons:Duplications, lateral transfer, and mitochondrial evolution. Proc. Natl. Acad.Sci. USA 97, 11348–11353.

Karlin, S., Brocchieri, L., Mrazek, J., Campbell, A.M., and Spormann, A.M.1999. A chimeric prokaryotic ancestory of mitochondria and primitive eu-karyotes. Proc. Natl. Acad. Sci. USA 96, 9190–9195.

Kersters, K., Devos, P., Gillis, M., Vandamme, P., and Stackebrandt, E. 2003.Introduction to the Proteobacteria. In The Prokaryotes: An Evolving Elec-tronic Resource for the Microbiological Community, ed. M. e. al. Dworkin,Springer-Verlag, New York.

Kolber, Z.S., Plumley, F.G., Lang, A.S., Beatty, J.T., Blankenship, R.E.,VanDover, C.L., Vetriani, C., Koblizek, M., Rathgeber, C., and Falkowski,P.G. 2001. Contribution of aerobic photoheterotrophic bacteria to the carboncycle in the ocean. Science 292, 2492–2495.

Ku, M.S., Kano-Murakami, Y., and Matsuoka, M. 1996. Evolution and expres-sion of C4 photosynthesis genes. Plant Physiol. 111, 949–957.

Kurland, C.G., and Andersson, S.G. 2000. Origin and evolution of the mito-chondrial proteome. Microbiol. Mol. Biol. Rev. 64, 786–820.

Lake, J.A., and Rivera, M.C. 1994. Was the nucleus the first endosymbiont?Proc. Natl. Acad. Sci. USA 91, 2880–2881.

Lang, B.F., Gray, M.W., and Burger, G. 1999. Mitochondrial genome evo-lution and the origin of eukaryotes. Annual Review of Genetics 33, 351–397.

Larimer, F.W., Chain, P., Hauser, L., Lamerdin, J. Malfatti, S., Do, L., Land,M. L., Pelletier, D. A., Beatty, J. t., Lang, A. S., Tabita, F. R., Gibson, J. L.,Hanson, T. E., Bobst, C., Torres, J. L., Peres, C., Harrison, F. H., Gibson,J., and Harwood, C. S., 2004. Complete genome sequence of the metaboli-cally versatile photosynthetic bacterium Rhodopseudomonas palustris. Nat.Biotechnol. 22, 55–56.

Leyva, J.A., Bianchet, M.A., and Amzel, L.M. 2003. Understanding ATP syn-thesis: Structure and mechanism of the F1-ATPase (Review). Mol. Membr.Biol. 20, 27–33.

Lopez-Garcia, P., and Moreira, D. 1999. Metabolic symbiosis at the origin ofeukaryotes. Trends Biochem. Sci. 24, 88–93.

Ludwig, W., and Klenk, H.-P. 2001. Overview: A phylogenetic backbone andtaxonomic framework for prokaryotic systamatics. In Bergey’s Manual of

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

134 R. S. GUPTA

Systematic Bacteriology, eds. D. R. Boone and R. W. Castenholz, 49–65.Springer-Verlag, Berlin.

Ludwig, W., and Schleifer, K.H. 1999. Phylogeny of Bacteria beyond the 16SrRNA Standard. ASM News 65, 752–757.

Maidak, B.L., Cole, J.R., Lilburn, T.G., Parker, C.T., Jr., Saxman, P.R., Far-ris, R.J., Garrity, G.M., Olsen, G.J., Schmidt, T.M., and Tiedje, J.M. 2001.The RDP-II (Ribosomal Database Project). Nucleic Acids Res. 29, 173–174.

Margulis, L. 1970. Origin of Eukaryotic cells. Yale University Press, New Haven,CT.

Margulis, L. 1993. Symbiosis in Cell Evolution. W.H. Freeman and Company,New York.

Margulis, L. 1996. Archaeal-eubacterial mergers in the origin of Eukarya: Phy-logenetic classification of life. Proc. Natl. Acad. Sci. USA 93, 1071–1076.

Martin, W., and Muller, M. 1998. The hydrogenosome hypothesis for the firsteukaryote. Nature 392, 37–41.

Martins-Pinheiro, M., Galhardo, R.S., Lage, C., Lima-Bessa, K.M., Aires, K.A.,and Menck, C.F. 2004. Different patterns of evolution for duplicated DNArepair genes in bacteria of the Xanthomonadales group. BMC. Evol. Biol. 4,29.

McLeod, M.P., Qin, X., Karpathy, S.E., Gioia, J. Highlander, S. K., Fox, G. E.,McNeill, T. Z., Jiang, H., Muzny, d., Jacob, L. S., Hawes, A. C., Sodergren,E., Gill, R., Hume, J., Morgan, M., Fan, G., Amin, A. G., Gibbs, R. A.,Hong, C., Yu, X. J., Walker, D. H., and Weinstock, G. M., 2004. Completegenome sequence of Rickettsia typhi and comparison with sequences of otherrickettsiae. J. Bacteriol. 186, 5842–5855.

Messer, W. 2002. The bacterial replication initiator DnaA. DnaA and oriC, thebacterial mode to initiate DNA replication. FEMS Microbiol. Rev. 26, 355–374.

Morden, C.W., Delwiche, C.F., Kuhsel, M., and Palmer, J.D. 1992. Gene phy-logenies and the endosymbiotic origin of plastids. Biosystems 28, 75–90.

Moreno, E., and Moriyon, I. 2001. The Genus Brucella. The Prokaryotes: AnEvolving Electronic Resource for the Microbiological Community. In ed.M. e. al. Dworkin. Springer-Verlag, New York.

Moulin, L., Bena, G., Boivin-Masson, C., and Stepkowski, T. 2004. Phylogeneticanalyses of symbiotic nodulation genes support vertical and lateral gene co-transfer within the Bradyrhizobium genus. Mol. Phylogenet. Evol. 30, 720–732.

Murray, R.G.E., Brenner, D.J., Colwell, R.R., De Vos, P., Goodfellow, M.,Grimont, P.A.D., Pfennig, N., Stackebrandt, E., and Zavarzin, G.A. 1990.Report of the Ad Hoc Committee on approaches to taxonomy within theProteobacteria. Int. J. Syst. Bacteriol. 40, 213–215.

Nierman, W.C., Feldblyum, T.V., Laub, M.T., Paulsen, I.T., Nelson, K.E., Eisen,J., Heidelberg, J.F., Alley, M.R., Ohta, N., Maddock, J.R., Potocka, I., Nelson,W.C., Newton, A., Stephens, C., Phadke, N.D., Ely, B., DeBoy, R.T., Dodson,R.J., Durkin, A.S., Gwinn, M.L., Haft, D.H., Kolonay, J.F., Smit, J., Craven,M.B., Khouri, H., Shetty, J., Berry, K., Utterback, T., Tran, K., Wolf, A., Va-mathevan, J., Ermolaeva, M., White, O., Salzberg, S.L., Venter, J.C., Shapiro,L., and Fraser, C.M. 2001. Complete genome sequence of Caulobacter cres-centus. Proc. Natl. Acad. Sci. USA 98, 4136–4141.

Ogata, H., Audic, S., Renesto-Audiffren, P., Fournier, P.E., Barbe, V., Samson,D., Roux, V., Cossart, P., Weissenbach, J., Claverie, J.M., and Raoult, D. 2001.Mechanisms of evolution in Rickettsia conorii and R. prowazekii. Science 293,2093–2098.

Olsen, G. J., Woese, C. R., and Overbeek, R. 1991. The winds of (evolutionary)change: Breathing new life into microbiology. J. Bacteriol. 176, 1–6.

Opperman, T., and Richardson, J.P. 1994. Phylogenetic analysis of sequencesfrom diverse bacteria with homology to the Escherichia coli rho gene. J. Bac-teriol. 176, 5033–5043.

Parker, A.R., and Eshleman, J.R. 2003. Human MutY: Gene structure, proteinfunctions and interactions, and role in carcinogenesis. Cell Mol. Life Sci. 60,2064–2083.

Paulsen, I.T., Seshadri, R., Nelson, K.E., Eisen, J.A. Heidelberg, J. F., Read, T.D., Dodson, R. J., Umayam, L., Brinkac, L. M., Beanan, M. J., Daugherty, s.C., DeBoy, R. T., Durkin, A. S., Kolonay, J. F., Madupu, r., Nelson, W. C.,

Ayodeji, B., Kraul, M., Shetty, J., Malek, J., Van Aken, S. E., Reidmuller, S.,Tettelin, H., Gill, S. R., White, O., Salzberg, S. L., Hoover, D. L., Lindler, L.E., Halling, s. M., Boyle, S. M., and Fraser, C. M., 2002. The Brucella suisgenome reveals fundamental similarities between animal and plant pathogensand symbionts. Proc. Natl. Acad. Sci. USA 99, 13148–13153.

Qi, H.Y., Sankaran, K., Gan, K., and Wu, H.C. 1995. Structure-function rela-tionship of bacterial prolipoprotein diacylglyceryl transferase: Functionallysignificant conserved regions. J. Bacteriol. 177, 6820–6824.

Reuter, K., and Ficner, R. 1995. Sequence analysis and overexpression of theZymomonas mobilis tgt gene encoding tRNA-guanine transglycosylase: Pu-rification and biochemical characterization of the enzyme. J. Bacteriol. 177,5284–5288.

Ribeiro, S., and Golding, G.B. 1998. The mosaic nature of the eukaryotic nu-cleus. Mol. Biol. Evol. 15, 779–788.

Rivera, M.C., and Lake, J.A. 1992. Evidence that eukaryotes and eocyte prokary-otes are immediate relatives. Science 257, 74–76.

Rivera, M.C., and Lake, J.A. 2004. The ring of life provides evidence for agenome fusion origin of eukaryotes. Nature 431, 152–155.

Rokas, A., and Holland, P.W. 2000. Rare genomic changes as a tool for phylo-genetics. Trends Ecol. Evol. 15, 454–459.

Romanowski, M.J., Bonanno, J.B., and Burley, S.K. 2002. Crystal structure ofthe Escherichia coli glucose-inhibited division protein B (GidB) reveals amethyltransferase fold. Proteins 47, 563–567.

Sadowsky, M.J. and P.H. Graham. 2000. Root and Stem Nodule Bacteria ofLegumes. In The Prokaryotes: An Evolving Electronic Resource for the Mi-crobiological Community, ed. M. e. al. Dworkin. Springer-Verlag, New York.

Sawada, H., Kuykendall, L.D., and Young, J.M. 2003. Changing concepts inthe systematics of bacterial nitrogen-fixing legume symbionts. J. Gen. Appl.Microbiol. 49, 155–179.

Sicheritz-Ponten, T., Kurland, C.G., and Andersson, S.G. 1998. A phylogeneticanalysis of the cytochrome b and cytochrome c oxidase I genes supports anorigin of mitochondria from within the Rickettsiaceae. Biochim. Biophys.Acta. 1365, 545–551.

Sixma, T.K. 2001. DNA mismatch repair: MutS structures bound to mismatches.Curr. Opin. Struct. Biol. 11, 47–52.

Soni, R.K., Mehra, P., Choudhury, N.R., Mukhopadhyay, G., and Dhar, S.K.2003. Functional characterization of Helicobacter pylori DnaB helicase.Nucleic Acids Res. 31, 6828–6840.

Stackebrandt, E. 2000. Defining Taxonomic Ranks. In The Prokaryotes: AnEvolving Electronic Resource for the Microbiological Community, ed. M. e.al. Dworkin. Springer-Verlag, New York.

Stackebrandt, E., Murray, R.G.E., and Truper, H.G. 1988. Proteobacteria classisnov., a name for the phylogenetic taxon that includes the “Purple bacteria andtheir Relatives.” Int. J. Syst. Bacteriol. 38, 321–325.

Stepkowski, T., Czaplinska, M., Miedzinska, K., and Moulin, L. 2003. Thevariable part of the dnaK gene as an alternative marker for phylogeneticstudies of rhizobia and related alpha Proteobacteria. Syst. Appl. Microbiol.26, 483–494.

Stryer, L. 1995. Biochemistry. W.H. Freeman and Co., New York.Taillardat-Bisch, A.V., Raoult, D., and Drancourt, M. 2003. RNA polymerase

beta-subunit-based phylogeny of Ehrlichia spp., Anaplasma spp., Neorick-ettsia spp. and Wolbachia pipientis. Int. J. Syst. Evol. Microbiol. 53, 455–458.

van Berkum, P., Terefework, Z., Paulin, L., Suomalainen, S., Lindstrom, K., andEardly, B.D. 2003. Discordant phylogenies within the rrn loci of Rhizobia.J. Bacteriol. 185, 2988–2998.

Van Sluys, M.A., Monteiro-Vitorello, C.B., Camargo, L.E., Menck, C.F., daSilva, A.C., Ferro, J.A., Oliveira, M.C., Setubal, J.C., Kitajima, J.P., andSimpson, A.J. 2002. Comparative genomic analysis of plant-associated bac-teria. Annu. Rev. Phytopathol. 40, 169–189.

Viale, A.M., and Arakaki, A.K. 1994. The chaperone connection to the originsof the eukaryotic organelles. FEBS Letters 341, 146–151.

Viale, A.M., Arakaki, A.K., Soncini, F.C., and Ferreyra, R.G. 1994. Evolutionaryrelationships among eubacterial groups as inferred from GroEL (chaperonin)sequence comparisons. Int. J. Syst. Bacteriol. 44, 527–533.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.

PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA 135

Wang, E.T., van Berkum, P., Beyene, D., Sui, X.H., Dorado, O., Chen, W.X.,and Martinez-Romero, E. 1998. Rhizobium huautlense sp. nov., a symbiont ofSesbania herbacea that has a close phylogenetic relationship with Rhizobiumgalegae. Int. J. Syst. Bacteriol. 48 Pt. 3, 687–699.

Woese, C.R., Stackebrandt, E., Macke, R.J., and Fox, G.E. 1985. A phylogeneticdefinition of the major eubacterial taxa. System. Appl. Microbiol. 6, 143–151.

Woese, C.R., Stackebrandt, E., Weisburg, W.G., Paster, B.J., Madigan, M.T.,Fowler, C.M.R., Hahn, C.M., Blanz, P., Gupta, R., Nealson, K.H., and Fox,G.E. 1984The phylogeny of purple bacteria: The alpha subdivision. System.Appl. Microbiol. 5, 315–326.

Wolf, Y.I., Aravind, L., and Koonin, E.V. 1999. Rickettsiae and Chlamydiae—evidence of horizontal gene transfer and gene exchange. Trends Genet 15,173–175.

Wood, D.W., Setubal, J.C., Kaul, R., Monks, D.E. Kitajima, J. P., Okura, V. K.,Zhou, Y., Chen, L., Wood, G. E., Almeida, N. F., Jr., Woo, L., Chen, Y.,Paulsen,I. T., Eisen, J. A., Karp, P. D., Bovee, D., Sr., Chapman, P., Clendenning,J., Deatherage, G., Gillet, W., Grant, c., Kutyavin, T., Levy, R., Li, M. J.,McClelland, E., Palmieri, A., Raymond, C., Rouse, G., Saenphimmachak,C., Wu, Z., Romero, P., Gordon, D., Zhnag, S., Yoo, H., Tao, Y., Biddle,

P., Jung, M., Krespan, W., Perry, M., Gordon-Kamm, B., Lioa, L., Kim, S.,Hendrick, C., Zhao, Z. Y., Dolan, M., Chumley, F., Tingey, S. V., Tomb,J. F., Godon, M. P., Olson, M. V., and Nester, E. W., 2001. The genome of thenatural genetic engineer Agrobacterium tumefaciens C58. Science 294, 2317–2323.

Young, J.M., Kuykendall, L.D., Martinez-Romero, E., Kerr, A., and Sawada,H. 2001. A revision of Rhizobium Frank 1889, with an emended descriptionof the genus, and the inclusion of all species of Agrobacterium Conn 1942and Allorhizobium undicola de Lajudie et al. 1998 as new combinations:Rhizobium radiobacter, R. rhizogenes, R. rubi, R. undicola and R. vitis. Int.J. Syst. Evol. Microbiol. 51, 89–103.

Yu, X.J. and D. H. Walker. 2003. The Order Rickettsiales. In The Prokaryotes:An Evolving Electronic Resource for the Microbiological Community, ed.M. e. al. Dworkin. Springer-Verlag, New York.

Yu, X.J., Zhang, X.F., McBride, J.W., Zhang, Y., and Walker, D.H. 2001. Phylo-genetic relationships of Anaplasma marginale and ‘Ehrlichia platys’ to otherEhrlichia species determined by GroEL amino acid sequences. Int. J. Syst.Evol. Microbiol. 51, 1143–1146.

Yurkov, V.V., and Beatty, J.T. 1998. Aerobic anoxygenic phototrophic bacteria.Microbiol. Mol. Biol. Rev. 62, 695–724.

Cri

tical

Rev

iew

s in

Mic

robi

olog

y D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y C

orne

ll U

nive

rsity

Fo

r pe

rson

al u

se o

nly.