Upload
neal
View
55
Download
0
Embed Size (px)
DESCRIPTION
Регуляторные структуры РНК. RNA genes: sRNAs : CsrB/RsmB , CsrC, DsrA, GadY, MicC, OxyS, RyhB, RydC, etc Antisense RNAs : CopA, DicF, MicF, RNAI, QaRNA etc Cys UTR regulatory RNAs : riboswitches, T-boxes, attenuators, IREs, etc. sRNAs. DsrA RNA . Regulation of rpoS : - PowerPoint PPT Presentation
Citation preview
Регуляторные структурыРНК
RNA genes:
• sRNAs: CsrB/RsmB , CsrC, DsrA, GadY, MicC, OxyS, RyhB, RydC, etc
• Antisense RNAs: CopA, DicF, MicF, RNAI, QaRNA etc
Cys UTR regulatory RNAs: riboswitches, T-boxes, attenuators, IREs, etc.
sRNAs
DsrA RNA
E. coli, salmonella spp., Shigella spp
Regulation of rpoS:
• Overcoming transcriptional silencing
• Promoting translation
CsrB/RsmB RNA family
sRNAs
RNA binds to approximately 18 copies of the CsrA protein
enterobacteria
conserved motif CAGGXX
negative effect: glycogen biosynthesis, glyconeogenesis, glycogen catabolism
positive effect: glycolysis
sRNAs
Pseudomonas spp.
5'-AGGA-3' repeats in loops
RNA possibly interacts with a CsrA-like protein
Involvement in regulation of 2, 4-diacetylphloroglucinol (Phl) and hydrogen cyanide (HCN) production
PrrB/RsmZ RNA family
GadY
GadY interacts with the 3' UTR of mRNA gadX:
increased stability to the transcript
E. coli, salmonella spp., Shigella spp.
RydC RNA
RydC is known to bind the protein HfqThe Hfq/RydC complex causes degradation of the target nRNA
E. coli, salmonella spp.
sRNAs
Antisense RNAs
CopA-like RNA MicF RNA
RNAs regulate plasmid copy number
four-way inhibition junction structure copA-mRNA copT
regulates ompF expression by inhibiting translation and inducing degradation
UTR RNA regulatory elements
Ribosomes (Transcription attenuation )
Repressor/Activator proteins (feedback inhibition of gene translation/splicing, antitermination (bgl), IREs (regulation of translation/mRNA stability), etc)
Uncharged tRNA (T-boxes)
Small molecules (various riboswitch regulatory elements)
Mediators of regulation:
Antitermination
Termination
(anti-antitermination)
Alternative RNA structures in transcription termination
Attenuation of transcription (Yanovsky).
Prediction of attenuators:
Amino acid biosynthesis (branched amino acids (ILE, LEU, VAL), histidine, threonine, tryptophan, and phenylalanine)
(gamma- and alpha-proteobacteria, in some cases low-GC Gram-positive bacteria, Thermotogales and Bacteroidetes/Chlorobi)
Three new histidine transporters were predicted:
• ortholog of BS- yuiF and yvsH
• from lysQ/lysP family
• HI0325 (Haemophylus influenzae )
E. coli: three aspartate kinase isozymes, ThrA, MetL and LysC
thrA: ILE-THR attenuator
metL: MetJ
lysC: LYS-element
Pasteurellales (two aspartate kinase isozymes):
thrA THR-MET-ILE attenuator
LysC: LYS-element
Detection of 5’ UTR RNA-elements
The RNApattern program:RNA pattern:• consensus motifs• RNA secondary structure: number of helices length of each helix loop lengths parameters of topology and distance between pairs of helices
specifier hairpin ===> ==> ===> <=== <== SC<=== SA SERS SER ---GTAGGACAAGTA 19 AGAGAGCTTGTGGTT---AGTGTGAACAAG--- 15 GAA--TCTACCTACTT -> DHA tyrZ Tyr ----AAGAACAAGTA 18 AGAAAGTTGCCGGCT---GATGAGAGGCGCTT 18 GAA--TACCTCTTTGA -> ST trpS Trp ---ATTAGAAGAGTA 16 AGAGAGTTAGTGGTT---GGTGCAAGCTAAC- 12 GAAA-TGGACTAATGA -> CA ASPS ASP -----GAGAAAAGTA 18 AGCGAATTGGGAAAT---GGTGTGAGCCCAA- 15 GAAA-GACATCTCGGA -> DF VALS VAL -GAAGAAGAGGAGTA 16 AGAGAGGAAAATTCACTGGCTGTAAGATTTTC 17 GAAT-GTAGCTTTGGA -> PN THRS THR ----AGAGACAAGTC 18 AGAGAGTGCGTGGTT---GCTGGAAACGCAT- 14 GAT--ACTACTCTTGA -> MN ileS Ile ----CAAAAACACAA 17 AGCGAATAGGTGAT----GGTGTAAGACCTATT 18 -----ATCATTTTGTT -> DF leuS Leu ----CTAGAGCAGTA 19 AGAGGAAGTGGAA-----GGTGAGAACTAATATT 10 GAA--CTTACTAGATT -> HD ARGS ARG -----TGGGAGAGTA 20 AGCGAGTCGGGAT-----GGTGGGAGCCGAT- 14 GAAA-CGCACCCATGA -> DF proS Pro ---AAAGAAATAGTA 18 AGAGAGAAAACGGT----GGTGAGAGTTTTC-- 14 GAA--CCTGTCTTTTA -> ZC lysS Lys ---AAGAGAAGAGTA 19 AGAGAGCTCTGGTA----GCTGAGAAAGAGC-- 15 GAAAAAAGACTTGGAG -> BQ metS Met ---AAAGGAAAAGTA 19 AGAGAGCTTCGGTA----GCTGAGAAGAAGC-- 14 GAACAATGGCCTTTGA -> MN pheS Phe ----TGAGATTAGTA 18 AGGGAATGCGGGGCGTG-ACTGGAAACCCGC- 16 GAA--TTCACTCAGAA -> MN glyQ Gly ---AGAAAGAGAGTT 15 AGCGAACCTGAGAG----AGTGTAAGTCAGGT 14 GACT-GGCACTTTCTC -> ST alaS Ala -AGTTAAGAATTGTT 17 AGAAAAGTGACGGTT---GCTGCGAGTCATT- 17 -----GCTACTTAACT ->
SA trpE Trp TCTAAAGAAATAGTA 22 AGAAAGCTAATGGGT---GATGGGAATTAGC-- 14 GAAT-TGGACTTTGGA -> BS ilvB Leu ---TGAGGATAAGTA 20 AGAGAACCGGGTTA----GCTGAGAACCGG--- 16 GAA--CTCGCCTCAGA -> CA ilvC Val -----AGGAAGAGTA 17 AGAGAGTGAGATACT---GGTGGGAACTCAT-- 13 GAAG-GTAGCCTTTGA -> BQ asnA Asn --AGGACGAGTAGTA 15 AGCGAGTCAGGGGT----GGTGTGAGCCTGA-- 15 GAAG-AACCTCCTGGA -> BS proB Pro -----AGGATTAGTA 18 AGAGAGCAAAATGAACC-GCTGAAACATTTTGC 15 GAA--CCTGCCTTGGA -> SA cysE Cys --CGAAGGATTAGTA 18 AGAGAGTGTACGGTT---GCTGTGAGTACA--- 14 GAA--TGCACCTTCGT -> MN hisC His -----AGAGAAAAAA 16 AGAGAGTATGGGAA----GCTGAAAACATAC-- 15 -----CACATTCTTGA -> DHA pheA Phe -----AAAGAGAGCA 19 AGGGAACTAAAGTCGGAGACTGAAAGCTTTAGT 14 GAGA-TTCACTCTGGA -> HD serA Ser ----GAAGATGAGGA 17 AGAGAGCTGGTGGTT---GCTGTGAACCAGCT- 18 -----AGCCCTTCTGA -> BQ phhA Tyr AGAATCGCAGTAGTA 17 AGAGAGCTAATGGTC---GGTGGAAATTGGC-- 14 GAAT-TACAATTCTGG -> EF yxjH Met -----TAGGAAAGTA 17 AGAGAGACTTTGGTT---GGTGAAAAAAGTT-- 13 GAAAAATGGCCTAGGA ->
CA yckK Cys ----AAGAACCAGTA 17 AGAGAAAAATCTCCAAG-GCTGAAAGGGATTTT 15 GAA--TGCATCTTTGA -> DF yqiX Arg -----AGAGAAAGTA 16 AGCGAGTTAGGGGTT---GGTGTAAGCCTAGC- 14 GAAG-AGAGCTCTGGA -> HD BH0807 Lys ----AGAGAAGAGTA 19 AGAAAGCCTGTAGTT---GCTGAGAACGGGT-- 14 GAAGCAAGACTCTGAG -> EF yheL Tyr -TTATTAGCCCAGTA 19 AGAAAGTCGATGGTT---GCTGCGAATCGAT-- 13 GAAT-TACACTAATAA -> BQ ykbA Thr --GAGGACACGATCA 16 AGAGAGGGAAGCCTTTG-GCTGTGAGCTTCCT- 14 GATT-ACCACCTCTGA -> BQ sdt2 Trp ---GCAAGAAGAGTA 18 AGAGAGCTGGGGGAA---GGTGTGAGCCCGGT- 15 GAA--TGGGCTTGCGA -> EF yusC Met ----AAAGAAGAGTA 18 AGAGAGCCCTGTTT----GCTGAGAATGGG--- 16 GAAG-ATGGTCTTTGA -> CA yhaG Trp ----AAGGAAGAGTA 18 AGAGAGCTGAGGGT----GGTGTGATCTCAGT- 15 GAA--TGGACCTTTTA -> BQ brnQ Ile ----GAGAACGAGTA 19 AGAGAGTTGGCGATTT--GCTGAAAGCCAAC-- 15 GAAA-ATCATCTCCGA -> REF01723 His --TTAGGACATAGTA 18 AGAGACTTTTTCATTG--GCTGAAAGAAAAAG- 17 -----CACACCTAAAA -> BS yvbW Leu -----GGGAGCAGTA 18 AGAGAGCTGCGGGGT---GGTGCGACGCAGC-- 13 GAA--CTCGCCCGGGA ->
Aminoacyl-tRNA synthetases
Amino acid biosynthetic genes
Amino acid transporters
Partial alignment of predicted T-boxes
Terminator(underlined) ===========> <===========
Antiterminator ==> ===> <===<== SA serS -> 26 CGTTA 51 AAATAGGGTGGCAACGCGTAGAC------------CACGTCCCTTGTAGGGATGTGGTCTTTTTTTA DHA tyrZ -> 47 CGTTA 65 AGGTAAGGTGGTAACACGGGAGCA-------TACTCTCGTCCTTCTGGCAATGAAGGACGGGAGTTTTTTGTTTT ST trpS -> 37 CCTTA 61 AATTGAGGTGGTACCGCGTATTACTT----GTAATAACGCCCTCACGTTTTAATAGCGTGGGGACTTTTTGCTAT CA aspS -> 39 CGTTA 34 ATAAAGGATGGCACCGTGAAAA----------GCCTTCACTCCTTACTGGAGTGGAGGCTTTTTTTATTTTAAATAAA DF valS -> 41 CGTTA 77 AATTAAGGTGGTAACGCGAGC------------TTTTCGTCCTTTTTAAAGAGGATGAAGAGCTCTTTTTTATTTCT PN thrS -> 30 CGTTA 38 AATGAAGGTGGAACCACGTTG-------------CGACGTCCTTTCGAGGATGTCGCATTTTTTTATTAG MN ileS -> 89 CGTTA 68 AATTAAGGTGGTACCACGAGC-------------TTTCGTCCTTTGATGAAAGTTCTTTTTTATTGAT DF leuS -> 28 AGCTA 29 AATTAGGGTGGTACCGCGAAGATT-------TATCCTCGTCCCTAAACGTAAGTTTAGTGACGAGGATTTTTTATTTTCA HD argS -> 41 CGTTA 27 AACGAGAGTGGTACCGCGGGTAA---------AAGCTCGCCTCTTTTTAGAAGAGGCGGGTTTTTTATTTT DF proS -> 33 CGTTA 30 AACTAGAGTGGTACCGCGGAAAT-----TAAACCTTTCGTCTCTATACTTGTATAGAGATGAGAGGTTTTTTATATTTTCAGGA ZC lysS -> 46 CGTTA 63 AACTGAGGTGGTACCGCGAAGCTAA-----CAACTCTCGTCCTCAAGATGAATAATCTTGGGGGTGGGAGTTTTTTTGTTGCAT BQ metS -> 55 CGTTA 66 AAATAAGGTGGTACCGCGACTGTTTA---TACAGCCCCGCCCTTATCTTTTTTAGATAAGGGCGGGGCTTTTTATATTTAA MN pheS -> 14 AATTA 20 AAAACGGATGGTACCGCGTGTC-------------AACGCTCCGCTTAAGGAGTTTTGGCACTTTTTTTGTTTT MN glyQ -> 14 AGCTA 23 AATTAGGGTGGAACCGCGTTT------------CAAACGCCCCTATGTCAGTTGGCATGGGAGTGATTGAGCGTGGCTCTTTT ST alaS -> 20 AATTA 18 AATAGAGGTGGTACCGCGGTT--------------TTCGCCCTCTGTGAGATGGACTTGTTTTGTATGGAGGACTATTTGAAA SA trpE -> 32 AATTA 4 AACTAAGGTGGCACCACGGTA-------------ACGCGTCCTTACAGGTATATGCGTTATGTGGTGTCTTTTT BS ilvB -> 50 CGTTA 47 AACAAGGGTGGTACCGCGGAAAGAAA---AGCCTTTTCGCCCCTTTTAGCTATCGCAGTTACTGCGCGGCTGATTGT CA ilvC -> 40 CGTTA 14 AATTTGGGTGGTACCGCGCGACCAAA-----AATTCTCGCCCCAAGCAGGGAATTTTGGCCGTTTTTTTATATAAATAAAT BQ asnA -> 51 CGTTA 62 AATTTGGGTGGTACCGCGGAACC-----AAAGCCTTTCGTCCCAGTTTTTTGGGAAAGAAGGGCTTTTTTTGTTGGCTT BS proB -> 33 CGTTA 30 AATCAAGGTGGTACCACGGAAAC--------CCATTTCGTCCTTATGAATCAGGATGAAATGGGTTTTTTTATTGTAGA SA cysE -> 33 CATTA 62 ATTCAGAGTGGAACCGTGCGG-------------AAGCGCCTCTAACAATACAATTTGTATGTTAGTGGTGCTTTTTTG MN hisC -> 46 CGTTA 50 AATGAAGGTGGAACCACGTGTGT---------GTCAGCGTCCTTGCAAGTTTTTTGCAAGGGCGCTTTTTTGAATAGT DHA pheA -> 41 CGTTA 50 AAAAAGGGTGGTACCGCGTGAC---------TTAACTCGTCCCTTATTTGGGGGTGAGGTAAGTCTTTTTTTATTTA HD serA -> 42 cgtta 57 AATGAGGGTGGCACCGCGGTATG-------AACCTTCCGCCCCTCACGACAGTCGTCGTGTGGGCAGAAGGTTTTTTTACTATCA BQ phhA -> 51 CGTTA 34 AAATAGGGTGGTACCGCGATTC------------TTTCGCCCCTATCGGATTTTCCGATAGGGGCTTTTTCTATTTC EF yxjH -> 40 CGTTA 51 AAAAAAGGTGGTACCGCGATAA-----------TAATCGCCCTTTTACTAGTTACGGCTAGTAAAAGGGCGTTTTTTTATAAA CA yckK -> 38 CGTTA 57 AATTAGAGTGGTACCGTGGAATT-------CAACTTCTGCCTCTAACTATGAGGATAGAAGTTTTTTGTTTTTAT DF yqiX -> 41 CCTTA 30 AAAAAGAGTGGTAACGCGGATAT----------AATTCGTCTCTTAGCTGTAAAGCTAAGGGACTTTTTTGATTTA HD BH0807->74 TGTTA 56 AACTGGGGTGGCACCACGACAAG----------TGATCGTCCCCAAGACTTTTATCAGTCTTGGGGACGTTTTTTTGTTCAT EF yheL -> 8 AATTA 33 AATTAAGGTGGTACCGCGGAGA-----------GATTCGTCCTTATTCTTTAAGGATGAATCTCTCTTTTTATGTAGC BQ ykbA -> 46 CGTTA 45 AACAAGGGTGGAACCACGAATAT--------AACACTCGTCCCTTTTTTAGGGAGGAGTGTTTTTTTATT BQ sdt2 -> 40 CGTTA 56 AATTGAGGTGGTACCACGGTATTAACATTACATATATCGTCCTCTACATGCATATTTGCGTGTAGGGGACTTTTTTATTTTC EF yusC -> 42 CGTTA 60 AATTAAGGTGGTATCACGAAATGA-----CAAACTTTCGTCCTTTTTGCTGTAATAGCAAAAGGATGGAAGTTTTTTTGTTT CA yhaG -> 48 CGTTA 51 AATTTAGGTGGTACCGCGGAAGT---------ATCTCCGTCCTAATTAATAAGATTAGGGCGGAGTTTTTTATTTGC BQ brnQ -> 44 CGTTA 66 AATTAGGGTGGTATCGCGGGTAAA------TATAACTCGTCCCTTTCTTTAGGGACGAGTTTTTTGTGTTCTT REF01723 -> 44 CGTTA 55 AATTGAGGTGGCACCACGAATGC----------GATTCGTCCTCTTGGCTCACAGCCAAGAGGCTTTTTTGTTTTTTTAATA BS yvbW -> 56 CGTTA 32 AACAAGAGTGGTACCGCGGTCAGC--CGAAGGCTCGTCGTCTCTTTATCTATTAGATTAGGTAGGAGACGGCGGGCTTTTTT
Aminoacyl-tRNA synthetases
Amino acid biosynthetic genes
Amino acid transporters
… continued
Amynoacyl-tRNA synthetases Aromatic a/a TRP, PHE, TYR
Most FIRMICUTES, Atopobium minutum
Branched chain a/a ILE, LEU,VAL
Most FIRMICUTES, Actinobacteria(ileS), Dienococcales\ Thermales(ileS, valS), Chloroflexi(ileS), Thermomicrobium roseum(leuS)
methionine Bacillales, Clostridiales, Thermoanaerobacter tengcongensis proline Some Bacillales, Clostridiales, cysteine Bacillales, some Lactobacillales, Clostridiales, Thermoanaerobacteriales
histidine Bacillales, Lactobacillales(exept streptococcus spp.), some Clostridiales, Thermoanaerobacter tengcongensis
arginine Bacillales, Lactobacillales (exept streptococcus spp.), Clostridiales, threonine Bacillales, Lactobacillales, Clostridiales, Dictyoglomi, Thermomicrobium roseum serine Most FIRMICUTES alanine Bacillales, Lactobacillales, Clostridiales ASP, ASN Most FIRMICUTES (exept streptococcus spp., Mycoplasmatales, Entomoplasmatales) glycine Most FIRMICUTES, Dienococcales\ Thermales lysine Bacillus cereus, Clostridium thermocellum Amino acid biosynthetic genes Aromatic a/a TRP, PHE, TYR
Most FIRMICUTES, Chloroflexi and Dictyoglomi (trp operon), some FIRMICUTES (aro genes, pheA, pah)
Branched chain a/a ILE, LEU,VAL
Bacillales, Clostridiales, Syntrophomonas wolfei, δ-proteobacteria(leu), Dictyoglomi, Thermomicrobium roseum
methionine Lactobacillales (exept streptococcus spp.), Desulfotomaculum reducens proline Bacillales, Desulfitobacterium hafniense, Desulfotomaculum reducens cysteine Bacillales, Enterococcus faecalis, Clostridium acetobutylicum, Dictyoglomi histidine some Lactobacillales arginine Clostridium difficile threonine Bacillus cereus, Clostridium difficile serine some FIRMICUTES alanine - ASP, ASN some FIRMICUTES glutamine Clostridium perfringes glycine - lysine -
Gene name T-box specificity Predicted function T-box srecifier codon
ycbK TRP tryptophan-specific permease Bacillus subtilis, Bacillus licheniformis yhaG TRP tryptophan-specific permease Clostridiales yvbW LEU leucine-specific permease Bacillus subtilis, Bacillus licheniformis ykbA THR threonine-specific permease Bacillus subtilis ybgF/aapA ? ? Lactobacillus reuteri
yheL TYR Tyrosine transporter (Na+/H+ antiporter) some Bacillales and Lactobacillales
LysX LYS lysine transporter some Bacillales
ILE Branched-chain amino acid transporter family: ILE-specific
some Bacillales, Lactobacillales andClostridiales
THR Branched-chain amino acid transporter family:: THR-specific
Bacillus cereus, Clostridium tetani
brnQ_braB
VAL Branched-chain amino acid transporter family:: VAL-specific some Lactobacillales
yusCBA MET methionine ABC transporter Lactobacillales, Enterococcus faecalis yqiXYZ ARG arginine ABC transporter Clostridium difficile
hisXYZ HIS histidine ABC transporter Lactobacillales, Clostridium difficile, Listeria monocytogenes, Enterococcus faecalis
CYS cysteine ABC transporter Clostridium acetobutylicum yckKJI MET methionine ABC transporter some Lactobacillales
aspQHMP ASP ASP(ASN) ABC transporter Lactobacillus johnsonii ytmKLM MET methionine ABC transporter Leuconostoc mesenteroides
TRP TRP-specific sodium dependent transporter Bacillus cereus
PHE PHE-specific sodium dependent transporter
Bacillus cereus
LEU LEU-specific sodium dependent transporter Bacillus cereus
yocR(yhdH)
TYR?\MET sodium dependent transporter Clostridium tetani mtsABC opp MET uptake of unknown methionine
precursors, possibly oligopeptides some Lactobacillales
trpXYZ TRP tryptophan ABC transporter Peptococcaceae, Streptococcus spp., Paenibacillus larvae
RDF02391 ARG arginine permease Clostridium difficile ABC-like transporter ? ? Desulfotomaculum reducens
CBX ? ? Clostridium botulinum gltT like ? ? some Clostridium spp.
New predicted amino acid transporters
Conserved RNA secondary structure of the regulatory RFN element
Capitals: invariant (absolutely conserved) positions. Lower case letters: strongly conserved positions. Dashes and stars: obligatory and facultative base pairs Degenerate positions: R = A or G; Y = C or U; K = G or U; B= not A; V = not U. N: any nucleotide. X: any nucleotide or deletion
NNNNyYYUC
NNNNrRRAG
NgGGNcCC
RgGGxc G
AuxgRRA
GRCCYG
AcCG
AGCCRGY
GG YRCC
GRYBy CYRVrG N
YGNaA N U U x N
NxAGU
UrN Ag
YuK N
RAxK
variablestem-loop
additionalstem-loop
2
3 4
5
5 ’ 3 ’
RFN element
5’ UTR regions of riboflavin genes from various bacteria 1 2 2’ 3 Add. 3’ Variable 4 4’ 5 5’ 1’ =========> ==> <== ===> -><- <=== -> <- ====> <==== ==> <== <========= BS TTGTATCTTCGGGG-CAGGGTGGAAATCCCGACCGGCGGT 21 AGCCCGTGAC-- 8 4 8 -----TGGATTCAGTTTAA-GCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAT BQ AGCATCCTTCGGGG-TCGGGTGAAATTCCCAACCGGCGGT 19 AGTCCGTGAC-- 8 5 8 -----TGGATCTAGTGAAACTCTAGGGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATATG BE TGCATCCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGATCCGGTGCGATTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATGCC HD TTTATCCTTCGGGG-CTGGGTGGAAATCCCGACCGGCGGT 19 AGTCCGTGAC-- 10 4 10 ----–TGGACCTGGTGAAAATCCGGGACCGACAGTGAA-AGTCTGGAT-GGGAGAAGGAAACG Bam TGTATCCTTCGGGG-CTGGGTGAAAATCCCGACCGGCGGT 23 AGCCCGTGAC-- 8 4 8 ----–TGGATTCAGTGAAAAGCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAG CA GATGTTCTTCAGGG-ATGGGTGAAATTCCCAATCGGCGGT 2 AGCCCGCAA--- 3 4 3 ------AGATCCGGTTAAACTCCGGGGCCGACAGTTAA-AGTCTGGAT-GAAAGAAGAAATAG DF CTTAATCTTCGGGG-TAGGGTGAAATTCCCAATCGGCGGT 2 AGCCCGCG---- 7 6 7 --------ATTTGGTTAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GGAAGAAGATATTT SA TAATTCTTTCGGGG-CAGGGTGAAATTCCCAACCGGCAGT 6 AGCCTGCGAC-- 11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGTTAA-AGTCTGGAT-GGGAGAAAGAATGT LLX ATAAATCTTCAGGG-CAGGGTGTAATTCCCTACCGGCGGT 2 AGCCCGCGA--- 4 4 4 -----ATGATTCGGTGAAACTCCGAGGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAATA PN AACTATCTTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 2 AGCCCACGA--- 3 4 3 -----ATGATTTGGTGAAATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAAAA TM AAACGCTCTCGGGG-CAGGGTGGAATTCCCGACCGGCGGT 3 AGCCCGCGAG-- 5 4 5 ----–TTGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAGAGCGTGA DR GACCTCTTTCGGGG-CGGGGCGAAATTCCCCACCGGCGGT 15 AGCCCGCGAA-- 8 12 9 ----–CCGATGCCGCGCAACTCGGCAGCCGACGGTCAC-AGTCCGGAC-GAAAGAAGGAGGAG TQ CACCTCCTTCGGGG-CGGGGTGGAAGTCCCCACCGGCGGT 3 AGCCCGCGAA-- 5 4 5 -----CCGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAAGGAGGGC AO AATAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGCGGT 2 AGTCCGCGA--- 7 7 7 -----AGGAACCGGTGAGATTCCGGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGATGAAA DU TTTAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGTGGT 2 AGTCCGCGA--- 13 4 12 -----AGGAACTAGTGAAATTCTAGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGAGCAGA CAU GAAGACCTTCGGGG-CAAGGTGAAATTCCTGATCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGACCCGGTGTGATTCCGGGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTCGGC FN TAAAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGTGGT 2 AGTCCACG---- 5 4 5 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GGGAGAAGAATTAG TFU ACGCGTGCTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT 3 AGTCCGCGAC-- 8 5 8 -----TGGAACCGGTGAAACTCCGGTACCGACGGTGAA-AGTCCGGAT-GGGAGGTAGTACGTG SX -AGCGCACTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT 3 AGTCCGCGAC-- 8 5 8 -----TTGACCAGGTGAAATTCCTGGACCGACGGTTAA-AGTCCGGAT-GGGAGGCAGTGCGCG BU GTGCGTCTTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT 30 AGCCCGCGAGCG 137 GTCAGCAGATCTGGTGAGAAGCCAGAGCCGACGGTTAG-AGTCCGGAT-GGAAGAAGATGTGC BPS GTGCGTCTTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 21 AGCCCGCGAGCG 8 4 8 GTCAGCAGATCTGGTCCGATGCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGATGTGC REU TTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 31 AGCCCGCGAGCG 7 5 7 GTCAGCAGATCTGGTGAGAGGCCAGGGCCGACGGTTAA-AGTCCGGAT-GAAAGAAGATGGGC RSO GTACGTCTTCAGGG-CGGGGTGGAATTCCCCACCGGCGGT 21 AGCCCGCGAGCG 11 3 11 GTCAGCAGATCCGGTGAGATGCCGGGGCCGACGGTCAG-AGTCCGGAT-GGAAGAAGATGTGC EC GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 17 AGCCCGCGAGCG 8 4 8 GACAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAG-AGTCCGGAT-GGGAGAGAGTAACG TY GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 67 AGCCCGCGAGCG 8 3 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGGGTAACG KP GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 20 AGCCCGCGAGCG 8 4 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGAGTAACG HI TCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 2 AGCCCACGAGCG 26 9 30 GTCAGCAGATTTGGTGAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAAAGAGAATAAAA VK GCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 14 AGCCCACGAGCG 11 9 11 GTCAGCAGATTTGGTGAGAATCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAGAATAAGC VC CAATATTCTCAGGG-CGGGGCGAAATTCCCCACCGGTGGT 13 AGCCCACGAGCG 5 4 5 GTCAGCAGATCTGGTGAGAAGCCAGGGCCGACGGTTAC-AGTCCGGAT-GAGAGAGAATGACA YP GCTTATTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 40 AGCCCGCGAGCG 16 6 16 GTCAGCAGACCCGGTGTAATTCCGGGGCCGACGGTTAT-AGTCCGGAT-GGGAGAGAGTAACG AB GCGCATTCTCAGGG-CAGGGTGAAAGTCCCTACCGGTGGT 25 AGCCCACGAGCG 16 4 27 GTCAGCAGATTTGGTGCGAATCCAAAGCCGACAGTGAC-AGTCTGGAT-GAAAGAGAATAAAA BP GTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 18 AGCCCGCGAGCG 10 4 10 GTCAGCAGACCTGGTGAGATGCCAGGGCCGACGGTCAT-AGTCCGGAT-GAGAGAAGATGTGC AC ACATCGCTTCAGGG-CGGGGCGTAATTCCCCACCGGCGGT 16 AGCCCGCGAGCA 10 3 11 ---CGCAGATCTGGTGTAAATCCAGAGCCGACGGT-AT-AGTCCGGAT-GAAAGAAGACGACG Spu AACAATTCTCAGGG-CGGGGTGAAACTCCCCACCGGCGGT 34 AGCCCGCGAGCG 6 6 6 GTCAGCAGATCTGGTG 52 TCCAGAGCCGACGGT 31 AGTCCGGAT-GGAAGAGAATGTAA PP GTCGGTCTTCAGGG-CGGGGTGTAAGTCCCCACCGGCGGT 13 AGCCCGCGAGCG 7 3 7 GTCAGCAGATCTGGTGCAACTCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGGCGTCA AU GGTTGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 17 AGCCCGCGAGCG 7 9 7 GTCAGCAGATCCGGTGAGAGGCCGGAGCCGACGGT-AT-AGTCCGGAT-GGAAGAGGACAAGG PU AAACGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 19 AGCCCGCGAGCG 19 4 18 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAC-AGTCCGGATGAAGAGAGAACGGGA PY TAACGTTCTCAGGG-CGGGGTGCAACTCCCCACCGGCGGT 19 AGCCCGCGAGCG 15 4 16 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAT-AGTCCGGATGAAGAGAGAGCGGGA PA TAACGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 19 AGCCCGCGAGCG 14 4 13 GTCAGCAGACCCGGTGCGATTCCGGGGCCGACGGTCAT-AGTCCGGATAAAGAGAGAACGGGA MLO TAAAGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 16 AGCCCGCGAGCG 8 5 8 GTCAGCAGATCCGGTGTGATTCCGGAGCCGACGGTTAG-AGTCCGGAT-GAAAGAGGACGAAA SM AAGCGTTCTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT 34 AGCCCGCGAGCG 8 3 8 GTCAGCAGATCCGGTCGAATTCCGGAGCCGACGGTTAT-AGTCCGGAT-GGAAGAGAGCAAGC BME GCTTGTTCTCGGGG-CGGGGTGAAACTCCCCACCGGCGGT 17 AGCCCGCGAGCG 10 15 10 GTCAGCAGATCCGGTGAGATGCCGGAGCCGACGGTTAA-AGTCCGGAT-GGAAGAGAGCGAAT BS ATCAATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT 18 AGCCCGCGA--- 5 4 5 -----AGGATTCGGTGAGATTCCGGAGCCGACAGT-AC-AGTCTGGAT-GGGAGAAGATGGAG BQ GTCTATCTTCGGGG-CAGGGTGAAAATCCCGACCGGCGGT 27 AGCCCGCGA—-- 3 5 3 -----AGGATTTGGTGTGATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG BE ATTCATCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGATCCGGTGCGAGTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGAAG CA AATGATCTTCAGGG-CAGGGTGAAATTCCCTACCGGCGGT 2 AGCCCGCGAG-- 3 4 3 ----TATGATCCGGTTTGATTCCGGAGCCGACAGT-AA-AGTCTGGAT-GAAAGAAGATATAT DF GAAGATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT 2 AGCCCGCG---- 6 4 6 -------GATTTGGTGAGATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAGAGAAGATATTT EF GTTCGTCTTCAGGGGCAGGGTGTAATTCCCGACCGGTGGT 3 AGTCCACGAC-- 5 3 5 ----ATTGAATTGGTGTAATTCCAATACCGACAGT-AT-AGTCTGGAT—-AAAGAAGATAGGG LLX AAATATCTTCAGGG-CACCGTGTAATTCGGGACCGGCGGT 21 ACTCCGCGAT-- 4 4 4 ----–TTGAAGCAGTGAGAATCTGCTAGCGACAGT-AA-AGTCTGGAT-GGAAGAAGATGAAC LO GTTCATCTTCGGGG-CAGGGTGCAATTCCCGACCGGTGGT 3 AGTCCACGAT-- 3 10 3 ----TTGACTCTGGTGTAATTCCAGGACCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGTTG PN AAGAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGCGGT 125 AGTCCGTG---- 3 4 3 -------GATGTGGTGAGATTCCACAACCGACAGT-AT-AGTCTGGAT-GGGAGAAGACGAAA ST AAGTGTCTTCAGGG-CAGGGTGTGATTCCCGACCGGCGGT 14 AGTCCGCG---- 3 4 3 -------GATGTGGTGTAACTCCACAACCGACAGT-AT-AGTCTGGAT-GAGAGAAGACCGGG MN AAGTGTCTTCAGGG-CAGGGTGAGATTCCCGACCGGCGGT 104 AGTCCGCG---- 3 4 3 -------GATGTGGTGAAATTCCACAACCGACAGT-AA-AGTCTGGAT-GGGAGAAGACTGAG SA ATTCATCTTCGGGG-TCGGGTGTAATTCCCAACCGGCAGT 6 AGCCTGCGAC-- 11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG AMI TCACAGTTTCAGGG-CGGGGTGCAATTCCCCACTGGCGGT 14 AGCCCGCGC--- 5 5 5 ------TGATCTGGTGCAAATCCAGAGCCAACGGT-AT-AGTCCGGAT-GGAAGAAACGGAGC DHA ACGAACCTTCGAGG-TAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCAAC-- 11 4 11 --CGACTGACTTGGTGAGACTCCAAGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTACAA FN AATAATCTTCGGGG-CAGGGTGAAATTCCCGACCGGTGGT 2 AGTCCACG---- 4 6 4 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GAGAGAAGAAAAGA GLU ---TGTTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 28 AGCCCGCGAGCG 10 4 10 GTCAGCAGATCCGGTTAAATTCCGGAGCCGACGGTCAT-AGTCCGGAT-GCAAGAGAACC---
Genomes Number of analyzed genomes
Number of genomes with RFN
Number of the RFN elements
a-proteobacteria 8 4 4b-proteobacteria 7 4 4g-proteobacteria 17 15 15e- and d-proteobacteria 3 0 0Bacillus/Clostridium 12 12 19Actinomycetes 9 4 4Cyanobacteria 5 0 0Other eubacteria 7 5 6Total 68 47 52
Distribution of RFN-elements in bacterial genomes RFN regulates riboflavin biosynthetic genes and transporters
Some predicted transporters are NEW
Bam GACAAAAAAATATTGATTGTATCCTTCGGGGCTGGGTG --- TCTGGATGGGAGAAGGATGA 59 ----------GTAAAGCCCCGAATGTGTAA---ACATTCGGGGCTTTTTGACGCCAAAT BS GGACAAATGAATAAAGATTGTATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGGATGA 59 ----------CTAAAGCCCCGAATTTTTTA--TAAATTCGGGGCTTTTTTGACGGTAAA BQ CTATAATTTGAGCAAACAGCATCCTTCGGGGTCGGGTG --- TCTGGATGGGAGAAGGATAT 250 -----------CCAAACCCCAAGGATATTAAA--ATCCTTGGGGTTTTTTGTTTTTTTT BE ACATAACGATATAGTGATGCATCCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGGATGC 155 ------------TGAGCCCCCGGGGACAT--------CCCGGGGGTTTCATTTTTATTG HD AAATTGAATAATTAATTTTTATCCTTCGGGGCTGGGTG --- TCTGGATGGGAGAAGGAAAC 148 -------------ATGCCCCGTGAGAACAAAA-----TCTCTGGGGCTTTTTTGCGCGC CA TAATGGTAATTTAATAGGATGTTCTTCAGGGATGGGTG --- TCTGGATGAAAGAAGAAATA 34 -------------AATCTCCGAAGGATTACC----TTTCTTTGGAGATTTTTTTATTTG DF TAAATATAAATTTAATACTTAATCTTCGGGGTAGGGTG --- TCTGGATGGAAGAAGATATT 63 ------------TAAACCCTGAGTTAATT--------CTCAGGGTTTTTTGTTTAAAAA LLX ACTTTAGCTACAATTGAATAAATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAT 127 ----------AAAAGACCCTGAAATTTT------ATTTTAGGGTCTTATTTTTTATTAG PN* ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAA 81 ----------TGTATGCCTTGAGTAGTCCCC---TATTCAAGGTATATTTTTTTGGAGG PN* ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAA 19 ------------CGTGCTCTGAAATGATTACTTGTCATTTCAGAGCATTTTTGTTAATC TM AAAACTGAATACAAAAGAAACGCTCTCGGGGCAGGGTG --- TCCGGATGGGAGAGAGCGTG 13 -----------ATGGGACCCGAGA----------------GGGTCCCTTTTCTTTTACA AO ATTTGCAACAATTTTTTAATAATCTTCAGGGCAGGGTG --- TCTGGATGGAAGAAGATGAA 33 --------TTTACAAGCCTTGAGATCGAAAG----ATTTCAAGGCTTTTTTCATCATTA DU AATTTTTTTAATACTATTTTAATCTTCAGGGCAGGGTG --- TCTGGATGGAAGAAGAAGAG 47 --------TGCATAAGCCTTGAGATCTTAG----GATTTCAAGGCTTTTTCATTAGTTA FN TAATCGAATATGTAAAATAAAGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGAATTA 18 ----------ATATTGCTCAGACTTT------------GTTTGAGCATTTTTTTATTAA SA TATAACAATTTCATATATAATTCTTTCGGGGCAGGGTG --- TCTGGATGGGAGAAAGAATG 74 ------TTTTCTCCTTGCATCTTAATT----------GATGTGAGGATTTTTGTTTATA DHA ACTCTTTTTAGATGAATACGAACCTTCGAGGTAGGGTG --- TCCGGATGGGAGAAGGTACA 43 -----------GTTTATGCCTCGAGGAACACCATTTCCTCGAGGCATTTTTGTTCTTTC FN GAAAAATAAATATTAAAAATAATCTTCGGGGCAGGGTG --- TCTGGATGAGAGAAGAAAAG 40 ------------CTTACCCGAATTCTAT------------AATTCGGTTTTTTTATTTT CA AATATAAAAAAATAAAGAATGATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATATA 19 ----------–-TATGCCCTGACGTTTTT---------CGTTGGGGCTTTTTTAATGCT DF AAAATTAAAAAATCAAAGAAGATCTTCGGGGCAGGGTG --- TCTGGATGAGAGAAGATATT 45 ----------ATAAAAACTCGAAGATAGGG----TCTTCGAGTTTTTTGTTTTTCCTAA BS TAATTAAATTTCATATGATCAATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGGA 103 --AAAGAACCTTTCCGTTTTCGAGTAAGATGTGATCGAAAAGGAGAGAATGAAGTGAAA BQ GGGAAAATAGAATATCGGTCTATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGGA 54 -------ATTCTCCCTTTGTGTAAA------------ACACAAAGGGTTTTTTCGTTCTATG BE ATAAAAATGTATAAGCGATTCATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGAA 114 --------GGCAGCCTTCTTCTTGTGAGGATGAATCACGAGAAGGGGAGGAGAACAAGCATG PN GTTTTTTGTTATGATAAAAGAGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGACGAA 137 -–AACTTCTTCTGATTTTATAG------------AAAATTGGAGGAACCTGTTATGACA ST TAAATCTGCTATGCTAGAAGTGTCTTCAGGGCAGGGTG --- TCTGGATGAGAGAAGACCGG 130 ---GGAACTTCTTTCAATTTGAAA-----------AAATTGGAGGAATTTTTTAATGTC MN ATTTTTTGATATGCTATAAGTGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGACTGA 138 ---–GGCCTTCTTTCGATTTGTAA-----------AAATTGGAGGAATTTTTTTATGAA SA AAATTTAATAATGTAAAATTCATCTTCGGGGTCGGGTG --- TCTGGATGGGAGAAGATGGA 17 --------TCCTCCTATTCTTACG--------AGATGAATGGAAGGAGAAAATTGAATATG EF AAAAAATATAATACAAGGTTCGTCTTCAGGGGCAGGGT --- GTCTGGATAAAGAAGATAGG 33 ---CTACTCTATTTTTCCCTGCAGA------------AAAATAGGGTTTTTTTGTATGA LLX TTTTTGTGCTATAATAAAAATATCTTCAGGGCACCGTG --- TCTGGATGGAAGAAGATGAA 66 -–TCAACTTCCTCGAAATTTGAAGAAT-TATTTTCTCATATTTGGAGGTTTTTTTATGT LO ATTGTAAGAAAATATTCGTTCATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGTTG 79 ---ATGCACAAACTCTCCCTCAACTTTTTTTA--------GTTGAGGTTTTTTATTTGC
Terminator +RBS sequestorThe RFN elementAntiterminator
Antiterminator
Alternative RNA secondary structures upstream of riboflavin operons with RFN elements
Attenuation of transcription via antitermination mechanism
EC AATCCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGAGTAACG 59 ----------CTGCCCTGATTCTGGTAACCATAATTTTAGTGAGGTTTTT-------TACCATGAATCAGACGCTA TY AACCCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGGGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATGTTAATGAGGTTTTTT------TACCATGAATCAGACGCTA KP ATCTCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGAGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATTTTAATGAGGTTTTTT------TACCATGAATCAGACGCTC HI TTAGCTCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAAA 41 ----------CAGCCCTGATTCTGGTATTTAATTGAAATCTCAAAT-TAGGAAAT--TACTATGAATCAGTCAATT VK TATTTGCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAGC 76 ----------CAGCCCTGATTCTGGTATCTAAATATCTTTATATTTCAAGGAATT--TACTATGAATCAGTCTATT AB TAGGCGCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAAA 54 ----------CCGCCCTGATTCTGGTATAAATTCATCTTATTAAA—AAGGCATT---TACTATGAATCAGTCATTA YP ATGGGGCTTATTCTCAGGGCGGGGTG --- TCCGGATGGGAGAGAGTAACG 194 ----------CCGCCCTGATTCTGGTAATCCATAATTTTTTAATGAGGTTTCT---TTACCATGAATCAGACGCTT VC CACAACAATATTCTCAGGGCGGGGCG --- TCCGGATGAGAGAGAATGACA 83 ----------AAGCCCTGATTCTGGTCATTTTTT--------------GGAGTATT--ACCATGAATCAGTCCTCA Spu CTATCAACAATTCTCAGGGCGGGGTG --- TCCGGATGGAAGAGAATGTAA 145 ----------ACGCCCTGATTCTGGATATTCCCATGTCGTATTTTTGAAGGATATTAA-CCATGAATCAGTCTTTA MLO GACGTTAAAGTTCTCAGGGCGGGGTG --- TCCGGATGAAAGAGGACGAAA 44 -------CGTGCGTCCTGATTCTGGTTCGAAACGGA--------------AGGATGGACCCATGAATCAGCATTCC AC AAGCGACATCGCTTCAGGGCGGGGCG --- TCCGGATGAAAGAAGACGACG 51 ----------CAGTCCTGAAATGTTTAACCGTAATT-------------------TACGAGAGCATTTCATATGTC BP AAGCAGTACGTCTTCAGGGCGGGGTG --- TCCGGATGAGAGAAGATGTGC 62 ----------TAGCCCTGAAACGTTTTTCGCCATTTCCTTTTTT------------GCGAGAGCGTTTCAATGTCC BPS AGTCAGTGCGTCTTCAGGGCGGGGCG --- TCCGGATGAAAGAAGATGTGC 86 ----------GAGCCCTGAAACGTTTTTCGCCCATTCATGTTTC-----------GCGAGGAGCGTTTCACATCATG BU AATCAGTGCGTCTTCAGGGCGGGGTG --- GCCGGATGGAAGAAGATGTGC 99 ----------ATGCCCTGAAACGTTTTTCGCCCAACTTTT--------------GCGATGAGCGTTTCAACTATGT REU CATCGTTACGTCTTCAGGGCGGGGTG --- TCCGGATGAAAGAAGATGGGC 77 ----------ATCCCCTGAAACGCCCATCCATGGAAATCCACGCAC-------------GGAGCGTTTCAATGCTG RSO GCTTGGTACGTCTTCAGGGCGGGGTG --- TCCGGATGGAAGAAGATGTGC 80 ---------CGTGCCCTGGAACGTCTTGTCGCCCATTTCA---------------GCGAGGAGCGTTTCCATGTTG PP GGTCGGTCGGTCTTCAGGGCGGGGTG --- TCCGGATGAAAGAAGGCGTCA 50 ----------TCGCCCCGAGACGTTCATCGATCATTCA------------------CGAGGAGCGTTTCATGTTCA PY GCCGGTAACGTTCTCAGGGCGGGGTG --- CCGGATGAAGAGAGAGCGGGA 91 ----------ATGCCCTGTTTTTTCATTAAATT---------------------AAACAGGAGTCAGAACACGTGC PU CGGCGAAACGTTCTCAGGGCGGGGTG --- CCGGATGAAGAGAGAACGGGA 68 ----------ACGCCCTGTTTTTCACAC--------------------------AAACAGGAGTCAGAACATGCAA PA GGCCGTAACGTTCTCAGGGCGGGGTG --- CCGGATAAAGAGAGAACGGG 53 ---------AAAGCCCTGTTTTTCAC---------------------------GAAACAGGAGTTCGTCATATG-- BME CGCGGGCTTGTTCTCGGGGCGGGGTG --- TCCGGATGGAAGAGAGCGAAT 54 ----------GCGCCCTGATTCTAGTTTCGTG--------------------------AGGAACCTATGAACCAAA CAU AATCCGAAGACCTTCGGGGCAAGGTG --- TCCGGATGGGAGAAGGTCGGC 116 ------CGCGATGCCCCGAAGGTGTG-----------------------------TTCAGGGGTGTCGCGATGAAC TFU GTACACACGCGTGCTCCGGGGTCGGT --- GGATGGGAGGTAGTACGTGGT 58 -------GCCTTACCCCGGAGCCTGACCT-------------------------GGCTAGGGGGAAGGCTTCTCGCATG GLU TGAGTTTTGTTCTCAGGGCGGGGCG --- TCCGGATGCAAGAGAACCG 32 ---------AAGGCCCCGAGGATTACATGCTTTTAAATCCTTTGAAAAGGGGACAAGATCATGAATCCTATAACCG DR GAACCGACCTCTTTCGGGGCGGGGCG --- TCCGGACGAAAGAAGGAGGAG 1 GACGCTCAGCTTGCCCCCCA------------------------------------GCAGGCGGCGTCCGCGTATG SM GTCGCAAGCGTTCTCAGGGCGGGGTG --- TCCGGATGGAAGAGAGCAAGC 45 ATCATTGGAAAAATGCCAACCCTGAAA-------------------GGCTTGAGACCATGACCATACTT TQ TTCGGCACCTCCTTCGGGGCGGGGTG --- TCCGGATGGGAGAAGGAGGGCCACTTGCGC AMI CTTACTCACAGTTTCAGGGCGGGGTG --- TCCGGATGGAAGAAACGGAGCGCCTTATGG
Alternative RNA secondary structures upstream of riboflavin genes with RFN elements
Attenuation of translation by sequestering of the RBS
RBS-sequestorThe RFN elementAntisequestor
Direct RBS sequestering
The predicted mechanism of the RFN-mediated regulation of riboflavin genes and operons
• Transcription attenuation
• Translation attenuation
Phylogenetic tree of RFN-elements
Имеет 6 предполагаемых трансмембранных сегментов; гомологичен PnuC (транспортер N-рибозил никотинамида)
новые потенциальные транспортеры флавинов:
Имеет 9 предполагаемых трансмембранных сегментов; не имеет гомологии с какими-либо известными генами.
1. ImpX найден в Fusobacterium nucleatum и Desulfitobacterium halfniense:
2. PnuX найден в актинобактериях:
Strep to m yces coe lico lo r T h erm o m o n ospora fu sca
C o ryn eb a c ter iu m g lu tam icu m
pn u X pn u X
im pX
Known Thi-box signal in diverse bacterial genomes
TTCGGGATCCGCGGAACCTGA-TCAGGCTAA-TACCTGCG-AAGGGAACAAGAGTTA THIC_EC TTCGGGATCCGTTGAACCTGA-TCAGGTTAA-TACCTGCG-AAGGGAACAAGAGAAG THIC_VC GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAAGC THIC_MLO GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-CACTGGCG-TAGGGACGGTGCAGAC THIC_SM AGAAATACCCTTTACACCCGA-TCGGGATAA-TACCTGCG-TGGGGAGTTTTCACGG THIC_NM TTCTTAACCCTTTGGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGAAGTAGAGGAA thiC_BS CCGTCGACCGTACGAACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG THIC_MT GGATCGACCCTTTGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGAAATTATGTCG THIT2_TVO TCCTCGACCCCAAGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGG thi1_TM
Notation: Red– Conserved nucleotides; Green– Purine or Pyrimidine conserved nucleotides; Blue– Non-conserved nucleotides
(Miranda-Rios et.al., 1997)
1 2 3 3' FACULTATIVE STEM-LOOP 2' 4 5 5' 4' 1' ----====>===> -=====> <===== ========> <======= <=== ===> =====> <===== <=== <====---- BACILLUS/CLOSTRIDIUM GROUP BS_THIC TAGTTACTGGGGGTGCCCGCT----------------TTCcgGGCTGAGAGAGAAGGCA-------------AGCTTCTTAACCCTTT---GGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGA-AGTAGAGGA BS_TENA TAACCACTAGGGGTGTCCTTC----------------ATAAGGGCTGAGATAAAAGTGT-------------GACTTTTAGACCCTCA---TAACTTGA-ACAGGTTCA-GACCTGCG-TAGGGA-AGTGGAGCG BS_YLMB TTCATCCTAGGGGTGCTTTG-------------------CGAAGCTGAGAGAGACTT-----------------TGTCTCAACCCTTT---TGACCTGA-TCTGGATCA-TGCCAGCG-GAGGGA-AGCGGTGAA BS_YKOF AAAGCACTAGGGGTGCTGT--------------------TTTGGCTGAGATAAAGCGCGGAA-----GAAACGCGCTTTGATCCCTTA---TGACCCGA-TCTGGATAA-TACCAGCG-TGGGGA-AGTGCAGGT SA_TENA GAACTACTAGGGGAGCCTAAT----------------GATATGGCTGAGATGAATT-------------------GTTCAGACCCTTA---TGACCTGA-TTTGGTTAG-TACCAACG-TAGGAA-AGTAGTTAT SA_YKOE CACACACTAGGGGTGTTT----------------------TATACTGAGATGAGGCTT---------------GCCCTCAAACCCTTT---GAACCTGA-TCTAGCTTG-AACTAGCG-TAGGAA-AGTGTTACT LLX_YUAJ TTTGCACAATGGGTCTATTGACAAA---------ACTGTCAGTAGCGAGA----------------------------AATACCATC----TGACCTGA-TCTGGGTAA-TGCCAGCG-TAGGAA-TGTGTTAAG CA_THIS ATAGTTAACGGGGAGCCTGTA-----------------GACAGGCTGAGAGTGGAATG--------------TGATTCCAGACCCTCA---TAACCTGA-TTTGGATAA-TGCCAACG-TAGGGA-GTTAATGCA CA_YUAJ TATGTGCTAGGGGTGCCTT---------------------TAGGCTGAGAAACAGTTT--------------GTCACGTTAACCCTT-----AACCTGA-TCTGGATAA-TACCAGCG-TAGGGA-AGCAGTTTG ST_YUAJ TTTCACAAAGGAGTGCTT-----------------------TGGCTGAGATCGCAA------------------TTGCGAAATCCTGA---GGACCTGA-TCTTGTTAG-TACAAGCG-TAGGGA-TTGTGACCA DHA_THIC TAATCACTAGGGGGGCCGAATA---------------AGGTCGGCTGAGATAAAGGACCCA---------AGAATCCTTTGACCCTT-----AACCTGA-TCTGGGTAA-TGCCAGCG-TAGGGAAGGTGGATAA LMO_TENA GAAAAACTAGGGGGGCCGAT-------------------TCTGGCTGAGATAGGAAGGTAAT-----------GCTTTCTGACCCTTT---GAACCTGT-TT--GTTAG-TGCAAGCG-TAGGGA-AGTGAATGT LMO_YUAJ TTACCACAGGGGGGGCTTC---------------------TTAGCTGAGATTGAGTCCACGTGT-----TTTTGGATTCTGACCCTTT---GAACCTGT-TC--GTTAA-TACGAGCG-TAGGGA-TTGTGGCGA PROTEOBACTERIA EC_THIB GTTCTCAACGGGGTGCCACGCGT------------ACGCGTGCGCTGAGAAA---------------------------ATACCCGTCGA---ACCTGA-TCCGGATAA-CGCCGGCG-AAGGGATTTGAGGC EC_THIM AAACGACTCGGGGTGCCCTTCTGC-------------GTGAAGGCTGAGAAA----------------------------TACCCGTATC---ACCTGA-TCTGGATAA-TGCCAGCG-TAGGGA-AGTCACG EC_THIC TTTCTTGTCGGAGTGCCTTA-------------------ACTGGCTGAGACCGTTT------------------ATTCGGGATCCGCGGA---ACCTGA-TCAGGCTAA-TACCTGCG-AAGGGA-ACAAGAG VC_THIC CCACTTGTCGGAGTGCCAT---------------------TGGGCTGAGACCGTTT------------------ATTCGGGATCCGTTGA---ACCTGA-TCAGGTTAA-TACCTGCG-AAGGGA-ACAAGAG VC_THID CCTGTAGTCGGGGAGCCTGAGAG-- 66 5 71 -AATTAAAGGCTGAGATCGCGT-------------------AGCGAGACCCGTTGA---ACCTGA-TTCAGTTAG-GACTGACG-TAGGGA-ACTATCC VC_THIB CCCACTCACGGGGGGCCACCCATTCAT-------CCGAATGGCGCTGAGATCAAGCAC---------------TGCTTGGGACCCGCA 21 -ACCTGA-ACCAGATAA-TGCTGGCG-TAGGAATTGAGCTA XFA_THIC TTTGAAGCGGGGGTACCATAGCCA------------AGCTGCGGTTGAGAC----------------------------ACACCCTTCGA---ACCTGA-TCCGGTTTA-CACCGGCG-TAGGAAAGCTTCGT MLO_THIC CATTCACCAGGGGAGTCCCGG----------------CAAGGGGCTGAGATACTGCTGGCTTTC------GCGGCGCAGTGACCCGTTGA---ACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAA MLO_THIB CGCTCTAACGGGGTGCCGGA------ 5 3 5 -----GACCGGCTGAGAGGCAGT------------------CTCGCCAACCCGCTGA---ACCTGA-TCCGGTTTG-TACCGGCG-GAGGGA-TTAGACG MLO_YK GCCCATCCACAGGGGTGCTCCGTAC-------------GGTCGGGGCTGAGACGGGGGCGG-----------CAAGCCCACAGACCCTAGA----AGCTGA-TCTGGGTAA-TACCAGCG-GAGCGA-GGCGGGCG NX_CITX CTCCTTGTCGGAGTGCCGCCGC---------------CGGGCGGCTGAGATTGCGA------------------AAGCAGAATCCGTAGA---ACCTGT--CGGGGTAA-TGCCTGCG-TAGGAA-ACAAACC NX_THIC ATTGAAACAGGGGTGCTGCCTGAT----------GTTTAGGCGGCTGAGAA----------------------------ATACCCTTTAC---ACCCGA-TCGGGATAA-TACCTGCG-TGGGGA-GTTTTCA ACTINOBACTERIAE MT_THIO CTGTAGACACGGGAGTCCCGGG--------------AGCGGGGTCTGAGAGTGGGCGCGCCT-------------GCCCTTACCGTCAC----ACCTGA-TCCGGATCA-TGCCGGCG-AAGGGAGGTCAAGGATG MT_THIC GTACCCACGCGGGAGCGCACGC--------------CGAGTGCGCTGAGAGGACGGCTCGGG------------GCCGTCGACCGTACGA---ACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG CGL_THIC CAGTCCCCACGGGCGCCCGA-----------------GCACGGGCTGAGATCGCGCTGATT---------GCTGCGCGAGCACCGTTTGA---ACCTG--TCCGGTTAG-CACCGGCG-AAGGAAGAGAGGAATGGTGCAATG CGL_THID ACTAGGCACGGGGTGCCAACCGGATGG---AAAAATTCCGGAGGCTGAGAAA---------------------------ACACCCGTTGA---ACCTGC-TCTAGCTCG-TACTAGCG-AAGGGATGGCCTTAACGTG CGL_THIE CTTACCCCACGGGTGCCCAAT---------------GCATTGGGCTGAGATTGCGCGCTGT---------TGCTGCGCGGGACCGTTCGA---ACCTG--TCTGGTTAA-CACCAGCG-AAGGAAGCGAGGATTGATTGTCCCGTG CGL_YKOE TCATAGACACGGGTGCTCGGTGA------------AAATCCGGGCTGAGATCTGGCA----------------TAGCCACGACCGTCGA----ACCTG-ATCCGGATAA-TGCCGGCG-ATAGGGAGGAAAAATATG CGL_OARX TAGTGACACGGGGTGCAAAAGCACTTT----AAAAAAGCTTTCGCTGAGATT---------------------------ACACCCGTCGA---ACCTG-ATCCAGTTAG-TACTGGCG-AAGGGACTGTCGCAT CYANOBACTERIA NPU_THIC TCCATGCTAGGGGTGCCTACAT---------------AACCAGGCTGAGATC---------------------------ACACCCTTAAC---ACCTGAGTCTGGGTAA-TACCAGCG-GAGGGAAGCTGTTTATTG CY_THIC CCATAGCTAGGGGTGTCTAGAA---------------AGCTAGGCTGAGAA----------------------------AAACCCTTAGA---ACCTGAGACTGGGTAA-TACCAGCG-GAGGGAAGCTCACCATTC AN_THIC TCCATGCTAGGGGTGCTTGCAC---------------TAACAGGCTGAGATT---------------------------ACACCCTTAAC---ACCTGAGACTGGGTAA-TACCAGCG-AAGGGAAGCTGTTTATTG THERMUS/DEINOCOCCUS, THERMOTOGALES, Fusobacterium, CFB group DR_THIB CGCGTCACCGGGGGTGCCCTGCTT------------CGGCAGCGGCTGAGAAC---------------------------ACACCCCAGGA---ACCTGA-ACCGGGTCA-TTCCGGCG-GAGGGAGTGTGATGC DR_THIC ATCGTCAACAGGGGTGCCTCCGCATA--------TGGGCCGGAGGCTGAGAGGGCAACT---------------CGGGCCTAACCCTATGA---ACCTGA-ACTGGTTAG-CACCAGCG-GAGGGA-GTGTGACG TQ_THIBGGCCGTCACCGGGGGTGCCCCA------------------AAAGGGCTGAGAGC---------------------------ATACCCTTGGA---ACCTGA-TCCGGGTCA-TGCCGGCG-TAGGGAAGGTGACGGCC TM_THI1 CCTTCCCCAGGGGGAGCTCCTAT---------------TCCGGGGCTGAGAGGAGGACGG-------------AAGTCCTCGACCCCAAGA---ACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGGA FN_THIC TATATGTACTGGGGAGCTT----------------------TGTGCTGAGATTAGAACCT------------TTTTTCTTAGACCCATAGT---ACCT-GA-TTTGGATAA-TGCCAACG-AAGGGA—GTACCA FN_THIX ACTAGTTACAAGGGAGTTAATA-----------------AATTGACTGAGAAAAGGATG--------------TGAGCCTTGACCTTTTG----ACCT-GA-TTTGGATAA-TGCCAACG-TAGGAA--GTAAA PG_THIS AGACCGCTACGGGGGTGCTTGCCG--- 4 3 4 -GATACGGCAGGCTGAGAT---------------------------AATACCCATAG---ACCT-GA-TCCGGATAA-TACCGGCG-GAGGGAT-GTAG PG_OMR ATTGGGAGAAGGGGTGCTTCCTGTA--- 3 7 3 --GTGGATGGCTGAGAAC---------------------------AAACCCTCATC---ACCT-GA-ACCGGATAA-TACCGGCG-TAGGAAA-CTCTC BX_THIS TAAAGACAAAGGGGTGCCACC------------------CGGTGGCTGAGATT---------------------------ATACCCTAAGA---ACCT-GA-TGCAGTTAG-TACTGCCG-AAGGGA—TTGTG ARCHAEA TAC_T1 GGTGTGGTGGGGGAGCTCCAT-----------------AAGGGGCTGAGAGGATCCGG---------------ATGGATCGATCCCTGGA---ACCTGA-TCCGGGTAA-TACCGGCG-GAGGGAAATTATG FAC_T1 AGTTATACCGGGGAGCTAA---------------------AATGCTGAGAGGATAA-------------------GGATCGACCCGTGCA---ACCTGA-TCCGGACAA-TACCGGCG-GAGGGAGATGGATA
Predicted regulatory THI-elements in bacterial genomes
Conserved RNA secondary structure of the regulatory THI element
Capitals: strongly conserved positions. Dashes and points: obligatory and facultative base pairs Degenerate positions: R = A or G; Y = C or U; K = G or U; M= A or C; N = any nucleotide
MGGG KCCC AG G A
A G
C C U
THI-elem ent
Thi-box
1
4
5
2
C Y G G
G R C C
N U NR
UR
NGYY
UCRR
NAG
AG A
G
3
GA U
GC
N
facultative stem -loop
Genomes Number of analyzed genomes
Number of genomes with THI
Number of the THI
elementsa-proteobacteria 7 7 15b-proteobacteria 6 6 12g-proteobacteria 18 17 38e- and d-proteobacteria 3 1 1Bacillus/Clostridium 18 18 51Actinomycetes 9 9 25Cyanobacteria 5 5 5Other eubacteria 14 11 11Archaea (Thermoplasma) 17 3 6Total 97 77 164
Distribution of THI elements in bacterial genomes THI-element regulates thiamine biosynthetic genes and transporters.
A number of NEW candidate thiamine-related transporters were identified.
1 ,2
1 ,2
•Thermus/Deinococcus group,•CFB group•Proteobacteria,
• Translation attenuation
The predicted mechanism of the THI-mediated regulation
of thiamin genes
•Actinobacteria,•Cyanobacteria,•Archaea
•Bacillus/Clostridium group,•Thermotoga, •Fusobacterium,•Chloroflexus
• Transcription attenuation
• Direct RBS sequestering
(грамотрицательные бактерии)Транспорт гидроксиметилпиримидина
Транспортгидроксиэтил-тиазола
(грамположительные бактерии)
New functional predictions
Predicted THI-regulated genes (more enzymes)
• tenA: gene of unknown function somehow associated with thiDFound in most firmicutes, some proteobacteria and archaea; ThiD-TenA gene fusions in some eukaryotes;Forms clusters with thiD and other THI-elements-regulated genes in most bacteria;Single tenA gene is also regulated by THI-elements in some bacteria;Not found in genomes without the thiamin pathway;Always co-occurs with the thiD and thiE genes
• tenI: gene of unknown function, thiE paralog Found in some unrelated bacteria;Forms a separate branch in the phylogenetic tree for thiE;In most bacteria, located in clusters of THI-elements-regulated genes.
• ylmB from Bacilli belongs to ArgE/dapE/ACY1/CPG2/yscS family of metallopeptidases;regulated by the THI-elements in B. subtilis and B. halodurans, not regulated in B. cereus.
• thi-4 from Thermotoga maritima belongs to a family of putative thiamine biosynthetic enzymes from archaea and eukaryotes. Located in the one operon with thiC and thiD.
• oarX from Methylobacillus and Staphylococcus is a single THI-elements-regulated gene; belongs to short-chain dehydrogenase/reductase (SDR) superfamily
Regulation of cobalamin-related genes:
Experimentally known facts:
Extensive region of the mRNA leader is essential for regulation of the btuB gene by vitamin B12.
Involvement of highly conserved B12-box rAGYCMGgAgaCCkGCcd in regulation of the cobalamin biosynthetic genes (E. coli, S. typhimurium).
Post-transcriptional regulation: RBS-sequestering hairpin is essential for regulation of the btuB and cbiA genes.
Ado-CBL is an effector molecule involved in the regulation of the CBL genes.
Identifying of other conserved sequenced regions and prediction of common RNA secondary structure of the B12-element.
B12-элемент – регулятор кобаламинового пути
A
A
A
AA
AA
CGd
a
aa
a
a
ktk
h
CC
cC
C
GG G
GGG
G
GT
M
Y
K
y
c
c G
g
g G
G
G YG
tg
g
ggN
RNN
NN
r
r
r
g
g C
c
c T
C
C G
C Ca
ta N
B 12 -бо кс
0
5' 3'
1
4
Ч а ст ь I Ч аст ь II
5 62
N
Д оп ол н и тел ь н аяш п и л ь ка -I
Д о п ол н и тел ь н аяш п и л ь к а -II
Ф акул ьт ат и в н ая ш п и л ь к а
Группа Bacillus/Clostridium
Различные таксономические группы
-протеобактерии
о сн ов н ая сп и р а л ь
CGh
G
d
yc c
C C
3
0 1 1' 2 AddI 2' 3 3' 4 VS 5 6 AddII 6' 5' VS 4' 0' ======> -===><===- ====> >< <==== ===> <== =====> -==> ======> >< <====== <==- <===== ---> <---<====== -proteobacteria hgGtkcy rg aa aGGGAA cgGtg a tCcg RCdG-ycCcCGChaCKGTra gcCACTG YGGGAAGgc rAGYCMGgAgaCCkGCcd MLO_METE -285 GCGCATGTCGTGGTTCT 22 AGC--TAAGAGGGAA--GCCGGTG 2 ATGCCGGCGCTG-CCCCCGCAACTGTTAGCGGCGAG 11 GGTGTCACTGAGGCGAA-----CGGCCTCGGGAAGACGGG 9 TGACCCGCGAGCCAGGAGACCTGCCACGACGAACAAC MLO_CFRX -237 CCGCTCCAGACGGTCCC 15 GGGGCTAAGAGGGAA--TGCGGTG 16 AATCCGCGGCTG-TCCCCGCAACTGTAAGCGAAGAG 9 AAAGCCACTGGGACG---------TTCCCGGGAAGGCGGC 11 TGACCCGCGAGCCAGGAGACCTGCCGTCTGCGACAAA MLO_BTUD -290 GGGTGCGTGATGGTCCC 16 GGGT-GAAAAGGGAA--CACGGTG 16 AGACCGTGGCTG-CCCCCGCAACTGTAAGCGGAGAG 10 CATGCCACTGGCCGGC-------AAGGCTGGGAAGGCAGG 9 AGACCCGCGAGCCAGGAGACCTGCCATCACTGAGTTG MLO_CBTAB -213 AGTCATGCAGTCGTCGG 13 CC----AAGAGGGAA--TGCGGTG 19 ATGCCGTGGCTG-CCCCCGCAACTGTGTGCGGTAGT 8 TATGCCACTGAAGATT------CGTCTTCGGGAAGGTGGG 9 TGATCCGTGAGCCAGGAGACCTGCCGACGACGGCAAA MLO_BLUB -233 CGCCACTGCCTGGTGCC 11 GGA--GAATCGGGAA--CACGGTT 2 ACTCCGTGGCGT--GCCCAACGCTGTAAGGGGGACC 9 AATGCCACTGTCGA-----------TGACGGGAAGGCACC 9 TTGATCCCGAGCCAGAAGACCGGCCTGGCAGGCATCG MLO_ARDX -308 ATGTCATCTCAGGTGCC 18 GGA--GAATTGGGAA--GCCGGTC 2 AGTCCGGCGCTG-CCCCCGCAACGGTGGTGGAGTTC 12 GAGACCACTGGGCAA--------AAGCCTGGGAAGGTGTC 16 ACACTCCAGAGCCCGGAAACCAGCCCGAGATTTTTGA SM_ARDX -310 AGGACACTCAAGGTGCC 16 GGA--GAATTGGGAA--GCCGGTC 2 ATCCCGGCGCTG-CCCCCGCAACGGTGGTGGAGCGA 13 AAGGCCACTGGACACC-------GCGTCCGGGAAGGCGCC 18 CGGCTCCAGAGCCCGGAAACCAGCCTTGAAGCAGAAA SM_BTUF -391 CTGGGACCGACGGTTCC 19 GGAT-TAATAGGGAA--CACGGTG 21 AAACCGTGGCTG-CCCCCGCAACTGTAAGCGGATCG 26 CCAGCCACTGCGCGCG-------TTGCGCGGGAAGGCAGA 9 CTGTCCGTGAGCCAGGAGACCTGCCGTCAAATCGATC SM_BLUB -251 TGCCGCCGTCAGGTGCC 11 GGG--GAATCGGGAA--GCCGGTG 2 GTTCCGGCACGT-GCCC---AACGCTGTGAAGGGGA 37 TTTGCCACTGAATATTGA---AGCTATTCGGGAAGGCGGC 8 ATGATCCGAAGTCAGAAGACCGGCCTGGCGAGATAGA SM_CBTC -255 GATCATGTGATGGTTCC 18 GGAT-GAAAAGGGAA--CACGGTG 21 AAACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG 11 GATGCCATTGGCCATGA-----ATCGGCTGATAAGGCGGA 8 CGACCCGCAAGCCAGGAGACCTGCCATCACCTTGGGC SM_COBU -527 GCAGTATGGATGGTTCT 21 GGAG-TAAATGGGAA--TGCGAAG 23 TTATCGCAGCCG-ACCCCGCGACTGTAGAACGGTCA 76 AAAGCCACTGGCGT--- 69 ---ACGCCGGGAAGGCGAG 76 GACGCCGTGAGCCAGGAGACCTGCCATCCGTCAGGGC PD_COBU -586 AGGTGTTGGATGGTTCC 21 GGAA-TAATTGGGAA--TGTGACG 22 TTATCGCAGCCG-ACCCCGCGACTGTAGAACGGTCA 77 AAAGCCACTGGCGT-- 119 AAGACGCCGGGAAGGTGAG 64 GACGCCGTGAGCCAGGAGACCTGCCATCCGGCATGGG BME_BTUB -378 TTTCAGGAGACGGTTCC 11 GGAT-GAAAAGGGAA--CACGGTG 14 AAACCGAGACTG-CCCCCGCAACTGTAACCGGAGAG 28 AAAGCCACTGAAA---- 15 ----AATCGGGAAGGCGGA 10 AGACCCGGAAGTCAGGAGACCTGCCGTATCCGGTCAC BME_BTUF -398 ACCGTCATGACGGTTCC 17 GGAT-TAATAGGGAA--CACGGTG 22 AGACCGTGGCTG-CCCCCGCAACTGTAAGCGGATTG 28 CATGCCACTGTGCCCA-------CGGCACGGGAAGGCAGA 10 TTATCCGCAAGCCAGGAGACCTGCCGTCTTACGTAGT BME_NRDH -558 CTTGTGTTCGAGGTTCT 19 AGCT-AAGACGGGAA--TCCGGTG 23 ATGCCGGAGCTG-CCCCCGCAACTGTAAGCGGCGAG 10 CATGCCACTGGCGA----------AAGCCGGGAAGGCGGG 9 TGAGCCGTGAGCCAGGAGACCTGCCTTGAGCGTGAAC BME_CBTAB -281 ACCATGTGACAGGTTTT 19 AATACCAAAAGGGAA--TGCGACG 22 TTATCGCAGCCG-ACCCCGCGACTGTAGAGCGGAGA 24 GCAGCCACTGGAAATCAGA-TGGATTTCTGGGAAGGCGCT 10 AGACCCGCGAGCCAGGAGACCTGCCTGTTGCATGAGG AU_CFRX -329 AAGGGACTGACGGTCTT 16 AAGC-TAAGAGGGAA--CACGGTT 18 ATTCCGTGGCTG-CCCCCGCAACTGTAAGCGGTAAG 10 AAAGCCACTGAACCTTTA-TGATCGGTTCGGGAAGGCGGT 12 ATAGCCGCAAGCCAGGAGACCTGCCGTTTCAGGAAAA AU_NRDH -257 GTGGTGTTCAAGGTTCT 20 AGCT-AAGACGGGAA--TTCGGTG 23 AGGCCGAAACTG-CCCCCGCAACTGTGAGCGGCGAG 13 TGAGCCACTGGAGCCAA-----AAGCTCCGGGAAGGCTGG 11 TGACCCGCAAGTCAGGAGACCTGCCTTGAGCGCAAAT AU_CBTAB -382 ATGTCCGTGATGGTTCC 17 GGT--GAAAAGGGAA--CACGATA 12 CATTCGTGGCTG-CCCCCGCAACTGTGAGCGGAGAG 10 AATGCCACTGGCAA--- 29 --AATGCCGGGAAGGTGTT 8 TGACCCGTAAGCCAGGAGACCTGCCATCACGGAAATA AU_ACHX -299 TTAGCCATCGTGGT-TC 16 GAGC-TAAGAGGGAA--TTCGGTG 20 AATCCGAAGCTG-CCCCCGCAACTGTAAGCGACGAG 11 CATGTCACTGAGGCC--------GGCCTCGGGAAGACGGA 9 TGACCCGCAAGCCAGGAGACCTGCCGCGATAGATAAC AU_BTUF -386 GAGAAAGCGACGGTTCC 18 GGAT-TAATAGGGAA--CATGGTG 20 ATGCCTTGGCTG-CCCCCGCAACTGTAAGCGGATTG 29 CATGCCACTGTTTTTTT----CGGAATGCGGGAAGGCAGA 10 AAATCCGTGAGCCAGGAGACCTGCCGTCAAAATGGAA AU_BLUB -272 TTCTCCGGTCAGGTGCC 9 GGC 4 AATCGGGAA--TCCGGTG 2 AGACCGGAACGT-GCCC-AACGCTGTAAGGCGGATG 10 CATGCCACTGAAGC----------AATTCGGGAAGGCGAA 9 -TGAAGCTTAGTCAGAAGACCGGCCTGGCAGGATAGA BJA_BTUB -321 TGATCGGTGACGGTTCT 9 GAT CAAAAGGGAA--CGTGGTG 30 ACGCCACGGCTG-CCCCCGCAACTGTAAGCGGTGAA 12 TATGCCACTGGGAATCT-----CGGTCCTGGGAAGGCGAC 9 CGACCCGCGAGCCAGGAGACCTGCCGTCAGCCGTGGT BJA_METE -296 CAAGTCGTCGAGGTTCT 12 GAT 8 AAGAGGGAA--GCCGGTG 3 ATGCCGGCTCTG-CCCCCGCAACTGTGAGCGGCGAG 14 GATGTCGCTGAAGCCTGC---ACGGCTTCGGGAAGGCCGG 10 TGACCAGCAAGCCAGGAGACCGGCCCCGACAATATAT BJA_CBTC -250 AGGACGGGCATGGTGCT 22 GCA--TAATCGGGAA--TGGGGAT 24 AAACCCCAGCCG-CCCCCGCGACTGTAAGCGGTGAA 11 ACCGCCACTGGGCCGCA------AGGTCCGGGAAGGCCGG 10 TGAACCGCGAGCCAGGAGACCGGCCGTGCATGTTTTG BJA_BTUB3 -308 ATGCTCGCGACGGTTTC 11 GAT--GAAAAGGGAA--TGCGGTG 16 ATGCCGCGGCTG-CCCCCGCAACTGTAAGCGGATAA 12 GAAGCCACTGGGTCCC-------GGTCCCGGGAAGGCGAC 10 CGACCCGCGAGCCAGGAGACCTGCCGTCAGCCGTGGT BJA_CFRX -308 GGCCCGGCGTTGGTTCC 12 GGC--GAAGAGGGAA--TGCGATA 27 AAAATGCAGCCG-CCCCCGCGACCGTGACCGGAGAG 8 GAGGCCACTGATCCCTG----ACGGGATCGGGAAGGCGGG 18 TGCTCCGCAAGCCGGGAGACCTGCCAGCGCGGACGAT RC_CBTF -327 AAGGCGGGATTGGTTCC 12 GGAT-GAAAAGGGAA--TGCGGTG 12 AACCCGCAGCTG-CCCCCGCAACTGTAAGCGGCGAG 11 GATGCCACTGGGGAT---------GCCCCGGGAAGGCCGA 9 GGACCCGCAAGCCAGGAGACCTGCCACCCCCCGGGCC RC_BTUB -313 TGTCCCGTCCAAGTTCC 12 GGAT-TGAAAGGGAA--CACGGAA 14 AGACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG 9 CATGCCACTGGGAT----------TTCCCGGGAAGGCCGA 8 AGACCCGTGAGCCAGGAGACCTGCTTGGACGATCACC RC_X-CBIP3 -264 GCCCGGGCCTTGGTTCC 14 GGAC-GAAGAGGGAA--GCCGGTG 2 AGTCCGGCGCTG-CCCCCGCAACTGTAAGCGGCAAG 8 ACGGCCACTGATCGC--------AGGATCGGGAAGGCGCA 9 CGAGCCGCAAGCCAGGAGACCTGCCAGGCCGAAACCA RC_BTUF -361 CCAGCGGCGTCGGTTTC 6 GAAT-TGAAAGGGAA--TCCGTTG 15 GAACCGGAACTG-CCCCCGCAACTGTAGGCGGCGAG 13 AAAGTCACTGTGGCGC------ATGCCATGGGAAGGCCGC 11 -GACCCGCCAGTCAGGAGACCTGCCGACACGTCGAAA RC_ARDX -246 GAAGGCCTCAGGGTGCC 14 GGA--GAATTGGGAA--GCCGGTG 2 AGACCGGCGCTG-CCCCCGCAACGGTCAGCAATGAG 7 AAGGCCACTGGACCCC------GGGGTTCGGGAAGGCGCT 18 CGCATTGCAAGCCCGGAAACCAGCCCTGTGACCGCCG RC_X-CNOA -200 GGGGCGTCATCGGTCCC 24 GGGGGAAAGAGGGAA--TACGGTG 21 AATCCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG 10 GATGCCACTGGGCCTG------CCGGTCCGGGAAGGCCGG 8 AGACCCGCAAGCCAGGAGACCTGCCTGTGATGCGCCC RC_X-BTUD -240 TCGCGGCAGATGGTTCC 21 GGT--GAAAAGGGAA--TACGGTG 20 AATCCGTAACTG-CCCCCGCAACTGTAAGCGGCGAG 9 GCAACCACTGGCCCCGAC--CGCGGGGCCGGGAAGGTGGG 7 CGACCCGCAAGTCAGGAGACCTGCCATCAGCGTCATC RC_CFRX -295 GGGCGGGCGCTGGTTTC 13 GC---GAAGAGGGAA-----TGTG 31 CGACCGCAGCCG-CCCCCGCGACCGTGACCGGAGAG 8 GAGGCCACTGGCAC----------CAGCCGGGAAGGCGGG 24 GCATCCGCAAGCCGGGAGACCTGCCAGCGCATGGATT RC_CBIM -282 CAACAGGCGATGGTTCC 10 GGAT-TAATAGGGAA--CACGGTG 21 AATCCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG 10 AAGACCACTGGCCC--- 18 -AACGGCCGGGAAGGTGAC 10 CGAACCGCAAGTCAGGAGACCTGCCATCGCTCTGGCG RC_EXBB -322 TGACGTGTTCAAGTTCC 12 GGAT-TGAAAGGGAA--CACGGAA 14 AGACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG 9 CATGCCACTGGGCC----------CGCCCGGGAAGGCCGA 8 AGACCCGCGAGCCAGGAGACCTGCTTGGACATTCACC RC_CRDX -264 CAGCGGGCCTTGGT-CC 16 GGGG-TAATAGGGAA--GCCGGTG 2 ACTCCGGCGCTG-CCCCCGCAACTGTCAGCGGCAAG 9 AСACCCACTGGCCC----------CGGCCGGGAAGGGGCA 9 CGAGCCGCAAGCCAGGAGACCTGCCAGGCCAAAGACC RC_NRDD -272 GTGACGCTCTGGGT-CT 14 AGC--CAAGAGGGAA--GCCGGTG 2 ATTCCGGCGCTG-CCCCCGCAACTGTAAGCGGCGAG 10 CACGCCACTGGGCCC--------CGGCCCGGGAAGGCCAG 8 TGACCCGCAAAGCAGGAGACCTGCCCGAGCCTTGATG RC04759 -466 CTTGTGGCGATGGTGGC 17 GCCT-GAAAAGGGAA--TGCGGTG 14 AGGCCGCGGCTG-CCCCCGCAACTGTGAGCGACGAG 10 AATGCCACTGGGCCC--------GCGCCCGGGAAGGTCCG 10 -GACCCGCAAGCCAGGAGACCTGCCATCGCAGACGTT RS_BLUB -217 GGCAGGGGTCAGGTGCC 10 GGA--GAATCGGGAA--GCCGGTG 2 AATCCGGCGCGG-GCCC-GCCGCTGTGACGGGGATG 10 GAGGCCACCGGTT------------CGCCGGGAAGGCGCC 9 ATGAACCGGAGCCAGAAGACCGGCCTGACGCAGAGGT RS_BLUE -287 GTGCGGGCGACGGTTCC 14 GGC--GAAGAGGGAA--TGCGGTG 17 AAGCCGCGACTG-CCCCCGCAACTGTAGGCGGCGAG 11 ATGCCCACTGGCCCAGG-----ACAGGCCGGGAAGGGCGG 9 AGACCCGCGAGTCAGGAGACCTGCCGTCGACGGACCT RS_CFRX -286 TCCGGCGCGCTGGTTCC 14 GGC--GAAGAGGGAA--TGCCCCA 0 --GAGGCAGCCG-CCCCCGCGACCGTGACCGGAGAG 6 CAAGCCACTGGCCGCA--------AGGCCGGGAAGGCGGG 23 ACATCCGCAAGCCGGGAGACCTGCCAGCGCTGAGACT RS_CBTC -267 CGGGCTATGACGGTTCC 19 GGAT-GAAAAGGGAA--CGCGGTG 16 GTTCCGCGACTG-CCCCCGCAACTGTGAGCGGCGAG 11 GACGCCACTGGACCGAA-----AGGGCCCGGGAAGGTTCG 10 CGACCCGCGAGTCAGGAGACCTGCCGTCGAGCGCGCA RS_BTUB -320 GGAACGGCTTCGGTTCC 12 GGAT-GAAAAGGGAA--CGCGGTG 16 ACTCCGCGGCTG-CCCCCGCAACTGTAGGCGGCGAG 11 GGCGCCACTGGGAT----------GTCCCGGGAAGGCCGG 9 CGACCCGCAAGTCAGGAGACCTGCCGGAGCGATCACC RS_BTUF -365 CAATCCTCGTCGGTTTC 6 GAAT-TGAAAGGGAA--TCCGCCG 15 GAACCGGAACTG-CCCCCGCAACTGTAGGCGGCGAG 12 AAAGCCACTGTGGCCTC-----AAGCCATGGGAAGGCCGC 10 TGACCCGCCAGTCAGGAGACCTGCCGGCGTTCGATCT SAR_BTUB -400 TTGATCGCGCCGGTGCC 8 GGGCTTAATCGGGAA--TGCGGTG 16 AATCCGCGGCTG-TCTCTGCAACTGTAAGCGGATAG 29 GCAGCCACTGGGCCGG- 15 --AGGTCTGGGAAGGCGTG 10 TGACCCGTGAGCCAGGAGACCTGCCGGCGTCTGGTCG SAR_COBW -403 ATGATCGCGCCGGTGCC 8 GGCT-TAATCGGGAA--TGCGGTG 16 AATCCGTGGCTG-TCCCTGCAACTGTAAGCGGATAG 28 GCAGCCACTGGGCAG-- 10 -CAAGTCTGGGAAGGCGTG 10 TGACCCGCGAGCCAGGAGACCTGCCGGCGCACCGGTC SAR_BTUBF -297 CCGACGCCAGAGGTGCC 10 GGCT--AAGAGGGAA--GCCGGTT 2 ATTCCGGCGCTG-CCCCCGCAACTGTAACCGGATAG 13 CATGCCACTGGTGTCGG- 6 CCAGCACCGGGAAGGCGGG 10 TGACCCGGAAGCCAGGAAACCTGCCCTTGGTTGTCGT CO_METE -339 GCCGTTGTCGTGGT-CT 18 AGC--TAAGAGGGAA--GTCGGTG 16 AATCCGGCGCTG-CCCCCGCAACTGTGAGCGGCGAG 13 CGTGTCACTGACGCG-- 15 GATGCGTCGGGAAGGCCAG 11 CGACCCGTGAGCCAGGAGACCTGCCTCGACAGATAAC CO_BTUB -318 GCTTCGCGTCAGGTTCC 8 GGAT-GAAAAGGGAA--CGAGGTT 2 AGACCTCGGCTG-CCCCCGCAACTGTAAGCGGCGAG 10 CATGCCACTGGGCCCAA-----AAGGCCTGGGAAGGCGAC 12 TGACCCGTGAGCCAGGAGACCTGCCCGGCGCAGTCGT RPA_HOXN -281 GCGCCCGTTCAGGTGTG 15 CAC------AGGGAA--GCCGGTG 28 AATCCGGCGCTG-CGCCCGCAACTGTGAGCGGTGAG 11 TCGGCCACTGGGCAGCA-----CTTGCCCGGGAAGGCGAA 9 CGACCCGTGAGCCAGGAGACCGGCCTGAGTACGTCAT RPA_BTUB3 -448 TGACCAGCGACGGTTCC 6 GGAT-CAATAGGGAA--CGCGGTG 16 ATTCCGCGGCTG-CCCCCGCAACTGTAAGCGGCGAG 10 CACGCCACTGGGCTTT-------CGTCCTGGGAAGGCGGT 9 CGACCCGCGAGCCAGGAGACCTGCCGTCAGTCGTGGT RPA_CFRX5 -383 TTGACGTCTTCGGTGCC 10 GGTG-AAACTGGGAA--TACGGTG 15 AATCCGTAGCTG-CCCCCGCAACTGTAGGCGGATCT 11 GTAGCCACTGACGTCCT-----CGGCGTCGGGAAGGCGGT 7 ATATCCGTGAGCCAGGAGACCGGCCGAAGACGGGAAG RPA_CRDX -364 TGCCAAGCGATGGTCCT 10 AGGT-GAAAAGGGAA--GCCGGTG 19 ATCCCGGAGCTG-CCCCCGCAACTGTAAGCGACGAG 8 GAGGCCACTGGGAA----------TTCCTGGGAAGGCGGC 9 CGACTCGCGAGCCAGGAGACCTGCCATCGCGTATTGT RPA_METE -297 ATCGCCGTCGAGGTTCT 19 AGCT--AAGAGGGAA--GCCGGTG 2 AGGCCGGCGCTG-CCCCCGCAACTGTTAGCGGTGAG 12 AAAGCCACTGGGAGC--------GATCCCGGGAAGGTCGA 10 TGACCCGCGAGCCAGGAGACCTGCCTCGTCGAACGAA RPA_COBT2 -412 CCGCTCGCTTCGGTGCC 12 GGTG--AAACGGGAA--TGCGGTG 16 AGTCCGCGGCTG-CCCCCGCAACTGTAAGCGGATCG 11 TCCGCCACTGAG----- 18 -----CTCGGGAAGGCGAC 7 ATGTCCGCGAGCCAGGAGACCGGCCGAAGTCCGCAAC RPA_BTUF2 -320 GAGGTTGTACCGGTGCC 13 GGTG--AAACGGGAA--TGCGGTG 15 ATGCCGCAGCTG-CCCTCGCAACTGTGGGCGGATCG 11 CATGCCACTGACCAGA-------TCGGTCGGGAAGGCGGA 8 ATATCCGCGAGCCAGGAGACCGGCCGGTACAAGGTGT RPA_BTUB -304 ATGGCGGTGACGGTTCC 5 GGGATGAAAAGGGAA--TACGGTG 24 AGGCCGTAGCTG-TTCCCGCAACTGTAAGCGGATCG 10 GATGCCACTGGGAACCT-----CGGTCCTGGGAAGGCGAC 6 TCAACCGCGAGCCAGGAGACCTGCCGTCATTCGTGGT RPA_CBIC CGCGCGCCGACGGTGTC 14 GACG--AAGAGGGAA-TATCGGAA 20 GCGCCGAAGCTG-CCCCCGCAACTGTAAACGGTGAG 9 TACGCCACTGGATCA---------TATCCGGGAAGGCCGC 8 CGACCCGTGAGCCAGGAGACCTGCCGTCGCCTGCTAT BPS_HOXN -591 GCTCGCGTTTCGGTGCT 23 AGT--CAAACGGGAA--ACAGGGA 22 CAACCTGTGCTG-CCCCCGCAACGGTAAGCGAAGGC 53 CCAACCACTGGACGCAT-----CGCGTCCGGGAAGGTGAA 5 GTTTTCGTCAGCCCGGATACCGGCCGAGACACGGGGC BPS_BTUB -329 GGCGCCGCCTCGGTGCT 16 GGT--TAAACGGGAA--GCAGGGC 22 CAACCTGCGCTG-CCCCCGCAACGGTAAGCGATCGC 70 GGTGCCACTGCGCTTC-------GCGCGCGGGAAGGCGAG 5 GACGTCGCGAGCCCGGATACCGGCCGAGGCGGGGAGG BPS_COBE -391 TGCGCGCGTTCGGTGCC 22 GCC---CAACGGGAA--ACAGGAA 17 CAACCTGTGCTGCCCCCCGCAACGGTAAGCCGCCTG 28 ACGGCCACTGTCCTC--------GCGGATGGGAAGGCGGC 7 CACGCGGCCAGCCCGGATACCGGCCGACGCACGGGGC BPS_COBG -303 GTCCGTCGACCGGCGCC 6 GGC---AAGAGGGAA--CGCAGGG 9 CCGCTGCGGCTG-CCCCCGCAACTGTGAGCAGCGAG 13 CACGCCACTGGCCAC-- 16 ----CGCCGGGAAGGCCCG 10 CGACCTGCCAGCCAGGAGACCTGCCGGGACGTTTCGT NE_BTUB -343 CCCTTGTTTGAGGTGTC 20 GAT--GAAACGGGAA--GCCGGTG 22 ATGCCGGCACTG-CCCCCGCAACGGTAAATGAGTCA 10 CACGCCACTGTGCTGT------ATGGCACGGGAAGGCGCA 20 CCGCTCATAAGTCCGGAGACCGGCCTGAAGCAATATC MFL_BTUB -327 CCAAGTTTTGAGGTGTC 22 GGTG-AAACTGGGAA--ACAGGTG 23 ATGCCTGTGCTG-CCCCCGCAACGGTAAGCAAGCCG 9 CATGCCACTGTGAAAGA-----CCTTCATGGGAAGGCGGC 9 AATCTTGCAAGCCCGGAGACCGGCCTGAAAACGATCA MFL_BTUB2 -411 ACCTCACTTACGGTTTT 19 AAAT--AATAGGGAA--TCCGGTG 16 AATCCGGAACTG-CCCCCGCAACTGTAATCGGTGAG 13 CTTGCCACTGGACTT---------GATCCGGGAAGGCCGC 11 TGACCCGAGAGTCAGGAGACCTGCCGCAAGTGAGCTA MFL_NRDA -365 ACACCATCTACGGTGTC 22 GA----AACAGGGAA--TGCGGTC 16 AAGCCGCAGCTG-CCCCCGCAACTGTGACCAGTGAG 15 AAGGTCACTGGGCCTGG- 5 TGAGGCCCGGGAAGACAGG 10 GGACCTGGGAGCCAGGAAACCTGCCGTAGATCATTTT REU_BTUB -252 CCCCCGTTCCAGGTGCT 24 AGTT--CAACGGGAA--ACAGGGA 34 CAACCTGTGCTG-CCCCCGCAACGGTAAGCGACCGC 35 AACGCCACTGAATC--- 17 -----ATCGGGAAGGCGGC 6 GATGTCGTCAGCCCGGATACCGGCCTGCAGAACGAGG RSO_HOXN -270 CTCACGATGATGGTGCC 7 GGTG--AAACGGGAA--CGCGGTG 2 ATGCCGCGGCTG-CCCCCGCAACTGTAAGCGACGAG 10 CCAGCCACCGCACG----------ATGCCGGGAAGGCGGC 9 TGACGCGCGAGCCAGGAGACCGGCCATCTCCTTCTGT RSO_BTUB -388 CGCCGCGTCCTGGTGCC 16 AGTT--AAACGGGAA--GCAGGGA 22 CAACCTGCGCTG-CCCCCGCAACGGTAAGCGAACGC 59 CATGCCACTGTTCCG--------CGGAACGGGAAGGCGGC 6 CGGTTCGCCAGCCCGGATACCGGCCAGGACAGTGGGT
Allignment of B12-elements alpha and beta proteobacteria
0 1 1' 2 AddI 2' 3 3' 4 VS 5 6 AddII 6' 5' VS 4' 0' ======> -===><===- ====> >< <==== ===> <== =====> -==> ======> >< <====== <==- <===== ---> <---<====== hgGtkcy rg aa aGGGAA cgGtg a tCcg RCdG-ycCcCGChaCKGTra gcCACTG YGGGAAGgc rAGYCMGgAgaCCkGCcd EC_BTUB -248 ATCCACTTGCCGGT-CCTGTGAGTT--AATAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAAAGGT 16 GCGGACACTGCCAT----------TCGGTGGGAAGTCATC 19 ACCCCTCCAAGCCCGAAGACCTGCCGGCCAACGTCGC SY_BTUB -252 ATCCGTGGGCCGGT-CCTGTGAGTT--AATAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAAAGGT 15 GCAGACACTGCCTC-----------CGGCGGGAAGTCATC 24 AACCCTCCAAGCCCGAAGACCTGCCGGCTAACGTCGC SY_CBIA -265 GTAAACCAACAGGTTTG 12 T--------AGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 AAGACCACTGATCGC--------AAGATTGGGAAGGACGG 6 AGGACGCTAAGCCAGAAGACCTGCCTGTCGGTGATAA KP_CBIA -264 ACAAACCGACAGGTTCG 15 C--------AGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 AAAACCACTGATCGA--------AAGATTGGGAAGGGCGG 6 ACGAGGCTAAGCCAGAAGACCTGCCTGCCGGTAACTG KP_BTUB -245 ATTCGCCTACCGGT-CCTGTGAGTT--AAAAGGGAA--CCCAGTG 2 AATCTGGGGCTG--ACGCGCAGCGGTAAGGAAGGTG 19 GCAGACACTGCGGCT--------AGCCGTGGGAAGTCATT 11 CAGCCTCCAAGCCCGAAGACCTGCCGGAATACGTCGC YP_BTUB -324 CATTGTGGTCCGGC-CT 22 AGAGTTAAAAGGGAA--TCCGGTG 2 AATCCGGAGCTG--ACGCGCAGCGGTAAGGGGAAGT 18 ACAGACACTGTCCGC--------AAGGATGGGAAGTCATC 67 GAGATCCTAAGCCCGAAGACCTGCCGGTATTACGTCG YE_BTUB -288 CATTGCGGTCCGGC-CT 22 AGAGTTAAAAGGGAA--TCCGGTG 2 AATCCGGAGCTG--ACGCGCAGCGGTAAGGGGAAGT 18 CCAGACACTGTCCGT--------AAGGATGGGAAGTCATC 32 GAGATCCCAAGCCCGAAAACCTGCCGGTATACGTCGC YE_CBIA -282 ATACTGAAACAGGTATG 15 T--------TGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 GAGACCACTGATCCAT-------AGGATTGGGAAGGTAGC 8 GTGACGCTAAGCCAGAAGACCAGCCAAATCAGTAAAG EO_BTUB -360 GATGAGCGTCCGGC-CTT 7 AAGTC-AAAAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAATGCC 17 GCAGACACTGTTAT--- 80 ----CGATGGGAAGTCATC 45 CGGCATCCAAGCCCGAAGACCTGCCGGAATACGTCGC VC_BTUB -326 AGCGCCAAGCTGGTGCT 26 GGCT-GAAAAGGGAA--TCCGGTG 2 ACTCCGGAACTG--ACGCGCAGCGGTAAGAGAGAAC 9 AACGACACTGCTTTT---------CGAGTGGGAAGTCGAG 14 GTGCTCTCAAGTCCGAAGACCTGCCAGCAACTGAGTT PA_BTUB -297 GCCTTGCGACAGGTGCC 8 GGTG-AAACAGGGAA--GCTGGTG 15 AGGCCAGCGCTG-CCCCCGCAACGGTAGGCGAATCA 12 ATGACCACTGTGCTC--------CGGCATGGGAAGGCGCG 19 TCGCTCGCGAGCCCGGAGACCGGCCTGACGCACCCAC PA_BTUB2 -297 GGCCCGTTCCAGGTGCC 18 GGTG--AAACGGGAA--GCCGGTG 16 AGTCCGGCGCTG-CCCCCGCAACGGTAAGCGAGCGA 9 CAGGCCACTGTGCTC--------CGGCATGGGAAGGCGAG 9 ACCCTCGCAAGCCCGGAGACCGGCCTGCAACGCCCTG PA_COBW -305 TGCCGGTTCGAGGTTCC 16 GGC--TAAGAGGGAA--CGCGGTC 1 ATGCCGCGGCTG-CCCCCGCAACTGTGAACGGCGAT 8 AATGCCACTGCGTG-----------ACGCGGGAAGGCGGG 16 CAGACCGTGAGCCAGGAGACCTGCCTCGTCGATCCCG PA_COBG -244 GCGCGTTCGTCGGTGCC 37 ------AAGAGGGAA--CACGGAG 25 TAGCCGTGGCTG-CCCCCGCAACTGTATGCAGCCTG 11 TTCGCCACTGGAT------------TACCGGGAAGGCGGC 33 CGGGCTGCGAGCCAGGAGACCTGCCGCCGAAACCAGT PA_CBTAB -245 GGGTTGTCCCAGGTGTC 17 AGGT-GAAACGGGAA--GCCGGTG 14 AGTCCGGCGCTG-CCCCCGCAACGGTAAGCGCATC---------------------------------------------------GCGCGCGAGCCCGGAGACCGGCCTGGAACCTTTCG PP_BTUR -334 GGCGTGTTTCAGGTGCC 21 GGTG-AAACTGGGAA--GCCGGTG 17 ATTCCGGCGCTG-CCCCCGCAACGGTGGATGAGTAA 10 AGGGCCACTGGATGCC------AGCATCCGGGAAGGCGCG 17 CCACTCACAAGCCCGGAGACCGGCCTGATACTGCCAA PP_BTUF -302 TGCGGGCCGCCGGTTTC 7 GAAC-TAACAGGGAA--TCCCAGG 15 CAATCGGAACTG-CCCCCGCAACTGTAGGTGCCGAG 11 GATGCCACTGGGCCTG-------CCGCCCGGGAAGGCCGG 11 -GACGCACCAGTCAGGAGACCTGCCGGCCTACATTCA PP_BTUB2 -319 CGCCAGTTTCAGGTGCC 18 GGTG--AAACGGGAA--ACCGGTG 19 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGCGA 9 GATACCACTGTGCTC--------AAGCATGGGAAGGTGAA 9 CCCCTCGCAAGCCCGGAGACCGGCCTGGAGCTTCACT PP_COBW -299 TGCCACTTCGAGGTTCT 13 AGCT-AAGACGGGAA--CGCGGTA 1 AAGCCGCGGCTG-CCCCCGCAACTGTAAGCACCGAC 11 ACAGCCACTGCGCCA--------ACGCGCGGGAAGGCGTC 27 AACGGTGCAAGCCAGGAGACCTGCCTCGTCACGTTTT PP_CBTAB -309 CCTCGCGTTCAGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 29 ATGCCGGTGCTG-CCCCCGCAACGGTAAGCGAGTGA 6 TGTACCACTGTGCCTCGT-AGTACGGCATGGGAAGGTGAC 20 TTCCTCGCAAGCCCGGAGACCGGCCTGGCGTTCATGA PU_BTUR -300 GGCTTGTTTCAGGTGCT 19 AGTG-AAACAGGGAA--GCCGGTG 30 ATCCCGGCGCTG-CCCCCGCAACGGTAAATGAGTAA 11 GATGCCACTGCTTA----------ACAGCGGGAAGGCGCG 14 CCGCTCATGAGCCCGGAGACCGGCCTGATCCATCCAG PU_COBW -302 TGCGCTTTCGAGGTTCT 14 AGCT-AAGAAGGGAA--CGCGGTC 1 AAGCCGCGGCTG-CCCCCGCAACTGTGAACGGTGCT 9 CACGCCACTGCCAA--- 12 ---CCAGCGGGAAGGCGCA 22 AACACCGTCAGCCAGGAGACCTGCCTCGTCACAGATT PU_CBTAB -335 AACTTGTTACGGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 29 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGCGA 6 AGATCCACTGTGCCCA------CGGGCATGGGAAGGTGAC 23 CCCCTCGTGAGCCCGGAGACCGGCCCGCAACACACAG PY_COBW -331 TGCCGGTTCGAGGTTCT 25 AGCT-AAGACGGGAA--TGCGGTA 1 ATGCCGCAGCTG-CCCCCGCAACTGTAAACGGTCAT 9 ACAGCCACTGCTG------------CGGCGGGAAGGCGCG 39 GCTGCCGTGAGCCAGGAGACCTGCCTCGAACCGGGCT PY_BTUR -298 GGCTTGTTTCAGGTGCT 20 GGTG-AAACAGGGAA--GCCGGTG 16 ATCCCGGCGCTG-CCCCCGCAACGGTAAATGAGTCA 11 CGTGCCACTGTGTTT--------CGACACGGGAAGGCGCG 13 CCGCTCATGAGCCCGGAGACCGGCCTGAACCACTCAA PY_CBTAB -303 ACCTTGTTTCGGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 18 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGAGA 5 TGATCCACTGTGCTC--------TGGCATGGGAAGGTGAC 30 CCCCTCGCGAGCCCGGAGACCGGCCCGACATTTTTCC PY_BTUF -321 TGCCGGCCGTCGGTTTC 6 GAAC-TAACAGGGAA--TTCGCCA 17 AAAACGAAACTG-CCCCCGCAACTGTAGGCATCGAG 11 ACTGCCACTGGATTC--------AGATCCGGGAAGGCCGG 11 -GACATGCCAGTCAGGAGACCTGCCGACCCGATTCAA SON_BTUD -303 --------------------------TAATAGGGAA--TCGGGGC 13 CAGCCCGAACTG-TACCCGCAACTGTGAGTAGTTAA----------------------69------------------------TTTTCTACAAGTCAGGAGACCTGCCTATTGCTGTTTT SON_BTUB -332 CAACCTTCTGTGGTGCT 18 AGA--TAATCGGGAA--GCCAGTG 2 ATTCTGGCACTG-CCCCCGCAACGGTAAAAGGTGAG----------------------89------------------------TATAGCCTAAGTCCGGAGACCGGCCCTAAAGGTGTTT AV_BTUB -302 GCCTCGCTTCAGGTGCC 5 GGTG-AAACAGGGAA--GCCGGTG 24 AGGCCGGCGCTG-CCCCCGCAACGGTAGACGAGTCG 10 ATAGCCACTGTGTTGC-----TCGGACACGGGAAGGCGCG 25 TCGCTCGTGAGCCCGGAGACCGGCCTGTGGCGATCCA XAX_BTUB -327 CGCGCCCCTGAGGTGAC 16 GTTT--AAACGGGAA--TCCGGTG 24 ATTCCGGAGCTG-CCCCCGCAACGGTGGGCGAGGTC 11 TACGCCACTGTGCAG--------TCGCATGGGAAGGCGCG 19 CCACTCGCAAGCCCGGAGACCGGCCTGAGGGATTGAC BS_BTUF -237 AATGTCAAATAGGTGCC 18 GGCT-TAAAAGGGAA--ACCGGTA 1 AAGCCGGTGCGG-T-CCCGCCACTGTAATTGGCCAA-------------------------------------------------GCGCCAAGAGCCAGGATACCTGCCTGTTTGATCAGC ZC_METE -309 AAAGGAAAATAGGTACA 16 TGTT-TAAAAGGGAAG-CTTGGTG 2 ACTCCAACACGG-T-CCCGCCACTGTAAATGCTGAG 9 TGGTGCCACTGTGA-----------AAACGGGAAGGTAAA 10 TGAAGCATAAGTCAGGAGACCTGCCTGTTTTAACAAC HD_ACHX -377 CTCAAGCATTAGGTGGT 16 ATCT-GAAAAGGGAA--GCTGGTG 2 AGTCCAGCACGG-T-CGCGCCACTGTAATAAGGAGC 10 GAAACCACTGTCCAA---------AGGATGGGAAGGTACA 9 -TTATCTTAAGTCAGGAGACCTGCCTAATGTATGCAC HD_BTUF -401 TCGCGCTGAAGGGTCGT 11 GCGT-GAAAAGGGAA--GTCGGTG 2 AATCCGACACGG-T-CCCGCCACTGTAAATGGGAGA 8 AGATCCACTGTCTA----------GCGACGGGAAGGGGGC 9 ATGAACATAAGTCAGGAGACCTGCCTTTCAGTTTGAG HD_METE -322 GTTTGGGAACAGGTACG 22 TGTT-TAAAAGGGAA--TCCGGTG 2 AATCCGGAGCGG-T-CCCGCCACTGTCATAGCTGAG 10 ATTGTCACTGACCGTTС-----ATTGGTTGGGAAGACTGT 8 TGACGCTAGAGCCAGGAGACCTGCCTGTTCTAACAGC HD_COBT -247 TAGGCTTCTTAGGTGCC 9 GGA--GAATAGGGAA---GTTCTG 2 A---CGACGCGG-AGCCCGCCACTGTAGTCGAGGAG 7 AATACCACTGGGA------------AACTGGGAAGGTGTA 8 -TGAATCGGAGCCAGGAGACCTGCCTAAGAAGATGCG HD_NRDA -345 GTGGACGGTAAGGTGCC 6 GGCT-TAAAAGGGAA--TCTGGTG 2 AATCCGGAGCTG-TCCCCGCAACTGTGAGTGCTACG 10 TTTGCCACTGTACATC- 14 AAATGTATGGGAAGGCTTC 8 TAAAGCACGAGTCAGGAGACCTGCCTTACTTCCACAA BE_NRDA -318 TGCCAAGCAATGGTGTC 6 GACT-TAATAGGGAA--TCCGGCG 2 AATCCGGAACTG-CCCCCGCAACTGTATGTGCGGAC 8 ATGGCCACTGGCGGCA- 14 -CGCCGCTGGGAAGGCCCC 9 CGATGCACGAGTCAGGAGACCTGCCTTGCTTGGAACG BE_BTUF -333 ATTCGCAGCAAGGTGCC 6 GGCT-TAATAGGGAA--TCCGGTG 2 AATCCGGAGCTG-TCCCCGCAACTGTCAATGCGGAC 8 ATCGCCACTGTACGGAC 18 -TCCGTACGGGAAGGCTTC 9 TGAAGCATGAGCCAGTAGACCTGCCTTGCTTGCCGCA BE_CBIW -346 AGCCTGCTTAAGGCTTGGGT-AG----AAAGGGGAAG-CCCGGTG 3 AATCCGGCACGG-TGCCCGCCACTGTGGTGGGGAGC 10 CAAGTCACTGAAGGA--------TGCTTCGGGAAGACGCC 8 ATGATCCTAAGTCAGGAGACCTGCCTTGTTTGGATCG BI_CBIW -566 GCAACAGTAAAGGTGCC 5 GGCT-TAATAGGGAA--ACTGGTG 2 AGACCAGTACTG-CCCCCGCAACTGTAAGTGTGGAC 8 ATAACCACTGTGAAAA-------AATCACGGGAAGGTTCT 9 TGATACACAAGTCAGGAGACCTGTCTTTATTGTGAAG LMO_X -318 GGTCTTATGTTGGTGGA 12 TTCT-GAAAGAGGAA--TTCGGTG 2 ATGCCGAAACTG-CCCCCGCAACTGTAAGGTGGACA 9 ATAACCACTGTACGTTTT---TAGCGTATGGGAAGGTTCG 8 ATGAAGCCAAGTCAGGATACTCGCCAAATAAGACGGA LMO_CBIA -332 ACAACTAAATAGGTGAA 4 TTA---ATCCGGGAA--AGAGGTG 2 AATCCTCTACAGGCCCTAGCTACTGTAATACGGACG 11 TATGTCACTGGAAGC--------AATTCCGGGAAGACTGG 8 ATGATGTTAAGTCAGGAGACC-GCTTTTATATTCGAT CA_BTUF -332 ACCATATTTTAGGCACC 8 GGTT-TAATAGGGAA--ATTGGTG 2 AATCCAATGCAA-CCCCCGTTACTGTATACAGTTAC 7 ATGTCCACTGGAGTT--------TTCTCTGGGAAGGATGG 7 TAAACTGTGAGCCAGGAGACCTACCTAAAATATTATG CA_CBIM -307 TAAAATTTGTAGGTTCA 16 TGAT-TAAAAAGGAA--TCAGGTG 2 AAGCCTGAGCGG-T-CCCGCCACTGTAATAAAGGAG 11 TATGTCACTGGGA------------AACTGGGAAGGCGTA 10 -GATTTTTGAGCCAGGATACTTGCCATATTCTAGTAT CPE_CBIM -480 ATTTAGAAATAGGTTAA 20 ATAT-TAAAAGGGAAG-TTGGGTT 2 AATCCCACGCGG-T-CCCGCCGCTGTAATAGAGGAG 12 TAAGCCACTGGAATATA-----ATATTTTGGGAAGGCCAC 9 TGATACTTGAGCCAGAAGACCTGCCTATTTTTAAAAC CPE_CBIK -294 TTATATTTTTAGGTTTG 4 TAAT-TAAAAGGGAA--AGTGGTT 2 AGTCCACTACAG-CCCCCGCTACTGTGATAGGATAC 10 TTGACCACTGATTATA-------TAAATTGGGAAGGGAGA 8 TAAGCCTTAAGTCAGGATACCTGCCTAAAGATCATGA CPE_BTUF -482 AACTAATAATTGGTGTG 5 CGCT-TAATAGGGAA--TGAAGTT 2 AGTCTTCAACTA-CC------TCAGTAACCGTGAAG 15 TATGTCACTGCATTT-------TTTGTGTGGGAAGACGAG 7 AAGAAGCAAAGTCGGGATACCTGCCTTTTATTTAAGT CPE_CBLT -537 TAAGAGCATTAGGTGTT 4 AACT-TAATAGGGAA-----AGTT 2 AAACT---GCAG-CCCCCGCTACTGTTGATAAGGAC 8 AAAGCCACTGTGATAA-----ATAGTCATGGAAAGGATTG 9 -GATTTATTAGCCAGGAGACCTGCCTAGTATGCTATT CB_CBIP -317 AAAAAGATTTAGGTGCC 11 GG-T-GAAAAGGGAA--TGTGGTA 2 A-GCCACAGCAG-CCCCCGCTACTGTAATTGAGGAC 10 TAAACCACTCTTTA----------AAAAGGGGAAGGGAAA 8 TGAATCATGAGCCAGGAGACCTGCCTAGATTTTTATT DF_CBIM -367 ATAATATTATAGGTTCT 7 AGAT-TAATAGGGAA--AAAGGTT 2 ATTCCTTTACAG-CCCCCGCTACTGTGATGCAGACG 9 TTAGCCACTATGATG-- 13 ---CTCATGGGAAGGAAAA 8 ATGAAGCTAAGTCAGGAGACCTGCCTAAAATATTAAA DF_CBIP -287 GATTAAAATTAGGTTCT 5 AGA 4 AAAAGGGAA--AAAGGTT 2 ATGCCTTTGCAG-CCCCCGCTACTGTGAAACCAACG 8 AATACCACTGTCAGT---------TTGATGGGAAGGTTAT 9 ATGAAGTTAAGCCAGGATACCTGCCTAATTTAATTTA DF_BTUF -393 AATAATACTAGGGTACT 5 AGTT-TAATAGGGAA--AGTGATG 2 AATTCACTACAG-CCCCCGCTACTGTATACGGATAC 7 AAATCCACTGAAATTTAT--AAAAATTTTGGGAAGGGTGA 7 AAAGCCGTGAGTCAGGAGACCTGCCCAGTATTATATA THT_BTUR -337 TAAAGCCTTATGGTCCC 5 GGGT-TAAAAGGGAAG-ACGGGTG 2 AATCCCGCGCAG-CCCCCGCTACTGTGAGGGAGGAC 10 TAAGCCACTGTCCGG-- 60 --CCGGGTGGGAAGGCAGG 8 TGAGTCCCGAGCCAGGAGACCTGCCATAAGGTTTTAG THT_BTUF -352 AAAAGCCTTATGGTCCC 5 GGGT-TAAAAGGGAAG-ACGGGTG 2 AATCCCGCGCAG-CCCCCGCTACTGTGAGGGAGGAC 10 TAAGCCACTGTCCGG-- 60 --CCGGATGGGAAGGCAGG 8 TGAGTCCCGAGCCAGGAGACCTGCCATAAGGTTTTTA EF_BTUF -340 TGTACTTATGAAGTGTC-------------AGGGAA--AGAGGTG 2 AATCCTCTACAG-ACCTACCTACTGTATGGTGGATG 8 AAGACCACAGATT------------ATTCTGGAAGGATTG 8 AAGAAGCTAAGTCAGGATACCGGCTTGATAAGTCTAA HMO_CBIM -396 TTGCTGGAACAGGTCGC 20 GCGTTAGAAAGGGAAG-TTCGGTG 2 AATCCGACGCGG-T-CCCGCCACTGTAAGGGGAATG 10 AATGTCACTGGCGTTT------AAGCGCTGGGAAGACGGA 8 ATGAACCCGAGCCAGGAGACCTGCCTGTTACCACGTC HMO01408 -271 CTACGGTTACAGGTGCC 6 GGA--GAATAGGGAA--CCGGGTG 2 AATCCTGGGCGG-T-CCCGCCGCTGTATGGTCGAGT 9 TGAGCCACTTCGT------------GTGAGGGAAGGCGCC 7 ATTAAGCCGAGCCAGAAGACCTGCCTGTACACTGTTC HMO_CBIQ -382 CGATGTCTGCAGGCGCC 5 GGCT-GAAAAGGGAA--TGAGGTG 2 AGACCTCAGCAG-CCCCCGCTACTGTATGGGAAGAC 12 GAATCCACTGGACTG--------CCGTCTGGGAAGGAAAC 9 TGATTCCTGAGCCAGGAGACCTGCCTGTCGCGACAAA HMO_CBLS -306 TAACCGTTTCAGGTGCC 8 GGA--GAATAGGGAA--CTGGGTG 2 AATCCCGGACGG-A-CCCACCACTGTAAGAGGAGCT 8 TTGGCCACTGGGA------------TTCTGGGAAGGCGTG 7 ATGATTCGGAGTCAGGAGACCTGCCTGTAACGCTCGG HMO_CBID -294 ATGATGCAAAGGGTGGC 34 GTC 6 ATTAGGGAA--GTCGGTG 2 ATTCCGACGCGG-T-GCCGCCACTGTGAAAGGGGAG 10 CAGGCCACCGGGT------------AACCGGGAAGGCGAA 8 ATGAACCTGAGCCAGGAAACCTGCCTGTCCCCGCACC DHA_BTUF -297 AACTATTGACAGGTTTA 6 TAAT-GAAAAGGGAA--TCAGGTG 2 AATCCTGAGCAA-CCCCCGTTACTGTAAGCGCCGTT 17 CATGCCACTGGCGA----------AGACTGGGAAGGCGAT 4 AAAGGCGCGAGCCAGGAGACCTGCCTGTTAATAAAAC DHA_CBIET -298 ATAGTATTCAAGGTTCC 8 GGAA-GAAAAGGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTGAACCACGAG 9 AGGCCCACTGGGATG---------AGCCTGGGAAGGGAAG 7 AAGACTGGAAGCCAGGAGACCTGCCTTGAACATTGCG DHA_CBLS -282 TAAGGATTTCAGGTGCC 13 GGA--GAATAGGGAA--CCGGGTG 2 ATTCCCGGACGG-A-CCCGCCACTGTAAAGAGGAGT 10 AATGCCACTGGGT------------AACTGGGAAGGCAGC 9 ATGACTCGAAGTCAGGAGACCTGCCTGGATCCGGGGA DHA_CNOA -352 ACATAGCTTAAGGTGCC 5 GGA--GAATAGGGAA--ACCGGTA 2 AGTCCGGTGCGG-A-TCCGCCGCTGTAATCGGAGAC 10 AATGTCACTGTCTTTTT-----TAGAGATGGGAAGGCGTG 9 TGACACGAGAGCCAGAAGACCTGCCTTTTAGAAAGCT DHA_NRDD -316 GGAATCTCATAGGTGAC 12 GTT--GAAAAGGGAA--GCCGGTT 2 AGGCCGGCACGG-T-CCCGCCGCTGTAAGGGAAATA 11 ATTACCACTGAAAGG---------GTTTCGGGAAGGTAAG 8 ATGATCCTAAGTCAGAAGAC-TGCCTATGTGTATACC DHA05379 -325 AGGCGGAATAGGGTTGC 6 GCAT-TAATAGGGAAC-TCCGGTG 2 AAGCCGGGACAG-C-CCCGCTACTGTAAGAAGGACG 11 GGATCCACTGGTGA----------AAACCGGGAAGGTAAG 8 ATGAGTTCAAGTCAGGATACCTGCCCCATTCCGGAAA
Allignment of B12-elements (continued) Gamma-proteobacteria, the Bacillus/Clostridium group
0 1 1' 2 AddI 2' 3 3' 4 VS 5 6 AddII 6' 5' VS 4' 0' ======> -===><===- ====> >< <==== ===> <== =====> -==> ======> >< <====== <==- <===== ---> <---<====== hgGtkcy rg aa aGGGAA cgGtg a tCcg RCdG-ycCcCGChaCKGTra gcCACTG YGGGAAGgc rAGYCMGgAgaCCkGCcd DI_BTUC -246 CACATTGATTAGGTGCA 12 TGC-----ATGGGAA--TCTGGTG 2 AATCCAGAGCTG-A-CGCGCAGCGGTGAAGGTGCAA 14 GTAGCCACTGAGAGTATA--AAAACTCTTGGGAAGGTGAG 17 GAGCACCCCAGTCCGAAGACCGGCCTAATCAGAAACA MT_CBTG -309 TCAGGCGATGACGAT--------------GCAGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTCACCGGGGAG 9 TAAGCCACGGCCAC-----------AGGCTGGAAGGCGAG 8 CGATCCGGGAGCCAGGAGACTCGCGTCATCGCGTCCT MT_METE -362 ACCACGCAGCTGGTCTG-48-------GAGAGGGAA--CCTGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTATGCAGGAAC 20 ACAAGCACTGGTCTCA-------ACGACTGGGAAGCGACG 17 GAGCCTGCGAGTCCGAAGACCTGCCAGCCGTGCCGGA ML_CBTG -270 AAAGGCGATGACGATGC--------------AGGAA--GTCGGTG 2 AAGCCGGCGCGG-T-CCCGCCACTGTAATCGGGGAG 9 TAGGCCACGGCCAT-----------TGGCTGGAAGGCGAG 8 TGATCCGAGAGCCAGGAAACTCGCGTCATCGCGTCCT ML_METE -369 GCTGGTCTGCTGGTTCC 44 ------GAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTATGCAGGAAC 11 GGAAGCACTGGTCTTA-- 8 -CGAGACTGGGAAGCGATG 18 GCGCCTGCGAGTCCGAAGACCTGCCGGCTGTGTCGGG RK_CHLID -224 AAGACAATCGAGGTGCC 8 GGA--TAATCGGGAA--GCCGGTG 2 AATCCGGCACAG-G-CCCGCTGCGGTGACCCGGGAG 22 GCAGCCACTGGACCGG------CCGGTCCGGGAAGGCGAT 11 CGACCGGGAAGTCCGAATACCGGCCTCGATTTCAGCT RK_COBN -260 CCACCTGCCGTGGTGCT-------------CGGGAA--GCCGGTG 2 AGACCGGCGCGG-CCCTCGCCACTGTGAGCGGGTAG 35 GAGACCACTGGACGG--------AAGTCCGGGAAGGTCGG 11 TGATCCGTCAGCCAGGAGACCGGCCACGGCGCGGGAA RK_CBTE -137 CACACGTGCCGAGGTGC-------------AGGCAA--TCCGGTG 2 AGTCCGGAGCGG-T-CGCGCCACTGTGACCGGGCGA-----------------------1------------------------CCGCCCGGGAGTCAGAAAACTGTCTCGGCGCATGGAT RK_BTUF -153 GCTGACGCCCGTGC----------------AGGGAAAGTCCGGTC 2 AGTCCGGCGCTG-A-CCCGCAACGGTAGGCCGTCCA-----------------------1------------------------CCGACGGTGAGCCCGATCGCCTGCACGGGGGTGCGCG SX_CBIM -209 CGCCACGCCTTGGTG------------AACGGGAAA--TCCGGTG 2 ATGCCGGTGCGG-CCCTCGCCACTGTGAATCGGGAA 21 GCAGCCACTGGATCGCT---TGCGGTCCGGGAAGGCGGA 12 GTACCCGTAAGCCAGGAGACCGGCCAAGGCGCGTCGT SX_METE -387 CCCGTGCAGCTGGTTCG 21 CGTCGCAAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-C-CCCGCAGCGGTGAGCGGGAAC 10 ATACGCACTGGGCCCG- 6 -CGGGCCCGGGAAGCGACG 29 GGGCCCGCGAGTCCGAAGACCTGCCACCTGCCCGCGC SX_PDUX -365 TGCCCGCAGTTGGTTCG 30 CGACGCAAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTGAGTGGGAAC 10 AACAGCACTGGGCC-- 13 ---AGCCCGGGAAGCGACG 40 GCGCCCACGAGTCCGAAGACCTGCCACTGCGCCCGTA SX_BTUF -190 TCGCCGCGACGGGAG--------------ACAGGAA--GCCGGTG 2 AATCCGGCACGG-T-CCCGCCACTGTGACCGGGGAG 10 CACGCCACTGCGCGC--------CGCGCGGGAAGGCCAG 10 CGATCCGGGAGTCAGGACACTGGCCTGTCGCGGGCCC SX_NRDA -271 ---TCGCTGTCGCCGC-------------AGGGGAA--TCCGGTG 2 AATCCGGAACTG-T-CCCGCAACGGTGTACTTGCGT----------------------38------------------------CGCCTGTCCAGTCCGAGGACCTGCCGACAGTGCGCCC SX_BTUC -311 CGAAGCGCCTCGTGG---------------GGGAA-GTCCGGTC 2 AGTCCGGCGCTG-A-CCCGCAACGGTAGGCGGAGCC-----------------------8------------------------GGTCCCGTGAGCCCGATTACCCGCGGTGGTGAAGCCC SX12454 -204 CAGGGCGACGACGGTC-------------CGAGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTGATCGGTGAG 11 GTTGCCACTGCCCCGG------AGGGGCGGGAAGGCCGG 9 TGACCCGGGAGCCAGGAAACTCACGTCGTCGCCTCCT TFU_COBN -299 TGCGCTATGGTGGTCGC 3 GTGGT-GAACGGGAA-GACCGGTG 2 AGACCGGCGCGG-CCCTCGCCACTGTGATCGAGGAG 30 GCGGCCACCGGGCAC-------CAGCCTGGGAAGGTCAG 11 TGACTCGTCAGCCAGGAGACCGGCCACGACGCGTCAT TFU_CHLID -225 GGAACCGCCGAGGTGCC 11 GGA--TAATCGGGAA--GCCGGTG 2 AATCCGGCACAG-G-CCCGCTGCGGTGACCTGGGAG 20 GCAGCCACTGGACGG-------CAGTCTGGGAAGGCGAT 10 TGATCAGGAAGTCCGAAGACCGGCCTCGGCATGGCTG TFU_CBTE -134 AAGAGCGTCGGGTGC----------------AGGCA-ATCCGGTC 2 AGTCCGGAGCGG-T-CGCGCCACTGTAGACGGGCTC------------------------------------------------AAGCCCGTGAGCCAGAAAACTCACCCGGCGTAGTGGT PI_CBIB -164 CGGCCAGCGCGCGTCCG------------CAGCGAA--GCCGGTG 2 AATCCGGCGCTG-T-CCCGCAACGGTGATGGGGCCC------------------------1---------------------- GCCCCG47CAGCCCCACGAGCTGCCTGCGCGTGCACC PI_CBIL -185 TCCCGGGCACCGGATGA 33 ---------GAGGAAT-GCCGGTG 23 AGTCCGCGACGG-T-CCCGCCACTGTGAGCCGGTGA-------------------------------------------------AGCCGGCGAGTCAGACACTCCGCCGGTGCCGCTGAC PI_MUTA -253 CTAGTAGTGCTGGTTCG 16 CGTCGCAAGAGGGAA--TCCGGTG 3 ATTCCGGAACTG-T-CCCGCAGCGGTCAATGGGAAC 9 TAAGGCACTGGGCGGC------AACGCCTGGGAAGTAGTA 28 ATGCCCATGAGTCCGAAGACCTGCCAGCAGCGACAAC PG_BTUB4 -526 GAAAAGACTGAAGTAAC 19 GTGC----AAGGGAA--TCCGGTG 2 ATTCCGGAGCTG-AGCCCTCAGCTGTAATGCTTCGA 45 GATGGTCACTGTAGA--- 11 -CCCTATGGGAAGGCCGA 12 AAGAAGCTAAGCCAGAAGACCTGCTTTAGTAGATTTG PG00461 -556 GCCGTGTCAATGGTTTT 23 AAT--GAAAAGGGAA--CCCAGTG 2 ATTCTGGGACTG-TACCCTCAGCTGTAAGTTCAGAT 19 AAAGCCACTATACAGA------ATCGTATGGGAAGGCAGC 4 CCTTGAATAAGTCAGAAGACCTGCCATTACAAGCGTT PG_BTUF -354 CGGATATGTGCGGTTCA 39 GAT--TAAAAGAGAA--TTTGGTG 2 AAGCCAAAACTA-TCCCCGTAGCCGTATGGTCGTAC 15 GATGCCACTGCATAT--------CGATGTGGGAAGGCGTA 4 TTTAGGCCGAGTCGGAAGACCTGCCGCACATATCTAA PG_NRDD -342 CCCATCGTAGTGGTCCC 23 GGG 4 AAGAGGGAA--TCGGGTG 2 AATCCCGAGCAG-T-CCCGCTGCTGTAAGCTTTTAC 44 GATGCCACTGTTCATT-- 19 GCTGAATGGGAAGGCGCG 14 GATGAAGTAAGCCAGAATACCTGCCTCTACGAGTTGC PG_X_CBTD -228 TGTGCGGACTTTGTTCA 33 TGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGAGCAG-T-CCCGCTGCTGTGAACCTTGTT 15 ATATCCACTGTCCGTTCT---GTGCGGATGGGAAGGAGTC 5 TATGGGGTGAGCCAGAAGACCTGCAAAGTCTTTGTCT BX_BTUB -371 TGCAGTGCATTGGTTTG 22 CAAT-TAAAAGGGAA--TCAGGTG 2 AATCCTGAACAG-T-CCCGCTGCTGTAAGTTTCACA 24 CTTGCCACTGGGAAAC------GTTTCCTGGGAAGGCGCT 5 ACAGAAACGAGTCAGAAGACCTGCCTGTGCATCTTTT BX_PCCC -344 GTCGCCGAATTGGTTCG 18 CGA 5 AAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAACTCTGT 26 CGTACCACTGACAGAAA-- 7 CTCTGTCGGGAAGGTCCC 7 TGTAGAGTCAGTCAGAAGACCTGCCATTCGTGAATAA BX_BTUB4 -344 GCAGCCGCTTAGGTGAT 25 AT----AAAAGGGAA--TCGGGTG 2 AATCCCGAACAG-TGCCCGCTACTGTGATCCCCCTG 53 TATACCACTGTCATA--- 10 -CATGACGGGAAGGTAGC 6 AAAAGGGATAGTCAGGAAACCTGCCGAAGCAGACATA BX_NRDA -280 GCTCCCTGATCGGTTCC 20 GGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAGCTCCGT 20 GTTGCCACTGGGA----- 26 ---CACCGGGAAGGCGTС 5 CAAGGAGTCAGTCAGAAGACCTGCCGCTTATCAAAGG BX_CBTD -269 TGTCCCGAATTGGTTTC 21 GGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAGCTTCAT 20 TTCGCCACTGACGT---- 15 ----GTCGGGAAGGCTTС 4 TTAGAAGTCAGTCAGAAGACCTGCCGTTCATCAAAGG BX_METE -264 GTCGGCAGATTGGTTCG 21 CGAT-TAAAAGGGAA--TCGGGTG 2 ACTCCCGGACAG-T-CCCGCTGCTGTGAAGTTTTAT 25 TTGGCCACTGACTCGT-------GTAGTCGGGAAGGCGTT 5 TGGAAGCTAAGTCAGAAGACCTGCCACTCTCGCTGAT BX_NRDD -210 TAGCAGATTTCAGTACT 12 AGT--CATAAGGGAA--CGCTGTG 2 AATCGGCGACAG-TACCCGCTGCTGTAATTCTCTGA 12 TATGCCACTGCGCCC---------AGCGTGGGAAGGCGTT 5 GGAGAGATAAGTCAGAAGACCTGCTGAAAAAGTAAAC CL_BTUB2 -231 TTACGGTTTCCGGTGCC 6 GGC 9 AAAAGGGAA--CCCGGTG 2 AATCCGGGACAG-TGCCCGCTGCTGTGATCCTCCCG 37 GAGGCCACTGGTTCGCGC--CCGCGAACCGGGAAGGCCGG 3 CGAGGGGAGAGTCAGAAGACCTGCCGTAATGCAGTAA CL_X_CBIM -227 TCCGATTATGTGGTGCC 17 GGCT-TAAAAGGGAA--TCCGGTG 2 AGTCCGGAACAG-TACCCGCTGCTGTAATTCCGCGC 32 AATGCCACTGTCCCGTT-----CAGGGATGGGAAGGCCGG 4 ATCCGGGAAAGTCAGAAGACCTGCCTCATATTTTTTG CL_X_FRD -498 TCGCCATGACAGGTGCC 12 GGA--GAATAGGGAA--GTACGTG 2 ATTCGTACACTG-TACCCGCAACTGTACAACGGTTA 47 CAGGTCACTGCCGGTT-- 13 -AACTGCGGGAAGGTTTG 11 TGCCGTGAAAGTCAGGAGACCTGCCAGTCATGCATTT CL_X_NRDJ -265 TTCAGCATTACGGTGCC 14 GGA--TAATAGGGAA--GTGCGTG 2 AATCGCACACTG-TGCCCGCAACTGTAAGATGGTAT 50 TGTATCCACTCCGCCA-- 20 --ATGCGGGGGAAGGCTG 29 AGCCATCGAAGTCAGGAGACCTGCCGTAGTGGTTGGC CL_BTUB -364 CATGATTAGCTGGTGCC 12 GGA--GAATAGGGAA--GTACGTG 2 ATTCGTACACTG-TACCCGCAACTGTACAACGGAAA 47 CACGTCACTGCCAG---- 15 ---GGGCGGGAAGGCTGC 8 AAGCCGTAAAGTCAGGAGACCTGCCAGTTACTCTTTG AN_X_CBIJ -153 AATATCAACTCGGTTCT 17 AGAGGTAAGGGGGAAAGTCCGGTG 2 AATCCGGCGCTG-T-CCCGCAACTGTAATGGGGCTT------------------------------------------------ATGCCTCAAAGTCAGAATGCCCGCCGAAAGTACAACA AN_CFRX -187 AATAAATATTCGGTTCT 17 AqAGGTAACGGGGAA26AACGGTG 2 AGTCCGGCGCTG-T-CCCGCAACTGTGAAGGAAAGA-----------------------10-----------------------AACTTTCCCAGTCAGAACGCCCGCCGAAATTGACGAT AN_COBG -152 TACTAGAACTTGGTGTT--------------GGGAAACTCCGGTG 2 ATTCCGGGGCTG-T-GCCGCAGCTGTGATGAAAAGT-----------------------18-----------------------AACTTTCCGAGTCAGAATGCCAATTCCAAGAGTTAGC TE_X_METE -160 CTTAGTTGCTCGGTTCT 17 AGACGTAAGGGGGAAAGTGCAGTG 2 AATCTGCCGCTG-T-CCCGCAGCTGTGAGGAGAGA-------------------------3-----------------------CACTCTCTAAGTCAGAATGCCCGCCGAGTGGTCAACC TE_CBIX -141 TGTGAGAAGCAGCCTGT-------------AGGGAAAATCCAGTG 2 AGTCTGGTGCTG-T-GCCGCAGCTGTGATGGGAAT--------------------------------------------------CTTCCCTCAGCCAGAATGCCTACTTGCTGTGGTTCA CY_HUPE -160 TAAGTTTAGTTGGTTCC 17 GGAGGTAACGGGGAAAAGCTGGTG 2 AAGCCAATACTG-T-CCCGCAACTGTGATGGGCCC--------------------------------------------------AGGCCCTAAGTCAGGATGCCCGCCAACGATGGCCGA SN_HUPE -210 GGTTTGGGTCTGGTTTC 17 GATGGAAACGGGGAAAGAACGGTG 2 AATCCGTCGCTG-T-CCCGCAGCTGTAAAGCGTCCGGCCC-------------------------------------------CGCCGGCGTCAGTCAGAACGCCCGCCAGGAGCACTACC PMA_HUPE -232 ATCCATCAATCGGTTTC 17 GAAGGAAACGGGGAAAGTTCGGCG 2 AATCCGGCACTG-T-CCCGCAGCTGTAAAGCGCAAC-----------------------15-----------------------ACTTGCGCGAGTCAGAATACCCGCCGAATTTCCATCG DR_BTUFC -236 TCCTCGCAGCAGGCGC--------------AGGGAAAGTCCGGTT 2 AGTCCGGCACTG-T-CGCGCAACGGTTTT---------------------------------------------------------------CAGTCCGAACACCTCGCCTGCTCGCGCTG DR_BTUFR -312 TGAGGCCACCTGAGCC-------------AGGGGAA-GCCCGGTG 34 ATTCCGGCACTG-T-CGCGCAGCGGTGAATCGGCCT------------------------2-----------------------AGGGCCGTCAGTCCGAATGCCTCTCAGGGACGCGAAC DR_ACHX -270 AAGCCTCCCGAGGAAC------------AGAGGGAA-GTCCGGTC 20 AGTCCGGCACAG-T-CGCGCTACGGTTA----------------------------------------------------------------CAGTCCGAACGCTCGCCTCGTGGAGAACG LI_CBIX -365 ATGTTTCACATAGAT----------------AGGAA--GACGGTT 2 AATCCGTCACGG-TATCCGCCGCTGTAAGAAGGACG 10 TAAGCCACTGGGAC----------AACCTGGGAAGGCGTG 9 AAGATTTCAAGTCAGAATACGACCTATGAAAATTCCT LI_BTUB -279 ACGGAAAACTTGTTTAT 7 ATG--AGGAAGGGAA--TCCGGTT 2 AATCCGGAGCTG-AACCCGCAGCTGTAATCGCCGAA 16 CATGCCACTGCGTTAA-------ATACGCGGGAAGGCTGC 3 ATCGGCGAAAGCCAGAAGACCTAACAAGTAAAAAAAC FN_BTUF -276 CATGTCAATTATGTTCC 11 GGC--TAAGAGGGAA--TTTGGTG 2 ATACCAAAACGA-G-CCCGTCGCTGTAATTGAGTTT 10 TATACCACTGGATTT---------TATTTGGGAAGGTAAA 6 TAAATCATAAGTCAGAAGACCTGCATAATTGAATTAC FN_BTUB -240 AGAAACAAATAGGTGCT 4 GGCTTAATAAAGGAA-GTTGGGTG 2 AATCCCACACAG-C-AATGCTACTGTATTGTGGACG 8 ATAGCCACTGGGA------------AACTGGGAAGGTGTA 8 TTGAAACTAAGTCAGGAGACTTACCATTATTTTATAT TM_BTUF -224 CCTCACCGTGCGGTACC 6 GGTT-CAAAGGGGAA--GCCGGTG 2 AATCCGGCGCGG-G-GCCGCCACCGTGACCGGGGAC 11 AACGCCACTGGGGCGA------TCACCCTGGGAAGGCGCG 10 TGATCCGGAAGCCGGGAAACCCGCCCGCGGTGAAGGG CAU_BTUR -268 TAGATCGTCGCGGTGAC 28 GTG-----GAAGGAA--GCTGGTG 2 AGTCCAGCACTG-T-GCCGCAACTGTAACCGGCTGT-----------------------2------------------------CAGGCCGGAAGTCAGGACGCCTGCCGCGATGTGTTGT CAU_BTUF -397 CATATCGTCGCGGTGAC 25 ------AAGGTGGAA--GCTGGTG 2 AGTCCAGCGCTG-T-GCCGCAACTGTAACCGGTTAG------------------------------------------------AAAGCCGGAAGCCAGGACGCCTGCCGCGATGTGATGA GME_COBU -290 TTTTACGTTCAGGTGCT 16 AGG--TAAAAGGGAA--AAGGGTG 2 ACTCCCTTGCTG--TCCCGCAACTGTGAACGGTGAT 14 GATGCCACTGATCT---- 18 -----CCGGGAAGGCGCG 10 TGATCCGTGAGCCAGGAAACCTGCCTGACCGTCAGCT TDE_CBTF -520 TAGACAAATAAGGTTCT 14 AGAT-TAAAAGGGAA--ACCGGTG 2 AAACCGGCACAGCC-CCCGCTACTGTAATTGAGTTT 34 TAAGCCACTGTTA------------ATATGGGAAGGCGAT 6 TAAATCATAAGTCAGGAGACCTGCCTATTTGTATTAC TDE_ROCG -490 GTTTCGGTCTTGGTGCT 14 AGTG--AAAAGGGAA--TCAGGTG 2 AGTCCTGAGCAG-T-CCGGCTGACGTAAGTGAGAGA 14 AATGCCACTGGTTT----------ATTCCGGGAAGGCGAA 9 TGACCTCCGAGCCGTAAGACCTGCCAATGACTATAAG TDE_BTUF -371 CAAACCATACAGGTGCC 7 GGTT--AAAAGGGAA-GCACGGTG 2 ATTCCGTCACGG-T-CCCGCCGCTGTAAGAGAATAG 13 TATGTCACTCGGGA----------AATCGGGGAAGGCTTA 10 GAAGCTCGAAGTCAGAATACCTGCCTGTAAAGGACTA
B2
Allignment of B12-elements (continued) (Actinobacteria, Cyanobacteria, The CFB group, Thermotogales, The Thermus/Dienoccoccus group and some others)
Mesorhizobium loti MLO 6 ardX2<>&transp5-6-cobU-btuR><cbiB<>cobD-X; &G1-cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cbiA-cobS-cobT1; cobF; X-cbiP; cobT2<>gene4-transp8; &G1-btuDFC; &ardX-frdX; &metE; &cbiY
Bradyrhizobium japonicum BJA 4 &transp7-cobF-cobT] [cobS-cbiY] [cbiB-cobD><cbiP-btuR<>&G1-cobW-cobN-cobG] [cbiLH><cbiJ] [cbi(ET)-cobE-cbiF-cbiA-cobA]; &btuB; &metE [btuB3-transp4]
Pseudomonas denitrificans # PD 1 cbiP; &cobU-cobW-cobN-btuR-//-cobE-cobA-cbiA-cobD-cbiB; cobF-cobG-cbiCLH><cbiJ<>cbi(ET)-cbiF; cobT<>cobS-gene4-transp8 Sinorhizobium meliloti SM 5 cbiP; &cobU-cobW-cobN-btuR-//-cobE-cobA-cbiA-cobD-cbiB; cobF-cobG-cbiCLH><cbiJ<>cbi(ET)-cbiF; cobT<>cobS-gene4-transp8; &cbiY; &btuFCD; &ardX-
frdX; &transp7 Brucella melitensis BME 4 cbiP-&transp5-6-cobU-cobW-cobN-btuR-//-cobE-cbiF><cbiJ-cbiD<>cobA-cbiA-cobD-cbiB; cbi(ET)<>cobG-cbiCL-cbi(GH); cobT<>cobS-gene4-transp8; &btuFCD2;
&btuBFCD; &nrdHIEF; cbiY Agrobacterium tumefaciens AU 6 cobD<>cbiB><cbiP-btuR<>&G1-cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF><cbiD<>cobA-cbiA; cobU; cobT<>cobS-gene4-transp8; &btuFCD;
&nrdHIEF; &transp5-6; &cbiY; yxjH-ATU04068<>&ATU04066-metR Rhodopseudomonas
palustris RPA 9 cbiY-cobT1<>&G1-cobU-cobW-cobN-btuR-cbiP1><cbiB-btuF<>cobD><btuDC-hoxN&; &metE-ZUR~btuB-cobN-gene2-3;
cobF-cobP2-cobS-cobT2&<>ORF663-&cbiCLH>-<cbiJ<>cbi(ET)-cobE-cbiF-cbiA-cobA><btuF2&; &btuB; &btuB3-transp4; cobC-&gene5 Rhodobacter capsulatus # RC 12 cobSTU<>cobC-bluE-cbiB-cobD-cbiY><cbiP-btuR<>&G1-cobW-cobN-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cobF-&cbiMNQO-ORF663-cbiA;
&btuFCD; &btuD; frdX-ardX&<metH<cbiP2-cbiP3&<>&gene6&btuBFCD-cobX; &exbBD-tonB; &gene5; ?&<>&nrdDG; &oppABCD Rhodobacter sphaeroides # RS 6 [cobT; &X-bluE-cobD; &cbiY; cbiB; cbiP; btuR-X<>&G1-cobW-cobN-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cobF; &transp7-cbiA;
&btuBFCD-cobX; &btuFC] Sphing. aromaticivorans # SAR 3 hoxN-&cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cbiA-cobF-btuR-cbiY-G1-cobT-cobC-cobS-cobD-cbiB-cbiP-cobU; &btuBFCD; &~btuB Rickettsia prowazekii RP 0 no Caulobacter crescentus CO 2 cobT<>gene4-transp8; &btuB-cbiP1; cbiP2; X-btuFCD; &metE
Bordetella pertussis BP 0 metH<>btuBF; btuB3-transp4 Burkholderia pseudomallei BPS 4 cbi(GH)-cbiLC-cobG&<>cbi(ET)-cbiDJF; cbiA-btuR-cobE&<>&hoxN-cobW-cobN--chlID; &btuBCD-cobTSC><btuF-cobD-cbiB-cobU<>cbiP; cbiY Neisseria meningitidis NM 0 no Nitrosomonas europaea NE 1 &btuB-transp3 -btuR><gene2-3-cobN; ~btuB-cobN-gene2-3 Methylobacillus flagellatus # MFL 3 &btuB-transp3-btuR--cbiA-cobN-gene2-3-cbiY-btuF-cobU; cbiP<>cbiB-cobD><cobC-cobST; &btuB3-transp4; &nrdAB Ralstonia eutropha # REU 1 &btuBCD-btuR-cobTS-cobC><btuF-cobD-cbiB-cobU<>cbiP; cbiA; cbiY Ralstonia solanacearum RSO 2 &btuBCD-cobTS-cobC><btuF-cobD-cbiB-cobU<>cbiP; &hoxN-cbiW-cobN-chl(ID)--cbiZ-(cbiX-cbiC)-cbiDLFG-cbi(HJ)-cobW-btuR-cbiA-cbiY
Escherichia coli EC 1 &btuB; btuCDE; X-btuF-X; X-btuR; cobC; cobUST Salmonella typhimurium SY 2 &btuB; btuCDE; X-btuF-X; X-btuR; cobC<>cobD; pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobUST Klebsiella pneumoniae # KP 2 &btuB; btuC-//-btuED; X-btuF-X; X-btuR; cobC; cobD; pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobU-cobT2; cobS-cobT1 Yersinia enterocolitica YE 2 &btuB; btuCED; X-btuF-X; X-btuR; pduX-cobD-cobA<-pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobUS-cobC-cobT; cobA2-pduX2 Y. pestis;E. carotovora YP,EO 1 &btuB; btuCDE; X-btuF-X; X-btuR Vibrio cholerae VC 1 &btuB; btuCD; X-cbiB-btuF-X; btuR; cbiP-X; cobTSU-cobC Pasteurellaecae HI,VK,AB 0 no Pseudomonas aeruginosa PA 5 &btuB-btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS;chlD-I//-cobN-cobW&<>&transp(5-6)-cobE-cbiF; cbi(GH)-cbiLC-cobG&<>cbi(ET)-cbiDJ;
&bruB2-btuDFC; ZURbtuB3-cobN; -gene2-3--metE; btuB3-transp4 Pseudomonas putida PP 5 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; &cobW-cobN-chlID; &transp5-6-cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ; cobF;
&btuB2-X-&btuFCD; btuF><btuB Pseudomonas fluorescens # PU 3 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; chlDI-cobN-cobW&<>&transp5-6-cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ; cobC2-cobF;
btuB2-btuDFC; btuB3; btuF><btuB] Pseudomonas syringae # PY 4 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; chlDI-cobN-cobW&<>&transp5-6 -cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ; cobF;
&btuFCD; metXY-bruB2-btuDFC Shewanella oneidensis SH 2 &btuB-X-metR<>metE; metH<>cobC-&btuDC--cobTSU-cbiP-btuR; cbiB; X-btuF Azotobacter vinelandii # AV 1 &(btuB-transp3-btuR)-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; btuB; X-btuF Xanthomonas axonopodis XAX 1 &btuB-transp3-transp3-btuR-cbiB-cobD-cbiP-cobUT-cobC-cobS; btuB3-transp4 Xylella fastidiosa XFA 0 no H. pylori, C. jejuni HP,CJ 0 no Magnetococcus # MCO 0 [cbiO-cbiA-cbiF-cbiX-cbiCD-cbi(TE)-cbiLG-cobE-cbiH><cbiP<>cobD; cbiB; X-btuR; cbiY; cobC; cobTS; cobU Geobacter metallireducens # GME 1 &cobUTSC-cbiA-X-cbiMNQO -cbiX-cbiCD-X-cbi(ET)-cbiLFGH-cbiPB-cobD
T/D Deinococcus radiodurans DR 3 XX-cobTS-X; X-cobU; &btuFCD; &btuF-btuR-cbiA-cbiB-cobD-cbiP; X-cobC; &achX-nrdIEF Thermus thermophilus # TQ 3 cbiA-btuR-X-cobST<>cobC-&hoxN-cbiDC-cbi(ET)-cbiLFHG-cbiX-cobA-cbiY-cbiB-XX-cobD--(cbiP-cobU); &btuFCD; &achX-nrdBA Fusobacterium nucleatum FN 2 cbiP1-X-cbiB-X-cobD-cbiA-X-cbiC-//-cbiDE-X-cbiT-//-cbiL-X-cbiF-//-cbiGHJ; btuR; cobUS-cobC-cobT;cbiK; transp11; &btuFCD; &btuBFCD; btuB<>btuFCD
B12-regulon: identification of genes and regulatory elements
B12-regulon: identification of genes and regulatory elementsB/C Bacillus subtilis BS 1 &btuFCD-pduO Bacillus cereus ZC 1 pduO; &metE Bacillus megaterium # BI 1 [&cbiW-cbi(H-?)-cbiX-cbiJCD-cbi(ET)-cbiLFGA-cobA-cbiY-btuR] Bacillus halodurans HD 5 &btuFCD-cbiB-cobD-cobU-cbiP-cobS-cobC-~cobU-pduO; &cobT; cbiA; cblX; &nrdAB; &metE; &achX Bacillus stearothermophilus # BE 3 &btuFCD-cbiB-cobD-cobUS-cobC-~cobU-pduO-cblT-cblX; [&cbiW-cbiMNQO-cbi(H-?)-cbiX-cbiJCD-cbi(ET)-cbiLFG-cbiAP--cobT-cobA-btuR; &nrdAB Staphylococcus aureus SA 0 btuFC Listeria monocytogenes LMO 2 pdu-cblX-cobUSC><pocR<>&pduABCDEGHKJL-eutJ-pduMNOPQFW-cobD-pduX; cblT-&cbiABCDETFGHJ-(cobA-hemD)-cbiKLMNQO-cbiP-pduO; btuF Clostridium acetobutylicum CA 2 &cbiMNQO-cobD-cbiG-pduX-cobT-cbiK-cbiP-cbiACDTLFJH-cobUS-cobC; cbiB -cbiK2-cbiE; &btuFCD-gene5 Clostridium perfringes CPE 4 cobTSUC-cbiB-cobD-&btuFCD-gene7-cbiP; &cbiK-cbiCDETLFG--cbiHJ-btuFCD; btuR-X; &cbiMNQO; &cblT-cblX Clostridium botulinum # CB 1 &cbiPB-cobD-cbiMNQ--hemC-(cobA-hemD)--hemB-cbiAC; cbiD-cbiE-X--cbiT; cbiLFGHJ-cblT-cbiK-pduX-btuR; cobC; cobU; cobS--cblX; btuFCD-1; btuFCD-2 Clostridium difficile # DF 3 cobTSU-cobC-&cbiP-cbiA-cbiB-cobD-pduX-cbiCDETFGHJ-cbi(LK)-hemC-(cobA-hemD)-hemB-cysGB; &cbiMNQO; &btuFCD--nrdEF Thermoanaerobacter tengcongensis TTE 2 &btuR-&btuFCD-cbiA-cbiPB-cblT-cblX-cobD-cobUS Enterococcus faecalis EF 1 &btuFCD-pduV-pduO-pdu cluster Streptococci (ST, PN, MN, LL) 0 no Heliobacillus mobilis # HMO 5 &cbiD-cbi(ET)-cbiLFGHJ-cbiX-cbiC-//-cobU-cbiPBA-btuR-cobTS-pduX-cobD-~cobC; cbiK; &cbiM]; &cbiQO; &cblX]; cblT-cbiO; &HMO01408 Desulfitobacterium halfniense DHA 6 cbiD-&cbi(ET)-cbiLFGHJ-cbiX-cbiC-cbiPB--cbiA-btuR-pduX-cobD]; [cobTUS]; cobC]; &btuFCD; &oppAB]; &cblX; [cbiMN]; [cbiQO]; &DHA05379; &nrdD]
Act Corynebacterium glutamicum CGL 0 gene8-cobUTS; pduO; btuCFD Corynebacterium diphtheriae DI 1 gene8-cobUTS; chlID-btuR-cbiA-//-cobA; cbiP; cobF; cbiB-//-cobD; cobN<>cobG-cbiC-cbi(LH)><cbiJ-cbiF-cbi(ET); &btuCDF Mycobacterium tuberculosis MT 2 gene8-cobTS; chl(ID)-btuR-cbiA--cysG; &transp12-cbiP-cobU; cbiB-//-cobD; cobN<>cobG-cbiC-cbi(LH)><metZ-X<>X><cbiJ-cbiF-cbi(ET); &metE Mycobacterium leprae ML 2 gene8-cobTS; btuR-cbiA; &transp12; &metE Thermobifida fusca # TFU 3 gene8-//-cobUS]; cbiB<>cobD-(cbiY-cobT)--cysG; cobF-cobN&<>cobG-cbiC-cbi(LH)><cbiF-cbi(ET)<>&chl(ID)-btuR-cbiA><cbiJ-cbiP-transp10& Rhodococcus str. # RK 4 gene8--cobT]; cobU; [cobD]; &transp10-cbiP]; [cobF];[cobN&<>cobG-cbiC-cbi(LH)];[cbiF];[cbi(ET)<>&chl(ID)];[btuR];[cbiA];[cbiJ><cbiB]; &btuFCD Streptomyces coelicolor SX 7 gene8-//-cobU--cobT-//-cobS; (cbiY-cobT)--cysG; cobF; cbiBP-cobN-chl(ID)-btuR-cbiA-cbiL><SX03279<>cbiF-cbi(ET)-cbi(GH)-cbiX-cobD; cbiC><cbiJ; &pduX-XX;
&cbiMNQO; &btuFCD; &btuCDF; &RSX12454; &nrdAB; &metE Propionicibacterium freudenreichii# PI 3 mutBA&<>&cbiLF-cbi(EGH)-(cbiX-cysGB)><cbiDCTJ; &cbiBP-btuR]; cbiMNQO-cobA
Cya (PMA,CY,SN) 1 cbiJ; cbiF; cbiC; cbiL; cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; cbiX; &hupE Anabaena sp. AN 3 &G1-cbiJ; cbiF; &cobG-cbiCL-X-cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; &G1-btuB-(genW-cbiW)-btuFCD; cbiMNQO T. elongatus TE 2 cbiJ; cbiF; cbiC; cbiL; cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; &cbiX; cbiMNQO; &metE Chloroflexus aurantiacus # CAU 2 &btuF-cbiW-btuCD-genW]; [X-cbiCD-cbi(ET)-cbiLF]; [cbiG]; [(cbiH-cysGB)-cobA-cbiA-cobUD-cbiP; &btuR-cobN-cobT-cbiMNQO-cbiB; cobCS; chl(ID) Chlorobium tepidum CL 4 &btuBF-cbiY-cbiP-cobD-cbiB-btuR-btuCD-X-cobUT-X-cobS;&cbiMNQO-cobA-cbiK-cbiL-cbi(HC)-(ET)-(GF)-(JD); chl(ID)-cobN-&btuB2;cbiA-btuF2; &nrdJ
CFB Porphyromonas gingivalis PG 5 cobUTSC; cbiA-pduO-cbiP-cobD-cbiB; &transp9; &btuB4-cbiK-btuFCD1; cbi(HC)-cbi(ET)-cbi(GF)-cbi(JD); cbiL; &btuFCD2; hmuY-hmuR(~btuB)-cobN-X-gene2-3; &nrdDG; &PG00461-62-63
Bacteroides fragilis # BX 7 cobUTSC><cbiB-cobD-cbiP; cbiA-pduO-//-transp9&<>&btuB4-cbiK-btuB3-transp4-cobN-X-gene2-3-cbi(HC)-cbi(ET)-cbi(GF)-cbi(JD); cbiL<>btuFCD &btuB; ~btuB-cobN-X-gene2-3-&nrdAB; btuFC; ~btuB-&metE; &nrdDG; &BX01357-58-59
Thermotoga maritima TM 1 &btuFCD; btuR Treponema denticola TDE 3 &btuFCD; cbiK-cbiLA; cbiG-cbiF; cbiHJ-btuD3; cbiET; X-cbiC; cbiD; cobUSC; btuR; cbiB; X-cbiP; cobD; chlID-cobN--btuFCD2; &transp11]; &rocG Leptospira interrogans LEP 2 &btuB; &(cbiX-cbiW)-X-frd-cbiDC-cbi(ET)-cbiLG-cbi(H?)-cbiF-btuR-cbiA-cobX-cobU-cbiPB; cobTSC Aquifex aeolicus AA 0 no
Arc Thermoplasma volcanicum TVO 0 cysGB-cobA-cbiCHDTLF-cbi(GE); cbiPB; X-cobT; X-cbiA-XX; cobDS; btuR; cobC; cobX cobY><btuD-X; btuF; btuC Methanosarcina acetivorans MAC 0 cbiTLFGHC; cbiDE; cbiMNQO; cbiA1; cbiA2; cbiP; cobD-cbiB--cobZ-cobS-cobY; btuR; cobT; transp2-gene2-3-cobN; opp-cobN-chlID; btuFC-X-btuD Halobacterium sp. HSL 0 cbiTLFGH-HSL00646-(cbiX-cbiW)-X><chl(ID)<>cobN-cbiC-cbiE><cbiD<>cobT-cbiA-btuR-cbiP-HSL01294-cbiB-cobSYD-cobX; btuF<>btuCD Archaeoglobus fulgidus AG 0 cbiT-cbiMNQO<>cbiLFG-cbi(HC)-cbiDE--cbiX-gene7; cbiB-X><cbiP; cbiA; cobY-X-cobS1; cobS2; ??-cobD-X; cobT; btuCD-XX-btuF Aeropyrum pernix AP 0 cobT-(cobX-cobZ)-cobY-cobD-cobS-cbiB-X-gene7; $~btuFCD btuF<$>mutBA-ygfD Methanopyrus kandleri MK 0 X-cbiC; X-cbiD; cbiT<>cbiL; cbiFGH; cbiE; cbiA1; cbiA2-X; cbiB; cbiP-X; ??-cobT; (cobX-cobZ)-cobS-cobY; ?-cobD; cobN<>cobN2-X-metE Methanococcus jannaschii MJ 0 cbiC; cbiD; cbiE; cbiF; cbiG; X-cbiH; cbiJ-X; cbiL-X; cbiT-X; cbiMNQO; cbiP; cbiA; X-cbiB-X; cobS-X; X-cobT; cobD; cobY; cobZ; cobN; btuF<>btuCD Methanobacterium thermoaut. TH 0 cbiC-X; cbiD; cbiE; cbiF; cbiB-cbiG; cbiH; X-cbiJ; cbiL; cbiT; cbiMNQO; cbiP; cbiA; (cobX-cobZ)-cobS-X; cobD; cobT-X; X-cobY;cobN-transp2-gene2 Pyrobaculum aerophilum PK 0 cbiCHDTLF; cbiGE; X-cobD-cobS-cbiB; cbiA; cobT-(cobX-cobZ)-btuR; Pyrococcus horikoshii PH 0 cobD-cbiB<$>gene7-cobZ--cobS-cobY; $cobT; btuR; $nrdDG; $mutB*-$ygfD-mmcE; $mutB^; $sucS; $btuF; $btuCD Pyrococcus abysii PO 0 cobD-cbiB<$>cobZ--cobS-XX-cobY; gene7; $cobT; btuR; $nrdDG; $mutB*-$ygfD-mmcE; $mutB^; $sucS; $btuF; $btuCD Pyrococcus furiosus PF 0 cobD-cobZ-gene7-cbiB$<//>cobS-cobY><cbiP-X<$>cobT; btuR; $cobS2; $nrdDG; $mutB*-$ygfD-mmcE; $mutB^-X; $sucS; $btuF; $btuCD Sulfolobus tokodaii STO 0 $cbiGECHDTLF; cbiP<>cobS-cbiB; $cobT; X-cobD; cobY; cobC; cbiA; $hoxN; btuF<$>btuCD
Distribution of B12-elements in bacterial genomes
B12-element regulates cobalamin biosynthetic genes and transporters, cobalt transporters and a number of other cobalamin related genes.
A U_B LUB
M LO _BLU B
SN_HUPE
AN_COBG
TE_CBIX
PMA_HUPE
AN_CFRX
TE_METE
AN_CBIJ
CY_HUPETQ_BTUF
CAU_BTUFCAU_BTURSX_BTUC
RK_BTUF
DR_BTUFCDR_ACHX
DR_BTUFR
TQ_ACHX
TQ_HOXN YE_BTU BY P_BTU B
EC_BTU BE O _B TU B
K P_B TU B
CL_N R DJ
C L_BTU B
C L_FRD
HM O 01408D H A _CB LSHM O _C B LS
SON_BTUD
M L_M E TE
M T_M E TES X_PD U X
S X _M ETE
VC _B TU B
D I_BTUC
PI_CBIB
SX_NRDADH A _CB IE T HD _C O B T
SAR_COBW
SAR_BTUB
RPA_COBT
R PA _B TUF2
RPA_CRDX
BX_B TUB
DH A _B TUF
CB _C B IP
TH T_BTU FTH T_BTU R
CA _B TU F CP E_C BIK
DF_BTU FY E_CBIA
K P_CBIA
EF_B TU F
LM O _CBIA
LI_C BIX
DH A_NR D D
H D_AC HX
BE _C BIW
H M O _C B IDH M O _C BIM
FN_BTU B
CP E _B TU FCP E_C BLT
DF_CBIMD F_C BIP
H M O _C B IQ
M LO _C BTA DH A 05379
AU _CFRX
B I_C BIW
B E_NR DA
B E_BTU F
H D_N R DA
RPA_BTUB
BM E _B TU FA U_B TU F
SM _B TUF
M FL_N RD A
PY_B TU FPP _BTU F
R S_B TU FRC _B TUF
RK _C O B NTFU_C O BN
SX _C B IMRPA _C B IC PA_CO B GBPS _C O B G
R SO _H O XNRPA_CFR X
R S_B LU ERC _C R D X
R C_C BIP 3B M E_C BTA
P D_C O BU
SM _C O BU
BJA_C FR X
R S_CFR X
R C _C FR X
RC _C B TF
SM _CBTCRC _B TU D
BM E _B TUB
RC_CBIMR C_C N O A
RS_C B TC
RPA _B TUB 3
C O_BTU B
RC _E X BBRC _B TUB
M LO _BTU D
A U_C BTA
B JA _B TUB
RC 04759AU _AC HX
RPA _M E TEM LO _M E TE
PY_C O B WP U_C O BWPA _C O B W BS_BTUF
ZC _M ETE
HD _M ETE
C A_C BIMCP E_C BIM
C L_B TUB 2B X_N R DD
FN _BTU F
C L_CB IM
BX_B TUB4
B X_M ETE
BX _P CC C
B X _CB TDB X_NR D A
PG _N RD DPG _C BTD
M FL_B TU B 2
R PA_H O X NXAX _BTU B
PA_B TU B
P P_BTU RPU _BTU RPY _B TU R
NE _B TU B
P Y_CB TAP U_C B TA
PA_B TU B 2
R EU _BTU B
RS O _BTU BB PS _B TU B
BPS_COBE
B PS _H O XN
M FL_B TU B
PP_CB TAPP_B TUB 2
PA_CBTAS M _ARD X M LO _AR D X
SON_BTUB
RC _N RD D
S A R_B TU B F
AU _N RD H
B M E_N RD HBJA_M E TE
C O _M ETE
TFU_CBTERK_CBTEPI_CBIL
TM _B TUF
S X_B TU FSX12454
M T_C BTGM L_C B TG
TFU_CH LIDR K_CH LID
RS _BLU BSM _B LU B
Phylogenetic tree of B12-elements
Without B2 domain
(in gray squares)
The predicted mechanism of the B12-mediated regulation of cobalamin genes
A
A
A
AA
AA
CGd
a
aa
a
a
ktk
h
CC
cC
C
GG G
GGG
G
GT
M
Y
K
y
c
c G
g
g G
G
G YG
tg
g
ggN
RNN
NN
r
r
r
g
g C
c
c T
C
C G
C Ca
ta N
P 0
P 1
P 4 P 5 P 6P 2
N
CGh
G
d
yc c
C C
P 3
B12-element
+A do-CBL
A do-CBL
pseudoknot
terminator
1 2 3
1 2
antiterminator
3
A
A
A
AA
AA
CGd
a
aa
a
a
ktk
h
CC
cC
C
GG G
GGG
G
GT
M
Y
K
y
c
c G
g
g G
G
G YG
tg
g
ggN
RNN
NN
r
r
r
g
g C
c
c T
C
C G
C Ca
ta N
P 0
P 1
P 4 P 5 P 6P 2
N
CGh
G
d
yc c
C C
P 3
B12-elem ent
+Ado-CBL
Ado-CBL
pseudoknot
RBS-sequestorhairpin
1 2
1 2
antisequestor
A. B.
Gene cluster Function Taxonomic group 1. CBL biosynthesis: cbi and cob cobalamin
biosynthesis proteobacteria, the Bacillus/Clostridium group
cbt, hoxN, cbiMNQO, hupE
cobalt transporters all CBL-synthesizing bacteria
orf1-cobW-cobN-chlID
cobalt chelation -, -proteobacteria, Pseudomonadaceae, actinobacteria
bluB cobalt reduction -proteobacteria btuR CBL
adenosyltransferase -, -proteobacteria, Pseudomonadaceae
2. Vitamin B12 transport: btuB vitamin B12 receptor proteobacteria btuFCD vitamin B12
transporter (ABC components)
-, -proteobacteria, Pseudomonadaceae, the Bacillus/Clostridium and CFB groups, Deinococcus radiodurans, actinobacteria, spirochetes, Fusobacteriaceae, Thermotogales,Chloroflexaceae
3. B12-dependent or alternative metabolic pathways: metE methionin synthase various groups nrd ribonucleotide
reductase various groups
ardX-frdX predicted enzymes -proteobacteria achX predicted enzymes Deinococcus radiodurans and some other species
Phylogenetic distribution of gene clusters regulated by B12-elements
BB1212--independent izozymes independent izozymes of methionine synthase and ribonucleotide reductaseof methionine synthase and ribonucleotide reductase
are regulated by the are regulated by the BB12-12-elementselementsin the genomes possessing both izozymesin the genomes possessing both izozymes
(it was not known formerly)(it was not known formerly)
BB1212-dependent and B-dependent and B1212-independent izozymes-independent izozymes
Ribonucleotide reductasesRibonucleotide reductasesNrdJ NrdJ
((BB1212-dependent-dependent)NrdAB/NrdDG NrdAB/NrdDG ((BB1212-independent-independent))
+ ––
–– +
+ +
Methionine synthasesMethionine synthasesMetH MetH
((BB1212-dependent-dependent))MetEMetE
((BB1212-independent-independent))
++ ––
–– ++
++ ++
B12B12 B12
g u yc a r
NaAUGc
AP 1
5' 3'base stem
u R
CA
U
Uu
Ga
P 4
NaGA
g
c
GRCA
aCcD H
Gg
UGCY
a
AA NuccN
r
NN
G gyC cr
P 2G GG A
C C DC
rG
N y G Aa
Ac
gg
P 3
P 5g
Conserved S-box structure
Genome AB Methionine biosynthetic genes MetK Transporters Other genes Bacillales: Bacillus subtilis BS metB; &metI-metC; &metF*; &metE;
&cysH-ylnABCDEF &metK &yusCBA; yusA2 mtnZYXW&mtnV<>mtnU&mtnKS; &yoaD;
yrrT-mtn-yrhAB; rhc; &yxjH1-&yxjH2 Bacillus cereus BC &metY-metB-hom; metC-metI&<>&metF*-metH;
&cysH-ylnBCADEF; metX; metE &metK &yusCBA1-yusA2; &yusCBA3; &yusACB4;
&metT; &mtnABC; &oppBCDFA; &mtsABC mtnZYXW&mtnV<>mtnU&mtnKS; yrrT-mtn-yrhAB; rhc; &mdh; &hmrA
Bacillus halodurans BH metY; metB; &hom; &metI-metC-metF*-metH; metE &metK &yusCBA &BH0835; mtn; rhc Bacillus stearothermophilus # BE [metY; [metB; [&metI-metC; [&metF*-metH &metK &yusCBA; mtnABC mtnZYXW&mtnV<>mtnU&mtnKS;
yrrT-mtn-yrhAB; rhc Oceanobacillus iheyensis OB &metY1; metB; &bhmT;
&X-metY2 &metK &yusCBA1; &X-yusACB2; &yusCBA3;
&yusCBA4-hmrA yrrT-rhc–yrhAB; mtn; &yxjH; &OB1276; OB3079&<>&OB3078; &OB2779-OB2778
Staphylococcus aureus SA &metX; $metI-metC-metF*-metE-mdh &metK &metT; &yusCBA; hcp-mtsABC yrhAB-yusCBA2; rhc<>hmrA; mtn Listeria monocytogenes LMO &metY-metX; &metE-metI-metC-metF* &metK &yusCBA1; &yusACB2; &oppABC &yxjH; mtn; rhc Lactobacillales: Enterococcus faecalis EF no metK $yusCBA1;$$yusCBA2; yusCBA3; $opp; mtsABC $yxjH; rhc; mtn Lactobacillus plantarum LP $metB-metY-hom; $metI; $metE- metF* metK $yusCBA1; $yusCBA2 $yxjH1; yxjH2; rhc; mtn Lactobacillus gasseri # LGA no ? $yusACB; hcp-mtsABC $yxjH-rhc; mtn Lactobacillus casei # LCA $metE-metF metK $yusA1; $yusA2CB $yxjH-rhc; mtn Lactobacillus delbrueckii # LDB metY; metB-rhc1-yusA2; $rhc2-metE-metF metK $yusACB; hcp-mtsABC mtn Lactobacillus brevis # LB no ? $yusACB $yxjH; rhc; mtn Oenococcus oeni # OOE metY; yxjH2-metI-metC-metB metK $yusCBA1; $yusA2-yxjH1-hmrB-yusCB; mtsABC yxjH3; rhc; mtn Leuconostoc mesenteroides # LME $metB-metI-metC-yxjH1--rhc-$yxjH2-metY; $metF-E metK [yusA1-yusA2-hmrB-yusCB; yusA3; $hcp-mtsABC mdh1; $mdh2 mtn Pediococcus pentosaceus # PPE no metK $yusCBA $yxjH; rhc; mtn Streptococcaceae: Lactococcus lactis LL metY; metB-metI; #metE-metF metK yusA1-A2-A3-A4-yusCB-mtsABC yxjH; rhc; mtn Streptococcus agalactiae SAG #metE-metF* metK #yusA-hmrB-yusCB-/-mtsABC; #yusA2; yusCBA3 rhc; mtn Streptococcus mutans MN #metY-mdh; ##metB; metI; ##metE-metF* metK #yusA-hmrB-yusCB-X-#hcp-mtsABC #yxjH; rhc; mtn Streptococcus pneumoniae PN #metY; #metB; #metI; #metE-metF metK #yusA-hmrB-yusCB; #hcp-mtsABC #fhs; #folD; rhc; mtn Streptococcus pyogenes ST metC metK #yusACB rhc; mtn Streptococcus suis # SSU #metY; metB; [metI; #metE-metF metK [yusA-hmrB-yusCB]; [hcp-mtsABC #yxjH; #mdh; rhc Streptococcus thermophilus # STH metY; ##metB; #metI; ##metE-metF metK #yusA-hmrB-yusCB-X-hcp-mtsABC;yusA2 #yxjH; #mdh; rhc; mtn Streptococcus uberis # SUB no metK #yusA-hmrB-yusCB; #hcp-mtsABC rhc Clostridia: Clostridium acetobutylicum CAC &metY; &metB; &metI-metC; metF*; &metH &metK &yusCBA Tcub iG-yrhBA><&; rhc; mtn Clostridium perfringes CPE no &metK &metT; mtsABC ub iG-yrhBA-rhc; mtn Clostridium botulinum CB &metY-hom-metB; &metF-msd-metH^ &metK &metT; &yusCBA1-yusA2; yusCB2 ub iG-yrhB-rhc-yrhA; mtn Clostridium tetani CTC &metY-metB; -msd-metH^ &metK &metT; &yusCBA Clostridium difficile # DF &metY-metB; &hom; folD-X-metF; rhc-msd-metH^ &metK &yusCBA1; &yusA2 Thermoanaerobacter tengcongensis TTE &hom-metY-metX; &metF-metH^ &metK Streptomyces coelicolor SX metF; metE; metH metK yusCBA &SCD95A.26 Thermobifida fusca # TFU metY-metX; metF; metH metK yusCBA &SCD95A.26 Chlorobium tepidum CL &metY-metX; metF; metH metK Chloroflexus aurantiacus CAU &metY-metX; metF; metH metK Cytophaga hutchinsonii CHU &metY; metX; metE; metH--metF metK Thermotogales (TM, PMI) metY-metB; metF-msd-metH^ metK &yusACB (only in Petrotoga miotherma)
Distriubtion of MET regulatory elements
Cystathionine
Homocysteine methyl-THF
Sulfide
CH
methylene-THF
THF
3
O-acetylhomoserine
Homoserine
Aspartate semialdehyde
Methionine
S-ribosyl-hom ocysteine
(SRH)
S-adenosyl-hom ocysteine
(SAH)
S-adenosyl-methionine
(SAM)
Methylthioribose (MTR)MTA
Threonine
metI yrhB
metC yrhAmetF
yxjH*
metK
mtnKSUVW XYZ
hom
cysH-...metB
metH
metX
metEmtn
mtn
metY
BC1434
FN 062 4
269.47
SON -3
CJ
CPE
LysT
MetTTyrT
MleN
DF
CTCCB
OB
SO N -2VC -2
NM B
SON -1VC -1
BHHP
C
TTE-nhaC
AC0744
FN 0978
BL1111
CTC 00901
OB2874OB1118
NM B05 36
FN0352BC4121E F-nhaC 1
E F-nhaC2
PPELP-nha2
LP-nha1 LLMG A
ELB
B S-yheL
B S-m leN
FN0650
VC2037
BC1709
SA 2292HI1107
VV21061FN207 7
BH3946
BC0373
FN14 22
BB0638
BB0637
F N1420CTC02529
SO1087VCA0193
BT1270
CCB
T C02520CPE2317
FN1414
SA2117
Archaea
clostrid ia
Pasteure llaceae
Phylogenetic tree of the NhaC Na+:H+ antiporter superfamily including predicted methionine-, lysine- and tyrosine-specific transporters
RFN Riboflavin biosynthesis and transport
FMN (flavin mononucleo-tide)
Bacillus/Clostridium group, proteobacteria, actinobacteria, other bacteria
THI Biosynthesis and transport of thiamin and related compounds
Thiamin pyrophosphate
Bacillus/Clostridium group, proteobacteria, actinobacteria, cyanobacteria, other bacteria, archea (thermoplasmas), plants, fungi
B12 Biosynthesis of cobalamine, transport of cobalt, cobalamin-dependent enzymes
Adenosyl-cobalamine
Bacillus/Clostridium group, proteobacteria, actinobacteria, cyanobacteria, spirochaetes, other bacteria
S-box Metabolism of methionine and cystein
Adenosyl- methionine
Bacillus/Clostridium group and some other bacteria
LYS Lysine metabolism lysine Bacillus/Clostridium group, enterobacteria, other bacteria
G-box Metabolism of purines
Guanine, adenine
Bacillus/Clostridium group and some other bacteria
Properties of riboswitches• Direct binding of ligands• Same structure – different mechanisms• Distribution in all taxonomic groups (diverse
bacteria, archea - thermoplasmas, eukaryotes – plants and fungi)
• Correlation between the mechanism and taxonomy:– Preferable attenuation of transcription (anti-anti-
terminator) – Bacillus/Clostridium group– Preferable attenuation of translation (anti-anti-
sequestor of translation initiation) – proteobacteria
Some confirmed predictions of metabolite promoting RNA riboswitches.
Structure of RFN-element and RNA riboswitch mechanism. (regulation of riboflavin metabolism and transport genes)
Bacillus subtilis Winkler et al., 2002bMironov et. al.2002
Structure of THI-element and RNA riboswith mechanism.(regulation of thiamin metabolism and transport genes)
Bacillus subtilis
Escherichia coli
Mironov et. al.2002
Winkler et al., 2002a
Structure of B12-element(regulation of cobalamin metabolism, transport and others cobalamin related genes)
Escherichia coliStreptomyces coelicolor
Nahvi et al., 2002Borovk et al, 2006
• Dmitry Rodionov • Andrei Mironov • Mikhail Gelfand