Upload
lamdung
View
212
Download
0
Embed Size (px)
Citation preview
저 시-비 리- 경 지 2.0 한민
는 아래 조건 르는 경 에 한하여 게
l 저 물 복제, 포, 전송, 전시, 공연 송할 수 습니다.
다 과 같 조건 라야 합니다:
l 하는, 저 물 나 포 경 , 저 물에 적 된 허락조건 명확하게 나타내어야 합니다.
l 저 터 허가를 면 러한 조건들 적 되지 않습니다.
저 에 른 리는 내 에 하여 향 지 않습니다.
것 허락규약(Legal Code) 해하 쉽게 약한 것 니다.
Disclaimer
저 시. 하는 원저 를 시하여야 합니다.
비 리. 하는 저 물 리 목적 할 수 없습니다.
경 지. 하는 저 물 개 , 형 또는 가공할 수 없습니다.
학 사 학 논
Analysis of the vaginal microbiome
by next-generation sequencing and
evaluation of its performance as a
clinical diagnostic tool in vaginitis
차 염 열분 이용한 질
미생 군집 분 질염에 있어
임상 검사법 가능 평가
2016 7 월
울 학 학원
학과 검사 학 공
A thesis of the Degree of Doctor of Philosophy
차 염 열분 이용한 질
미생 군집 분 질염에 있어
임상 검사법 가능 평가
Analysis of the vaginal microbiome
by next-generation sequencing and
evaluation of its performance as a
clinical diagnostic tool in vaginitis
July 2016
The Department of Laboratory Medicine
Seoul National University
College of Medicine
Ki Ho Hong
차 염 열분 이용한 질
미생 군집 분 질염에 있어
임상 검사법 가능 평가
지도
이 논 학 사 학 논 출함
2016 4 월
울 학 학원
학과 검사 학 공
학 사 학 논 인 함
2016 6 월
원 장 (인)
부 원장 (인)
원 (인)
원 (인)
원 (인)
i
Analysis of the vaginal microbiome
by next-generation sequencing and
evaluation of its performance as a
clinical diagnostic tool in vaginitis
by
Ki Ho Hong
A thesis submitted to the Department of Laboratory
Medicine in partial fulfillment of the requirements for the
Degree of Doctor of Philosophy in Medicine at Seoul
National University College of Medicine
June 2016
Approved by Thesis Committee:
Professor Chairman
Professor Vice chairman
Professor
Professor
Professor
ii
ABSTRACT
Introduction: Changes in the vaginal microbiome are associated with vaginal
symptoms and diseases. These changes are usually identified through
microscopic examination and microbiological culture. Next-generation
sequencing (NGS) can detect many more microorganisms in the vaginal
microbiome than these traditional methods. Several studies have analyzed
vaginal microbiomes with NGS; however, short read lengths and the
exclusion of microorganisms other than bacteria are common limitations of
these studies. The aim of this study was to analyze the vaginal microbiomes of
Korean women using NGS with long read lengths and the inclusion of
bacteria as well as other microorganisms. This study also compared NGS with
other assays and evaluated its feasibility for vaginitis prediction.
Methods: In total, 89 vaginal swab specimens were collected. Of these, 67
specimens were microscopically examined by Gram-staining and
microbiological culture. A GS Junior (454 Life Sciences, Branford, CT, USA)
system was used for NGS. The 16S rRNA, internal transcribed spacer (ITS),
and Tvk genes were used to detect bacteria, fungi, and Trichomonas vaginalis.
Data processing, operational taxonomic unit (OTU) table construction, and
chimeric sequence removal were performed with Usearch software.
Taxonomic assignment was performed using the Ribosomal Database Project
(RDP) website and Basic Local Alignment Search Tool (BLAST) database. A
DNA probe assay for Candida spp., Gardnerella vaginalis, and T. vaginalis
was performed.
iii
Results: In total, 202,958 reads of the 16S rRNA gene and 7,600 reads of the
internal transcribed spacer (ITS) gene were obtained from NGS of 89
specimens. ITS sequences were obtained in the majority of specimens (56.2%).
The 16S rRNA sequences and ITS sequences were clustered into 3,259 and
112 OTUs, respectively. The compositions of the intermediate Nugent score
group and vaginitis Nugent score group differed from those of the normal
score group; however, they were similar to each other. Shannon diversity
indices, the number of species, and the fraction of Lactobacillus spp. were
significantly different among the three groups. From the NGS data, various
predictors of diversity were analyzed to predict vaginitis, and a fraction of
Lactobacillus spp. was associated with the highest area under curve (AUC)
value (0.8559). NGS and DNA probe assay showed good agreement. NGS
and microbiological culture showed 73.1% agreement (range, 86.2–89.7%).
Conclusions: The intermediate Nugent score group and vaginitis group were
not significantly different in the microbiome analysis. ITS sequences were
common in normal specimens. NGS is a promising tool for examining vaginal
microbiomes and diagnosing vaginitis.
---------------------------------------------------------------------------------------------
Keywords: Vaginal microbiome, Next generation sequencing, Vaginitis
Student number: 2012-2172
iv
CONTENTS
Abstract ..................................................................................... i
Contents ................................................................................... iii
List of tables..............................................................................vi
List of figures ......................................................................... viii
List of abbreviations .................................................................ix
1. Introduction ........................................................................... 1
2. Materials and Methods ......................................................... 3
2.1. Specimens ......................................................................... 3
2.2. Next generation sequencing............................................. 4
2.3. Processing sequence data, building an operational
taxonomic unit (OTU) table and assigning
taxonomies ....................................................................... 5
2.4. DNA probe assay .............................................................. 8
2.5. Statistical analysis ............................................................ 9
3. Results .................................................................................. 10
v
3.1. Sequence read statistics. ................................................ 10
3.2. Operational taxonomic unit (OTU) statistics and
taxonomic allocation of 16S rRNA sequences. ............. 12
3.3. Operational taxonomic unit (OTU) statistics and
taxonomic allocation of internal transcribed spacer
(ITS) sequences.............................................................. 15
3.4. Diversity calculation ...................................................... 21
3.5. Cluster analysis of the microbiome compositions ........ 27
3.6. Association between diversity predictors and
Nugent scores. ............................................................... 32
3.7. The comparison of various predictors of diversity as
a diagnostic criterion of vaginitis. ................................ 35
3.8. Comparison of next generation sequencing (NGS),
DNA probe assay and microbiological culture in
the detection of Candida spp. and Gardnerella
vaginalis. ........................................................................ 39
3.9. Comparison of next generation sequencing (NGS)
vi
and microbiological culture .......................................... 41
4. Discussion............................................................................. 43
References ................................................................................ 48
Abstract in Korean .................................................................. 52
vii
LIST OF TABLES
Table 1. Results of clustering, taxonomic scores and read
lengths ......................................................................... 11
Table 2. BLAST results in operational taxonomic units (OTU)
with similarity scores of less than 0.5. ......................... 13
Table 3. Distribution of taxonomic rank after taxonomic
allocation... .................................................................. 14
Table 4. Taxonomic result of internal transcribed spacer (ITS)
gene sequencing... ........................................................ 16
Table 5. Total reads and positive specimen in the internal
transcribed spacer (ITS) gene taxonomy.. ..................... 20
Table 6. Results of clustering, the number of taxonomy and
Shannon diversity index............................................... 22
Table 7. The most abundant taxa in the three Nugent score
groups.. ........................................................................ 31
Table 8. Comparison of next generation sequencing (NGS),
viii
DNA probe assay and microbiological culture for
detection of Candida spp. and Gardnerella vaginalis .. 40
Table 9. Discordant results between next generation
sequencing (NGS) and microbiological culture. ........... 42
ix
LIST OF FIGURES
Figure 1. Summary of processing sequence data, building an
operational taxonomic unit (OTU) table and
assigning taxonomies. .................................................... 7
Figure 2. Heatmap of microbiome of 89 vaginal swab
specimens. ................................................................. 28
Figure 3. Association between diversity predictors and
Nugent score groups. ................................................. 33
Figure 4. Receiver-operating characteristic (ROC) curves of
eleven predictors of diversity and three vaginitis
criteria... .................................................................... 36
x
LIST OF ABBREVIATIONS
ANOVA analysis of variance
AUC area under curve
BLAST Basic Local Alignment Search Tool
bp base pair
CI confidence interval
κ Cohen’s kappa index
ITS internal transcribed spacer
NGS next generation sequencing
OTU operational taxonomic unit
RDP Ribosomal Database Project
ROC receiver operating characteristic
rRNA ribosomal RNA
SPA simple percent agreement
1
1. INTRODUCTION
The vaginal microbiome consists of the largest number of microorganisms of
all the microbiomes in the human female reproductive system. The vaginal
microbiome of healthy woman usually comprises 4–12 species, with Lactoba-
cillus being the most abundant genus. Under certain conditions, the composi-
tion of the vaginal microbiome can change. The total number of species may
increase, and microorganisms other than Lactobacillus spp., such as anaerobic
bacteria, fungus, and protozoa, overgrow. This change often accompanies
symptoms in the host female, such as pain, abnormal vaginal discharge, and
odor. These conditions are associated with infertility, preterm delivery, and
pelvic inflammatory diseases (1-3).
Currently, changes in the vaginal microbiome are usually identified by micro-
scopic examination and culture of vaginal swabs. Polymerase chain reaction
(PCR) and nucleic acid hybridization methods are also used. Recently, next-
generation sequencing (NGS) has been applied to the study of microbiomes.
NGS is massive, parallel sequencing that has enabled study of the human mi-
crobiome. It can identify many more microorganisms than microscopy or cul-
ture.
Several studies have analyzed the vaginal microbiome using NGS (4-9).
However, many of these studies have limitations, including short read lengths
and the exclusion of fungi. In this study, the vaginal microbiomes of Korean
women were analyzed by NGS with long read lengths and the detection of
2
both bacteria and fungi. The potential for using NGS data as a clinical diag-
nostic tool to predict vaginitis was estimated, and NGS data was compared
with that of other assays.
3
2. MATERIALS AND METHODS
2.1. Specimens
Sixty-nine vaginal swab specimens were collected from sixty-five patients
who visited the gynecological clinics of Seoul National University Hospital,
Seoul, Republic of Korea, for vaginal symptoms between December 2011 and
March 2012. Twenty additional vaginal swab specimens were collected from a
healthy control group. Among the 69 specimens from patients, sixty-seven
specimens had results from microscopic examination of Gram-stained smears
and microbiological cultures. Nugent scores of these 67 specimens were
calculated on the basis of Gram stain results (10). The specimens were
grouped into three categories according to Nugent scores: normal (scores 0–3),
intermediate (scores 4–6), and vaginitis (scores 7–10). The specimens were
cultured at 37°C and 5% CO2 for two days. The cultured organisms were
identified using VITEK II ID Cards (bioMérieux, Marcy-l'Étoile, France) and
MicroScan Pos ID Panels (Beckman Coulter, Brea, CA, USA). Lactobacillus
sp. was reported as normal vaginal flora or Lactobacillus spp.; species names
were not reported. This study was approved by the Institutional Review Board
of Seoul National University Hospital (IRB Number: H-1510-073-711).
4
2.2. Next generation sequencing
PrepMan Ultra Sample Preparation Reagent (Thermo Fisher Scientific,
Waltham, MA, USA) was used for nucleic acid extraction following the
manufacturer’s instructions. Sequencing of bacteria, fungi, and T. vaginalis
was performed on extracted nucleic acid. For bacteria, the V3–V5 region of
the 16S rRNA gene was the target. Primers 357F and 926R were used for
bacterial identification, with an expected amplicon size of 570 bp (12). For
fungi, the internal transcribed spacer (ITS) gene was the target. Primers ITS-5
and ITS-4 were used, yielding a 700-bp amplicon.(13). For T. vaginalis, the
Tvk gene was the target. Primers TVK3 and TVK7 were used, for an expected
263-bp amplicon (14). The GS Junior System (454 Life Sciences, Branford,
CT, USA) was used for NGS following the manufacturer’s instructions. The
minimum sequence length was 150 bp and the minimum exponential quality
score was 20. Only those sequences that fulfilled these two minimum criteria
were included in further analyses.
5
2.3. Processing sequence data, building an operational
taxonomic unit (OTU) table and assigning taxonomies.
A schematic diagram of data processing and further analyses is shown in
Figure 1. Usearch for Windows software (version 6.0.203) was used for
sequence processing, clustering, and removing chimeric sequences (15, 16).
The similarity threshold for clustering of two sequences into the same
operational taxonomic unit (OTU) was 0.97. For the removal of chimeric
sequences, both de novo and reference modes were used. The de novo mode
identifies possible chimeric sequences from the initial sequences and
compares them with the given sequences again. The reference mode finds the
chimeric sequences from a database of previously reported chimeric
sequences. When a sequence was identified as a chimeric sequence in both the
de novo and reference modes of Usearch, the sequence was regarded as a true
chimeric sequence and was removed from the OTU. Two open source
chimeric sequence databases for bacteria and fungi were used in the reference
mode (16, 17). However, there is no known chimeric sequence database for T.
vaginalis.
The online Ribosomal Database Project (RDP) database (version 10.3.2)
was used for matching and aligning the 16S rRNA sequences (18). Among the
RDP database sequences, near-full length (≥1,200 bp) sequences of good
quality were used for matching. After comparison with RDP sequences, each
OTU was matched with the single bacterial taxon with the highest similarity
score. If the similarity score was low (< 0.5), the sequence was analyzed again
6
using the nucleotide Basic Local Alignment Search Tool (BLAST) database
(19). The ITS and Tvk sequences were also analyzed by BLAST, since these
sequences are not included in the RDP database (20). PermutMatrix 1.9.3 was
used for drawing a heatmap to visualize the taxonomy (21). The Shannon
diversity index was calculated to estimate alpha diversity (22).
7
Figure 1. Summary of processing sequence data, building an
operational taxonomic unit (OTU) table and assigning
taxonomies.
Abbreviations: BLAST, Basic Local Alignment Search Tool; NGS, next
generation sequencing; OTU, operational taxonomic unit; and RDP,
Ribosomal Database Project.
Sequencing output from GS junior
(fasta format)
Clustering of sequences and producing OTU
(The similarity between two sequences > 97%)
Filtering of chimera sequences
(de novo mode, reference mode)
OTU table for bacteria, fungus, Trichomonas vaginalis
Assignment of taxonomy using database
(bacteria: RDP, fungus/Trichomonas: BLAST)
Heatmap generation, alpha diversity calculation
8
2.4. DNA probe assay
BD Affirm VPIII Microbial Identification Test (Becton Dickinson, NJ, USA)
is a direct specimen DNA probe-based diagnostic test for the detection of
Candida spp., Gardnerella vaginalis, and T. vaginalis. VPIII was performed
following the manufacturer’s instruction. In total, 87 specimens were tested
with this DNA probe assay. Two specimens could not be tested because of
small volumes.
9
2.5. Statistical analysis
Stata version 13.1 (StataCorp, TX, USA) was used for the statistical analysis.
A paired t-test and one-way analysis of variance (ANOVA) were performed to
compare the Shannon diversity indices among specimens. Pearson’s chi-
square and Cohen’s kappa index were calculated for a comparison of NGS,
DNA probe assay, and microbiological culture results. A receiver operating
characteristic (ROC) curve analysis was performed to evaluate the various
predictors of vaginitis. P values of less than 0.05 were considered statistically
significant.
10
3. RESULTS
3.1. Sequence read statistics
After the removal of chimeric sequences, 202,958 reads of the 16S rRNA gene
and 7,600 reads of the ITS gene were obtained from 89 specimens (Table 1).
No Tvk gene read was obtained in the specimens. NGS of the Tvk gene was
performed on cultured T. vaginalis, and NGS correctly identified samples
with T. vaginalis. Therefore, T. vaginalis was presumed not to be present in
any of the clinical specimens. 16S rRNA reads were obtained from all 89
specimens, and the average number of reads per specimen was 2,280 (range,
13–6,825). ITS reads were obtained from 50 specimens, and the average
number of ITS reads per single specimen was 182 (range, 1–3,612). The
average size of reads was 364 bp (range, 151–580) for the 16S rRNA gene and
322 bp (range, 180–566) for the ITS gene.
11
Table 1. Results of clustering, taxonomic scores and read
lengths.
Gene Taxonomic score OTUs Total reads Read length (bp)
Minimum Average Maximum
16S rRNA Similarity score
0.000-0.299 10 3,612 265 385 541
0.300-0.499 21 3,335 163 372 566
0.500-0.699 343 18,019 164 380 580
0.700-0.899 1,598 25,545 151 353 561
0.900-1.000 1,287 152,447 154 374 537
ITS Percent identity
<95.0% 11 928 190 350 566
95.0-96.9% 20 152 180 315 512
97.0-98.9% 43 6,422 184 321 509
99.0-100% 38 98 180 319 478
Sum 3,371 210,558 151 363 580
Abbreviations: bp, base pair; ITS, internal transcribed spacer; OTU,
operational taxonomic unit; and rRNA, ribosomal RNA.
12
3.2. Operational taxonomic unit (OTU) statistics and
taxonomic allocation of 16s rRNA sequences
After clustering, 202,958 reads of 16S rRNA sequences were clustered into
3,259 OTUs (Table 1). After matching the 16S rRNA sequences with those in
the RDP database, each OTU was allocated to a single taxon with the highest
similarity. Fifty-one sequences with similarity scores of less than 0.5 were re-
matched using the nucleotide BLAST database. Twenty-one sequences did
not match any sequence with more than 97% percent identity. These sequenc-
es were excluded from further evaluation. BLAST results for the remaining 30
sequences are shown in Table 2. Of these, 20 sequences were identified as
human DNA by BLAST, and these sequences were excluded from further
evaluation. Ten sequences that were identified by BLAST were included in
later analyses. Finally, 645 OTUs were identified to the species level, and
2,451 OTUs were identified to the genus level (Table 3).
13
Table 2. BLAST results in operational taxonomic units
(OTU) with similarity scores of less than 0.5.
RDP result Similarity score* BLAST result Percent identity† Read length (bp)
Bacteroides pyogenes 0.208 Lactobacillus sp. 97.51 401
Euryarchaeota 0.188 Lactobacillus sp. 98.40 437
Prevotellaceae 0.215 Lactobacillus sp. 99.13 345
Prevotellaceae 0.234 Lactobacillus sp. 99.30 287
Porphyromonadaceae 0.228 Streptococcus anginosus 98.59 213
Gammaproteobacteria 0.241 Streptococcus anginosus 100.00 219
Gammaproteobacteria 0.213 Candida albicans 98.86 416
Clostridiales 0.245 Candida albicans 98.65 269
Planomicrobium 0.257 Candida albicans 98.05 333
Prevotellaceae 0.286 Glomus mosseae 97.53 263
Bacteroidetes 0.234 Homo sapiens 98.05 256
Methanobrevibacter 0.201 Homo sapiens 98.96 385
Bacteroidetes 0.238 Homo sapiens 99.59 244
Bacteroides pyogenes 0.205 Homo sapiens 98.68 379
Conexibacter 0.217 Homo sapiens 98.23 226
Bacteroides pyogenes 0.208 Homo sapiens 97.61 377
Thiomargarita namibiensis 0.186 Homo sapiens 99.74 379
Pelagibius 0.206 Homo sapiens 98.38 371
Bacteroides pyogenes 0.198 Homo sapiens 99.00 400
Porphyromonadaceae 0.236 Homo sapiens 98.66 224
Bacteroides coprocola 0.397 Homo sapiens 100.00 163
Bacteroides pyogenes 0.217 Homo sapiens 98.29 351
Bacteroides pyogenes 0.218 Homo sapiens 98.39 309
Porphyromonadaceae 0.229 Homo sapiens 98.05 257
Bacteroidetes 0.224 Homo sapiens 98.47 259
Archaeoglobus 0.249 Homo sapiens 97.16 176
Porphyromonadaceae 0.255 Homo sapiens 98.53 272
Bacteroides pyogenes 0.216 Homo sapiens 98.60 356
Gammaproteobacteria 0.274 Homo sapiens 97.97 196
Bacteroides pyogenes 0.195 Homo sapiens 98.48 396
The BLAST results with high similarity were expressed in percent identities
score. *: RDP, †: BLAST.
Abbreviations: bp, base pair; BLAST, Basic Local Alignment Search Tool;
OTU, operational taxonomic unit; and RDP, Ribosomal Database Project.
14
Table 3. Distribution of taxonomic rank after taxonomic allocation.
Gene Taxonomic rank Total reads OTUs Taxa Read length (bp) Average Maximum Minimum 16S rRNA Domain 5 4 1 399 416 374
Class 6 6 3 440 479 411
Phylum 14 5 3 403 485 337
Order 98 27 7 371 494 247
Family 5,451 121 18 376 525 160
Genus 106,782 2,451 118 364 580 151
Species 90,602 645 157 363 541 158
ITS Kingdom 2 1 1 217 217 217
Family 14 5 3 321 458 208
Genus 115 29 9 273 392 180
Species 7,469 77 18 342 566 180
Total 210,558 3,371 338 363 580 151
Abbreviations: bp, base pair; ITS, internal transcribed spacer; OTU, operational taxonomic unit; and rRNA, ribosomal RNA.
15
3.3. Operational taxonomic unit (OTU) statistics and
taxonomic allocation of internal transcribed spacer (ITS)
sequences
After clustering, 7,600 reads of ITS sequences were clustered into 112 OTUs.
After taxonomic allocation, 77 OTUs were identified to the species level and
29 OTUs were identified to the genus level (Table 3). Candida spp. accounted
for the highest total number of reads, while Phialemonium curvatum was
detected in the highest number of specimens (Table 4, 5).
In both the 16S rRNA and ITS taxonomies, the average read lengths associated
with taxonomic groups with a low similarity score (16S rRNA) and percent
identity (ITS) were similar to those of taxonomic groups with a high similarity
score and percent identity; therefore, similarity/identity appears to be
independent of read length (Table 1). Similarly, the read lengths did not differ
by taxonomic level (Table 3).
16
Table 4 Taxonomic result of internal transcribed spacer (ITS) gene sequencing.
Specimen Taxonomy Taxonomic rank Total reads Nugent score Fungi in culture Candida in DNA probe
101 Trichomeriaceae Family 7 1 Negative Negative
102 Cryptococcus mangaliensis Species 4 2 Negative Negative
Malassezia restricta Species 1
Nectriaceae Species 4
104 Malassezia Genus 2 0 Negative Negative
Malassezia restricta Species 2
105 Lophiostoma Genus 3 6 Negative Negative
Phialemonium curvatum Species 2
106 Cladosporium silenes Species 3 5 Negative Positive
Malassezia Genus 8
Phialemonium curvatum Species 6
Saccharomyces cerevisiae Species 13
Saccharomyces pastorianus Species 12
111 Epicoccum Genus 46 8 Negative Negative
Trichomeriaceae Family 4
114 Malassezia Genus 1 4 Negative Negative
Phialemonium curvatum Species 7
117 Candida glabrata Species 4 0 Negative Positive
Cladosporium Genus 2
17
Table 4. continued.
Specimen Taxonomy Taxonomic rank Total reads Nugent score Fungi in culture Candida in DNA probe
120 Cryptococcus Genus 4 8 Negative Negative
Phialemonium curvatum Species 2
123 Candida albicans Species 7 0 Candida albicans Positive
Thanatephorus cucumeris Species 2
124 Malassezia Genus 1 8 Negative Negative
126 Malassezia restricta Species 1 5 Negative Negative
129 Phialemonium curvatum Species 5 4 Negative Negative
130 Phialemonium curvatum Species 4 0 Negative Negative
131 Candida albicans Species 13 0 Negative Negative
132 Phialemonium curvatum Species 11 0 Negative Negative
133 Candida Genus 1 0 Candida lusitaniae Positive
Candida albicans Species 3,611
134 Alternaria alternata Species 15 0 Negative Negative
Aspergillus ochraceus Species 3
Kazachstania telluris Species 2
Phoma Genus 33
136 Malassezia Genus 1 2 Negative Negative
Malassezia globosa Species 2
140 Phialemonium curvatum Species 1 8 Negative Negative
141 Malassezia Genus 3 8 Negative Negative
142 Phialemonium curvatum Species 1 2 Negative Negative
146 Rhodotorula Genus 1 0 Negative Negative
18
Table 4. continued.
Specimen Taxonomy Taxonomic rank Total reads Nugent score Fungi in culture Candida in DNA probe
201 Malassezia Genus 1 0 Negative Negative
Phialemonium curvatum Species 1
202 Phialemonium curvatum Species 2 0 Negative Negative
205 Phialemonium curvatum Species 1 6 Negative Negative
206 Ambrosia artemisiifolia Species 1 0 Negative Negative
Ascomycota Family 1
Phialemonium curvatum Species 1
207 Dendronephthya Genus 1 3 Negative Negative
Erythrobasidiaceae Family 1
208 Candida albicans Species 40 2 Negative Negative
Malassezia Genus 5
Malassezia restricta Species 1
210 Candida albicans Species 2 4 Negative Negative
211 Malassezia restricta Species 1 0 Negative Negative
213 Candida albicans Species 11 8 Candida albicans Positive
217 Phialemonium curvatum Species 3 0 Negative Negative
220 Phialemonium curvatum Species 5 1 Negative Negative
224 Stereum ostrea Species 3 4 Negative Negative
225 Candida Genus 1 4 Negative Positive
Candida albicans Species 3,523
Phialemonium curvatum Species 3
Unclassified eukaryote Genus 1
19
Table 4. continued.
Specimen Taxonomy Taxonomic rank Total reads Nugent score Fungi in culture Candida in DNA probe
226 Candida albicans Species 1 No data Negative
228 Phialemonium curvatum Species 3 No data Negative
229 Candida albicans Species 1
No data Negative
231 Fusarium Species 2
No data Negative
232 Candida albicans Species 1
No data Negative
233 Phialemonium curvatum Species 9
No data Negative
Phyllodistomum folium Species 2
Unclassified fungi Kingdom 2
234 Candida glabrata Species 5
No data Positive
236 Candida albicans Species 1
No data Negative
Phialemonium curvatum Species 2
237 Phialemonium curvatum Species 1
No data Negative
239 Malassezia Genus 1
No data Negative
Phialemonium curvatum Species 6
241 Malassezia Genus 2
No data Negative
242 Candida glabrata Species 1
No data Negative
243 Candida glabrata Species 94
No data Positive
Malassezia restricta Species 1
246 Unclassified fungi Kingdom 1 No data Negative
20
Table 5. Total reads and positive specimens in the internal
transcribed spacer (ITS) gene taxonomy.
*: genus or higher taxonomic rank.
Taxonomy* Total reads Number of positive specimens
Candida 7,214 13
Nakaseomyces 104 4
Phialemonium 76 21
Epicoccum 46 1
Malassezia 34 18
Phoma 33 1
Saccharomyces 25 2
Alternaria 15 1
Trichomeriaceae 11 2
Ascomycota 1 1
Cryptococcus 8 2
Cladosporium 5 2
Fusicolla 4 1
Aspergillus 3 1
Stereum 3 1
Lophiostoma 3 1
Fusarium 2 1
Kazachstania 2 1
Phyllodistomum 2 1
Thanatephorus 2 1
21
3.4. Diversity calculation
Two Shannon diversity indices were calculated for each specimen. First, both
16S rRNA and ITS sequences were included in the calculations. Second, only
16S rRNA sequences were included. The mean Shannon diversity index of the
16S rRNA and ITS genes was 1.4137 (95% CI: 1.2414–1.5859) and that of 16s
rRNA alone was 1.3792 (95% CI: 1.2053–1.5530). The difference was
determined to be statistically significant by a paired t-test (p = 0.0005). Table
6 shows raw data for clustered OTUs, the numbers of taxa and Shannon
diversity indices.
22
Table 6. Results of clustering, the number of taxonomy and Shannon diversity index.
Specimen Total reads OTU Fraction of reads
Number of taxa Shannon diversity index
16S rRNA + ITS 16S rRNA
16S rRNA ITS Lactobacillus Total† ≥0.1% ≥1% ≥5% Total* ≥0.1% ≥1% ≥5% Total* 16S rRNA
101 4,528 27 99.8% 0.2% 99.8% 9 8 3 1 8 7 3 1 0.8497 0.8395
102 2,856 64 99.7% 0.3% 77.0% 11 10 5 4 7 7 5 4 1.8654 1.8469
103 2,535 40 100.0% 0.0% 99.6% 11 3 2 1 11 3 2 1 1.1459 1.1459
104 3,154 19 99.9% 0.1% 99.8% 4 1 1 1 2 1 1 1 0.3832 0.3726
105 1,995 63 99.7% 0.3% 63.3% 26 22 4 2 24 20 4 2 2.0012 1.9870
106 2,354 29 98.2% 1.8% 0.0% 16 9 2 1 9 3 2 1 0.4112 0.2959
107 1,875 64 100.0% 0.0% 21.4% 21 16 6 5 21 16 6 5 2.0576 2.0576
108 3,860 26 100.0% 0.0% 99.8% 5 1 1 1 5 1 1 1 1.1134 1.1134
110 2,702 125 100.0% 0.0% 0.0% 24 15 8 2 24 15 8 2 2.7955 2.7955
111 767 42 93.5% 6.5% 0.0% 31 31 12 5 29 29 11 6 2.2287 2.0934
113 25 11 100.0% 0.0% 0.0% 11 11 11 5 11 11 11 5 1.9997 1.9997
114 255 28 96.9% 3.1% 1.2% 24 24 10 5 22 22 9 5 2.2934 2.2115
115 810 21 100.0% 0.0% 94.9% 18 18 4 2 18 18 4 2 1.1504 1.1504
116 2,717 23 100.0% 0.0% 100.0% 5 5 4 4 5 5 4 4 1.6336 1.6336
117 3,401 29 99.8% 0.2% 99.4% 13 3 1 1 11 2 1 1 0.4764 0.4631
118 3,243 33 100.0% 0.0% 99.6% 8 6 3 3 8 6 3 3 1.6656 1.6656
119 1,760 29 100.0% 0.0% 55.1% 14 8 5 2 14 8 5 2 1.6084 1.6084
120 2,069 23 99.7% 0.3% 96.9% 13 9 3 1 11 8 3 1 0.6622 0.6423
23
Table 6. continued.
Specimen Total reads OTU Fraction of reads
Number of taxa Shannon diversity index
16S rRNA + ITS 16S rRNA
16S rRNA ITS Lactobacillus Total ≥0.1% ≥1% ≥5% Total ≥0.1% ≥1% ≥5% Total 16S rRNA
121 2,170 20 100.0% 0.0% 99.6% 11 5 3 1 11 5 3 1 0.4832 0.4832
122 19 1 100.0% 0.0% 100.0% 1 1 1 1 1 1 1 1 0.0000 0.0000
123 1,979 19 99.5% 0.5% 98.8% 8 6 2 2 6 4 2 2 1.0092 0.9822
124 1,711 77 99.9% 0.1% 0.0% 26 19 6 2 25 19 6 2 2.2817 2.2781
125 13 5 100.0% 0.0% 0.0% 5 5 5 5 5 5 5 5 1.0438 1.0438
126 889 28 99.9% 0.1% 0.4% 20 20 6 2 19 19 6 2 1.2896 1.2823
127 119 13 100.0% 0.0% 0.0% 12 12 9 3 12 12 9 3 1.6326 1.6326
128 3,308 39 100.0% 0.0% 0.0% 5 3 2 2 5 3 2 2 0.7471 0.7471
129 1,254 28 99.6% 0.4% 0.2% 19 11 4 1 18 10 4 1 0.7326 0.7094
130 1,936 17 99.8% 0.2% 98.7% 4 4 1 1 3 3 1 1 0.3720 0.3579
131 2,172 26 99.4% 0.6% 98.8% 9 8 2 1 8 7 2 1 0.8276 0.7957
132 2,197 15 99.5% 0.5% 99.3% 6 2 1 1 5 1 1 1 0.4549 0.4255
133 8,100 21 55.4% 44.6% 55.4% 5 3 2 2 3 2 1 1 1.0967 0.1069
133-1 3,612 10 100.0% 0.0% 0.0% 3 3 2 1 3 3 2 1 0.2516 0.2516
134 2,029 42 97.4% 2.6% 93.3% 22 13 4 1 18 12 3 1 1.0791 0.9563
135 43 8 100.0% 0.0% 9.3% 7 7 7 2 7 7 7 2 1.0921 1.0921
136 3,945 24 99.9% 0.1% 99.8% 7 1 1 1 5 1 1 1 1.2492 1.2435
137 2,978 84 100.0% 0.0% 0.5% 18 12 5 2 17 12 5 2 2.4477 2.4477
138 2,814 107 100.0% 0.0% 38.1% 17 14 7 4 17 14 7 4 2.5085 2.5085
24
Table 6. continued.
Specimen Total reads OTU Fraction of reads
Number of taxa Shannon diversity index
16S rRNA + ITS 16S rRNA
16S rRNA ITS Lactobacillus Total ≥0.1% ≥1% ≥5% Total ≥0.1% ≥1% ≥5% Total 16S rRNA
138 2,814 107 100.0% 0.0% 38.1% 17 14 7 4 17 14 7 4 2.5085 2.5085
139 2,048 32 100.0% 0.0% 23.8% 14 8 5 2 13 8 5 2 1.2601 1.2601
140 472 35 99.8% 0.2% 49.2% 29 29 10 3 28 28 10 3 2.0700 2.0592
141 816 18 99.6% 0.4% 0.1% 14 14 4 3 13 13 4 3 1.0679 1.0451
142 1,596 27 99.9% 0.1% 95.6% 14 9 4 3 13 9 4 3 1.5414 1.5371
143 2,437 14 100.0% 0.0% 99.9% 6 2 2 1 6 2 2 1 0.3804 0.3804
144 1,696 54 100.0% 0.0% 0.5% 20 17 6 1 20 17 6 1 2.1255 2.1255
145 92 21 100.0% 0.0% 1.1% 18 18 18 3 18 18 18 3 2.0922 2.0922
146 1,291 24 99.9% 0.1% 95.3% 15 13 5 1 14 13 5 1 1.2216 1.2162
201 5,816 33 100.0% 0.0% 99.7% 7 2 1 1 5 2 1 1 0.8882 0.8852
202 356 40 99.4% 0.6% 2.8% 34 34 12 4 33 33 12 4 2.3937 2.3723
203 1,725 29 100.0% 0.0% 93.4% 18 10 3 1 18 10 3 1 1.1830 1.1830
204 596 32 100.0% 0.0% 0.3% 24 24 9 2 24 24 9 2 1.7143 1.7143
205 1,614 32 99.9% 0.1% 87.5% 27 17 4 1 26 17 4 1 1.0058 1.0012
206 4,664 22 99.9% 0.1% 99.9% 8 4 3 2 5 4 3 2 0.8972 0.8917
207 891 52 99.8% 0.2% 1.1% 43 43 16 4 41 41 16 4 2.6013 2.5896
208 6,871 45 99.3% 0.7% 99.3% 13 4 3 1 10 3 3 1 0.9400 0.9019
210 108 17 98.1% 1.9% 7.4% 16 16 10 5 15 15 9 5 2.1318 2.0781
211 5,702 22 100.0% 0.0% 99.9% 9 3 3 2 8 3 3 2 0.7068 0.7053
25
Table 6. continued.
Specimen Total reads OTU Fraction of reads
Number of taxa Shannon diversity index
16S rRNA + ITS 16S rRNA
16S rRNA ITS Lactobacillus Total ≥0.1% ≥1% ≥5% Total ≥0.1% ≥1% ≥5% Total 16S rRNA
213 1,823 40 99.4% 0.6% 5.5% 24 17 5 3 23 16 5 3 1.8405 1.8146
214 4,057 51 100.0% 0.0% 0.0% 29 4 2 1 28 4 2 1 0.7829 0.7829
215 2,335 45 100.0% 0.0% 82.6% 31 15 8 2 30 15 8 2 1.2923 1.2923
216 4,414 145 100.0% 0.0% 0.4% 19 10 4 3 19 10 4 3 2.8650 2.8650
217 3,757 50 99.9% 0.1% 1.5% 25 10 3 1 23 9 3 1 0.4839 0.4778
218 3,059 45 100.0% 0.0% 93.9% 23 12 3 2 23 12 3 2 1.1614 1.1614
219 3,082 29 100.0% 0.0% 0.1% 14 4 3 1 13 3 2 1 0.8521 0.8521
220 3,638 23 99.9% 0.1% 99.2% 13 5 2 1 12 4 2 1 0.2112 0.2011
221 2,720 26 100.0% 0.0% 98.5% 14 6 3 3 14 6 3 3 1.0906 1.0906
222 27 3 100.0% 0.0% 96.3% 3 3 3 2 3 3 3 2 0.8259 0.8259
223 3,354 22 100.0% 0.0% 99.3% 15 6 3 2 15 6 3 2 0.5059 0.5059
224 205 15 98.5% 1.5% 1.0% 15 15 8 3 14 14 7 3 1.6712 1.6186
225 10,126 55 65.2% 34.8% 64.4% 11.5 5 3 2 9 5 2 1 1.1699 0.7245
226 3,317 30 100.0% 0.0% 98.2% 16 8 2 2 15 8 2 2 0.6378 0.6353
227 2,755 18 100.0% 0.0% 98.7% 12 4 1 1 12 4 1 1 0.1629 0.1629
228 145 23 97.9% 2.1% 4.1% 20 20 15 4 19 19 14 4 2.3201 2.2663
229 440 53 99.8% 0.2% 1.4% 38 38 10 3 37 37 10 3 2.5863 2.5761
230 3,468 20 100.0% 0.0% 97.9% 10 6 3 2 10 6 3 2 0.7210 0.7210
231 99 21 98.0% 2.0% 0.0% 20 20 20 5 19 19 19 5 2.4053 2.3540
26
Table 6. continued.
Abbreviations: ITS, internal transcribed spacer; OTU, operational taxonomic unit; and rRNA, ribosomal RNA.
*: any fraction more than zero.
Specimen Total reads OTU Fraction of reads
Number of Taxa Shannon diversity index
16S rRNA + ITS 16S rRNA
16S rRNA ITS Lactobacillus Total ≥0.1% ≥1% ≥5% Total ≥0.1% ≥1% ≥5% Total 16S rRNA
232 4,589 162 100.0% 0.0% 10.9% 24 13 5 4 23 13 5 4 2.9830 2.9816
233 197 30 93.4% 6.6% 0.5% 29 29 17 2 26 26 14 2 2.3682 2.2165
234 4,165 75 99.9% 0.1% 66.1% 26 12 6 3 24 11 6 3 2.0296 2.0228
235 3,253 122 100.0% 0.0% 10.6% 22 15 7 4 21 14 7 4 2.8136 2.8136
236 2,364 26 99.9% 0.1% 98.3% 20 5 2 1 18 5 2 1 0.2417 0.2314
237 752 32 99.9% 0.1% 1.3% 25 25 6 2 24 24 6 2 1.2990 1.2906
238 3,919 71 100.0% 0.0% 0.7% 17 9 3 2 17 9 3 2 1.7740 1.7740
239 3,557 21 99.8% 0.2% 98.7% 11 6 2 2 9 5 2 2 0.8520 0.8386
240 5,227 21 100.0% 0.0% 99.3% 9 4 2 1 9 4 2 1 0.2459 0.2459
241 68 18 97.1% 2.9% 7.4% 17 17 17 3 16 16 16 3 2.0842 2.0106
242 385 38 99.7% 0.3% 2.9% 33 33 14 5 32 32 14 5 2.6370 2.6258
243 4,350 20 97.8% 2.2% 97.7% 7 2 2 1 5 1 1 1 0.3351 0.2124
244 74 16 100.0% 0.0% 0.0% 16 16 16 5 16 16 16 5 2.1820 2.1820
245 125 22 100.0% 0.0% 0.0% 20 20 12 6 20 20 12 6 2.4690 2.4690
246 3727 154 100.0% 0.0% 0.0% 19 10 5 2 16 10 5 2 3.7206 3.7191
27
3.5. Cluster analysis of the microbiome compositions
The bacterial compositions of 89 specimens are shown as a heatmap in Figure
2. The figure contains microorganisms that represent a fraction of sequences
that is greater than 0.1% of the total reads from a specimen. In Figure 2, each
row represents a taxon at order level and each column represents a specimen.
Sixty-seven specimens with Nugent scores were categorized into normal,
intermediate, and vaginitis groups. In Figure 2A, columns are clustered by
Nugent score categories. The pattern of the normal group was distinct from
the patterns of the intermediate and vaginitis groups. However, the patterns of
the intermediate group (yellow bar) and the vaginitis group (red bar) were
similar. In Figure 2B, columns were clustered by Euclidean distance
according to complete linkage rule and their similarities were visualized using
a neighbor-joining tree. Specimens could be grouped into four major clusters
(Group I-IV in Figure 2B). The four clusters consisted of heterogeneous
Nugent score groups.
The most abundant taxa of the three groups are shown in Table 7. The major
taxon of the normal group was Lactobacillus spp., and other taxa were
relatively rare. The genera that were more common in the intermediate and
vaginitis groups than in the normal group included Prevotella, Sneathia,
Aerococcus, Atopobium, Megasphaera, and Cupriavidus. The average
Lactobacillus fraction was higher in the vaginitis group than in the
intermediate group. (38.98% versus 25.19%, respectively).
28
Figure 2. Heatmap of microbiome of 89 vaginal swab specimens (A) Sorting on the basis of Nugent score.
29
Figure 2. Heatmap of microbiome of 89 vaginal swab specimens (B) Columns clustering.
30
Figure 2. (continued)
Each row shows a taxon at the order level and each column shows a single
specimen. The bar above the first row shows the Nugent score of each
specimen. Green: normal group (Nugent score 0–3); Yellow: intermediate
group (Nugent score 4–6); Red: vaginitis group (Nugent score 7–10); Gray:
specimens without Nugent score data.
(a) Sorting on the basis of Nugent score
(b) Columns are clustered. The clustering rule was complete linkage.
Their similarity was visualized using a neighbor-joining tree. Four
groups according to clustering results are shown in the bottom panel.
31
Table 7. The most abundant taxa in the three Nugent score
groups.
The value is the average fraction of corresponding taxonomy among Nugent
group. The lowest taxonomy is genus rank.
Normal group (n=30) Intermediate Group (n=25) Vaginitis group (n=12)
Lactobacillus 83.41% Lactobacillus 25.19% Lactobacillus 38.98%
Streptococcus 4.90% Cupriavidus 13.67% Prevotella 27.80%
Diaphorobacter 2.50% Sneathia 8.65% Sneathia 7.48%
Enterobacteriaciae 1.97% Streptococcus 8.27% Aerococcus 5.62%
Candida 1.54% Prevotella 6.04% Atopobium 4.46%
Cupriavidus 1.36% Atopobium 5.87% Megasphaera 1.72%
Prevotella 0.80% Megasphaera 5.28% Diaphorobacter 1.67%
Cloacibacterium 0.43% Enterobacteriaciae 4.37% Gardnerella 1.36%
Veillonella 0.34% Haemophilus 3.99% Porphyromonas 1.29%
Chlamydia 0.22% Diaphorobacter 2.90% Dialister 1.05%
Comamonas 0.20% Aerococcus 2.17% Cupriavidus 1.03%
Novosphingobium 0.18% Gp4 1.60% Saccharofermentans 0.94%
Staphylococcus 0.16% Sphingomonas 1.48% Peptoniphilus 0.69%
Haemophilus 0.14% Candida 1.47% Mobiluncus 0.69%
Gemella 0.13% Cloacibacterium 1.23% Anaerococcus 0.55%
Pseudomonas 0.11% Saccharofermentans 0.73% Epicoccum 0.50%
Acinetobacter 0.10% Corynebacterium 0.55% Coriobacteriaceae 0.38%
Alishewanella 0.09% Novosphingobium 0.55% Mycoplasma 0.35%
Sphingobium 0.08% Alishewanella 0.38% Moryella 0.35%
Dechloromonas 0.08% Propionibacterium 0.37% Fusobacterium 0.30%
32
3.6. Association between diversity predictors and Nugent
scores.
Shannon diversity indices for the16S rRNA and ITS sequences showed a
significant association with Nugent score groups in a one-way ANOVA test
(Figure 3A, p = 0.0037). With Bonferroni correction, there was a significant
association between the normal Nugent score group and the vaginitis Nugent
score group (p = 0.033). The numbers of taxa representing more than 5% of
the reads differed significantly among the Nugent score groups (Figure 3B, p
= 0.0163). With Bonferroni correction, there was a significant association
between the normal Nugent score group and the vaginitis Nugent score group
(p = 0.004). The proportions of Lactobacillus spp. were significantly different
among the Nugent score groups (p < 0.0001) by a one-way ANOVA analysis,
but the proportions did not increase with the grade of vaginitis, represented by
the Nugent score group. The mean proportions of Lactobacillus spp. in the
normal, intermediate, and vaginitis groups were 83.4%, 25.2%, and 39.0%,
respectively.
33
Figure 3. Association between diversity predictors and Nugent score groups
01
23
(n=30) (n=25) (n=12)
P = 0.033
0-3 4-6 ≥7
Shan
non d
ivers
ity
index
Nugent score
(A) Shannon index vs. Nugent score group
01
23
45
67
8
P = 0.004
0-3 4-6 ≥7(n=30) (n=25) (n=12)
Num
ber
of ta
xa(≥
5%
)
Nugent score
(B) Number of taxa vs. Nugent score group
34
Figure 3. continued
A total of 67 specimens with Shannon diversity index were grouped into three
groups according to Nugent score. Normal group: score 0-3, intermediate
group: score 4-6, vaginitis group: score 7-10.
(a) Shannon diversity indices according to Nugent score group in the form of
Tukey’s box plot.
The Shannon diversity index including both bacteria and fungi was
calculated for each sample. In total, 67 Shannon diversity indices were
classified into three groups according to the Nugent score of the
specimen.
(b) Total number of taxa according to Nugent score group in the form of
Tukey’s box plot.
Only species more than 5% in total reads were included.
35
3.7. The comparison of various predictors of diversity as
a diagnostic criterion of vaginitis.
An ROC curve analysis was performed for various predictors of diversity to
estimate their diagnostic value for vaginitis (Figure 4). Three criteria were
used to compare the diagnostic value of each method for vaginitis. First, a
Nugent score of ≥4 was considered vaginitis, including both the intermediate
and vaginitis groups in the original Nugent criteria (10). Second, a Nugent
score of ≥ 7 was considered vaginitis, similar to the original criterion (11).
Third, microbiological culture results other than those of normal vaginal flora
or Lactobacillus spp. were considered vaginitis. Since the Nugent criteria are
based on bacterial morphotypes and do not consider yeast morphotypes,
various other predictors were compared, including the Shannon diversity
index for 16S rRNA and the Shannon diversity indices for both 16S rRNA and
ITS sequences. Other parameters, such as the total number of taxa and the
fraction of sequences that were from Lactobacillus spp. were also compared.
All eleven predictors showed statistically significant associations with the first
criterion for vaginitis (Nugent score ≥ 4, Figure 4A). The highest AUC was
0.8559, which was obtained based on the fraction of sequences belonging to
lactobacilli and the first criterion for vaginitis. Applying this parameter and
criterion combination, with a 12.45% lactobacilli fraction as a cut-off, the
sensitivity of this algorithm was 83.78% (95% CI: 68.0–93.8%), and the
specificity was 80.00% (95% CI: 61.4–92.3%).
36
Figure 4. Receiver-operating characteristic (ROC) curves of eleven predictors of diversity and three vaginitis
criteria.
0.0
00
.25
0.5
00
.75
1.0
0S
en
sitivi
ty
0.00 0.25 0.50 0.75 1.001-Specificity
a, AUC: 0.7158* b, AUC: 0.7536*
c, AUC: 0.7770* d, AUC: 0.7176*
e, AUC: 0.7252* f, AUC: 0.7568*
g, AUC: 0.7685* h, AUC: 0.7158*
i, AUC: 0.8559* j, AUC: 0.7234*
k, AUC: 0.7243* Reference line
(A) Vaginitis criterion 1
0.0
00
.25
0.5
00
.75
1.0
0S
en
sitivi
ty
0.00 0.25 0.50 0.75 1.001-Specificity
a, AUC: 0.7409* b, AUC: 0.7992*
c, AUC: 0.7500* d, AUC: 0.6545
e, AUC: 0.7515* f, AUC: 0.8053*
g, AUC: 0.7568* h, AUC: 0.6674
i, AUC: 0.6455 j, AUC: 0.7485*
k, AUC: 0.7606* Reference line
(B) Vaginitis criterion 2
0.0
00
.25
0.5
00
.75
1.0
0S
en
sitivi
ty
0.00 0.25 0.50 0.75 1.001-Specificity
a, AUC: 0.5910 b, AUC: 0.6290
c, AUC: 0.6248 d, AUC: 0.6018
e, AUC: 0.5788 f, AUC: 0.6220
g, AUC: 0.6140 h, AUC: 0.5835
i, AUC: 0.6932* j, AUC: 0.6144
k, AUC: 0.5882 Reference line
(C) Vaginitis criterion 3
37
Figure 4. continued.
ROC curves of eleven predictors using (A) vaginitis criterion 1, (B) vaginitis
criterion 2 and (C) vaginitis criterion 3.
*: P < 0.05
Abbreviations: AUC, area under curve; CI, confidence interval; ITS, internal
transcribed spacer; and rRNA, ribosomal RNA.
(1) Vaginitis criteria
- Vaginitis criterion 1: When the Nugent score of a specimen was 4 or
more, the specimen was regarded as vaginitis.
- Vaginitis criterion 2: When the Nugent score of a specimen was 7 or
more, the specimen was regarded as vaginitis.
- Vaginitis criterion 3: When the microbiological culture of a
specimen showed results other than normal vaginal flora or
Lactobacillus spp., the specimen was regarded as vaginitis.
(2) Predictors.
a–d: The total number of taxa in the specimen, including both the 16S rRNA
gene and the ITS gene, when the fraction of that taxon is more than (a)
zero, (b) 0.1% , (c) 1% and (d) 5%.
e–h: The total number of taxa in the specimen, including only the 16S rRNA
gene, when the fraction of that taxon is more than (e) zero, (f) 0.1%, (g)
1% and (5) 5%.
i: The fraction of Lactobacillus spp. in the specimen.
38
j–k: Shannon diversity index of the specimen, when the index was calculated
from both the 16S rRNA gene and the ITS gene (j) or was calculated from
the 16S rRNA gene only (k).
39
3.8. Comparison of next generation sequencing (NGS),
DNA probe assay and microbiological culture in the
detection of Candida spp. and Gardnerella vaginalis.
The DNA probe assay and culture showed 95.5% and 76.1% agreement in the
detection of Candida spp. and G. vaginalis, respectively. NGS and culture
showed 88.1% and 86.6% agreement in the detection of Candida spp. and G.
vaginalis, respectively (Tables 8). NGS and the DNA probe assay showed
86.2% and 89.7% agreement in the detection of Candida spp. and G. vaginalis,
respectively. All three assays showed significant associations by Pearson’s
chi-square analysis. Cohen’s kappa index results showed fair to substantial
agreement among the three test methods (23). T. vaginalis was not found in
any specimen by any method.
40
Table 8. Comparison of next generation sequencing (NGS),
DNA probe assay and microbiological culture for detection
of Candida spp. and Gardnerella vaginalis
Method1 Method2 N Organisms OPA P-value κ
NGS Culture 67 Candida spp. 88.1% 0.003 0.363
NGS Culture 67 Gardnerella vaginalis 86.6% 0.003 0.509
DNA probe Culture 67 Candida spp. 95.5% <0.001 0.776
DNA probe Culture 67 Gardnerella vaginalis 76.1% <0.001 0.335
NGS DNA probe 87 Candida spp. 86.2% <0.001 0.460
NGS DNA probe 87 Gardnerella vaginalis 89.7% <0.001 0.690
Abbreviations: κ, Cohen’s kappa index; SPA, simple percent agreement;.
P-values of Pearson’s chi-square analysis and Cohen’s kappa index were cal-
culated for each comparison.
41
3.9. Comparison of next generation sequencing (NGS) and
microbiological culture.
The results of NGS and culture were difficult to compare directly, since NGS
detected many microorganisms and culture usually detected only a few. In our
study, NGS and culture were considered to be in complete agreement, if NGS
results included all of the cultured microorganisms. “Normal vaginal flora”
culture results included Lactobacillus spp. With this definition of complete
agreement, NGS and culture showed 73.1% (49/67) agreement. Two
specimens were considered in partial agreement. In one specimen showing
partial agreement, Enterococcus faecalis and Candida albicans grew, while
NGS detected only C. albicans. In the other specimen, Escherichia coli and G.
vaginalis grew in culture, while NGS detected only G. vaginalis. Table 9
shows the remaining 16 specimens with discordant results between these two
detection methods.
42
Table 9. Discordant results between next generation
sequencing (NGS) and microbiological culture.
In these sixteen specimens, NGS could not find any microorganisms which
were grown in culture.
Specimen Culture result 16S rRNA sequence ITS sequence
103 Candida albicans 2,535 0
113 Streptococcus agalactiae 25 0
116 Yeast 2,717 0
122 Staphylococcus aureus 19 0
126 Streptococcus anginosus 888 1
128 Escherichia coli 3,308 0
142 Enterococcus faecalis 1,595 1
145 Escherichia coli 92 0
146 Enterococcus faecalis 1,290 1
201 Enterococcus faecalis 5,814 1,495
203 Enterococcus faecalis 1,725 0
206 Streptococcus agalactiae 4,661 3
210 Enterococcus faecalis 106 2
215 Staphylococcus epidermidis 2,335 0
221 Candida ciferrii 2,720 0
225 Escherichia coli 6,598 3,528
43
4. DISCUSSION
Most read lengths in our study were long enough for sequences to be
identified at the species and genus levels, although they were shorter than we
expected. Several studies have analyzed vaginal microbiomes with NGS
technology (4-9). Hummelen et al. studied the vaginal microbiota of 132 HIV-
positive Tanzanian women using an Illumina system (5). Ravel et al. studied
the vaginal microbiota of 396 women comprising four ethnic groups (7). This
group used a 454 FLX system, and the average read length of 16S rRNA
sequences was 240 bp. Martin et al. studied vaginal swabs of 92 American
women (ethnicity unknown) with a 454 system, and the average read length
was 480 bp (24). This study showed the highest average read length of all of
these earlier studies. The studies described above analyzed only bacteria in
vaginal microbiomes and did not analyze fungi. As many bacterial vaginitis
cases are accompanied by vulvovaginal candidiasis, an analysis of fungi can
provide more information about the vaginal microbiome. In our study, the ITS
gene, indicating the presence of fungi, was found in 56.2% of specimens,
although many specimens included only a few reads of this gene. Shannon
diversity indices also showed significant differences with the inclusion of
fungi. Fungi should be considered in investigations of the vaginal microbiome.
The microbiota of the normal Nugent score group mainly comprised
Lactobacillus, as expected based on previous reports (Figure 2). The Shannon
diversity index, number of species, and proportion of sequences belonging to
Lactobacillus spp. showed significant differences among the Nugent score
44
groups (Figures 3 and 4). However, the composition of the intermediate
Nugent score group and that of the vaginitis Nugent score group were similar.
One possibility is that there was bias by the examiner who assigned the
Nugent scores; another is that these results were skewed because of the
relatively small number of specimens in the vaginitis group. In our data, all 12
specimens comprising the vaginitis group (Nugent score ≥ 7) had a score of 8,
suggesting the possibility of bias when Nugent scores were assigned. As the
Nugent score is highly dependent on the examiner, bias is possible.
Yet another possibility is that the intermediate group and the vaginitis
group are on the same clinical spectrum; thus, they may share similar
microbiome patterns. In Figure 2, the Lactobacillus fraction showed good
correlation with the Nugent score when results from the normal group were
compared with those from the other two groups (intermediate and vaginitis
groups combined). Similarly, all 11 parameters of diversity were significantly
associated with evidence for vaginitis, including both the intermediate and
vaginitis group (a Nugent score of ≥ 4, Figure 4A). In short, we could not
distinguish between the intermediate group (Nugent score 4–6) and the
vaginitis group (Nugent score 7–10) with the microbiome analysis in this
study. Although clinical factors such as symptoms and treatment were not
considered in this study, the current cutoffs for Nugent scores might need to
be changed based on the results of this study.
In this study, NGS, DNA probe assay, and microbiological culture were
compared for their abilities to detect vaginal microorganisms. There are few
studies comparing NGS and culture in the investigation of vaginal
45
microbiomes. Smidt et al. compared NGS, quantitative PCR, and culture-
based methods to identify Lactobacillus spp. (25). At present, this is the only
prior study that has compared culture and NGS using vaginal specimens.
These researchers compared agreement in the detection of species within the
Lactobacillus genus and reported general concordance among the three
methods in detecting L. crispatus, L. jensenii, and L. gasseri, but not L. iners.
The concordance between NGS and culture in the detection of Lactobacillus
spp. could not be determined, because Lactobacillus spp. in this study were
reported as normal vaginal flora, and species were not reported. In our study,
NGS and microbiological culture showed only 73.1% agreement, while NGS
and the DNA probe assay showed good correlation.
Salipante et al. reported a similar phenomenon, based on sputum
specimens from cystic fibrosis patients (26). In their study, 17.3% of
pathogens were identified only by culture, and the total agreement rate
between NGS and culture was 56.7%. Some of the reads were lost during the
de-noising steps of NGS, and this was a reason for some discordance. They
also suggested that discrepancies between the results by the two methods
reflect various factors, including inefficient DNA extraction from particular
organisms, primer bias, or properties of the specimens themselves, including
internal sample heterogeneity. Toma et al. also reported discrepancies between
culture and NGS in endotracheal aspirates (27). Interestingly, they used
multiple databases, and the discrepancy rates differed according to the
database used. NGS and culture results coincided in 85% of samples using
three databases. They suggested that short microbial reads and amplification
46
bias caused by mismatches of universal primers in certain bacteria might have
caused such discrepancies. Although studies reporting discrepancies between
the results of NGS and culture methods in an investigation of the vaginal
microbiome were identified, the factors mentioned above could have had a
similar effect in our study.
The RDP is a very convenient pipeline for analyzing 16S rRNA sequences
from NGS data. It provides a taxonomic ranking of sequences in a form that
can easily be converted into data for use with various platforms. However, our
results showed that there might be significant mismatches based on low
similarity scores. Fettweis et al. noted similar mismatches (28). In our study,
several sequences showing RDP results with low similarity scores were
identified as human or Candida spp. DNA. Since vaginal swabs can include
nucleic acid from both humans and microorganisms other than bacteria,
sequences with poor similarity scores should be analyzed using other
databases.
This study had some limitations. First, we could not evaluate the various
analytical performance parameters, but we investigated the possibility of
using NGS as a clinical diagnostic tool. We evaluated the accuracy of NGS by
comparison with culture and DNA probe assay results. Second, NGS results
and microbiological culture results were indirectly compared.
Although there are some remaining problems to solve and further
optimization will be required before using NGS as an independent diagnostic
tool (26), NGS detected many more microorganisms than traditional detection
methods. NGS might become a useful and powerful method to investigate
47
vaginal microbiomes that can provide information on clinical diseases
associated with specific vaginal microbiome profiles.
48
REFERENCES
1. Liversedge NH, Turner A, Horner PJ, Keay SD, Jenkins JM, Hull MG.
The influence of bacterial vaginosis on in-vitro fertilization and embryo
implantation during assisted reproduction treatment. Hum Reprod.
1999;14(9):2411-5.
2. Mead PB. Epidemiology of bacterial vaginosis. American journal of
obstetrics and gynecology. 1993;169(2 Pt 2):446-9.
3. Schmidt H, Hansen JG. Bacterial vaginosis in a family practice
population. Acta obstetricia et gynecologica Scandinavica.
2000;79(11):999-1005.
4. Fredricks DN, Fiedler TL, Marrazzo JM. Molecular identification of
bacteria associated with bacterial vaginosis. The New England journal of
medicine. 2005;353(18):1899-911.
5. Hummelen R, Fernandes AD, Macklaim JM, Dickson RJ, Changalucha J,
Gloor GB, et al. Deep sequencing of the vaginal microbiota of women
with HIV. PloS one. 2010;5(8):e12078.
6. Ling Z, Kong J, Liu F, Zhu H, Chen X, Wang Y, et al. Molecular analysis
of the diversity of vaginal microbiota associated with bacterial vaginosis.
BMC genomics. 2010;11:488.
7. Ravel J, Gajer P, Abdo Z, Schneider GM, Koenig SS, McCulle SL, et al.
Vaginal microbiome of reproductive-age women. Proceedings of the
National Academy of Sciences of the United States of America. 2011;108
Suppl 1:4680-7.
49
8. Schellenberg J, Links MG, Hill JE, Dumonceaux TJ, Peters GA, Tyler S,
et al. Pyrosequencing of the chaperonin-60 universal target as a tool for
determining microbial community composition. Applied and
environmental microbiology. 2009;75(9):2889-98.
9. Schellenberg JJ, Links MG, Hill JE, Dumonceaux TJ, Kimani J, Jaoko W,
et al. Molecular definition of vaginal microbiota in East African
commercial sex workers. Applied and environmental microbiology.
2011;77(12):4066-74.
10. Nugent RP, Krohn MA, Hillier SL. Reliability of diagnosing bacterial
vaginosis is improved by a standardized method of gram stain
interpretation. Journal of clinical microbiology. 1991;29(2):297-301.
11. Rodrigues FS, Peixoto S, Adami F, Alves BD, de Sousa Gehrke F,
Azzalis LA, et al. Proposal of a new cutoff for Nugent criteria in the
diagnosis of bacterial vaginosis. Journal of microbiological methods.
2015.
12. HMP Consortium 16S 454 Sequencing Protocol;. [updated 2010/10/27].
Available from:
http://www.hmpdacc.org/doc/16S_Sequencing_SOP_4.2.2.pdf.
13. CLSI. Interpretive Criteria for Identification of Bacteria and Fungi by
DNA Target Sequencing; Approved Guideline. CLSI document MM18-A.
Wayne, PA: Clinical and Laboratory Standards Institute; 2008.
14. Lawing LF, Hedges SR, Schwebke JR. Detection of trichomonosis in
vaginal and urine specimens from women by culture and PCR. Journal of
clinical microbiology. 2000;38(10):3585-8.
50
15. Edgar RC. Search and clustering orders of magnitude faster than BLAST.
Bioinformatics. 2010;26(19):2460-1.
16. Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R. UCHIME
improves sensitivity and speed of chimera detection. Bioinformatics.
2011;27(16):2194-200.
17. Nilsson RH, Abarenkov K, Veldre V, Nylinder S, P DEW, Brosche S, et
al. An open source chimera checker for the fungal ITS region. Molecular
ecology resources. 2010;10(6):1076-81.
18. Haas BJ, Gevers D, Earl AM, Feldgarden M, Ward DV, Giannoukos G, et
al. Chimeric 16S rRNA sequence formation and detection in Sanger and
454-pyrosequenced PCR amplicons. Genome research. 2011;21(3):494-
504.
19. Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for
rapid assignment of rRNA sequences into the new bacterial taxonomy.
Applied and environmental microbiology. 2007;73(16):5261-7.
20. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local
alignment search tool. Journal of molecular biology. 1990;215(3):403-10.
21. Caraux G, Pinloche S. PermutMatrix: a graphical environment to arrange
gene expression profiles in optimal linear order. Bioinformatics.
2005;21(7):1280-1.
22. Shannon CE. A Mathematical Theory of Communication. Bell Sys Tech
J. 1948;27(3):379-423.
23. Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa
statistic. Family medicine. 2005;37(5):360-3.
51
24. Martin DH, Zozaya M, Lillis R, Miller J, Ferris MJ. The microbiota of
the human genitourinary tract: trying to see the forest through the trees.
Transactions of the American Clinical and Climatological Association.
2012;123:242-56.
25. Smidt I, Kiiker R, Oopkaup H, Lapp E, Roop T, Truusalu K, et al.
Comparison of detection methods for vaginal lactobacilli. Beneficial
microbes. 2015;6(5):747-51.
26. Salipante SJ, Sengupta DJ, Rosenthal C, Costa G, Spangler J, Sims EH,
et al. Rapid 16S rRNA next-generation sequencing of polymicrobial
clinical samples for diagnosis of complex bacterial infections. PloS one.
2013;8(5):e65226.
27. Toma I, Siegel MO, Keiser J, Yakovleva A, Kim A, Davenport L, et al.
Single-molecule long-read 16S sequencing to characterize the lung
microbiome from mechanically ventilated patients with suspected
pneumonia. Journal of clinical microbiology. 2014;52(11):3913-21.
28. Fettweis JM, Serrano MG, Sheth NU, Mayer CM, Glascock AL, Brooks
JP, et al. Species-level classification of the vaginal microbiome. BMC
genomics. 2012;13 Suppl 8:S17.
52
국
: 질 미생 군집 변 는 여러 증상 질 과 연 어
있 며, 통 는 미경 소견과 미생 양 그 변 를
인하 다. 차 염 열 분 법 (Next-generation sequencing;
NGS) 통 인 법에 해 훨씬 많 미생 진단할 있다.
과거 NGS 를 사용한 질 미생 연구들에는 산 이가
짧거나 균 외 미생 상 하지 않는 등 한계가
있었다. 이 연구는 한국 여 질 미생 군집 충분한 read
length 균 외 다른 미생 도 모 포함하여 분 하는 것
목 한다. 또한 NGS 를 다른 검사법과 하며, 질염
인자 가능 평가하 한다.
법: 89 개 질 도말 검체가 집 었 며, 그 67 개는 Gram
염색 도말 미경 소견과 미생 양 결과가 있었다. 균, 진균,
질편모충 분 해 16S rRNA, internal transcribed spacer (ITS), Tvk
자를 상 하 며, 454 사(Branford, CT, USA) GS junior
장 를 사용하여 NGS 를 시행하 다. 데이 분 , 조작상분 단
구 , 키 라 염 열 거에는 Usearch 소프트웨어를
사용하 며, Ribosomal database project (RDP), Basic Local Alignment
Search Tool (BLAST) 데이 베이스를 사용하여 미생 동 ,
53
분 하 다. 잔여 도말 검체에 Beckton Dickinson (NJ, USA)사 BD
Affirm VPIII 검사법 사용하여 칸 다, 가드 라, 질편모충
검사를 시행하 다.
결과: 202,958 개 16S rRNA 염 열과 7,600 개 ITS
염 열이 인 었 며 Tvk 염 열 검출 지 않았다. ITS
염 열 체 검체 56.2% (50/89)에 검출 었다. 16S rRNA
ITS 염 열 각각 3,259 개 112 개 조작상분 단 다.
Nugent 가 상인 검체 미생 분포는 주
Lactobacilliales 목(目) 구 어 있었다. 간 또는 질염에
해당하는 검체는 Lactobacilliales 다른 다양한 미생
구 어 있었 나, 간군과 질염군 구 에 뚜 한 차이는
보이지 않았다. Shannon 다양 지 , 종(種) 자,
Lactobacillus 속(屬) 군에 뚜 한 차이를 보 다.
NGS 자료에 미생 군집 여러 가지 다양 지 를 얻어 질염
한 결과 Lactobacillus 속 이 가장 높 곡
아래 면 (area under curve) 값 보 다 (0.8559).
NGS 핵산 듬자법 가드 라 칸 다 검출에 있어 좋
일 도를 보 다 (범 86.2 – 89.7%).
결 : 간군과 질염군 미생 군집 뚜 이 구분 지 않았 며,
ITS 염 열 상 검체에 도 하게 검출 었다. NGS 는 질
54
미생 군집 분 과 질염 진단에 있어 용한 도구가 것
보인다.
-------------------------------------
주요어 : 질 미생 군집, 차 염 열분 , 질염
학 번 : 2012-21726