Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
저 시-비 리- 경 지 2.0 한민
는 아래 조건 르는 경 에 한하여 게
l 저 물 복제, 포, 전송, 전시, 공연 송할 수 습니다.
다 과 같 조건 라야 합니다:
l 하는, 저 물 나 포 경 , 저 물에 적 된 허락조건 명확하게 나타내어야 합니다.
l 저 터 허가를 면 러한 조건들 적 되지 않습니다.
저 에 른 리는 내 에 하여 향 지 않습니다.
것 허락규약(Legal Code) 해하 쉽게 약한 것 니다.
Disclaimer
저 시. 하는 원저 를 시하여야 합니다.
비 리. 하는 저 물 리 목적 할 수 없습니다.
경 지. 하는 저 물 개 , 형 또는 가공할 수 없습니다.
의학박사 학위논문
Genomic analysis of
ovarian clear cell carcinomas and uterine
corpus endometrial carcinomas using next
generation sequencing
차세대 염기서열 분석을 활용한 난소 투명세포
암과 자궁내막 암의 유전체 분석 연구
2018년 8월
서울대학교 대학원
의과학과 의과학 전공
이지원
차세대 염기서열 분석을 활용한 난소 투명세포
암과 자궁내막 암의 유전체 분석 연구
지도 교수 김종일
이 논문을 의학박사 학위논문으로 제출함
2018년 8월
서울대학교 대학원
의과학과 의과학 전공
이지원
이지원의 의학박사 학위논문을 인준함
2018년 8월
위 원 장 (인)
부위원장 (인)
위 원 (인)
위 원 (인)
위 원 (인)
Genomic analysis of
ovarian clear cell carcinomas and
uterine corpus endometrial carcinomas
using next generation sequencing
by
Ji won Lee
A thesis submitted to the Department of Medicine in partial
fulfillment of the requirements for the Degree of Doctor of
Philosophy in Biomedical Science at Seoul National
University College of Medicine
August 2018
Approved by Thesis Committee:
Professor Chairman
Professor Vice chairman
Professor
Professor
Professor
i
Abstract
Genomic analysis of
ovarian clear cell carcinomas and
uterine corpus endometrial
carcinomas using next generation
sequencing
Ji Won Lee
Major in Biomedical Science
Department of Biomedical Science
Seoul National University Graduate School
Gynecologic Cancer, occurred in reproductive organs, including the
cervix, ovaries, uterus, fallopian tubes, vagina and vulva, is one of the
cancers that women have most often been affected. Especially, ovarian
cancer and uterine corpus cancer of gynecologic cancers are in the 10
leading types of estimated new cancer cases and death in 2017. In
accordance with the era of precision medicine, reliable genetic
diagnosis is essential for providing individualized treatment of the
cancers. Recently, many groups including The Cancer Genome Atlas
ii
have been researched for the genetic profiling of this cancers to broaden
the understanding of gynecologic cancer. However, in the cases of
ovarian clear cell carcinoma (OCCC) and uterine corpus endometrial
carcinoma (UCEC), only small genomic studies have been reported,
and the genetic understandings of these cancers are not fully elucidated
until now.
In the first chapter, comprehensive genomic characterization of OCCCs
was performed and analyzed via whole exome sequencing (WES) with
blood samples and fresh cancer tissues. The samples were collected
from the fifteen patients with ovarian clear cell carcinoma (OCCC) from
2012 to 2016, and stored at the Seoul National University Hospital
Human Biobank. The sequencing data of the fresh OCCC tissues was
characterized by analyzing genomic alterations (somatic mutations and
somatic copy number variations). A median of 178 exonic mutations
(111-25,798) and a median of 343 somatic copy number variations (43-
1,820) were found per tumor sample. In all, 54 somatic mutations
including PIK3CA, ARID1A, and KRAS were found in the 15 Korean
OCCCs. Copy number amplifications in NTRK1, MYC, and GNAS and
copy number deletions in TET2, TSC1, BRCA2, and SMAD4 were
frequently detected in 15 OCCCs. The somatically altered pathways
were associated with proliferation and survival pathways (including the
PI3K/AKT, ERBB2, and TP53 pathways) in 87% of OCCCs, and with
iii
chromatin remodeling pathway in 47% of OCCCs. No significant
differences in frequencies of genetic alterations were detected between
OCCC with endometriosis and OCCC groups without endometriosis.
In the second chapter, clinical characteristics and corresponding
genomic data were analyzed via sequencing data of 370 patients with
UCEC from The Cancer Genome Atlas (TCGA) database, and factors
associated with survival outcomes were also investigated. The analyses
suggest that the LYL1 gene amplification was observed in 22 patients
(5.9%) with UCEC. Patients with LYL1 gene amplification were
significantly older at the time of diagnosis and more often were marked
by non-endometrioid, high-grade, or advanced disease. Multivariate
analyses, adjusted for tumor histologic type, grade, and stage, did not
confirm LYL1 gene amplification as an independent prognostic factor for
either progression-free survival (PFS) or overall survival (OS) clearly.
However, in survival analyses, PFS and OS rates (3-year PFS: 34.4%
vs. 79.9%, P=0.031; 5-year OS: 25.1% vs. 84.9%, P=0.014) were
observed in the amplification subset. Cancer-related genes (MYC,
CDK6, and ERBB2) increased in patients with LYL1 amplification were
discovered. MAPK, WNT, and cell cycle pathways were also
significantly enriched in LYL1 gene amplified patients.
iv
These results improved the current understanding of gynecologic cancer,
and allows more efficient diagnostic strategy for ovarian clear cell
carcinoma and uterine corpus endometrial carcinoma.
* The first chapter was published in Gynecologic Oncology [1], and the
second chapter was published in BMC cancer [2].
----------------------------------------------------------------------------------------------
Keywords: Gynecologic cancer; Ovarian clear cell carcinoma; Uterine
corpus endometrial carcinoma; Whole exome sequencing; Copy
number variation; Somatic mutation; The Cancer Genome Atlas
Student number: 2015-30608
v
CONTENTS
Abstract………………………………………………………………………i
Contents………………………………………………………………….....v
List of tables……………….……………………………………………….vi
List of figures………………………………………………………………vi
List of Abbreviations…………………………………………………,…xii
General introduction…………………………....………………………...1
Ovarian cancer...............................................................................1
Uterine cancer................................................................................3
Next generation sequencing………………………………...............5
Cancer genomics...........................................................................7
Genomic characteristics of ovarian clear cell carcinoma.............10
Genomic characteristics of uterine corpus endometrial
carcinomas...................................................................................12
Objectives of this study.................................................................13
Chapter 1. Genomic landscape of ovarian clear cell carcinoma via
next generation sequencing.............................................................15
Introduction……………………………………………………………16
vi
Materials and Method………………………………………………..18
Results…………………………………………………………….......23
Discussion………………………………………………………….....45
Chapter 2. Idnetification of LYL1 gene amplification as an
independent prognostic factor of uterine corpus endometrial
cancer via next generation sequencing data analysis of The Cancer
Genome Atlas.....................................................................................59
Introduction……………………………………………………………60
Materials and Method………………………………………………..63
Results………………………………………………………………...65
Discussion………………………………………………………........83
General discussion............................................................................90
General conclusion...........................................................................93
Reference………………………………………………………………….95
Abstract in Korean…………………………………………………..….111
vii
LIST OF TABLES
Table 1-1. Primer design for validation with Sanger sequencing…….22
Table 1-2. Clinicopathologic characteristics of the patients…………..28
Table 1-3. Summary of somatic mutations and copy number
variations.............................................................................................29
Table 1-4. Significant somatic mutations in 15 Korean patients with
OCCC……………………………………………………………………….30
Table 2-1 Cliicopathologic characteristics of patients..........................73
Table 2-2. Factors associated with survival outcomes in patients with
survival outcomes in patients with uterine corpus endometrial
carcinoma............................................................................................76
Table 2-3. Factors associated with survival outcomes in patients with
endometriod histologic type of uterine caner......................... .............77
viii
LIST OF FIGURES
Figure 1-1. Landscape of somatic mutations in 15 Korean OCCCs…32
Figure 1-2. The validations of somatic mutations……………………...33
Figure 1-3. Somatic mutations and CNVs pathways in OCCCs.…..…34
Figure 1-4. Landscape of somatic copy number variations in 15 Korean
OCCCs……………………………………………………………………...35
Figure 1-5. Arm level CNVs and focal level CNVs……………....…......36
Figure 1-6. Copy number variation analysis for drug targetable
genes...................................................................................................37
Figure 1-7. Somatic copy number variations in chromosome arm
level…………………………………………………………………..……..38
Figure 1-8. Comparison of mutation burden and CNV burden between
endometriosis-associated and non-endometriosis-associated
OCCCs……………………………………………...………………………39
Figure 1-9. Comparison of somatic mutations between endometriosis-
associated and non-endometriosis-associated OCCCs in frequently
mutated genes……………………………………………………………..40
Figure 1-10. Comparison of somatic mutations between endometriosis-
associated and non-endometriosis-associated OCCCs in somatically
altered pathways…………………………………………..……………….41
ix
Figure 1-11. Comparisons of somatic copy number variations between
endometriosis-associated and non-endometriosis-associated OCCCs in
amplified and deleted chromosomes…………………………………….42
Figure 1-12. Comparisons of somatic copy number variations between
endometriosis-associated and non-endometriosis-associated OCCCs in
oncogenes and tumor suppressor genes………………………………..43
Figure 1-13. Comparisons of somatic copy number variations between
endometriosis-associated and non-endometriosis-associated OCCCs in
somatically-altered pathways……………………………………………..44
Figure 1-14. The alteration frequency of RFX3 (Regulatory Factor X3)
in TCGA………………………………………………....………………….51
Figure 1-15. The alteration frequency of MST1R(Macrophage
stimulating protein receptor) in TCGA……………………………………52
Figure 1-16. The alteration frequency of MED12 (Mediator Complex
Subunit 12) in TCGA……..............................................................……53
Figure 1-17. The alteration frequency of GPC3(Glypican 3) in TCGA..54
Figure 1-18. Associations between tumor stage and somatically-
mutated genes…………………………………………………....………..55
Figure 1-19. Associations between somatic genomic alterations and
clinical features………………………………………………………….....56
x
Figure 1-20. Mutation signature difference between patient #1 and other
patients……………………………………………………..……......….….57
Figure 1-21. APOBEC signature in Patient #13………………………...58
Figure 2-1. Frequencies of LYL1 gene amplification in various cancer
types....................................................................................................69
Figure 2-2. Frequencies of copy number variations............................70
Figure 2-3. Correlations between amplification frequencies and
mortality...............................................................................................71
Figure 2-4. Correlations between deletion frequencies and mortality..72
Figure 2-5. Overall survival of patients with uterine corpus endometrial
carcinoma............................................................................................74
Figure 2-6. Progression-free survival of patients with uterine corpus
endometrial carcinoma.........................................................................75
Figure 2-7. Enrichment analysis by LYL1 gene status.........................78
Figure 2-8. Expression levels of enriched DEGs.................................79
Figure 2-9. Significant gene networks.................................................80
Figure 2-10. Gene set enrichment analysis according to histologic types
and TCGA classes...............................................................................81
Figure 2-11. Enriched gene list of cancer-related and cell proliferation
pathways.............................................................................................82
xi
Figure 2-12. Gene expression between LYL1 amplified patients and
non- LYL1 amplified patients................................................................86
Figure 2-13. Pearson’s correlation between LYL1 amplification and
oncogene expression...........................................................................87
Figure 2-14. The genomic alteration frequency of LYL1 gene..............88
Figure 2-15. Survival analysis in high-grade serous ovarian cancer....89
xii
LIST OF ABBREVIATIONS
EOC: epithelial ovarian cancer
EMS: endometriosis
OCCC: ovarian clear cell carcinoma
UCEC: uterine corpus endometrial cancer
PDS: primary debulking surgery
BMI: body mass index
FIGO: and International Federation of Gynecology and Obstetrics
CNV: copy number variation
TCGA: the cancer genome analysis
NGS: the next generation sequencing
PFS: progression-free survival
OS: overall survival
MSI: microsatellite instability
DEG: differentially expressed gene
GSEA: gene set enrichment analysis
KEGG: kyoto encyclopedia of genes and genomes
GATK: genome analysis toolkit
xiii
WGS: Whole-genome sequencing
WES: Whole-exome sequencing
RNA-seq: RNA sequencing
Gb: Giga base
SNV: Single nucleotide variant
Indel: Short insertion/deletion
CNV: Copy number variation
SV: Structural variation
TSG: Tumor suppressor gene
DEG: Differentially expressed gene
TCGA: The cancer genome atlas
Mb: Mega base
SCNA: Somatic copy number alteration
PCR: Polymerase chain reaction
GATK: Genome analysis toolkit
rlog: Regularized log
FPKM: Fragment per kilo base per million
1
General Introduction
1. Ovarian cancer
Ovarian cancer is the most lethal gynecologic cancer in the United State
and Korea [3, 4]. Ovarian cancer is the fifth type of estimated deaths at
the United State, and the tenth type of estimated new cases and the
eighth of estimated deaths at Korea in 2017 [4-6]. Although
carcinogenesis and cancer classification in the past is based on their
morphology, recently, genetic profiling and molecular pathways in
characterizing carcinogenesis became also one of main method for
classification [7, 8]. Ovarian cancer can be subdivided into four types;
epithelial, germ cell, sex cord-stromal, and unspecified [9, 10]. Epithelial
type is over 90% of ovarian cancers [9, 11]. The ovarian epithelial
cancers can divide type I tumors into three groups; endometriosis-
related tumors (clear cell carcinoma and seromucinous carcinomas),
low-grade serous carcinomas, and mucinous carcinomas and malignant
Brenner tumors [10]. Type II tumors might be divided into high-grade
serous carcinomas, undifferentiated carcinomas, carcinosarcomas, and
primary peritoneal carcinomas [10]. Based on previous studies, high
grade serous carcinoma, which is the most of type I, mainly is deficient
in homologous recombination DNA damage repair, and amplifies the
CCNE1 gene [12]. ERBB2, KRAS, BRAF and MEK pathway are
activated in Low grade serous carcinoma [13, 14]. Activation of PIK3CA
2
and inactivation of ARID1A chromatin remodeling are expected as main
cause of ovarian clear cell carcinoma [15]. Endometrioid carcinomas
were associated with inactivation of PTEN and activations of WNT-β-
catenin pathway [16]. Mucinous carcinomas were related with the
features such as KRAS pathway activation and frequent TP53 mutation
[17]. Additionally, Type II carcinomas exhibit a high proliferative activity,
and are sensitive to carboplatin/paclitaxel therapy in the beginning, most
type I carcinomas is stable, and are resistant at chemotherapy [10].
However understanding of ovarian cancer is not perfect, and has several
barriers due to limited understanding of disease etiology, tumor immune
micro-environment, or the mechanism of disease becoming refractory
to treatment. Many strategies for detection and treatment have been
researched, and developed. Especially, recent research for inhibitors of
the PARP and VEGF pathways have shown remarkable progress in
improving of progression-free survival in ovarian cancer patients [18].
For example, recently, EZH2 and PARP inhibitors, sensitive to BRCA1
gene and BRCA2 gene inactivating somatic mutations of ARID1A, are
major research topic in uterine and ovarian carcinomas area [18, 19].
Moreover many research groups including the cancer genome atlas
(TCGA) focus on genetic profiling for broader understanding and better
precision medicine for gynecologic cancer [7, 20, 21]. However, in spite
of recent efforts for the genomic characteristics, the molecular
characteristics of some ovarian cancers are still not fully revealed yet.
3
2. Uterine cancer
Uterine cancer is the most common gynecologic cancer in developed
countries and the second most common in developing countries [22, 23].
Additionally, in the United States, the estimated cases were 26.0 per
100,000 women per year, and the estimated deaths were 4.6 per
100,000 women per year from 2013 to 2015 [24, 25]. Uterine cancer can
be divided into two different types; Uterine sarcomas, which is detected
in the muscle layer or connective tissue of the uterus, and Endometrial
carcinomas, which is identified in the endometrium of uterus. Moreover,
risk factor and symptoms for both types are different [26]. Uterine
sarcomas are developed by the risk factors such as radiation therapy
history, retinoblastoma history, and diagnosed twice in white race
women [27]. Endometrial carcinomas, which are the most of uterine
cancer, are formed by the risk factors such as age, obesity, high number
of menstrual cycle, no prior pregnancies, Lynch syndrome history,
estrogen replacement therapy history, and Tamoxifen treated history
[27]. Patients with uterine endometrial cancers and uterine sarcomas
can have experience such as bleeding not related to menstruation,
postmenopausal bleeding, and unusual vaginal discharge without any
visible blood [27]. The symptoms of endometrial cancer are painful
urination, pain during intercourse, mass in the pelvic area, and
unintentional weight loss. Uterine sarcoma symptoms include frequent
4
urination, pain in the abdomen, mass in the vagina, and feeling full at all
times [27]. However the treatments (chemotherapy after surgery) are
same because the cause is unknown. Recently, to broaden
understanding of the uterine cancer, TCGA analyzed Uterine Corpus
Endometrial Carcinomas, and Identified four subtypes; 1) POLE
ultramutated, 2) Microsatellite instability hypermutated, 3) Copy number
low and 4) Copy number high [20, 21]. The Uterine Corpus Endometrial
Carcinomas were characterized by frequent mutations in TP53,
extensive copy number alterations, DNA methylation changes, and
frequent mutations in PTEN and KRAS [20]. Their analysis mainly
covered characterizing by analyzing genomic alterations and dividing
subtypes in UCEC. Therefore, finding genomic targets for the novel
molecular diagnostic and the effective precision medicine with their data
will be great benefit to concur uterine cancers.
5
3. Next generation sequencing.
Next generation sequencing (NGS) is one of the most advanced
genome analysis techniques in the biological sciences of the last 10
years [28]. Sanger sequencing or the second generation sequencing is
considered as gold standard for researching human genetics, however
it is difficult to analyze the whole human genome because of the cost
and the period of analyses [29]. Next generation sequencing has been
developed since Dr. Sydney Brenner discovered the basic concepts of
NGS in 1992, and, recently, A human genome can be analyzed in 3 days
with $1000 via NGS technology [30]. Recently many researching
projects including HapMap, 1000 Genome Project, ENCODE and
modENCODE are researched by analyzing sequencing data of the
human genome via NGS [31, 32]. Additionally, NGS plays an important
roles in developing the genetic diagnosis and precision medicines of
cancer [33]. There are various types of NGS methods, and the different
NGS methods should be applied to different project according to
research objectives. To discover the sequence of gene from novel
species, de novo sequencing is usually performed. Over 10,000 species
are discovered by de novo sequencing until now, and more species are
researching via Genome 10K project [32]. However de novo sequencing
should be improved, for research levels, the limitation which the whole
gene of the unknown species cannot be assembled because of technical
6
issues such as sequencing read size and difficulty in detect repeated
sequence. For cancer genetics, re-sequencing technologies such as
whole-genome sequencing (WGS), whole-exome sequencing (WES),
RNA sequencing (RNA-seq), whole bisulfite sequencing, and CHIP
sequencing are used according to purposes of analyses [34]. WGS
generates the sequencing data of entire genome, and is used for
identification of the precise breakpoint of CNA and SV. However WGS
is usually not used for clinical researches because expensive cost and
big data size are required to perform whole genome sequencing. WES
covers the coding regions of gene (exome), approximately 1% of
genome, and is the most accurate and effective method for identification
of genomic alterations (SNV and indel) with clinical implications, and
copy number alteration in exome region [35, 36]. RNA sequencing
covers the whole transcriptomes of genes, and is applied to discovery
of novel splicing, novel transcripts in unannotated protein coding genes,
RNA-editing, differentially expressed gene, and fusion genes [37].
Whole bisulfite sequencing is developed for identifying the role of
epigenetics in cancer, and detect DNA methylation, epigenetic mark [38].
CHIP sequencing is the method to discover chromatin
immunoprecipitation enriched methylated DNA [38]. Genomic analyses
via NGS technology allow us to handle personalized management of
patients via their molecular characteristics, and to broaden genetic
understanding of cancers.
7
4. Cancer genomics
Cancer is a disease caused by certain changes to genes [39]. The
genetic change in cancer can be divided into two types. The one change
is the genetic change inherited from the parents. Such changes, called
germline mutation, can be found in every cell of the offspring [39]. The
other changes, that cause cancer, can be acquired by DNA damages
via exposure to carcinogenic substance such as tobacco smoke and
radiation. These genetic changes, called somatic mutation, cannot
inherit to the offspring [39]. Cancers are generally made by the somatic
mutations in oncogene or tumor suppressor gene (TSG). In the past, the
researchers can focus on only already well known site such as BRCA1
and TP53 by Sanger sequencing. However, recently, with privilege of
the development of next generation sequencing technology,
researchers could extensively discover frequently mutated somatic
mutations which can be main cause of cancer [39]. They also identified
actionable mutations in oncogene of cancer via WGS and WES.
Furthermore, recently they could discover molecular subtypes,
differentially expressed genes (DEGs) between those subtypes, fusion
gene, which can be drivers in cancers via RNA-seq, the most powerful
NGS technology in cancer genomics [40]. Additionally, the researchers
can study epigenetic alterations in cancer genome via bisulfite
sequencing or chip sequencing. Lately, The Cancer Genome Atlas
8
(TCGA) has researched for many cancer genomics study since 2005
though next generation sequencing, and broaden the knowledge on
mutational characteristics of cancer by WGS and WES. TCGA also
allows us to know the underlying mechanism of cancer by using RNA-
seq and microarray. In general, cancer genomics via NGS discovered
driver mutations such as oncogenes and TSGs after filtering somatic
mutation in cancers because there are various numbers of somatic
mutations in cancer, but all somatic mutation didn’t play a role in cancer
[41]. The driver alterations can be found by identify single-nucleotide
polymorphisms (SNP), copy number variation (CNV) and the others
such as fusion genes, transcriptome function, and methylation of genes
[41]. Although synonymous SNPs, which is the most in human body, are
not contributed to cancer formation, non-synonymous SNPs, which can
be driver mutations, play the most important roles in cancer [41]. The
somatic CNVs are also important genetic events for tumorigenesis and
the aggressiveness of cancer. CNVs can be divided into two types; focal
and arm-level SCNA [42, 43]. Focal SCNA usually contribute the
deletions or amplifications of gene level region such as TSG or
oncogene, and the arm-level SCNAs how the aneuploidy of the
chromosome arms, and could be applied as independent prognostic
factor [43]. Additionally, it is important to find the different source of
mutational processes of cancer via mutational signature, measured with
six nucleotide substitution subtypes: C>A, C>G, C>T, T>A, T>C, and
T>G, because cancers are made by several exogenous and
9
endogenous mutational processes which generate somatic mutations
such as tobacco smoking, ultraviolet, and DNA replication deficiency
[44]. Therefore, the mutational signature analysis is also one of the most
powerful methods for understanding the mechanism of cancer genomics.
10
5. Genomic characteristics of ovarian clear cell carcinoma
Ovarian clear cell carcinoma (OCCC), resistant to chemotherapy, and
associated with endometriosis, is the fourth ranked type of all epithelial
ovarian carcinomas [45]. Although the genomic researches of OCCC
were important for developments of precision medicines, a few
Japanese groups have researched for OCCCs with only targeted
sequencing from FFPE samples [45]. Previous studies reported that
PIK3CA, ARID1A, CTNNB1, CSMD3, LPHN3, LRP1B, TP53, KRAS,
PPP2R1A, and PTEN were frequently mutated in OCCC, and minorly
MLL3, ARID1B, and PIK3R1 were altered in OCCC [46]. MET gene
amplification was also reported in previous OCCC studies [47]. Previous
researches were shown that frequently mutated or amplified/deleted
genes were involved in the KRAS pathway, MYC pathways, and the
critical chromatin remodeling complex pathway via integration of the
genomic analyses [47, 48]. However previous studies couldn’t
discovered the genomic profiling fully because OCCC is rarer than other
cancers, and it is difficult to obtain the fresh tumor samples. The all of
the previous studies were genomic analyses from formalin-fixed
paraffin-embedded (FFPE) samples, which can be detected artifact of
genomic alterations, of OCCC [49, 50]. Moreover, CNVs, another
important genomic alterations which can drives the aggressive cancer,
were discovered limited regions of CNV. To full understanding the
11
genomic events and the tumor genesis mechanisms of OCCCs, further
studies are needed with fresh tissues and better next generation
sequencing such as WES, WGS and RNA sequencing.
12
6. Genomic characteristics of uterine corpus endometrial
carcinomas
Uterine endometrial cancer is one of the most common cancers of the
female pelvic site, and arise in the cells of the inner lining of the uterus.
43,000 women with uterine corpus endometrial carcinomas in the United
States have been diagnosed and about 8,000 of them have died of
endometrial cancer [25]. However risk factors for uterine cancer don’t
have cleared. Therefore TCGA analyzed 548 cases, and characterize
UCEC with next generation sequencing data [20]. They discovered four
subtypes (POLE ultramutated, Microsatellite instability hypermutated,
Copy number low and Copy number high) by analyzing the status of
genomic alterations in UCEC. TCGA identified that PTEN, PIK3CA, TTN,
ARID1A, PIK3R, TP53, MUC4, MUC16, KMT2D, CSMD3 were
frequently mutated in UCECs, and characterized by frequency of TP53
mutations, somatic CNVs and DNA methylations [20]. Although, as
TCGA suggested, type and burdens of genomic alteration were
significant for characterizing the UCECs, only stages and grades of
prognostic factors for survival have reported as prognostic factors in
UCECs. If genetic factors were investigated in relation to prognostic
factors via TCGA sequencing data, they will be good benefits to develop
prognostic markers for diagnostics and therapeutic medicines for cancer
therapy.
13
7. Objectives of this studies
Although there have been incredible efforts in deciphering the molecular
characteristics of OCCCs, there was many obstacles such as difficulty
of obtaining fresh tumor samples and the limitation of next generation
sequencing technology, and it was difficult to understanding of genomic
alterations and the mechanism in OCCCs. In first chapter, 15 fresh
OCCC tissues and whole exome sequencing were applied to discover
the genomic alterations (SNP and CNV), and to establish the genomic
landscape of OCCC. OCCC is also well known to associations with
endometriosis which can affect mutation frequency. Additionally, the
frequencies of somatic mutations and somatic copy number variants
between in patients with endometriosis and in patients without
endometriosis were tested by various statistical methods.
Genomic profiling of uterine corpus endometrial carcinomas is analyzed
fully via next generation sequencing data by TCGA. However they
focused on only genomic characterizing of UCECs and genomic
alteration analyses, and researched for very small parts of the genomic
risk factors with clinical features. Therefore, in second chapter, various
statistical analyses between clinicopathologic features and genomic
alterations were analyzed for an independent prognostic factor.
14
It is expected that these studies will allow us to have more intensive
understanding of molecular genetic information of gynecologic cancer,
and will make us discover genomic targets for the novel molecular
diagnostic and the effective precision medicine.
15
CHAPTER 1
Genomic landscape of ovarian clear cell
carcinoma via next generation
sequencing
16
Introduction
Ovarian cancer is the most lethal gynecologic cancer in the United State,
Korea, and even developing contries [23]. In Korea, the incidence rate
of ovarian cancer has been gradually increasing and is expected to
reach 2.5% (2,618) of new cancer cases and 3.8% (1,168) of all cancer
deaths among women in 2017 [5, 51]. Of the histologic types, the
majority (90%) of ovarian cancers are epithelial ovarian cancers (EOCs),
which are further grouped into different histologic subtypes [52].
Ovarian clear cell carcinoma (OCCC), which is the fourth ranked type of
all epithelial ovarian carcinomas, was resistant to chemotherapy, and
related with endometriosis Additionally OCCC has a poorer prognosis
than other histologic EOC subtypes, such as serous or endometrioid
adenocarcinomas [53]. OCCC is associated with endometriosis (EMS),
which is a common benign condition in reproductive-age women [54,
55]. Interestingly, OCCC is more common in East Asian women than in
Western women: it accounts for 24% of EOCs in Japan but only a small
portion of EOCs in Western countries [56]. In Korea, OCCC is the fourth
most common histologic subtype, which accounts for 10.3% of EOCs,
and the incidence of OCCC has increased markedly across all age
groups since 1999 [57].
Although, in previous period, carcinogenesis and cancer classification
17
are identified by their morphology, molecular pathways in characterizing
carcinogenesis were used for classification. Additionally,in accordance
with the era of precision medicine, it is obvious that reliable genetic
diagnosis is essential for providing individualized treatment for patients
with OCCC. Although the genomic researches of OCCC were important
for developments of precision medicines, a few Japanese groups have
researched for OCCCs with targeted sequencing from FFPE samples.
Previous studies reported that PIK3CA, ARID1A, CTNNB1, CSMD3,
LPHN3, LRP1B, TP53, KRAS, PPP2R1A, and PTEN were frequently
mutated in OCCC, and minorly MLL3, ARID1B, and PIK3R1 were
altered in OCCC. MET gene amplification was also reported in previous
OCCC studies. In OCCC, both a clinical approach, considering the
presence of underlying EMS, and a genomic approach, such as those
conducted by The Cancer Genome Analysis (TCGA) Group, may be
necessary [7]. However, the low incidence of OCCC hinders such
integrative genomic analyses. To date, only small genomic studies of
OCCC have been reported from some East Asian countries; the
genomic landscape of Korean OCCC has not yet been investigated.
The aim of this study was to obtain whole exome sequencing (WES)
data of Korean OCCC via the next generation sequencing (NGS)
technique. Genomic profiles were compared between EMS-associated
OCCC (EMS-OCCC) and Non-EMS-OCCC.
18
MATERIALS AND METHODS
Study population
At our institution, Seoul National University Hospital (SNUH), patients
scheduled to undergo surgery for gynecologic cancer have been
routinely asked whether they will donate their biological samples (e.g.,
blood samples, cancer tissue samples) for research purposes since
June 2012. Blood samples and cancer tissues are obtained before
surgery and at the time of surgery, respectively, from those patients who
provide informed consent. The cancer tissues undergo gross
examination and frozen section procedures. In this step, the
pathologists ascertain necrotic portions of the tumor, which are ruled out
from banking. Only viable portions of the tumor are selected and cut in
the form of a 1 cm3 sized cube. These biospecimens are then stored at
the SNUH Human Biobank.
For the present study, we searched relevant patients from the SNUH
Ovarian Cancer Cohort to identify those who met the following inclusion
criteria: 1) older than 18 years; 2) diagnosed with OCCC between June
2012 and December 2016; 3) underwent primary debulking surgery
(PDS); 4) agreed to donate their biological samples and provided
informed consent; and 5) blood and cancer tissue samples were stored
simultaneously at the SNUH Human Biobank. Patients with following
conditions were excluded: 1) diagnosis of any malignancy other than
19
ovarian cancer; 2) neoadjuvant chemotherapy or targeted therapy
before surgery; 3) insufficient clinical data or lost to follow-up; and 4)
severe co-morbidities, such as end-stage renal disease, uncontrolled
diabetes mellitus, or long-term corticosteroid use.
Of the 15 patients with OCCC who met these criteria, 5 were
pathologically diagnosed with OCCC arising in EMS (EMS-OCCC group)
and the other 10 were pathologically diagnosed with OCCC that did not
arise in EMS (Non-EMS-OCCC group) [58]. By reviewing their medical
records, we collected information about clinicopathologic characteristics,
which we compared between the two groups.
Whole exome sequencing via next generation sequencing
technique
Both blood and cancer samples underwent preparation, DNA library
construction, and quality control analysis before sequencing. All blood
and cancer tissue samples passed quality control. The Illumina HiSeq
2000 system was used to obtain WES data. To determine “somatic
mutation”, 20% of 0.2 g of frozen cancer tissue samples were
sequenced with a depth of x150. To determine “germline mutation”, 500-
600 μl of buffy coat blood samples were sequenced with a depth of x100.
Both somatic and germline genomic data underwent mutation analysis,
including the identification of single nucleotide polymorphisms (SNPs)
and insertion or deletion mutations (INDELs), as well as copy number
20
variation (CNV) analysis via MuTect [59], Indelocator, and ExomCNV
[60]. Cosmic, dbSNP, and Clinvar databases were searched to
determine whether the detected mutations were previously assigned ID
numbers. We also performed in silico analyses with SIFT and Polyphen2
to predict whether the observed mutations were likely to be deleterious
or not: SIFT (deleterious, sift ≤0.05; tolerated, sift >0.05); PolyPhen2
(probably damaging, PP2 ≥0.957; possibly damaging, 0.453≤ PP2
<0.957; benign, PP2 <0.453).
Validation with Sanger sequencing
Somatic mutations, which discovered by exome sequencing, were
validated by Sanger sequencing. PCR amplification was performed at
following conditions: 95°C for 5 minutes; 45 cycles of 95°C for 30
seconds, 60°C for 30seconds and 72°C for 1 minute; and 72°C for 7
minutes. Primers for the identified somatic mutations are presented in
Table 1-1. Sanger sequencing was run on ABI PRISM 3730XL Analyzer
(Applied Biosystems, Foster City, CA, USA) and variant sequences
were analyzed using Variant Reporter Software Version 1.1 (Applied
Biosystems).
Statistical Analysis
Statistical analyses were performed to evaluate differences between
21
groups with respect to patient characteristics. Student’s t-test and Mann-
Whitney U-test were used to compare continuous variables. Pearson’s
chi-squared test, Fisher’s exact test, and Kruskal-Wallis test were used
to compare categorical variables. Pearson correlation coefficients were
calculated between patient characteristics and somatically-mutated
genes. R statistical software (version 2.12; R Foundation for Statistical
Computing, Vienna, Austria; ISBN 3-900051-07-0; http://www.R-
project.org) was used for the statistical analyses. A P value < 0.05 was
considered statistically significant.
22
Table 1-1. Primer design for validation with Sanger sequencing
Ref
Seq
#9 ARID1A chr1:27106421 T A 5_ARID1A_F 5_ARID1A_RCCGAGATGTTGGCGAGTGTA 363
#1 ARID1A chr1:27023237 A C 1_ARID1A_F 1_ARID1A_RTTGTTGTCCGCCATGTTGTT 325
#1 ARID1A chr1:27023891 G A 2_ARID1A_F 2_ARID1A_RCAGACAATGGCAGCTCCC 647
#1 ARID1A chr1:27100129 C T 4_ARID1A_F 4_ARID1A_RGCCTTGGGTGGAGAACTGAT386
#2 ARID1A chr1:27056178 CCTCAGCCA C 3_ARID1A_F 3_ARID1A_RGTATAAGAGAGGCCGCCCAT386
#3 ARID1A chr1:27056181 C T 3_ARID1A_F 3_ARID1A_RGTATAAGAGAGGCCGCCCAT386
#7 ARID1A chr1:27100207 C T 4_ARID1A_F 4_ARID1A_RGCCTTGGGTGGAGAACTGAT386
#8 ARID1A chr1:27023743 CG C 2_ARID1A_F 2_ARID1A_RCAGACAATGGCAGCTCCC 647
#1 ARID2 chr12:46246393 C T 11_ARID2_F 11_ARID2_RTGGCACAGCAACCATTGT 380
#14 ARID2 chr12:46230701 TAC T 10_ARID2_F 10_ARID2_RTTGAAATCAACAGGGTCCAGT559
#4 ERBB2 chr17:37868583 G T 12_ERBB2_F 12_ERBB2_RTACCCATCAAAGCTCTCCGG 337
#1 ERBB2 chr17:37882870 C A 13_ERBB2_F 13_ERBB2_RTAGAAGGTGCTGTCCAAGGG399
#12 GPC3 chrX:132887723 G T 41_GPC3_F 41_GPC3_RCCAGGTTTCCAAGTCACTGC 316
#13 GPC3 chrX:132888189 C A 42_GPC3_F 42_GPC3_RGGGACCTTAATCACCACAGC 330
#11 KRAS chr12:25398284 C A 9_KRAS_F 9_KRAS_RAAGCGTCGATGGAGGAGTTT544
#10 KRAS chr12:25398284 C A 9_KRAS_F 9_KRAS_RAAGCGTCGATGGAGGAGTTT544
#1 KRAS chr12:25398240 G A 9_KRAS_F 9_KRAS_RAAGCGTCGATGGAGGAGTTT544
#1 LRP1B chr2:141299405 A G 16_LRP1B_F 16_LRP1B_RGGTGATAGTTAAATCTGGGCCAG381
#1 LRP1B chr2:141660725 G A 17_LRP1B_F 17_LRP1B_RGGGCAAAGCAAACTATACTCCC496
#1 LRP1B chr2:142567941 G A 19_LRP1B_F 19_LRP1B_RTGTGTGTTTCAGCTGAGTGG 337
#9 LRP1B chr2:141806579 C A 18_LRP1B_F 18_LRP1B_RTTGTGCCAGTTAAACGGTGG 436
#13 MED12 chrX:70352744 C G 44_MED12_F 44_MED12_RGTGTTTCCATCCCACAGCAG 370
#3 MED12 chrX:70346843 G A 43_MED12_F 43_MED12_RGAGTGTGAGGAAGTGCATGC386
#4 MST1R chr3:49940820 G A 25_MST1R_2F 25_MST1R_2RCCTCTAGGGTCCCAGCTCG 383
#14 MST1R chr3:49940343 C T 24_MST1R_F 24_MST1R_RTTCCTGCATGACCTAGAGCC 400
#4 PIK3CA chr3:178917478 G A 21_PIK3CA_F 21_PIK3CA_RTGAGGTGAATTGAGGTCCCT 366
#5 PIK3CA chr3:178936082 G A 22_PIK3CA_F 22_PIK3CA_RCGTATCACCAACAGCAGGG 730
#2 PIK3CA chr3:178936082 G A 22_PIK3CA_F 22_PIK3CA_RCGTATCACCAACAGCAGGG 730
#6 PIK3CA chr3:178916876 G A 20_PIK3CA_F 20_PIK3CA_RAGAAAGGGACAACAGTTAAGCT422
#1 PIK3CA chr3:178952018 A G 23_PIK3CA_2F 23_PIK3CA_2RTGCTGTTCATGGATTGTGCA 484
#3 PIK3CA chr3:178936082 G A 22_PIK3CA_F 22_PIK3CA_RCGTATCACCAACAGCAGGG 730
#9 PPP2R1A chr19:52715983 G A 15_PPP2R1A_F 15_PPP2R1A_RAGGGAGAGGAGAGGAACAGT462
#2 PPP2R1A chr19:52715983 G A 15_PPP2R1A_F 15_PPP2R1A_RAGGGAGAGGAGAGGAACAGT462
#11 PPP2R1A chr19:52715982 C T 15_PPP2R1A_F 15_PPP2R1A_RAGGGAGAGGAGAGGAACAGT462
#6 PTEN chr10:89692904 C G 7_PTEN_F 7_PTEN_RAAATTCTCAGATCCAGGAAGAGG300
#1 PTEN chr10:89624271 A C 6_PTEN_F 6_PTEN_RCCAGGCAAGAGTTCCGTCTA 364
#1 PTEN chr10:89720853 G A 8_PTEN_F 8_PTEN_RTTGACGCTGTGTACATTGGG 375
#5 RFX3 chr9:3263078 C T 39_RFX3_F 39_RFX3_RATGCTACGCTCAGATGTCGA 381
#13 RFX3 chr9:3330444 G C 40_RFX3_F 40_RFX3_RATGCGAAACTTGCCATGTTG 335
#4 SYNE1 chr6:152646355 T A 30_SYNE1_F 30_SYNE1_RCCCTTGTCTCCTCTCTTCCG 254
#12 SYNE1 chr6:152779915 C T 37_SYNE1_F 37_SYNE1_RGCTATGAACGTTCCCTGAGC 493
#1 SYNE1 chr6:152454491 G A 26_SYNE1_F 26_SYNE1_RCTCCACGTTTGATGCTCAGG 478
#1 SYNE1 chr6:152510441 A C 27_SYNE1_F 27_SYNE1_RTGCCATGATGTGCCTCTAGA 406
#1 SYNE1 chr6:152539486 G A 28_SYNE1_2F 28_SYNE1_2RCCACTTGCCCTTTTACCAGAC 427
#1 SYNE1 chr6:152642987 C G 29_SYNE1_F 29_SYNE1_RACAGTGTTGAGGAAGTGTCTT340
#1 SYNE1 chr6:152675831 T G 31_SYNE1_2F 31_SYNE1_2RGCTAACCCATGCAAGTGTGA 953
#1 SYNE1 chr6:152683401 G T 32_SYNE1_F 32_SYNE1_RCCCTTGCTTACTGGAGTGGA 356
#1 SYNE1 chr6:152688444 T G 33_SYNE1_F 33_SYNE1_RGACTTGCCTCGTATCTGTGC 364
#1 SYNE1 chr6:152706918 G A 34_SYNE1_F 34_SYNE1_RAGGCCTGTTGTCTTACCTGA 376
#1 SYNE1 chr6:152711468 C A 35_SYNE1_F 35_SYNE1_RATTGCAATAGGGCCCAGAGT399
#1 SYNE1 chr6:152754967 G A 36_SYNE1_F 36_SYNE1_RCCTTTGGACCCAGCAATGTT 505
#1 SYNE1 chr6:152832214 T C 38_SYNE1_F 38_SYNE1_RGTGCTTGTTTGTTTCGGTGC 543
#14 TP53 chr17:7578260 C T 14_TP53_F 14_TP53_RTTTCTTTGCTGCCGTCTTCC 516
#15 TP53 chr17:7578427 T A 14_TP53_F 14_TP53_RTTTCTTTGCTGCCGTCTTCC 516
Alt seq, alternate sequence; Ref seq, reference sequence.
GCCACTGACAACCACCCTTA
GGTGGATGGAGACGGGAAAT
GTGTAGAATCAGTGAGGCATCA
CCAGGAGACCTCAGTACTGC
GGGACCGAACAGCATTTGAT
TTCCTGAGCTCCTTCTGCAA
GCCACTGACAACCACCCTTA
CAGGACTTGTAAGAGCAGTGG
AAGAATTATGGGCCGGGTCC
GGCGTGCCCCTATAATCCTA
TGATGGGTAAAGAGAATGAGGAG
GAGCTGAGACCGCACCAC
GCTTATGTGCACTGCCCTAG
GAGCCATTTCCATCCTGCAG
TACCAGGACCAGAGGAAACC
ACGGTTGAGGTCACTAAGCA
CCATTGAGTTGCCGATCAGA
TGTTAAAGCCCGTCCACTCA
TGAATGAGCCTGGTTTGCAC
CTTTGTCTACGAAAGCCTCTCT
TGGTTCTTTCCTGTCTCTGAAA
TAAACCTCATGCGGACCTGT
TAAACCTCATGCGGACCTGT
TAAACCTCATGCGGACCTGT
TTTACCACAGTTGCACAATATCC
GTGTCTGTGTCACCGGGAG
GCAGGGCACTAGGATCATCT
AGGAATGTTTGCTGCCTTTG
TGGTTCTTTCCTGTCTCTGAAA
TGGTTCTTTCCTGTCTCTGAAA
CAGAACAATGCCTCCACGAC
TGAACAGCCAAATCTTGAACACT
ACCGTCTACATCCAGCTTCC
CGAATGGCATGAGAACCTT
ACCAACCAGAAGAACTTGAGT
AGGTTGGAAGTTGACTCCCA
ACGAGCCCTTCTATCCTGTG
TCCCTTGGGACTGTCTAGAC
ACCAAGCAGTACGTTCTCCA
CCTGGGTTCATTAGCTGGGT
GGACCCTGACATACTCCCAA
GGACCCTGACATACTCCCAA
GGACCCTGACATACTCCCAA
AATTAGGGTTCTGAGGCGGG
CCACGACAGCACTATCCCTA
CCAACCACCAGTACAACTCC
TACAGGAGGCTTCAAATGCG
CGAAAGCTGGGCATTAACGA
CCACCCTGTCCTATCCTTCC
Size
GCCACCGGAACATCAAGATC
GGGAAAGGAGCTGCAGGA
CCAACCACCAGTACAACTCC
CCACGACAGCACTATCCCTA
AATTAGGGTTCTGAGGCGGG
Table 1. Primer design for validation with Sanger sequencing.
Sample Gene Position Alt seq Forward primer Sequence Reverse primer Sequence
23
RESULTS
Characteristics of 15 Korean patients with OCCC
The patients’ clinicopathologic characteristics are depicted in Table 1-2.
Their median age was 51.1 years. The numbers of patients with stage I,
II, and III disease were 9, 2, and 4, respectively. All patients underwent
PDS, which was followed by adjuvant chemotherapy in all except two
patients. During the median observation period of 23.4 months, two
patients had a recurrence and received second-line chemotherapy. Of
these, one patient eventually died despite treatment at 19.0 months after
diagnosis. Clinicopathologic factors, such as age, body mass index
(BMI), parity, initial serum CA-125 levels, and International Federation
of Gynecology and Obstetrics (FIGO) stage, were not statistically
different between the EMS-OCCC and Non-EMS-OCCC groups.
Landscape of somatic mutations in Korean OCCCs
To detect somatic mutation, WES was performed on frozen tissues of
15 Korean OCCCs with paired blood samples. The 15 Korean OCCCs
in this study displayed a median of 178 somatic exonic mutations (range,
111-25,798) in each tumor sample, identified by NGS analysis (Figure
1-1A, Table 1-3). The number of somatic mutations was not associated
with accompanying endometriosis, tumor stage, patient age, or patient
BMI (Figure 1-1B). However, PIK3CA mutations and PTEN mutations
24
were positively associated with disease stage (P=0.004 and P=0.03,
respectively). The frequencies of significant somatic mutations were as
follows: PIK3CA (40%), ARID1A (40%), KRAS (20%), PPP2R1A (20%),
SYNE1 (20%), RFX3 (13%), MED12 (13%), GPC3 (13%), MST1R
(13%), TP53 (13%), ARID2 (13%), LRP1B (13%), PTEN (13%), and
ERBB2 (13%) (Figure 1-1C). C to T transition was the most frequent
somatic mutations in each OCCC (C to T ratio: 34%-57%) (Figure 1-
1D).
Somatic mutations (26 known and 27 unpublished) in Korean
OCCCs
In total, 54 somatic mutations (3 nonsense variants and 51 missense
variants) were discovered across 14 genes in 15 Korean OCCCs by
NGS WES; the predicted functional impact, resulting from the
corresponding amino acid substitutions, was analyzed using SIFT and
PolyPhen2. 53 of 54 somatic mutations were verified by Sanger
sequencing: PIK3CA (6/6), ARID1A (8/8), KRAS (3/3), PPP2R1A (3/3),
SYNE1 (13/13), RFX3 (2/2), MED12 (2/2), GPC3 (1/2), MST1R (2/2),
TP53 (2/2), ARID2 (2/2), LRP1B (4/4), PTEN (3/3), and ERBB2 (2/2)
(Figure 1-2). Of the validated 53 mutations, 27 were not reported in the
COSMIC database, dbSNP, or Clinvar (Table 1-4). Of the newly
identified somatic mutations, 19 were designated “deleterious” in SIFT
and “probably damaging” or “possibly damaging” in Polyphen2: ARID1A
25
(pLeu2011Gln, pAsn115His, pLeu1309Phe), ERBB2 (pGly344Cys,
pPhe976Leu), GPC3 (pPro273His), LRP1B (pSer2444Pro, pLeu38Phe),
MED12 (pLeu1489Val), MST1R (pAla234Thr), PTEN (pArg335Gln),
RFX3 (pAla488Thr, pGln97Glu), and SYNE1 (pAsn5174Ile,
pArg8641Trp, pAsn3630Thr, pPhe3401Leu, pAsp3294Ala, pThr112Ala)
(Table 1-4).
Somatically-altered pathway in Korean OCCCs
Frequent somatic mutations in biological pathways play an important
role in tumorigenesis and tumor progression (15). To discover key
altered pathways in Korean OCCCs, we screened published papers on
mutational spectrum for mutated genes and manually searched the
KEGG pathway database for frequently and potentially important
mutated genes. We found that significantly altered pathways were
associated with proliferation and survival (including the PI3K/AKT
pathway, TP53 pathway, and ERBB2 pathway) in 87% of the 15 Korean
OCCCs and with chromatin remodeling in 47% of OCCCs (Figure 1-3).
Landscape of somatic copy number variations
A median of 343 somatic CNVs (range, 43-1,820) were discovered per
tumor sample using ExomeCNV tool (Table 1-3). CNV profiles were
variable across 15 Korean OCCCs (Figure 1-4). Somatic CNVs were
26
also detected in the key mutated pathways consisting of somatic
mutations (TP53 pathway: patients #3, #11, and #13; PI3K/AKT
pathway: patients #13 and #15; MAPK pathway: patients #3, #9, #10,
#13, and #14) (Figure 1-5). Focal level gains of oncogenic genes,
including NTRK1 (33%) in the 1q chromosome region, MYC (40%) in
the 8q chromosome, and GNAS (47%) in the 20q chromosome, were
frequently observed in 15 Korean OCCCs (Figure 1-6). Focal level
losses of TET2 (73%) in the 4 chromosome, TSC1 (67%) in the 9q
chromosome, BRCA2 (60%) in the 13q chromosome and SMAD4 (47%)
in the 18q chromosome were frequently detected (Figure 1-6). Arm-
level gains and losses most frequently involved the 8q chromosome
(47%) and the 4q chromosome (80%), respectively (Figure 1-5, Figure
1-7).
EMS-OCCC vs. Non-EMS-OCCC
To identify differences between EMS-OCCC and Non-EMS-OCCC,
somatic genetic alterations (somatic mutation: total burden and
frequently mutated genes; somatic CNVs: total burden, oncogenes and
tumor suppressor genes), as well as somatically-altered pathways
(somatic mutations: TP53 pathway, PI3K/AKT pathway, ERBB2
pathway, and chromatin remodeling pathway; somatic CNVs: TP53
pathway, PI3K/AKT pathway, and MAPK pathway) were compared. No
significant differences in the frequency of genetic alterations were
27
detected between the EMS-OCCC and Non-EMS-OCCC groups
(Figure 1-8, 1-9, 1-10, 1-11, 1-12,1-13).
28
Table 1-2. Clinicopathologic characteristics of the patients.
All EMS-OCCC Non-EMS-OCCC
(n=15) (n=5) (n=10)
Age at diagnosis, years
Median (range) 51.1 (28.9-71.4) 51.3 (42.7-55.5) 49.8 (28.9-71.4) 0.953
BMI, kg/m2
Median (range) 21.5 (16.9-28.3) 21.5 (19.5-23.1) 21.6 (17.0-28.3) 0.513
Parity
Mean±SD 1.3±0.9 1.2±0.8 1.4±1.0 0.594
Menopause 10 (66.7) 4 (80.0) 6 (60.0) 0.6
Comorbidities
Hypertension 1 (6.7) 0 1 (10.0) 1
Diabetes 2 (13.3) 0 2 (20.0) 0.524
Dyslipidemia 2 (13.3) 0 2 (20.0) 0.524
Alcohol intake 3 (20.0) 0 3 (30.0) 0.505
Smoking 0 0 0 N/A
CA-125 at diagnosis, IU/ml
Median (range) 79.9 (7.7-1067.0) 447.5 (15.5-1067.0) 63.7 (7.7-269.8) 0.165
FIGO stage 0.839
I 9 (60.0) 3 (60.0) 6 (60.0)
II 2 (13.3) 1 (20.0) 1 (10.0)
III 4 (26.7) 1 (20.0) 3 (30.0)
Residual tumor at PDS 0.333
None 14 (93.3) 4 (80.0) 10 (100.0)
RT < 1 cm 1 (6.7) 1 (20.0) 0
Adjuvant chemotherapy
None 2 (13.3) 0 2 (20.0) 0.524
Paclitaxel-Carboplatin 12 (80.0) 5 (100.0) 7 (70.0)
Irinotecan-Cisplatin 1 (6.7) 0 1 (10.0)
Observation period, months 0.44
Median 23.4 48.9 20.3
Recurrence 2 (13.3) 1 (20.0) 1 (10.0) 1
Progression free survival, months
Median 23.4 30.1 19.4 0.808*
Death 1 (6.7) 0 1 (10.0) 1
Values are n (%) unless otherwise specified.
EMS, endometriosis; EMS-OCCC, endometriosis associated ovarian clear cell carcinoma; BMI, body mass index; CA-125;
cancer antigen 125; FIGO, International Federation of Gynecology and Obstetrics 2014; N/A, not applicable; PDS, primary
debulking surgery; RT, residual tumor; SD, standard deviation.
* Log-rank test.
Characteristics P
29
All
EM
S-O
CC
CN
on-E
MS
-OC
CC
(n=
15)
(n=
5)
(n=
10)
Som
atic m
uta
tions
M
edia
n (
range)
178 (
111-2
5,7
98)
168 (
154-2
07)
184 (
111-2
5,7
98)
0.3
07
Som
atic C
NV
s
Media
n (
range)
343 (
43-1
,820)
221 (
135-4
05)
345 (
43-1
,820)
0.1
9
Tab
le 3
. S
um
mary
of
so
mati
c m
uta
tio
ns a
nd
co
py n
um
be
r v
ari
ati
on
s.
Chara
cte
rist
ics
P
Tab
le 1
-3. S
um
ma
ry o
f so
mati
c m
uta
tio
ns a
nd
co
py n
um
ber
vari
ati
on
s
30
Ref
Seq
#9
AR
ID1
Ach
r1:2
71
06
42
1T
AM
isse
nse
NM
_0
06
01
54
c60
32
T>
Ap
Leu
20
11
Gln
#1
AR
ID1
Ach
r1:2
70
23
23
7A
CM
isse
nse
NM
_0
06
01
54
c34
3A
>C
pA
sn1
15
His
#1
AR
ID1
Ach
r1:2
70
23
89
1G
AM
isse
nse
NM
_0
06
01
54
c99
7G
>A
pA
la3
33
Th
r
#1
AR
ID1
Ach
r1:2
71
00
12
9C
TM
isse
nse
NM
_0
06
01
54
c39
25
C>
Tp
Leu
13
09
Ph
e
#2
AR
ID1
Ach
r1:2
70
56
17
8C
CT
CA
GC
CA
CF
ram
esh
ift
NM
_0
06
01
54
c11
75
_1
18
2d
elC
TC
AG
CC
Ap
Pro
39
2fs
#3
AR
ID1
Ach
r1:2
70
56
18
1C
TN
on
sen
seN
M_
00
60
15
4c1
17
7C
>T
pG
ln3
93
*
#7
AR
ID1
Ach
r1:2
71
00
20
7C
TN
on
sen
seN
M_
00
60
15
4c4
00
3C
>T
pA
rg1
33
5*
#8
AR
ID1
Ach
r1:2
70
23
74
3C
GC
Fra
mes
hif
tN
M_
00
60
15
4c8
54
del
Gp
Gly
28
5fs
#1
AR
ID2
chr1
2:4
62
46
39
3C
TM
isse
nse
NM
_1
52
64
12
c44
87
C>
Tp
Ser
14
96
Ph
e
#1
4A
RID
2ch
r12
:46
23
07
01
TA
CT
Fra
mes
hif
tN
M_
15
26
41
2c9
51
_9
52
del
AC
pL
eu3
17
fs
#4
ER
BB
2ch
r17
:37
86
85
83
GT
Mis
sen
seN
M_
00
44
48
3c1
03
0G
>T
pG
ly3
44
Cy
s
#1
ER
BB
2ch
r17
:37
88
28
70
CA
Mis
sen
seN
M_
00
44
48
3c2
92
8C
>A
pP
he9
76
Leu
#1
2G
PC
3ch
rX:1
32
88
77
23
GT
Mis
sen
seN
M_
00
11
64
61
71
c81
8C
>A
pP
ro2
73
His
#1
3G
PC
3ch
rX:1
32
88
81
89
CA
Mis
sen
seN
M_
00
11
64
61
71
c35
2G
>T
pV
al1
18
Ph
e
#1
1K
RA
Sch
r12
:25
39
82
84
CA
Mis
sen
seN
M_
03
33
60
3c3
5G
>T
pG
ly1
2V
al
#1
0K
RA
Sch
r12
:25
39
82
84
CA
Mis
sen
seN
M_
03
33
60
3c3
5G
>T
pG
ly1
2V
al
#1
KR
AS
chr1
2:2
53
98
24
0G
AM
isse
nse
NM
_0
33
36
03
c79
C>
Tp
His
27
Ty
r
#1
LR
P1
Bch
r2:1
41
29
94
05
AG
Mis
sen
seN
M_
01
85
57
2c7
33
0T
>C
pS
er2
44
4P
ro
#1
LR
P1
Bch
r2:1
41
66
07
25
GA
Mis
sen
seN
M_
01
85
57
2c3
53
0C
>T
pS
er1
17
7L
eu
#1
LR
P1
Bch
r2:1
42
56
79
41
GA
Mis
sen
seN
M_
01
85
57
2c1
12
C>
Tp
Leu
38
Ph
e
#9
LR
P1
Bch
r2:1
41
80
65
79
CA
No
nse
nse
NM
_0
18
55
72
c17
65
G>
Tp
Glu
58
9*
#1
3M
ED
12
chrX
:70
35
27
44
CG
Mis
sen
seN
M_
00
51
20
2c4
46
5C
>G
pL
eu1
48
9V
al
#3
ME
D1
2ch
rX:7
03
46
84
3G
AM
isse
nse
NM
_0
05
12
02
c27
10
G>
Ap
Glu
90
4L
ys
#4
MS
T1
Rch
r3:4
99
40
82
0G
AM
isse
nse
NM
_0
02
44
72
c22
3C
>T
pA
rg7
5C
ys
#1
4M
ST
1R
chr3
:49
94
03
43
CT
Mis
sen
seN
M_
00
24
47
2c7
00
G>
Ap
Ala
23
4T
hr
#4
PIK
3C
Ach
r3:1
78
91
74
78
GA
Mis
sen
seN
M_
00
62
18
2c3
53
G>
Ap
Gly
11
8A
spD
elet
erio
us
Pro
bab
ly D
amag
ing
CO
SM
75
1V
erif
ied
To
lera
ted
Po
ssib
ly D
amag
ing
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
grs
35
88
75
39
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M1
12
46
71
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
To
lera
ted
Ben
ign
CO
SM
38
94
88
3rs
37
75
18
98
6V
erif
ied
To
lera
ted
Po
ssib
ly D
amag
ing
Ver
ifie
d
Del
eter
iou
sP
oss
ibly
Dam
agin
gC
OS
M5
94
52
59
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M5
20
rs1
21
91
35
29
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M5
20
rs1
21
91
35
29
27
62
2V
erif
ied
Del
eter
iou
sB
enig
nC
OS
M6
41
85
00
No
t d
etec
ted
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
Del
eter
iou
sP
oss
ibly
Dam
agin
gV
erif
ied
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
Ver
ifie
d
Del
eter
iou
sP
oss
ibly
Dam
agin
gC
OS
M3
87
16
44
Ver
ifie
d
Ver
ifie
d
CO
SM
90
77
23
rs3
87
90
68
46
Ver
ifie
d
CO
SM
51
44
6V
erif
ied
Ver
ifie
d
Del
eter
iou
sP
oss
ibly
Dam
agin
gV
erif
ied
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M1
60
21
37
Ver
ifie
d
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
db
SN
P I
DC
lin
var
ID
Va
lid
ati
on
by
Sa
ng
er s
equ
enci
ng
Del
eter
iou
sP
rob
ably
Dam
agin
g
Gen
eP
osi
tio
nT
ype
of
Alt
era
tio
nN
M n
um
ber
cDN
A c
ha
ng
eA
A c
ha
ng
eP
ati
ent
Alt
seq
Tab
le 4
. S
ign
ific
an
t so
mati
c m
uta
tio
ns i
n 1
5 K
ore
an
pati
en
ts w
ith
OC
CC
SIF
TP
oly
ph
en2
Co
smic
ID
Tab
le 1
-4. S
ign
ific
an
t so
mati
c m
uta
tio
ns in
15 K
ore
an
pati
en
ts w
ith
OC
CC
31
Ref
Seq
#5
PIK
3C
Ach
r3:1
78
93
60
82
GA
Mis
sen
seN
M_
00
62
18
2c1
62
4G
>A
pG
lu5
42
Ly
s
#2
PIK
3C
Ach
r3:1
78
93
60
82
GA
Mis
sen
seN
M_
00
62
18
2c1
62
4G
>A
pG
lu5
42
Ly
s
#6
PIK
3C
Ach
r3:1
78
91
68
76
GA
Mis
sen
seN
M_
00
62
18
2c2
63
G>
Ap
Arg
88
Gln
#1
PIK
3C
Ach
r3:1
78
95
20
18
AG
Mis
sen
seN
M_
00
62
18
2c3
07
3A
>G
pT
hr1
02
5A
la
#3
PIK
3C
Ach
r3:1
78
93
60
82
GA
Mis
sen
seN
M_
00
62
18
2c1
62
4G
>A
pG
lu5
42
Ly
s
#9
PP
P2
R1
Ach
r19
:52
71
59
83
GA
Mis
sen
seN
M_
01
42
25
5c5
48
G>
Ap
Arg
18
3G
ln
#2
PP
P2
R1
Ach
r19
:52
71
59
83
GA
Mis
sen
seN
M_
01
42
25
5c5
48
G>
Ap
Arg
18
3G
ln
#1
1P
PP
2R
1A
chr1
9:5
27
15
98
2C
TM
isse
nse
NM
_0
14
22
55
c54
7C
>T
pA
rg1
83
Trp
#6
PT
EN
chr1
0:8
96
92
90
4C
GM
isse
nse
NM
_0
00
31
44
c38
8C
>G
pA
rg1
30
Gly
#1
PT
EN
chr1
0:8
96
24
27
1A
CM
isse
nse
NM
_0
00
31
44
c45
A>
Cp
Arg
15
Ser
#1
PT
EN
chr1
0:8
97
20
85
3G
AM
isse
nse
NM
_0
00
31
44
c10
04
G>
Ap
Arg
33
5G
ln
#5
RF
X3
chr9
:32
63
07
8C
TM
isse
nse
NM
_0
01
28
21
16
1c1
46
2G
>A
pA
la4
88
Th
r
#1
3R
FX
3ch
r9:3
33
04
44
GC
Mis
sen
seN
M_
00
12
82
11
61
c28
9C
>G
pG
ln9
7G
lu
#4
SY
NE
1ch
r6:1
52
64
63
55
TA
Mis
sen
seN
M_
18
29
61
3c1
55
21
A>
Tp
Asn
51
74
Ile
#1
2S
YN
E1
chr6
:15
27
79
91
5C
TM
isse
nse
NM
_1
82
96
13
c25
45
G>
Ap
Ala
84
9T
hr
#1
SY
NE
1ch
r6:1
52
45
44
91
GA
Mis
sen
seN
M_
18
29
61
3c2
59
21
C>
Tp
Arg
86
41
Trp
#1
SY
NE
1ch
r6:1
52
51
04
41
AC
Mis
sen
seN
M_
18
29
61
3c2
32
47
T>
Gp
Ile7
74
9M
et
#1
SY
NE
1ch
r6:1
52
53
94
86
GA
Mis
sen
seN
M_
18
29
61
3c2
20
97
C>
Tp
Ala
73
66
Val
#1
SY
NE
1ch
r6:1
52
64
29
87
CG
Mis
sen
seN
M_
18
29
61
3c1
59
52
G>
Cp
Val
53
18
Leu
#1
SY
NE
1ch
r6:1
52
67
58
31
TG
Mis
sen
seN
M_
18
29
61
3c1
08
89
A>
Cp
Asn
36
30
Th
r
#1
SY
NE
1ch
r6:1
52
68
34
01
GT
Mis
sen
seN
M_
18
29
61
3c1
02
03
C>
Ap
Ph
e34
01
Leu
#1
SY
NE
1ch
r6:1
52
68
84
44
TG
Mis
sen
seN
M_
18
29
61
3c9
88
1A
>C
pA
sp3
29
4A
la
#1
SY
NE
1ch
r6:1
52
70
69
18
GA
Mis
sen
seN
M_
18
29
61
3c8
54
3C
>T
pA
la2
84
8V
al
#1
SY
NE
1ch
r6:1
52
71
14
68
CA
Mis
sen
seN
M_
18
29
61
3c8
12
4G
>T
pG
lu2
70
8A
sp
#1
SY
NE
1ch
r6:1
52
75
49
67
GA
Mis
sen
seN
M_
18
29
61
3c4
42
4C
>T
pS
er1
47
5L
eu
#1
SY
NE
1ch
r6:1
52
83
22
14
TC
Mis
sen
seN
M_
18
29
61
3c3
34
A>
Gp
Th
r11
2A
la
#1
4T
P5
3ch
r17
:75
78
26
0C
TM
isse
nse
NM
_0
00
54
65
c58
9G
>A
pV
al1
97
Met
#1
5T
P5
3ch
r17
:75
78
42
7T
AM
isse
nse
NM
_0
00
54
65
c50
3A
>T
pH
is1
68
Leu
AA
, am
ino
aci
d; A
lt s
eq,
alte
rnat
e se
qu
ence
; R
ef s
eq,
refe
ren
ce s
equ
ence
.
Cli
nva
r ID
Va
lid
ati
on
by
Sa
ng
er s
equ
enci
ng
cDN
A c
ha
ng
eA
A c
ha
ng
eS
IFT
Po
lyp
hen
2C
osm
ic I
Dd
bS
NP
ID
Pa
tien
tG
ene
Po
siti
on
Alt
seq
Typ
e o
f A
lter
ati
on
NM
nu
mb
er
Tab
le 4
. (C
on
t.)
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M4
48
01
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M4
37
79
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
Del
eter
iou
sB
enig
nC
OS
M3
02
11
18
Ver
ifie
d
To
lera
ted
Ben
ign
Ver
ifie
d
To
lera
ted
Pro
bab
ly D
amag
ing
CO
SM
27
75
55
rs3
68
83
23
47
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
Del
eter
iou
sB
enig
nV
erif
ied
To
lera
ted
Ben
ign
Ver
ifie
d
To
lera
ted
Ben
ign
Ver
ifie
d
Del
eter
iou
sB
enig
nC
OS
M5
86
95
29
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gV
erif
ied
To
lera
ted
Ben
ign
Ver
ifie
d
Del
eter
iou
sP
oss
ibly
Dam
agin
gV
erif
ied
To
lera
ted
Po
ssib
ly D
amag
ing
Ver
ifie
d
Del
eter
iou
sP
oss
ibly
Dam
agin
gV
erif
ied
Del
eter
iou
sP
oss
ibly
Dam
agin
gV
erif
ied
Del
eter
iou
sB
enig
nC
OS
M5
27
5V
erif
ied
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M5
21
9rs
12
19
09
22
43
62
83
7V
erif
ied
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M5
12
11
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M5
12
53
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M5
12
53
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M7
60
rs1
21
91
32
73
40
60
9V
erif
ied
To
lera
ted
Pro
bab
ly D
amag
ing
CO
SM
77
1rs
39
75
17
20
25
46
34
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M7
46
rs1
21
91
32
87
Ver
ifie
d
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M7
60
rs1
21
91
32
73
40
60
9V
erif
ied
Del
eter
iou
sP
rob
ably
Dam
agin
gC
OS
M7
60
rs1
21
91
32
73
40
60
9V
erif
ied
Table
1-4
. (C
ont.)
32
Figure 1-1. Landscape of somatic exonic mutations in 15 Korean
OCCCs. (A) The numbers of somatic mutations, for each patient with
OCCC, are shown. (B) Key clinical features (endometriosis, stage, age,
and body mass index) are presented. (C) Names of significantly mutated
genes (left), distribution of mutations across 15 OCCCs (middle), and
frequency of significantly mutated genes (right) are shown. (D) The
mutational signature of each OCCC is displayed.
33
Figure 1-2. The validations of somatic mutations. 53 of 54 somatic
mutations are validated by sanger sequencing.
34
Figure 1-3. Somatic mutations and CNVs pathways in OCCCs.
Somatically-altered pathways are plotted with both somatic mutations
and somatic CNVs in chromatin remodeling-related, and cell
proliferation and survival-related pathways.
35
Figure 1-4. Landscape of somatic copy number variations in 15
korean OCCCs. A Circos plot is displayed, showing the somatic copy
number gains and losses in each OCCC, in order from outermost ring
(#1 patient-#15 patient).
36
Figure 1-5. Arm level CNVs and focal level CNVs. CNVs in key
biological pathways are plotted.
37
Figure 1-6. Copy number variation analysis for drug targetable
genes. Frequency of copy number gains (red) and copy number losses
(blue) are shown across the genome. Genes of interest are labeled with
frequency of CNVs.
38
Figure 1-7. Somatic copy number variations in chromosome arm
level. Arm-level somatic copy number variations are displayed by each
OCCC case (left, middle). Frequency of copy number gains (red) and
copy number losses (blue) are shown (right).
39
Figure 1-8. Comparisons of mutation burden and CNV burden
between between endometriosis-associated and non-
endometriosis-associated OCCCs. Mutation burden and CNV burden
were ploted across endometriosis status.
40
Figure 1-9. Comparisons of somatic mutations between
endometriosis-associated and non-endometriosis-associated
OCCCs in frequently mutated genes. Names of significantly mutated
genes (left), distribution of mutations across 15 OCCCs (middle), and
frequency of mutated genes, with P values for differences between the
two groups (right) are displayed.
41
Figure 1-10. Comparisons of somatic mutations between
endometriosis-associated and non-endometriosis-associated
OCCCs in somatically altered pathways. Names of significantly
mutated genes in somatically-altered pathways (left), distribution of
mutations across 15 OCCCs (middle), and frequency of mutated genes,
with P values for differences between the two groups (right) are
displayed.
42
Figure 1-11. Comparisons of somatic copy number variations
between endometriosis-associated and non-endometriosis-
associated OCCCs in amplified and deleted chromosomes. Names
of representative genes in amplified and deleted chromosomes (left),
distribution of mutations across 15 OCCCs (middle), and frequency of
mutated genes, with P values for differences between the two groups
(right) are displayed.
43
Figure 1-12. Comparisons of somatic copy number variations
between endometriosis-associated and non-endometriosis-
associated OCCCs in oncogenes and tumor suppressor genes.
Names of representative genes in oncogenes and tumor suppressor
genes (left), distribution of mutations across 15 OCCCs (middle), and
frequency of mutated genes, with P values for differences between the
two groups (right) are displayed.
44
Figure 1-13. Comparisons of somatic copy number variations
between endometriosis-associated and non-endometriosis-
associated OCCCs in somatically-altered pathways. Names of
amplified and deleted genes in somatically copy number-altered
pathways (left), distribution of mutations across 15 OCCCs (middle),
and frequency of mutated genes, with P values for differences between
the two groups (right), are displayed.
45
DISCUSSION
In the present study, we successfully characterized the genomic
landscape of 15 Korean patients with OCCC. This cancer featured
complex genomic alterations. To our knowledge, this is the first report
of an NGS WES study in Korean patients with OCCC, as well as the first
attempt to compare genomic profiles of OCCC according to the
presence or absence of EMS. As TCGA Research Network emphasizes,
genomic analyses and identification of alterations will provide new
therapeutic approaches and allow subtype-stratified care, with the
anticipation that this will improve cancer outcomes [61]
It is well known that the different histologic types of EOC exhibit a distinct
mutation spectrum, reflecting their different etiology and lineage. In
contrast with high-grade serous carcinomas, OCCCs have few TP53
mutations, but they do contain recurrent ARID1A and PIK3CA mutations
[62, 63] .NGS WES data of the current study showed similar results, as
the proportions of ARID1A, PIK3CA, and TP53 mutations in the Korean
OCCCs were 40%, 40%, and 13%, respectively. Mutations in ARID1A,
a tumor suppressor gene, are found in many human cancers, including
OCCCs. A Japanese group also reported that ARID1A mutations
accounted for 17% of Japanese OCCCs [45]. Recently, ATR inhibitors
were suggested as candidates for synthetic lethal therapy in ARID1A-
deficient tumors [64]. In the current era of precision medicine, ATR
46
inhibitors may open new treatment choices for individuals with ARID1A-
mutated OCCCs.
Associations between EMS and OCCC have been reported in previous
studies. EMS, a common benign gynecologic disorder that occurs in
5%–10% of reproductive-age women, seems to increase the risk for
EOCs, especially OCCC or endometrioid carcinoma [65, 66].The exact
molecular mechanisms, underlying malignant transformation of EMS,
have not been fully elucidated [66]. Nevertheless, mutations in ARID1A,
and PI3K/AKT pathway alterations, such as mutations in PIK3CA, are
considered an early event in the transformation of EMS into OCCC [67].
OCCCs arising in the presence of EMS, and those arising in the
absence of EMS, may develop from different processes of
carcinogenesis. However, previous comparisons between EMS-OCCC
and Non-EMS-OCCC have focused only on treatment response or
survival outcomes. Therefore, we compared genomic profiles between
the two groups in the current study, and the results showed no
significant differences in the frequency of genetic alterations between
EMS-OCCC and Non-EMS-OCCC groups.
Moreover, the present study represents the most precise and
comprehensive genomic analysis in Korean OCCCs in several ways.
First, fresh frozen tissue samples, rather than formalin-fixed paraffin-
embedded (FFPE) samples, were used in this study. Despite the use of
optimization methods, the yield rate of good quality DNA in FFPE
47
samples is known to be low, compared with the rate in fresh frozen
tissue samples [68]. The quality of DNA extraction in FFPE samples is
negatively affected by defects or artifacts that occur during sample
storage and preparation [69].In the current study, the median storage
period of fresh frozen samples was 23.1 (range, 2.2-60.8) months, and
although some samples with relatively long storage periods were
included, NGS WES data were successfully obtained in all 15 patients.
Another advantage of using fresh frozen tissue is that a sufficient
amount of DNA, required for WES, is retained easily. However, fresh
frozen tissue samples may not always be available in some clinical
settings.
Second, we used the WES technique, rather than targeted sequencing,
for NGS DNA analyses. Compared with a designed cancer panel,
consisting of selected-cancer-related genes, WES covers a wider area
of the genome and, thereby, more readily enables identification of novel
mutations. Of the 27 novel somatic mutations we identified in Korean
OCCCs, MST1R (pAla234Thr), GPC3 (pPro273His), MED12
(pLeu1489Val), and RFX3 (pAla488Thr, pGln97Glu) could only be
discovered using WES because these genes are not usually included in
a cancer panel. MED12 (a component of the transcriptional Mediator
complex), MST1R, GPC3, and RFX3 were reported as cancer-related
genes in previous studies and in The Cancer Genome Altas database
(Figure 1-14 - 1-17). In the recent exome sequencing study of Japanese
group, list of the significantly mutated genes and their frequencies were
48
different from the current study. However, cell proliferation and survival
related genetic alterations (MYC amplifications, RB1 deletions, ARID1A
mutations and PIK3CA mutations) were detected in both studies [70].
Moreover, PIK3CA mutations and PTEN mutations were positively
associated with disease stage (P=0.004 and P=0.03, respectively)
(Figure 1-18). However the frequencies of somatic genomic alterations
were no associations with clinical features (Figure 1-19).
Lastly, the WES technique, used in the current study, provided
additional analyses of copy number alterations. Because structural
variations, producing large genomic rearrangements in the human
genome, are known to play a key role in some diseases, confirmation of
somatic CNVs in the current study was meaningful [71]. Identification of
CNVs also helped to determine somatically-altered pathways more
precisely. CNV analyses identified amplification of oncogenic genes
(NTRK1, MYC, and GNAS) and deletion of tumor suppressor genes
(BRCA2 and ATM), which were not detected by somatic mutation
analysis.
Interestingly, somatic mutations in TP53 (13%; 2/15), and somatic CNVs
in BRCA2 (60%; 9/15) and TP53 (26%; 4/15), were more common in
this study than in previously reported studies. Both TP53 and BRCA2
genes are well-known tumor suppressor genes. As the TP53 pathway is
related to cancer cell proliferation and survival, it is thought that although
their frequency is low, when somatic mutations do occur in TP53, they
49
will play an important role in tumorigenesis of OCCC.
Patient #1 showed higher number of somatic mutations compared to
other patients. We analyzed the relative proportions of the six different
possible base-pair substitutions to discover mutational processes. As
the results, C to A transversions, known to be related with tobacco
smoking [44, 72], were more frequently observed in patient #1 (31%)
than in others (12%) (Figure 1-20).
Patient #13 showed APOBEC signatured mutation pattern, which were
well known as key player in mutagenesis in cervical, bladder, head and
neck, breast, and lung cancer type (Figure 1-21). However the
APOBEC signatured mutation were
The current study has several limitations. First, the size of the study
population in this retrospective case-control study was small (n=15).
Second, expression of somatically-mutated genes was not investigated.
Whole transcriptome sequencing or microarray analyses to discover the
effects of genetic alterations would improve the accuracy and
completeness of genomic profiling. Despite these limitations, the current
study was the first to use NGS WES for the characterization of Korean
OCCCs. Frequently mutated genes, as well as novel
(unpublished/unreported) deleterious somatic mutations, were
discovered. Through the evaluation of somatic exonic mutations and
CNVs, we were able to plot somatically-altered pathways in OCCC. By
50
analyzing cancer tissue samples and their matched blood samples of
individual OCCC patients, integrative somatic and genomic analyses
were performed.
In conclusion, the present study successfully characterized the genomic
landscape of 15 patients with OCCC. We identified potential therapeutic
targets in most tumors for the treatment of OCCC. Additional larger
studies, including whole transcriptome sequencing to determine the
effects of genetic alterations, are warranted.
51
Figure 1-14. The alteration frequency of RFX3 (Regulatory Factor
X3) in TCGA. The alteration frequencies of RFX3 were shown across
cancers which were researched in TCGA
52
Figure 1-15. The alteration frequency of MST1R (Macrophage
stimulating protein receptor) in TCGA. The alteration frequencies of
MST1R were shown across cancers which were researched in TCGA
53
Figure 1-16. The alteration frequency of MED12 in TCGA. The
alteration frequencies of MED12 were shown across cancers which
were researched in TCGA
54
Figure 1-17. The alteration frequency of GPC3 (Glypican 3) in
TCGA. The alteration frequencies of GPC3 were shown across
cancers which were researched in TCGA
55
Figure 1-18. Associations between tumor stage and somatically-
mutated genes. Pearson correlation coefficients (positive (red) /
negative (blue)) between tumor stage and somatically-mutated genes
are displayed (top). P values for differences between patients with and
without gene mutations are shown (bottom).
56
Figure 1-19. Association between somatic genomic alterations and
clinical features.
57
Figure 1-20. Mutation signature difference between patient #1 and
other patients. the relative proportions of the six different possible
base-pair substitutions were ploted across patient #1 and other patients
58
Figure 1-21. APOBEC signature in Patient #13. Mutational spectrum
was shown across substitution subtypes.
59
CHAPTER 2
Idnetification of LYL1 gene amplification as
an independent prognostic factor of uterine
corpus endometrial cancer via next
generation sequencing data analysis of The
Cancer Genome Atlas
60
INTRODUCTION
Uterine corpus endometrial cancer (UCEC) imposes a global burden in
both the United States and Korea [73]. In United States, it is the most
common gynecologic cancer, and 61,380 new cases were identified in
2017 [74]. In Korea, the incidence of UCEC has been increasing and
2.5% (2,578) of new cancer cases is expected in 2017 [5, 51]. However
genomic understanding of UCEC is not clear because the cause is
unknown [16]. However recently, to broaden understanding of the
uterine cancer, TCGA analyzed Uterine Corpus Endometrial
Carcinomas, and Identified four subtypes; POLE ultramutated,
Microsatellite instability hypermutated, Copy number low and Copy
number high [19]. The Uterine Corpus Endometrial Carcinomas were
characterized by frequent mutations in TP53, extensive copy number
alterations, DNA methylation changes, and frequent mutations in PTEN
and KRAS. However their analysis mainly covered characterizing by
analyzing genomic alterations and dividing subtypes in UCEC [19-21].
Therefore, finding genomic targets for the novel molecular diagnostic
and the effective precision medicine with their data will be great benefit
to concur uterine cancers. The Cancer Genome Atlas (TCGA) Research
Network reported integrated genomic and transcriptomic profiling of 373
UCEC patients [20]. Moreover, this group classified UCEC into four
categories listed from good to poor prognosis: (1) POLE (ultramutated);
61
(2) microsatellite instability (MSI) (hypermutated); (3) copy-number low;
and (4) copy-number high. Especially, copy-number high group included
most of the serous and serous-like endometrioid tumors, and shares
genomic features with ovarian serous carcinomas. Since then,
researchers has designed clinical trials of UCEC on post-surgical
adjuvant treatment based on these molecular classification In
accordance with the era of precision medicine, discovery of reliable
genetic changes is essential to provide individualized treatment for the
patients with UCEC [20, 75]. Not widely known genes, such as LYL1
gene, can be identified as a novel prognostic indicator or as a potential
target for therapeutic drugs. LYL1 gene is located on the short (p) arm
of chromosome 19 at position 13, and encodes a protein possibly
associated with blood vessel maturation and hematopoiesis. As a
member of basic helix-loop-helix transcription factor family, LYL1 gene
is known to regulate cell proliferation and differentiation [76]. A
chromosomal aberration involving LYL1 causes a form of T-cell acute
lymphoblastic leukemia.
Interestingly, somatic amplifications in LYL1 gene are frequently
observed in UCSC than any other cancers: the second among cancers
of The Cancer Genome Atlas (TCGA). However, its impact on UCSC
has not been evaluated in previous studies. Thus, in this study, we
aimed to elucidate whether genetic alterations in LYL1 gene, such as
amplification status, affect survival outcomes in UCEC using data from
TCGA database.
62
MATERIALS AND METHODS
Data acquisition
We downloaded the UCEC patients’ genomic alteration data and the
corresponding clinicopathologic information from the TCGA data portal
(https://tcgadata.nci.nih.gov/tcga/tcgaDownload.jsp) and the cBioPortal
for Cancer Genomics (http://www.cbioportal.org/). The Illumina Genome
Analyzer was utilized as the platform for DNA sequencing (Illumina Inc,
San Diego, CA). This study was conducted in compliance with the TCGA
publication guideline and publication policy
(http://cancergenome.nih.gov/publications/publicationguidelines).
Study population
In total, 370 UCEC patients were included in this study. We collected
patients’ clinicopathologic characteristics including age, underlying
comorbidities, International Federation of Gynecology and Obstetrics
(FIGO) stage, histologic type, tumor grade, and treatment of UCSC,
such as surgery, radiation, and chemotherapy. Information about
microsatellite instability (MSI) of the tumor was also collected. Patients
were assigned to the LYL1 amplification group and non-amplification
group according to the LYL1 gene alteration status.
63
Bioinformatic analysis
LYL1 gene alteration status, especially amplification status, was
examined by using the cBioPortal for Cancer Genomics
(http://www.cbioportal.org/). The level 3 data of patients with UCEC and
raw reads (htseq counts) for differentially expressed gene analysis were
obtained from the FireBrowse (http://firebrowse.org). The Kyoto
Encyclopedia of Genes and Genomes (KEGG) pathway analysis of
gene expression data [77] was performed though the Gene Set
Enrichment Analysis (GESA) method [78], and the enrichment pathways
were visualized by the NetworkAnlyst (http://www.networkanalyst.ca)
[79]. In visualization of gene network analyses, the Search Tool for the
Retrieval of Interacting Genes/Proteins (STRING) database was used
with confidence score of 400 to 1,000 [80]. Differentially expressed
genes were identified by the R package ‘DEseq2’ [81].
Statistical Analysis
Statistical analyses were performed to evaluate differences in
clinicopathologic characteristics between the two groups. Student’s t-
test and Mann-Whitney U-test were used to compare continuous
variables. Pearson’s chi-squared test and Fisher’s exact test were used
to compare categorical variables.
64
We calculated PFS as the time interval between the date of initial
diagnosis and the date of disease progression. Overall survival (OS)
was calculated as the time interval between the date of initial diagnosis
to the date of cancer-related death or end of the study. Survival analyses
were conducted using the Kaplan-Meier method with log-rank test.
Hazard ratios (HRs) and 95% confidence intervals (CIs) were calculated
using Cox proportional hazards regression models. We used SPSS
software version 21.0 (SPSS Inc., Chicago, IL, USA) for survival
analyses. All other statistical analyses were performed using the R
statistical software version 2.12.1 (R Foundation for Statistical
Computing, Vienna, Austria; ISBN 3-900051-07-0; http://www.R-
project.org). A P value <0.05 was considered statistically significant.
65
RESULTS
Somatic copy number variations in uterine corpus endometrial
carcinoma
Frequencies of somatic amplifications in LYL1 gene according to cancer
types of TCGA are depicted in Figure. 2-1. UCEC was the second-
ranked cancer type with high LYL1 gene amplification. In genomic
alteration analyses, chromosome 1q, 3q, 8q, 17q, and 19p were
frequently amplified among 370 patients with UCEC (Figure. 2-2). LYL1
gene on 19p arm was amplified in 5.9% (22/370) of the UCEC patients.
Additionally, LYL1 gene was one of the 15 mostly amplified oncogenes
and deleted tumor suppressor genes filtered by Gene family analysis in
Gene Set Enrichment Analysis (Figure. 2-3 and 2-4).
Characteristics of the patients with uterine corpus endometrial
carcinoma
Patients’ clinicopathologic characteristics are presented in Table 1. The
mean age of patients was 63 years. Of the 370 UCEC patients, 304
(82.2%), 52 (14.1%), and 14 (3.8%) were endometrioid, serous, and
mixed adenocarcinoma types, respectively. Patients in LYL1
amplification group were significantly older at diagnosis of UCEC, and
showed more aggressive cancer features compared to those in LYL1
66
non-amplification group: advanced-stage disease (FIGO stage III-IV)
(P=0.003), grade 3 disease (P<0.001), and serous histologic type
(P<0.001) were more common. Proportions of four UCEC categories
from TCGA classification were also different between the two groups:
72.7% of patients in LYL1 amplification group were copy-number high
category, whereas only 12.1% in LYL1 non-amplification group showed
this category (P<0.001). In terms of adjuvant treatment, patients who
received chemotherapy were more common in LYL1 amplification group
than in non-amplification group (50.0% vs. 28.4%, P=0.032) (Table 2-1).
Comparisons of survival outcomes between the two groups and
identification of prognostic factors
During the median 23.9 months of observation period (range, 0.5-191.7
months), 5 patients in LYL1 amplification group and 34 in LYL1 non-
amplification group died of the disease. In survival analysis, LYL1
amplification group showed poorer PFS and OS: 3-year PFS rate, 34.4%
vs. 79.9% (P=0.031) and 5-year OS rate, 25.1% vs. 84.9% (P=0.014)
(Figure. 2-5 and 2-6).
LYL1 amplification was significantly associated with poor OS in
univariate analysis (P=0.019) (Table 2-2). However, after adjusting
variables including histologic type, grade, and FIGO stage, LYL1 status
was not confirmed as a prognostic factor for OS; Only advanced stage
disease (FIGO stage III-IV) was an independent poor prognostic factor
67
(adjusted HR, 3.509; 95% CI, 1.734-7.101; P<0.001). Table 2 also
presents factors associated with PFS. In univariate analysis, LYL1
amplification was associated with poor PFS (P=0.037). However, the
statistical significance of LYL1 status on survival outcome disappeared
in multivariate analysis. Advanced stage disease (FIGO stage III-IV)
was identified as an independent poor prognostic factor for PFS
(adjusted HR, 3.581; 95% CI, 1.981-6.473; P<0.001).
Next, we performed subgroup analyses for each histologic type. In
UCEC patients who had endometrioid histologic type (n=304), both
survival curves for PFS and OS were not different between the LYL1
amplification and non-amplification groups: P values for PFS and OS,
0.070 and 0.323, respectively. Multivariate analyses revealed that LYL
amplification as an independent poor prognostic factor for PFS in this
subgroup with borderline statistical significance (adjusted HR, 4.093; 95%
CI, 0.926-18.012; P=0.063) (Table 2-3).
Differentially expressed genes in LYL1 amplified tumors
To discover the clinical significance of LYL1 gene amplification in UCEC
patients, we performed the GESA pathway analysis with 993 genes with
increased expression in the LYL1 amplification group. As the results, we
identified that MAPK signaling pathway, WNT signaling pathway, cell
cycle pathway, and cancer-related pathway were significantly
upregulated in this group (P<0.001, P=0.002, P=0.004, and P<0.001,
68
respectively) (Figure. 2-7 and 2-8). From 993 differentially expressed
genes, 384 cancer-related genes were filtered by using STRING
database, and enriched on MAPK signaling pathway, WNT signaling
pathway, cell cycle pathway, and cancer-related pathway. MYC, CDK6,
PPKACA, ERBB2 genes were frequently interacted with other cancer-
related genes (Figure 2-9). We also conducted GSEA according to
histologic types and TCGA classes (Figure 2-10). Among the four TCGA
classes, only the high copy number group showed LYL1 amplifications,
and cell proliferation pathway was significantly enriched in this group.
Compared to endometrioid type, cancer-related and cell proliferation
pathways and genes were more commonly enriched in serous type
(Figure 2-11).
69
Figure 2-1. Frequencies of LYL1 gene amplification in various
cancer-types. Frequencies of LYL1 genetic alterations are shown
across cancer-types.
70
Figure 2-2. Frequencies of copy number variations. The
amplifications and deletions of copy number variations are shown
across chromosomes.
71
Figure 2-3. Correlations between amplification frequencies and
mortality. Correlations between amplification frequencies and mortality
are shown across top 15 oncogenes.
72
Figure 2-4. Correlations between deletion frequencies and
mortality. Correlations between deletion frequencies and mortality are
shown across top 15 tumor suppressor genes.
73
Table 2-1. Cliicopathologic characteristics of patients.
74
Figure 2-5. Overall survival of patients with uterine corpus
endometrial carcinoma. Overall survival of patients are shown by LYL1
gene status.
75
Figure 2-6. Progression-free survival of patients with uterine
corpus endometrial carcinoma. Progression-free survival of patients
are shown by LYL1 gene status.
76
Tab
le 2
-2. F
acto
rs a
sso
cia
ted
wit
h s
urv
ival o
utc
om
es in
pati
en
ts w
ith
ute
rin
e c
orp
us e
nd
om
etr
ial ca
rcin
om
a
77
Tab
le 2
-3.
Facto
rs a
sso
cia
ted
wit
h s
urv
ival
ou
tco
mes i
n p
ati
en
ts w
ith
en
do
metr
iod
his
tolo
gic
typ
e o
f u
teri
ne
co
rpu
s e
nd
om
etr
ial
ca
rcin
om
a
78
Figure 2-7. Enrichment analysis by LYL1 gene status. Significantly
enriched pathways are shown in upregulated 993 DEGs.
79
Figure 2-8. Expression levels of enriched DEGs. Expression levels
of enriched DEGs are shown across LYL1 amplification.
80
Figure 2-9. Significant gene networks. Gene networks bearing
simplified KEGG pathway annotations and grouped process-wise by
commonest term prevailing in network
81
Figure 2-10. Gene set enrichment analysis according to histologic
types and TCGA classes. Gene set enrichment score were shown by
histologic type.
82
Figure 2-11. Enriched gene list of cancer-related and cell
proliferation pathways. Cancer-related and cell proliferation genes are
shown according to the two histologic types; serous and endometrioid.
83
DISCUSSION
In this study, we investigated whether LYL1 gene amplification affect
survival outcomes in UCEC through the analysis of TCGA database.
Patients in LYL1 amplification group showed poorer survival outcomes
compared to those in non-amplification group. To date, previous studies
have tried to find out novel biomarkers predicting survival outcomes in
various cancer types. For example, some researchers investigated
impact of altered expression of specific genes, such as homeobox gene
family, L1CAM and MYC, on prognosis of cancers using TCGA database
[7, 82, 83]. LYL1 gene, a basic helix-loop-helix transcription factor, is
known as an oncogene in human and mouse cancers, and shows many
associations with cancer-related properties such as angiogenesis [84-
86]. By genetic and epigenetic modulations, LYL1 gene plays its role as
a regulator for cell proliferation and differentiation [76]. Both in vivo and
in vitro experiments demonstrated that LYL1 gene interacts with several
oncogenes, such as MYC, TAL1, TAL2 and LMO2 [87, 88]. To elucidate
the role of LYL1 gene amplification in UCEC, we analyzed TCGA
expression data, and discovered that overexpressed cancer-related
genes are enriched on MAPK signaling, WNT signaling, and cell cycle
pathways in UCEC patients who had the LYL1 gene amplification.
Especially, MYC, CDK6, PPKACA, and ERBB2, the well-known
oncogenes and cancer markers, were overexpressed in LYL1
84
amplification groups, and MYC and ERBB2 were reported associations
with uterine cancers in previous studies [89-93].
The current study failed to prove LYL1 gene amplification as an
independent prognostic marker for survival of UCEC patients in
multivariate analyses. Only advanced stage disease was identified as a
poor prognostic marker with statistical significance. In this study, more
than three-fourths of patients had early stage disease: 68.6% and 6.5%
were FIGO stage I and II, respectively. According to the Surveillance,
Epidemiology, and End Results data, the 5-year survival for disease with
distant metastasis was only 16.2%, whereas those for disease with
confined to primary site was 95.3%. [94]. We believe that the effect of
stage on survival outcomes was quite considerable, making it difficult to
analyze impact of LYL1 gene amplification in the current study
population. Nevertheless, we could extrapolate the significance of LYL1
amplification status: LYL1 gene amplification can be a novel cancer
maker highlighting overexpression of accompanying oncogenes such as
MYC, PRKACA, ERBB2, and CDK6 from MAPK signaling, WNT
signaling, and cell cycle pathways in UCEC (Figure 2-12), and there are
positive associations between these oncogenes and LYL1 amplification
in 370 UCECs (Figure 2-13). Therefore, LYL1 gene amplification may
be a prognostic indicator in UCEC patients, having potentials to be a
novel target for therapeutic drugs.
The current study has several limitations. First, validations of suggested
85
LYL1 gene and its association with other genes, as well as with as
genetic mechanism, were not performed, and protein-level expression
was not investigated. Such proteo-genomics studies might discover the
effects of genetic alterations and the accuracy and completeness of
genomic profiling. In addition, further studies to identify the genetic and
epigenetic regulatory mechanisms of LYL1 gene and to evaluate its
efficacy as a prognostic indicator and therapeutic target are warranted.
In UCEC cell lines, LYL1 gene can be overexpressed or inhibited by
siRNA, and the subsequent changes in cell differentiation, proliferation
and death will be evaluated. Experiments within LYL1 gene knocked-out
patient-derived xenograft (PDX) animal models would be one of the
possible methods. Second, sample size of the LYL1 gene amplification
group was small (n=22) to derive fair statistical analyses.
Additionally, as we are very interested in LYL1 gene, we also performed
similar analyses on other malignancies using TCGA database. As the
results, high-grade serous ovarian cancer and UCEC were the top two
malignancies in which LYL1 gene amplifications are frequently detected
(Figure 2-14). Similarly, in high-grade serous ovarian cancer, patients
displaying LYL1 gene amplification showed significantly poorer OS rates
compared to those without LYL1 gene amplification (P=0.013) (Figure
2-15)
86
Figure 2-12. Gene expression between LYL1 amplified patients and
non- LYL1 amplified patients. Gene expression of oncogenes are
shown according to LYL1 amplified patients and non- LYL1 amplified
patients.
87
Figure 2-13. Pearson’s correlation between LYL1 amplification and
oncogene expression. The R values and p values are shown across
oncogenes.
88
Figure 2-14. The genomic alteration frequency of LYL1 gene. The
frequencies of the genomic alteration in LYL1 gene are shown across
various cancers.
89
Figure 2-15. Survival analysis in high-grade serous ovarian cancer.
Overall survival outcomes are shown by LYL1 gene status.
90
General discussion
For more exact diagnosis, more efficient management, and more
effective targeted therapeutic strategy against gynecological cancers,
the genomic profiling of OCCCs were analyzed via WES, and the
genomic targets, for the novel molecular diagnostic and the effective
precision medicine, were discovered by analyzing the next generation
sequencing data of OCCC and UCEC.
However, in the first chapter, there were several idiosyncratic points.
First, Patient #1 showed higher number of somatic mutations compared
to other patients. We analyzed the relative proportions of the six different
possible base-pair substitutions to discover mutational processes [95,
96]. As the results, C to A transversions [97-99], known to be related
with tobacco smoking, were more frequently observed in patient #1
(31%) than in others (12%). Second, Patient #13 showed APOBEC
signatured mutation pattern, which were well known as key player in
mutagenesis in cervical, bladder, head and neck, breast, and lung
cancer type [100, 101]. However the APOBEC signatured mutation
should be verified with RNA sequencing data because APOBEC gene
is functionally related to the C to U RNA-editing cytidine deaminase
[102]. Finally, this current study has additional several limitations; 1) the
size of the study population in this retrospective case-control study was
91
small (n=15) to test associations between group with endometriosis and
without endometriosis. 2) Expression via Whole transcriptome
sequencing or microarray were not investigated to discover the effects
of genetic alterations. 3) The validation of CNVs detected via WES is
not completed.
In the second chapter, there were also several idiosyncratic point. In this
study, there were very small cohort (n=22) with LYL1 amplification in
370 UCECs, and it is difficult to determine if the amplification in LYL1 is
a driver mutation or a passenger mutation. LYL1 gene amplification and
its association with other gene expression were also successfully
revealed. To overcome these limitations, we followed a previously
published method of Raphael’s to identify a driver mutation as follows:
1) recurrence of the gene; 2) prediction of the functional impact of
individual gene mutations; and 3) assessment of combinations of
mutations using pathways and interaction networks. This method has
been widely used in identification of driver mutations from the NGS data
[103]. In addition, we measured Pearson’s correlation coefficients to
evaluate associations between LYL1 amplification and differentially
expressed genes (DEGs). LYL1 gene was the 10th mostly amplified
oncogenes among the 255 oncogenes whose amplifications were
detected in UCEC, and DEG analyses successfully revealed that
expressions of specific genes were increased along the LYL1 gene
92
amplification. Despite these limitation, the current study used TCGA
database and discovered a novel gene, LYL1, which is associated with
the prognosis of UCEC by analyzing not only bioinformatics data but
also patients’ clinicopathologic data.
Based on this thesis, two studies would improve the accuracy and
completeness of genomic understanding of OCCC and UCEC, and lead
to more efficient diagnosis, management, and targeted therapeutic
strategy against OCCC and UCEC
93
General conclusion
The objective of this study was to establish more intensive
understanding of molecular genetic information of gynecologic cancer
(Especially, Ovarian clear cell carcinoma and uterine corpus
endometrial carcinoma), and to allow for us to discover genomic targets
for the novel molecular diagnostic and the effective precision medicine.
In the first chapter, the most exact genomic landscape of 15 patients
with OCCC successfully were established from frequently mutated
genes (PIK3CA, ARID1A, KRAS, PPP2R1A, SYNE1, RFX3, MED12,
GPC3, MST1R, TP53, ARID2, LRP1B, PTEN and ERBB2) via the whole
exome sequencing, and the genomic features of OCCC were
characterized with the somatic mutations and the somatic copy number
variations. Additionally, potential therapeutic targets, such as MYC and
BRCA1, in OCCCs were identified. Although the fact that endometriosis
are associated with OCCC, no significant differences in the frequency
of genetic alterations were detected across the EMS- and Non-EMS-
OCCC groups in this study. However, in spite of this study, the
mechanism of OCCC was not clear. Therefore, additional larger studies
such as whole transcriptome sequencing, to determine the effects of
genetic alterations, will be necessary in future.
94
In the second chapter, LYL1 gene amplification might be associated with
poor survival outcomes in UCEC patients, especially who had
endometrioid histologic type via TCGA expression data and clinical data
(3-year PFS: 34.4% vs. 79.9%, P = 0.031; 5-year OS: 25.1% vs. 84.9%,
P = 0.014). This study also suggest that expressions of cancer-related
genes (MYC, CDK6, PPKACA and ERBB2) are increased in UCEC
patients who had the LYL1 gene amplification, and MAPK, WNT, and
cell cycle pathways were significantly enriched by LYL1 gene
amplification. Therefore, LYL1 gene amplification can be a prognostic
indicator in UCEC, and it may be a new and potential target for
therapeutic drugs.
It is expected that the findings in this thesis would lead to broader
understanding, more exact diagnosis, more efficient management, and
more effective targeted therapeutic strategy for gynecological cancers.
95
Reference
1. Kim, S.I., et al., Genomic landscape of ovarian clear cell
carcinoma via whole exome sequencing. Gynecol Oncol, 2018. 148(2):
p. 375-382.
2. Kim, S.I., et al., LYL1 gene amplification predicts poor survival
of patients with uterine corpus endometrial carcinoma: analysis of the
Cancer genome atlas data. BMC Cancer, 2018. 18(1): p. 494.
3. Suh, D.H., et al., Practice guidelines for management of ovarian
cancer in Korea: a Korean Society of Gynecologic Oncology Consensus
Statement. J Gynecol Oncol, 2018. 29(4): p. e56.
4. Torre, L.A., et al., Ovarian cancer statistics, 2018. CA Cancer J
Clin, 2018.
5. Jung, K.W., et al., Prediction of Cancer Incidence and Mortality
in Korea, 2017. Cancer Res Treat, 2017. 49(2): p. 306-312.
6. Matz, M., et al., The histology of ovarian cancer: worldwide
distribution and implications for international survival comparison.
Gynecol Oncol, 2017. 144(2): p. 405-413.
7. Cancer Genome Atlas Research, N., Integrated genomic
analyses of ovarian carcinoma. Nature, 2011. 474(7353): p. 609-15.
96
8. Sfakianos, G.P., et al., Validation of ovarian cancer gene
expression signatures for survival and subtype in formalin fixed paraffin
embedded tissues. Gynecol Oncol, 2013. 129(1): p. 159-64.
9. Trabert, B., et al., Reported Incidence and Survival of Fallopian
Tube Carcinomas: A Population-Based Analysis From the North
American Association of Central Cancer Registries. J Natl Cancer Inst,
2017.
10. Kurman, R.J. and M. Shih Ie, The Dualistic Model of Ovarian
Carcinogenesis: Revisited, Revised, and Expanded. Am J Pathol, 2016.
186(4): p. 733-47.
11. Haruta, S., et al., Molecular genetics and epidemiology of
epithelial ovarian cancer (Review). Oncol Rep, 2011. 26(6): p. 1347-56.
12. Patch, A.M., et al., Whole-genome characterization of
chemoresistant ovarian cancer. Nature, 2015. 521(7553): p. 489-94.
13. Tsang, Y.T., et al., KRAS (but not BRAF) mutations in ovarian
serous borderline tumour are associated with recurrent low-grade
serous carcinoma. J Pathol, 2013. 231(4): p. 449-56.
14. Grisham, R.N., et al., BRAF mutation is associated with early
stage disease and improved outcome in patients with low-grade serous
ovarian cancer. Cancer, 2013. 119(3): p. 548-554.
97
15. Jones, S., et al., Frequent mutations of chromatin remodeling
gene ARID1A in ovarian clear cell carcinoma. Science, 2010. 330(6001):
p. 228-31.
16. Catasus, L., et al., Molecular genetic alterations in endometrioid
carcinomas of the ovary: similar frequency of beta-catenin abnormalities
but lower rate of microsatellite instability and PTEN alterations than in
uterine endometrioid carcinomas. Hum Pathol, 2004. 35(11): p. 1360-8.
17. Mackenzie, R., et al., Targeted deep sequencing of mucinous
ovarian tumors reveals multiple overlapping RAS-pathway activating
mutations in borderline and cancerous neoplasms. BMC Cancer, 2015.
15: p. 415.
18. Shen, J., et al., ARID1A Deficiency Impairs the DNA Damage
Checkpoint and Sensitizes Cells to PARP Inhibitors. Cancer Discov,
2015. 5(7): p. 752-67.
19. Bitler, B.G., et al., Synthetic lethality by targeting EZH2
methyltransferase activity in ARID1A-mutated cancers. Nat Med, 2015.
21(3): p. 231-8.
20. Cancer Genome Atlas Research, N., et al., Integrated genomic
characterization of endometrial carcinoma. Nature, 2013. 497(7447): p.
67-73.
21. Cherniack, A.D., et al., Integrated Molecular Characterization of
Uterine Carcinosarcoma. Cancer Cell, 2017. 31(3): p. 411-423.
98
22. Jemal, A., et al., Global cancer statistics. CA Cancer J Clin,
2011. 61(2): p. 69-90.
23. Torre, L.A., et al., Global cancer statistics, 2012. CA Cancer J
Clin, 2015. 65(2): p. 87-108.
24. Lozano, R., et al., Global and regional mortality from 235
causes of death for 20 age groups in 1990 and 2010: a systematic
analysis for the Global Burden of Disease Study 2010. Lancet, 2012.
380(9859): p. 2095-128.
25. Siegel, R.L., K.D. Miller, and A. Jemal, Cancer statistics, 2016.
CA Cancer J Clin, 2016. 66(1): p. 7-30.
26. Carter, J. and S. Pather, An overview of uterine cancer and its
management. Expert Rev Anticancer Ther, 2006. 6(1): p. 33-42.
27. Singh, S.D., et al., Ovarian and uterine cancer incidence and
mortality in American Indian and Alaska Native women, United States,
1999-2009. Am J Public Health, 2014. 104: p. S423-31.
28. Salk, J.J., M.W. Schmitt, and L.A. Loeb, Enhancing the
accuracy of next-generation sequencing for detecting rare and
subclonal mutations. Nat Rev Genet, 2018. 19(5): p. 269-285.
29. Le Gallo, M., F. Lozy, and D.W. Bell, Next-Generation
Sequencing. Adv Exp Med Biol, 2017. 943: p. 119-148.
99
30. van Nimwegen, K.J., et al., Is the $1000 Genome as Near as
We Think? A Cost Analysis of Next-Generation Sequencing. Clin Chem,
2016. 62(11): p. 1458-1464.
31. Couzin, J., Human genome. HapMap launched with pledges of
$100 million. Science, 2002. 298(5595): p. 941-2.
32. International HapMap, C., The International HapMap Project.
Nature, 2003. 426(6968): p. 789-96.
33. Blumenthal, G.M., E. Mansfield, and R. Pazdur, Next-
Generation Sequencing in Oncology in the Era of Precision Medicine.
JAMA Oncol, 2016. 2(1): p. 13-4.
34. Metzker, M.L., Sequencing technologies - the next generation.
Nat Rev Genet, 2010. 11(1): p. 31-46.
35. Sokolenko, A.P., et al., Identification of novel hereditary cancer
genes by whole exome sequencing. Cancer Lett, 2015. 369(2): p. 274-
88.
36. Magi, A., et al., EXCAVATOR: detecting copy number variants
from whole-exome sequencing data. Genome Biol, 2013. 14(10): p.
R120.
37. Ozsolak, F. and P.M. Milos, RNA sequencing: advances,
challenges and opportunities. Nat Rev Genet, 2011. 12(2): p. 87-98.
100
38. Boerno, S.T., et al., Next-generation sequencing technologies
for DNA methylation analyses in cancer genomics. Epigenomics, 2010.
2(2): p. 199-207.
39. Martincorena, I. and P.J. Campbell, Somatic mutation in cancer
and normal cells. Science, 2015. 349(6255): p. 1483-9.
40. Seo, J.S., et al., The transcriptional landscape and mutational
profile of lung adenocarcinoma. Genome Res, 2012. 22(11): p. 2109-19.
41. Carr, T.H., et al., Defining actionable mutations for oncology
therapeutic development. Nat Rev Cancer, 2016. 16(5): p. 319-29.
42. Mermel, C.H., et al., GISTIC2.0 facilitates sensitive and
confident localization of the targets of focal somatic copy-number
alteration in human cancers. Genome Biol, 2011. 12(4): p. R41.
43. Beroukhim, R., et al., The landscape of somatic copy-number
alteration across human cancers. Nature, 2010. 463(7283): p. 899-905.
44. Lawrence, M.S., et al., Mutational heterogeneity in cancer and
the search for new cancer-associated genes. Nature, 2013. 499(7457):
p. 214-218.
45. Maru, Y., et al., Identification of novel mutations in Japanese
ovarian clear cell carcinoma patients using optimized targeted NGS for
clinical diagnosis. Gynecol Oncol, 2017. 144(2): p. 377-383.
101
46. Gadducci, A., N. Lanfredini, and R. Tana, Novel insights on the
malignant transformation of endometriosis into ovarian carcinoma.
Gynecol Endocrinol, 2014. 30(9): p. 612-7.
47. Yamashita, Y., et al., Met is the most frequently amplified gene
in endometriosis-associated ovarian clear cell adenocarcinoma and
correlates with worsened prognosis. PLoS One, 2013. 8(3): p. e57724.
48. Anglesio, M.S., et al., Multifocal endometriotic lesions
associated with cancer are clonal and carry a high mutation burden. J
Pathol, 2015. 236(2): p. 201-9.
49. Ivanov, M., et al., Towards standardization of next-generation
sequencing of FFPE samples for clinical oncology: intrinsic obstacles
and possible solutions. J Transl Med, 2017. 15(1): p. 22.
50. Ofner, R., et al., Non-reproducible sequence artifacts in FFPE
tissue: an experience report. J Cancer Res Clin Oncol, 2017. 143(7): p.
1199-1207.
51. Lim, M.C., et al., Incidence of cervical, endometrial, and ovarian
cancer in Korea, 1999-2010. J Gynecol Oncol, 2013. 24(4): p. 298-302.
52. Cho, K.R. and M. Shih Ie, Ovarian cancer. Annu Rev Pathol,
2009. 4: p. 287-313.
53. Sugiyama, T., et al., Clinical characteristics of clear cell
carcinoma of the ovary: a distinct histologic type with poor prognosis and
102
resistance to platinum-based chemotherapy. Cancer, 2000. 88(11): p.
2584-9.
54. Munksgaard, P.S. and J. Blaakaer, The association between
endometriosis and ovarian cancer: a review of histological, genetic and
molecular alterations. Gynecol Oncol, 2012. 124(1): p. 164-9.
55. Pearce, C.L., et al., Association between endometriosis and risk
of histological subtypes of ovarian cancer: a pooled analysis of case-
control studies. Lancet Oncol, 2012. 13(4): p. 385-94.
56. Sugiyama, T., et al., Randomized Phase III Trial of Irinotecan
Plus Cisplatin Compared With Paclitaxel Plus Carboplatin As First-Line
Chemotherapy for Ovarian Clear Cell Carcinoma: JGOG3017/GCIG
Trial. J Clin Oncol, 2016. 34(24): p. 2881-7.
57. Kim, S.I., et al., Incidence of epithelial ovarian cancer according
to histologic subtypes in Korea, 1999 to 2012. J Gynecol Oncol, 2016.
27(1): p. e5.
58. Scott, R.B., Malignant changes in endometriosis. Obstet
Gynecol, 1953. 2(3): p. 283-9.
59. do Valle, I.F., et al., Optimized pipeline of MuTect and GATK
tools to improve the detection of somatic single nucleotide
polymorphisms in whole-exome sequencing data. BMC Bioinformatics,
2016. 17: p. 341.
103
60. Sathirapongsasuti, J.F., et al., Exome sequencing-based copy-
number variation and loss of heterozygosity detection: ExomeCNV.
Bioinformatics, 2011. 27(19): p. 2648-54.
61. Tomczak, K., P. Czerwinska, and M. Wiznerowicz, The Cancer
Genome Atlas (TCGA): an immeasurable source of knowledge.
Contemp Oncol, 2015. 19(1A): p. A68-77.
62. Morikawa, A., et al., PIK3CA and KRAS mutations in cell free
circulating DNA are useful markers for monitoring ovarian clear cell
carcinoma. Oncotarget, 2018. 9(20): p. 15266-15274.
63. Itamochi, H., et al., Whole-genome sequencing revealed novel
prognostic biomarkers and promising targets for therapy of ovarian clear
cell carcinoma. Br J Cancer, 2017. 117(5): p. 717-724.
64. Hall, A.B., et al., Potentiation of tumor responses to DNA
damaging therapy by the selective ATR inhibitor VX-970. Oncotarget,
2014. 5(14): p. 5674-85.
65. Gounaris, I. and J.D. Brenton, Molecular pathogenesis of
ovarian clear cell carcinoma. Future Oncol, 2015. 11(9): p. 1389-405.
66. Yamashita, Y., Ovarian cancer: new developments in clear cell
carcinoma and hopes for targeted therapy. J Clin Oncol, 2015. 45(5): p.
405-7.
104
67. Li, M., et al., Characterization of ovarian clear cell carcinoma
using target drug-based molecular biomarkers: implications for
personalized cancer therapy. J Ovarian Res, 2017. 10(1): p. 9.
68. Astolfi, A., et al., Whole exome sequencing (WES) on formalin-
fixed, paraffin-embedded (FFPE) tumor tissue in gastrointestinal stromal
tumors (GIST). BMC Genomics, 2015. 16: p. 892.
69. Einaga, N., et al., Assessment of the quality of DNA from
various formalin-fixed paraffin-embedded (FFPE) tissues and the use of
this DNA for next-generation sequencing (NGS) with no artifactual
mutation. PLoS One, 2017. 12(5): p. e0176280.
70. Murakami, R., et al., Exome Sequencing Landscape Analysis in
Ovarian Clear Cell Carcinoma Shed Light on Key Chromosomal
Regions and Mutation Gene Networks. Am J Pathol, 2017. 187(10): p.
2246-2258.
71. Freeman, J.L., et al., Copy number variation: new insights in
genome diversity. Genome Res, 2006. 16(8): p. 949-61.
72. Hainaut, P., M. Olivier, and G.P. Pfeifer, TP53 mutation
spectrum in lung cancers and mutagenic signature of components of
tobacco smoke: lessons from the IARC TP53 mutation database.
Mutagenesis, 2001. 16(6): p. 551-3; author reply 555-6.
73. Fitzmaurice, C., et al., The Global Burden of Cancer 2013.
JAMA Oncol, 2015. 1(4): p. 505-27.
105
74. Siegel, R.L., K.D. Miller, and A. Jemal, Cancer Statistics, 2017.
CA Cancer J Clin, 2017. 67(1): p. 7-30.
75. Hodson, R., Precision medicine. Nature, 2016. 537(7619): p.
S49.
76. San-Marina, S., et al., Lyl1 interacts with CREB1 and alters
expression of CREB1 target genes. Biochim Biophys Acta, 2008.
1783(3): p. 503-17.
77. Kanehisa, M., et al., KEGG: new perspectives on genomes,
pathways, diseases and drugs. Nucleic Acids Res, 2017. 45(D1): p.
D353-D361.
78. Subramanian, A., et al., Gene set enrichment analysis: a
knowledge-based approach for interpreting genome-wide expression
profiles. Proc Natl Acad Sci U S A, 2005. 102(43): p. 15545-50.
79. Xia, J., E.E. Gill, and R.E. Hancock, NetworkAnalyst for
statistical, visual and network-based meta-analysis of gene expression
data. Nat Protoc, 2015. 10(6): p. 823-44.
80. Szklarczyk, D., et al., STRING v10: protein-protein interaction
networks, integrated over the tree of life. Nucleic Acids Res, 2015. 43:
p. D447-52.
81. Varet, H., et al., SARTools: A DESeq2- and EdgeR-Based R
Pipeline for Comprehensive Differential Analysis of RNA-Seq Data.
PLoS One, 2016. 11(6): p. e0157022.
106
82. Dellinger, T.H., et al., L1CAM is an independent predictor of
poor survival in endometrial cancer - An analysis of The Cancer
Genome Atlas (TCGA). Gynecol Oncol, 2016. 141(2): p. 336-340.
83. Eoh, K.J., et al., Upregulation of homeobox gene is correlated
with poor survival outcomes in cervical cancer. Oncotarget, 2017. 8(48):
p. 84396-84402.
84. Meng, Y.S., et al., Oncogenic potential of the transcription factor
LYL1 in acute myeloblastic leukemia. Leukemia, 2005. 19(11): p. 1941-
7.
85. Pirot, N., et al., LYL1 activity is required for the maturation of
newly formed blood vessels in adulthood. Blood, 2010. 115(25): p. 5270-
9.
86. Orsulic, S., et al., Induction of ovarian cancer by defined
multiple genetic changes in a mouse model system. Cancer Cell, 2002.
1(1): p. 53-62.
87. Bain, G., et al., E2A deficiency leads to abnormalities in
alphabeta T-cell development and to rapid development of T-cell
lymphomas. Mol Cell Biol, 1997. 17(8): p. 4782-91.
88. Deleuze, V., et al., Angiopoietin-2 is a direct transcriptional
target of TAL1, LYL1 and LMO2 in endothelial cells. PLoS One, 2012.
7(7): p. e40484.
107
89. Li, L., et al., SIRT1 activation by a c-MYC oncogenic network
promotes the maintenance and drug resistance of human FLT3-ITD
acute myeloid leukemia stem cells. Cell Stem Cell, 2014. 15(4): p. 431-
446.
90. Tadesse, S., et al., Targeting CDK6 in cancer: State of the art
and new insights. Cell Cycle, 2015. 14(20): p. 3220-30.
91. Martinez-Ledesma, E., R.G. Verhaak, and V. Trevino,
Identification of a multi-cancer gene expression biomarker for cancer
clinical outcomes using a network-based algorithm. Sci Rep, 2015. 5: p.
11966.
92. Elsahwi, K.S. and A.D. Santin, erbB2 Overexpression in Uterine
Serous Cancer: A Molecular Target for Trastuzumab Therapy. Obstet
Gynecol Int, 2011. 2011: p. 128295.
93. Subramaniam, K.S., et al., Cancer-associated fibroblasts
promote endometrial cancer growth via activation of interleukin-6/STAT-
3/c-Myc pathway. Am J Cancer Res, 2016. 6(2): p. 200-13.
94. Kim, Y., et al., Integrative and comparative genomic analysis of
lung squamous cell carcinomas in East Asian patients. J Clin Oncol,
2014. 32(2): p. 121-8.
95. Nik-Zainal, S., et al., Mutational processes molding the
genomes of 21 breast cancers. Cell, 2012. 149(5): p. 979-93.
108
96. Helleday, T., S. Eshtad, and S. Nik-Zainal, Mechanisms
underlying mutational signatures in human cancers. Nat Rev Genet,
2014. 15(9): p. 585-98.
97. Alexandrov, L.B., et al., Mutational signatures associated with
tobacco smoking in human cancer. Science, 2016. 354(6312): p. 618-
622.
98. Tricker, A.R., Re: environmental tobacco smoke, genetic
susceptibility, and risk of lung cancer in never-smoking women. J Natl
Cancer Inst, 2000. 92(9): p. 760-1.
99. Cook, J.L., Tobacco smoke: chemical carcinogenesis and
genetic lesions. Ochsner J, 1999. 1(3): p. 130-5.
100. Li, Z., et al., APOBEC signature mutation generates an
oncogenic enhancer that drives LMO1 expression in T-ALL. Leukemia,
2017. 31(10): p. 2057-2064.
101. Wang, S., et al., APOBEC3B and APOBEC mutational
signature as potential predictive markers for immunotherapy response
in non-small cell lung cancer. Oncogene, 2018.
102. Yang, B., et al., APOBEC: From mutator to editor. J Genet
Genomics, 2017. 44(9): p. 423-437.
103. Raphael, B.J., et al., Identifying driver mutations in sequenced
cancer genomes: computational approaches to enable precision
medicine. Genome Med, 2014. 6(1): p. 5.
109
국문초록
차세대 염기서열 분석을 활용한 난
소 투명세포 암과 자궁 내막 암의
유전체 분석 연구
서울대학교 대학원 의과학과 의과학 전공
이지원
부인 암 (Gynecologic Cancer)은 자궁 (uterus) 내외, 난소 (ovary),
나팔관 (fallopian tubes), 질 (vagina)그리고 외음부 (vulva) 등 여성의
생식기 (reproductive organs)에서 일어나는 여성 암이다. 특히 부인
암 중 난소 암 (ovarian cancer) 과 자궁암 (uterine corpus
endometrial carcinoma)은 2017 년도에 발생 율과 사망 율이 높은
10 대 암에 속한다. 최근 들어 정밀 의학 시대에 환자 개개인에
맞는 그리고 각각의 암 종에 맞는 암 진단법과 치료제가 요구되고
있다. 특히 The Cancer Genome Atlas 연구 그룹은 최근 10 년 동안
여러 부인 암을 이해하기 위해 여러 암을 모으고 그 암에 대한
유전자 분석을 수행하고 보고 하고 있다. 하지만 난소 투명세포 암
110
(ovarian clear cell carcinoma)과 자궁암의 경우에는 아직 더 많은
분석이 필요한 상태이다.
특히 난소 투명세포 암의 경우에는 사망률이 다소 높지만 발병
율이 낮아 아직까지 저장을 위해 변형을 가하지 않은 암 조직을
구하기 어려운 상태이다. 하지만 이번 연구에서는 2012 년도에서
2016 년도 까지 서울대 병원에서 난소 투명세포 암을 진단 받은
15 명의 환자들로부터 암 조직을 모아 서울대병원 인체 자원은행에
보관하였다. 15 개의 암 조직을 Whole exome sequencing (WES)을
수행하여 얻은 sequencing data 를 통해 난소 투명세포 암의 유전자
지도를 구축하였다. 난소 투명세포 암의 차세대 염기서열 분석을
통해 평균 178 개 (111-25,798)의 체세포 돌연변이 (somatic
mutation)와 343 (43-1,820)개의 DNA 복제 수 변이 (CNVs, copy
number variants)를 찾아내었다. 15 명의 난소 투명세포 암
환자들로부터 PIK3CA (40%), ARID1A (40%), KRAS (20%)을 포함한
14 개의 유전자의 반복적 변이를 찾아내었다. DNA 복제 수 변이의
경우는 NTRK1 (33%), MYC (40%), and GNAS (47%)등의 유전자가
반복적으로 증폭되어 있었고, TET2 (73%), TSC1 (67%), BRCA2 (60%),
and SMAD4 (47%)등의 유전자가 반복적으로 소실 되어 있었다.
또한 난소 투명세포 암을 가진 환자들 중 87%는 세포의 증식과
생존관련 경로 (proliferation and survival signaling pathway: PI3K/AKT,
TP53, ERBB2 pathways) 에, 47%는 염색 질 재구성 (chromatin
111
remodeling)관련 경로에 유전자 변이를 가지고 있었다. 그밖에 난소
투명세포 암에 관련이 많다고 알려진 자궁내막 증 유무에 따른
유전자변이 차이를 조사했지만 통계학상으로 차이점을 찾지
못하였다.
또한 자궁 암의 경우에는 TCGA 에서 연구가 되어 기본적인 유전체
연구는 이루어져 있고 발병 율이 높아 진단 방법 및 치료제 개발에
대한 아직 실질 적으로 임상에서 쓸 수 있는 임상 데이터들과의
연관된 유전자 연구는 많이 부족한 상태이다. 두 번째 연구에서는
TCGA 로부터 370 명 자궁암 환자들의 시퀀싱 데이터를 받아 임상
정보들과 같이 분석하여 LYL1 유전자 상태 (amplification 혹은 non-
amplification)에 따른 차이점을 분석하였다. LYL1 유전자 증폭은
370명의 환자 중 22명의 환자에서 확인이 되었다. 22명의 환자들은
진단되었을 당시 나이가 많았고 비 자궁내막 모양, 고 등급 상태,
후기 단계 이었다. 다중 분산 분석 (multivariate analysis)에서는
LYL1 유전자 증폭이 독립적 예후 인자로 인정되기는 힘들었지만
무병 생존 분석 (progression-free survival)과 전체 생존 분석 (overall
survival)에서는 생존에 영향을 미치는 것으로 확인 되었다. DEG
(Differentially expressed Gene)분석과 GSE (Gene Set
Enrichment)분석을 통해 LYL1 유전자 증폭에 따라 발현이 증가하는
유전자들을 분석하였고, 생존 분석을 통해 생존에 따른 영향을
미친다는 것을 찾아 내었다. 그리고 LYL1 유전자 증폭이 있는
112
환자에서 발현이 올라가는 유전자들을 확인 해 보면 MAPK
((P<0.001), WNT (P=0.002), and cell cycle pathways (P=0.004)에
통계학적으로 관련성을 보였다. 종합하여, 이번 논문은 부인 암
(Gynecologic cancer)에 대한 이해를 높여 주었고, 특히 난소
투명세포 암을 차세대 염기서열 분석을 통한 정확한 유전체 지도를
구축하였다는 점, 처음으로 유전체 복제 수 변이를 확인 하였다는
점과 아직 정확한 진단 법이 많지 않은 자궁암에 있어 효과적인
진단을 할 수 있는 마커를 제시하였다는 점에 의의가 있다.
* 본 논문의 첫 번째 연구는 Gynecologic Oncology 저널에 출판
완료된 내용[1] 이고, 논문의 두 번째 연구는 BMC cancer 저널에
출판된 내용임 [2].
----------------------------------------------------------------------------------------------
주요어: 자궁 암; 난소 투명세포 암; 차세대 염기서열 분석; 체세포
돌연변이; DNA 복제 수 변이; 유전자 증폭; 독립적 예후 인자
학번: 2015-30608