Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Improvement of biopharmaceutical-grade plasmid production by
targeted genome editing of Escherichia coli
Marisa Mariano Faustino
Thesis to obtain the Master of Science Degree in
Biotechnology
Supervisor: Prof. Gabriel António Amaro Monteiro
Examination Committee
Chairperson: Prof. Leonilde de Fátima Morais Moreira
Supervisor: Prof. Gabriel António Amaro Monteiro
Members of the Committee: Prof. Duarte Miguel de França Teixeira dos Prazeres
October 2017
i
Acknowledgments
First, I would like to thank my supervisor Professor Gabriel Monteiro for allowing me
making part of this project, for the guidance, motivation and every given advice that made me
improve throughout the work.
I would also like to acknowledge Claudia Alves for her remarkable work in teaching me
the laboratorial procedures I needed to accomplish my work, and also for being available
whenever I have doubts and for her patience. To Sofia Duarte which have also helped me in
some parts of the work. To the laboratory staff Ricardo Pereira and Rosa Gonçalves for
organizing the lab material so well.
I am also grateful for all my friends and colleagues from the laboratory iBB group, the
master degree in Biotechnology, from my village, and also from my residence.
I would like to express my gratitude to my parents, for all the efforts that they have made
for me, and supported me to achieve my goals. And last but not least to my dear great friends
Sara Mariano and Pedro Alves for all meaningful conversations, motivation and emotional
support through all these years.
iii
Resumo
O crescimento do interesse pelo uso de plasmídeos nas áreas terapêuticas tem estimulado
o desenvolvimento de estudos com objetivo de otimizar a sua produção, sendo um exemplo a
modificação do genoma da bactéria hospedeira, onde estes são amplificados. No entanto, o
efeito de cada modificação na produção de plasmídeos depende do genótipo deste hospedeiro.
Assim o objetivo deste trabalho é a criar estirpes de E. coli capazes de produzir plasmídeos com
um elevado rendimento, através da deleção do gene pgi do genoma, que já mostrou ser benéfico
na estirpe GALG20 ((MG1655∆endA∆recA∆pgi) construída em Gonçalves et al. 2013. Esta
deleção tem por objetivo direcionar o fluxo de carbono para a via das pentoses de fosfato, de
forma a aumentar a quantidade de nucleótidos produzidos e por consequente de plasmídeos.
Como GALG20 exibiu mais tarde instabilidade genómica, o objetivo foi realizar deleções na
estirpe selvagem MG1655 e na estirpe altamente mutagenizada DH5α, usando a metodologia
no-SCAR, um sistema que não deixa “cicatrizes” podendo evitar a instabilidade do genoma, e
depois comparar a produção de plasmídeos destas duas estirpes deletadas no gene do pgi com
os dois diferentes genótipos. A deleção deste gene foi tentada nas estirpes BW27783 e BW2P,
usada na produção de minicírculos, derivados de plasmídeos. Apesar de várias tentativas, as
deleções com o método no-SCAR não foram atingidas, por essa razão foi empregue a
metodologia de Datsenko e Wanner, que apesar que deixar uma “cicatriz” no genoma, tem
mostrado ser eficiente. A estirpe BW2P∆pgi foi construída com sucesso, utilizando este ultimo
método, produzindo 1.2 vezes mais plasmídeo parental (161.3 ± 71.6 µg.L-1.OD600-1) que a sua
estirpe parental, BW2P (133.8 ± 34.2 µg.L-1.OD600-1), antes da indução com L-arabinose. Após
a indução, a estirpe mutada produziu 1.9 vezes mais plasmídeos (289.8 ± 44.0 µg.L-1.OD600-1)
que a estirpe BW2P (156.5 ± 45.8 µg.L-1.OD600-1). Em conclusão, a deleção do pgi demonstrou
ser benéfica no aumento do rendimento específico de plasmídeos produzido.
Palavras-chave: terapia genética, engenharia metabólica, gene pgi, plasmídeos, minicírculos, via das pentoses fosfato.
v
Abstract
The growing interest of plasmids in the therapeutic field has stimulated new studies to
optimize its production, such as the modification of the genome’s host bacteria, where they are
amplified. However, the effect of each modification on plasmid production can depend on the
host genotype. Here the aim was to create E. coli strains to produce high plasmid yields, by
deleting the pgi gene from their genome, once it has shown to be beneficial on GALG20
(MG1655∆endA∆recA∆pgi) in Gonçalves et al. 2013. This deletion aims to direct the carbon
flux through the pentose phosphate pathway, to increase the production of nucleotides and
hence of plasmids. As GALG20 presented afterwards a genome instability problem, the aim
was to make deletions in the wild-type MG1655 and highly mutagenized DH5α strains, using
the no-SCAR methodology, a scar-free system that may prevent the genome host instability,
and then compare the plasmid production of both pgi deleted strains in two different background
genotypes. Also, the deletion of pgi gene was pursued in BW27783 and BW2P strains used to
produce minicircles, which are plasmid derived molecules. Despite several attempts, deletions
with the no-SCAR method were not achieved, and so the Datsenko and Wanner methodology
was employed instead, regardless of the “scar” that is left in the genome, it has shown to be
efficient. BW2P∆pgi was successfully constructed with this latter method, producing 1.2-fold
more parental plasmid (161.3 ± 71.6 µg.L-1.OD600-1) than its parental strain, BW2P (133.8 ±
34.2 µg.L-1.OD600-1), before induction with L-arabinose. After the induction, the modified strain
produced 1.9-fold more plasmid (289.8 ± 44.0 µg.L-1.OD600-1) than the BW2P strain (156.5 ±
45.8 µg.L-1.OD600-1). In conclusion, the pgi deletion showed to be beneficial to increase the
specific plasmid yield.
Keywords: gene therapy, metabolic engineering, pgi gene, plasmids, minicircles, pentose
phosphate pathway (PPP)
vii
Table of Contents
Acknowledgments ..................................................................................................................... i
Resumo .................................................................................................................................... iii
Abstract .................................................................................................................................... v
List of Figures ......................................................................................................................... xi
List of Tables ........................................................................................................................ xvii
List of Abbreviations ............................................................................................................. xix
1. Introduction ......................................................................................................................... 1
1.1. Gene therapy .............................................................................................................. 1
1.1.1. Carrier vectors (viral and non-viral) ...................................................................... 1
1.1.2. Clinical trials ......................................................................................................... 2
1.2. Minicircles ................................................................................................................. 3
1.2.1. Synthesis of minicircles ......................................................................................... 5
1.3. Metabolism ................................................................................................................. 6
1.3.1. Glycolysis .............................................................................................................. 7
1.3.2. Pentose phosphate pathway (PPP) ........................................................................ 8
1.3.3. Biosynthetic pathways ........................................................................................... 9
1.3.4. The by-product acetate .......................................................................................... 9
1.3.5. Stress Responses .................................................................................................. 10
1.3.6. Plasmid Stability .................................................................................................. 11
1.3.6.1. Structural stability ......................................................................................... 11
1.3.6.2. Isoform stability ............................................................................................ 12
1.3.6.3. Segregational stability ................................................................................... 12
1.4. Growth conditions .................................................................................................... 13
1.4.1. Medium composition ........................................................................................... 13
1.4.2. Temperature ......................................................................................................... 13
viii
1.5. Genetic engineering of bacteria ............................................................................... 13
1.5.1. Recombinants selection methodology ................................................................. 15
1.5.1.1. Antibiotic resistance cassette ........................................................................ 15
1.5.1.2. Engineered Cas9-gRNA complex ................................................................. 16
2. Thesis Objective ................................................................................................................ 19
3. Materials and Methods ...................................................................................................... 21
3.1. Media, chemicals and other reagents ....................................................................... 21
3.2. Cell storage .............................................................................................................. 21
3.3. Transformations ....................................................................................................... 22
3.3.1. Preparation of competent cells ............................................................................ 22
3.3.1.1. Chemical competent cells .............................................................................. 22
3.3.1.2. Electrocompetent cells .................................................................................. 22
3.3.2. Heat shock transformation ................................................................................... 23
3.3.3. Electroporation .................................................................................................... 23
3.4. Genomic deletion with the no-SCAR system .......................................................... 23
3.4.1. Strains and plasmids ............................................................................................ 24
3.4.2. Oligonucleotide design ........................................................................................ 28
3.4.3. Procedure ............................................................................................................. 29
3.5. Genomic deletion with the Datsenko and Wanner methodology ............................ 30
3.5.1. Strains and plasmids ............................................................................................ 30
3.5.2. Amplification of the pgi-kanamycin resistance cassette ..................................... 32
3.5.3. Chromosomal integration of the pgi-kanamycin cassette ................................... 34
3.5.4. Colony PCR ......................................................................................................... 35
3.5.5. Elimination of the antibiotic resistance cassette .................................................. 35
3.5.6. Procedure ............................................................................................................. 36
3.6. Genomic PCR analysis ............................................................................................. 36
3.7. Plasmid purification ................................................................................................. 37
ix
3.8. Genomic DNA purification ...................................................................................... 38
3.9. PCR products purification ........................................................................................ 38
3.10. DNA quantification and quality assessment ............................................................ 38
3.11. Restriction enzyme digestion ................................................................................... 38
3.12. Agarose gel electrophoresis ..................................................................................... 38
3.13. Minicircle production ............................................................................................... 39
3.14. Recombination analysis by densitometry ................................................................ 39
3.15. DNA sequencing ...................................................................................................... 39
4. Results and discussion ....................................................................................................... 41
4.1. Genomic deletion with the no-SCAR system .......................................................... 41
4.1.1. Purification of the plasmids ................................................................................. 41
4.1.2. The pKDsgRNA-pgi construction ....................................................................... 41
4.1.3. Recombination between the oligonucleotide and the upstream and downstream
areas of the pgi gene ....................................................................................................... 44
4.2. Deletion with the Datsenko and Wanner methodology ........................................... 51
4.2.1. Purification of the plasmids ................................................................................. 51
4.2.2. pgi kanamycin cassette production ...................................................................... 52
4.2.3. Deletion of the pgi gene ...................................................................................... 53
4.2.4. Minicircle production .......................................................................................... 56
5. Conclusions and future work ............................................................................................. 61
6. References ......................................................................................................................... 63
Annexes ..................................................................................................................................... I
Annex A – sgRNA-pgi sequence .............................................................................................. I
Annex B - Region between gam and araC of pKDsgRNA-xxx ................................................ I
Annex C- Alignment of the sequencing result obtained with a primer forward from a region
that included the sgRNA (strain above) with its theoretical sequence (strain bellow) of the
pKDsgRNA-pgi plasmid. ........................................................................................................ II
x
Annex D- Alignment of the sequencing result obtained with a primer forward from a region
that included the overlapping region between gam and araC (320nt) (strain above) with its
theoretical sequence (strain bellow) of the pKDsgRNA-pgi plasmid. ................................... III
Annex E- Alignment of the sequencing result obtained with a primer reverse from a region that
included the overlapping region between gam and araC (320nt) (strain above) with its
theoretical sequence (strain bellow) of the pKDsgRNA-pgi plasmid. ................................... IV
Annex F- Alignment between the PCR product resultant of the amplification of the “scar”
sequenced (above) with a reverse primer, of the BW2P∆pgi strain with its theoretical sequence
inserted in the genome (bellow). ............................................................................................. V
xi
List of Figures
Figure 1- Vectors used in gene therapy clinical trials (from
wiley.com//legacy/wileychi/genmed/clinical/). ......................................................................... 2
Figure 2- Indications addressed by gene therapy clinical trials (from
wiley.com//legacy/wileychi/genmed /clinical/) ......................................................................... 3
Figure 3- Phases of gene therapy clinical trials (from
wiley.com//legacy/wileychi/genmed/clinical/). ......................................................................... 3
Figure 4- Schematic illustration of the recombination of a parental plasmid into a minicircle
and miniplasmid. ORI-bacterial origin of replication; GOI-gene of interest; MRS-multimer
resolution site (from doi:10.1128/micro-biolspec.PLAS-0022-2014.f3). .................................. 4
Figure 5- Schematic representation of the glycolysis and pentose phosphate pathway (adapted
from: http://schoolbag.info/chemistry/mcat_biochemistry/59.html) ......................................... 8
Figure 6- Schematic representation of the homologous recombination processed by the λ-Red
system. The exo enzyme has 5’ to 3’ exonuclease activity and after binding to the linear DNA
creates single stranded 3’ overhangs. The beta enzyme binds to this overhands and stimulates
the annealing between the target DNA by base pair homology, creating the recombinant DNA
(from Sharan et al. 2009). ......................................................................................................... 14
Figure 7- Schematic representation of the main steps in the Datsenko and Wanner genetic
engineering procedure. In a first step an antibiotic resistance cassette is amplified by PCR with
homologous arms (H1 and H2), complementary to the upstream and downstream sequences of
the target gene. The cassette is then introduced in the genome through homologous
recombination with the target gene, performed by the λ-Red system. In a last step the FLP
recombinase is expressed to remove the cassette from the genome, leaving a “scar” with a FRT
site (from Datsenko & Wanner 2000). ..................................................................................... 15
Figure 8- Schematic illustration of the engineered Cas9-gRNA complex cleaving a target
sequence DNA. crRNA- CRISPR RNA, tracrRNA- transactivating CRISPR RNA. (from
Sander & Joung 2014). ............................................................................................................. 16
Figure 9- General scheme of the counterselection of recombinants, using the complex Cas9-
sgRNA (from Peters et al. 2015. .............................................................................................. 17
Figure 10- Illustration of the expression of multiple gRNA, from a “single construct” (from
http://blog.addgene.org/crispr-101-multiplex-expression-of-grnas). ....................................... 18
file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219652file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219652file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219652
xii
Figure 11- Schematic representation of the two plasmids used in the no-SCAR protocol. (A)-
plasmid Cas9cr4 (6,770 bp), and (B)- plasmid pKDsgRNA-xxx (“xxx” represents the name of
the target gene) (6,959 bp), (from Reisch & Prather 2015). .................................................... 25
Figure 12- Schematic representation of the construction of the pKDsgRNA-pgi using the
pKDsgRNA-ack plasmid as template in 2 PCR independent reactions to produce the fragments
1 and 2, that are further joined in a CPEC reaction to build the pKDsgRNA-pgi. P1 and P2 are
used to construct the fragment 1 and P3 and P4 the fragment 2. P1 represents the forward primer
pgi-pfrag1fwd, P2 the reverse primer pgi-pfrag1rev, P3 the reverse primer pgi-pfrag2rev and
P4 the forward primer pgi-pfrag2fwd. The orange region represents the sgRNA-ack, the red
sequences the sgRNA-pgi, while the brown circles the nicks (Adapted from Cipriano 2017).
.................................................................................................................................................. 26
Figure 13- Schematic representation of the designed oligonucleotide (73 bp) with upstream
(blue) and downstream (green) homologous sequences to the pgi gene. The red cross represents
the PAM site (adapted from Reisch & Prather 2015. ............................................................... 28
Figure 14- Schematic procedure of the of no-SCAR (Scarless Cas9 Assisted Recombineering)
system for genome editing described by Reisch & Prather 2015 (from Reisch & Prather 2015).
.................................................................................................................................................. 29
Figure 15- Schematic representation of the 3 plasmids used in the Datsenko and Wanner
methodology, the pKD46 (A), pCP20 (B), pKD13 (C). (A from
http://www.biofeng.com/zaiti/dachang/pKD46.html, (B) from
http://www.youbio.cn/product/vt1693 and (C) from Cipriano 2017). .................................... 31
Figure 16- Schematic representation of the parental plasmid (PP) pMINILi-VEGF-GFP, and
the miniplasmid (MP) and minicircle (MC) originated after the recombination between the
multimer resolution sites (MRS) by the ParA resolvase, induced with the addition of L-
arabinose to the growth medium (from Alves et al. 2016). ...................................................... 32
Figure 17- Schematic illustration of the amplification of the pgi-kanamycin cassette by PCR
using the forward kancassette_pgi_F (containing the homology site 1 (H1) and the priming site
1 (P1)) and reverse primer kancassette_pgi_R (containing the homology site 2 (H2) and priming
site 2 (P2)); then the homologous recombination performed by the λ Red system enzymes
between the pgi-kan cassette and the target pgi gene leads to the insertion of the cassette into
the genome on the pgi locus. Also the representation of the annealing of the primers
(pgi_conf_F and pgi_conf_R) used to confirm the insertion of the cassette (adapted from
Datsenko & Wanner 2000). ...................................................................................................... 34
file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219658file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219658file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219658
xiii
Figure 18- Agarose gel electrophoresis of the (A)- purified pKDsgRNA-ack isolated from
DH5α cells (lane 1), and after HindIII enzymatic digestion (lane 3), (B)- purified pCas9cr4
from ceded DH5α cells (lane 1), and after BamHI digestion (lane 3); (C)- purified pCas9cr4
isolated from a transformed DH5α colony (lane 1), and after BamHI digestion (lane 3); (D)-
purified pCas9cr4 isolated from a transformed MG1655 colony (lane 1), after BamHI digestion
(lane 3). NZYDNA Ladder III lanes 2. .................................................................................... 41
Figure 19- Agarose gel electrophoresis of the two amplified fragments from the pKDsgRNA-
ack by PCR, to be used for the construction of the pKDsgRNA-pgi. Fragment 1 (lane 1),
fragment 2 (lane 3). NZYDNA Ladder III (lane 2). ................................................................. 42
Figure 20- Agarose gel electrophoresis of the purified pKDsgRNA-pgi from five colonies
(1,2,3,4,5) of DH5α, lanes 2,3,4,5,6 respectively. ................................................................... 42
Figure 21-Alignment of the partial sequencing result obtained (with a forward primer) from a
region that included the sgRNA, (within the box) (above) with its theoretical sequence (bellow)
of the pKDsgRNA-pgi plasmid. ............................................................................................... 43
Figure 22- Alignment of the partial sequencing result obtained (with a forward primer) from a
region that included the overlapping region between gam and araC (320nt) (within the box)
(above) with its theoretical sequence (bellow) of the pKDsgRNA-pgi plasmid...................... 43
Figure 23- Agarose gel electrophoresis. Purified pCas9cr4 and pKDsgRNA-pgi extracted from
DH5α (lane 2) and MG1655 (lane 3) after enzymatic digestion with BamHI. NZYDNA Ladder
III molecular weight marker (lane 1). ...................................................................................... 44
Figure 24- Schematic representation of the pgi gene deletion, from the genome, by the no-
SCAR system described by Reisch & Prather 2015. In a first step the supplied oligonucleotide
containing 73 bp is homologous recombined with the upstream (blue) and downstream (green)
regions of the pgi gene, with the help of the λ Red system, originating the chimeric genomic
DNA. After the replication of the chimeric DNA, a population is the recombinant (with the pgi
deleted), another is the wild-type. From which after being spread on plates containing aTc
(induces the expression of the Cas9-sgRNA-pgi), the recombinants survive while the wild-type
cells die. The PAM site is represented by a red cross (Adapted from Reisch & Prather 2015 and
Peters et al. 2015). .................................................................................................................... 45
Figure 25- Agarose gel electrophoresis resultant from the amplification of an upstream and
downstream region of the pgi gene by a PCR reaction with the pgi_conf_F and pgi_conf_R
primers, on 12 colonies (1-12) of the MG1655/pCas9cr4/pKDsgRNA-pgi strain (A) and
DH5α/pCas9cr4/ pKDsgRNA-pgi (B), lanes 1-12, respectively. And on the strain MG1655 and
DH5α (C), lanes 1 and 2 respectively. NZYDNA Ladder III lane M. ..................................... 48
xiv
Figure 26- Agarose gel electrophoresis of the fragment resultant from the amplification of an
upstream and downstream region of the recA locus, with the primers recA_check_F and
recA_check_R), on colonies 3, 4 and 12 of MG1655/pCas9cr4/pKDsgRNA-pgi strain, lanes 2,
3, 4, respectively. And on colonies 10, 11 and 2, of DH5α/pCas9cr4/pKDsgRNA-pgi strain,
lanes 5, 6 and 7, respectively. NZYDNA Ladder III lane 1. .................................................... 49
Figure 27- Agarose gel electrophoresis resultant from the amplification of an upstream and
downstream region of the pgi gene, or an internal region of the pgi gene by a PCR. On 3 colonies
(3, 4 and 12) of the MG1655/pCas9cr4/pKDsgRNA-pgi, lanes 2, 4 and 6, with the pgi_fwr1
and pgi_rev1 primers and lanes 3,5 and 7 with the pgi_fwr2 and pgi_rev2 primers, respectively.
And on 3 colonies (10, 11, 2) of DH5α/pCas9cr4/pKDsgRNA-pgi, lanes 8, 10 and 12, with the
pgi_fwr1 and pgi_rev1 primers and lanes 9, 11 and 13, with the pgi_fwr2 and pgi_rev2,
respectively. Set of primers A- pgi_fwr1 and pgi_rev1, B- pgi_fwr2 and pgi_rev2. .............. 50
Figure 28- Analysis of pKD13, pKD46 and pCP20 plasmids by agarose gel electrophoresis.
(A)-Purified pKD13 extracted from DH5α (lane 2), and after enzymatic digestion (lane 3).
Purified pKD46 extracted from DH5α, BW27783 and BW2P (lane 4, 6 and 8), and after
enzymatic digestion (lane 5, 7 and 9), respectively. (B)- Purified pCP20 extracted from DH5α
after enzymatic digestion (lane 2). All digestions were performed with BamHI. NZYDNA
Ladder III molecular weight marker (lanes 1). ........................................................................ 52
Figure 29 – Agarose gel electrophoresis of the amplified pgi kan-cassette after purification (lane
2). NZYDNA Ladder III (lane 1). ............................................................................................ 53
Figure 30- Agarose gel electrophoresis of the PCR product resultant from the pgi kan-cassette
introduced in the genome amplification of 6 colonies of BW27783 (lanes 2-7), and BW2P
(lanes 8-13) strains. NZYDNA Ladder (I) (lane 1). ................................................................. 54
Figure 31- Agarose gel electrophoresis of the PCR product for the analyse of the presence of
the “scar” in the pgi locus in the genome of 4 colonies of BW2P (lanes 2-5). NZYDNA Ladder
I (lane 1). .................................................................................................................................. 55
Figure 32- Alignment between the PCR product resultant of the amplification of the “scar”
sequenced (above) (with a forward primer) of the BW2P strain with its theoretical sequence
inserted in the genome (bellow). .............................................................................................. 56
Figure 33- Agarose gel electrophoresis of ~650ng of the purified plasmids. Before the induction
with L-(+) arabinose to the cell cultures of BW2P∆pgi and BW2P, non-digested (lanes 2 and
4) and digested with ApaLI enzyme (lanes 3 and 5), respectively, and after 2h of the induction,
non-digested (lanes 6 and 8) and digested with ApaLI enzyme (lanes 7 and 9), respectively.
Marker NZYDNA Ladder III (lane1). The DNA isoforms observed are specified at figure’s
xv
right. Abbreviatures: oc- open circular, lin- linear, sc- supercoiled, PP- parental plasmid, MC-
minicircle, MP-miniplasmid..................................................................................................... 57
Figure 34- (A) Growth curves exhibited by BW2P∆pgi (blue line) and BW2P (orange line),
Log (optical density (OD600nm)) over time (hours). L-arabinose was added to the culture at 3h.
The arrows indicate the time at which the samples of 2 mL were harvested from the culture (at
3 and 5 hours) (from 3 replicates). (B)-Specific plasmid DNA yield (µg.L-1.OD-1) exhibited by
BW2P∆pgi (blue) and BW2P (orange) before the induction with L-arabinose and after 2h of
induction (from 3 replicates). The specific plasmid yield (µg.L-1.OD-1) was determined through
the equation [[pDNA](ng.µL-1) * Vpurified plasmid (µL) * 10-3] / [OD600nm * Vculture sample (L)]. The
Vpurified plasmid is 50 µL and Vculture sample is 2 mL. The SEM was calculated and represented with
error bars. ................................................................................................................................. 58
xvii
List of Tables
Table 1- Characteristics of the PCR primers used for the amplification of the 2 fragments that
constitute the pKDsgRNA-pgi plasmid from the pKDsgRNA-ack, including its sequence (5’--
>3’), its melting temperature (Tm) in ºC, its GC (%) content, and the number of
oligonucleotides. ...................................................................................................................... 27
Table 2- PCR reaction setup (left table) and cycling conditions (right table) for the amplification
of 2 fragments of the pKDsgRNA-ack, containing the sgRNA-pgi. For the fragment 1 the
forward and reverse primers are the pgi-pfrag1fwd and primer pgi-pfrag1rev, respectively, and
for the fragment 2 the primer pgi-pfrag2fwd and primer pgi-pfrag2rev, respectively. V- volume.
.................................................................................................................................................. 27
Table 3- CPEC reaction setup (left table) and cycling conditions (right table) for the
amplification of pKDsgRNA-pgi by the junction of the fragment 1 and 2 constructed by PCR.
The overlapping regions between these 2 fragments will serve as primers for the DNA
polymerase. V-volume. ............................................................................................................ 27
Table 4- Characteristics of the PCR primers used for the amplification of the pgi kanamycin
resistance cassette, including its sequence (5’-->3’) (upper case letters- homology sites, lower
case letters- priming sites), its melting temperature (Tm) in ºC, its GC (%) content, and the
number of oligonucleotides. ..................................................................................................... 33
Table 5- PCR reaction setup (left table) and cycling conditions (right table) for the amplification
of the pgi kanamycin resistance cassette. V-volume ................................................................ 33
Table 6- Characteristics of the PCR primers used for the amplification of a region that contains
the pgi or the recA genes, including its sequence (5’-->3’), its melting temperature (Tm) (ºC),
its GC (%) content, and the number of oligonucleotides. ........................................................ 37
Table 7- PCR reaction setup (left table) and cycling conditions (right table) for the amplification
of a region containing the pgi or the recA gene.(*)- When cells were used as template the 1st
step was for 7 minutes in order to also allow the cells to lyse, to release the gDNA to the
medium, while if genomic DNA (gDNA) was used directly as sample, the 1st step occurred only
for 3 min. (**) annealing temperature: 47ºC when the primers pgi_conf_F and pgi_conf_R were
used; 53ºC when the primers pgi_fwr1 and pgi_rev1, pgi_fwr2 and pgi_rev2, recA_check_F
and recA_check_R were used. V- volume. .............................................................................. 37
Table 8- Resume of the attempts made when electrotransforming the
MG1655/pCas9cr4/pKDsgRNA-pgi and DH5α/pCas9cr4/pKDsgRNA-pgi with the
oligonucleotide to delete the pgi gene. It is indicated the alterations made to the preparation of
xviii
the electrocompetent cells, the electroporation, and recovery steps, as well as the concentration
of aTc used in the plates, and the number of colonies tested. (*)- It was added 100ng.mL-1 aTC,
and 50mM L-arabinose, (**)- The plates contained also LB + 34 µg.mL-1 Cm + 200 µg.mL-1
Spec. ......................................................................................................................................... 47
Table 9- Resume of the specific (µg.L-1.OD-1) and volumetric (mg.L-1) plasmid yields obtained
in the samples from BW2P∆pgi and BW2P, when non-induced (3h of growth) and 2h after
being induced (5h of growth) with L-arabinose. Average values ± SEM are exhibited. ......... 58
xix
List of Abbreviations
ADA - Adenosine deaminase
Amp - Ampicillin
ARPs - Antibiotic resistance proteins ARPs
aTc- anhydrotetracycline
ATP - Adenosine triphosphate
bp - base pair
Cas9 - CRISPR associated protein 9
Cm - Chloramphenicol
CMV - Cytomegalovirus
CRISPR - Clustered Regularly Interspaced
Short Palindromic Repeats
crRNA - CRISPR RNA
DBS - Double strand break
DCW - Dry cell weight
DNA - Deoxyribonucleic acid
dsDNA - double stranded DNA
ED - Entner-Doudoroff
FADH - Flavin adenine dinucleotide
FDA - Food and Drug Administration
FLP - Flippase recombinase
FTR - FLP recognition target
gDNA - genomic DNA
GFP - Green fluorescence protein
GOI - Gene of interest
HPLC - High performance liquid chroma-
tography
Kan - Kanamycin
LB - Luria Bertani
lin - linear
LPL - Lipoprotein lipase
LPS - Lipopolysaccharides
MC - Minicircle
miRNA- microRNA
MP - Miniplasmid
mRNA- messenger RNA
MRS - Multimer resolution site
NADH - Nicotinamide Adenine
Dinucleotide
NADPH - Nicotinamide Adenine Dinu-
cleotide Phosphate
NHEJ - non-homologous end joining me-
chanism
no-SCAR - Scarless Cas9 Assisted Recom-
bineering
oc - Open circular
OD - Optical density
Ori - origin of replication
ORT - Operator-repressor titration
PAM - protospacer adjacent motifs
PCR - Polymerase chain reaction
pDNA - plasmid DNA
PP - Parental plasmid
ppGpp - guanosine tetraphosphate
PPP - Pentose Phosphate Pathway
PTS - Phosphotransferase System
RBS - Ribosome binding site
Re - Recombination efficiency
RNA - Ribonucleic acid
ROS - Reactive oxygen species
rRNA - ribosomal RNA
sc - supercoiled
sgRNA - single guide RNA
siRNA- small interfering RNA
xx
Spec - Spectinomycin
ssDNA - single stranded DNA
TALEN - Transcription activator like
effector nucleases
Tc - Tetracycline
TCA - Tricarboxylic acid cycle
tracrRNA - trans-activating CRISPR RNA
tRNA - tranfer RNA
VEGF - vascular endothelial growth factor
ZFN - Zinc finger nucleases
1
1. Introduction
1.1. Gene therapy
Genes are the units of heredity. They encode proteins or RNA molecules with functional
activity, providing to the organisms their characteristics. But, when carrying a mutation in their
sequence or are not being correctly regulated, they may not function properly, resulting in some
cases in a genetic disease. One of the approaches to treat it is through gene therapy, where the
affected gene is corrected or modulated, to recover its normal function. This can be achieved
by inserting the correct gene in a random location in the genome to replace the defective one.
By removing the defective gene by the correct one though homologous recombination. Or even
by regulating its expression, turning it on or off. And also by repairing the mutation through
reverse mutation (Misra 2013).
This is done with the help of supplied nucleic acids: DNA, messenger RNA (mRNA),
small interfering RNA (siRNA), microRNA (miRNA) or antisense oligonucleotides (Yin et al.
2014). However, after their production and purification, they have to be delivered to the patients
and reach the target cells, where they will exert their function. For that, they have to pass
through several extracellular and intracellular barriers (e.g. nucleases, blood clearance,
diffusion through the plasma membrane and nuclear envelope, lysosomal vesicles) (Gaspar et
al. 2015). To facilitate their delivery they are incorporated in a carrier/delivery vector, whose
design is of extreme importance to ensure the passage through all these barriers (Misra 2013).
1.1.1. Carrier vectors (viral and non-viral)
Carrier vectors are distinguished in viral (modified) and non-viral. Some examples of
viral vectors include retroviruses, lentiviruses, adenoviruses, adeno-associated virus. While
non-virus comprise polymer, peptide and lipid based vectors and naked plasmids (Yin et al.
2014). Figure 1 shows the distribution of used vectors in clinical trials.
2
Figure 1- Vectors used in gene therapy clinical trials (from
wiley.com//legacy/wileychi/genmed/clinical/).
Viral vectors exhibit a high delivery efficiency compared to the non-viral, however they
are less safe and more difficult to produce, turning the non-viral vectors a better alternative (Yin
et al. 2014), depending on the objective.
The most used non-viral vector, plasmid DNA (pDNA), is normally produced in large
scale using Escherichia coli as the host cell. First it needs to be extracted and purified from the
host in order to eliminate contaminants like the RNA, genomic DNA, proteins and other cellular
debris (Prazeres & Monteiro 2014), so that it fulfils the necessary requirements to be handed to
the patients, to prevent any harm.
1.1.2. Clinical trials
In 1990, a girl which possessed the adenosine deaminase (ADA) deficiency, that
compromises the immune system, was the first one to receive a gene therapy product. Her white
blood cells were collected and the functional gene encoding ADA was inserted. Then the girl
was infused with her own cells (https://history.nih.gov/exhibits/genetics/sect4.htm) (https:/
/www.news-medical.net/health/Gene-Therapy-History.aspx.).
Until now the most part of gene therapy clinical trials conducted were related with
cancer diseases, about 64.6%. Followed by monogenic diseases, 10.5%, and infectious diseases,
7.4% (Figure 2). Some examples include Alzheimer’s disease, cystic fibrosis, haemophilia.
HIV, Hunting’s disease, blindness diseases and others.
3
Figure 2- Indications addressed by gene therapy clinical trials (from
wiley.com//legacy/wileychi/genmed /clinical/)
Since the safety and efficacy of gene therapy is still being evaluated it is now being used
to treat diseases which have no other treatments available (https://ghr.nlm.nih.gov/
primer/therapy/genetherapy). A great percentage of gene therapy clinical trials are still in the
initial stages, about 95% are in phase I, I/II and II (Figure 3). And only 5% constitute phase
II/III, III and IV (Hanna et al. 2017). Some of the gene therapies approved so far include
Gendicine to treat head and neck squamous cell carcinoma (Xiao-jie et al. 2015), Glybera to
restore lipoprotein lipase (LPL) activity (http://www.ema.europa.eu/ema/index.jsp?curl=pages
/medicines/human/medicines/002145/human_med_001480.jsp&mid=WC0b01ac058001d124
%0A), Imlygic to treat melanoma (Fukuhara et al. 2016), and the cell gene therapy Strimvelis
to treat adenosine deaminase deficiency (Hanna et al. 2017).
Figure 3- Phases of gene therapy clinical trials (from wiley.com//legacy/wileychi/genmed/clinical/).
1.2. Minicircles
Usually plasmids for therapeutic applications are first produced in high amounts in host
bacteria and after being purified they are delivered to the patients, where they will exert their
therapeutic effect. Due to that, this plasmid vectors contain bacterial sequences required for its
production into the host, like a bacterial origin of replication, and typically an antibiotic
4
resistance gene (to guarantee that only the cells with the plasmid survive). And eukaryotic
sequences constituted by the gene of interest (GOI) that will exert the therapeutic function, and
its regulatory sequences (e.g. promoters, polyadenylation sequences), to control the expression
of the GOI in the eukaryotic target cells (Prazeres & Monteiro 2014).
The prokaryotic sequences are associated with a decrease in stability, uptake and efficacy
of the plasmid vector. Which can also trigger immune responses in the patients, like the CpG-
motifs (Chen et al. 2008). Since those sequences are only needed during the bacterial growth
and not for the therapeutic action, some researchers have been focused in the production of
plasmids, in which these sequences can be removed afterwards, when they are no longer
necessary. The removal of these prokaryotic sequences is also advantageous to prevent the
integration of the antibiotic resistance genes in the genome of the eukaryotic cells, or in the
human microbiota (Gaspar et al. 2015). That would cause resistance to some antibiotics.
The plasmids from which these sequences can be removed are called parental plasmids
(PP), they contain both prokaryotic and eukaryotic sequences. And also two recombinase target
sites (multimer resolution sites-MRS), through which specific recombinases can act originating
a miniplasmid (MP) (contains the bacterial sequences) and a minicircle (MC) (contains the
eukaryotic sequences) (Prazeres & Monteiro 2014), Figure 4.
Figure 4- Schematic illustration of the recombination of a parental plasmid into a minicircle and
miniplasmid. ORI-bacterial origin of replication; GOI-gene of interest; MRS-multimer resolution site
(from doi:10.1128/micro-biolspec.PLAS-0022-2014.f3).
Minicircle lacks the prokaryotic sequences and contains the GOI, it is the one used in the
therapeutic applications. After their production they need to be extracted from bacteria, for that
the bacterial cells are lysed, then to be purified it needs to be separated from the impurities
present like other nucleic acids and proteins, the MP and remaining PP. Which is difficult to
achieve due to their physico-chemical properties similarities (like the size, charge, isoform,
5
molecular composition) (Gaspar et al. 2015). So, the efficiency of recombination should be high
to ensure a maximum ratio MC/PP, avoiding residual PP in the final formulation. To facilitate
the purification procedure, some researchers developed a strain capable of degrading the MP
and residual PP by the action of a I-SceI endonuclease, whose restriction site in located on the
prokaryotic backbone (Kay et al. 2010).
Since these prokaryotic sequences constituted ~50% of the parental plasmid, the
minicircles exhibit a reduced size. Making them more easy to transfect, and also to exhibit
higher expression levels of the therapeutic gene due to their increased diffusion across the
cytoplasm, compared to the PP (Lukacs et al. 2000) (Yin et al. 2014) (Gaspar et al. 2015).
Besides the minicircles there are other plasmids alternatives lacking the antibiotic
resistance gene (based on e.g. Operator-repressor titration (ORT) technology (pORT), or the
mini-intronic plasmid enclosing RNA/OUT antisense RNA) and the selection is rather based
on other parameters. However they still contain the bacterial origin that contributes to an
increased size (Gaspar et al. 2015), that the MC doesn’t possess.
1.2.1. Synthesis of minicircles
To produce MC from PP with a higher productivity, generally the PP is
amplified/produced in higher quantities in E. coli as host cell due to their advantageous
characteristics (further mentioned). Then the cells need to be induced to express the enzymes
responsible for the recombination process. Which act on the specific recombination sites (MRS)
present on PP, giving rise to the MP and MC (Gaspar et al. 2015) (Figure 4). The recombinases
currently used are divided in 2 main families based on the amino acid residue as a catalytic
nucleophile: serine and tyrosine (Colloms et al. 1997). The inducers normally used are the heat
and L-arabinose. The induction response from this last one seems to be dependent on the
concentrations used due to the all-or-none response (Khlebnikov et al. 2001), as well as the
stage of the cellular growth at which it is added to the medium. Higher MC yields were observed
when the induction occurred at final of the exponential phase, since at this period the
concentration of healthier cells is higher as well as the concentration of PP produced, that can
be used to originate MC (Gaspar et al. 2014) (https://bitesizebio.com/13514/).
Currently the use of minicircles seems to be a safer approach in therapeutic applications
compared to plasmids, however the quantities of MC obtained at a laboratorial scale,
milligrams, need to be increased to an industrial scale, grams, while maintaining its stability
and quality. Also, the “metabolic burden” that the host bacteria can exhibit (term further
explained) represents a drawback. The purification process needs also to be improved to ensure
6
that the MC final product devoid MP, PP or bacterial debris. Due to that researches are being
done to overcome this problems by altering growth/induction conditions, genotype of the host
(Gaspar et al. 2015).
1.3. Metabolism
As mentioned previously the E. coli, a gram-negative bacterium, is normally the host
microorganism of choice, once it grows fast in a minimal medium, while producing rapid and
high plasmid yields, at a lower cost compared to other microorganisms (Silva et al. 2012).
Despite this, E. coli produces an endotoxin (triggers immune responses), lipopolysaccharides
(LPS), which is released when the cells are lysed to extract the pDNA. It even show some
genetic instability due to the presence of mobile DNA elements, which are capable of changing
their position in the genome (like transposases, integrases, specific site recombinases, insertion
sequences (IS) elements, defective phages), compromising the production of the plasmid and/or
the therapeutic product, and hence its safety (Bower et al. 2012) (Pósfai et al. 2016).
The E. coli strain selected is also fundamental, because their genotype may alter the
production of the plasmids. Since the goal is to produce large amounts of pDNA, while
maintaining its biological activity, several strains like DH5, DH5α, BL21, DH10B, and others
have been used for this production. However, most of them were constructed for a different
purpose like the expression of recombinant proteins and/or to clone heterologous genes, and so
they may not ensure the best conditions to produce the plasmid DNA (Gonçalves et al. 2013).
After the selection of the host strain, the cells are transformed with the desired plasmid,
which to replicate and maintain them need to fractionate a certain amount of resources (like
metabolic energy, amino acids, nucleotides, etc.) to certain metabolic processes related with its
production. Cells with pDNA (P+) have an increased demand in nucleotides for the replication
of the plasmid, besides the necessary for the host genome DNA (gDNA). Also, an additional
demand exists for amino acids to synthesize some proteins encoded by the plasmid, like the
ones which ensure cells’ survival in a selective medium with antibiotics or without an essential
nutrient, or even because they are the therapeutic product. Besides, in those reactions it is
expended a huge amount of energy, e.g. the addition of amino acids to a peptide chain spend 2
GTP molecules, the covalently binding of each amino acids to their correspondent tRNA, to
form aminoacyl-tRNA expends one molecule of ATP (Silva et al. 2012) (Glick 1995) (Nelson
et al. 2008). This imposition of resources to the cells was termed as “metabolic burden” by
Glick in 1995. It tends to affect the normal cell functioning, and so P+ tend to exhibit a longer
period of adaptation to the medium (lag phase), followed by a slower exponential growth
7
compared with the ones without plasmids (P-) (Ow et al. 2006). This causes stress in the cells,
since their metabolism and growth is being affected, compromising the plasmid production
(Silva et al. 2012).
With the purpose of overcoming this metabolic burden, the plasmid instability, the stress
responses in P+ cells, and to increase the yields of plasmid produced, several studies have been
conducted. Different fermentation strategies have been tested, like changing the environmental
conditions (the concentration of oxygen, the medium components like the carbon and nitrogen
sources, and also the temperature); as well as different feeding modes: batch, fed-batch (Silva
et al. 2012) (Yau et al. 2008). Also the design of new E. coli strains, by changing their genotype,
in order to change their metabolism to improve the plasmid production and stability (Bower et
al. 2012). In the following sections some of the strategies studied to alter the cellular
metabolism are discussed as well as their effect on the plasmid production.
1.3.1. Glycolysis
Glycolysis is the main pathway for glucose to be metabolized and yield pyruvate. Some
energy is produced in the form of ATP, as well as reducing agents (NADH), and intermediates
for biosynthetic reactions (Ow et al. 2006) (Nelson et al. 2008) (Li et al. 2015). Some
researchers have compared the genes/proteins expression profiles of P+ and P- of this pathway.
But the results obtained were not concordant, which may happen due to the different
fermentation modes employed, and the composition of the media used in each study, which will
probably influence the rate of glucose consumption (Silva et al. 2012). For instance, Ow et al.
(2006) described that most part of the glycolytic genes were downregulated (except gapA, that
codes glyceraldehyde-phosphate dehydrogenase), while Wang et al. (2006) have demonstrated
opposite results, where most part of the glycolytic enzymes activities were increased, fact
observed also in Birnbaum & Bailey 1991. Based in its own previous study, Ow et al. (2009)
choose to perform the knockout of the fruR (“fructose repressor”) in DH5α. This gene encodes
a transcriptional regulator that stimulate the expression of enzymes involved in gluconeogenic
and tricarboxylic acid cycle (TCA) pathways, while represses genes involved in the reactions
of cleavage of sugars, including the Entner-Doudoroff (ED), pentose phosphate (PP), and
glycolysis pathways. The authors employed fed-batch cultures with exponential feeding, and
reported an increase in the specific plasmid yield in the mutant strain, which produced about
21% more than the wild type (19.2 mg.g DCW-1 of pDNA versus 15.9 mg.g DCW-1).
8
1.3.2. Pentose phosphate pathway (PPP)
The PPP is one of the pathways where 6-P-glucose is metabolized. Here it is produced
reductive power (NADPH) and the metabolite 5-P-ribose. NADPH is important for the
reductive biosynthesis of lipid and nucleic acids and to prevent the cell from the oxidative stress.
5-P-ribose is a precursor in the biosynthesis of nucleotides, which are the “building blocks” of
DNA, RNA and coenzymes like ATP, NADH, FADH2, and coenzyme A (Nelson et al. 2008).
In Ow et al. (2006) the flux of this pathway remained constant, comparing P+ with P. While
Wang et al. (2006) described a downregulation of zwf, gnd and rpi genes, which catalyse the
enzymatic reactions represented in Figure 5.
Figure 5- Schematic representation of the glycolysis and pentose phosphate pathway (adapted from:
http://schoolbag.info/chemistry/mcat_biochemistry/59.html)
Once the metabolic flux trough the PPP doesn’t seem to be enough to support the extra
demand in nucleotides imposed by the presence of the plasmid, some strategies rely on directing
the flux into the PPP, by overexpressing the zwf, gnp or rpi or even by deleting pgi (Silva et al.
2012). Flores et al. (2004) observed an increase in the growth rate of JM101 when transformed
with a plasmid carrying the zwf gene, from 0.46 h-1 (uninduced) to 0.64 h-1 (induced with IPTG
to express the zwf). Wang et al. (2006) overexpressed rpiA gene and observed an increase
around 3.3 and 7-fold in pDNA copy number, when using a plasmid derived from ColE1 with
2 and 1 origins of replication, respectively, in the BL21 strain. Gonçalves et al. (2013), have
constructed a strain, GALG20 (MG1655∆endA∆recA∆pgi), that has a similar growth rate when
compared with the parental strain and produces 25-fold more of [pDNA] using 20 g.L-1 of
9
glucose. With the deletion of pgi the glycolysis continues, since the two end products of PPP
(3-P-glyceraldehyde and 6-P-frutose) are further used in glycolysis. But, this modification can
create a redox imbalance inside the cell due to the excess of NADPH produced, that may induce
stress (Charusanti et al. 2010). The authors also tested the effect of the pgi mutation changing
the carbon source to glycerol, however the plasmid DNA yield obtained was relatively the same
for both strains (GALG20 and MG1655∆endA∆recA), since glycerol is metabolized in a route
different from glucose. And so, the pgi mutation was only beneficial when glucose was used.
1.3.3. Biosynthetic pathways
In the pathways mentioned before, intermediates are produced for the synthesis of
amino acids, important to produce proteins, and nucleotides, the building blocks of nucleic
acids. So, it is likely that exist a high demand in amino acids in P+ cells due to the expression
of antibiotic resistance genes or for the expression of recombinant proteins (dependent on the
plasmid), however many genes involved in its synthesis in Ow et al. (2006) have shown to be
downregulated. Such as leuA, proB, asnB and ansA, aspC, hisC, cysK and cysM, pheA, tyrB,
asd, metA and metB which code enzymes responsible for the catalysis of leucine, proline,
asparagine, aspartate, histidine, cysteine, phenylalanine, tyrosine, lysine, methionine,
respectively (https://ecocyc.org/). Some genes involved in purine (purU, gmk, guaC) and
pyrimidine (pyrG, dut, deoA) synthesis were also found to be downregulated. One hypothesis
for these downregulations may be the slower growth rate exhibited by the P+ cells compared
with the P- cells, which will show (P-) a higher rate of replication of the gDNA and so an
increasing the demand of nucleotides (compared with P+) (Ow et al. 2006).
So, possible strategies to increase the production of the plasmid could be to overexpress
some genes involved in the amino acid and nucleotide synthesis that exhibited a relative low
expression.
1.3.4. The by-product acetate
To produce plasmids in large amounts, fermentations can be realized with high cell
concentrations in the medium, that can be achieved by the consumption of the substrate (e.g.
glucose) in great amounts.
However, this can lead to an overflow in the metabolism, characterized by an increased
flux in the glycolysis producing acetyl-CoA in excess, and since the TCA is unable to
completely oxidize at the velocity that is being formed it is converted in acetate (Wolfe 2005)
(Borja et al. 2012) (Lara et al. 2008). The acetate produced affects the cellular growth and
10
represents a waste of the carbon source (Gonçalves et al. 2013). One strategy to overcome this
is to use glycerol as the carbon source (Carnes et al. 2006). Also, fed-batch cultures can be used
instead of batch cultures, to decrease the rate at which glucose is consumed, but this showed to
affect the growth rate (Lara et al. 2008). To prevent it, Borja et al. (2012) used a strain (VH33)
where the principal transporter responsible for the internalization of glucose (PTS) was deleted,
leading to the entrance of glucose to the cell at a slower rate by the galactose permease. VH33
strain produced twice more pDNA (2.78 mg.L-1) when compared with its parental strain W3110
(1.14 mg.L-1), but still below the DH5α strain (12.73 mg.L-1). Where VH33 only originated
0.14 g.L-1 of acetate compared with the 0.32 g.L-1 and 0.62 g.L-1 produced by the parental and
DH5α strains, respectively. Despite the low production of pDNA by VH33 compared to DH5α,
the deletion of the PTS seems to be beneficial, which could be tested in DH5α.
Gonçalves et al. (2013) performed the deletion of pykA and pykF in JM101, MG1655
and DH5α strains. These deletions were already performed in JM101 by Cunningham et al.
(2009). Both genes encode isoenzymes of pyruvate kinase enzyme, which catalyses the
formation of pyruvate and ATP (https://ecocyc.org/). Gonçalves et al. 2013 reported an increase
in the production of pDNA in JM101ΔpykFΔpykA (5.3 or 2.6 mg.g DCW-1) compared with
JM101 (2.5 or 1.3 mg.g DCW-1), grown with 5 or 20 g.L-1 of glucose. When using GALG1011
(MG1655ΔendAΔrecAΔpykFΔpykA) (6.6 or 1.6 mg.g DCW-1) the yield was also higher than its
parental strain, MG1655ΔendAΔrecA (3.6 or 0.8 mg.g DCW-1). While for DH5αΔpykFΔpykA
the production decreased (0.3 or 0.4 mg.g DCW-1) compared with DH5α (1.8 or 0.8 mg.g DCW-
1), when grown with 5+10 g.L-1 (2 pulses) or 20 g.L-1 of glucose, respectively. In the same study
the effect of one single mutation either in pykF or pykA (but for JM101) was also tested, where
better pDNA yields were obtained compared with double deletion. This may be explained by
the little acetate that is being produced due to the reduced flux of glycolysis, when the activity
of the pyruvate kinase is diminished instead of eliminated, allowing the continuous production
of ATP and an increased glycolytic flux through the PPP. In this study is also noticed the
influence of the feeding strategy mode, once better plasmid yields were obtained when it was
supplied 5+10 g.L-1 (2 pulses) glucose instead of 20 g.L-1. And also that the effect of each
modification on the plasmid production depends on the host genotype.
1.3.5. Stress Responses
Stress responses may be triggered by changes in the composition of the growth medium,
like the depletion of nutrients, such as the “building blocks” (amino acids), which may result
from the increased expression of antibiotic resistance proteins (ARPs) encoded in the plasmid.
11
This may induce the heat shock response, and so heat shock proteins (e.g. chaperones,
proteases) are expressed. They may degrade, form aggregates or inclusion bodies of the ARPs,
in order to restrict the amount of the spent amino acids (Glick 1995). In fact, some researchers
observed an increase expression of these proteins, like IbpA, IbpB, DnaK, HtgA, HtrE (Ow et
al. 2006) (Birnbaum & Bailey 1991) (Haddadin et al. 2005). Amino acid starvation also induces
a stringent response (Wegrzyn 1999). Mainly regulated by guanosine tetraphosphate (ppGpp),
a signalling molecule produced by the RelA and capable to bind to the RNA polymerase
(Nelson et al. 2008) (Wegrzyn 1999). The increase in ppGpp levels leads to the reduction of
rRNA, tRNA, and pDNA synthesis , as it activates the expression of genes associated with the
amino acid synthesis and even of proteases, to overcome the amino acid starvation (Silva et al.
2012).
Some studies reviewed by Wegrzyn 1999 evaluated the replication of different replicons
in amino acid starved cells. In most of those studies, the replication of plasmids was found to
be inhibited when cells presented a stringent response (wild-type cells), but a significant
replication when cells exhibited a relaxed response (relA mutants).
1.3.6. Plasmid Stability
The plasmid (in)stability can be divided in 3 main types, structural, isoform and
segregational. It is affected by the metabolic burden, copy number of the plasmids, the substrate
used for growth, the medium composition and culture conditions.
1.3.6.1. Structural stability
The structural instability is related to modifications in the sequence of the plasmid like
point mutations, deletions, insertions and even genomic rearrangements, which may affect the
final therapeutic product and its effectiveness (Silva et al. 2012).
This instability may arise due to the environmental conditions, the presence of mutagen
agents or reactive oxygen species (ROS), introduction of errors during the replication and non-
proper repair, the existence of mobile genetic elements and even inverted repeats or secondary
structures, which constitute a target for DNA repair systems (Pósfai et al. 2016) (Silva et al.
2012) (Friehs 2004). To overcome this problem strains with the recA gene mutated have been
constructed, to prevent the homologous recombination mechanism (Silva et al. 2012). Also,
mutants in the uvrC and umuC genes, which encoded proteins involved in the repair of DNA
damage by UV light and in the SOS pathway, respectively. Such as the SURE strain which
possess these latter mutations, and when harbouring a plasmid containing inverted repeats only
12
0.8-1.2 % of cells had rearranged it, compared to 15-20 % of DH5α cells. Demonstrating the
benefit of those mutations in the maintenance of the plasmid stability (Alan L. Greener & Del
Mar 1996).
1.3.6.2. Isoform stability
Plasmids typically assume one of this three isoforms: supercoiled (sc), open circular
(oc) and linear (lin). The first one is a highly compact and condensed structure. However, if
they are exposed to abrasive chemical reagents, physical shear or nucleases, “nicks”/breaks can
be introduced in one or both strands of the plasmid, unfolding it, and giving rise to the
relaxed/open plasmid or linear isoforms, respectively (Loftus, Bernard T. 1984).
FDA recommends that more than 80% of the plasmid should be in the sc isoform for
DNA vaccines products, since it represents a measurement of the quality of the plasmid. Since
when its isoform is compromised, the sc plasmid takes another isoform (http://www.fda.gov/).
Evidence shows even that the supercoiled isoform leads to higher expression of an exogenous
gene compared to the others (Silva et al. 2012). Being important to ensure that throughout the
whole process of production and purification, plasmid isoform stability is maintained.
The two enzymes mostly involved in the control of the supercoiling conformation in
E. coli are the topoisomerase I, that removes supercoils, and DNA gyrase (topoisomerase II)
responsible for the introduction of negative supercoils (Hassan et al. 2016). Overexpressing
DNA gyrase could represent one strategy, however a study revealed that there was only a small
dependence of the expression of DNA gyrase and the supercoiling levels obtained, in fact, the
authors obtained a 1.3% change in the level of supercoiling for a 10% change in expression of
DNA gyrase and even for topoisomerase I (Snoep et al. 2002). Mutation in endA, inactivates
the endonuclease I, a non-specific endonuclease located in the periplasm, that is released when
the cells are lysed, and so its inactivation prevents the introduction of nicks in the plasmid
(Carnes et al. 2006).
1.3.6.3. Segregational stability
The segregational instability is characterized by the inefficient distribution of plasmids
when cells divide. Which may happen due to the metabolic burden (Silva et al. 2012). Also due
to the accumulation of plasmids inside the cells that can originate plasmid multimers and further
lead to its loss (Silva et al. 2012) (Yau et al. 2008). To prevent the presence of cells without the
plasmid, normally the plasmids possess an antibiotic resistance cassette, and so an antibiotic is
usually added to the medium, so only the cells harbouring the plasmid survive. Another
13
alternative is to use auxotrophic host strains, that are transformed with a plasmid that carries a
gene to complement the auxotrophy (Friehs 2004). One example is E. coli JM83 strain which
is proline auxotrophic, and can be complemented with a plasmid carrying the proBA genes
(Fiedler & Skerra 2001). Other selection alternatives are mentioned at section 1.2.
1.4. Growth conditions
1.4.1. Medium composition
As mentioned before the substrate sources affect the plasmid production. Zheng et al.
(2007) evaluated the carbon and nitrogen sources in the plasmid production using DH5α. They
tested the addition of different carbon sources to LB medium. When D-glucose or glycerol were
added DH5α produced 7.73 mg.L-1 or 7.97 mg.L-1, respectively, less than 14.63 mg.L-1 on LB.
While an increase was observed when D-fructose, D-lactose, mannitol or sucrose were added,
since it produced 14.67 mg.L-1, 21.93 mg.L-1, 27.67 mg.L-1, 28.27 mg.L-1, respectively.
Relatively to nitrogen source they evaluated the substitution of tryptone in the LB medium by
NH4(SO4)2, soya peptone, proteose peptone, and casein peptone. A higher plasmid
concentration was obtained when casein peptone (28.67 mg.L-1) was used when compared with
tryptone (14.63 mg.L-1).
1.4.2. Temperature
Temperature influences the bacterial growth and metabolism. Silva et al. 2009 reported
an increase in the specific yield and purity degree of the plasmids associated with a slower
growth rate of the cultures for higher temperatures, 37ºC and 40ºC, compared with 30ºC and
32ºC. The increase in pDNA production leads to a higher metabolic burden imposing stress in
the cells, affecting their growth.
1.5. Genetic engineering of bacteria
The act of modifying directly a genome of an organism by either adding or deleting
DNA segments is termed as genetic engineering. It is done to change some specific
characteristics. And there are several methods to perform it. One of them is through homologous
recombination. Which can be accomplished using the λ-Red recombinase system (comprises
the exo, bet and gam genes) of a bacteriophage. These genes encode the enzymes responsible
for the recombination between the supplied DNA fragments (that contains the desired
mutation), and the target DNA (to be modified) present in the organism’s genome. The exo,
14
encodes an exonuclease that binds to
linear DNA and degrade the 5’
extremities creating single stranded (ss)
3’overhangs. The beta binds to these
extremities and promotes the annealing
with the target DNA, by complementary
base pair trough the homologous
sequences. While the gam enzyme
avoids the degradation of the double
stranded DNA (dsDNA) linear
fragment supplied, by the endogenous
RecBDC nuclease. By this homologous
recombination a recombinant DNA is
formed with the desired mutation
(Sharan et al. 2009) (Figure 6).
After the homologous re-
combination procedure in the desired
microorganism, the selection of the
recombinants must be done.
Figure 6- Schematic representation of the
homologous recombination processed by the λ-Red
system. The exo enzyme has 5’ to 3’ exonuclease
activity and after binding to the linear DNA creates
single stranded 3’ overhangs. The beta enzyme
binds to this overhands and stimulates the annealing
between the target DNA by base pair homology,
creating the recombinant DNA (from Sharan et al.
2009).
15
1.5.1. Recombinants selection methodology
1.5.1.1. Antibiotic resistance cassette
One of the most well-known selection procedures relies on the use of an antibiotic
resistance cassette, as the one described by Datsenko & Wanner 2000. This method is based on
the replacement of a target gene by an antibiotic resistance cassette. Which is constructed in a
PCR reaction using two primers, both forward and reverse contain a homology arm (H1 and H2)
which is complementary to the regions upstream and downstream of the target gene,
respectively. Each one of the primers have also a sequence complementary to the antibiotic
resistance gene (P1 and P2 sites). After the amplification of the cassette, it is used to transform
cells that express the λ-Red system, where the recombination occurs through the flanking
homologies. Then the
cells are placed in a
medium supplied with
the antibiotic for which
the cassette gives. After
the selection of the
transformants, the
cassette resistance gene
can be eliminated using
the Flippase
recombinase (FLP)
enzyme, which does the
recombination between
short Flippase
recognition target
(FRT) sites (Figure 7).
With the elimination of the antibiotic resistance cassette a “scar” is left in the chromosome,
with a FRT site. Which may represent a problem since it can generate chromosomal instability
or even not desired genomic rearrangements, due to the possible recombination between FRT
Figure 7- Schematic representation of the main steps in the Datsenko and
Wanner genetic engineering procedure. In a first step an antibiotic resistance
cassette is amplified by PCR with homologous arms (H1 and H2),
complementary to the upstream and downstream sequences of the target gene.
The cassette is then introduced in the genome through homologous
recombination with the target gene, performed by the λ-Red system. In a last
step the FLP recombinase is expressed to remove the cassette from the genome,
leaving a “scar” with a FRT site (from Datsenko & Wanner 2000).
16
sites located in other places of the genome (Datsenko & Wanner 2000). This problem may be
solved using the no-SCAR (Scarless Cas9 Assisted Recombineering) described by Reisch &
Prather (2015), which does not leave any scar in the genome. In this procedure, after the
homologous recombination through the λ-Red system, the selection of the recombinants is
rather based on the use of an engineered Cas9-gRNA complex (further explained).
1.5.1.2. Engineered Cas9-gRNA complex
The engineered Cas9-gRNA complex (Figure 8) is based on the CRISPR/Cas system
(Clustered Regularly Interspaced Short Palindromic Repeats CRISPR-associated proteins), the
“immune mechanism” of prokaryotes (Rath et al. 2015). It is composed by the endonuclease
Cas9 which has two lobes: one recognizes the target sequence to be cleaved (through the help
of the gRNA) and the other is the nuclease that cleaves it (Jinek et al. 2012). The guide RNA
(gRNA) results from the fusion of a CRISPR RNA (crRNA) and part of the trans-activating
CRISPR RNA (tracrRNA). The crRNA sequence
contains a sequence (~20nt complementary to the
target DNA) that guides the Cas9 to the target DNA
to be cleaved in both strands by complementary base
pair, introducing a double strand break (DBS).
However, the Cas9 only does the cleavage if the
target DNA is adjacent to short sequences, termed as
protospacer adjacent motifs (PAM). That usually it is
a triplet NGG (N represents any nucleotide and G a
guanine) (Sander & Joung 2014). So, the target
sequence needs to be in the form 5’- N20NGG. The
DBS is introduced of about 3 to 4 nt upstream of the
PAM sequence (http://www.addgene.org/crispr/gui
de/). The requirement of the existence of PAM does
not limit the targeting of the Cas9, once that there are
around 424,651 GG doublets on both strands of the
E. coli genome (Reisch & Prather 2015). Part of the
crRNA sequence also hybridizes with the tracrRNA
which is also important in guiding the Cas9 to the
target, once the mature crRNA is unable to do it
alone (Jinek et al. 2012). By simply altering the first 20 nt of the gRNA (corresponding to the
Figure 8- Schematic illustration of the
engineered Cas9-gRNA complex cleaving
a target sequence DNA. crRNA- CRISPR
RNA, tracrRNA- transactivating CRISPR
RNA. (from Sander & Joung 2014).
17
protospacer in crRNA), the target DNA can be changed (Sander & Joung 2014), making it an
advantage compared to other systems. It has been used in diverse applications. In the no-SCAR
methodology it is used in the selection of the recombinants. Where the complex is induced to
introduce a double strand break (DSB), in the sequences that were not recombined (because the
gRNA was designed to target that sequence), and because the E. coli is unable to repair that
break since it lacks the non-homologous end joining mechanism (NHEJ), it dies. Contrariwise,
if the homologous recombination occurs properly, the Cas9 is unable to cleave the DNA, once
the target sequence is not there the gRNA does not hybridize with that, the cell survives, (Figure
9) (Peters et al. 2015) (Reisch & Prather 2015). By this negative selection (or counterselection)
is possible to distinguish those which have been modified (the recombinants) from those which
not, without leaving a “scar” in the genome like in the Datsenko and Wanner methodology
referred.
Figure 9- General scheme of the counterselection of recombinants, using the complex Cas9-sgRNA
(from Peters et al. 2015.
1.5.1.2.1. Cas9-gRNA complex versus other alternatives
The introduction of DSB in the genome can be also performed by other methods. Some
examples include DNA-binding proteins like meganucleases, which are engineered restriction
enzymes derived from microbial mobile genetic elements. Zinc finger nucleases (ZFN) and
transcription activator like effector nucleases (TALEN), composed by an engineered zinc finger
18
and a TAL effector DNA binding domain, respectively, responsible for the recognition and
binding to the target DNA, and also by a non-specific nuclease domain from one restriction
enzyme, the FokI, that cleaves the target DNA (Xiao-jie et al. 2015).
However, once we desire to change the target DNA sequence, to using one of these
methods referred, the de novo synthesis of the protein that guides ZNF and TALENS is
required. A more complex procedure compared to the design of a new complementary gRNA,
for the Cas9-gRNA complex. Another advantage of the Cas9-gRNA usage is that it is possible
to introduce DSB at multiple sites in parallel (multiplexing), by expressing different gRNA
from a “single construct”, and each one targets distinguish sequences (Xiao-jie et al. 2015)
(Sander & Joung 2014), Figure 10.
Figure 10- Illustration of the expression of multiple gRNA, from a “single construct” (from
http://blog.addgene.org/crispr-101-multiplex-expression-of-grnas).
Although Cas9-gRNA represents a good alternative, one disadvantage is the possibility
of Cas9 cleave at potential off-target sites. Once any target sequence of 20 nt adjacent to a PAM
can have hundreds of off-target positions. To overcome this some researchers have tested a
diversity of strategies. One example is to truncate the 3’ ends to “modify the gRNA
architecture” or add two extra guanine nucleotides attached to the 5’ end, it resulted in a lower
off-target effects, but it also lowered the efficiency of the on-target genome editing (Sander &
Joung 2014).
Other methodologies like the TetA-SacB Dual Selection and SceI counterselection are
also based on the Cas9 counterselection, however they don’t demonstrate the way the plasmids
responsible to encode Cas9 and sgRNA are removed, while the no-SCAR protocol does turning
it more advantageous (Reisch & Prather 2015).
19
2. Thesis Objective
The increased demand in biopharmaceutical plasmids for therapeutic applications has
stimulated new studies to improve the yield and quality of the plasmids produced in the host
bacteria. One strategy is to construct optimized strains able to direct the metabolic flux to the
production of necessary compounds for the amplification and maintenance of plasmids, while
ensuring its stability. However, most studies used strains, which contain a genotype highly
mutagenized, and first developed to produce recombinant proteins, and clone exogenous genes.
So, the consequence of specific mutations in the production of plasmids is not well clear, and
they may not ensure the best conditions for the plasmid production (Gonçalves et al. 2013).
Here the aim was to create E. coli strains able to produce high plasmid yields and compare
their production, by deleting the pgi gene from their genome, once it has already shown to be
beneficial on GALG20 (MG1655∆endA∆recA∆pgi) in Gonçalves et al. 2013. This gene
encodes the key enzyme glucose-6-phosphate isomerase involved in the
glycolysis/gluconeogenic pathway, responsible for the interconversion of glucose-6-phosphate
in fructose-6-phosphate. This deletion aims to direct the carbon flux through the pentose
phosphate pathway, to increase the production of nucleotides and hence of plasmids.
However, GALG20 has showed a genome instability problem, an unwanted deletion of
20 kbp sequence. Due to that, the aim is to introduce deletions in the wild-type MG1655 and in
the highly mutagenized DH5α strain. The deletion methodology is based on the novel no-SCAR
(Scarless Cas9 Assisted Recombineering) described by Reisch & Prather (2015), a scar-free
system that may prevent the genome host instability, and then compare the plasmid production
of both pgi deleted strains in two different background genotypes.
Additionally, the pgi deletion in BW27783 and BW2P strains used to produce minicircles,
which are plasmid derived molecules is intended. Minicircles represent a promising non-viral
vector tool in therapeutic applications, because they only contain the eukaryotic sequences
needed for the therapeutic effect. Therefore, pgi deleted strains before and after induction (with
arabinose) of a recombinase able to convert a parental plasmid into miniplasmid and minicircle
molecules will be compared to evaluate if the pgi deletion is beneficial for minicircle
production.
21
3. Materials and Methods
3.1. Media, chemicals and other reagents
Bacterial growth media: LB medium (NZYTech) (10 g.L-1 of tryptone, 5 g.L-1 yeast
extract and 10 g.L-1 NaCl), LB Agar (NZYTech) (10 g.L-1 of tryptone, 5 g.L-1 yeast extract,
10 g.L-1 NaCl and 15 g.L-1 of agar), SOB medium (20 g.L-1 of Bacto Tryptone (BD), 5 g.L-1 of
Yeast Extract (Liofilchem), 10mM of NaCl (Panreac), 2.5 mM of KCl (Merk), 10mM of
MgCl2.6H2O (Fagron), 10 mM of MgSO4 (Merk),