Improvement of biopharmaceutical-grade plasmid production by … · Improvement of biopharmaceutical-grade plasmid production by targeted genome editing of Escherichia coli Marisa

Improvement of biopharmaceutical-grade plasmid production by

targeted genome editing of Escherichia coli

Marisa Mariano Faustino

Thesis to obtain the Master of Science Degree in

Biotechnology

Supervisor: Prof. Gabriel António Amaro Monteiro

Examination Committee

Chairperson: Prof. Leonilde de Fátima Morais Moreira

Supervisor: Prof. Gabriel António Amaro Monteiro

Members of the Committee: Prof. Duarte Miguel de França Teixeira dos Prazeres

October 2017

i

Acknowledgments

First, I would like to thank my supervisor Professor Gabriel Monteiro for allowing me

making part of this project, for the guidance, motivation and every given advice that made me

improve throughout the work.

I would also like to acknowledge Claudia Alves for her remarkable work in teaching me

the laboratorial procedures I needed to accomplish my work, and also for being available

whenever I have doubts and for her patience. To Sofia Duarte which have also helped me in

some parts of the work. To the laboratory staff Ricardo Pereira and Rosa Gonçalves for

organizing the lab material so well.

I am also grateful for all my friends and colleagues from the laboratory iBB group, the

master degree in Biotechnology, from my village, and also from my residence.

I would like to express my gratitude to my parents, for all the efforts that they have made

for me, and supported me to achieve my goals. And last but not least to my dear great friends

Sara Mariano and Pedro Alves for all meaningful conversations, motivation and emotional

support through all these years.

iii

Resumo

O crescimento do interesse pelo uso de plasmídeos nas áreas terapêuticas tem estimulado

o desenvolvimento de estudos com objetivo de otimizar a sua produção, sendo um exemplo a

modificação do genoma da bactéria hospedeira, onde estes são amplificados. No entanto, o

efeito de cada modificação na produção de plasmídeos depende do genótipo deste hospedeiro.

Assim o objetivo deste trabalho é a criar estirpes de E. coli capazes de produzir plasmídeos com

um elevado rendimento, através da deleção do gene pgi do genoma, que já mostrou ser benéfico

na estirpe GALG20 ((MG1655∆endA∆recA∆pgi) construída em Gonçalves et al. 2013. Esta

deleção tem por objetivo direcionar o fluxo de carbono para a via das pentoses de fosfato, de

forma a aumentar a quantidade de nucleótidos produzidos e por consequente de plasmídeos.

Como GALG20 exibiu mais tarde instabilidade genómica, o objetivo foi realizar deleções na

estirpe selvagem MG1655 e na estirpe altamente mutagenizada DH5α, usando a metodologia

no-SCAR, um sistema que não deixa “cicatrizes” podendo evitar a instabilidade do genoma, e

depois comparar a produção de plasmídeos destas duas estirpes deletadas no gene do pgi com

os dois diferentes genótipos. A deleção deste gene foi tentada nas estirpes BW27783 e BW2P,

usada na produção de minicírculos, derivados de plasmídeos. Apesar de várias tentativas, as

deleções com o método no-SCAR não foram atingidas, por essa razão foi empregue a

metodologia de Datsenko e Wanner, que apesar que deixar uma “cicatriz” no genoma, tem

mostrado ser eficiente. A estirpe BW2P∆pgi foi construída com sucesso, utilizando este ultimo

método, produzindo 1.2 vezes mais plasmídeo parental (161.3 ± 71.6 µg.L-1.OD600-1) que a sua

estirpe parental, BW2P (133.8 ± 34.2 µg.L-1.OD600-1), antes da indução com L-arabinose. Após

a indução, a estirpe mutada produziu 1.9 vezes mais plasmídeos (289.8 ± 44.0 µg.L-1.OD600-1)

que a estirpe BW2P (156.5 ± 45.8 µg.L-1.OD600-1). Em conclusão, a deleção do pgi demonstrou

ser benéfica no aumento do rendimento específico de plasmídeos produzido.

Palavras-chave: terapia genética, engenharia metabólica, gene pgi, plasmídeos, minicírculos, via das pentoses fosfato.

v

Abstract

The growing interest of plasmids in the therapeutic field has stimulated new studies to

optimize its production, such as the modification of the genome’s host bacteria, where they are

amplified. However, the effect of each modification on plasmid production can depend on the

host genotype. Here the aim was to create E. coli strains to produce high plasmid yields, by

deleting the pgi gene from their genome, once it has shown to be beneficial on GALG20

(MG1655∆endA∆recA∆pgi) in Gonçalves et al. 2013. This deletion aims to direct the carbon

flux through the pentose phosphate pathway, to increase the production of nucleotides and

hence of plasmids. As GALG20 presented afterwards a genome instability problem, the aim

was to make deletions in the wild-type MG1655 and highly mutagenized DH5α strains, using

the no-SCAR methodology, a scar-free system that may prevent the genome host instability,

and then compare the plasmid production of both pgi deleted strains in two different background

genotypes. Also, the deletion of pgi gene was pursued in BW27783 and BW2P strains used to

produce minicircles, which are plasmid derived molecules. Despite several attempts, deletions

with the no-SCAR method were not achieved, and so the Datsenko and Wanner methodology

was employed instead, regardless of the “scar” that is left in the genome, it has shown to be

efficient. BW2P∆pgi was successfully constructed with this latter method, producing 1.2-fold

more parental plasmid (161.3 ± 71.6 µg.L-1.OD600-1) than its parental strain, BW2P (133.8 ±

34.2 µg.L-1.OD600-1), before induction with L-arabinose. After the induction, the modified strain

produced 1.9-fold more plasmid (289.8 ± 44.0 µg.L-1.OD600-1) than the BW2P strain (156.5 ±

45.8 µg.L-1.OD600-1). In conclusion, the pgi deletion showed to be beneficial to increase the

specific plasmid yield.

Keywords: gene therapy, metabolic engineering, pgi gene, plasmids, minicircles, pentose

phosphate pathway (PPP)

vii

Table of Contents

Acknowledgments ..................................................................................................................... i

Resumo .................................................................................................................................... iii

Abstract .................................................................................................................................... v

List of Figures ......................................................................................................................... xi

List of Tables ........................................................................................................................ xvii

List of Abbreviations ............................................................................................................. xix

1. Introduction ......................................................................................................................... 1

1.1. Gene therapy .............................................................................................................. 1

1.1.1. Carrier vectors (viral and non-viral) ...................................................................... 1

1.1.2. Clinical trials ......................................................................................................... 2

1.2. Minicircles ................................................................................................................. 3

1.2.1. Synthesis of minicircles ......................................................................................... 5

1.3. Metabolism ................................................................................................................. 6

1.3.1. Glycolysis .............................................................................................................. 7

1.3.2. Pentose phosphate pathway (PPP) ........................................................................ 8

1.3.3. Biosynthetic pathways ........................................................................................... 9

1.3.4. The by-product acetate .......................................................................................... 9

1.3.5. Stress Responses .................................................................................................. 10

1.3.6. Plasmid Stability .................................................................................................. 11

1.3.6.1. Structural stability ......................................................................................... 11

1.3.6.2. Isoform stability ............................................................................................ 12

1.3.6.3. Segregational stability ................................................................................... 12

1.4. Growth conditions .................................................................................................... 13

1.4.1. Medium composition ........................................................................................... 13

1.4.2. Temperature ......................................................................................................... 13

viii

1.5. Genetic engineering of bacteria ............................................................................... 13

1.5.1. Recombinants selection methodology ................................................................. 15

1.5.1.1. Antibiotic resistance cassette ........................................................................ 15

1.5.1.2. Engineered Cas9-gRNA complex ................................................................. 16

2. Thesis Objective ................................................................................................................ 19

3. Materials and Methods ...................................................................................................... 21

3.1. Media, chemicals and other reagents ....................................................................... 21

3.2. Cell storage .............................................................................................................. 21

3.3. Transformations ....................................................................................................... 22

3.3.1. Preparation of competent cells ............................................................................ 22

3.3.1.1. Chemical competent cells .............................................................................. 22

3.3.1.2. Electrocompetent cells .................................................................................. 22

3.3.2. Heat shock transformation ................................................................................... 23

3.3.3. Electroporation .................................................................................................... 23

3.4. Genomic deletion with the no-SCAR system .......................................................... 23

3.4.1. Strains and plasmids ............................................................................................ 24

3.4.2. Oligonucleotide design ........................................................................................ 28

3.4.3. Procedure ............................................................................................................. 29

3.5. Genomic deletion with the Datsenko and Wanner methodology ............................ 30

3.5.1. Strains and plasmids ............................................................................................ 30

3.5.2. Amplification of the pgi-kanamycin resistance cassette ..................................... 32

3.5.3. Chromosomal integration of the pgi-kanamycin cassette ................................... 34

3.5.4. Colony PCR ......................................................................................................... 35

3.5.5. Elimination of the antibiotic resistance cassette .................................................. 35

3.5.6. Procedure ............................................................................................................. 36

3.6. Genomic PCR analysis ............................................................................................. 36

3.7. Plasmid purification ................................................................................................. 37

ix

3.8. Genomic DNA purification ...................................................................................... 38

3.9. PCR products purification ........................................................................................ 38

3.10. DNA quantification and quality assessment ............................................................ 38

3.11. Restriction enzyme digestion ................................................................................... 38

3.12. Agarose gel electrophoresis ..................................................................................... 38

3.13. Minicircle production ............................................................................................... 39

3.14. Recombination analysis by densitometry ................................................................ 39

3.15. DNA sequencing ...................................................................................................... 39

4. Results and discussion ....................................................................................................... 41

4.1. Genomic deletion with the no-SCAR system .......................................................... 41

4.1.1. Purification of the plasmids ................................................................................. 41

4.1.2. The pKDsgRNA-pgi construction ....................................................................... 41

4.1.3. Recombination between the oligonucleotide and the upstream and downstream

areas of the pgi gene ....................................................................................................... 44

4.2. Deletion with the Datsenko and Wanner methodology ........................................... 51

4.2.1. Purification of the plasmids ................................................................................. 51

4.2.2. pgi kanamycin cassette production ...................................................................... 52

4.2.3. Deletion of the pgi gene ...................................................................................... 53

4.2.4. Minicircle production .......................................................................................... 56

5. Conclusions and future work ............................................................................................. 61

6. References ......................................................................................................................... 63

Annexes ..................................................................................................................................... I

Annex A – sgRNA-pgi sequence .............................................................................................. I

Annex B - Region between gam and araC of pKDsgRNA-xxx ................................................ I

Annex C- Alignment of the sequencing result obtained with a primer forward from a region

that included the sgRNA (strain above) with its theoretical sequence (strain bellow) of the

pKDsgRNA-pgi plasmid. ........................................................................................................ II

x

Annex D- Alignment of the sequencing result obtained with a primer forward from a region

that included the overlapping region between gam and araC (320nt) (strain above) with its

theoretical sequence (strain bellow) of the pKDsgRNA-pgi plasmid. ................................... III

Annex E- Alignment of the sequencing result obtained with a primer reverse from a region that

included the overlapping region between gam and araC (320nt) (strain above) with its

theoretical sequence (strain bellow) of the pKDsgRNA-pgi plasmid. ................................... IV

Annex F- Alignment between the PCR product resultant of the amplification of the “scar”

sequenced (above) with a reverse primer, of the BW2P∆pgi strain with its theoretical sequence

inserted in the genome (bellow). ............................................................................................. V

xi

List of Figures

Figure 1- Vectors used in gene therapy clinical trials (from

wiley.com//legacy/wileychi/genmed/clinical/). ......................................................................... 2

Figure 2- Indications addressed by gene therapy clinical trials (from

wiley.com//legacy/wileychi/genmed /clinical/) ......................................................................... 3

Figure 3- Phases of gene therapy clinical trials (from

wiley.com//legacy/wileychi/genmed/clinical/). ......................................................................... 3

Figure 4- Schematic illustration of the recombination of a parental plasmid into a minicircle

and miniplasmid. ORI-bacterial origin of replication; GOI-gene of interest; MRS-multimer

resolution site (from doi:10.1128/micro-biolspec.PLAS-0022-2014.f3). .................................. 4

Figure 5- Schematic representation of the glycolysis and pentose phosphate pathway (adapted

from: http://schoolbag.info/chemistry/mcat_biochemistry/59.html) ......................................... 8

Figure 6- Schematic representation of the homologous recombination processed by the λ-Red

system. The exo enzyme has 5’ to 3’ exonuclease activity and after binding to the linear DNA

creates single stranded 3’ overhangs. The beta enzyme binds to this overhands and stimulates

the annealing between the target DNA by base pair homology, creating the recombinant DNA

(from Sharan et al. 2009). ......................................................................................................... 14

Figure 7- Schematic representation of the main steps in the Datsenko and Wanner genetic

engineering procedure. In a first step an antibiotic resistance cassette is amplified by PCR with

homologous arms (H1 and H2), complementary to the upstream and downstream sequences of

the target gene. The cassette is then introduced in the genome through homologous

recombination with the target gene, performed by the λ-Red system. In a last step the FLP

recombinase is expressed to remove the cassette from the genome, leaving a “scar” with a FRT

site (from Datsenko & Wanner 2000). ..................................................................................... 15

Figure 8- Schematic illustration of the engineered Cas9-gRNA complex cleaving a target

sequence DNA. crRNA- CRISPR RNA, tracrRNA- transactivating CRISPR RNA. (from

Sander & Joung 2014). ............................................................................................................. 16

Figure 9- General scheme of the counterselection of recombinants, using the complex Cas9-

sgRNA (from Peters et al. 2015. .............................................................................................. 17

Figure 10- Illustration of the expression of multiple gRNA, from a “single construct” (from

http://blog.addgene.org/crispr-101-multiplex-expression-of-grnas). ....................................... 18

file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219650file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219651file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219652file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219652file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219652

xii

Figure 11- Schematic representation of the two plasmids used in the no-SCAR protocol. (A)-

plasmid Cas9cr4 (6,770 bp), and (B)- plasmid pKDsgRNA-xxx (“xxx” represents the name of

the target gene) (6,959 bp), (from Reisch & Prather 2015). .................................................... 25

Figure 12- Schematic representation of the construction of the pKDsgRNA-pgi using the

pKDsgRNA-ack plasmid as template in 2 PCR independent reactions to produce the fragments

1 and 2, that are further joined in a CPEC reaction to build the pKDsgRNA-pgi. P1 and P2 are

used to construct the fragment 1 and P3 and P4 the fragment 2. P1 represents the forward primer

pgi-pfrag1fwd, P2 the reverse primer pgi-pfrag1rev, P3 the reverse primer pgi-pfrag2rev and

P4 the forward primer pgi-pfrag2fwd. The orange region represents the sgRNA-ack, the red

sequences the sgRNA-pgi, while the brown circles the nicks (Adapted from Cipriano 2017).

.................................................................................................................................................. 26

Figure 13- Schematic representation of the designed oligonucleotide (73 bp) with upstream

(blue) and downstream (green) homologous sequences to the pgi gene. The red cross represents

the PAM site (adapted from Reisch & Prather 2015. ............................................................... 28

Figure 14- Schematic procedure of the of no-SCAR (Scarless Cas9 Assisted Recombineering)

system for genome editing described by Reisch & Prather 2015 (from Reisch & Prather 2015).

.................................................................................................................................................. 29

Figure 15- Schematic representation of the 3 plasmids used in the Datsenko and Wanner

methodology, the pKD46 (A), pCP20 (B), pKD13 (C). (A from

http://www.biofeng.com/zaiti/dachang/pKD46.html, (B) from

http://www.youbio.cn/product/vt1693 and (C) from Cipriano 2017). .................................... 31

Figure 16- Schematic representation of the parental plasmid (PP) pMINILi-VEGF-GFP, and

the miniplasmid (MP) and minicircle (MC) originated after the recombination between the

multimer resolution sites (MRS) by the ParA resolvase, induced with the addition of L-

arabinose to the growth medium (from Alves et al. 2016). ...................................................... 32

Figure 17- Schematic illustration of the amplification of the pgi-kanamycin cassette by PCR

using the forward kancassette_pgi_F (containing the homology site 1 (H1) and the priming site

1 (P1)) and reverse primer kancassette_pgi_R (containing the homology site 2 (H2) and priming

site 2 (P2)); then the homologous recombination performed by the λ Red system enzymes

between the pgi-kan cassette and the target pgi gene leads to the insertion of the cassette into

the genome on the pgi locus. Also the representation of the annealing of the primers

(pgi_conf_F and pgi_conf_R) used to confirm the insertion of the cassette (adapted from

Datsenko & Wanner 2000). ...................................................................................................... 34

file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219658file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219658file:///C:/Users/Marisa/Desktop/Tese/Tese/Documento/Document/Tese%2026%20professor%20v36_gm%20-38%20mudança%20do%20tamanho%20da%20legenda%20para%2011%20ia%20na%20secçao%203.5.3.docx%23_Toc497219658

xiii

Figure 18- Agarose gel electrophoresis of the (A)- purified pKDsgRNA-ack isolated from

DH5α cells (lane 1), and after HindIII enzymatic digestion (lane 3), (B)- purified pCas9cr4

from ceded DH5α cells (lane 1), and after BamHI digestion (lane 3); (C)- purified pCas9cr4

isolated from a transformed DH5α colony (lane 1), and after BamHI digestion (lane 3); (D)-

purified pCas9cr4 isolated from a transformed MG1655 colony (lane 1), after BamHI digestion

(lane 3). NZYDNA Ladder III lanes 2. .................................................................................... 41

Figure 19- Agarose gel electrophoresis of the two amplified fragments from the pKDsgRNA-

ack by PCR, to be used for the construction of the pKDsgRNA-pgi. Fragment 1 (lane 1),

fragment 2 (lane 3). NZYDNA Ladder III (lane 2). ................................................................. 42

Figure 20- Agarose gel electrophoresis of the purified pKDsgRNA-pgi from five colonies

(1,2,3,4,5) of DH5α, lanes 2,3,4,5,6 respectively. ................................................................... 42

Figure 21-Alignment of the partial sequencing result obtained (with a forward primer) from a

region that included the sgRNA, (within the box) (above) with its theoretical sequence (bellow)

of the pKDsgRNA-pgi plasmid. ............................................................................................... 43

Figure 22- Alignment of the partial sequencing result obtained (with a forward primer) from a

region that included the overlapping region between gam and araC (320nt) (within the box)

(above) with its theoretical sequence (bellow) of the pKDsgRNA-pgi plasmid...................... 43

Figure 23- Agarose gel electrophoresis. Purified pCas9cr4 and pKDsgRNA-pgi extracted from

DH5α (lane 2) and MG1655 (lane 3) after enzymatic digestion with BamHI. NZYDNA Ladder

III molecular weight marker (lane 1). ...................................................................................... 44

Figure 24- Schematic representation of the pgi gene deletion, from the genome, by the no-

SCAR system described by Reisch & Prather 2015. In a first step the supplied oligonucleotide

containing 73 bp is homologous recombined with the upstream (blue) and downstream (green)

regions of the pgi gene, with the help of the λ Red system, originating the chimeric genomic

DNA. After the replication of the chimeric DNA, a population is the recombinant (with the pgi

deleted), another is the wild-type. From which after being spread on plates containing aTc

(induces the expression of the Cas9-sgRNA-pgi), the recombinants survive while the wild-type

cells die. The PAM site is represented by a red cross (Adapted from Reisch & Prather 2015 and

Peters et al. 2015). .................................................................................................................... 45

Figure 25- Agarose gel electrophoresis resultant from the amplification of an upstream and

downstream region of the pgi gene by a PCR reaction with the pgi_conf_F and pgi_conf_R

primers, on 12 colonies (1-12) of the MG1655/pCas9cr4/pKDsgRNA-pgi strain (A) and

DH5α/pCas9cr4/ pKDsgRNA-pgi (B), lanes 1-12, respectively. And on the strain MG1655 and

DH5α (C), lanes 1 and 2 respectively. NZYDNA Ladder III lane M. ..................................... 48

xiv

Figure 26- Agarose gel electrophoresis of the fragment resultant from the amplification of an

upstream and downstream region of the recA locus, with the primers recA_check_F and

recA_check_R), on colonies 3, 4 and 12 of MG1655/pCas9cr4/pKDsgRNA-pgi strain, lanes 2,

3, 4, respectively. And on colonies 10, 11 and 2, of DH5α/pCas9cr4/pKDsgRNA-pgi strain,

lanes 5, 6 and 7, respectively. NZYDNA Ladder III lane 1. .................................................... 49

Figure 27- Agarose gel electrophoresis resultant from the amplification of an upstream and

downstream region of the pgi gene, or an internal region of the pgi gene by a PCR. On 3 colonies

(3, 4 and 12) of the MG1655/pCas9cr4/pKDsgRNA-pgi, lanes 2, 4 and 6, with the pgi_fwr1

and pgi_rev1 primers and lanes 3,5 and 7 with the pgi_fwr2 and pgi_rev2 primers, respectively.

And on 3 colonies (10, 11, 2) of DH5α/pCas9cr4/pKDsgRNA-pgi, lanes 8, 10 and 12, with the

pgi_fwr1 and pgi_rev1 primers and lanes 9, 11 and 13, with the pgi_fwr2 and pgi_rev2,

respectively. Set of primers A- pgi_fwr1 and pgi_rev1, B- pgi_fwr2 and pgi_rev2. .............. 50

Figure 28- Analysis of pKD13, pKD46 and pCP20 plasmids by agarose gel electrophoresis.

(A)-Purified pKD13 extracted from DH5α (lane 2), and after enzymatic digestion (lane 3).

Purified pKD46 extracted from DH5α, BW27783 and BW2P (lane 4, 6 and 8), and after

enzymatic digestion (lane 5, 7 and 9), respectively. (B)- Purified pCP20 extracted from DH5α

after enzymatic digestion (lane 2). All digestions were performed with BamHI. NZYDNA

Ladder III molecular weight marker (lanes 1). ........................................................................ 52

Figure 29 – Agarose gel electrophoresis of the amplified pgi kan-cassette after purification (lane

2). NZYDNA Ladder III (lane 1). ............................................................................................ 53

Figure 30- Agarose gel electrophoresis of the PCR product resultant from the pgi kan-cassette

introduced in the genome amplification of 6 colonies of BW27783 (lanes 2-7), and BW2P

(lanes 8-13) strains. NZYDNA Ladder (I) (lane 1). ................................................................. 54

Figure 31- Agarose gel electrophoresis of the PCR product for the analyse of the presence of

the “scar” in the pgi locus in the genome of 4 colonies of BW2P (lanes 2-5). NZYDNA Ladder

I (lane 1). .................................................................................................................................. 55

Figure 32- Alignment between the PCR product resultant of the amplification of the “scar”

sequenced (above) (with a forward primer) of the BW2P strain with its theoretical sequence

inserted in the genome (bellow). .............................................................................................. 56

Figure 33- Agarose gel electrophoresis of ~650ng of the purified plasmids. Before the induction

with L-(+) arabinose to the cell cultures of BW2P∆pgi and BW2P, non-digested (lanes 2 and

4) and digested with ApaLI enzyme (lanes 3 and 5), respectively, and after 2h of the induction,

non-digested (lanes 6 and 8) and digested with ApaLI enzyme (lanes 7 and 9), respectively.

Marker NZYDNA Ladder III (lane1). The DNA isoforms observed are specified at figure’s

xv

right. Abbreviatures: oc- open circular, lin- linear, sc- supercoiled, PP- parental plasmid, MC-

minicircle, MP-miniplasmid..................................................................................................... 57

Figure 34- (A) Growth curves exhibited by BW2P∆pgi (blue line) and BW2P (orange line),

Log (optical density (OD600nm)) over time (hours). L-arabinose was added to the culture at 3h.

The arrows indicate the time at which the samples of 2 mL were harvested from the culture (at

3 and 5 hours) (from 3 replicates). (B)-Specific plasmid DNA yield (µg.L-1.OD-1) exhibited by

BW2P∆pgi (blue) and BW2P (orange) before the induction with L-arabinose and after 2h of

induction (from 3 replicates). The specific plasmid yield (µg.L-1.OD-1) was determined through

the equation [[pDNA](ng.µL-1) * Vpurified plasmid (µL) * 10-3] / [OD600nm * Vculture sample (L)]. The

Vpurified plasmid is 50 µL and Vculture sample is 2 mL. The SEM was calculated and represented with

error bars. ................................................................................................................................. 58

xvii

List of Tables

Table 1- Characteristics of the PCR primers used for the amplification of the 2 fragments that

constitute the pKDsgRNA-pgi plasmid from the pKDsgRNA-ack, including its sequence (5’--

>3’), its melting temperature (Tm) in ºC, its GC (%) content, and the number of

oligonucleotides. ...................................................................................................................... 27

Table 2- PCR reaction setup (left table) and cycling conditions (right table) for the amplification

of 2 fragments of the pKDsgRNA-ack, containing the sgRNA-pgi. For the fragment 1 the

forward and reverse primers are the pgi-pfrag1fwd and primer pgi-pfrag1rev, respectively, and

for the fragment 2 the primer pgi-pfrag2fwd and primer pgi-pfrag2rev, respectively. V- volume.

.................................................................................................................................................. 27

Table 3- CPEC reaction setup (left table) and cycling conditions (right table) for the

amplification of pKDsgRNA-pgi by the junction of the fragment 1 and 2 constructed by PCR.

The overlapping regions between these 2 fragments will serve as primers for the DNA

polymerase. V-volume. ............................................................................................................ 27

Table 4- Characteristics of the PCR primers used for the amplification of the pgi kanamycin

resistance cassette, including its sequence (5’-->3’) (upper case letters- homology sites, lower

case letters- priming sites), its melting temperature (Tm) in ºC, its GC (%) content, and the

number of oligonucleotides. ..................................................................................................... 33


of the pgi kanamycin resistance cassette. V-volume ................................................................ 33

Table 6- Characteristics of the PCR primers used for the amplification of a region that contains

the pgi or the recA genes, including its sequence (5’-->3’), its melting temperature (Tm) (ºC),

its GC (%) content, and the number of oligonucleotides. ........................................................ 37


of a region containing the pgi or the recA gene.(*)- When cells were used as template the 1st

step was for 7 minutes in order to also allow the cells to lyse, to release the gDNA to the

medium, while if genomic DNA (gDNA) was used directly as sample, the 1st step occurred only

for 3 min. (**) annealing temperature: 47ºC when the primers pgi_conf_F and pgi_conf_R were

used; 53ºC when the primers pgi_fwr1 and pgi_rev1, pgi_fwr2 and pgi_rev2, recA_check_F

and recA_check_R were used. V- volume. .............................................................................. 37

Table 8- Resume of the attempts made when electrotransforming the

MG1655/pCas9cr4/pKDsgRNA-pgi and DH5α/pCas9cr4/pKDsgRNA-pgi with the

oligonucleotide to delete the pgi gene. It is indicated the alterations made to the preparation of

xviii

the electrocompetent cells, the electroporation, and recovery steps, as well as the concentration

of aTc used in the plates, and the number of colonies tested. (*)- It was added 100ng.mL-1 aTC,

and 50mM L-arabinose, (**)- The plates contained also LB + 34 µg.mL-1 Cm + 200 µg.mL-1

Spec. ......................................................................................................................................... 47

Table 9- Resume of the specific (µg.L-1.OD-1) and volumetric (mg.L-1) plasmid yields obtained

in the samples from BW2P∆pgi and BW2P, when non-induced (3h of growth) and 2h after

being induced (5h of growth) with L-arabinose. Average values ± SEM are exhibited. ......... 58

xix

List of Abbreviations

ADA - Adenosine deaminase

Amp - Ampicillin

ARPs - Antibiotic resistance proteins ARPs

aTc- anhydrotetracycline

ATP - Adenosine triphosphate

bp - base pair

Cas9 - CRISPR associated protein 9

Cm - Chloramphenicol

CMV - Cytomegalovirus

CRISPR - Clustered Regularly Interspaced

Short Palindromic Repeats

crRNA - CRISPR RNA

DBS - Double strand break

DCW - Dry cell weight

DNA - Deoxyribonucleic acid

dsDNA - double stranded DNA

ED - Entner-Doudoroff

FADH - Flavin adenine dinucleotide

FDA - Food and Drug Administration

FLP - Flippase recombinase

FTR - FLP recognition target

gDNA - genomic DNA

GFP - Green fluorescence protein

GOI - Gene of interest

HPLC - High performance liquid chroma-

tography

Kan - Kanamycin

LB - Luria Bertani

lin - linear

LPL - Lipoprotein lipase

LPS - Lipopolysaccharides

MC - Minicircle

miRNA- microRNA

MP - Miniplasmid

mRNA- messenger RNA

MRS - Multimer resolution site

NADH - Nicotinamide Adenine

Dinucleotide

NADPH - Nicotinamide Adenine Dinu-

cleotide Phosphate

NHEJ - non-homologous end joining me-

chanism

no-SCAR - Scarless Cas9 Assisted Recom-

bineering

oc - Open circular

OD - Optical density

Ori - origin of replication

ORT - Operator-repressor titration

PAM - protospacer adjacent motifs

PCR - Polymerase chain reaction

pDNA - plasmid DNA

PP - Parental plasmid

ppGpp - guanosine tetraphosphate

PPP - Pentose Phosphate Pathway

PTS - Phosphotransferase System

RBS - Ribosome binding site

Re - Recombination efficiency

RNA - Ribonucleic acid

ROS - Reactive oxygen species

rRNA - ribosomal RNA

sc - supercoiled

sgRNA - single guide RNA

siRNA- small interfering RNA

xx

Spec - Spectinomycin

ssDNA - single stranded DNA

TALEN - Transcription activator like

effector nucleases

Tc - Tetracycline

TCA - Tricarboxylic acid cycle

tracrRNA - trans-activating CRISPR RNA

tRNA - tranfer RNA

VEGF - vascular endothelial growth factor

ZFN - Zinc finger nucleases

1

1. Introduction

1.1. Gene therapy

Genes are the units of heredity. They encode proteins or RNA molecules with functional

activity, providing to the organisms their characteristics. But, when carrying a mutation in their

sequence or are not being correctly regulated, they may not function properly, resulting in some

cases in a genetic disease. One of the approaches to treat it is through gene therapy, where the

affected gene is corrected or modulated, to recover its normal function. This can be achieved

by inserting the correct gene in a random location in the genome to replace the defective one.

By removing the defective gene by the correct one though homologous recombination. Or even

by regulating its expression, turning it on or off. And also by repairing the mutation through

reverse mutation (Misra 2013).

This is done with the help of supplied nucleic acids: DNA, messenger RNA (mRNA),

small interfering RNA (siRNA), microRNA (miRNA) or antisense oligonucleotides (Yin et al.

2014). However, after their production and purification, they have to be delivered to the patients

and reach the target cells, where they will exert their function. For that, they have to pass

through several extracellular and intracellular barriers (e.g. nucleases, blood clearance,

diffusion through the plasma membrane and nuclear envelope, lysosomal vesicles) (Gaspar et

al. 2015). To facilitate their delivery they are incorporated in a carrier/delivery vector, whose

design is of extreme importance to ensure the passage through all these barriers (Misra 2013).

1.1.1. Carrier vectors (viral and non-viral)

Carrier vectors are distinguished in viral (modified) and non-viral. Some examples of

viral vectors include retroviruses, lentiviruses, adenoviruses, adeno-associated virus. While

non-virus comprise polymer, peptide and lipid based vectors and naked plasmids (Yin et al.

2014). Figure 1 shows the distribution of used vectors in clinical trials.

2

Figure 1- Vectors used in gene therapy clinical trials (from

wiley.com//legacy/wileychi/genmed/clinical/).

Viral vectors exhibit a high delivery efficiency compared to the non-viral, however they

are less safe and more difficult to produce, turning the non-viral vectors a better alternative (Yin

et al. 2014), depending on the objective.

The most used non-viral vector, plasmid DNA (pDNA), is normally produced in large

scale using Escherichia coli as the host cell. First it needs to be extracted and purified from the

host in order to eliminate contaminants like the RNA, genomic DNA, proteins and other cellular

debris (Prazeres & Monteiro 2014), so that it fulfils the necessary requirements to be handed to

the patients, to prevent any harm.

1.1.2. Clinical trials

In 1990, a girl which possessed the adenosine deaminase (ADA) deficiency, that

compromises the immune system, was the first one to receive a gene therapy product. Her white

blood cells were collected and the functional gene encoding ADA was inserted. Then the girl

was infused with her own cells (https://history.nih.gov/exhibits/genetics/sect4.htm) (https:/

/www.news-medical.net/health/Gene-Therapy-History.aspx.).

Until now the most part of gene therapy clinical trials conducted were related with

cancer diseases, about 64.6%. Followed by monogenic diseases, 10.5%, and infectious diseases,

7.4% (Figure 2). Some examples include Alzheimer’s disease, cystic fibrosis, haemophilia.

HIV, Hunting’s disease, blindness diseases and others.

3

Figure 2- Indications addressed by gene therapy clinical trials (from

wiley.com//legacy/wileychi/genmed /clinical/)

Since the safety and efficacy of gene therapy is still being evaluated it is now being used

to treat diseases which have no other treatments available (https://ghr.nlm.nih.gov/

primer/therapy/genetherapy). A great percentage of gene therapy clinical trials are still in the

initial stages, about 95% are in phase I, I/II and II (Figure 3). And only 5% constitute phase

II/III, III and IV (Hanna et al. 2017). Some of the gene therapies approved so far include

Gendicine to treat head and neck squamous cell carcinoma (Xiao-jie et al. 2015), Glybera to

restore lipoprotein lipase (LPL) activity (http://www.ema.europa.eu/ema/index.jsp?curl=pages

/medicines/human/medicines/002145/human_med_001480.jsp&mid=WC0b01ac058001d124

%0A), Imlygic to treat melanoma (Fukuhara et al. 2016), and the cell gene therapy Strimvelis

to treat adenosine deaminase deficiency (Hanna et al. 2017).

Figure 3- Phases of gene therapy clinical trials (from wiley.com//legacy/wileychi/genmed/clinical/).

1.2. Minicircles

Usually plasmids for therapeutic applications are first produced in high amounts in host

bacteria and after being purified they are delivered to the patients, where they will exert their

therapeutic effect. Due to that, this plasmid vectors contain bacterial sequences required for its

production into the host, like a bacterial origin of replication, and typically an antibiotic

4

resistance gene (to guarantee that only the cells with the plasmid survive). And eukaryotic

sequences constituted by the gene of interest (GOI) that will exert the therapeutic function, and

its regulatory sequences (e.g. promoters, polyadenylation sequences), to control the expression

of the GOI in the eukaryotic target cells (Prazeres & Monteiro 2014).

The prokaryotic sequences are associated with a decrease in stability, uptake and efficacy

of the plasmid vector. Which can also trigger immune responses in the patients, like the CpG-

motifs (Chen et al. 2008). Since those sequences are only needed during the bacterial growth

and not for the therapeutic action, some researchers have been focused in the production of

plasmids, in which these sequences can be removed afterwards, when they are no longer

necessary. The removal of these prokaryotic sequences is also advantageous to prevent the

integration of the antibiotic resistance genes in the genome of the eukaryotic cells, or in the

human microbiota (Gaspar et al. 2015). That would cause resistance to some antibiotics.

The plasmids from which these sequences can be removed are called parental plasmids

(PP), they contain both prokaryotic and eukaryotic sequences. And also two recombinase target

sites (multimer resolution sites-MRS), through which specific recombinases can act originating

a miniplasmid (MP) (contains the bacterial sequences) and a minicircle (MC) (contains the

eukaryotic sequences) (Prazeres & Monteiro 2014), Figure 4.

Figure 4- Schematic illustration of the recombination of a parental plasmid into a minicircle and

miniplasmid. ORI-bacterial origin of replication; GOI-gene of interest; MRS-multimer resolution site

(from doi:10.1128/micro-biolspec.PLAS-0022-2014.f3).

Minicircle lacks the prokaryotic sequences and contains the GOI, it is the one used in the

therapeutic applications. After their production they need to be extracted from bacteria, for that

the bacterial cells are lysed, then to be purified it needs to be separated from the impurities

present like other nucleic acids and proteins, the MP and remaining PP. Which is difficult to

achieve due to their physico-chemical properties similarities (like the size, charge, isoform,

5

molecular composition) (Gaspar et al. 2015). So, the efficiency of recombination should be high

to ensure a maximum ratio MC/PP, avoiding residual PP in the final formulation. To facilitate

the purification procedure, some researchers developed a strain capable of degrading the MP

and residual PP by the action of a I-SceI endonuclease, whose restriction site in located on the

prokaryotic backbone (Kay et al. 2010).

Since these prokaryotic sequences constituted ~50% of the parental plasmid, the

minicircles exhibit a reduced size. Making them more easy to transfect, and also to exhibit

higher expression levels of the therapeutic gene due to their increased diffusion across the

cytoplasm, compared to the PP (Lukacs et al. 2000) (Yin et al. 2014) (Gaspar et al. 2015).

Besides the minicircles there are other plasmids alternatives lacking the antibiotic

resistance gene (based on e.g. Operator-repressor titration (ORT) technology (pORT), or the

mini-intronic plasmid enclosing RNA/OUT antisense RNA) and the selection is rather based

on other parameters. However they still contain the bacterial origin that contributes to an

increased size (Gaspar et al. 2015), that the MC doesn’t possess.

1.2.1. Synthesis of minicircles

To produce MC from PP with a higher productivity, generally the PP is

amplified/produced in higher quantities in E. coli as host cell due to their advantageous

characteristics (further mentioned). Then the cells need to be induced to express the enzymes

responsible for the recombination process. Which act on the specific recombination sites (MRS)

present on PP, giving rise to the MP and MC (Gaspar et al. 2015) (Figure 4). The recombinases

currently used are divided in 2 main families based on the amino acid residue as a catalytic

nucleophile: serine and tyrosine (Colloms et al. 1997). The inducers normally used are the heat

and L-arabinose. The induction response from this last one seems to be dependent on the

concentrations used due to the all-or-none response (Khlebnikov et al. 2001), as well as the

stage of the cellular growth at which it is added to the medium. Higher MC yields were observed

when the induction occurred at final of the exponential phase, since at this period the

concentration of healthier cells is higher as well as the concentration of PP produced, that can

be used to originate MC (Gaspar et al. 2014) (https://bitesizebio.com/13514/).

Currently the use of minicircles seems to be a safer approach in therapeutic applications

compared to plasmids, however the quantities of MC obtained at a laboratorial scale,

milligrams, need to be increased to an industrial scale, grams, while maintaining its stability

and quality. Also, the “metabolic burden” that the host bacteria can exhibit (term further

explained) represents a drawback. The purification process needs also to be improved to ensure

6

that the MC final product devoid MP, PP or bacterial debris. Due to that researches are being

done to overcome this problems by altering growth/induction conditions, genotype of the host

(Gaspar et al. 2015).

1.3. Metabolism

As mentioned previously the E. coli, a gram-negative bacterium, is normally the host

microorganism of choice, once it grows fast in a minimal medium, while producing rapid and

high plasmid yields, at a lower cost compared to other microorganisms (Silva et al. 2012).

Despite this, E. coli produces an endotoxin (triggers immune responses), lipopolysaccharides

(LPS), which is released when the cells are lysed to extract the pDNA. It even show some

genetic instability due to the presence of mobile DNA elements, which are capable of changing

their position in the genome (like transposases, integrases, specific site recombinases, insertion

sequences (IS) elements, defective phages), compromising the production of the plasmid and/or

the therapeutic product, and hence its safety (Bower et al. 2012) (Pósfai et al. 2016).

The E. coli strain selected is also fundamental, because their genotype may alter the

production of the plasmids. Since the goal is to produce large amounts of pDNA, while

maintaining its biological activity, several strains like DH5, DH5α, BL21, DH10B, and others

have been used for this production. However, most of them were constructed for a different

purpose like the expression of recombinant proteins and/or to clone heterologous genes, and so

they may not ensure the best conditions to produce the plasmid DNA (Gonçalves et al. 2013).

After the selection of the host strain, the cells are transformed with the desired plasmid,

which to replicate and maintain them need to fractionate a certain amount of resources (like

metabolic energy, amino acids, nucleotides, etc.) to certain metabolic processes related with its

production. Cells with pDNA (P+) have an increased demand in nucleotides for the replication

of the plasmid, besides the necessary for the host genome DNA (gDNA). Also, an additional

demand exists for amino acids to synthesize some proteins encoded by the plasmid, like the

ones which ensure cells’ survival in a selective medium with antibiotics or without an essential

nutrient, or even because they are the therapeutic product. Besides, in those reactions it is

expended a huge amount of energy, e.g. the addition of amino acids to a peptide chain spend 2

GTP molecules, the covalently binding of each amino acids to their correspondent tRNA, to

form aminoacyl-tRNA expends one molecule of ATP (Silva et al. 2012) (Glick 1995) (Nelson

et al. 2008). This imposition of resources to the cells was termed as “metabolic burden” by

Glick in 1995. It tends to affect the normal cell functioning, and so P+ tend to exhibit a longer

period of adaptation to the medium (lag phase), followed by a slower exponential growth

7

compared with the ones without plasmids (P-) (Ow et al. 2006). This causes stress in the cells,

since their metabolism and growth is being affected, compromising the plasmid production

(Silva et al. 2012).

With the purpose of overcoming this metabolic burden, the plasmid instability, the stress

responses in P+ cells, and to increase the yields of plasmid produced, several studies have been

conducted. Different fermentation strategies have been tested, like changing the environmental

conditions (the concentration of oxygen, the medium components like the carbon and nitrogen

sources, and also the temperature); as well as different feeding modes: batch, fed-batch (Silva

et al. 2012) (Yau et al. 2008). Also the design of new E. coli strains, by changing their genotype,

in order to change their metabolism to improve the plasmid production and stability (Bower et

al. 2012). In the following sections some of the strategies studied to alter the cellular

metabolism are discussed as well as their effect on the plasmid production.

1.3.1. Glycolysis

Glycolysis is the main pathway for glucose to be metabolized and yield pyruvate. Some

energy is produced in the form of ATP, as well as reducing agents (NADH), and intermediates

for biosynthetic reactions (Ow et al. 2006) (Nelson et al. 2008) (Li et al. 2015). Some

researchers have compared the genes/proteins expression profiles of P+ and P- of this pathway.

But the results obtained were not concordant, which may happen due to the different

fermentation modes employed, and the composition of the media used in each study, which will

probably influence the rate of glucose consumption (Silva et al. 2012). For instance, Ow et al.

(2006) described that most part of the glycolytic genes were downregulated (except gapA, that

codes glyceraldehyde-phosphate dehydrogenase), while Wang et al. (2006) have demonstrated

opposite results, where most part of the glycolytic enzymes activities were increased, fact

observed also in Birnbaum & Bailey 1991. Based in its own previous study, Ow et al. (2009)

choose to perform the knockout of the fruR (“fructose repressor”) in DH5α. This gene encodes

a transcriptional regulator that stimulate the expression of enzymes involved in gluconeogenic

and tricarboxylic acid cycle (TCA) pathways, while represses genes involved in the reactions

of cleavage of sugars, including the Entner-Doudoroff (ED), pentose phosphate (PP), and

glycolysis pathways. The authors employed fed-batch cultures with exponential feeding, and

reported an increase in the specific plasmid yield in the mutant strain, which produced about

21% more than the wild type (19.2 mg.g DCW-1 of pDNA versus 15.9 mg.g DCW-1).

8

1.3.2. Pentose phosphate pathway (PPP)

The PPP is one of the pathways where 6-P-glucose is metabolized. Here it is produced

reductive power (NADPH) and the metabolite 5-P-ribose. NADPH is important for the

reductive biosynthesis of lipid and nucleic acids and to prevent the cell from the oxidative stress.

5-P-ribose is a precursor in the biosynthesis of nucleotides, which are the “building blocks” of

DNA, RNA and coenzymes like ATP, NADH, FADH2, and coenzyme A (Nelson et al. 2008).

In Ow et al. (2006) the flux of this pathway remained constant, comparing P+ with P. While

Wang et al. (2006) described a downregulation of zwf, gnd and rpi genes, which catalyse the

enzymatic reactions represented in Figure 5.

Figure 5- Schematic representation of the glycolysis and pentose phosphate pathway (adapted from:

http://schoolbag.info/chemistry/mcat_biochemistry/59.html)

Once the metabolic flux trough the PPP doesn’t seem to be enough to support the extra

demand in nucleotides imposed by the presence of the plasmid, some strategies rely on directing

the flux into the PPP, by overexpressing the zwf, gnp or rpi or even by deleting pgi (Silva et al.

2012). Flores et al. (2004) observed an increase in the growth rate of JM101 when transformed

with a plasmid carrying the zwf gene, from 0.46 h-1 (uninduced) to 0.64 h-1 (induced with IPTG

to express the zwf). Wang et al. (2006) overexpressed rpiA gene and observed an increase

around 3.3 and 7-fold in pDNA copy number, when using a plasmid derived from ColE1 with

2 and 1 origins of replication, respectively, in the BL21 strain. Gonçalves et al. (2013), have

constructed a strain, GALG20 (MG1655∆endA∆recA∆pgi), that has a similar growth rate when

compared with the parental strain and produces 25-fold more of [pDNA] using 20 g.L-1 of

9

glucose. With the deletion of pgi the glycolysis continues, since the two end products of PPP

(3-P-glyceraldehyde and 6-P-frutose) are further used in glycolysis. But, this modification can

create a redox imbalance inside the cell due to the excess of NADPH produced, that may induce

stress (Charusanti et al. 2010). The authors also tested the effect of the pgi mutation changing

the carbon source to glycerol, however the plasmid DNA yield obtained was relatively the same

for both strains (GALG20 and MG1655∆endA∆recA), since glycerol is metabolized in a route

different from glucose. And so, the pgi mutation was only beneficial when glucose was used.

1.3.3. Biosynthetic pathways

In the pathways mentioned before, intermediates are produced for the synthesis of

amino acids, important to produce proteins, and nucleotides, the building blocks of nucleic

acids. So, it is likely that exist a high demand in amino acids in P+ cells due to the expression

of antibiotic resistance genes or for the expression of recombinant proteins (dependent on the

plasmid), however many genes involved in its synthesis in Ow et al. (2006) have shown to be

downregulated. Such as leuA, proB, asnB and ansA, aspC, hisC, cysK and cysM, pheA, tyrB,

asd, metA and metB which code enzymes responsible for the catalysis of leucine, proline,

asparagine, aspartate, histidine, cysteine, phenylalanine, tyrosine, lysine, methionine,

respectively (https://ecocyc.org/). Some genes involved in purine (purU, gmk, guaC) and

pyrimidine (pyrG, dut, deoA) synthesis were also found to be downregulated. One hypothesis

for these downregulations may be the slower growth rate exhibited by the P+ cells compared

with the P- cells, which will show (P-) a higher rate of replication of the gDNA and so an

increasing the demand of nucleotides (compared with P+) (Ow et al. 2006).

So, possible strategies to increase the production of the plasmid could be to overexpress

some genes involved in the amino acid and nucleotide synthesis that exhibited a relative low

expression.

1.3.4. The by-product acetate

To produce plasmids in large amounts, fermentations can be realized with high cell

concentrations in the medium, that can be achieved by the consumption of the substrate (e.g.

glucose) in great amounts.

However, this can lead to an overflow in the metabolism, characterized by an increased

flux in the glycolysis producing acetyl-CoA in excess, and since the TCA is unable to

completely oxidize at the velocity that is being formed it is converted in acetate (Wolfe 2005)

(Borja et al. 2012) (Lara et al. 2008). The acetate produced affects the cellular growth and

10

represents a waste of the carbon source (Gonçalves et al. 2013). One strategy to overcome this

is to use glycerol as the carbon source (Carnes et al. 2006). Also, fed-batch cultures can be used

instead of batch cultures, to decrease the rate at which glucose is consumed, but this showed to

affect the growth rate (Lara et al. 2008). To prevent it, Borja et al. (2012) used a strain (VH33)

where the principal transporter responsible for the internalization of glucose (PTS) was deleted,

leading to the entrance of glucose to the cell at a slower rate by the galactose permease. VH33

strain produced twice more pDNA (2.78 mg.L-1) when compared with its parental strain W3110

(1.14 mg.L-1), but still below the DH5α strain (12.73 mg.L-1). Where VH33 only originated

0.14 g.L-1 of acetate compared with the 0.32 g.L-1 and 0.62 g.L-1 produced by the parental and

DH5α strains, respectively. Despite the low production of pDNA by VH33 compared to DH5α,

the deletion of the PTS seems to be beneficial, which could be tested in DH5α.

Gonçalves et al. (2013) performed the deletion of pykA and pykF in JM101, MG1655

and DH5α strains. These deletions were already performed in JM101 by Cunningham et al.

(2009). Both genes encode isoenzymes of pyruvate kinase enzyme, which catalyses the

formation of pyruvate and ATP (https://ecocyc.org/). Gonçalves et al. 2013 reported an increase

in the production of pDNA in JM101ΔpykFΔpykA (5.3 or 2.6 mg.g DCW-1) compared with

JM101 (2.5 or 1.3 mg.g DCW-1), grown with 5 or 20 g.L-1 of glucose. When using GALG1011

(MG1655ΔendAΔrecAΔpykFΔpykA) (6.6 or 1.6 mg.g DCW-1) the yield was also higher than its

parental strain, MG1655ΔendAΔrecA (3.6 or 0.8 mg.g DCW-1). While for DH5αΔpykFΔpykA

the production decreased (0.3 or 0.4 mg.g DCW-1) compared with DH5α (1.8 or 0.8 mg.g DCW-

1), when grown with 5+10 g.L-1 (2 pulses) or 20 g.L-1 of glucose, respectively. In the same study

the effect of one single mutation either in pykF or pykA (but for JM101) was also tested, where

better pDNA yields were obtained compared with double deletion. This may be explained by

the little acetate that is being produced due to the reduced flux of glycolysis, when the activity

of the pyruvate kinase is diminished instead of eliminated, allowing the continuous production

of ATP and an increased glycolytic flux through the PPP. In this study is also noticed the

influence of the feeding strategy mode, once better plasmid yields were obtained when it was

supplied 5+10 g.L-1 (2 pulses) glucose instead of 20 g.L-1. And also that the effect of each

modification on the plasmid production depends on the host genotype.

1.3.5. Stress Responses

Stress responses may be triggered by changes in the composition of the growth medium,

like the depletion of nutrients, such as the “building blocks” (amino acids), which may result

from the increased expression of antibiotic resistance proteins (ARPs) encoded in the plasmid.

11

This may induce the heat shock response, and so heat shock proteins (e.g. chaperones,

proteases) are expressed. They may degrade, form aggregates or inclusion bodies of the ARPs,

in order to restrict the amount of the spent amino acids (Glick 1995). In fact, some researchers

observed an increase expression of these proteins, like IbpA, IbpB, DnaK, HtgA, HtrE (Ow et

al. 2006) (Birnbaum & Bailey 1991) (Haddadin et al. 2005). Amino acid starvation also induces

a stringent response (Wegrzyn 1999). Mainly regulated by guanosine tetraphosphate (ppGpp),

a signalling molecule produced by the RelA and capable to bind to the RNA polymerase

(Nelson et al. 2008) (Wegrzyn 1999). The increase in ppGpp levels leads to the reduction of

rRNA, tRNA, and pDNA synthesis , as it activates the expression of genes associated with the

amino acid synthesis and even of proteases, to overcome the amino acid starvation (Silva et al.

2012).

Some studies reviewed by Wegrzyn 1999 evaluated the replication of different replicons

in amino acid starved cells. In most of those studies, the replication of plasmids was found to

be inhibited when cells presented a stringent response (wild-type cells), but a significant

replication when cells exhibited a relaxed response (relA mutants).

1.3.6. Plasmid Stability

The plasmid (in)stability can be divided in 3 main types, structural, isoform and

segregational. It is affected by the metabolic burden, copy number of the plasmids, the substrate

used for growth, the medium composition and culture conditions.

1.3.6.1. Structural stability

The structural instability is related to modifications in the sequence of the plasmid like

point mutations, deletions, insertions and even genomic rearrangements, which may affect the

final therapeutic product and its effectiveness (Silva et al. 2012).

This instability may arise due to the environmental conditions, the presence of mutagen

agents or reactive oxygen species (ROS), introduction of errors during the replication and non-

proper repair, the existence of mobile genetic elements and even inverted repeats or secondary

structures, which constitute a target for DNA repair systems (Pósfai et al. 2016) (Silva et al.

2012) (Friehs 2004). To overcome this problem strains with the recA gene mutated have been

constructed, to prevent the homologous recombination mechanism (Silva et al. 2012). Also,

mutants in the uvrC and umuC genes, which encoded proteins involved in the repair of DNA

damage by UV light and in the SOS pathway, respectively. Such as the SURE strain which

possess these latter mutations, and when harbouring a plasmid containing inverted repeats only

12

0.8-1.2 % of cells had rearranged it, compared to 15-20 % of DH5α cells. Demonstrating the

benefit of those mutations in the maintenance of the plasmid stability (Alan L. Greener & Del

Mar 1996).

1.3.6.2. Isoform stability

Plasmids typically assume one of this three isoforms: supercoiled (sc), open circular

(oc) and linear (lin). The first one is a highly compact and condensed structure. However, if

they are exposed to abrasive chemical reagents, physical shear or nucleases, “nicks”/breaks can

be introduced in one or both strands of the plasmid, unfolding it, and giving rise to the

relaxed/open plasmid or linear isoforms, respectively (Loftus, Bernard T. 1984).

FDA recommends that more than 80% of the plasmid should be in the sc isoform for

DNA vaccines products, since it represents a measurement of the quality of the plasmid. Since

when its isoform is compromised, the sc plasmid takes another isoform (http://www.fda.gov/).

Evidence shows even that the supercoiled isoform leads to higher expression of an exogenous

gene compared to the others (Silva et al. 2012). Being important to ensure that throughout the

whole process of production and purification, plasmid isoform stability is maintained.

The two enzymes mostly involved in the control of the supercoiling conformation in

E. coli are the topoisomerase I, that removes supercoils, and DNA gyrase (topoisomerase II)

responsible for the introduction of negative supercoils (Hassan et al. 2016). Overexpressing

DNA gyrase could represent one strategy, however a study revealed that there was only a small

dependence of the expression of DNA gyrase and the supercoiling levels obtained, in fact, the

authors obtained a 1.3% change in the level of supercoiling for a 10% change in expression of

DNA gyrase and even for topoisomerase I (Snoep et al. 2002). Mutation in endA, inactivates

the endonuclease I, a non-specific endonuclease located in the periplasm, that is released when

the cells are lysed, and so its inactivation prevents the introduction of nicks in the plasmid

(Carnes et al. 2006).

1.3.6.3. Segregational stability

The segregational instability is characterized by the inefficient distribution of plasmids

when cells divide. Which may happen due to the metabolic burden (Silva et al. 2012). Also due

to the accumulation of plasmids inside the cells that can originate plasmid multimers and further

lead to its loss (Silva et al. 2012) (Yau et al. 2008). To prevent the presence of cells without the

plasmid, normally the plasmids possess an antibiotic resistance cassette, and so an antibiotic is

usually added to the medium, so only the cells harbouring the plasmid survive. Another

13

alternative is to use auxotrophic host strains, that are transformed with a plasmid that carries a

gene to complement the auxotrophy (Friehs 2004). One example is E. coli JM83 strain which

is proline auxotrophic, and can be complemented with a plasmid carrying the proBA genes

(Fiedler & Skerra 2001). Other selection alternatives are mentioned at section 1.2.

1.4. Growth conditions

1.4.1. Medium composition

As mentioned before the substrate sources affect the plasmid production. Zheng et al.

(2007) evaluated the carbon and nitrogen sources in the plasmid production using DH5α. They

tested the addition of different carbon sources to LB medium. When D-glucose or glycerol were

added DH5α produced 7.73 mg.L-1 or 7.97 mg.L-1, respectively, less than 14.63 mg.L-1 on LB.

While an increase was observed when D-fructose, D-lactose, mannitol or sucrose were added,

since it produced 14.67 mg.L-1, 21.93 mg.L-1, 27.67 mg.L-1, 28.27 mg.L-1, respectively.

Relatively to nitrogen source they evaluated the substitution of tryptone in the LB medium by

NH4(SO4)2, soya peptone, proteose peptone, and casein peptone. A higher plasmid

concentration was obtained when casein peptone (28.67 mg.L-1) was used when compared with

tryptone (14.63 mg.L-1).

1.4.2. Temperature

Temperature influences the bacterial growth and metabolism. Silva et al. 2009 reported

an increase in the specific yield and purity degree of the plasmids associated with a slower

growth rate of the cultures for higher temperatures, 37ºC and 40ºC, compared with 30ºC and

32ºC. The increase in pDNA production leads to a higher metabolic burden imposing stress in

the cells, affecting their growth.

1.5. Genetic engineering of bacteria

The act of modifying directly a genome of an organism by either adding or deleting

DNA segments is termed as genetic engineering. It is done to change some specific

characteristics. And there are several methods to perform it. One of them is through homologous

recombination. Which can be accomplished using the λ-Red recombinase system (comprises

the exo, bet and gam genes) of a bacteriophage. These genes encode the enzymes responsible

for the recombination between the supplied DNA fragments (that contains the desired

mutation), and the target DNA (to be modified) present in the organism’s genome. The exo,

14

encodes an exonuclease that binds to

linear DNA and degrade the 5’

extremities creating single stranded (ss)

3’overhangs. The beta binds to these

extremities and promotes the annealing

with the target DNA, by complementary

base pair trough the homologous

sequences. While the gam enzyme

avoids the degradation of the double

stranded DNA (dsDNA) linear

fragment supplied, by the endogenous

RecBDC nuclease. By this homologous

recombination a recombinant DNA is

formed with the desired mutation

(Sharan et al. 2009) (Figure 6).

After the homologous re-

combination procedure in the desired

microorganism, the selection of the

recombinants must be done.

Figure 6- Schematic representation of the

homologous recombination processed by the λ-Red

system. The exo enzyme has 5’ to 3’ exonuclease

activity and after binding to the linear DNA creates

single stranded 3’ overhangs. The beta enzyme

binds to this overhands and stimulates the annealing

between the target DNA by base pair homology,

creating the recombinant DNA (from Sharan et al.

2009).

15

1.5.1. Recombinants selection methodology

1.5.1.1. Antibiotic resistance cassette

One of the most well-known selection procedures relies on the use of an antibiotic

resistance cassette, as the one described by Datsenko & Wanner 2000. This method is based on

the replacement of a target gene by an antibiotic resistance cassette. Which is constructed in a

PCR reaction using two primers, both forward and reverse contain a homology arm (H1 and H2)

which is complementary to the regions upstream and downstream of the target gene,

respectively. Each one of the primers have also a sequence complementary to the antibiotic

resistance gene (P1 and P2 sites). After the amplification of the cassette, it is used to transform

cells that express the λ-Red system, where the recombination occurs through the flanking

homologies. Then the

cells are placed in a

medium supplied with

the antibiotic for which

the cassette gives. After

the selection of the

transformants, the

cassette resistance gene

can be eliminated using

the Flippase

recombinase (FLP)

enzyme, which does the

recombination between

short Flippase

recognition target

(FRT) sites (Figure 7).

With the elimination of the antibiotic resistance cassette a “scar” is left in the chromosome,

with a FRT site. Which may represent a problem since it can generate chromosomal instability

or even not desired genomic rearrangements, due to the possible recombination between FRT

Figure 7- Schematic representation of the main steps in the Datsenko and

Wanner genetic engineering procedure. In a first step an antibiotic resistance

cassette is amplified by PCR with homologous arms (H1 and H2),

complementary to the upstream and downstream sequences of the target gene.

The cassette is then introduced in the genome through homologous

recombination with the target gene, performed by the λ-Red system. In a last

step the FLP recombinase is expressed to remove the cassette from the genome,

leaving a “scar” with a FRT site (from Datsenko & Wanner 2000).

16

sites located in other places of the genome (Datsenko & Wanner 2000). This problem may be

solved using the no-SCAR (Scarless Cas9 Assisted Recombineering) described by Reisch &

Prather (2015), which does not leave any scar in the genome. In this procedure, after the

homologous recombination through the λ-Red system, the selection of the recombinants is

rather based on the use of an engineered Cas9-gRNA complex (further explained).

1.5.1.2. Engineered Cas9-gRNA complex

The engineered Cas9-gRNA complex (Figure 8) is based on the CRISPR/Cas system

(Clustered Regularly Interspaced Short Palindromic Repeats CRISPR-associated proteins), the

“immune mechanism” of prokaryotes (Rath et al. 2015). It is composed by the endonuclease

Cas9 which has two lobes: one recognizes the target sequence to be cleaved (through the help

of the gRNA) and the other is the nuclease that cleaves it (Jinek et al. 2012). The guide RNA

(gRNA) results from the fusion of a CRISPR RNA (crRNA) and part of the trans-activating

CRISPR RNA (tracrRNA). The crRNA sequence

contains a sequence (~20nt complementary to the

target DNA) that guides the Cas9 to the target DNA

to be cleaved in both strands by complementary base

pair, introducing a double strand break (DBS).

However, the Cas9 only does the cleavage if the

target DNA is adjacent to short sequences, termed as

protospacer adjacent motifs (PAM). That usually it is

a triplet NGG (N represents any nucleotide and G a

guanine) (Sander & Joung 2014). So, the target

sequence needs to be in the form 5’- N20NGG. The

DBS is introduced of about 3 to 4 nt upstream of the

PAM sequence (http://www.addgene.org/crispr/gui

de/). The requirement of the existence of PAM does

not limit the targeting of the Cas9, once that there are

around 424,651 GG doublets on both strands of the

E. coli genome (Reisch & Prather 2015). Part of the

crRNA sequence also hybridizes with the tracrRNA

which is also important in guiding the Cas9 to the

target, once the mature crRNA is unable to do it

alone (Jinek et al. 2012). By simply altering the first 20 nt of the gRNA (corresponding to the

Figure 8- Schematic illustration of the

engineered Cas9-gRNA complex cleaving

a target sequence DNA. crRNA- CRISPR

RNA, tracrRNA- transactivating CRISPR

RNA. (from Sander & Joung 2014).

17

protospacer in crRNA), the target DNA can be changed (Sander & Joung 2014), making it an

advantage compared to other systems. It has been used in diverse applications. In the no-SCAR

methodology it is used in the selection of the recombinants. Where the complex is induced to

introduce a double strand break (DSB), in the sequences that were not recombined (because the

gRNA was designed to target that sequence), and because the E. coli is unable to repair that

break since it lacks the non-homologous end joining mechanism (NHEJ), it dies. Contrariwise,

if the homologous recombination occurs properly, the Cas9 is unable to cleave the DNA, once

the target sequence is not there the gRNA does not hybridize with that, the cell survives, (Figure

9) (Peters et al. 2015) (Reisch & Prather 2015). By this negative selection (or counterselection)

is possible to distinguish those which have been modified (the recombinants) from those which

not, without leaving a “scar” in the genome like in the Datsenko and Wanner methodology

referred.

Figure 9- General scheme of the counterselection of recombinants, using the complex Cas9-sgRNA

(from Peters et al. 2015.

1.5.1.2.1. Cas9-gRNA complex versus other alternatives

The introduction of DSB in the genome can be also performed by other methods. Some

examples include DNA-binding proteins like meganucleases, which are engineered restriction

enzymes derived from microbial mobile genetic elements. Zinc finger nucleases (ZFN) and

transcription activator like effector nucleases (TALEN), composed by an engineered zinc finger

18

and a TAL effector DNA binding domain, respectively, responsible for the recognition and

binding to the target DNA, and also by a non-specific nuclease domain from one restriction

enzyme, the FokI, that cleaves the target DNA (Xiao-jie et al. 2015).

However, once we desire to change the target DNA sequence, to using one of these

methods referred, the de novo synthesis of the protein that guides ZNF and TALENS is

required. A more complex procedure compared to the design of a new complementary gRNA,

for the Cas9-gRNA complex. Another advantage of the Cas9-gRNA usage is that it is possible

to introduce DSB at multiple sites in parallel (multiplexing), by expressing different gRNA

from a “single construct”, and each one targets distinguish sequences (Xiao-jie et al. 2015)

(Sander & Joung 2014), Figure 10.

Figure 10- Illustration of the expression of multiple gRNA, from a “single construct” (from

http://blog.addgene.org/crispr-101-multiplex-expression-of-grnas).

Although Cas9-gRNA represents a good alternative, one disadvantage is the possibility

of Cas9 cleave at potential off-target sites. Once any target sequence of 20 nt adjacent to a PAM

can have hundreds of off-target positions. To overcome this some researchers have tested a

diversity of strategies. One example is to truncate the 3’ ends to “modify the gRNA

architecture” or add two extra guanine nucleotides attached to the 5’ end, it resulted in a lower

off-target effects, but it also lowered the efficiency of the on-target genome editing (Sander &

Joung 2014).

Other methodologies like the TetA-SacB Dual Selection and SceI counterselection are

also based on the Cas9 counterselection, however they don’t demonstrate the way the plasmids

responsible to encode Cas9 and sgRNA are removed, while the no-SCAR protocol does turning

it more advantageous (Reisch & Prather 2015).

19

2. Thesis Objective

The increased demand in biopharmaceutical plasmids for therapeutic applications has

stimulated new studies to improve the yield and quality of the plasmids produced in the host

bacteria. One strategy is to construct optimized strains able to direct the metabolic flux to the

production of necessary compounds for the amplification and maintenance of plasmids, while

ensuring its stability. However, most studies used strains, which contain a genotype highly

mutagenized, and first developed to produce recombinant proteins, and clone exogenous genes.

So, the consequence of specific mutations in the production of plasmids is not well clear, and

they may not ensure the best conditions for the plasmid production (Gonçalves et al. 2013).

Here the aim was to create E. coli strains able to produce high plasmid yields and compare

their production, by deleting the pgi gene from their genome, once it has already shown to be

beneficial on GALG20 (MG1655∆endA∆recA∆pgi) in Gonçalves et al. 2013. This gene

encodes the key enzyme glucose-6-phosphate isomerase involved in the

glycolysis/gluconeogenic pathway, responsible for the interconversion of glucose-6-phosphate

in fructose-6-phosphate. This deletion aims to direct the carbon flux through the pentose

phosphate pathway, to increase the production of nucleotides and hence of plasmids.

However, GALG20 has showed a genome instability problem, an unwanted deletion of

20 kbp sequence. Due to that, the aim is to introduce deletions in the wild-type MG1655 and in

the highly mutagenized DH5α strain. The deletion methodology is based on the novel no-SCAR

(Scarless Cas9 Assisted Recombineering) described by Reisch & Prather (2015), a scar-free

system that may prevent the genome host instability, and then compare the plasmid production

of both pgi deleted strains in two different background genotypes.

Additionally, the pgi deletion in BW27783 and BW2P strains used to produce minicircles,

which are plasmid derived molecules is intended. Minicircles represent a promising non-viral

vector tool in therapeutic applications, because they only contain the eukaryotic sequences

needed for the therapeutic effect. Therefore, pgi deleted strains before and after induction (with

arabinose) of a recombinase able to convert a parental plasmid into miniplasmid and minicircle

molecules will be compared to evaluate if the pgi deletion is beneficial for minicircle

production.

21

3. Materials and Methods

3.1. Media, chemicals and other reagents

Bacterial growth media: LB medium (NZYTech) (10 g.L-1 of tryptone, 5 g.L-1 yeast

extract and 10 g.L-1 NaCl), LB Agar (NZYTech) (10 g.L-1 of tryptone, 5 g.L-1 yeast extract,

10 g.L-1 NaCl and 15 g.L-1 of agar), SOB medium (20 g.L-1 of Bacto Tryptone (BD), 5 g.L-1 of

Yeast Extract (Liofilchem), 10mM of NaCl (Panreac), 2.5 mM of KCl (Merk), 10mM of

MgCl2.6H2O (Fagron), 10 mM of MgSO4 (Merk),

Documents

Improvement of biopharmaceutical-grade plasmid production by … · Improvement of biopharmaceutical-grade plasmid production by targeted genome editing of Escherichia coli Marisa