Upload
lizbeth-hines
View
340
Download
0
Embed Size (px)
Citation preview
Gene Variationssingle nucleotides polymorphism
© number variation
刘戈飞汕头大学医学院遗传学与细胞生物学教研室
075488900497
13502932022
Genetic VariationsChromosome numbers
Segmental duplications, Copy number variation
Translocations
Inversion
Sequence Repeats
Transposable Elements
Short deletions and insertions
Tandem Repeats
Nucleotide Insertions and Deletions (Indels)
Single Nucleotide Polymorphisms (SNPs)
Mutations
Sizable
Minor
Structural
Sequence
Genetic Markers• Morphological markers
• Cytological markers
• Biochemical and physiological markers
• Molecular markers
• 1980, RFLPs (restriction fragment length polymorphisms)
• 1985, STRs (short tandem repeats, mini-satellites)
• 1990s, SNPs (single nucleotide polymorphisms)
• 2000s, CNV (copy number variation)
A C G T G T C G G T C T T A A A Maternal chromosome
A C G T G T C C G T C T T A A A Paternal chromosome
A C G T G T C G G T C T T A A A Maternal chromosome
A C G T G T C G G T C T T A A A Paternal chromosome
A C G T G T C C G T C T T A A A Maternal chromosome
A C G T G T C C T A C T T A A A Paternal chromosome
The position of the SNP is indicated by the box. Individual 1 is heterozygous, while individuals 2 and 3 are homozygous.
Individual 1
Individual 2
Individual 3
SNP
Single nucleotide polymorphism (SNP)
在基因组中,不同个体的 DNA 序列上的单个碱基的差异被称作单核苷酸多态性。
1/1000 Estimated between any 2 individuals
(3 m)
10 m in the whole populations
Single nucleotide polymorphism (SNP)
SNP Effects
• SNPs in genesIn coding regions (possible protein structure changes) Synonymous substitutions ( 同义 ) Missense substitutions ( 错义 ) Nonsense substitutions ( 终止 )In coding and non-coding regions Change of gene expression (by diverse binding various factor
s) Yield Timing Alternative splicing
• SNPs in regulatory regions• Change of gene expression
• SNPs in non-regulatory intergenic regions• Can be used as genetic markers
1 2 3 4
1 2 1 2 43
HapMap国际人类基因组单体型图计划
Towards genomevariations
• 人类的所有群体中大约存在一千万个 SNP 位点,其中稀有的 SNP 位点的频率至少有 1% 。
• 相邻 SNPs 的等位位点倾向于以一个整体遗传给后代。位于染色体上某一区域的一组相关联的 SNP 等位位点被称作单体型 (haplotype) 。
• 大多数染色体区域只有少数几个常见的单体型 ( 每个具有至少 5% 的频率 ) ,它们代表了一个群体中人与人之间的大部分多态性。一个染色体区域可以有很多 SNP 位点,但是只用少数几个标签 SNPs ,就能够提供该区域内大多数的遗传多态模式。
HapMap 的构建分为三个步骤: a 在多个个体的 DNA 样品中鉴定单核苷酸多态性 SNPs ; b 将群体中频率大于 1% 的那些共同遗传的相邻 SNPs 组合成单体型; c 在单体型中找出用于识别这些单体型的标签 SNPs 。通过对图中的三个标签 SNPs 进行基因分型,研究者可以确定每个个体拥有图示的四个单体型中的哪一个。
We are so young! with limited number of ancestorswith a few (thousands) of generationswith only a few recombination events
我们非常年轻人类进化史上曾有一大瓶颈(约 6-15 万年前)通过“瓶颈”的人类祖先群体很小(仅有万余人)现代人类仅经过少数几千个时代(约 3000 - 5600 代)“ 遗传重组”数目有限
Human genome is composed of “blocks”
单体型的起源
Methods and technologies in SNP studies
• Discovery (Find SNPs)• Validation (A common one or rare one)• Genotyping (Frequency in population)
Consideration:—Call rates
—Flexibility
—Throughput
—Cost
Fundamental approaches
• large-scale sequencing based: genomic-alignment(GA), reduced representation shotgun(RRS)• PCR based: common PCR• hybridization based: DNA chips
Genomic DNA mRNA
BAC library cDNA libraryRRS (reduced representation
shot-gun) library or sampling
BAC overlap Shotgun overlap EST overlap
Sequence overlap SNP discovery
GTTTAAATAATACTGATCA
GTTTAAATAATACTGATCAGTTTAAATAGTACTGATCAGTTTAAATAGTACTGATCA
How to discover SNPs
Base-calling
Quality determination
Contig assembly
Sequence viewing
Polymorphism tagging
Polymorphism reporting
Individual genotyping
Polymorphism detection
PolyPhred
Consed
Analysis
Sequence Phred PhrapAmplify DNA5’ 3’
Discovering SNPs by Sequencing
Phylogenetic analysis
ATAGACG ATACACG ATAGACG ATACACG
ATAGACGATACACG
Homozygotes Heterozygote
SNP 检定— Genotyping目标:灵敏、准确、简单、高通量、低成本
Invader(Third Wave) 、 SNPlex(ABI) 、 Parallele 、 BeadArray(Illumina)
Fluorescence Polarization(PE) 、MassArray(Sequenom)
SNaPshot(ABI) 、 SNuPe(GE) 、 TaqMan(ABI) 、 Pyrosequencing
Throughput
SNP screening of certain genes
5’UTR exons 3’UTR-1000~-1 regulation region
Genes, Samples, Phenotypes
Primers design and PCR
Directly DNA sequencing
Statistical Analysis
• SNP raise the resolution of genetic analysis
• Pharmacogenomics• Personalize medicine
2|JANUARY 2007|VOLUME 8 www.nature.com/reviews/genetics
Science,2004 23 JULY 2004,305:525
• Forty-three authors used the DNA from 270 individuals from the 4 HapMap populations.
• Overall, the authors found 1,447 discrete, heterogeneously distributed, copy number variable regions (CNVRs), which cover 12% of the human genome. They found that 24% of CNVRs are associated with segmental duplications.
• CNVRs contain different classes of functional elements. – many CNVs preferentially lie outside genes.– genes that are involved in cell-adhesion functions,
sensory perception of smell and response to chemical stimuli are enriched within CNVs.
– Conversely, cell signalling and proliferation, as well as kinase-and phosphorylationrelated categories were underrepresented among CNVs.
– Interestingly, ultraconserved elements are strongly excluded from these regions.
• CNV has effects on SNP genotype patterns. SNP has the ability to identify linked CNV.
• Both types of variation will need to be collecte
d and analysed systematically if we are to understand the genetic basis of human disease.
• The authors call for standard assessment of CNV in all studies of the genetic basis of phenotypic variation, and for an international effort to continue to characterize and catalogue structural genomic variation.
26,628 clones 534500 SNPs
• phenotype: modify drug response• predispose to or cause disease• polymorphism: population genetics• genome wide gene regulation variation
Effects of CNV
• Genome-wide– array-based
• array- CGH: Clone-based(1Mb), oligonucleotide-based(30kb)
• SNP array (signal intensity, genotyping)– sequence-assembly comparison
• Targeted– PCR-based
• MAPH, MLPA, QMPSF: mutiplex, up to 40 regions per time• real-time qPCR
– Hybridization-based• FISH, Southern blotting
• Computation approaches
Methods to identify CNV
Methods to identify CNV: array-CGH
array-based CGH
Methods to identify CNV: array-CGH
representational oligonucleotide microarray analysis, ROMA
multiplex amplifiable probe hybridization, MAPH
Methods to identify CNV: targeted PCR-based
Multiplex ligation-dependent probe amplification, MLPA
Methods to identify CNV: targeted PCR-based
Quantitative multiplex PCR of short fluorescent fragments, QMPSF
Methods to identify CNV: targeted PCR-based
Methods to identify CNV: computational
• Mass spectrometery: MALDI-TOF• real-time quantitative PCR• Southern blotting• FISH
Validation of CNV
submicrosopic
microscopic
DNA 提取
自动序列分析及等位基因信息的获得
Primer
MassArray ( 1 )
目标序列的扩增
第一次纯化和 SNP 位点延伸反应
点样
MALDI-TOF 质谱测定
Alle
le 1
Alle
le 1
Alle
le 2
Alle
le 2
+Enzyme+ddATP
+dCTP/dGTP/dTTP
Allele 1Unlabeled Primer (23mer)
T C T
Extended Primer (24mer)
T C TA T G A
Allele 2Unlabeled Primer(23mer)
A C T
Extended Primer (26-mer)
A C T
MassArray ( 2 )
Identify CNV of certain genes or regions
Southern (small samples)
FISH (optional)
real-time qPCR
QMPSF
MAPH
MLPA
(large samples)