Upload
parwati-sihag
View
26
Download
0
Embed Size (px)
Citation preview
Sequence Alignment Method
BY:-Parwati SihagM.Sc. Biotechnology
SEQUENCE ALIGNMENTIt is the way of arranging the
sequence of DNA, RNA, Protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationship between the sequence.
Global Alignment In global alignment, two sequences to
be aligned are assumed to be generally similar over their entire length.
Alignment is carried out from beginning to end of both sequences to find the best possible alignment across the entire length between the two sequences.
This method is more applicable for aligning two closely related sequences of roughly the same length.
Local AlignmentLocal alignment, on the other hand, does not
assume that the two sequences in question have similarity over the entire length.
It only finds local regions with the highest level of similarity between the two sequences and aligns these regions without regard for the alignment of the rest of the sequence regions.
This approach can be used for aligning more divergent sequences with the goal of searching for conserved patterns in DNA or protein sequences. The two sequences to be aligned can be of different lengths.
•It is simplest method of alignment.•In pairwise alignment sequence there is a aligning of two sequences.•It is used in structural, functional and evolutionary analysis of sequence.•By pairwise alignment high accuracy result is obtained.•It is also used to identify homologous sequence.
Advantage of Pairwise alignment
Disadvantage of pairwise alignment•It is not useful when we align more than two sequence.•Pairwise alignment is difficult if we use long sequences for alignment.
•It is also known as the dot plot method.•It is a graphical way of comparison two sequence in a two dimensional matrix.•In a dot matrix two sequences to be compared are written in the horizontal and vertical axis of the matrix.•The comparison is done by scanning each residue of one sequence for similarity with all residue in the other sequence.
DOT MATRIX METHOD
Dot matrix method
DYNAMIC PROGRAMING METHODIt is the method that determines optimal
alignment by matching two sequence for all possible pair of character between the two sequence.
It is similar to dot matrix as,it finds alignment in a more quantitative way by converting a dot matrix into scoring matrix
Dynamic programming
MULTIPLE SEQUENCE ALIGNMENT
•It is a sequence alignment of three or more biological sequence, generally protein, DNA, or RNA.•MSAs require more sophisticated methodologies than pairwise alignment because they are more computational complex.•Most multiple sequence alignment program use heuristic methods rather than global optimization.• Because identifying the optimal alignment between more than a few sequence of moderate length is prohibitively computational expensive.
Advantage of multiple sequence alignment
•MSA is used for comparing more than two sequences.•It is used to identify homologous residue within sequence.•To find out identical sequence.
Disadvantage of multiple sequence alignment
•It is more complex method as compare to pairwise allignment.•It is more time consuming.•Due to gap within the sequence it show error.•Low accuracy as compare to pairwise sequence allignment.
Online tool for sequence alignmentThere are following online tool for sequence alignment.•BLAST•FASTA•CLUSTAL OMEGA
BASIC STEPS PERFORMED IN BLAST
Open NCBI SITE
All data bases (choosed gene )
Enter the name of gene(thyroid peroxidase)
Click on search
Get list of search result
Get the gene I.D and location
Click on FASTA
Obtained FASTA format and NCBI reference sequence
Run BLAST
http:/www.ncbi.nlm.nin.gov/FASTA
FASTA
NCBI Enter All databases
Select Nucleotide or Protein
Name of Protein, gene or nucleotide gene
open Select file
BASIC STEPS INVOLVED IN FASTA
FASTA Format
Copy and paste the FASTA format file in BLAST query file
Run BLAST
It is the most commonly used approach to multiple sequence alignment.It speeds up the alignment of multiple sequence through a multistep process.It first conducts pairwise alignment for each possible pair of sequences using the Needleman-Wunsch alignment and record these similarity scores from the pairwise comparison.
PROGRESSIVE ALIGNMENT
•The scores are then converted into evolutionary distances to generate a distance matrix for all the sequence involved.•As a result,a phylogenetic tree is generated using the neighbor-joining method.•In the next step,the closest sequence based on guide tree is aligned with the consensus sequence using dynamic programming.
•It is based on the idea that an optimal solution can be found by repeatedly modifying existing suboptimal solution.•The procedure starts by producing a low quality alignment and gradually improves it by iterative realignment through well defined procedures until no more improvement in the alignment can be achieved.
ITERATION ALIGNMENT
It performs multiple alignment through two sets of iteration.1.Outer iteration=In this an initial random alignment is generated that is used to derive a UPGMA tree2.Inner iteration=In this the sequence are randomly divided into two groupsThe process is repeated over many cycles until there is no further improvement in the overall alignment scores.
•If a residue match is found, a dot is placed within the graph.•Otherwise, the matrix position are left bank.•When the two sequences have substantial regions of similarity, many dots line up to form contiguous diagonal lines, which reveal the sequence alignment.•If there are interruptions in the middle of a diagonal line, they indicate insertion or deletion
THANK YOU