41
Sequence Alignment Method BY:- Parwati Sihag M.Sc. Biotechnology

Parwati sihag

Embed Size (px)

Citation preview

Page 1: Parwati sihag

Sequence Alignment Method

BY:-Parwati SihagM.Sc. Biotechnology

Page 2: Parwati sihag

SEQUENCE ALIGNMENTIt is the way of arranging the

sequence of DNA, RNA, Protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationship between the sequence.

Page 3: Parwati sihag

Global Alignment In global alignment, two sequences to

be aligned are assumed to be generally similar over their entire length.

Alignment is carried out from beginning to end of both sequences to find the best possible alignment across the entire length between the two sequences.

This method is more applicable for aligning two closely related sequences of roughly the same length.

Page 4: Parwati sihag
Page 5: Parwati sihag

Local AlignmentLocal alignment, on the other hand, does not

assume that the two sequences in question have similarity over the entire length.

It only finds local regions with the highest level of similarity between the two sequences and aligns these regions without regard for the alignment of the rest of the sequence regions.

This approach can be used for aligning more divergent sequences with the goal of searching for conserved patterns in DNA or protein sequences. The two sequences to be aligned can be of different lengths.

Page 6: Parwati sihag
Page 7: Parwati sihag

•It is simplest method of alignment.•In pairwise alignment sequence there is a aligning of two sequences.•It is used in structural, functional and evolutionary analysis of sequence.•By pairwise alignment high accuracy result is obtained.•It is also used to identify homologous sequence.

Advantage of Pairwise alignment

Page 8: Parwati sihag

Disadvantage of pairwise alignment•It is not useful when we align more than two sequence.•Pairwise alignment is difficult if we use long sequences for alignment.

Page 9: Parwati sihag

•It is also known as the dot plot method.•It is a graphical way of comparison two sequence in a two dimensional matrix.•In a dot matrix two sequences to be compared are written in the horizontal and vertical axis of the matrix.•The comparison is done by scanning each residue of one sequence for similarity with all residue in the other sequence.

DOT MATRIX METHOD

Page 10: Parwati sihag

Dot matrix method

Page 11: Parwati sihag

DYNAMIC PROGRAMING METHODIt is the method that determines optimal

alignment by matching two sequence for all possible pair of character between the two sequence.

It is similar to dot matrix as,it finds alignment in a more quantitative way by converting a dot matrix into scoring matrix

Page 12: Parwati sihag

Dynamic programming

Page 13: Parwati sihag

MULTIPLE SEQUENCE ALIGNMENT

•It is a sequence alignment of three or more biological sequence, generally protein, DNA, or RNA.•MSAs require more sophisticated methodologies than pairwise alignment because they are more computational complex.•Most multiple sequence alignment program use heuristic methods rather than global optimization.• Because identifying the optimal alignment between more than a few sequence of moderate length is prohibitively computational expensive.

Page 14: Parwati sihag
Page 15: Parwati sihag
Page 16: Parwati sihag

Advantage of multiple sequence alignment

•MSA is used for comparing more than two sequences.•It is used to identify homologous residue within sequence.•To find out identical sequence.

Page 17: Parwati sihag

Disadvantage of multiple sequence alignment

•It is more complex method as compare to pairwise allignment.•It is more time consuming.•Due to gap within the sequence it show error.•Low accuracy as compare to pairwise sequence allignment.

Page 18: Parwati sihag

Online tool for sequence alignmentThere are following online tool for sequence alignment.•BLAST•FASTA•CLUSTAL OMEGA

Page 19: Parwati sihag

BASIC STEPS PERFORMED IN BLAST

Open NCBI SITE

All data bases (choosed gene )

Enter the name of gene(thyroid peroxidase)

Click on search

Get list of search result

Get the gene I.D and location

Click on FASTA

Obtained FASTA format and NCBI reference sequence

Run BLAST

Page 20: Parwati sihag
Page 21: Parwati sihag
Page 22: Parwati sihag
Page 23: Parwati sihag
Page 24: Parwati sihag
Page 25: Parwati sihag
Page 26: Parwati sihag

http:/www.ncbi.nlm.nin.gov/FASTA

FASTA

Page 27: Parwati sihag

NCBI Enter All databases

Select Nucleotide or Protein

Name of Protein, gene or nucleotide gene

open Select file

BASIC STEPS INVOLVED IN FASTA

Page 28: Parwati sihag

FASTA Format

Copy and paste the FASTA format file in BLAST query file

Run BLAST

Page 29: Parwati sihag
Page 30: Parwati sihag
Page 31: Parwati sihag
Page 32: Parwati sihag
Page 33: Parwati sihag
Page 34: Parwati sihag

It is the most commonly used approach to multiple sequence alignment.It speeds up the alignment of multiple sequence through a multistep process.It first conducts pairwise alignment for each possible pair of sequences using the Needleman-Wunsch alignment and record these similarity scores from the pairwise comparison.

PROGRESSIVE ALIGNMENT

Page 35: Parwati sihag

•The scores are then converted into evolutionary distances to generate a distance matrix for all the sequence involved.•As a result,a phylogenetic tree is generated using the neighbor-joining method.•In the next step,the closest sequence based on guide tree is aligned with the consensus sequence using dynamic programming.

Page 36: Parwati sihag
Page 37: Parwati sihag

•It is based on the idea that an optimal solution can be found by repeatedly modifying existing suboptimal solution.•The procedure starts by producing a low quality alignment and gradually improves it by iterative realignment through well defined procedures until no more improvement in the alignment can be achieved.

ITERATION ALIGNMENT

Page 38: Parwati sihag
Page 39: Parwati sihag

It performs multiple alignment through two sets of iteration.1.Outer iteration=In this an initial random alignment is generated that is used to derive a UPGMA tree2.Inner iteration=In this the sequence are randomly divided into two groupsThe process is repeated over many cycles until there is no further improvement in the overall alignment scores.

Page 40: Parwati sihag

•If a residue match is found, a dot is placed within the graph.•Otherwise, the matrix position are left bank.•When the two sequences have substantial regions of similarity, many dots line up to form contiguous diagonal lines, which reveal the sequence alignment.•If there are interruptions in the middle of a diagonal line, they indicate insertion or deletion

Page 41: Parwati sihag

THANK YOU