Parwati sihag

Sequence Alignment Method

BY:-Parwati SihagM.Sc. Biotechnology

SEQUENCE ALIGNMENTIt is the way of arranging the

sequence of DNA, RNA, Protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationship between the sequence.

Global Alignment In global alignment, two sequences to

be aligned are assumed to be generally similar over their entire length.

Alignment is carried out from beginning to end of both sequences to find the best possible alignment across the entire length between the two sequences.

This method is more applicable for aligning two closely related sequences of roughly the same length.

Local AlignmentLocal alignment, on the other hand, does not

assume that the two sequences in question have similarity over the entire length.

It only finds local regions with the highest level of similarity between the two sequences and aligns these regions without regard for the alignment of the rest of the sequence regions.

This approach can be used for aligning more divergent sequences with the goal of searching for conserved patterns in DNA or protein sequences. The two sequences to be aligned can be of different lengths.

•It is simplest method of alignment.•In pairwise alignment sequence there is a aligning of two sequences.•It is used in structural, functional and evolutionary analysis of sequence.•By pairwise alignment high accuracy result is obtained.•It is also used to identify homologous sequence.

Advantage of Pairwise alignment

Disadvantage of pairwise alignment•It is not useful when we align more than two sequence.•Pairwise alignment is difficult if we use long sequences for alignment.

•It is also known as the dot plot method.•It is a graphical way of comparison two sequence in a two dimensional matrix.•In a dot matrix two sequences to be compared are written in the horizontal and vertical axis of the matrix.•The comparison is done by scanning each residue of one sequence for similarity with all residue in the other sequence.

DOT MATRIX METHOD

Dot matrix method

DYNAMIC PROGRAMING METHODIt is the method that determines optimal

alignment by matching two sequence for all possible pair of character between the two sequence.

It is similar to dot matrix as,it finds alignment in a more quantitative way by converting a dot matrix into scoring matrix

Dynamic programming

MULTIPLE SEQUENCE ALIGNMENT

•It is a sequence alignment of three or more biological sequence, generally protein, DNA, or RNA.•MSAs require more sophisticated methodologies than pairwise alignment because they are more computational complex.•Most multiple sequence alignment program use heuristic methods rather than global optimization.• Because identifying the optimal alignment between more than a few sequence of moderate length is prohibitively computational expensive.

Advantage of multiple sequence alignment

•MSA is used for comparing more than two sequences.•It is used to identify homologous residue within sequence.•To find out identical sequence.

Disadvantage of multiple sequence alignment

•It is more complex method as compare to pairwise allignment.•It is more time consuming.•Due to gap within the sequence it show error.•Low accuracy as compare to pairwise sequence allignment.

Online tool for sequence alignmentThere are following online tool for sequence alignment.•BLAST•FASTA•CLUSTAL OMEGA

BASIC STEPS PERFORMED IN BLAST

Open NCBI SITE

All data bases (choosed gene )

Enter the name of gene(thyroid peroxidase)

Click on search

Get list of search result

Get the gene I.D and location

Click on FASTA

Obtained FASTA format and NCBI reference sequence

Run BLAST

http:/www.ncbi.nlm.nin.gov/FASTA

FASTA

NCBI Enter All databases

Select Nucleotide or Protein

Name of Protein, gene or nucleotide gene

open Select file

BASIC STEPS INVOLVED IN FASTA

FASTA Format

Copy and paste the FASTA format file in BLAST query file

Run BLAST

It is the most commonly used approach to multiple sequence alignment.It speeds up the alignment of multiple sequence through a multistep process.It first conducts pairwise alignment for each possible pair of sequences using the Needleman-Wunsch alignment and record these similarity scores from the pairwise comparison.

PROGRESSIVE ALIGNMENT

•The scores are then converted into evolutionary distances to generate a distance matrix for all the sequence involved.•As a result,a phylogenetic tree is generated using the neighbor-joining method.•In the next step,the closest sequence based on guide tree is aligned with the consensus sequence using dynamic programming.

•It is based on the idea that an optimal solution can be found by repeatedly modifying existing suboptimal solution.•The procedure starts by producing a low quality alignment and gradually improves it by iterative realignment through well defined procedures until no more improvement in the alignment can be achieved.

ITERATION ALIGNMENT

It performs multiple alignment through two sets of iteration.1.Outer iteration=In this an initial random alignment is generated that is used to derive a UPGMA tree2.Inner iteration=In this the sequence are randomly divided into two groupsThe process is repeated over many cycles until there is no further improvement in the overall alignment scores.

•If a residue match is found, a dot is placed within the graph.•Otherwise, the matrix position are left bank.•When the two sequences have substantial regions of similarity, many dots line up to form contiguous diagonal lines, which reveal the sequence alignment.•If there are interruptions in the middle of a diagonal line, they indicate insertion or deletion

THANK YOU

Science

Parwati sihag