5
Plant Physiol. (1987) 85, 942-946 0032-0889/87/85/0942/05/$0 1.00/0 Classification and Structural Comparison of Full-Length cDNAs for Pathogenesis-Related Proteins' Received for publication April 21, 1987 and in revised form August 5, 1987 MAKOTO MATSUOKA*, NAOKI YAMAMOTO, YURIKO KANO-MURAKAMI, YOSHIYUKI TANAKA, YOSHIHIRO OZEKI, HISASHI HIRANO, HIROYUKI KAGAWA, MASAHIRO OSHIMA, AND YUKO OHASHI National Institute ofAgrobiological Resources (M.M., H.H., H.K., M.O., Y.Ohashi), Forestry and Forest Products Research Institute (N.Y.), Fruits Tree Research Station (Y.K-M.), National Institute of Agro-Environmental Science (Y.T.) Tsukuba Science City, Ibaraki 305, Japan; and Department of Biology, College ofArts and Sciences, The University of Tokyo, Komaba, Tokyo 153, Japan (Y.Ozeki) ABSTRACT Fourteen cDNA clones of pathogenesis-related (PR) proteins, PRla and PRlb of tobacco were obtained and classified into six groups based on restriction enzyme maps. To assign the groups to different classes of PR1 proteins, all the clones were partially sequenced and compared with amino acid sequences of PRla and PRlb. Two groups of these coffe- sponded to PRla and four to PRlb. The results indicate that there are at least two kinds of PRla mRNAs and four kinds of PRlb mRNAs. In fact, one cDNA insert hybridized to at least six to seven DNA fragments in restriction enzyme fragments of Samsun NN genomic DNA, indicating that the PRI protein genes exist as a multigene family in the tobacco genome. Two sequences of essentially full-length cDNAs for PRla and PRlb were determined and compared. The coding sequences of two cDNAs share 93% homology and the deduced amino acid sequences of PRla and PRlb precursors, which are synthesized as larger precursors containing signal peptides, are 91% homologous. The homology of mature PRla and PRlb regions is higher than that of larger precursors, 94% in the nucleotide sequence and 93% in the amino acid sequence, whereas that of the signal peptide regions is 80 and 90%, respectively. The hydropathy patterns and the secondary structures predicted by Chou- Fasman rules are similar to tomato PR protein in the half-side of the C terminus, which suggests that the half-C terminus side is important for the function of PRI proteins. Tobacco leaves carrying the N gene produce novel soluble proteins following infection with tobacco mosaic virus (18). The proteins, called PR proteins,2 are host coded proteins induced by viruses, bacteria, and fungi (1). The proteins are also induced by some chemicals, such as polyacrylic acid (4) and salicylic acid (19), and plant hormones (2). In tobacco cultivar Samsun NN, there are three major PR proteins designated as PR la, PRlb, and PRlc in the order of decreasing mobility in nondenaturing gels. The three proteins (PRl proteins) have similar mol wt, amino acid compositions (1), and antigenicities (8). Furthermore, the PRl proteins have unique properties such as high solubility at low pH (16), resistance to many kinds of proteases (14), and extracellular location (13). These unique properties are thought I Supported in the part by a grant from the Ministry of Agriculture, Forestry and Fisheries of Japan. 2 Abbreviations: PR proteins, pathogenesis-related proteins; bp, base pair. to be related to the anti-virus function of the PRI proteins. We have cloned 14 PRI protein cDNAs and classified the clones into six groups based on their restriction enzyme maps in order to study the heterogeneity in PRI proteins. We also deter- mined the sequences of two full-length cDNA clones correspond- ing to PRla and PR lb. We discuss the homology between the two sequences and the protein structures of PRla and PRlb deduced from the sequences. MATERIALS AND METHODS Plant Materials. Tobacco plants (Nicotiana tabacum cv Sam- sun NN) were cultivated in 12.5 cm pots under natural light in a temperature controlled greenhouse (20-32°C) for 2 to 3 months after sowing. Materials. Restriction enzymes and other enzymes were ob- tained from Takara Shuzo Co., Kyoto. a-32P-dCTP and [35S] methionine were obtained from Amersham Japan. Synthetic deoxyoligonucleotides were obtained from Nikkaki Co., Tokyo. Preparation of anti-PRla serum was described previously (12). Construction of cDNA Library. Poly(A+) RNA was prepared from TMV-infected or salicylic acid-treated tobacco leaves as previously described (9). The poly(A+) RNA was fractionated by 5 to 20% sucrose density gradient centrifugation. Fractions con- taining PRl protein mRNAs, as revealed by in vitro translation, were used for the construction of a cDNA library. Double-strand cDNAs were synthesized by the method of Gubler and Hoffman (5) and cloned into Pst I site of pUC 8 by the [dG:dCJ homo- polymer tailing method and transformed into competent Esch- erichia coli cells with the method of Hanahan (6). Sequencing. The cDNA inserts were sequenced by the dideox- ynucleotide chain termination method (10). Restriction frag- ments of cDNA inserts were cloned into the appropriate sites of M13 mpl8 or mpl9 and purified single-stranded DNAs were sequenced using both the universal primer (Takara Shuzo, Co., Kyoto) and synthetic oligonucleotides complementary to specific regions of the cDNA inserts. Sequence data were analyzed using the computer programs of Software Development Co., Japan. The purification of PRla and PRlb was described previously (12). The amino acid sequences of purified PRla and PRlb were determined with an automated gas phase chromatograph. Southern hybridization Analysis of Tobacco DNA. Leaftissues from tobacco were homogenized in 20 mm Tris-HCl buffer (pH 8.0) containing 0.285 M sucrose, 2 mM MgCl2, and 0.1 M Na N, N-diethyl dithiocarbamate. Nuclei were collected by centrifuga- tion (12,000g) for 5 min and suspended with 50 mm Tris-HCl buffer (pH 8.0) containing 1.5% sarkocyl and 20 mM EDTA. Extracted DNAs were purified by CsCl density gradient centrif- 942 www.plantphysiol.org on February 29, 2020 - Published by Downloaded from Copyright © 1987 American Society of Plant Biologists. All rights reserved.

Classification and Structural Comparison of Full-Length ...2Abbreviations: PRproteins, pathogenesis-related proteins; bp, base pair. toberelated totheanti-virusfunction ofthePRI proteins

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Classification and Structural Comparison of Full-Length ...2Abbreviations: PRproteins, pathogenesis-related proteins; bp, base pair. toberelated totheanti-virusfunction ofthePRI proteins

Plant Physiol. (1987) 85, 942-9460032-0889/87/85/0942/05/$0 1.00/0

Classification and Structural Comparison of Full-Length cDNAsfor Pathogenesis-Related Proteins'

Received for publication April 21, 1987 and in revised form August 5, 1987

MAKOTO MATSUOKA*, NAOKI YAMAMOTO, YURIKO KANO-MURAKAMI, YOSHIYUKI TANAKA,YOSHIHIRO OZEKI, HISASHI HIRANO, HIROYUKI KAGAWA, MASAHIRO OSHIMA, AND YUKO OHASHINational Institute ofAgrobiological Resources (M.M., H.H., H.K., M.O., Y.Ohashi), Forestry and ForestProducts Research Institute (N.Y.), Fruits Tree Research Station (Y.K-M.), National Institute ofAgro-Environmental Science (Y.T.) Tsukuba Science City, Ibaraki 305, Japan; and Department ofBiology, College ofArts and Sciences, The University of Tokyo, Komaba, Tokyo 153, Japan (Y.Ozeki)

ABSTRACT

Fourteen cDNA clones of pathogenesis-related (PR) proteins, PRlaand PRlb of tobacco were obtained and classified into six groups basedon restriction enzyme maps. To assign the groups to different classes ofPR1 proteins, all the clones were partially sequenced and compared withamino acid sequences of PRla and PRlb. Two groups of these coffe-sponded to PRla and four to PRlb. The results indicate that there areat least two kinds of PRla mRNAs and four kinds of PRlb mRNAs. Infact, one cDNA insert hybridized to at least six to seven DNA fragmentsin restriction enzyme fragments of Samsun NN genomic DNA, indicatingthat the PRI protein genes exist as a multigene family in the tobaccogenome. Two sequences of essentially full-length cDNAs for PRla andPRlb were determined and compared. The coding sequences of twocDNAs share 93% homology and the deduced amino acid sequences ofPRla and PRlb precursors, which are synthesized as larger precursorscontaining signal peptides, are 91% homologous. The homology ofmaturePRla and PRlb regions is higher than that of larger precursors, 94% inthe nucleotide sequence and 93% in the amino acid sequence, whereasthat of the signal peptide regions is 80 and 90%, respectively. Thehydropathy patterns and the secondary structures predicted by Chou-Fasman rules are similar to tomato PR protein in the half-side of the Cterminus, which suggests that the half-C terminus side is important forthe function of PRI proteins.

Tobacco leaves carrying the N gene produce novel solubleproteins following infection with tobacco mosaic virus (18). Theproteins, called PR proteins,2 are host coded proteins induced byviruses, bacteria, and fungi (1). The proteins are also induced bysome chemicals, such as polyacrylic acid (4) and salicylic acid(19), and plant hormones (2). In tobacco cultivar Samsun NN,there are three major PR proteins designated as PR la, PRlb,and PRlc in the order of decreasing mobility in nondenaturinggels. The three proteins (PRl proteins) have similar mol wt,amino acid compositions (1), and antigenicities (8). Furthermore,the PRl proteins have unique properties such as high solubilityat low pH (16), resistance to many kinds of proteases (14), andextracellular location (13). These unique properties are thought

I Supported in the part by a grant from the Ministry of Agriculture,Forestry and Fisheries of Japan.

2 Abbreviations: PR proteins, pathogenesis-related proteins; bp, basepair.

to be related to the anti-virus function of the PRI proteins.We have cloned 14 PRI protein cDNAs and classified the

clones into six groups based on their restriction enzyme maps inorder to study the heterogeneity in PRI proteins. We also deter-mined the sequences oftwo full-length cDNA clones correspond-ing to PRla and PRlb. We discuss the homology between thetwo sequences and the protein structures of PRla and PRlbdeduced from the sequences.

MATERIALS AND METHODS

Plant Materials. Tobacco plants (Nicotiana tabacum cv Sam-sun NN) were cultivated in 12.5 cm pots under natural light ina temperature controlled greenhouse (20-32°C) for 2 to 3 monthsafter sowing.

Materials. Restriction enzymes and other enzymes were ob-tained from Takara Shuzo Co., Kyoto. a-32P-dCTP and [35S]methionine were obtained from Amersham Japan. Syntheticdeoxyoligonucleotides were obtained from Nikkaki Co., Tokyo.Preparation of anti-PRla serum was described previously (12).

Construction of cDNA Library. Poly(A+) RNA was preparedfrom TMV-infected or salicylic acid-treated tobacco leaves aspreviously described (9). The poly(A+) RNA was fractionated by5 to 20% sucrose density gradient centrifugation. Fractions con-taining PRl protein mRNAs, as revealed by in vitro translation,were used for the construction of a cDNA library. Double-strandcDNAs were synthesized by the method of Gubler and Hoffman(5) and cloned into Pst I site of pUC 8 by the [dG:dCJ homo-polymer tailing method and transformed into competent Esch-erichia coli cells with the method of Hanahan (6).

Sequencing. The cDNA inserts were sequenced by the dideox-ynucleotide chain termination method (10). Restriction frag-ments ofcDNA inserts were cloned into the appropriate sites ofM13 mpl8 or mpl9 and purified single-stranded DNAs weresequenced using both the universal primer (Takara Shuzo, Co.,Kyoto) and synthetic oligonucleotides complementary to specificregions of the cDNA inserts. Sequence data were analyzed usingthe computer programs of Software Development Co., Japan.The purification of PRla and PRlb was described previously(12). The amino acid sequences of purified PRla and PRlb weredetermined with an automated gas phase chromatograph.

Southern hybridization Analysis of Tobacco DNA. Leaftissuesfrom tobacco were homogenized in 20 mm Tris-HCl buffer (pH8.0) containing 0.285 M sucrose, 2 mM MgCl2, and 0.1 M Na N,N-diethyl dithiocarbamate. Nuclei were collected by centrifuga-tion (12,000g) for 5 min and suspended with 50 mm Tris-HClbuffer (pH 8.0) containing 1.5% sarkocyl and 20 mM EDTA.Extracted DNAs were purified by CsCl density gradient centrif-

942 www.plantphysiol.orgon February 29, 2020 - Published by Downloaded from Copyright © 1987 American Society of Plant Biologists. All rights reserved.

Page 2: Classification and Structural Comparison of Full-Length ...2Abbreviations: PRproteins, pathogenesis-related proteins; bp, base pair. toberelated totheanti-virusfunction ofthePRI proteins

CLASSIFICATION AND SEQUENCES OF PRI PROTEIN cDNAs

ugation. DNAs were digested with restriction enzymes, separatedby electrophoresis in 0.7% agarose gel, and transferred to anitrocellulose filter. Hybridization with labeled pPR 1183 cDNAinsert was carried out in 50% formamide containing 5 x SSC, 6x Denhart solution, 0.5% SDS, and 100 iig/ml calfthymus DNAat 42C for 20 h. The filter was washed with 0.1 x SSC, 0.5%SDS at 45°C for 3 h.

RESULTS AND DISCUSSIONIsolation and Classification of cDNA Clones for PR1 Proteins.

Poly(A+) RNA from TMV infected tobacco leaves was fraction-ated by sucrose gradient centrifugation to increase the proportionof PR1 protein cDNA clones in the cDNA library. PR1 proteinmRNA content in each fraction was determined by in vitrotranslation. The PRl protein mRNAs sedimented at approxi-mately 10 S (data not shown). The partially purified mRNAswere used for the construction of a cDNA library as described in"Materials and Methods." We used synthetic oligonucleotides ashybridization probes to screen clones containing PRI proteincDNAs. The mixed probes were 16 kinds of 17mers derivedfrom the internal amino acid sequence of PRlb (Glu-Met-Trp-Val-Asp-Glu). Hybridization of 32P-labeled oligonucleotides tothe cDNA library of about 1000 colonies led to the identificationof 24 putative PRI protein cDNA clones. The inserts were cutout and subjected to Southern hybridization to confirm theiridentification and determine the length of the cDNA inserts.

EBHu

23.1-94--6.6-

4,4-i

23-20-

J*1"~. a'

Group# Clone #

1 10012066216621893074

2 20952135

3 3111

4 3137

5 1172118321803048

6 3024

SR RI IL

SR R

sERI I

EI ~~~I I I ,i

SR R R

II I I

RR

R R RI ~~~~I II

RsROI I i

I I I I I

0 02 0.4 0.6 0,8Kb

FIG. 1. Comparison of restriction enzyme sites in cDNAs for PR1proteins. Abbreviations are used for the restriction enzymes: E, EcoRI;R, RsaI; and S, ScaI.

FIG. 2. Southern blot analysis of Samsun NN tobacco total DNAhybridized with PRIa cDNA insert. Total DNA, 5 gg, was digested withEcoRI (E), BglII (B), and HindlIl (H), separated on 0.7% agarose gel andblotted onto nitrocellulose filter. The filter was hybridized against 32p_labeled cDNA inserts of pPRI 183. The numbers on the left side of thepanel are size markers in kb.

Fourteen clones contained almost full sized cDNA inserts codingfor PRl proteins (>600 bp).The 14 clones were classified into six groups according to

restriction enzyme maps (Fig. 1). The maps ofgroups 1 to 4 weresimilar to one another, but different from groups 5 and 6, whichwere similar to each other. These results indicate that PRlprotein cDNAs can be classified into two major classes accordingto their structures. One of the class is represented by cDNAs ingroups 1, 2, 3, and 4, and group 1 is the most abundant in theclass. Another class is represented by cDNAs in groups 5 and 6,and group 5 is more abundant than group 6.

In order to estimate the number of genes encoding PR1proteins in the Samsun NN tobacco genome, we hybridized the32P-labeled pPRl 183 cDNA insert (group 5) to tobacco genomicDNA digested with EcoRI, BglII, and HindIII under conditionswhere the 32P-probe should cross-hybridize to all other cDNAgroups. The cDNA insert hybridized to six fragments in BglIIand HindIll digests and seven fragments in EcoRI digests (Fig.2). Some cDNA inserts (groups 1 and 2) are separated into twofragments by EcoRI digestion, but the fragment of 3' side is tooshort to be detected in the Southern blot under these conditions.The results suggest that there are several genes encoding PR1proteins in Samsun NN tobacco. This estimation may corre-spond with the presence of 6 cDNA groups, and each cDNAgroup may be encoded by single gene.

Structure of PR1 protein cDNAs. We determined the entirenucleotide sequence of the cDNA inserts of pPR1 183 andpPR2095 which belonged to the second and first classes, respec-tively (Fig. 3). Both sequences contained a 507 bp open readingframe capable ofcoding for a polypeptide with 168 amino acids.

. . i

943

www.plantphysiol.orgon February 29, 2020 - Published by Downloaded from Copyright © 1987 American Society of Plant Biologists. All rights reserved.

Page 3: Classification and Structural Comparison of Full-Length ...2Abbreviations: PRproteins, pathogenesis-related proteins; bp, base pair. toberelated totheanti-virusfunction ofthePRI proteins

-5pPR 1183 (PR la) TAGTC

-17pPR 2095 (PR lb) ACATTTCTCCTATAGTC

1 60ATG.GGA.TTT.GTT.CTC.TTT.TCA.CAA.TTG.CCT.TCA.TTT.CTT.CTT.GTC.TCT.ACA.CTT.CTC.TTAMET-Gly-Phe-Val-Leu-Phe-Ser-Gln-Leu-Pro-Ser-Phe-Leu-Leu-Val-Ser-Thr-Leu-Leu-Leu-30 T A C T -11

Phe MET Phe

120TTC.CTA.GTA.ATA.TCC.CAC.TCT.TGC.CGT.GCC.CAA.AAT.TCT.CAA.CAA.GAC.TAT.TTG.GAT.GCCPhe-Leu-Val-Ile-Ser-His-Ser-Cys-Arg-Ala-Gln-Asn-Ser-Gln-Gln-Asp-Tyr-Leu-Asp-Ala

A T C A 0 10l e Ser His

180CAT.AAC.ACA.GCT.CGT.GCA.GAT.GTA.GGT.GTA.GAA.CCT.TTG.ACC.TGG.GAC.GAC.CAG.GTA.GCAHis-Asn-Thr-Ala-Arg-Ala-Asp-Val-Gly-Val-Glu-Pro-Leu-Thr-Trp-Asp-Asp-Gln-Val-Ala

C G A A T A GG 30~~~~................~ ~ ~ ~ ~~~~~~~~~ 0 G *T. .....~....***~*** G 3-R *:*::*::*:*.*: :::*:*::*:*>.:.......

240GCC.TAT.GCG.CAA.AAT.TAT.GCT.TCC.CAA.TTG.GCT.GCA.GAT.TGT.AAC.CTC. GTA.CAT. TCT. CATAla-Tyr-Ala-Gln Asp-Cys-Asn-Leu-Val-His-Ser-His

A T T C C 50

300GGT.CAA.TAC.GGC.GAA.AAC.CTA.GCT.GAG.GGA.AGT.GGC.GAT.TTC.ATG.ACG.GCT.GCT.AAG.GCCGly-Gln-Tyr-Gly-Glu-Asn-Leu-Ala-Glu-Gly-Ser-Gly-Asp-Phe-MET-Thr-Ala-Ala-Lys-AlaC C T 70~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~..............7{.................... .................v .. ..............

360GTT.GAG.ATG.TGG.GTC.GAT.GAG.AAA.CAG.TAT.TAT.GAC.CAT.GAC.TCA.AAT.ACT.TGT.GCA.CAAVal-Glu-MET-Trp-Val-Asp-Glu-Lys-Gln-Tyr-Tyr-Asp-His-Asp-Ser-Asn-Thr-Cys-Ser-Glnc ~~~~~~~~C90

420GGA.CAG.GTG.TGT.GGA.CAC.TAT.ACT.CAG.GTG.GTT.TGG.CGT.AAC.TCG.GTT.CGT.GTT.GGA.TGTGly-Gln-Val-Cys-Gly-His-Tyr-Thr-Gln-Val-Val-Trp-Arg-Asn-Ser-Val-Arg-Val-Gly-Cys

110

480GCT.AGG.GTT.CAG.TGT.AAC.AAT.GGA.GGA.TAT.GTT.GTC.TCT.TGC.AAC.TAT.GAT.CCT.CCA.GGTAla-Arg-Val-Gln-Cys-Asn-Asn-Gly-Gly-Tyr-Val-Val-Ser-Cys-Asn-Tyr-Asp-P,roPro,,Gy

A C 130Lys

AAT.TAT.AGA.GGC.GAA.AGT.CCA.TAC.TAA.TTGAAACGACCTACGTCCATTTCACGTTA-ATATGTATGGATT

GTC T C TTGAAATGA---ATGTCCATTTCACGTTATATATGTATGGAC-Val Ile Gln

GTTCTGCTTGATA----TCAAGAACTTAAATAATTGCTCTAAAAAGCAACTTAAAGTCAA--GTATATaGTAATAGTAC-TTCTGCTTGATATATATAAACAACTTAAATAATTGCACTAAAAAGCAACTTATAGTTAAAAGTATATA----------

TATATATGTAATCCTCT---GAAGTGGATCTATAAAAAGACCAAGTGGTCATAATTAAGGGG---AAAA---ATATG4G

-ATATTTGTAATCCTCTG GAATGGATCTGTAAAAAGTCCAAGTGGTCTTAATTAAGG 2gAGGATATATATGAA

TTGATG4TCAGCTTGATGTATGATCTGATATTATTATGAACACTTTTGTACTCAT (poly A)

T--TCAGCTTGATGTATGATCTGAA (poiy A)

FIG. 3. Comparison of the nucleotide sequences of the cDNA insert of pPRl 183 and pPR2095. The nucleotide sequence of the coding region ofpPR2095 and its deduced amino acid sequence are shown only where they differ from those ofpPRl 183, and amino acid sequences directly decidedby the automated gas phase chromatograph are shown as shadow. The arrow indicates the ends of the signal peptide. Hyphens in the 3'-noncodingregion indicate gaps that are necessary to align the two sequences. The nucleotides conserved in both sequences are marked by asterisks. Short directrepeats or inverted repeats near the end of the deletions are indicated by horizontal arrows.

944 MATSUOKA ET AL. Plant Physiol. Vol. 85, 1987

www.plantphysiol.orgon February 29, 2020 - Published by Downloaded from Copyright © 1987 American Society of Plant Biologists. All rights reserved.

Page 4: Classification and Structural Comparison of Full-Length ...2Abbreviations: PRproteins, pathogenesis-related proteins; bp, base pair. toberelated totheanti-virusfunction ofthePRI proteins

CLASSIFICATION AND SEQUENCES OF PRI PROTEIN cDNAs

10 29 33 43 53 63 73 se 33 133 113 129 139 143 15S 3e3

MGFVLFSQLPSFLLVSTLLLFLV ISHSCItAQNSQQDYLDAHNTARADVGVEPL T"DDVYQNMYASQNAAsQLVHHGYGNLvsHAEGDFNLAAEsDWMTdQYDHDgfSQVCGHYTQVV"RNs RVG(ARvc NGGfYWSCNYDPPGNYRGESPY

pPR2095Ie 23 30 As 53 Be 73 83 O3 131 lie 120 133 143 13 1e

GFFLFSMPSFFLVSTLLLFLI ISHSSHAQNSQQDYLDAHNTARADVCVEPLTWNGVAYAQNYVSQLADCNLVGQYGENUQFNTMKAVMVEKQYDHDSNTCAQGQVCGHTVWRNSVRVGCARVKCNNa;rW5CNYDPPVI QYFIG. 4. Plot of hydropathy index and predicted secondary structures of PRIa and PRIb. The location of the hydrophobic domains of the

molecules are graphed above the midpoint line, and the hydrophilic regions are below it. pPRl 183 (PRIa) was indicated as a line and pPR2095(PR Ib) as a dotted line. The arrow indicates the N terminus of mature proteins. The secondary structures of PRl proteins were predicted by theChou and Fasman rule. Loops, a-helix; O, ,B-turn; and A, i3-sheet. The amino acid residues are donated in the single letter code.

The sequence of the pPR1 183 cDNA also contained a 5 bp 5'-noncoding region and a 240 bp 3'-noncoding region, and thatof pPR2095 contained a 17 bp 5'-noncoding region and a 222bp 3'-noncoding region. The two sequences were very similar toeach other; the coding regions of two cDNAs shared 93% se-quence homology and the amino acid sequences of two PR1protein precursors were 91% homologous (Fig. 3).We compared the deduced amino acid sequences with PRl

protein amino acid sequence data to assign the two clones to therespective PR1 proteins. We determined partial internal aminoacid sequences of PRla and PRlb with the automated gas phasechromatograph (decided amino acid sequences shown as shadowin Fig. 3). These partial sequences of PRIa and PRlb matchedcompletely with the amino acid sequences deduced from cDNAsof pPR1 183 and pPR2095, respectively. The data strongly sug-gest that pPR1183 (group 5) is a cDNA for PRla and pPR2095(group 2) is for PR lb. We determined partial nucleotide se-quences of all 12 cDNA clones by the dideoxynucleotide chaintermination method using the synthetic oligonucleotide primer(5' CAACAAGACTATTTGGAT 3', 100-117 base residues inFig. 3). All the clones in the first class (groups 1-4) had identicalsequences between 125 and 219 base residues as that ofpPR2095,and all the clones in the second class (groups 5 and 6) hadidentical sequences between 123 and 210 base residues aspPR1 183. In these regions, several bases and a few amino acidschanged between PRla and PR lb. The results confirm that thefirst class (groups 1-4) correspond to cDNAs for PR la and thesecond one correspond to PR lb.

Recently, Cornelissen et al. (3) reported the nucleotide se-quence of PRlb and the partial amino acid sequence of PRIaand Ic. Their sequence of PRlb cDNA was almost identical tothe sequence of pPR2095 described above including the 5'- and

3'-noncoding regions, and the PRla partial sequence in the C-terminal side was also identical with pPR1 183, including the 3'-noncoding region. However, the partial amino acid sequence ofPRIa in N terminal side, reported by Lucas et al. (7), wasdifferent from our sequence. They reported the unique aminoacid sequence of PRia, SQVAAYAQNYAPS (amino acid resi-dues 27-37), but we did not find their sequence in all five clonescorresponding to PR I a groups (the second class). We determineddirectly the partial amino acid sequence of purified PRla (12) inthe above region (YASQ, amino acid residues 34-37) with theautomated gas phase chromatograph. The sequence obtained issame as the deduced amino acid sequence but is not the same astheir sequence. This disagreement may result from differences intobacco cultivars.PR1 proteins are synthesized as large precursors containing

signal peptides which are composed of 30 amino acid residues(15). The mol wt ofprecursors ofPRIa and PRlb were calculatedto be 18,600 and 18,500, respectively, and those ofmature PRlaand PRlb were calculated to be 15,300 and 15,100, respectively.The homology of mature protein region was 94% in amino acidsequence and 94% in nucleotide sequence. On the other hand,that of the signal peptide region was 80% in the amino acidsequence and 90% in the nucleotide sequence. These resultsshow that the frequency of amino acid exchange caused by onenucleotide exchange in the signal peptide region is higher thanthat in the mature protein region. In fact, an exchange of eightbases in the signal peptide region results in six amino acidexchanges (6/8 = 75%), but in the mature protein region, 26base exchanges result in eight amino acid exchanges (8/26 =31 %). Interestingly, amino acid exchanges in the mature proteinregion mainly occur at acidic or basic amino acid residues. Threeacidic amino acid residues, Asp, Glu, and Glu, in PR 1 a (amino

945

www.plantphysiol.orgon February 29, 2020 - Published by Downloaded from Copyright © 1987 American Society of Plant Biologists. All rights reserved.

Page 5: Classification and Structural Comparison of Full-Length ...2Abbreviations: PRproteins, pathogenesis-related proteins; bp, base pair. toberelated totheanti-virusfunction ofthePRI proteins

946 MATSUOKA ET AL.

acid residues 27, 59, and 135, respectively) exchanged to Asn,Gln, and Gln in PRlb, respectively, and basic amino acidexchanges occur at residue 114 (Gln-.Lys) and residue 133(Arg-d*le). The amino acid exchanges should account for thedifference of isoelectric points of PRla and PRlb, and PRlashould be more acidic than PR lb. In fact, the isoelectric pointsin 9 M urea of PRla and PRlb are pH 4.3 and 4.8, respectively(8).Comparison of the 3'-Noncoding Regions of PRI Protein

cDNAs. In contrast to the coding region, the sequences in the3'-noncoding region of cDNAs for PRIa and PRlb were morediverse. By allowing base-insertions or -deletions, the 3'-noncod-ing regions of the two cDNAs can be aligned to show maximumbase matching (Fig. 3). In all deletions longer than 3 bp, exceptfor one case, there are short direct repeats or inverted repeats atthe deletion points in the other sequence. The deletion patternat the direct repeat elements in the other sequence is similar tothose observed between sporamin genes in sweet potato, andmay be generated by the slipped mispairing of tandem repeatsduring DNA replication (1 1).Hydropathy Analysis and Secondary Structure of PRI Pro-

teins. Although the function of PRl proteins has not been wellestablished, the unique properties of PRl proteins, such as theirhigh solubility in low pH solution (pH 3) and resistance againstmany kinds of proteases, may be related to the physiologicalfunction of PRl proteins (17). Therefore, we examined thehydropathy profiles and predicted the secondary structures ofPRla and PRlb (Fig. 4). The N-terminal leader sequences ofPRla and PRlb show typical structure of signal peptides. Forinstance, these are composed of three highly hydrophobic coreregions and a hydrophilic ,B-turn structure near the C terminus.The mature proteins contain six regions (i.e. positions 58-68,

73-78, 93-96, 100-104, 119-129, and 147-153) in which thehydrophobic amino acid residues predominate. Three of thehydrophobic regions (second, fifth, and sixth) had predominant,8-sheet structure. The third and fourth hydrophobic regions werecalculated to have a-helix structures. The predictions are similarto the case of the tomato PR protein p14 (7). The tomato p14protein has two highly hydrophobic domains predicted as ,8-sheetin the C terminal region and one highly hydrophobic domainpredicted as a-helix structure in the internal region. This homol-ogy suggests that these hydrophobic regions may play someimportant roles in the function of PR I proteins.

Plant Physiol. Vol. 85, 1987

Acknowledgment-We thank Dr. K. Nakamura of Nagoya University for thecritical reading of this manuscript.

LITERATURE CITED

1. ANTONIW JF, CE RITTER, WS PIERPOINT, LC VAN LOON 1980 Comparison ofthree pathogenesis-related proteins from plants of two cultivars of tobaccoinfected with TMV. J Gen Virol 47: 79-87

2. ANTONIW JF, J KUEH, DGA WALKEY, RF WHITE 1981 The presence ofpathogenesis-related proteins in callus of Xanthi-nc tobacco. Phytopathol Z101: 179-184

3. CORNELISSEN BJC, RAMH VAN HUIJSDUIJNEN, LC VAN LOON, JF BOL 1986Molecular characterization of messenger RNAs for pathogenesis-related pro-teins la, lb and Ic, induced by TMV infection of tobacco. EMBO J 5: 37-40

4. GIANINAZZI S, B KASSANIS 1974 Virus resistance in plants by polyacrylic acid.J Gen Virol 23: 1-9

5. GUBLER U, BJ HOFFMAN 1983 A Simple and very efficient method forgenerating cDNA libraries. Gene 25: 263-269

6. HANAHAN D 1985 Techniques for transformation of E. coli. In DNA Cloning,DM Glover, ed. IRL Press, Arlington, VA, pp 109-135

7. LUCAS J, AC HENRIQUEz, F LOTFsPEICH, A HENCHEN, HL SANGER 1985Amino acid sequence of the 'pathogenesis-related' leaf protein p14 fromviroid-infected tomato reveals a new type structurally unfamiliar proteins.EMBO J 4: 2745-2749

8. MATSUOKA M, Y OHASHI 1984 Biochemical and selorogical studies of patho-genesis-related proteins of Nicotiana species. J Gen Virol 65: 2209-2215

9. MATSUOKA M, S Asou, Y OHASHI 1985 Transcriptional step is necessary forinduction of pathogenesis-related proteins. Proc Jpn Acad 61: 486-489

10. MESSING J 1983 New M13 vectors for cloning. Methods Enzymol 101: 20-7811. MURAKAMI S, T HATTORI, K NAKAMURA 1986 Structural differences in full-

length cDNAs for two classes of sporamin, the major soluble proteins ofsweet potato tuberous roots. Plant Mol Bill 7: 343-355

12. OHASHI Y, M MATSUOKA 1985 Synthesis of stress proteins in tobacco leaves.Plant Cell Physiol 26: 473-480

13. PARENT J-G, A ASsELIN 1984 Detection of pathogenesis-related proteins (PRor b) and of other proteins in the intercellular fluid of hypersensitive plantsinfected with tobacco mosaic virus. Can J Bot 62: 564-569

14. PIERPOINT WS 1983 The major proteins in extracts of tobacco leaves that areresponding hypersensitively to virus-infection. Phytochemistry 22: 2691-2697

15. VAN HUUSDUIJNEN RAMH, BJC CORNELISSEN, LC VAN LOON, JH VAN BOOM,M TROMP, JF BOL 1985 Virus-induced synthesis of messenger RNAs forprecursors of pathogenesis-related proteins in tobacco. EMBO J 4: 2167-2171

16. VAN LOON LC 1976 Specific soluble leaf proteins in virus-infected tobaccoplants are not normal constituents. J Gen Virol 30: 375-379

17. VAN LOON LC 1985 Pathogenesis-related proteins. Plant Mol Biol 4: 111-11618. VAN LOON LC, A VAN KAMMEN 1970 Polyacrylamide disc electrophoresis of

the soluble leaf proteins from Nicotiana tabacum var. "Samsun" and "Sam-sun NN". Virology 40: 199-211

19. WHITE RF 1979 Acetylsalicylic acid (aspirin) induces resistance to tobaccomosaic virus in tobacco. Virology 99: 410-412

www.plantphysiol.orgon February 29, 2020 - Published by Downloaded from Copyright © 1987 American Society of Plant Biologists. All rights reserved.