10
Protein Science (1995), 4:1577-1586. Cambridge University Press. Printed in the USA. Copyright 0 1995 The Protein Society The distribution of a-helix propensity along the polypeptide chain is not conserved in proteins from the same family VICTOR MUROZ, FRANCISCO J. BLANCO, AND LUIS SERRANO European Molecular Biology Laboratory, Postfach 10.2209, Meyerhofstrasse 1, 69012 Heidelberg, Germany (RECEIVED February 16, 1995; ACCEPTED May 18, 1995) Abstract We address the question of whether the distribution of secondary structure propensities of the residues along the polypeptide chain (denominated here as secondary structure profiles) is conserved in proteins throughout evolu- tion, for the particular case of a-helices. We have analyzed by CD the conformation of peptides corresponding to the five a-helices of two a/@ parallel proteins (ComA and Ara). The large a-helical population of peptide ComA-4 detected by CD in aqueous solution has been confirmed by NMR. These proteins are members of the CheY and P21-ras families, respectively, which have been studied previously in the same way (Muiioz V, JimCnez MA, Rico M, Serrano L, 1995, JMof Biol245:275-296). Comparison of the helical content of equiva- lent peptides reveals that protein a-helix propensity profiles are not conserved. Some equivalent peptides show very different helical populations in solution and this is especially evident in very divergent proteins (ComA and CheY). However, all the peptides analyzed so far adopted an important population of helical conformations in the presence of 30% trifluoroethanol, indicating that there could be a conserved minimal requirement for helical propensity. Keywords: a-helix; protein folding; protein stability; molecular evolution; secondary structure propensities Natural proteins, namely polypeptide chains with a specific bio- logical function, are affected by two different phenomena in evolution: a random mutagenesis process that changes the amino acidsequence, and natural selection, which constrains the changes within the framework of conserving the functionality. Because protein function is clearly dependent upon the 3D struc- ture of the protein it is currently assumed that the tendency of natural selection is to maintain the overall structure, especially around the active site. However, a protein may accommodate a large number of mutations without changing its tertiary struc- ture as long as the mutations compensate for their effects on the overall protein stability and the function is retained (Serrano et al., 1993). Only general features of protein structure, such as spacing patterns of hydrophobic and polar residues, seem to be shared by proteins with the same fold. As a result, proteins be- longing to the same family could have little in common apart from a scaffold of core secondary structures (Russell & Barton, 1994). All of this indicates that therules defining a specific 3D structure should not be identified with a strict code, rather with a set of physicochemical properties common to all the protein Reprint requests to: Victor Muiioz, EMBL, Postfach 10.2209, Meyerhofstrasse 1, 69012 Heidelberg, Germany; e-mail: munoz@embl- heidelberg.de. sequences that are compatible with the given fold (Bowie et al., 1991). It is then interesting to study these properties in families of sequences to search for patterns that may allow these rules to be identified. Because secondary structure is conserved within protein families, we could ask whether the propensity to popu- late native-like structures by the secondary structure elements of proteins is also maintained along evolution. Secondary struc- ture propensities are sequence specific, arise from short-range interactions, and contribute, together with tertiary contacts, to the final secondary structure of the protein. A common second- ary structure propensity profile for protein families would in- dicate that there is not only conservation of the structure of the folded state, but also of some characteristics of the unfolded state under native conditions. This question can be investigated experimentally through the conformational study of peptides corresponding to secondary structure elements. The isolated peptides in aqueous solution will display the secondary structure propensities of these elements, in the absence of tertiary interactions (Dyson &Wright, 1991). Here, we address the particular case of the conservation of a-helix propensities in evolutionarily related proteins. To do so we have compared the a-helical propensities of peptides corre- sponding to equivalent a-helices of two divergent members from two different families of a/@ parallel proteins: the CheY super- 1577

The distribution of α-helix propensity along the polypeptide chain is not conserved in proteins from the same family

Embed Size (px)

Citation preview

Protein Science (1995), 4:1577-1586. Cambridge University Press. Printed in the USA. Copyright 0 1995 The Protein Society

The distribution of a-helix propensity along the polypeptide chain is not conserved in proteins from the same family

VICTOR MUROZ, FRANCISCO J. BLANCO, AND LUIS SERRANO European Molecular Biology Laboratory, Postfach 10.2209, Meyerhofstrasse 1 , 69012 Heidelberg, Germany

(RECEIVED February 16, 1995; ACCEPTED May 18, 1995)

Abstract

We address the question of whether the distribution of secondary structure propensities of the residues along the polypeptide chain (denominated here as secondary structure profiles) is conserved in proteins throughout evolu- tion, for the particular case of a-helices. We have analyzed by CD the conformation of peptides corresponding to the five a-helices of two a/@ parallel proteins (ComA and Ara). The large a-helical population of peptide ComA-4 detected by CD in aqueous solution has been confirmed by NMR. These proteins are members of the CheY and P21-ras families, respectively, which have been studied previously in the same way (Muiioz V, JimCnez MA, Rico M, Serrano L, 1995, JMof Biol245:275-296). Comparison of the helical content of equiva- lent peptides reveals that protein a-helix propensity profiles are not conserved. Some equivalent peptides show very different helical populations in solution and this is especially evident in very divergent proteins (ComA and CheY). However, all the peptides analyzed so far adopted an important population of helical conformations in the presence of 30% trifluoroethanol, indicating that there could be a conserved minimal requirement for helical propensity.

Keywords: a-helix; protein folding; protein stability; molecular evolution; secondary structure propensities

Natural proteins, namely polypeptide chains with a specific bio- logical function, are affected by two different phenomena in evolution: a random mutagenesis process that changes the amino acid sequence, and natural selection, which constrains the changes within the framework of conserving the functionality. Because protein function is clearly dependent upon the 3D struc- ture of the protein it is currently assumed that the tendency of natural selection is to maintain the overall structure, especially around the active site. However, a protein may accommodate a large number of mutations without changing its tertiary struc- ture as long as the mutations compensate for their effects on the overall protein stability and the function is retained (Serrano et al., 1993). Only general features of protein structure, such as spacing patterns of hydrophobic and polar residues, seem to be shared by proteins with the same fold. As a result, proteins be- longing to the same family could have little in common apart from a scaffold of core secondary structures (Russell & Barton, 1994). All of this indicates that the rules defining a specific 3D structure should not be identified with a strict code, rather with a set of physicochemical properties common to all the protein

Reprint requests to: Victor Muiioz, EMBL, Postfach 10.2209, Meyerhofstrasse 1 , 69012 Heidelberg, Germany; e-mail: munoz@embl- heidelberg.de.

sequences that are compatible with the given fold (Bowie et al., 1991). It is then interesting to study these properties in families of sequences to search for patterns that may allow these rules to be identified. Because secondary structure is conserved within protein families, we could ask whether the propensity to popu- late native-like structures by the secondary structure elements of proteins is also maintained along evolution. Secondary struc- ture propensities are sequence specific, arise from short-range interactions, and contribute, together with tertiary contacts, to the final secondary structure of the protein. A common second- ary structure propensity profile for protein families would in- dicate that there is not only conservation of the structure of the folded state, but also of some characteristics of the unfolded state under native conditions.

This question can be investigated experimentally through the conformational study of peptides corresponding to secondary structure elements. The isolated peptides in aqueous solution will display the secondary structure propensities of these elements, in the absence of tertiary interactions (Dyson &Wright, 1991). Here, we address the particular case of the conservation of a-helix propensities in evolutionarily related proteins. To do so we have compared the a-helical propensities of peptides corre- sponding to equivalent a-helices of two divergent members from two different families of a/@ parallel proteins: the CheY super-

1577

1578 V . Murior et al.

TSSHHHHHHHHHH S SGGGHHHHHH SE P21 KLWVGAGGVGKSALTIQLIQNHFVDEYDPI-EDSYRKQ-VVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLC A r a KIWIGDSAVGKSNLLSRYARNEFSANSKATIGVEFQTQSMEIEGKEVKAQIWOTAGQERFRAVTSAYYRGAVGALV

RFRAVTSAYYRCAVC

T HHHHHHHHHHHHHHHHHTT S HHHHHHHHHHTT TTHHHHH P21 NVFAIHTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLAA-RTVESRQAQDLARSYGIPYIETSAKTRQGVEDAF A r a IWDTRRTTFESVGRWLDELKIHSDT-TVARMLVGNKCDLNEIRAVSVEEGKALAEEEGLFFVETSALDSTNVKTAF - AVSYEECKALAEEECLF STNVKTAF

HHHHHHHHH P21 YTLVREIRQHK ARA EMVILDIYNNV

EMVILDIYNNV

S HHHHHHHHHHHHHHT SSHHHHHTTSSSS S HHHHHHHHHHSTTT CheY ELKFLWDDFSTMRRIVRNLLKELGFNN-VEEA-EDGVDALNKLQAGGFGFIISDWN~N-~GLELLKTIRADSAM COWA MKKILVIMWlPAVMECTKTILETDSNLSVDCLSPEPSEQFIKQHDFSSYDLILMDLNLGGEVNGMELSKQILQENPH

DHPAYMECTKTILETDSNLS EPSEOFIKOHDFSSY VNGMELSKOILOENPH

HHHHHHTTS SS HHHHHHHHHHHHHHTT CheY SALWLMVTAEAKKENIIAAAQAGASGYWKPFTAATLEEKLNKIFEKLGM C m - A -HCKIIWTGYEVEDYFEEAIRAGLHGAISKTESKEKITQYIYHVLNGEIL

EVEDYFEEAIRACLHTESKEKITOYIYHVLNGEIL Fig. 1. Sequence alignment of Ara with H-ras p21 and of ComA with CheY. The secondary structure, using the method of Kabsch and Sander (1983), of the regions of CheY and H-ras p21 equivalent to the peptides analyzed here, are shown above the sequences. Ara and ComA peptides analyzed in this work are shown underlined below their respective sequences.

family (Volz, 1993) of response regulator proteins and the ras family of eukaryotic G-proteins (Valencia et al., 1991) (Fig. I). In each family there is at least one member with a known 3D structure: CheY in the CheY family (Stock et al., 1989; Volz & Matsumura, 1991; Bellsolell et al., 1994; Santoro et al., 1995) and H-ras p21 in the ras family (Milburn et al., 1990; Pai et al., 1990; Schlichting et al., 1990). Both structures have the typical a/@ parallel open-sheet fold with five parallel 0-strands and five a-helices intercalated, although in H-ras p21 there is an extra antiparallel @-strand. ComA has 16% sequence identity with CheY as calculated from the sequence alignment, and the func- tional residues are conserved (Stock et al., 1991). Ara is less di- vergent, sharing 32% sequence identity with H-ras p21 (Valencia et al., 1991). We have analyzed the helical propensities of the fragments corresponding to the putative helical segments of ComA and Ara, by CD. The large helical content found for the peptide ComA-4 has been further confirmed by NMR. The re- sults obtained here are compared to those obtained previously from the analysis of the equivalent CheY and H-ras p21 (p21) peptides (Muiioz et al., 1995).

Results

Significance and estimation of the helical propensities

The nine peptides (Fig. 1) have been analyzed by CD in aque- ous solution and in 30% (v/v) TFE mixtures. The analysis in aqueous solution provides information on the real a-helical pro- pensities of the different peptides, but usually protein fragments do not populate the helical conformation to a large extent (typ- ically less than 20% when they have no protected ends; Muiioz

& Serrano, 1994). TFE considerably enhances the a-helix pop- ulation of peptides having some a-helical tendency, as has been described previously (Dyson et al., 1992a; Jimenez et al., 1994), but not in peptides that populate other secondary structure con- formations (Dyson et al., 1992b; Blanco et al., 1994). We have used 30% (v/v) TFE only to obtain a second dimension on the comparison because it highlights low-helical tendencies that cannot be detected experimentally in aqueous solution. Quanti- fication of the helical content of natural peptides is not straight- forward. The presence of aromatic residues (Chakrabartty et al., 1993) or other secondary structure conformations, such as @-turns (Wishart et al., 1991), can affect the far-UV CD spec- trum of a peptide, especially in the region around 222 nm. To diminish these effects, we have used two different methods to estimate the helical content of the different peptides: the ellip- ticity at 222 nm (Chen et al., 1974) and the R1 ratio (see Mate- rials and methods) (Bruch et al., 1991; Mutioz et al., 1995). Both methods render consistent results within the experimental error and with the rank obtained from the position of the absolute ellipticity minimum (Table 1).

Comparison of a-helical propensities from the ComA and.CheY protein fragments

In Figure 2 we show the comparison of the CD spectra in aque- ous solution of the peptides corresponding to equivalent heli- ces in CheY and ComA. The same is shown in Figure 3, but in the presence of 30% TFE. The helical content of the ComA pep- tides in both conditions is indicated in Table 1 and, for compar- ative purposes, we show the helical content of the CheY peptides in Table 2 (Mutioz et al., 1995). ComA helix one (ComA-I)

a-Helix propensities in evolutionarily related proteins 1579

3 -15 s

? C 2 x 5 1 - I A i

190 200 210 220 230 240 250 Wavelength (nm)

" 1

a $ -15 E! =E 8 -20

190 200 210 220 230 240 250 Wavelength (nm)

L D

g =F -30 190 200 210 220 230 240 250

Wavelength (nm)

Fig. 2. Far-UV CD spectra of the peptides corresponding to the puta- tive a-helices of the ComA protein (closed circles) and the ones cor- responding to the five a-helices of CheY (Muiioz et al., 1995) (open circles). A: Helix 1. B: Helix 2. C: Helix 3. D: Helix 4. E: Helix 5. All peptides were analyzed at 500 p M , in 5 mM phosphate buffer pH 7.0, 5 "C, except for the peptides ComA-4 and ComA-5. The spectrum of peptide ComA-4 (putative helix 4) was obtained at pH 2.5 . The spec- trum of peptide ComA-5 was obtained at IO p M .

c g -20 1 I I I I I J

Z! 190 200 210 220 230 240 250 Wavelength (nm)

1580 V. Muiioz et al.

a B

2 iiJ 20 e

-20

!?! =: g 4 0

190 200 210 220 230 240 250 Wavelength (nm)

e tiJ 10 1 ' 0

-40 ' 1 1 I I I I ' 190 200 210 220 230 240 250 Wavelength (nm)

3 s L l A L 2 - k -30 190 200 210 220 230 240 250 ' 190 200 210 220 230 240 250

~~

Wavelength (nm) Wavelength (nm)

% 40 Fig. 3. Far-UV CD spectra of the peptides corresponding to the puta- tive a-helices of the ComA protein (closed circles) and the ones cor- responding to the five a-helices of CheY (Muiioz et al., 1995) (open circles). A: Helix 1 . B: Helix 2. C: Helix 3. D Helix 4. E: Helix 5 . All peptides were analyzed at 20 pM, in 5 mM phosphate buffer, pH 7.0, with 30% TFE, at 5 "C.

e U - L. 20

m o

'3 n P .- - - 0 7 P -20 s! 8 -40 C

190 200 210 220 230 240 250 Wavelength (nm)

a-Helix propensities in evolutionarily related proteins

Table 1. CD data and helical contents of the peptides corresponding to the putative a-helices of ComA and Ara proteins in water and in 30% TFEa

R1

ComA-1

ComA-2

ComA-3

ComA-4

ComA-5

Ara-2

Ara-3

Ara-4

Ara-5

0.80

1.18 -0.57

-2.58 1.00

-1.37 0.5 1

-2.38 0.65

-2.47 0.98

-0.58 0.69

-1.37 0.87

-1.10 0.92

-0.89

222 nmc

- 1,003 -9,262

399 -26,203

-1,167 -15,683 -4,435

-3 1,343 - 1,219

-18,786 610

-9,262 -2,455

-15,683 -456

-11,011 -2,082

-14,365

Min (nm)d %HelRlc %Hel,,,'

197.5 2.0 3.3

< 190 0.0 0.0

195.0 0.0 4.0

21 28

90 63

44 48 201.5 8.0 14.0

82 95 197.0 5 .O 4.0

86 55

29 28

44 46

32 33

28 42

196.0 0.0 0.0

200.0 4.0 8.0

199.0 1.5 2.0

198.0 1 .o 7.0

- ~

a First row, data in water. Second row, data in 30% TFE. R1 is the ratio between the ellipticity at 193 nm and the minimum

Mean residue ellipticity at 222 nm, in deg cm2 dmol" . Position of the minimum. For high helical contents, this is not very

meaningful and therefore it is not given for the peptides in 30% TFE.

ellipticity found within the range 195-210 nm.

' Helical content determined from the R1 ratio. Helical content determined from the ellipticity at 222 nm.

shows a CD spectrum typical of a random coil with a small a-helix contribution (Fig. 2A). This is in contrast with the spec- trum of the first a-helix of CheY (CheY-A), which exhibits a very significant helical population (Muiioz et al., 1995). Their differences in a-helix propensity are reflected in the compari- son of their spectra in 30% TFE. In both cases the a-helical pop-

Table 2. a-Helical content of the peptides corresponding to the a-helices. of CheI: ComA, H-ras p21, Ara, and Fxn, in water and in 30% TFEa

Protein He1 1 He1 2 He1 3 He1 4 He1 5

CheY 14 (55) Ob (43) 5 (40) 2 (50) 1 1 (76) ComA 2.5 (25) Ob (77) 2 (46) 11 (89) 5 (71) P2 I -ras Ara 0 (11) 0 (28) 6 (45) -' (32) 4 (45) Fxn 16 (59) d 2 (40) 13 (56) 8 (37)

c - 0 (31) 6 (58) 10 (41) 7 (60)

a The Vo helical content on average ([VoRI + "0222 nm]/2) is shown for the peptides corresponding to the a-helices of the five parallel a/p proteins. In parentheses we show the helical content in 30% TFE.

CD spectrum different from that of a random coil or a mixture of random-coil and a-helix.

'The peptide strongly aggregated. Data from Fxn, P21-ras and CheY have been extracted from Muiioz et al. (1995).

dThis has not been analyzed because it is not an @-helix in the protein.

1581

dation is enhanced, but to a very different extent (Fig. 3A). Helix two of ComA (ComA-2) presents a far-UV CD spectrum that is not typical of random coil or of a linear combination of random coil and a-helix (Fig. 2B). This spectrum is very simi- lar to that found in the equivalent peptide of CheY (CheY-B), although they do not have any significant sequence identity. Upon addition of 30% TFE, both peptides adopt a well-populated a-helical conformation (see Fig. 3B). The peptides correspond- ing to the third helix of ComA and CheY also have similar spec- tra in aqueous solution, corresponding to random coil with a small helical population (Fig. 2C). In the presence of 30% TFE, they readily adopt the helical conformation to a similar extent (Fig. 3C; Table 2). In the case of a-helix four, we found large differences in the CD spectra of the fragments from both pro- teins (ComA-4 and CheY-D; Fig. 2D). ComA-4 exhibits a very significant a-helix population, whereas CheY-D is fully unstruc- tured. Both peptides are, on the other hand, quite sensitive to 30% TFE (Fig. 3D). The ComA-5 fragment spectrum is char- acteristic of a random coil with a small amount of a-helix (see Fig. l), whereas CheY-E exhibits a significant helical popula- tion (Table 2). The analysis in 30% TFE indicates that the a-helix population of the two peptides is greatly enhanced, but to a different extent according to the population shown in water (Table 2).

Comparison of a-helical propensities of the Ara and H-ras p21 protein fragments

The CD spectra in aqueous solution of the five protein fragments from Ara and p21 are shown in Figure 4. Figure 5 shows the same peptides in 30% v/v TFE water mixtures. The helical con- tent of the Ara peptides in aqueous solution and 30% TFE are indicated in Table 1. In Table 2 we show the helical content of the p21 peptides for comparative purposes (Muiioz et al., 1995). The CD analysis of peptides corresponding to a-helix 1 of p21 and Ara has been described previously and we do not show their spectra here. Helix 1 of p21 aggregated, rendering a spectrum typical of a P-sheet structure. Helix 1 of Ara (Ara-1) had a CD spectrum typical of random coil (Murioz et al., 1995). The CD spectra of the second helix from p21 (p21-B) and Ara (Ara-2) are also characteristic of a random coil (Fig. 4A). The a-helix population induced by TFE is not very high in either p21-B or Ara-B (Fig. 5A; Table 2). CD spectra of peptides correspond- ing to the third a-helix in p21 (P21-C) and Ara (Ara-3) are quite similar and indicative of a mixture of random coil with a small helical population (Fig. 4B). The lower ellipticity at 222 nm for the Ara-3 peptide could be due to the presence of a tryptophan (Woody, 1978). Results in 30% TFE, in which the aromatic con- tribution is relatively smaller and the a-helix population higher, indicate that the peptides could have similar a-helical propen- sities (Fig. 5B; Table 2). Helix 4 of Ara (Ara-4) presents an un- usual CD spectrum (Fig. 4C) that cannot be explained by the presence of aromatic residues because there is only a Phe resi- due at the C-terminus. Because this peptide aggregates, it is pos- sible that the spectrum o f the diluted peptide is not that of a monomeric form. The equivalent helix in p21 (p21-D) has clear signs of a-helical population in aqueous solution. Thus, 30% TFE induces similar a-helical populations in both peptides (Fig. X ; Table 2). Helix 5 of both proteins (Ara-5 and P21-E) has similar CD spectra in aqueous solution (Fig. 4D) and in 30% TFE (Fig. 5D). The spectrum in water indicates the existence of

1582

? A 2 x 5 F A

V. Muiioz et al.

W

4 e c

I 1

0

-5

-1 0

-1 5

-20

190 200 210 220 230 240 250 Wavelength (nm)

190 200 210 220 230 240 250 Wavelength (nm)

? C ? D 2 2 x 5 x 5

T O " 0 "E E

n &

n i

E z 0 c!

e Q) -5

E 9 E

Y -5 -10 -g -10 .Y

-5 0)

- - 0 W Q) W

- - 3 -15 -15 e e u)

c c 1 -20 1 -20

190 200 210 220 230 240 250 190 200 210 220 230 240 250 Wavelength (nm) Wavelength (nm)

Fig. 4. Far-UV CD spectra of the peptides corresponding to the putative a-helices of the Ara protein (closed circles) and those corresponding to the last four a-helices of p21 (Mufioz et al., 1995) (open circles). A: Helix 2. B: Helix 3 . C: Helix 4. D: Helix 5 . All peptides were analyzed at 500 pM in 5 mM phosphate buffer, pH 7.0,5 "C, except for peptide Ara-4. The spectrum of pep- tide Ara-4 (putative helix 4) was obtained at 10 pM.

a random coil population in equilibrium with a small helical population.

NMR analysis of the ComA-4 helix

The CD analysis of ComA-4 indicates that its a-helical content is very high for a short protein fragment (15 residues), which does not have chemically protected ends (Muiioz & Serrano, 1994). We have carried out the NMR analysis of this peptide in aqueous solution to corroborate the CD results and to gather information about the distribution of the helical population along the peptide. The existence of a significant a-helix popu- lation localized in the central region of the peptide (residues 5- 12), is indicated by the presence of several i, i + 3 and i , i + 4 NOES and the chemical shifts of the C, H protons that are up-

field shifted with respect to random coil values (Jimenez et al., 1987). The summary of the NMR results for this peptide is shown in Figure 6 .

Discussion

The main limitation on inferring secondary structure propen- sities from the conformational analysis of peptides resides in the fact that their conformational properties may depend strongly upon the cutting point. This is especially important if the cut- ting point interrupts a secondary structure element. Cutting points chosen on the basis of sequence alignments are even more prone to this problem because secondary structures could be shifted in the sequence (Ryden & Hunt, 1993). However, in this case, because the alignments of the CheY (Stock et al., 1989;

a-Helix propensities in evolutionarily related proteins 1583

? A E x 20 r

2 s -30 ' 190 200 210 220 230 240 250 Wavelength (nm)

C

c ~ l l l , l , c -30 $ 190 200 210 220 230 240 250 m

Wavelength (nm)

3 B 40

'L 30 c X

2 20 $ 10 0, g o s B 5 -10 0 1 9 -20 g -30 !?!

3 190 200 210 220 230 240 250 Wavelength (nm)

n D

ii "190 200 210 220 230 240 250 Wavelength (nm)

Fig. 5. Far-UV CD spectra of the peptides corresponding to the putative a-helices of the Ara protein (closed circles) and those corresponding to the four a-helices of p21 (Muiioz et al., 1995) (open circles). A: Helix 2. B: Helix 3 . C: Helix 4. D: Helix 5 . All peptides were analyzed at 20 pM, in 5 mM phosphate buffer, pH 7.0, with 30% TFE, at 5 "C.

Volz, 1993) and P21-ras (Valencia et al., 1991) families are by all standards very good and have few insertions or deletions, these shifts should be minimal. Moreover, the peptides have been designed to be longer than the corresponding helices in CheY and P21-ras.

The helical contents of the peptides analyzed in this study are in the range expected for fragments of regions corresponding to a-helices in protein 3D structures (from not experimentally detectable up to 25%, excluding those with protected ends that considerably enhance the a-helix population) (Muiioz & Ser- rano, 1994). By the three criteria utilized, the analysis is discrim- inative and detects different behaviors for the peptides in aqueous solution. One of the nine peptides, C o M - 4 , shows a very significant a-helix content, confirmed by NMR, whereas the rest have smaller or nondetectable helical populations. All the peptides become helical upon addition of 30% TFE. The dif-

ferences observed in helical contents, although they may not seem relevant in terms of the whole protein, can indeed reflect important differences in free energy due to the cooperativity on the helix/coil transition.

Comparison of the helical content of peptides from proteins belonging to the CheY superfamily (CheY and ComA) indicates that the helical propensity profile is not conserved in this fam- ily. The peptide with the highest helical content corresponds to helix 1 in CheY, whereas in ComA the maximal helical content corresponds to helix 4. The differences are even more clear be- cause the reciprocal peptides of each protein are those with the least helical content (helix 4 of CheY and helix 1 of ComA). On the other hand, the helical behavior of peptides in the P21 fam- ily (P21-ras and Ara) is more related. All the equivalent peptides in these two proteins show similar helical contents in aqueous solution. These results are further supported by the data in 30%

1584 V . Murioz et ai.

0.4 1 B

P :::I 00 0.1

- 0 , 4 l , , , , , , , , , , , , , , , 1 E V E D Y F E E A I R A G L H

Fig. 6. NMR analysis of the peptide ComA-4. A: NOEs summary. Asterisks indicate NOEs that could not be detected because of signal overlapping. B: Difference in chemical shifts of the CaH protons with respect to random coil value (Bundi & Wiithrich, 1979).

TFE. Thus, the main conclusion from this work is that forma- tion of an a/O parallel protein does not require a specific a-helix propensity profile. Proteins less divergent (H-ras p21 and Ara; 32% sequence identity) could have more similar secondary struc- ture patterns by means of their sequence similarity. Very di- vergent proteins (CheY and ComA; 16% sequence identity), although deriving from a common ancestor, have already lost their similarities on secondary structure propensity profiles.

Analysis of how protein 3D structures are constructed pro- vides an explanation to the results found in this work. Amino acids in secondary structure elements are involved in interactions both with residues within the element (short-range interactions) and with residues very far in sequence (long-range interactions). The first type produces the secondary structure propensities, but the second type also contributes, providing an extra energy input, to the definition of the final secondary structure in the protein. Maintaining the secondary structure propensities throughout evolution implies that mutations affecting the pro- pensity of a specific secondary structure element should uniquely be compensated by mutations in the same secondary structure element (close in sequence). However, what matters in terms of keeping the biological function is the 3D structure, thus, mu- tations that reduce secondary structure propensities may be com- pensated by mutations in other regions of the polypeptide chain. The resukis that secondary structure propensities degenerate, not being conserved in proteins from the same family.

Although a specific a-helix propensity profile does not seem necessary for obtaining a definite 3D structure, we could ask whether there are some common characteristics in all the CY/^ parallel proteins analyzed in this work and in a previous one (CheY, Com-A, P21-ras, Ara, and flavodoxin; Muiioz et al., 1995). Common characteristics should illustrate what the lim- its are on a-helix propensities that still permit the build up of an a/@ parallel protein. In this regard, all the peptides we have studied showed significant helical populations upon addition of 30% TFE (Table 1). TFE enhances a-helix populations in pep- tides by displacing a preexisting equilibrium to formation of a-helix (Nelson & Kallenbach, 1989) that is independent of tertiary interactions (some peptides may have a propensity to populate a secondary structure different from the one they adopt in the native protein; Mudoz & Serrano, 1994). Therefore, peptides without any significant helical propensity d o not become heli- cal upon addition of very high amounts of TFE (Dyson et al., 1992b). This indicates that the peptides studied here have some non-neglectable a-helix propensity, although in some cases below the experimentally observable limits. However, as we discussed above, these small helical propensities reflect large dif- ferences in free energy and could be quite important in deter- mining the amount of energy that has to be provided by tertiary contacts in order to form an a-helix in the native structure. An- other interesting observation is that, in all the proteins analyzed, there is at least one protein fragment that populates the helical conformation more than 10% (Table 2). Finally, i t is also re- markable that the peptides corresponding to a-helix 2 always have the least a-helix propensity in the five a//3 proteins stud- ied so far (flavodoxin has a loop at the equivalent positions of helix 2 in CheY and p21) (Table 2) . This supports the idea pre- viously proposed that there could be some negative selection for this a-helix (Muiioz et al., 1995).

Summary

The specific a-helical propensity profile is not conserved in pro- teins from the same family, indicating that it is not required to determine the final 3D structure. However, comparison of the helical content of helical fragments from five different proteins reveals that there could be some fuzzy rules for the a-helix pro- pensity patterns in order to build up an a/o parallel protein.

Material and methods

Pep f ide design

To identify the secondary structure elements of ComA and Ara, we have used sequence alignments (Stock et al., 1989; VaIencia et al., 1991) because their 3D structure is not known. The pep- tides have been designed to encompass the regions of ComA and Ara proteins equivalent to the a-helices of CheY and p21, re- spectively. These sequences have been elongated following the criteria defined previously for CheY and p21 proteins (Muiioz et al., 1995). The amino acid sequences of ComA aligned with CheY and Ara aligned with H-ras p21 are shown in Figure 1 , together with the peptide sequences.

Peptide synthesis

Solid-phase synthesis of the peptides was performed on an Abimed AMS422 multiple peptide synthesizer using Fmoc chern-

a-Helix propensities in evolutionarily related proteins 1585

istry and PyBOP activation at 0.025 mmol scale. After synthe- sis was completed, protecting groups were removed and the peptide chains were cleaved from the resin with a mixture of TFA/phenol/EDT/thioanisole/water (10 mL/0.75 g/0.25 mL/ 0.5 mL10.5 mL) for 3 h. The peptides were purified on a Vydac C-18 reverse phase column (20 x 250 mm, 0.01 mm particle) at a flow rate of 10 mL/min. Solvent A was water containing 0.1% TFA and solvent B was 70% acetonitrile, 0.1'70 TFA in water. Peptide homogeneity (>98%) was determined by HPLC using acetonitrile gradient of 0.7% per min. Peptide composition was confirmed by amino acid analysis and the molecular weight checked by matrix-assisted laser desorption time-of-flight mass spectrometry.

CD analysis

Far-UV CD spectra were recorded on a Jobin-Ivon CD6 or on a Jasco 710 instrument as previously indicated (JimCnez et al., 1994). The concentrations of the different peptides were deter- mined by amino acid analysis as described in Jimenez et al. (1994).

Determination of the helical percentage from fur-UV CD spectra

To estimate the helical population of the different peptides, we have used the mean-residue ellipticity at 222 nm, as indicated in Equation I (Chen et al., 1974):

070 Helix = (Oobs.,,, x 100)/[8,,, x (1 - 2.57/1)] (1)

where is the mean-residue ellipticity observed experimen- tally at 222 nm, t9222 is the ellipticity of a peptide of infinite length with 100% helix population, taken as -39,500 deg cmz dmol", and I is the peptide length or, more precisely, the number of peptide bonds. Because this method is extremely sensitive to the peptide concentration, we have considered the use of the param- eter R1, which is independent of it (Bruch et al., 1991). Rl is the ratio between the ellipticity at 193 nm (typically the maxi- mum of an a-helix) and the minimum ellipticity found within the range 195-210 nm. This parameter changes as a function of the helix content and correlates well with the helix estimation from the ellipticity at 222 nm (Muiioz et al., 1995). Quantita- tive far-UV CD estimations of helix populations of the differ- ent peptides analyzed in this paper have been made on the basis of both methods. The position of the absolute ellipticity mini- mum has also been used to qualitatively rank the helical con- tent of the different peptides.

' H NMR spectroscopy

All the two-dimensional spectra were acquired in the phase- sensitive mode using the time-proportional phase incrementa- tion (TPPI) technique (Marion & Wiithrich, 1983) with pre- saturation of the water signal. COSY (Aue et al., 1976) and NOESY (Kurnar et al., 1980) spectra were recorded using stan- dard phase-cycling sequences. Short mixing times (200 ms) were used in the NOESY experiments to avoid spin diffusion effects. TOCSY (Bax & Davis, 1985) spectra were acquired using the standard MLEV 17 spinlock sequence and an 80-ms mixing time. The size of the acquisition data matrix was 2,048 x 512 words

in f2 and f i , respectively, and before Fourier transformation the 2D data matrix was multiplied by a phase-shifted square-sine bell window function in both dimensions and zero-filled to 2K X 2K. The phase-shift was optimized for every spectrum. The complete assignment of the 'H-NMR spectra was performed by using standard 2D sequence-specific methods (Wuthrich et al., 1982).

Aggregation test

All the peptides were water soluble and did not massively ag- gregate. Solubility does not discard the possibility of formation of dimers or bigger oligomers. The association of peptides to form dimers or oligomers can be detected by the dependence of the NMR, or far-UV CD, spectra on peptide concentration. The effect of the peptide concentration on the far-UV CD spectrum was determined by acquiring spectra at two different concen- trations, 500 and 10 pM, in 0.1- and 5-mm cuvettes, respectively. In those cases for which we found a change in the spectral prop- erties upon dilution, we have explored various pHs to find con- ditions where aggregation did not occur. For peptides Ara-4 and ComA-5, we have not found any conditions in which the far- UV CD spectrum was independent of the peptide concentration. For these two peptides, we have utilized their CD spectra at 10 pM. Peptide CornA-4 showed changes in its CD spectrum upon dilution at pH 7, but not at pH 2.5, which has been cho- sen for the CD and NMR analysis. (The spectrum of the diluted peptide at pH 2.5 and 7.0 is very similar; data not shown.)

References

Aue WP, Bertholdi E, Ernst RR. 1976. Two-dimensional spectroscopy. Ap- plication to nuclear magnetic resonance. J Chem Phys 64:2229-2246.

Bax A, Davis DG. 1985. MLEV-17 based two-dimensional homonuclear magnetization transfer spectroscopy. J Mugn Reson 65:355-360.

Bellsolell L, Prieto J. Serrano L, Coll M. 1994. Magnesium binding to the bacterial chemotaxis protein CheY results in large conformational changes involving its functional surface. J Mol Biol238:489-495.

Blanco F, Rivas G, Serrano L. 1994. A short linear peptide that folds into a native stable &hairpin in aqueous solution. Nature Struct Biol I : 584-590.

Bowie JU, Liithy R, Eisenberg D. 1991. A method to identify protein se- quences that fold into a known three-dimensional structure. Science 253:164-170.

Bruch MD, Dhingra mM, Gierasch LM. 1991. Side chain backbone hydro- gen bonding contributes to helix stability in peptides derived from an a-helical region of carboxypeptidase A. Proteins Sfruct Funct Genet 10:130-139.

Bundi A, Wiithrich K . 1979. 'H-NMR parameters of the common amino acid residues measured in aqueous solution of the linear tetrapeptides H-Gly-Gly-X-Ala-OH. Biopolymers 18:285-298.

Chakrabartty A, Kortemme T, Padmanabhan S, Baldwin RL. 1993. Aro- matic side chain contribution to far-ultraviolet circular dichroism of he- lical peptides and its effect on measurement of helix propensities. Biochemistry 32:5560-5565.

Chen Y H , Yang JT, Chau KH. 1974. Determination of the helix and B form of proteins in aqueous solution by circular dichroism. Biochemistry 13: 3350-3359.

Dyson HJ, Merutka G , Waltho JP, Lerner RA, Wright PE. 1992a. Folding of peptide fragments comprising the complete sequence of proteins. Mod- els for initiation of protein folding. 1. Myohemerythrin. JMoiBio1226: 795-818.

Dyson HJ, Sayre JR, Merutka G , Shin HC, Lerner RA, Wright PE. 1992b. Folding of peptide fragments comprising the complete sequence of pro- teins. Models for initiation of protein folding. 11. Plastocyanin. JMol Biol226:819-835.

Dyson HJ, Wright PE. 1991. Defining solution conformations of small lin- ear peptides. Annu Rev Biophys Chem 20:519-538.

1586 V. Muiioz et ai.

Jimenez MA, Muiioz V, Rico M, Serrano L. 1994. Helix stop and start sig- nals in peptides and proteins: The capping box does not necessarily pre- vent helix elongation. J Mol Bioi 242:487-4%.

Jimenez MA, Nieto JL, Herranz J, Rico M, Santoro J. 1987. 'H NMR and

FEES Lett 221:320. CD evidence of the folding of the isolated ribonuclease 50-61 fragment.

Kabsch W, Sander C. 1983. Dictionary of protein secondary structure: Pat- tern of recognition of hydrogen-bonded and geometrical features. Bio- polymers 22:2577-2637.

Kumar A, Ernst RR, Wiithrich K. 1980. A two-dimensional nuclear Over- hauser enhancement (2D NOE) experiment for the elucidation of complete proton-proton cross-relaxation networks in biological macro- molecules. Biochem Biophys Res Commun 95:l-6.

Marion D, Wuthrich K. 1983. Application of phase sensitive two-dimensional correlated spectroscopy (COSY) for measurements of 'H-'H spin-spin coupling constants in proteins. Biochem Biophys Res Commun 113: 967-974.

Milburn MV, Tong L, De Vos AM, Briinger A, Yamaizumi Z, Nishimura S, Kim SH. 1990. Molecular switch for signal transduction: Structural differences between active and inactive form of protoncogenic ras pro- teins. Science 247:939-945.

Muiioz V, JimCnez MA, Rico M, Serrano L. 1995. Structural analysis of pep- tides encompassing all the a-helices of three a/P parallel proteins: Che-Y,

of a/@ parallel proteins. J Mol Bioi 245:275-296. flavodoxin & P2I-ras: Implications for a-helix stability and the folding

Muiioz V, Serrano L. 1994. Elucidating the folding problem of helical pep- tides in solution. Nature Struct Bioi 1:399-409.

Nelson JW, Kallenbach NR. 1989. Persistence of the alpha-helix stop sig- nal in the S-peptide in TFE solutions. Biochemistry 28:5256-5261.

Pai EF, Krengel U, Petsko GA, Goody RS, Kabsch W, Wittinghofer A. 1990.

at 1.3A. EMBO J 9:2351-2359. Refined crystal structure of the triphosphate conformation of H-ras p21

Russel RB, Barton GJ. 1994. Structural features can be unconserved in pro-

teins with similar folds. An analysis of side-chain contacts secondary structure and accessibility. J Mol Bioi 244:332-350.

Ryden LC, Hunt LT. 1993. Evolution of protein complexity: The blue copper- containing oxidases and related proteins. J Mol Evoi36:41-66.

Santoro J, Bruix M, Pascual J, Lopez E, Serrano L, Rico M. 1995. Three- dimensional structure of chemotactic CheY protein in aqueous solution

Schlichting I , Almo SC, Rapp G, Wilson K, Petratos K, Lentfer A, Wittin- by nuclear magnetic resonance methods. J Mol Bioi 247:717-725.

ghofer A, Kabsch W, Pai EF, Petsko GA. 1990. Time-resolved X-ray crys- tallographic studies of the conformational change in H-ras p21 protein on GTP hydrolysis. Nature 345:309-314.

Serrano L, Day T, Fersht AR. 1993. The step-wise mutation of barnase to binase and a procedure for engineering increased stability of proteins: An experimental analysis of the evolution of protein stability. JMolBiol

Stock AM, Mottonen JM, Stock JB, Schutt CE. 1989. Three-dimensional structure of CheY the response regulator of bacterial chemotaxia. Na- ture 344:745-749.

Valencia A, Chardin P, Wittinghofer A, Sander C. 1991. The ras protein fam- ily: Evolutionary tree and role of conserved amino acids. Biochemistry 30:4367-4648.

Volz K. 1993. Structural conservation in the Che Y superfamily. Biochem- istry 32:11741-11753.

Volz K, Matsumura P. 1991. Crystal structure of Escherichia coli CheY re- finedat 1.7A.JBiolChem266:15511-15519.

Wishart DS, Sykes BD, Richards FM. 1991. Relationship between nuclear magnetic resonance chemical shift and protein secondary structure. JMol Biol222: 1423- I43 1 .

Woody RW. 1978. Aromatic side-chain contributions to the far ultraviolet dichroism of peptides and proteins. Biopolymers 17:1471-1467.

Wiithrich K, Wieder G, Wagner G , Braun W. 1982. Sequential resonance assignments as a basis for determination of spatial protein structures by high resolution magnetic resonance. JMol Bioi 155:311-319.

233:305-312.