Destabilization and stabilization of proteins

ESSAY

Destabilizationandstabilizationof proteins

John A. SchellmanInstitute of Molecular Biology, University of Oregon, Eugene, OR, USA

Abstract. In Part I the history of progress in the stabilization and destabilization of proteinconformations by means of cosolvents is outlined in terms of distinct conceptual steps. In PartII it is shown that a straightforward application of the Kirkwood–Buff theory of solutions leadsto formulas for the preferential interaction and the free energy of unfolding, which confirmand generalize the results of Part I.

Introduction 351

Part I. A succession of concepts 352

1. Cooperativity 352

2. Cosolvent interaction 352

3. Linearity 354

4. Solvent exchange 354

5. Excluded volume 356

6. Summation 357

Part II

1. The Kirkwood–Buff approach 357

Acknowledgments 360

References 360

Introduction

Studies of protein transitions from the folded to the unfolded state and the reverse have been

a basic part of protein physical chemistry for well over 50 years. Much of current work is

concerned with the ‘protein folding problem’ : the search for the mechanism by which proteins

Address for correspondence : J. A. Schellman, Institute of Molecular Biology, University of Oregon,

Eugene, OR 97405, USA.

Tel. : US-541-434-4231 ; Fax: US-541-346-5891 ; E-mail : [email protected]

Quarterly Reviews of Biophysics 38, 4 (2005), pp. 351–361. f 2006 Cambridge University Press 351doi:10.1017/S0033583505004099 Printed in the United Kingdom

First published online 9 March 2006

spontaneously find their characteristic 3D structures at the atomic level. This problem involves

very sophisticated kinetic techniques on the experimental side and elaborate molecular mech-

anistic modeling on the theoretical side. Nevertheless, thermodynamic studies of the transition

process continue to provide core information on the stability of folded proteins. Combined with

the studies of mutant sequences they even provide information at the molecular level.

This paper will be limited to a discussion of the thermodynamic aspects of protein transitions.

We also restrict our attention to the most ‘popular ’ type of transition, i.e. that which involves

changes in solvent conditions rather than changes in temperature or pressure which require

a different analysis.

The main purpose will be to introduce a new theoretical approach, while at the same time

providing a detailed comparison with older results. Part I will introduce, one by one, the main

concepts that have brought the field to its current state ; Part II will follow with a brief derivation

of a statistical thermodynamic theory, which can then be compared with the historical result.

Part I. A succession of concepts

1. Cooperativity

It was noted early that the unfolding and folding of proteins occurred without the apparent

presence of any intermediates. Clearly there must be kinetic intermediates on the reaction

pathway(s) but their population is so small as to be undetectable. (Transitions that involve the

molten globule as an intermediate will not be considered here.) Only in recent years have

special kinetic methods and traps been devised for perceiving intermediates. Although

cooperativity had been noted before in biological systems, e.g. oxygen binding to hemoglobin,

it was the high-level cooperativity of protein and helix-coil transitions that initiated the

appreciation of cooperativity as a ubiquitous element in biology. The two-state nature of most

transitions of monomeric proteins has been tested and demonstrated in many ways over a period

of more than 50 years.

The apparent total cooperativity of protein unfolding has resulted in an extraordinary

simplicity in interpretation. The relative proportion of the two molecular states (folded and

ordered, unfolded and disordered), is detectable by calorimetry and a variety of spectroscopic

and other physical methods. Free-energy changes may be obtained from the equilibrium

constants, and variation of T, P and component concentrations permits a complete thermo-

dynamic analysis of one state relative to the other. A typical set of modern experimental curves

is shown in Fig. 1.

2. Cosolvent interaction

Formulas for the free energy of multiple binding of denaturants, viewed as ligands, appeared

very early (Schellman, 1955 ; Hermans & Scheraga, 1961 ; Aune & Tanford, 1969). The general

formula, which includes cooperative interactions between bound ligands, heterogeneous sites

and non-ideal components is given in Schellman (1975),

m2=m02+RT ln PxRT ln S, (1a)

where m2 is the chemical potential of the protein, P is its total concentration (summed over

all species), m20, its reference chemical potential and S the binding polynomial for the ligand L

352 J. A. Schellman

on the protein. P is assumed to be sufficiently small that the solution is ideal in protein

concentration. By definition

S=1+K1L+K2L2+K3L

2+ . . .+KN LN :

The K’s are the standard phenomenological coefficients for the binding and L is the activity of

an added component such as a denaturant. Equation (1a) can also be expressed in a cogent

and simple way. From the theory of binding polynomials, we know that the fraction of proteins

with n ligands is KnLn/S with K0=1. Hence the concentration of protein molecules with no

ligands is P0=P/S. Substituting this in Eq. (1a) :

m2=m02+RT ln P0, (1b)

Although normally P0 is too small to be measurable, it can be calculated by dividing the

total protein concentration by S.

Experimentalists used this type of formalism for many years and generally found that the

use of activities rather than concentrations complicates the issue by making relations less linear.

We will find some justification for this at the end of the paper. A protein is usually folded

or unfolded under conditions of constant concentration. Hence the change in free energy of

unfolding is given by

DG unf=DG 0unfxRT ( lnSux lnSf ):

We have switched from m to DG which is preferred by experimentalists. DG unf0 is the free-energy

change in the absence of ligand, u and f refer to the unfolded and folded forms of the protein.

At present there is no way of evaluating a general binding polynomial for such a complex case

and it is usually represented by a binding polynomial for independent sites. In this case S factors

into the product

S=Yn

(1+KnC )

1·0

0·8

0·6

0·4

0·2

01 2 3 4

Urea (M)

Fu

5 6 7 8 9 10

Fig. 1. Plots of the fraction of unfolded protein as a function of urea concentration. The filled circles are

for native ribonuclease T1. The other curves are the results for single amino acid changes from the native

sequence. See the original paper for details. (From Shirley et al. 1992, with the authors’ permission.)

Destabilization and stabilization of proteins 353

and we have

DG unf=DG 0xRTXu sites

ln (1+KkuC )xXn sites

ln (1+KknC )

!, (2a)

The unfolded (u) and folded (f ) states have many sites in common and the formula is usually

represented as

DG unf=DG 0xRTX

exposed sites

ln (1+KkC ) (2b)

or

DG unf=DG 0unfxn R T ln (1+KavC ), (2c)

where n is an estimated number of sites exposed by the unfolding and Kav is an average

association constant. This formula was the ‘workhorse ’ in this field for many years.

3. Linearity

In 1974 Greene and Pace observed a linear relation between the unfolding free energy and

denaturant concentration for urea and guanidine (Greene & Pace, 1974). They proposed the

empirical relation

DG unf=DG 0xmC ,

in which ‘m ’ is a measure of the change in interaction with solvent which occurs when the

protein is unfolded and DG 0 is an estimate of free energy of unfolding in the absence of

denaturant. This finding has been amply verified in many investigations ever since, as shown

in Fig. 2. In terms of binding theory, Eq. (2a) can be linearized by expanding the logarithmic

terms.

DG unf=DG 0unfxRTC

Xu sites

KkuxXn sites

Kkn

!: (3a)

The equivalent to Eq. (2c) is

DG unf=DG 0unfxnRT KavC ), (3b)

4. Solvent exchange

The concentrations of cosolvents are quite large for denaturants usually ranging from about

3 M to 7 M. The linear relation indicates that the K ’s which describe the binding must be very

small indeed to allow the approximation of ln (1+KC ) as KC in Eq. (3a). In biochemistry,

association constants usually vary between 103 and 109 l molx1 and molar concentrations are

very small. Ordinary first-year equilibrium theory leaves us unprepared for high concentrations

and very weak interactions. Viewed another way, in a 6 M solution urea is present as 0�27 in

volume fraction compared to 0�73 for water. Any site on a protein would have a reasonably

354 J. A. Schellman

high probability of coming into contact with urea even if there were no preferential

solvation at that site. Without preferential solvation there is no thermodynamic effect [see

Eq. (4)].

We use the term preferential solvation to describe the accumulation or diminution of components

near or at the surface of the protein. Preferential interaction is a thermodynamic variable and

includes all effects including excluded volume.

What must be done is to allow a competition between the two solvent components by

setting up a binding polynomial for all possibilities (Schellman, 1987, 2003).Xsite

=1+K1Q1+K3Q3:

For the remainder of this paper, 1 will represent the principal solvent, 2, the protein, and 3, the

cosolvent. The first term represents a bare site and the next two terms refer to the interaction

with water and with denaturant respectively. The Q’s are volume fractions. The volume fraction

of the protein is usually very small so that

Q1 ffi 1xQ3:

A bare site, with neither water nor denaturant in contact, is completely unlikely so the first

term may be ignored relative to the others. With these changes :

DG exsite=xRT ( ln K1+ ln (1+(K3=K1x1)Q3)

=xRT ln (1+(K kx1)Q3),

)(4)

9

8

7

6

5

4

3

2

1

∆G (

kcal

mol

–1)

0

–1

–2

1 2 3 4

Urea (M)

5 6 7 8

Fig. 2. Pace plots (DG unf versus urea concentration) for the set of proteins of Fig. 1. Symbols match in the

two figures. (From Shirley et al. 1992, with the authors’ permission.)


xRT ln K1 is the free energy for hydrating a bare site and this term has been incorporated

into the reference free energy in the second line to establish the hydrated site as the reference

state.

K k=K3/K1 is the exchange constant for removing a molecule of 1 from the site and replacing

it with a molecule of 3.

W +S H S +W :

We see that the association constants of Eqs (1a) and (2a) are in fact equilibrium constants

corrected for the effect of ambient concentration. When K k=1 the site is indifferent to whether

it is occupied by principal solvent or cosolvent and the interchange has no thermodynamic

effect. When K k>1, we have preferential interaction in favor of cosolvent with a major

reduction in the value of the apparent association constant. Normally K kx1@K k.When K k<1, we have an extension of the theory to the case of preferential solvation of

the principal solvent, water. The result is the stabilization rather than the destabilization of

the folded form. Cosolvents in this class are called osmolytes and have been thoroughly studied

(Santoro et al. 1992 ; Timasheff, 1993 ; Courtenay et al. 2000). Whether folded structures are

stabilized or destabilized by a cosolvent depends on the value of K k relative to unity.

Biochemists prefer to use molarity as a concentration unit. The transformation is quite simple,

Q3=V 3C3. If we assume that V 3 is relatively constant over the range of a transition, we can

transform the formula to molarity

DG exsite=xRTC 3

Xu sites

kkuxXn sites

kkn

!, (5)

Where kku and kkn both have the form k=(K kx1)V 3.

Standard equilibrium theory is valid only for situations in which chance encounters are very

unlikely compared to attractive interaction.

5. Excluded volume

Excluded volume was ignored in unfolding studies for a long time, even though practitioners

in the field were well aware of its potential to influence bimolecular interactions. For a number

of reasons it was considered to make a small contribution to the unfolding reaction. For

example, the molar volumes of folded and unfolded proteins differ by only a very small

percentage. Also the similarity of results obtained with molarity and molality as concentration

units seemed to indicate that volume effects were not sizable. In addition, there were no known

ways to calculate the excluded volume of a small molecule relative to a complicated protein

structure or random chain.

Excluded volume effects were first suggested as the mechanism for protein stabilization by

osmolytes and were later added to the scheme for unfolding or denaturation (Wills & Winzor,

1993 ; Saunders et al. 2000 ; Schellman, 2003). Fortunately, work on the surface area of folded

and unfolded proteins has revealed methods of calculating excluded volumes of proteins for

small ligands (Richards, 1985) and values can be obtained with available computer programs.

The excluded volume of the unfolded form is a more complex problem involving calculations

for a distribution of polypeptide structures (Creamer et al. 1997). For this case the volume

problem has not been completely solved, but a reasonable approximate method has been

356 J. A. Schellman

suggested (Schellman, 2003). The contribution of the excluded volume to the free energy

of unfolding is RT(Xu – Xn)C3 and is positive and large for most ligands.

6. Summation

A synthesis of the five elements discussed in this part has been presented recently for several

proteins and several cosolvents. It turns out that proteins interact favorably with both

denaturants and osmolytes. This is probably necessary for the stability of the solutions.

The difference between the denaturant class and the osmolytes class lies in the fact that for

denaturants the positive excluded volume contribution is overcome by a larger negative con-

tribution from preferential solvation with the cosolvent, whereas for osmolytes the preferential

solvation is not sufficient to overcome the positive excluded volume. Details are given in

Schellman (2003).

Combining all the effects discussed in this part, which were unraveled one by one, gives

the following formula for the free energy of a protein in a cosolvent

G s=G 0+RTC3 XsxXj

kj

( ), (6)

‘ s ’ is a label for the state of the protein (e.g. u or f) and kj=(Kjkx1)V 3. For the unfolding

reaction,

DG unf=DG 0unf+RTC3 (XuxXn)x

Xu sites

(K kkux1)V 3xXn sites

(K kknx1)V

!( )

ffi DG 0unf+RTC3{DXxnexp�kkexp}:

(7)

The second line is a simple empirical form where nexp is an estimate of the number of exposed

sites and �kkexp is an average of k that fits the experimental data. Equation (7) is the proper relation

for interpreting the effect of a cosolvent on the stability of a protein.

Part II

1. The Kirkwood–Buff approach

Robert M. Mazo and I have been working with the Kirkwood–Buff (K–B) theory of solutions for

some time. His results deal mainly with general proofs and transformations as well as work

with the theory with P, T, and C as variables. Mine has dealt mainly with the variable set of m1,

T and C. Both these formulations are explicitly stated in the original article (Kirkwood & Buff,

1951). Our results are often obtained independently, but are always dependent upon our mutual

progress with K–B theory.

The K–B theory is expressed in terms of the K–B integrals, Gab, where a and b represent

components of the solution. Although Gab=Gba we will adopt the convention that the first

index is the central molecule, which will mainly be the protein, and b a cosolvent molecule in

the neighborhood of the central molecule. From Kirkwood & Buff (1951) :

Gab=ZV0

(g(rab)x1)dv,


where rab is the distance between the molecules of b and a ; g(rab)=rb(r)/r0 is the radial

distribution function of b at a distance rab from a ; rb0 is the number density of molecules

of b beyond the range of interaction of the two molecules ; V0 is a volume around a beyond

which interaction between a and b can be ignored and rb(r)=rb0 . All contributions to Gab

occur within V0 which is otherwise unspecified in shape or size. Ionic systems require special

treatment. Molecules that are not spherical are averaged over all orientations. Note that

r0bGab=

RV0

(rb(r )xr0b)dv is the total excess of the number of molecules in the neighborhood

of the central molecule a. For a cosolvent molecule in the neighborhood of a protein, G23=RV0

(g(r23)x1)dv and r30G23 or C3

0G23=excess of cosolvent near the protein.

Note that the number densities, r (molecules per unit volume), are directly proportional to

molarities. In the macroscopic view, molarities can be changed to number densities (Kirkwood

& Buff, 1951). For example moles/l=1000/Na molecules/ml where Na is Avogadro’s

number. Changing k to R in formulas converts from molecules to moles.

In this paper we make use of the K–B results for osmotic systems [K–B, eqs (15)–(16)] which

have (T, m1, C) as variables. This differs from most applications, which are based on the theory

for (T, P, C) and [K–B, eqs (12)–(14)].

The K–B integrals are complex objects. They are functions of the concentrations of all

components other than the principal solvent except at very low concentrations. For a three-

component. osmotic system there are only three integrals (G22, G23, G33) and these perform

the functions of the equivalent infinite hierarchy of McMillan–Mayer integrals which are

independent of concentration (Hill, 1960).

The K–B theory has direct application to the problem of the effect of a cosolvent on protein

stability. The preferential interaction or thermodynamic binding of 3 to 2 is given by Eisenberg

(1976)

CC � @C3

@C2

� �m1 , m3 , T

=x@m2

@m3

� �m1 , T ,C2

=x

@m2

@C3

� �m1 , T ,C2

@m3

@C3

� �m1 , T ,C2

, (8)

with our set of variables. (The first equality is the definition of CC ; the second can be derived

via a Maxwell relation for the thermodynamic function Axn1m1xn3m3 with V constant.) C

is usually defined in terms of molality, but molarity, the concentration units of K–B theory,

is more appropriate here. Eisenberg (1976) gives formulas for (P, T, m) and (P, T, C). The

two derivatives on the right-hand side of Eq. (8) appear explicitly in the K–B paper [eqs (15)

and (16)] leading to

CC=G23C3

1+C2G22: (9)

In the usual limit of low protein concentration, C2p0, so

CC ; G23C3=C3

ZV0

(g23(r )x1)dv: (10)

From the definition of G23

C3G23=ZV0

(C3(r )xC 03 )dv,

358 J. A. Schellman

which is clearly the excess of 3 in the neighborhood of the protein and is in precise agreement

with the intuitive concept of ‘binding ’. As in Part I the excess can be positive or negative :

the preferential solvation of cosolvent and the preferential solvation of water with the protein

are both represented.

To compare Eq. (10) with the results of Part I, we need to convert thermodynamic formula

[Eq. (6)] to CC rather than G 2ex=m2

ex using Eq. (8), which is general. With m2 from Eq. (6) and

assuming m3 to be ideal

@m2

@C3

� �m1 , T ,C2

=RT XsxXj

kj

( ), (11a)

@m3

@C3

� �m1 , T ,C2

=RT=C3: (11b)

The ideal assumption is implicit in all of Part I. The pragmatic reason for this is that the free

energy is linear in C, but not in the activity. We return to this point shortly. From Eqs (11) and (8)

CC=xC3 XsxXj

kj

( ): (12a)

For this simple linear case, CC=xG ex/RT [refer to Eq. (6)].

We now compare Eq. (10) from K–B theory to Eq. (12a) from the thermodynamic model.

In evaluating the integral in Eq. (10), we assume a hard shell model in which no interpenetration

of the cosolvent is allowed. Within the boundary presented by the hard shell, g(r) is zero and

this part of the integral gives xXs, the excluded volume#. We also write g(r) as an equilibrium

constant. It is the equilibrium constant for a transfer of a molecule of 3 from the external

bulk concentration to C3(r). It is also related to the potential of average force (a free energy)

by xkT ln g(r)=w(r). With these changes

CC=C3 xX23+Zr 62X

(k(r23)x1)dv

� �(K--B theory) (13)

and

CC=C3 xX23+Xj

(Kj kx1)V 3

( )(thermodynamic model) (12b)

The general forms of the two relations are identical. Equation (12b) refers to the specific model

of site binding. Equation (13) is completely general. The interactions can be long-range

rather than contact interactions. This could happen either as multilayer adsorption or via long-

range attractions or repulsions. CC can include an ionic atmosphere but care must be taken

in applying K–B theory to ionic systems because of the singularity of K–B matrices [ ] and

the long-range nature of Coulombic forces. The volume in Eq. (13) is the integration volume

# In this step we are ignoring the longer range structure imposed even by the hard shell model in which

ripples are found in the distribution outside the central molecule corresponding to solvation packing layers.

These would very likely be present for the principal solvent, water, but the assumption is that their con-

tribution is small for cosolvents, which are about 30–40 times as dilute on a molar basis.


V0xX23. V0 must be sufficiently large that the integral of [k(r23)x1] vanishes outside it. The

extent of V0 depends on the nature of the interaction between the protein and the cosolvent.

In general biochemists think of it as a short-range contact interaction, but this will require

further study for ionic systems.

We take the K–B result as a generalization and verification of the semi-empirically derived thermodynamic

formula. Evidently, the inventory of effects outlined in Part I is complete.

Another interesting problem is why the effect of a cosolvent is linear in concentration and

why concentration appears to be a better variable than the activity, which one would expect

to be the proper thermodynamic variable. Part of the job is done already. We can rearrange

Eq. (8) to

@DG ex2

@C3

� �m1 , T ,C2

=xDCC @m3

@C3

� �m1 , T ,C2

=xRTDG23C31

C3+

@ ln c3

@C3

� �m1 , T ,C2

!(14a)

xm=xRTDG23 1+C3@ ln c3

@C3

� �m1 , T ,C2

!: (14b)

The D’s mean the change resulting from the unfolding of the protein. DG23 is a function of

C3 and the second factor of the final equation has been determined experimentally for most

cosolvents as a function of concentration. The quantities in Eq. (14) are not quite equal to xm,

the slope of the Pace plot, because the variables are wrong. We need (P, T, C). We have made

some further progress with this problem but transformation to the correct variables leads to an

overly technical discussion that would not be suitable here. This will appear in a later work.

Acknowledgments

This paper is dedicated to Bengt Norden. His energy, productivity, success and good humor

are well known to the attendees of this Nobel Symposium. It has been a joy for me to work in

his department for two extended stays and many visits. His group has a special enthusiasm

that envelops a visitor immediately. I have found myself instantly involved in the progress

and problems of a large number of his students and co-workers. It requires a very special talent

to engender both hard work and enjoyment in the scientific enterprise. Hopefully, he will now

be able to return to his research programs full time.

Note added in proof

P. Smith has recently and independently derived equation (8).

References

AUNE, K. & TANFORD, C. (1969). Thermodynamics of the

denaturation of lysozyme by guanidine hydrochloride.

II. Dependence on denaturant concentration. Bio-

chemistry 8, 4586–4590.

COURTENAY, E. S., CAPP, M. W., ANDERSON, C. F. &

RECORD, M. T. (2000). Vapor pressure osmometry

studies of osmolyte-protein intractions : implications for

the action of osmoprotectants in vivo and for

360 J. A. Schellman

the interpretatin of ‘osmotic stress ’ experiments in vitro.

Biochemistry 39, 4455–4471.

CREAMER, T. P., SRINIVASAN, R. & ROSE, G. D. (1997).

Modeling unfolded states of proteins and peptides. II.

Backbone solvent accessibility. Biochemistry 36, 2832–

2835.

EISENBERG, H. (1976). Biological Macromolecules and

Polyelectrolytes in Solution. Oxford: Clarendon Press.

GREENE JR., R. F. & PACE, C. N. (1974). Urea and

guanidine hydrochloride denaturation of ribo-

nuclease, lysozyme, alpha-chymotrypsin, and beta-

lactoglobulin. Journal of Biological Chemistry 249,

5388–5393.

HERMANS, J. & SCHERAGA, H. (1961). Structural studies

of ribonuclease. V. Reversible change of conforma-

tion. Journal of the American Chemical Society 83,

3283–3292.

HILL, T. L. (1960). Introduction to Statistical Thermodynamics.

Reading, MA: Addison-Wesley.

KIRKWOOD, J. G. & BUFF, F. P. (1951). The statistical

mechanical theory of solutions. Journal of Chemical Physics

19, 774–777.

RICHARDS, F. M. (1985). Calculation of molecular volumes

and areas for structures of known geometry. Methods in

Enzymology 115, 440–464.

SANTORO, M. M., LIU, Y., KHAN, S. M. A., HOU, L.-X. &

BOLEN, D. W. (1992). Increased thermal stabilituy

of proteins in the presence of naturally occurring

osmolytes. Biochemistry 31, 5278–5283.

SAUNDERS, A. J., DAVIS-SEARLES, P. R., ALLEN, D. L.,

PIELAK, G. J. & ERIE, D. A. (2000). Osmolyte induced

changes in protein conformational equilibria. Biopolymers

53, 293–307.

SCHELLMAN, J. A. (1955). The stability of hydrogen-

bonded peptide structures in aqueous solution. Comptes-

rendus des travaux du Laboratoire Carlsberg (Ser. Chim.) 29,

230–259.

SCHELLMAN, J. A. (1975). Macromolecular binding.

Biopolymers 14, 999–1018.

SCHELLMAN, J. A. (1987). Selective binding and solvent

denaturation. Biopolymers 26, 549–559.

SCHELLMAN, J. A. (2003). Protein stability in mixed

solvents. Biophysical Journal 85, 108–125.

SHIRLEY, B. A., STANSSENS, P., HAHN, U. & PACE, C. N.

(1992). Contribution of hydrogen bonding to the con-

formational stability of ribonuclease T1. Biochemistry 31,

725–732.

TIMASHEFF, S. N. (1993). The control of proteins stability

and associations by weak interactions with water,

How do solvents affect these processes? Annual Review

of Biophysics and Biomolecular Structures 22, 67–69.

WILLS, P. R. & WINZOR, D. J. (1993). Thermodynamic

analysis of ‘Preferential Solvation’ in protein solutions.

Biopolymers 33, 1627–1629.


Documents

Destabilization and stabilization of proteins