Upload
ji-hun-kim
View
213
Download
0
Embed Size (px)
Citation preview
proteinsSTRUCTURE O FUNCTION O BIOINFORMATICS
STRUCTURE NOTE
Solution structure of hypotheticalprotein HP1423 (Y1423_HELPY)reveals the presence of aL motifrelated to RNA bindingJi-Hun Kim, Sung Jean Park, Ki-Young Lee, Woo-Sung Son, Na-Young Sohn,
Ae-Ran Kwon, and Bong-Jin Lee*
Research Institute of Pharmaceutical Sciences, College of Pharmacy, Seoul National University, Kwanak-Gu, Seoul 151-742, Korea
Key words: Helicobacter pylori; NMR; HP1423; RNA-binding; aL motif.
INTRODUCTION
Helicobacter pylori has uniqueness to survive in the
extreme acidic environment in stomach. It is an impor-
tant human bacterial pathogen, causing diverse gastric
diseases such as peptic ulcers, chronic gastritis, and mu-
cosa-associated lymphoid tissue lymphoma.1 In addition,
the fact that duodenal ulcers are also associated with
H. pylori infection has been proposed.2 H. pylori was iso-
lated from human stomachs in 1979 for the first time,3
and whole-genome analysis of H. pylori was completed in
the United States in 1997.4 Until now, complete genome
sequences for the H. pylori strain 26695, J99 and HPAG1
have been determined.
Even though the genomic information is known well,
the functions of many genes are still unknown. The func-
tional studies are still required to expand our biological
understanding for H. pylori. In this regard, determining
three-dimensional structures of unknown proteins can
lead to the elucidation of biological function of proteins
and the discovery of new drug target candidate for anti-
biotics.
In this study, we determined the solution structure of
HP1423 (Y1423_HELPY), which has 84 amino acid resi-
dues, conserved hypothetical protein from H. pylori
strain 26695. According to Pfam database, HP1423
belongs to S4 (PF01479) superfamily. The S4 domain is a
small domain consisting of 60–65 amino acid residues
that probably mediates binding to RNA.5 The structure
of HP1423 revealed the presence of the aL-RNA binding
motif in the protein, which is a general feature of several
RNA binding protein families.
METHODS
The gene of HP1423 was cloned into the NdeI/XhoI
site of pET-21a (1) vector (Novagen). The product con-
struct contains eight extra residues at the C-terminus
(LEHHHHHH) that assist protein purification. The
recombinant protein was expressed in E. coli BL21 (DE3)
(Stratagene) cells in M9 media which contains 15NH4Cl
and 13C6-glucose as nitrogen and carbon sources. The
expression of this protein was induced with 0.5 mM
IPTG at 378C. This protein was purified using a Ni21
agarose column (his-tag binding resin). The NMR sample
Additional Supporting Information may be found in the online version of this
article.
Grant sponsor: New Drug Target Discovery; Grant number: 370C-20070095, 2007;
Grant sponsors: Korea government (MOST) (Innovative Drug Research Center
for Metabolic and Inflammatory Disease), BK21 Project for Medicine, Dentistry,
and Pharmacy, Korea Basic Science Institute (KBSI), National Center for Inter-
university Research Facilities (NCIRF).
Ae-Ran Kwon’s current address is Department of Herbal Skin Care, Daegu Hanny
University, 290, Yugok-dong, Gyengsan-si, Gyeongsangbuk-do.
*Correspondence to: Bong-Jin Lee, Research Institute of Pharmaceutical Sciences,
College of Pharmacy, Seoul National University, San 56-1, Shillim-Dong, Kwanak-
Gu, Seoul 151-742, Korea. E-mail: [email protected]
Received 1 September 2008; Revised 7 November 2008; Accepted 12 November
2008
Published online 24 November 2008 in Wiley InterScience (www.interscience.wiley.
com). DOI: 10.1002/prot.22335
252 PROTEINS VVC 2008 WILEY-LISS, INC.
was prepared at a concentration of about 1 mM in 90%
H2O/10% D2O containing 20 mM NaH2PO4/Na2HPO4
(pH 5.8), 150 mM NaCl, 150 mM Glycine, 1 mM DTT,
and 1 mM EDTA.
All NMR spectra measurements were performed at
298K by using Bruker AVANCE DRX 600 spectrometer
equipped with a cryogenic probe and Bruker AMX 500
spectrometer. Backbone and sidechain assignments were
performed by using 3D HNCO, HN(CA)CO, HNCA,
HN(CO)CA, HNCACB, HN(CO)CACB, HBHA(CO)NH,15N-TOCSY-HSQC, C(CO)NH-TOCSY. Aromatic ring
resonances were assigned using 3D 15N-NOESY-HSQC
(mixing time 120 ms), 13C-NOESY-HSQC (mixing time
100 ms). Chemical shifts were referenced to DSS. NMR
data were processed using NMRPipe6 and analyzed with
NMRView.
Upper distance limit restraints were obtained from 3D15N, 13C-NOESY-HSQC by manual and automated
assignment of NOESY spectra by using CYANA 2.0.7 Di-
hedral angle restraints were calculated from chemical
shift using the program TALOS8 and overall secondary
structure was predicted from chemical shift index (CSI)9
and NOE pattern. Hydrogen-bond restraints were
obtained using H-D exchange experiment, observation of
regular secondary elements from CSI search and NOE
patterns.
The initial structures were generated with CYANA 2.0
and then refined through standard annealing and tor-
sion-angled dynamics using the program CNS 1.2.10 The
programs, MOLMOL11 and PYMOL12 were used to vis-
ualize the structures. Analysis on the quality of the final
structure was accomplished using PROCHECK-NMR13
and Aqua.14
RESULTS AND DISCUSSION
The structural statistics summarized in Table I indicate
that a high quality NMR structure is obtained for
the HP1423 (PDB entry: 2k6p). The overall structure of
HP1423 is shown in the Figure 1. The structure of
HP1423 is composed of five b-strands and three a-heli-ces. The b-strands correspond to residues 28–30, 33–34,
44–49, 56–60, and 78–80. The a-helices correspond to
residues 3–10, 21–25, and 74–76. The topology can be
described as a1a2b2b1b3b4a3b5. 1H-15N HSQC spec-
trum of 15N-enriched HP1423 showed good signal dis-
persion (Supp. Info. Fig. 3). But, unfortunately, we could
not detect the resonances from residues 15–19. Disap-
pearance of peaks in this region seems to be related with
the intermediate-conformational exchange.
Helix a1 is followed by a2, four strands of antiparallelb-sheet, short a3 and b5 strand. Notably, the region,
extending from a1 through b3, forms an obvious struc-
tural motif defined here as the aL motif, because of the
two a-helices and the loop between b2 and b3 which
forms an L-shaped meander. This structural motif shows
a high degree of conservation between different families
within the S4 (PF01479) superfamily and may be impor-
tant for interaction to RNA.5
We analyzed the distribution of charged residues in
HP1423 to search excess positive charge on surface of the
protein, which may be potentially involved in interaction
with the negatively charged phosphate backbone of RNA.
The map of electrostatic potential onto the surface of
HP1423 revealed an expected result; the surface region of
the aL motif has a strong concentration of positive
charge (see Fig. 1). The side-chains of R2 (on just before
a1), K5 (on a1), K36, K39, and K42 (on the L-shaped
loop) are clustered to form a positively charged surface
in the aL motif. On the same face, the loop between b4and a3 exposes another positively charged side chain of
K67. This charge characteristic may raise the possibility
that HP1423 is a RNA binding protein.
The amino acid sequence of HP1423 from H. pylori
was compared against sequences in several databases
Table IStructural Statistics for the 20 Energy-Minimized Conformers
of HP1423
Completeness of resonance assignmentsBackbone (%) 93.1Side chain (%) 77.4
Experimental constraintsNOE constraints total 988Intraresidue (i 5 j) 214Sequential (|i 2 j| 5 1) 328Medium-range (1 < |i 2 j| < 5) 184Long-range (|i 2 j| � 5) 262
Dihedral constraintsu 13C 13
RMSD to the mean structure (�) for residues 2–13, 22–82#Backbone atoms (N, Ca, CO) 0.44 � 0.06#All heavy atoms 1.06 � 0.06
Deviation from idealized geometry#Bonds (�) 0.000774 � 0.0000452#Angles (8) 0.298 � 0.0011
CNS energy (kcal/mol)a
E overall 46.14 � 0.65E bond 0.82 � 0.064E angle 33.90 � 0.24E improper 1.44 � 0.10E vdw 9.04 � 0.55E noe 0.87 � 0.45E cdih 0.08 � 0.03
Violations per conformerDistance constraints (>0.1 �) 0Dihedral angle constraints (>58) 0van der Walls (<1.6 �) 0
Ramachandran plot (%)b
Ranges: 2–13, 22–82 (%)Most favored region 70.6Additionally allowed region 28.7Generously allowed region 0.4Disallowed region 0.3
aThe default parameters and force constants of protein-allhdg.param, and annea-
l.inp in CNS 1.2 were used for structure calculation.bPROCHECK-NMR was used for calculation.
Solution Structure of Hypothetical Protein
PROTEINS 253
using BLAST network service at the National Center for
Biotechnology Information16 (Supp. Info. Fig. 1). The
result showed that 19 proteins were identified as homo-
logues of HP1423 (score >80 bits) and the above-men-
tioned six residues that consist of the positively charged
surface are strongly conserved in S4 superfamily.
A search for structural homologues using program
DALI17 within the protein data bank18 also showed that
HP1423 is structurally similar to proteins that belong to
S4 superfamily. The superfamily include the Hsp15 pro-
tein (PDB code: 1dm9-B, Z-score 9.6, rmsd 1.8, identity:
36%),19 ribosomal small subunit pseudouridine synthase
A (PDB code: 1vio-A, Z-score 5.3, rmsd 2.9, identity:
34%),20 30S ribosomal protein S4 (PDB code: 1fjg-D, Z-
score 4.9, rmsd 2.2, identity: 26%),21 tyrosyl-tRNA syn-
thetase (PDB code: 1h3e, Z-score 4.2, rmsd 2.1, identity:
no significant sequence similarity found),22 and hypo-
thetical protein YbcJ (PDB code :1p9k, Z-score 2.7, rmsd
2.1, identity: no significant sequence similarity found).23
It is notable that all the structural homologues have
the aL motif. Moreover, it was already shown that the
aL motifs of RPS4 and tyr-tRS interact with a double-
stranded RNA.21,22 However, one can notice that the
surfaces involved in the RNA contact are somewhat dif-
ferent (Supp. Info. Fig. 2). The residues of tyr-tRS that
interact with tRNA are on the helix a2 (S383, R389,
N393, and R394) and the second b-hairpin (R420, K422,
D423, and R424).22 In the case of RPS4, a2 helix (S113,
R115, Q116, R118, Q119, R122, and H123) and L-shaped
loop (R131, R132, D134, and R139) are conducive to
this kind of interaction.21
To define structural characteristic of HP1423 in detail,
we compared the structure of HP1423 with the structures
of Hsp15, ribosomal small subunit pseudouridine syn-
thase A, RPS4, tyr-tRS, and YbcJ. Figure 2 highlights in
color the aL motif in the six different proteins. The con-
formation of the aL motif region of HP1423 (residues 1–
50) is similar to those of the other proteins. In addition,
sequences in the aL motifs are well conserved except tyr-
tRS and YbcJ (see Fig. 2). For tyr-tRS and YbcJ, despite
the low sequence similarity against HP1423, the aL motif
regions are folded similarly.
Figure 1NMR solution structure of HP1423. A: The superposition of the final 20 structures over the energy-minimized average structure. B: Ribbon
drawing of the representative conformer of HP1423. C, D: Electrostatic potential surface diagrams of HP1423 orientated to show the proposed
RNA-binding aL motif facing outwards. The extreme ranges of red (negative) and blue (positive) represent electrostatic potentials of less than
29 to greater than 19 kbT, where kb is the Boltzmann constant and T is the temperature. The figure was calculated with APBS15 showing the
accessible surface.
J.-H. Kim et al.
254 PROTEINS
Figure 2The aL motif in six different protein structures. The peptide backbones of six different structures are compared with HP1423. The region
highlighted in green is the aL motif that is shared by all six proteins. A: For HP1423, surface-exposed and positively charged residues R2, K5, K36,
K39, K42, and K67 are putative RNA binding sites. B: For Hsp15, surface-exposed and positively charged residues R10, K13, R24, R28, K44, and
K47 are putative RNA binding sites. C: For pseudouridine synthase A, surface-exposed and positively charged residues R17, K22, R25, and K37 areputative RNA binding sites. D: For RPS4, S113, R115, Q116, R118, Q119, R122, and H123 (on a1 helix) and R131, R132, D134, and R139 (on L-
shaped loop) are RNA binding regions. E: For tyr-tRS, S383, R389, N393, and R394 on the helix a2 are RNA binding regions. F: For YbcJ, surface-
exposed and positively charged residues K40, R57, K58, R59, and K61 are putative RNA binding sites. G: Structure-based sequence alignment of the
proposed aL RNA-binding motif in HP1423, Hsp15, ribosomal small subunit pseudouridine synthase A, RPS4, tyr-tRS, and ybcJ. The secondary
structures of six different proteins are displayed. Sequence alignment was performed using clustalW224 and Jalview.25 [Color figure can be viewed
in the online issue, which is available at www.interscience.wiley.com.]
Solution Structure of Hypothetical Protein
PROTEINS 255
In this aL motif, two arginine residues conserved in
Hsp15 (R10, R28) have been proposed as the putative
RNA-binding site.19,26 In addition, the structure of
RPS4 bound to RNA reveals that arginine residue on the
a2 helix has an important role in RNA binding. How-
ever, unlike Hsp15 and RPS4, pseudouridine synthase A
and HP1423 have only one arginine residue on the a1helix which is equivalent to R10 of Hsp15. Nevertheless,
pseudouridine synthase A still maintains the RNA bind-
ing activity. Thus, the requirement of two arginines
seems not to be generally applicable for classification of
RNA binding proteins.
We also compared the potential maps of HP1423,
Hsp15, pseudouridine synthase A, RPS4, tyr-tRS and
YbcJ to figure out another putative RNA binding residues
of HP1423 (Supp. Info. Fig. 2). In HP1423, residues on
a1 and L-shaped loop, and K67, as noted earlier, consist
of the positively charged surface. In RPS4, pseudouridine
synthase A, and YbcJ, the positive charges notably clus-
tered around the a2 helix and the loop. In tyr-tRS, the
positively charged side chains extend toward the surface
of the a2 helix and the turn of the second b-hairpinbetween b4 and b5. In Hsp15, the positive surface is on
a1, a2 helices and L-shaped loop. In the case of RPS4
and tyr-tRS, RNA binding sites correspond to positive
concentrated surfaces.21,22 Despite the fact that the con-
formations of the aL motifs are similar to each other,
the distribution of positive charged residues looks some-
what different. These results strongly suggest that the
positively charged residues on the aL motif play an im-
portant role in the binding to RNA but the RNA binding
manners are not exactly same in S4 superfamily. Based
on our analysis, the RNA binding residues are not com-
pletely revealed and it still needs further study to figure
out the exact contact residues. However, it is very proba-
ble that the positively charged residues R2, K5, K36, K39,
K42, and K67 that are strongly conserved in the S4
superfamily have a crucial role in RNA binding.
In conclusion, the solution structure of HP1423 was
successfully determined, which should be valuable for
future binding studies with RNA and elucidation of the
biological role of HP1423. Our structural analysis
strongly suggests that the aL motif of HP1423 may bind
to RNA by the similar but not exactly same manner of
the S4 RNA binding proteins.
ACKNOWLEDGMENTS
We thank Dr. H.S. Won (The Kunkuk University), Dr.
K.D Han, and Dr. M.D. Seo for their valuable comments.
REFERENCES
1. Cover TL, Blaser MJ. Helicobacter pylori infection, a paradigm for
chronic mucosal inflammation: pathogenesis and implications for
eradication and prevention. Adv Int Med 1996;41:85–117.
2. Pietroiusti A, Luzzi I, Gomez MJ, Magrini A, Bergamaschi A, For-
lini A, Galante A. Helicobacter pylori duodenal colonization is a
strong risk factor for the development of duodenal ulcer. Aliment
Pharmacol Ther 2005;21:909–915.
3. Warren JR, Marshall B. Unidentified curved bacilli on gastric epi-
thelium in active chronic gastritis. Lancet 1983;323:1311–1315.
4. Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleisch-
mann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA, Nelson
K, Quackenbush J, Zhou L, Kirkness EF, Peterson S, Loftus B,
Richardson D, Dodson R, Khalak HG, Glodek A, McKenney K, Fit-
zegerald LM, Lee N, Adams MD, Hickey EK, Berg DE, Gocayne JD,
Utterback TR, Peterson JD, Kelley JM, Cotton MD, Weidman JM,
Fujii C, Bowman C, Watthey L, Wallin E, Hayes WS, Borodovsky
M, Karp PD, Smith HO, Fraser CM, Venter JC. Helicobacter pylori
duodenal colonization is a strong risk factor for the development of
duodenal ulcer. Nature 1997;388:539–547.
5. Aravind L, Koonin EV. Novel predicted RNA-binding domains
associated with the translation machinery. J Mol Evol 1999;48:291–
302.
6. Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfifer J, Bax A.
NMRPipe: a multidimensional spectral processing system based on
Unix pipes. J Biomol NMR 1995;6:277–293.
7. Herrmann T, Guntert P, Wuthrich K. Protein NMR structure deter-
mination with automated NOE assignment using the new software
CANDID and the torsion angle dynamics algorithm DYANA. J Mol
Biol 2002;319:209–227.
8. Cornilescu G, Delagio F, Bax A. Protein backbone angle restraints
from searching a database for chemical shift and sequence homol-
ogy. J Biomol NMR 1999;13:289–301.
9. Wishart DS, Sykes BD. The 13C chemical-shift index: a simple
method for the identification of protein secondary structure using
13C chemical-shift data. J Biomol NMR 1994;4:171–180.
10. Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-
Kunstleve RW, Jiang JS, Kuszewski J, Nilges M, Pannu NS, Read RJ,
Rice LM, Simonson T, Warren GL. Crystallography and NMR sys-
tem(CNS): a new software system for macromolecular structure
determination. Acta Cryst 1998;D54:905–921.
11. Koradi R, Billeter M, Wuthrich K. MOLMOL: a program for dis-
play and analysis of macromolecular structures. J Mol Graph 1996;
14:51–55.
12. DeLano WL. The PYMOL molecular graphics system. San Carlos,
CA: Delano Scientific; 2002.
13. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PRO-
CHECK: a program to check the stereochemical quality of protein
structures. J Appl Cryst 1993;26:283–291.
14. Laskowski RA, Rullmann JA, MacArthur MW, Kaptein R, Thornton
JM. AQUA and PROCHECK-NMR: programs for checking the
quality of protein structures solved by NMR. J Biomol NMR
1996;8:477–486.
15. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA. Electro-
statics of nanosystems: application to microtubules and the ribo-
some. Proc Natl Acad Sci USA 2001;98:10037–10041.
16. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local
alignment search tool. J Mol Biol 1990;215:403–410.
17. Holm L, Sander C. Dali: a network tool for protein structure com-
parison. Trends Biochem Sci 1995;20:478–480.
18. Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Bur-
khardt K, Fenf Z, Gilliland GL, Iype L, Jain S, Fagan P, Marvin J,
Padilla D, Ravichandran V, Schneider B, Thanki N, Weissig H,
Westbrook JD, Zardecki C. The protein data bank. Acta Crystallogr
D Biol Crystallogr 2002;58(Pt 6):899–907
19. Staker BL, Korber P, Bardwell JC, Saper MA. Structure of Hsp15
reveals a novel RNA-binding motif. EMBO J 2000;19:749–757.
20. Matte A, Louie GV, Sivaraman J, Cygler M, Burley SK. Structure of
the pseudouridine synthase RsuA from Haemophilus influenza. Acta
Crystallogr Sect F Struct Biol Cryst Commun 2005;61(Pt 4):350–
354.
J.-H. Kim et al.
256 PROTEINS
21. Carter AP, Clemons WM, Jr, Brodersen DE, Morgan-Warren RJ,
Wimberly BT, Ramakrishnan V. Functional insights from the struc-
ture of the 30S ribosomal subunit and its interactions with antibi-
otics. Nature 2000;407:340–348.
22. Yaremchuk A, Kriklivyi I, Tukalo M, Cusack S. Class I tyrosyl-tRNA
synthetase has a class II mode of cognate tRNA recognition. EMBO
J 2000;21:3829–3840.
23. Volpon L, Lievre C, Osborne MJ, Gandhi S, Iannuzzi P, Larocque
R, Matte A, Cygler M, Gehring K, Ekiel I. The solution structure of
YbcJ from Escherichia coli reveals a recently discovered alphaL motif
involved in RNA binding. J Bacteriol 2003;185:4204–4210.
24. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving
the sensitivity of progressive multiple sequence alignment through
sequence weighting, position-specific gap penalties and weight ma-
trix choice. Nucleic Acids Res 1994;22:4673–4680.
25. Clamp M, Cuff J, Searle SM, Barton GJ. The Jalview Java alignment
editor. Bioinformatics 2004;20:426–427.
26. Davies C, Gerstner RB, Draper DE, Ramakrishnan V, White SW.
The crystal structure of ribosomal protein S4 reveals a two-domain
molecule with an extensive RNA-binding surface: one domain
shows structural homology to the ETS DNA-binding motif. EMBO
J 1998;17:4545–4558.
Solution Structure of Hypothetical Protein
PROTEINS 257