Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Topics / 9:)6;5<
•! What is ChEMBL? / =>?@,AB
•! Contents / $%&CD,EFGH
•! Other ChEMBL resources /IJ)K)LM%(
•! Questions / NO
Who are we? / PQB
•! ChEMBLR=>?@SATUV=>?L5WXY)EMBL-EBI
ZT2008[Z\]^_`abcZdef^g%h
•! ijRklmn(oSpEFfqr^$%&'%(stu@vh6l
(6wx)yzR£4.7MSZ{|qEBI}~��Q^)p���
•! L%�%AJohn OveringtonT�>�%A��13�
Genomes
Ensembl
Ensembl Genomes
EGA
Nucleotide sequence
ENA
Functional
genomics
ArrayExpress
Expression Atlas
Protein Sequences UniProt
Protein families,
motifs and domains
InterPro
Macromolecular PDBe
Protein activity
IntAct , PRIDE
Pathways
Reactome
Systems
BioModels
BioSamples
Literature and ontologies
CiteXplore, GO
Chemogenomics
ChEMBL
•! ChEMBL database
•! Curation •! Interface
•! Research group •! IMI eTox
•! Industrial collaborations
Chemical
entities
ChEBI
Research & Databases at the EBI
What is ChEMBLdb? / =>?@,AB
•! -./23EFe��Rdrug-like cmpdsS)$%&'%(
•! ��\�8��t>�%���
•! e��)�����, ��m��
•! SARbc
•! &%456R&>m<N��Sbc
•! �� 30[¡)MedChem¢£
•! PubChem��)Y¤$%&
•! FDA¥¦.
•! §¨©ª�«¬RHTTPSS
ChEMBL>&%®¯%(R:9°S
=>?@Wikipedia
Drug Discovery Process
-./EF��§(
> 690万件 生物活性(bioactivities)
> 110万個 化合物(compounds)
> 8千個 ターゲット(targets)
~12,000 candidates ~2000
承認薬
(drugs)
Target
Discovery
Lead
Discovery
Lead
Optimisation
Preclinical
Development Phase 1
Phase 2
Phase 3
Launch
•!Target
identification
•!Microarray
profiling
•!Target
validation
•!Assay
development
•!Biochemistry
•!Clinical/Animal
disease models
•!High-
throughput
Screening (HTS)
•!Fragment-based
screening
•!Focused
libraries
•!Screening
collection
•!Medicinal
Chemistry
•!Structure-based
drug design
•!Selectivity screens
•!ADMET screens
•!Cellular/Animal
disease models
•!Pharmacokinetics
•!Toxicology
•!In vivo safety
pharmacology
•!Formulation
•!Dose prediction PK
tolerability Efficacy
Safety
&
Efficacy
Indication
Discovery &
expansion
Med. Chem. SAR Clinical Candidates Drugs
Discovery Development Use
Clinical Trials
ChEMBL$%&'%(
SAR Data
Compound
Assa
y
Ki=4.5
nM
>Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCS
YEEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLW
RSRYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPR
SEGSSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRN
PDGDEEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFG
SGEADCGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLI
SDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRD
IALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNL
PIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGC
DRDGKYGFYTHVFRLKKWIQKVIDQFGE
APTT 11
min
Target
&%456
Compound
e��
Bioactivity
±�²³
What is ChEMBL data? / =>?@$%&,AB
Extraction and Curation / ´µ,¨©¶%·¸>
•! MedChem¹Wº%»@)¼½¢£¾¿ManuallyZ´µRY
¤ÀÁS
-! J Med Chem, Bioorg Med Chem Lett, J Nat Prod etc
•! >Ât(\T¨©¶%&%RChemistry,Biology)JQÃQ
)ÄÅÆSZ{Çg¯5<
-! Chemical: Incorrect structures, Duplications, Missing salts etc
-! Biogical: Normalise gene names, Assign Uniprot IDs, Target classification, Confidence scores
•! ÈÉÊ�ª5�$%6R3, 4ËÌZÍÎS
•! PubChemとのデータ共有
-! Confirmatory (e��%&>m<N)Single Interaction Value))Ï
sÈÉÊZ>Ð%6Ñ
-! PubChemもChEMBLの文献情報を取り込んでいる。
-!化合物のオーバーラップは、2%以下。
•! Neglected Disease Dataset(主にマラリアデータ)
-! GSK & Novartis Malaria Screening Data
-! Drugs for Neglected Diseases Initiative (DNDi) etc
External Data Import / Ò£Óbc
ChEMBL
Literature PubChem
Assays 667,868 410,112
10,909 化合物
FJ Gamo et al. Nature 465(7296) 305-310 (2010)
ChEMBL Schema / (¨%Ô
e��
-./
&%456
ÕÖRª5§S
£Ó
Compounds / e��
•! e��)��bcATMolfileÑ
•! e��)×ØATStandard InChIsÙÚ
R(�¶Û=Ü(6L%sÝÞS
•! ��m��)ßàRMW, PSA, logP, Ro5,
Med_Chem_Friendly etc)
•! MolregnoATInternal�e��ID
•! CHEMBL_IDATDBÛ®�·º@)
IDÑCompounds, Assays, TargetsáqZ
ârqãä%<�IDÑ
-! prefixの”CHEMBL”が数字の前につく
-! 例:CHEMBL123 (molregno=22942,
chebi_id=122942)
•! åæçPpT&>m<NTèéêëìRíZ
-./T��bc�fS
e�� ��m��!
e�� ��!
e��)îï
bc
Targets / &%456RðÊS
•! íZ&>m<NÑKZñòTCell-Line��Ñ
•! ðÊ&>m<N)×ØATUniprot AccessionsÙ
ÚÑ�¿ZóôÊZ`õÑ -! e.g. Enzyme > Kinase > Protein Kinase > TK > EGFR
•! &%456)ªö>p÷frøê
-! Compound known only to bind to receptor family
-! e.g. activity reported vs. ‘Muscarinic
receptors’
-! Compound binds to multi-complex
-! e.g. Ion channel
ðÊ`a)<l(
ðÊ`a)îïbc
Protein DNA
Organism Cell Line
Experiments / ÕÖ
•! ª5§ZAT&%456,e��)
ù�súÈf^BindingûYZT
Functional,ADMETpüÇÑ
•! ²³bc)u>�Ð>6AýþÑ
-! IC50 (half maximal inhibitory
concentration)
-! Ki (binding affinity)
-! MIC (minimum inhibitory concentration)
-! % Inhibition (of activity)
•! Rÿ!q\A�rpS²³")ð#
es$�|qrÇ
-! Standard Values, Units and Types
ª5§!
²³bc!
ª5§,&%456!
ð#e�Q^
²³"(nM)!
Functional Assays)%õ
Whole organism assays
(e.g., anti-infectives/parasitics)
Disease-derived cell-line
(e.g., human ovarian cancer cell line cytotoxicity)
Tissue or cell-based disease model
(e.g., glucose uptake by adipocytes)
Tissue or cell-based assay for target effect
(e.g., contraction of guinea-pig ileum)
Cell-based assay over-expressing target
(e.g., GPCR calcium mobilisation)
Targ
et a
ssocia
tion
Dis
ease a
ssocia
tion
14
疾患
標的分子
Marketed Drugs / -./bc
Drug class
Small molecule,
peptide, antibody
etc.
Rule of Five
compliant
First-in-class?
Oral
Delivery? Parenteral
Delivery?
Topical
delivery?
Single Enantiomer? Prodrug?
Boxed
warning?
•! FDA¥¦.)bcsOrange Book¾¿´µ
•! &'-./bcRRecent Drug ApprovalsSA?�(\)*+'
Web Services / t¯?ö%,(
•! REST APIZ{Ç��(lh¾¿)ª<§(p��
•! Compound (similarity & substructure), Target, Bioactivity Search
•! JAVA, Perl, Python)ö>�@-%�s./
•! XML, JSON)ªt6�5601söÐ%6
•! 例:CHEMBL_IDによる化合物検索
•! https://www.ebi.ac.uk/chemblws/compounds/CHEMBL1
ChEMBL for Drug Discovery
Physchem Property Space and Affinity ケミカルスペースとアフィニティ
!"#$%&'()*()+&*,-,&./%%0&123!24)*5&
分子量
ALogP
RO5
Drugs
分子量
<300 300-400 400-500 500-600 >600
Good
Bad
アフィニティ
- Potency generally
increases with MWT
Physchem Property Space and ADMET =Üv@(2%(,.�34
`a5
ALogP
6,-&'(2,7,(8,'(8(-9&.&:;<0&
`a5
<300 300-400 400-500 500-600 >600
Good
Bad
バイオアベイラビリティ
RO5
Drugs
- High MWT tends to lead to
poor ADMET properties
QED Drug-likeness Trend
Drug-likeness Trends / プロパティトレンド
£Óe��)67ATRO5
��m��s89Ñ
&:;<[\A6¶>�A=>�Qqr�r
Year
% o
f co
mp
ou
nd
s in
Ye
ar
リピンスキーRO5
Rule of Five Trend
Bickerton et al. 2012
Comparison of Dataset / $%&§56)?@
21
Media
n P
ropert
y
*chembl_13 data from literature
*cmpds with MWT >1000 not included
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
ChEMBL_13 PubChem Drug Drug>1999
MW
ALOGP
AROM
418
369 353
401
Ligand Efficiency,ª®�ä��
22
C. Abad-Zaptero DDT, 2005,10,464-469
BE
I
SEI
BEIA
pXC50*1000/MWT
SEIA pXC50*100/PSA
•! Better to optimise compounds with highest efficiency not highest potency
•! Puts compounds with different MWTs on same scale
Colored by
Activity Value
Prevalent Rings in ChEMBL / �L�¶>6L>(B
23
%_<=1990 %_2001-2010
% - no. of times ring appears /total no of rings reported for that
year range
in pre 1990 top 12 but not in 2001-2010 top 12
in 2001-2010 top 12 but not in pre 1990 top 12
Very few rings in the majority of bioactive molecules
similar to P Ertl J Med Chem 2006, 49,4568-4573
How to search in ChEMBL
Searching ChEMBLdb
•! Identifying Compounds interacting with Specific
Targets
-! Text search for protein names/synonyms
-! Browse protein or organism tree
-! Sequence search using BLAST
•! Compound Searching
-! Search by substructure or similarity
-! Search by compound name
-! Search by lists (smiles, names, IDs)
ChEMBLdb Interface
Browse Targets
ターゲットブラウザ
Research Results
Serotonin Receptor (セロトニン受容体)
Calc.
properties
Drug
Information
Clickable structure
Parent and Salt
Forms
Database links
Webinar / t¯,»%EC (5/30)
Other ChEMBL Resources
SARfari ChEMBL-NTD •!Kinase/GPCRのSAR情報DB •!熱帯病の医薬品候補化合物DB
Malaria Protein Family
Focused
DrugEBIlity •!創薬標的バリデーションDB
Druggability
eTOX •!トキシコロジー
0
5000
10000
15000
20000
25000
30000
35000
40000
1980 1985 1990 1995 2000 2005 2010
GPCR
Kinase
Protease
Ion Channel
Nuclear Receptor
Transporter
Year
Nu
mb
er
of
Bio
activitie
s
Protein Families ChEMBL Timeline / &hl>
Protein Families in ChEMBL / ®DÜL%
GPCR
33%
Kinase
12%
Protease
9%
Ion Channel
5%
% of Bioactivities in
ChEMBLdb
Kinase Protease Ion Channel GPCR Transcription
Factor
Top10 by Target Class
ChEMBLdb Contents (Targets)
% of Bioactivities
in ChEMBLdb
H3 CRTH2 P2Y12 EP3
Number of Bioactivities by Target (Top50)
Year: 2009 ~ Current
Number of Bioactivities by Compound (Top100)
Clozapine P2X(2) Antagonist
Year: 2009 ~ Current
Trends in GPCR Family
Kinase SARfari
Kinome Tree Target Browser
Kinase Domains & Compounds
Binding Site Similarity
3D Structure Analysis
ChEMBL DrugEBIlity Portal
Druggability,Tractabiliy
Ligand
EDrug-likenessF
Rule of Five
Drug-like
Target
EDruggabilityF
Druggable
Ligand
Protein
* Lipinski & Hopkins, Nature 2004
MW HBD HBA LogP RotB
Druggability 100<MW<550 ! 5 ! 10 ! 5 ! 10
Tractability 200<MW<800 ! 8 ! 15 ! 8 ! 16
=>2-?()&0(),5?5&
=>2-?()&=@25!@,-,5?5&&
A7+;&B>4++,'(8(-9& A7+;&B>4++,'(8(-9&
A7+;&B>4++,'(8(-9&
=>2-?,5?5&C>?D4?)19&
C>?D4?)19&
Druggability Result -Family-
Avg. Druggability
Transcription Factor
(# of domains per each protein > 3)
Druggability Result -Site Details-
Tyrosine protein kinase
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
56 60 64 68 72 76 80 84 88
Site1
Site4
Buried Surface Area [%]
Fre
qu
en
cy Small Drug Sites
All Data
A7+;&E)5?3'8?&F12>?&
A7+;&B>4++,'8?&F12>?&
B(@9*>2G28,-?&&
6?*41-,5?&HBIC6J&
#(>18?5K&0)2L)&B>4+&M,>+?-5&
=>2-?()&0(),5?&H!"##$J&
M>,)51>(!N2)&C,1-2>&H%#&&'(J&
O?3'>,)?&=>2-?()&H)#*J&
=>2-@>23'()&
P,8?1N)Q:&
Druggability Plots
Publication / ¢£
•! Bellis LJ, Akhtar R, Al-
Lazikani B, Atkinson F,
Bento AP, Chambers J,
Davies M, Gaulton A,
Hersey A, Ikeda K, Krüger
FA, Light Y, McGlinchey S,
Santos R, Stauch B,
Overington JP.
Biochem Soc Trans. 2011 Oct;39(5):1365-70.
•! A. Gaulton, L. Bellis, J. Chambers, M. Davies, A. Hersey, Y. Light, S.
McGlinchey, R. Akhtar, F. Atkinson, A.P. Bento, B. Al-Lazikani, D.
Michalovich, & J.P. Overington, NAR. 2011 Database Issue.
The ChEMBL-og- / &'bc
•! ChEMBL?�(
-! http://chembl.blogspot.co.uk
Webinar / t¯,»%
•! Û>l>§Ü»%Rt¯,»%S
-! 16-May-2012 3:30pm Schema and sql querying
-! 30-May-2012 9:00am Interface and Searching (日本語)
-! 13-Jun-2012 3:30pm Interface and Searching
-! 27-Jun-2012 3:30pm Schema and sql querying
-! 11-Jul-2012 3:30pm Interface and Searching
-!http://chembl.blogspot.co.uk/2012/02/chembl-webinars-
for-2012.html
•! 5Ì30:RGSB:9*¡BHI5*{�
-! JChEMBL)>&%®¯(,�8KZ9rq
-!LMNAOPR:9°S
-! ��^\êQR��
-!âOr�[email protected]
Acknowledgment / UV
Mark Davies*
Shaun McGlinchey
Yvonne Light* Louisa Bellis*
Ruth Akhtar
Francis Atkinson**
Patricia Bento
George Papadatos**
Jon Chambers** Anna Gaulton**
Anne Hersey**
John Overington* **
(前職が企業研究者* Pharma出身**)