52
By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Embed Size (px)

Citation preview

Page 1: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

By Xianfeng (Jeff) Chen, Ph.D.

Computational and Systems Biologist

Plant Genomics, Bioinformatics, and Systems Biology

Presented at:

Page 2: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

陈贤丰博士简历

Page 3: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

However, do we now and future have high quality ecosystem, enough food and energy, better and healthy living condition

to sustain life on this planet ?

We have over 5000 years of civilization, now with advanced technologies ….

Page 4: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

What am I doing at ERDC DOD ? ---- Transcriptomics Based Gene Signature Detection to

Environmental Perturbations

Page 5: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

(1) http://systemsbiology.usm.edu/BirdGenomics/(2) http://systemsbiology.usm.edu/EGGT/

Environmental Genomics on Sentinel Species

Example Project URLs:

--- Ecotoxicogenomics

Page 6: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Mapping of the Modules onto Plant Drought Responsive Co-Expression Biological Networks

Software: (1) WGCNA: an R package for weighted correlation network analysis. (2) Cytoscape for display as well.

Page 7: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Agenda Today

(1) Cyber-infrastructure and Systems Biology --- Cutting Edge Science and Technology.

(2) Cowpea and common bean genomic gene space sequence clustering, assembly, annotation, and knowledgebase establishment ---- Genome Assembly and Analysis.

(3) Soybean (Glycine max) and Medicago (Medicago

truncatula), Brachy (Brachypodium distachyon) WRKY transcription factor family classification and gene lead nomination for transgenic studies --- Data Mining and Lead Discovery.

(4) High level transgenic expression of promoters, isolated from soybean transcription factor genes, in lima bean and soybean for functional testing and validation --- Synthetic Biology.

Page 8: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Section One: Cyber-infrastructure and Systems Biology

Reductionist Approach,One Gene, One Protein

Systems Approach,Multiple Genes, Network Analysis

Cutting Edge Science and Technology

Page 9: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Systems Biology for Environment and Energy

Page 10: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genomes Determine Ecosystem Behaviors

Page 11: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Status of Technologies in Systems Biology

Page 12: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

First Genome Sequenced in 1995

Page 13: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

First Human Genome Sequenced with $3 Billions Cost

Page 14: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Next Generation Sequencing (NGS) Platform

PacBio RS

Ion Proton HiSeq 2500

Popular 2nd Generation Sequencers

NextSeq 500

Page 15: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

HiSeq X Ten

• cost $10 millions• 125 bp read length• $1000 per genome

• cost $900• 80 K bp read length

MinION

Page 16: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

中国地区已经成为继北美和欧洲之后的第 3 大二代测序仪器设备拥有区,图中标注的数字为二代测序仪器数目,其中可以看出中国的二代测序仪器主要分布在深圳、北京和上海地区。

二代测序仪器的世界分布图

Page 17: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

美国国家人类基因组研究所

英特尔创始人 (Gordon Moore) 之一的摩尔定律

当价格不变时,集成电路上可容纳的晶体管数目,约每隔 18 个月便会增加一倍,性能也将提升一倍。

Page 18: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Organisms with Sequenced Genomes

Bacteria: about 2300Archae: about 50Protists: 13Plants: 7 => Arabidopsis, rice, poplar, Chlamydomonas, soybean, medicago, brachyFungi: 14 => including Saccharomyces cerevisiaeAnimals: 16 => including C. elegans, Drosophila, mouse, human

Total : about 2600 genomes completely sequenced so far

Page 19: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Developing from Genome Data to Full Cell Simulation

Our CurrentEffort

Page 20: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Bioinformatics Bottlenecks for Successful Implementation of Latest Systems Biology Technologies

Data Generation Capacity

Bioinformatics No Marriage, Sad ..……

Page 21: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Supporting Cyber- infrastructure and Systems Biology Workflow

Historic strong area

Supporting

http://genomicscience.energy.gov/compbio/#page=news

2011 DOE Systems Biology Knowledge Base (Kbase) Initiative:

Page 22: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Cyber-knowledge System to Enable Genomics-based Predictive Biology

Bioenergy, Biodefense,Environmental QualityMonitoring and Bioremediation.

Page 23: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:
Page 24: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

PC 1-2 CPUs Computing Unix Multiple CPUs ComputingCluster Computing,or Supercomputing

Cyber-infrastructure Component : High Performance Computing

Step 1 Step 2

Start point

Most Biology Labs 5-10 Biological Labs in US

for Large Sets of Data Analysis

--- Migration of Bio-Computing Capability

Page 25: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Project website : http://www.igece.org/LegumeSystemsBiology.html

Genomics Knowledgebase System Architecture---- Legumes as Bioenergy Model Organisms

Page 26: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

(1) BMC Bioinformatics 2007, 8:129.(2) BMC Genomics 2008, 9:103.

Section Two: Genome Assembly and Analysis

--- CGKB – Cowpea Genomics Knowledge Base

Project website: http://cowpeagenomics.med.virginia.edu/CGKB/

We published on:

Page 27: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genome Annotation Strategy (1): Homology-based Annotation for Cowpea GSS

263,425 Total Cowpea Gene Space Sequence (GSS).

High level coding region detected !

We published on BMC Genomics 9:103, 2008.

Page 28: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Common Bean 415K Genomic Gene Spaces Clustering and Assembly

for Unigene Production

Test Run No. of Contig No. of Singlet Parameter Setting

One 64,933 1,049 -c 4 -p 95 -l 60 -v 10 Two 62,140 219 -c 4 -p 98 -l 100 -v 5 Three 61,397 288 -c 4 -p 98 -l 40 -v 2

---- Used The Gene Indices Clustering Tools (TGICL), which uses megblast for homology-based clustering and CAP3 for assembly.

Diamond - SGI Altrix ICE Spercomputer Jade - Cray XT4 Supercomputer

Page 29: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Project Website: http://systemsbiology.usm.edu/Legume/CommonBean/

Common Bean Genomics and Bioinformatics

Page 30: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Non-codingDetected

CodingDetected

CodingPotential

Reference Proteins

207,587 207,412 49.98% NR.aa

216,217 198,782 48.90% Refseq

291,412 123,587 29.78% SwissProt

207,599 207,400 49.98% Uniref100

No. of Proteins

Coding Potential Detection of 415K Common Bean Genomic Gene Spaces

Note: Sequenced at the Washington University Genome Center via Methylation Filtration Techniques

10 millions

6.5 millions

10 millions

0.5 millions

Page 31: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genome Annotation Strategy (2): Metabolic Pathway Integration for Cowpea GSS

We published on BMC Bioinformatics. 2007, 8:129.

Page 32: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genome Annotation Strategy (3): GO Integration with Distribution of Function Assignments for Cowpea GSS

BMC Genomics. 2008, 9:103.

BMC Bioinformatics. 2007, 8:129.

We published on:

Page 33: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genome Annotation Strategy (4): Comparative Genomics at Genome-scale

We published on BMC Genomics. 2008, 9:103.

---- Example of medicago vs cowpea

Page 34: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genome Annotation Strategy (5): Comparison at Gene Family Level

(1) BMC Genomics 9:103, 2008.(2) Plant Physiology 147:280-295, 2008.(3) BMC Plant Biology 10:237, 2010.(4) BMC Bioinformatics 9:53, 2008.(5)  Plant Biotechnology Journal 1: 1-10, 2011 (6) BMC Genomics, 13:270, 2012 (7) Journal of Agriculture Biotechnology,

22 (5): 572-579, 2014

--- WRKY and CONSTANS (CO) and CO-like Gene Families of Cowpea Transcription Factors

 

We published the methods on:

Page 35: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genome Annotation Strategies: (6) Repeat, (7) Domain, (8) Gene Model for Cowpea GSS

We published on:BMC Bioinformatics 8:129, 2007.

Repeat

Domain

Gene Model

Project website: http://cowpeagenomics.med.virginia.edu/CGKB/

Page 36: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genome Annotation Strategy (9): Comparative Genomics on Network for Conserved Protein Complexes

Comparative genome analysis

Conserved networks

---- May not have been done in plants yet ?

Page 37: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Genome Annotation Strategy (10): Functional Validation through Reverse Genetics Program

My name

2008

Page 38: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Section Three: Data Mining on Data Processed via Computational Approach in Kbase

Knowledge-based Discovery

Page 39: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Data Mining on Promoters and Transcription Factors

---- SURE, Soybean Upstream Regulatory Elements for Ongoing Regulatory Motif Annotation

http://systemsbiology.usm.edu/SystemsBiologyCenter/Soybean/Soybean_files/SURE.html

Page 40: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Project URL: http://www.igece.org/Soybean_TF/ We published on BMC Plant Biology 10:237, 2010.

4, 452 Predicted Transcription Factors in 76 Families

Page 41: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Project website: http://compsysbio.achs.virginia.edu/tobfac/

Plant Physiology 147:280-295, 2008.BMC Bioinformatics 9:53, 2008.

We published on:

Page 42: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

1,134 Predicted Transcription Factors

Page 43: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

115

89

Section Four: Nominating Transcription FactorsInvolved in Stress Response for Transgenic Studies

Group IX

Red Dot = Soybean ERF genes

Implicated in regulating wounding and jasmonate responses

Soybean Promoter :

GmWRKYs, GmERFs, Gmubis, Gmcons

more and more and more……..

10 promoters per month

Promoter

Page 44: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Tested WRKY Promoters for Expression Assays

Page 45: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

2009 2039 2069

---- Climate Change Predicts Drought in the near Future

WRKYs Are the Key Proteins Regulating Plant Drought Tolerance

Page 46: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Project website: http://systemsbiology.usm.edu/PhytoTech/WRKY/

Page 47: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

WRKYs across Plant Kingdom

Project website: http://systemsbiology.usm.edu/PhytoTech/WRKY/

Page 48: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Project URL:http://www.igece.org/WRKY/BrachyWRKY/

Page 49: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

The WRKY Wide Web: The Database of WRKY Transcription Factors

MEME Analysis of Conserved Domains

We published on BMC Genomics, 13:270, 2012.

Page 50: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Soybean WRKY Promoter Transient Expressions at Peek

http://www.igece.org/WRKY/BrachyWRKY/SoySUREWRKY/WRKY.html

 More details on methods : BMC Plant Biology 10:237, 2010

Lima Bean Cotyledons

Soybean Hairy Root

Video at:

Page 51: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Where are the Strong Promoters for WRKY Over-expression ?

 More details on the techniques we published: BMC Plant Biology 10:237, 2010.

Page 52: By Xianfeng (Jeff) Chen, Ph.D. Computational and Systems Biologist Plant Genomics, Bioinformatics, and Systems Biology Presented at:

Acknowledgement

DOD ERDC ELDr. Victor MedinaMs. Xia Wang

DOD ERDC - ITLMr. Mark CowanMr. Ken LawrenceMr. Dave DumasMr. Phillip Bucci

UC, Berkeley & DavisDr. Chris VulpeDr. Paul GeptsDr. Dawei Lin

University of VirginiaDr. Mike Timko

The Ohio State UniversityDr. John Finer

University of MichiganDr. Youqun (Oliver) He

Cornell UniversityDr. Zhangjun Fei

University of Southern MississippiDr. Chaoyang (Joe) ZhangDr. George Glover, System Admin

IFXworks CorporationDr. Keith StewardDr. Vladimir MakarovDr. Rich Zhang

South Dakota State UniversityDr. Paul RushtonInstitute of Green Energy & Clean Environment

Dr. Jason LiDr. Anna MaDr. Joe Sung