56
Cátedra de genética Universidad Nacional Autónoma de Colombia Sede Bogotá Layla Michán Departamento de Biología Evolutiva, Facultad de Ciencias, UNAM (r)evolución de información en biología: el caso de la genómica

Genomica colombia

Embed Size (px)

Citation preview

Page 1: Genomica colombia

Caacutetedra de geneacutetica

Universidad Nacional Autoacutenoma de ColombiaSede Bogotaacute

Layla Michaacuten

Departamento de Biologiacutea Evolutiva Facultad de Ciencias UNAM

(r)evolucioacuten de informacioacuten en biologiacutea el caso de la genoacutemica

INFORMACIOacuteNBIOLOacuteGICA

TIPOS1Bioloacutegica2Bibliograacutefica3Institucional

AacuteREAS1Biodiversidad2Biomedicina

APLICACIONES1Obtencioacuten de nuevo conocimiento (Biologiacutea)2Anaacutelisis de la ciencia actual (Bibliometriacutea Historia Sociologiacutea)3Ciencias de la Informacioacuten y la documentacioacuten bioloacutegica4Planeacioacuten evaluacioacuten gestioacuten y poliacutetica cientiacutefica

PROBLEMAS1 (R)evolucioacuten digital en las ciencias bioloacutegicas2 Caracteriacutesticas de la E-biologiacutea3 Dinaacutemica estructura y relaciones de la biologiacutea reciente4 Publicacioacuten cientiacutefica5 Recursos web y ciberinfraestructura para biologiacutea6 Colecciones de datos7 Meta-anaacutelisis de literatura bibliometriacutea anaacutelisis de redes mineriacutea de textos semaacutentica

ENFOQUES1Biologiacutea2Ciencias de la informacioacuten y documentacioacuten3Tecnologiacuteas de la Informacioacuten y comunicacioacuten4Ciencias de la computacioacuten

Historia de la cienciasEtapas

bull Antiguumledad (III ac V dc)

bull Edad Media(V-XIV)

bull Renacimiento (XV-XVI)

bull Etapa Moderna Temprana (XVII-XVIII)

bull Etapa Moderna (1800-1950)

bull Etapa Reciente o Contemporaacutenea(1950-2010)

Historia Biologiacutea (Problemas)

Historia Natural (in vivo)

ndash Siglo III AC-XIX

BiologiacuteaSiglo XX (in vitro)

Molecularizacioacuten

Siglo XXI (in silico)Computarizacioacuten

Los problemas de la Herencia Siglo XIX

bull La biologiacutea es una rama del conocimiento derivada de la siacutentesis de diversas tradiciones cientiacuteficas sin duda la maacutes importante es la Historia Natural pero durante el siglo XIX hubo otras quizaacutes menos abundantes en cuanto a produccioacuten pero igualmente relevantes en cuanto a las innovaciones conceptuales para la biologiacutea actual entre estas estaacuten

bull La herencia que estudiaba como se transmiten los caracteres de los padres a los hijos con base en

ndash La de los criadores y mejoradores enfocados al estudio del mejoramiento (especialmente de plantas) es decir en la herencia de los caracteres importantes para el hombre (agronomiacutea actual)

ndash La de los hibridoacutelogos encargados de estudiar la herencia de caracteres y la naturaleza de la especie (esencialista y nominalista) iniciada con Linneo

bull Habiacutea dos formas de estudiar la herencia de caracteres

ndash Los arboles genealoacutegicos o pedigriacutes

ndash Cruzas entre individuos con distintos caracteres

Gregor Mendel (1822-1884)bull En 1900 se redescubrioacute el trabajo de Gregor Mendel quien en 1865 habiacutea

anunciado el resultado de sus estudios con chiacutecharos ante la Sociedad

Bruniana

bull Los principales postulados de sus experimentos fueron

ndash Ciertos caracteres que pueden distinguirse faacutecilmente muestran

predominancia de unos sobre otros en la progenie de la cruza de padres

con caracteres opuestos o diferenciados

ndash Dos caracteres son transmitidos como elementos diferentes uno del padre

y otro de la madre en donde uno es dominante y el otro recesivo Los

caracteres que se transmiten y aparecen en la primera generacioacuten son los

dominantes y los que permanecen ocultos o de forma latente en el proceso

son los recesivos

bull Mendel usoacute la expresioacuten recesivo porque estos caracteres desaparecen en la

primera generacioacuten pero reaparecen en las subsecuentes

bull Reconocioacute que esos pares actuacutean independientemente de otros pares de

caracteres

bull De estos dos principios derivoacute algunas reglas aritmeacuteticas que rigen la herencia

Para progenies de hiacutebridos con un par de caracteres diferenciantes) R y r su

proporcioacuten de la progenie seraacute de tres dominantes contra un recesivo

Cuando difieren en dos pares de caracteres la proporcioacuten seraacute de 9331

Redescubrimiento de las leyes de Mendel

En 1900 de manera independiente publicaron los resultados de sus trabajos haciendo referencia al trabajo ldquoprecursorrdquo de Mendel

bull El botaacutenico holandeacutes Hugo de Vries (1848-1935)

bull El botaacutenico alemaacuten Carl Correns (1864-1933)

bull El austriacuteaco Eric Tschermak von Seysenegg (1871-1962)

Este hecho marcoacute el inicio del estudio de la geneacutetica se extendieron y aplicaron los conocimientos mendelianos

bull Soacutelo Correns comprendioacute completamente el trabajo de Mendel y sus consecuencias Tanto De Vries como Tschermak no entendiacutean conceptos como dominancia y confundiacutean en una las dos leyes de Mendel en una sola Es entonces muy claro que el trabajo de Mendel no fue entendido ni en sus aspectos teacutecnicos ni tampoco en su importancia De hecho el entendimiento de su relevancia vino antes de ser entendido teacutecnicamente

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 2: Genomica colombia

INFORMACIOacuteNBIOLOacuteGICA

TIPOS1Bioloacutegica2Bibliograacutefica3Institucional

AacuteREAS1Biodiversidad2Biomedicina

APLICACIONES1Obtencioacuten de nuevo conocimiento (Biologiacutea)2Anaacutelisis de la ciencia actual (Bibliometriacutea Historia Sociologiacutea)3Ciencias de la Informacioacuten y la documentacioacuten bioloacutegica4Planeacioacuten evaluacioacuten gestioacuten y poliacutetica cientiacutefica

PROBLEMAS1 (R)evolucioacuten digital en las ciencias bioloacutegicas2 Caracteriacutesticas de la E-biologiacutea3 Dinaacutemica estructura y relaciones de la biologiacutea reciente4 Publicacioacuten cientiacutefica5 Recursos web y ciberinfraestructura para biologiacutea6 Colecciones de datos7 Meta-anaacutelisis de literatura bibliometriacutea anaacutelisis de redes mineriacutea de textos semaacutentica

ENFOQUES1Biologiacutea2Ciencias de la informacioacuten y documentacioacuten3Tecnologiacuteas de la Informacioacuten y comunicacioacuten4Ciencias de la computacioacuten

Historia de la cienciasEtapas

bull Antiguumledad (III ac V dc)

bull Edad Media(V-XIV)

bull Renacimiento (XV-XVI)

bull Etapa Moderna Temprana (XVII-XVIII)

bull Etapa Moderna (1800-1950)

bull Etapa Reciente o Contemporaacutenea(1950-2010)

Historia Biologiacutea (Problemas)

Historia Natural (in vivo)

ndash Siglo III AC-XIX

BiologiacuteaSiglo XX (in vitro)

Molecularizacioacuten

Siglo XXI (in silico)Computarizacioacuten

Los problemas de la Herencia Siglo XIX

bull La biologiacutea es una rama del conocimiento derivada de la siacutentesis de diversas tradiciones cientiacuteficas sin duda la maacutes importante es la Historia Natural pero durante el siglo XIX hubo otras quizaacutes menos abundantes en cuanto a produccioacuten pero igualmente relevantes en cuanto a las innovaciones conceptuales para la biologiacutea actual entre estas estaacuten

bull La herencia que estudiaba como se transmiten los caracteres de los padres a los hijos con base en

ndash La de los criadores y mejoradores enfocados al estudio del mejoramiento (especialmente de plantas) es decir en la herencia de los caracteres importantes para el hombre (agronomiacutea actual)

ndash La de los hibridoacutelogos encargados de estudiar la herencia de caracteres y la naturaleza de la especie (esencialista y nominalista) iniciada con Linneo

bull Habiacutea dos formas de estudiar la herencia de caracteres

ndash Los arboles genealoacutegicos o pedigriacutes

ndash Cruzas entre individuos con distintos caracteres

Gregor Mendel (1822-1884)bull En 1900 se redescubrioacute el trabajo de Gregor Mendel quien en 1865 habiacutea

anunciado el resultado de sus estudios con chiacutecharos ante la Sociedad

Bruniana

bull Los principales postulados de sus experimentos fueron

ndash Ciertos caracteres que pueden distinguirse faacutecilmente muestran

predominancia de unos sobre otros en la progenie de la cruza de padres

con caracteres opuestos o diferenciados

ndash Dos caracteres son transmitidos como elementos diferentes uno del padre

y otro de la madre en donde uno es dominante y el otro recesivo Los

caracteres que se transmiten y aparecen en la primera generacioacuten son los

dominantes y los que permanecen ocultos o de forma latente en el proceso

son los recesivos

bull Mendel usoacute la expresioacuten recesivo porque estos caracteres desaparecen en la

primera generacioacuten pero reaparecen en las subsecuentes

bull Reconocioacute que esos pares actuacutean independientemente de otros pares de

caracteres

bull De estos dos principios derivoacute algunas reglas aritmeacuteticas que rigen la herencia

Para progenies de hiacutebridos con un par de caracteres diferenciantes) R y r su

proporcioacuten de la progenie seraacute de tres dominantes contra un recesivo

Cuando difieren en dos pares de caracteres la proporcioacuten seraacute de 9331

Redescubrimiento de las leyes de Mendel

En 1900 de manera independiente publicaron los resultados de sus trabajos haciendo referencia al trabajo ldquoprecursorrdquo de Mendel

bull El botaacutenico holandeacutes Hugo de Vries (1848-1935)

bull El botaacutenico alemaacuten Carl Correns (1864-1933)

bull El austriacuteaco Eric Tschermak von Seysenegg (1871-1962)

Este hecho marcoacute el inicio del estudio de la geneacutetica se extendieron y aplicaron los conocimientos mendelianos

bull Soacutelo Correns comprendioacute completamente el trabajo de Mendel y sus consecuencias Tanto De Vries como Tschermak no entendiacutean conceptos como dominancia y confundiacutean en una las dos leyes de Mendel en una sola Es entonces muy claro que el trabajo de Mendel no fue entendido ni en sus aspectos teacutecnicos ni tampoco en su importancia De hecho el entendimiento de su relevancia vino antes de ser entendido teacutecnicamente

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 3: Genomica colombia

Historia de la cienciasEtapas

bull Antiguumledad (III ac V dc)

bull Edad Media(V-XIV)

bull Renacimiento (XV-XVI)

bull Etapa Moderna Temprana (XVII-XVIII)

bull Etapa Moderna (1800-1950)

bull Etapa Reciente o Contemporaacutenea(1950-2010)

Historia Biologiacutea (Problemas)

Historia Natural (in vivo)

ndash Siglo III AC-XIX

BiologiacuteaSiglo XX (in vitro)

Molecularizacioacuten

Siglo XXI (in silico)Computarizacioacuten

Los problemas de la Herencia Siglo XIX

bull La biologiacutea es una rama del conocimiento derivada de la siacutentesis de diversas tradiciones cientiacuteficas sin duda la maacutes importante es la Historia Natural pero durante el siglo XIX hubo otras quizaacutes menos abundantes en cuanto a produccioacuten pero igualmente relevantes en cuanto a las innovaciones conceptuales para la biologiacutea actual entre estas estaacuten

bull La herencia que estudiaba como se transmiten los caracteres de los padres a los hijos con base en

ndash La de los criadores y mejoradores enfocados al estudio del mejoramiento (especialmente de plantas) es decir en la herencia de los caracteres importantes para el hombre (agronomiacutea actual)

ndash La de los hibridoacutelogos encargados de estudiar la herencia de caracteres y la naturaleza de la especie (esencialista y nominalista) iniciada con Linneo

bull Habiacutea dos formas de estudiar la herencia de caracteres

ndash Los arboles genealoacutegicos o pedigriacutes

ndash Cruzas entre individuos con distintos caracteres

Gregor Mendel (1822-1884)bull En 1900 se redescubrioacute el trabajo de Gregor Mendel quien en 1865 habiacutea

anunciado el resultado de sus estudios con chiacutecharos ante la Sociedad

Bruniana

bull Los principales postulados de sus experimentos fueron

ndash Ciertos caracteres que pueden distinguirse faacutecilmente muestran

predominancia de unos sobre otros en la progenie de la cruza de padres

con caracteres opuestos o diferenciados

ndash Dos caracteres son transmitidos como elementos diferentes uno del padre

y otro de la madre en donde uno es dominante y el otro recesivo Los

caracteres que se transmiten y aparecen en la primera generacioacuten son los

dominantes y los que permanecen ocultos o de forma latente en el proceso

son los recesivos

bull Mendel usoacute la expresioacuten recesivo porque estos caracteres desaparecen en la

primera generacioacuten pero reaparecen en las subsecuentes

bull Reconocioacute que esos pares actuacutean independientemente de otros pares de

caracteres

bull De estos dos principios derivoacute algunas reglas aritmeacuteticas que rigen la herencia

Para progenies de hiacutebridos con un par de caracteres diferenciantes) R y r su

proporcioacuten de la progenie seraacute de tres dominantes contra un recesivo

Cuando difieren en dos pares de caracteres la proporcioacuten seraacute de 9331

Redescubrimiento de las leyes de Mendel

En 1900 de manera independiente publicaron los resultados de sus trabajos haciendo referencia al trabajo ldquoprecursorrdquo de Mendel

bull El botaacutenico holandeacutes Hugo de Vries (1848-1935)

bull El botaacutenico alemaacuten Carl Correns (1864-1933)

bull El austriacuteaco Eric Tschermak von Seysenegg (1871-1962)

Este hecho marcoacute el inicio del estudio de la geneacutetica se extendieron y aplicaron los conocimientos mendelianos

bull Soacutelo Correns comprendioacute completamente el trabajo de Mendel y sus consecuencias Tanto De Vries como Tschermak no entendiacutean conceptos como dominancia y confundiacutean en una las dos leyes de Mendel en una sola Es entonces muy claro que el trabajo de Mendel no fue entendido ni en sus aspectos teacutecnicos ni tampoco en su importancia De hecho el entendimiento de su relevancia vino antes de ser entendido teacutecnicamente

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 4: Genomica colombia

Historia Biologiacutea (Problemas)

Historia Natural (in vivo)

ndash Siglo III AC-XIX

BiologiacuteaSiglo XX (in vitro)

Molecularizacioacuten

Siglo XXI (in silico)Computarizacioacuten

Los problemas de la Herencia Siglo XIX

bull La biologiacutea es una rama del conocimiento derivada de la siacutentesis de diversas tradiciones cientiacuteficas sin duda la maacutes importante es la Historia Natural pero durante el siglo XIX hubo otras quizaacutes menos abundantes en cuanto a produccioacuten pero igualmente relevantes en cuanto a las innovaciones conceptuales para la biologiacutea actual entre estas estaacuten

bull La herencia que estudiaba como se transmiten los caracteres de los padres a los hijos con base en

ndash La de los criadores y mejoradores enfocados al estudio del mejoramiento (especialmente de plantas) es decir en la herencia de los caracteres importantes para el hombre (agronomiacutea actual)

ndash La de los hibridoacutelogos encargados de estudiar la herencia de caracteres y la naturaleza de la especie (esencialista y nominalista) iniciada con Linneo

bull Habiacutea dos formas de estudiar la herencia de caracteres

ndash Los arboles genealoacutegicos o pedigriacutes

ndash Cruzas entre individuos con distintos caracteres

Gregor Mendel (1822-1884)bull En 1900 se redescubrioacute el trabajo de Gregor Mendel quien en 1865 habiacutea

anunciado el resultado de sus estudios con chiacutecharos ante la Sociedad

Bruniana

bull Los principales postulados de sus experimentos fueron

ndash Ciertos caracteres que pueden distinguirse faacutecilmente muestran

predominancia de unos sobre otros en la progenie de la cruza de padres

con caracteres opuestos o diferenciados

ndash Dos caracteres son transmitidos como elementos diferentes uno del padre

y otro de la madre en donde uno es dominante y el otro recesivo Los

caracteres que se transmiten y aparecen en la primera generacioacuten son los

dominantes y los que permanecen ocultos o de forma latente en el proceso

son los recesivos

bull Mendel usoacute la expresioacuten recesivo porque estos caracteres desaparecen en la

primera generacioacuten pero reaparecen en las subsecuentes

bull Reconocioacute que esos pares actuacutean independientemente de otros pares de

caracteres

bull De estos dos principios derivoacute algunas reglas aritmeacuteticas que rigen la herencia

Para progenies de hiacutebridos con un par de caracteres diferenciantes) R y r su

proporcioacuten de la progenie seraacute de tres dominantes contra un recesivo

Cuando difieren en dos pares de caracteres la proporcioacuten seraacute de 9331

Redescubrimiento de las leyes de Mendel

En 1900 de manera independiente publicaron los resultados de sus trabajos haciendo referencia al trabajo ldquoprecursorrdquo de Mendel

bull El botaacutenico holandeacutes Hugo de Vries (1848-1935)

bull El botaacutenico alemaacuten Carl Correns (1864-1933)

bull El austriacuteaco Eric Tschermak von Seysenegg (1871-1962)

Este hecho marcoacute el inicio del estudio de la geneacutetica se extendieron y aplicaron los conocimientos mendelianos

bull Soacutelo Correns comprendioacute completamente el trabajo de Mendel y sus consecuencias Tanto De Vries como Tschermak no entendiacutean conceptos como dominancia y confundiacutean en una las dos leyes de Mendel en una sola Es entonces muy claro que el trabajo de Mendel no fue entendido ni en sus aspectos teacutecnicos ni tampoco en su importancia De hecho el entendimiento de su relevancia vino antes de ser entendido teacutecnicamente

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 5: Genomica colombia

Los problemas de la Herencia Siglo XIX

bull La biologiacutea es una rama del conocimiento derivada de la siacutentesis de diversas tradiciones cientiacuteficas sin duda la maacutes importante es la Historia Natural pero durante el siglo XIX hubo otras quizaacutes menos abundantes en cuanto a produccioacuten pero igualmente relevantes en cuanto a las innovaciones conceptuales para la biologiacutea actual entre estas estaacuten

bull La herencia que estudiaba como se transmiten los caracteres de los padres a los hijos con base en

ndash La de los criadores y mejoradores enfocados al estudio del mejoramiento (especialmente de plantas) es decir en la herencia de los caracteres importantes para el hombre (agronomiacutea actual)

ndash La de los hibridoacutelogos encargados de estudiar la herencia de caracteres y la naturaleza de la especie (esencialista y nominalista) iniciada con Linneo

bull Habiacutea dos formas de estudiar la herencia de caracteres

ndash Los arboles genealoacutegicos o pedigriacutes

ndash Cruzas entre individuos con distintos caracteres

Gregor Mendel (1822-1884)bull En 1900 se redescubrioacute el trabajo de Gregor Mendel quien en 1865 habiacutea

anunciado el resultado de sus estudios con chiacutecharos ante la Sociedad

Bruniana

bull Los principales postulados de sus experimentos fueron

ndash Ciertos caracteres que pueden distinguirse faacutecilmente muestran

predominancia de unos sobre otros en la progenie de la cruza de padres

con caracteres opuestos o diferenciados

ndash Dos caracteres son transmitidos como elementos diferentes uno del padre

y otro de la madre en donde uno es dominante y el otro recesivo Los

caracteres que se transmiten y aparecen en la primera generacioacuten son los

dominantes y los que permanecen ocultos o de forma latente en el proceso

son los recesivos

bull Mendel usoacute la expresioacuten recesivo porque estos caracteres desaparecen en la

primera generacioacuten pero reaparecen en las subsecuentes

bull Reconocioacute que esos pares actuacutean independientemente de otros pares de

caracteres

bull De estos dos principios derivoacute algunas reglas aritmeacuteticas que rigen la herencia

Para progenies de hiacutebridos con un par de caracteres diferenciantes) R y r su

proporcioacuten de la progenie seraacute de tres dominantes contra un recesivo

Cuando difieren en dos pares de caracteres la proporcioacuten seraacute de 9331

Redescubrimiento de las leyes de Mendel

En 1900 de manera independiente publicaron los resultados de sus trabajos haciendo referencia al trabajo ldquoprecursorrdquo de Mendel

bull El botaacutenico holandeacutes Hugo de Vries (1848-1935)

bull El botaacutenico alemaacuten Carl Correns (1864-1933)

bull El austriacuteaco Eric Tschermak von Seysenegg (1871-1962)

Este hecho marcoacute el inicio del estudio de la geneacutetica se extendieron y aplicaron los conocimientos mendelianos

bull Soacutelo Correns comprendioacute completamente el trabajo de Mendel y sus consecuencias Tanto De Vries como Tschermak no entendiacutean conceptos como dominancia y confundiacutean en una las dos leyes de Mendel en una sola Es entonces muy claro que el trabajo de Mendel no fue entendido ni en sus aspectos teacutecnicos ni tampoco en su importancia De hecho el entendimiento de su relevancia vino antes de ser entendido teacutecnicamente

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 6: Genomica colombia

Gregor Mendel (1822-1884)bull En 1900 se redescubrioacute el trabajo de Gregor Mendel quien en 1865 habiacutea

anunciado el resultado de sus estudios con chiacutecharos ante la Sociedad

Bruniana

bull Los principales postulados de sus experimentos fueron

ndash Ciertos caracteres que pueden distinguirse faacutecilmente muestran

predominancia de unos sobre otros en la progenie de la cruza de padres

con caracteres opuestos o diferenciados

ndash Dos caracteres son transmitidos como elementos diferentes uno del padre

y otro de la madre en donde uno es dominante y el otro recesivo Los

caracteres que se transmiten y aparecen en la primera generacioacuten son los

dominantes y los que permanecen ocultos o de forma latente en el proceso

son los recesivos

bull Mendel usoacute la expresioacuten recesivo porque estos caracteres desaparecen en la

primera generacioacuten pero reaparecen en las subsecuentes

bull Reconocioacute que esos pares actuacutean independientemente de otros pares de

caracteres

bull De estos dos principios derivoacute algunas reglas aritmeacuteticas que rigen la herencia

Para progenies de hiacutebridos con un par de caracteres diferenciantes) R y r su

proporcioacuten de la progenie seraacute de tres dominantes contra un recesivo

Cuando difieren en dos pares de caracteres la proporcioacuten seraacute de 9331

Redescubrimiento de las leyes de Mendel

En 1900 de manera independiente publicaron los resultados de sus trabajos haciendo referencia al trabajo ldquoprecursorrdquo de Mendel

bull El botaacutenico holandeacutes Hugo de Vries (1848-1935)

bull El botaacutenico alemaacuten Carl Correns (1864-1933)

bull El austriacuteaco Eric Tschermak von Seysenegg (1871-1962)

Este hecho marcoacute el inicio del estudio de la geneacutetica se extendieron y aplicaron los conocimientos mendelianos

bull Soacutelo Correns comprendioacute completamente el trabajo de Mendel y sus consecuencias Tanto De Vries como Tschermak no entendiacutean conceptos como dominancia y confundiacutean en una las dos leyes de Mendel en una sola Es entonces muy claro que el trabajo de Mendel no fue entendido ni en sus aspectos teacutecnicos ni tampoco en su importancia De hecho el entendimiento de su relevancia vino antes de ser entendido teacutecnicamente

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 7: Genomica colombia

Para progenies de hiacutebridos con un par de caracteres diferenciantes) R y r su

proporcioacuten de la progenie seraacute de tres dominantes contra un recesivo

Cuando difieren en dos pares de caracteres la proporcioacuten seraacute de 9331

Redescubrimiento de las leyes de Mendel

En 1900 de manera independiente publicaron los resultados de sus trabajos haciendo referencia al trabajo ldquoprecursorrdquo de Mendel

bull El botaacutenico holandeacutes Hugo de Vries (1848-1935)

bull El botaacutenico alemaacuten Carl Correns (1864-1933)

bull El austriacuteaco Eric Tschermak von Seysenegg (1871-1962)

Este hecho marcoacute el inicio del estudio de la geneacutetica se extendieron y aplicaron los conocimientos mendelianos

bull Soacutelo Correns comprendioacute completamente el trabajo de Mendel y sus consecuencias Tanto De Vries como Tschermak no entendiacutean conceptos como dominancia y confundiacutean en una las dos leyes de Mendel en una sola Es entonces muy claro que el trabajo de Mendel no fue entendido ni en sus aspectos teacutecnicos ni tampoco en su importancia De hecho el entendimiento de su relevancia vino antes de ser entendido teacutecnicamente

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 8: Genomica colombia

Redescubrimiento de las leyes de Mendel

En 1900 de manera independiente publicaron los resultados de sus trabajos haciendo referencia al trabajo ldquoprecursorrdquo de Mendel

bull El botaacutenico holandeacutes Hugo de Vries (1848-1935)

bull El botaacutenico alemaacuten Carl Correns (1864-1933)

bull El austriacuteaco Eric Tschermak von Seysenegg (1871-1962)

Este hecho marcoacute el inicio del estudio de la geneacutetica se extendieron y aplicaron los conocimientos mendelianos

bull Soacutelo Correns comprendioacute completamente el trabajo de Mendel y sus consecuencias Tanto De Vries como Tschermak no entendiacutean conceptos como dominancia y confundiacutean en una las dos leyes de Mendel en una sola Es entonces muy claro que el trabajo de Mendel no fue entendido ni en sus aspectos teacutecnicos ni tampoco en su importancia De hecho el entendimiento de su relevancia vino antes de ser entendido teacutecnicamente

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 9: Genomica colombia

Tomas H Morgan (1866-1945)

Teoriacutea cromosoacutemica de la herencia

Mapas cromosoacutemicos

Herencia ligada al sexo

Nobel 1933

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 10: Genomica colombia

El teacutermino biologiacutea molecular fue acuntildeado por el Director de la

Divisioacuten de Ciencias Naturales de la fundacioacuten Rokefeller Warren

Weaver en 1938

Warren Weaver

(1894-1978)Warren Weaver Hall

Washington

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 11: Genomica colombia

Hechos histoacutericos httpwwwnaturecomnaturejournalv422n6934pdftimeline_01626pdf

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 12: Genomica colombia

James Watson y Francis Crick

Premio Nobel 1962

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 13: Genomica colombia

1990 TICS Y BIOLOGIacuteA

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 14: Genomica colombia

La sociedad del conocimiento

bull Se refiere al incremento espectacular y a la aceleracioacuten sin precedente del ritmo de creacioacuten acumulacioacuten distribucioacuten y aprovechamiento de la informacioacuten y el conocimiento

-El conocimiento se crea se acumula se difunde y se aprovecha pues orienta las decisiones y permite la intervencioacuten en el mundo de acuerdo con ciertos fines y valores (Morales 2001)

bull El modelo de la Sociedad del Conocimiento estaacute en construccioacuten al igual que la sociedad misma (Oliveacute 2005)

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 15: Genomica colombia

(Re)evolucioacuten de la informacioacuten (1990)

bull Disentildeo de las computadoras

bull Aparicioacuten del Internet

bull Masificacioacuten de la web

bull Formato digital bajo costo poco espacio

bull Explosioacuten de la informacioacuten

bull Gran cantidad de colecciones de datos

bull Dinaacutemica cambia estaacute en modificacioacuten constante tanto el contenido como los formatos

bull Masiva

bull Aplicaciones de mata-anaacutelisis

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 16: Genomica colombia

(Re)evolucioacuten de la informacioacuten (cientiacutefica)bull La adopcioacuten del formato electroacutenico

bull Produccioacuten acelerada de una gran variedad de programas aplicaciones herramientas utilidades recursos y servicios electroacutenicos para la praacutectica cientiacutefica muchos de ellos disponibles a traveacutes de la Internet

bull Uso de la Web como medio de comunicacioacuten

bull El nuacutemero de los investigadores y de sus publicaciones se duplica aproximadamente cada veinte antildeos

bull Viven actualmente entre un 80 y un 90 de los cientiacuteficos que han existido

bull Se publican cada antildeo mas de dos millones de artiacuteculos

bull Se conceden un milloacuten de patentes

bull Modificacioacuten en la forma de producir evaluar y consultar la informacioacuten

ndash Revistas electroacutenicas

ndash Procedimiento electroacutenico

ndash Evaluacioacuten libre

ndash E-prints Preprints y posprints

ndash Inmediatez

bull Se han producido nuevas disciplinas cientiacuteficas

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 17: Genomica colombia

La transformacioacuten de la biologiacutea

bull Molecularizacioacutenbull Colaboracioacutenbull Multidisciplinariedadbull Proyectos internacionalesbull Gridsbull Nuevos modelosbull Nuevas disciplinas

ndash Bioinformaacuteticandash Informaacutetica Meacutedicandash Neuroinformaacuteticandash Biologiacutea de Sistemas

bull Colecciones de datosndash Bioloacutegicasndash Bibliograacuteficas

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 18: Genomica colombia

Investigacioacuten bioloacutegica

Observacioacuten

Descripcioacuten

ClasificacioacutenExperimentacioacuten

Comparacioacuten

Investigacioacuten Cientiacutefica(Biologiacutea)

In vivoIn vitro (1900)In Silico (1990)

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 19: Genomica colombia

0

500

1000

1500

2000

2500

3000

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Escherichia (94873)

Drosophila (48989)

Saccharomyces (27549)

Arabidopsis (18094)

Caenorhabditis (5353)

Year

Docu

men

ts

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 20: Genomica colombia

0

50

100

150

200

250

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Zea (7636)

Neurospora (6640)

Dictyostelium (6191)

Chlamydomonas (5646)

Schizosaccharomyces (3183)

Danio (973)

Year

Docu

men

tos

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 21: Genomica colombia

Figure 21

0

5

10

15

20

25

30

35

40

Ge

ne

tics amp H

ere

dity

Bio

che

m amp

Mo

l Bio

Ce

ll Bio

De

velo

p B

io

Mu

ltidiscip

linary

Ne

uro

scien

ces

Bio

logy

Zoo

logy

Evolu

tion

ary Bio

Ecolo

gy

Ento

mo

logy

Toxico

logy

Ph

ysiolo

gy

Bio

ph

ysics

Bio

tech

amp M

icrob

io

Be

havio

ral Scien

Do

cum

en

ts

Subject Area

12581258 1231 1200 782 706

10907

8137

4310

2519 2482 2365 23401811

1465

6328

17900

(2)(1)

(20)

(3)

(15)

(18)

(6)

(3)(2)

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 22: Genomica colombia

Figure 22

0

100

200

300

400

500

600

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Genetics amp Heredity

Biochemistry amp Molecular Biology

Cell Biology

Developmental Biology

Neuroscience

Year

Docu

men

ts

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 23: Genomica colombia

Figure 23

0

10

20

30

40

50

60

70

80

90

100

19

05

19

10

19

15

19

20

19

25

19

30

19

35

19

40

19

45

19

50

19

55

19

60

19

65

19

70

19

75

19

80

19

85

19

90

19

95

20

00

20

05

Evolutionary Biology

Ecology

Zoology

Toxicology

Year

Docu

men

ts

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 24: Genomica colombia

Coleccioacuten de

secuencias de

genesE-ciencia

Ciberinfraestructura

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 25: Genomica colombia

Genbank

bull Es una coleccioacuten anotada de todas las secuencias de nucleoacutetidos a disposicioacuten del puacuteblico y su traduccioacuten de proteiacutenas

bull Centro Nacional de Informacioacuten Biotecnoloacutegica (NCBI)bull European Molecular Biology Laboratory (EMBL) de datos de

Bibliotecas del Instituto Europeo de Bioinformaacutetica (EBI)bull DNA Data Bank de Japoacuten (DDBJ)bull Reciben las secuencias producidas en laboratorios de todo el

mundo de maacutes de 100000 organismos distintosbull Crece a un ritmo exponencial duplicando cada 10 meses Suelte

134 producido en febrero de 2003 conteniacutea maacutes de 29300 millones de bases nucleotiacutedicas en maacutes de 230 millones de secuencias

bull Se construye mediante el enviacuteo directo de los distintos laboratorios y de los centros de secuenciacioacuten a gran escala

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 26: Genomica colombia

httpwwwncbinlmnihgovgenbankgenbankstatshtml

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 27: Genomica colombia

Olson M Hood L Cantor C Botstein D A common language for physical mapping of the human genome Science 1989 245(4925) 1434ndash1435 [PubMed]

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 28: Genomica colombia

0

100000

200000

300000

400000

500000

600000

700000

800000

1865

1870

1875

1880

1885

1890

1895

1900

1905

1910

1915

1920

1925

1930

1935

1940

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

Documentos en PubMed (NIH)

Cerca de 20 millones octubre 2010)

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 29: Genomica colombia

e-science cyberinfraestructure

bull e-science (europe)

bull United Kingdoms Office of Science and Technology in 1999

bull Will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet

bull cyberinfraestructure (USA)bull United States National Science

Foundation (NSF) blue-ribbon committee in 2003

bull Describes the new research environments that support advanced data acquisition data storage data management data integration data mining data visualization and other computing and information processing services over the Internet

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 30: Genomica colombia

Ciberinfraestructura

bullEntorno tecnoloacutegico-social que permite crear difundir y preservar los datos informacioacuten y conocimientos mediante la adquisicioacuten almacenamiento gestioacuten integracioacuten informaacutetica mineriacutea visualizacioacuten y otros servicios a traveacutes de Internet (NSF 2003 2007)

bullIncluye un conjunto interoperable de diversos elementos

ndash1) Infraestructura los sistemas computacionales (hardware software y redes) servicios instrumentos y herramientas

ndash2) Colecciones de datos

ndash3) Grupos virtuales de investigacioacuten (colaboratorios y observatorios)

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 31: Genomica colombia

e-investigacioacutenbull Actividades de investigacioacuten que utilizan una gama de capacidades avanzadas de las TIC y abarca

nuevas metodologiacuteas de investigacioacuten que salen de un mayor acceso a

Las comunicaciones de banda ancha de redes instrumentos de investigacioacuten y las instalaciones redes de sensores y repositorios de datos

Software y servicios de infraestructura que permitan garantizar la conectividad e interoperabilidad

Aplicacioacuten herramientas que abarcan la disciplina de instrumentos especiacuteficos y herramientas de interaccioacuten

avanzar y aumentar en lugar de reemplazar las tradicionales metodologiacuteas de investigacioacuten

bull permitiraacute a los investigadores para llevar a cabo su labor de investigacioacuten maacutes creativa eficiente y colaboracioacuten a larga distancia y difundir sus resultados de la investigacioacuten con un mayor efecto

bull Colaboracioacuten

Nuevos campos de investigacioacuten emergentes utilizando nuevas teacutecnicas de mineriacutea de datos y el anaacutelisis avanzados algoritmos computacionales y de redes de intercambio de recursos

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 32: Genomica colombia

e-Science bull Originally referred to experiments that connected together a few powerful

computers located at different sites and later a very large number of modest PCs across the world in order to undertake enormous calculations or process huge amounts of data The coordination of geographically dispersed computing and data resources has become known as the Grid This is shorthand for the emerging standards and technology ndash hardware and software ndash being developed to enable and simplify the sharing of resources The analogy is an electric power grid which comprises numerous varied resources connected together to contribute power into a shared pool that users can easily access when they need it

bull What is exciting about the Grid is that the combination of extensive connectivity massive computer power and vast quantities of digitized data ndash all three of which are still rapidly expanding ndash making possible new applications that are orders of magnitude more potent than even a few years ago

bull The term e-research is sometimes used instead of e-science with the advantage that gives more emphasis to the end result of better richer faster or new research results rather than the technologies used to get them

National Centre for e-Social Science 2008 Frequently Asked Questions Diponible en httpwwwncessacukabout_eSSfaqq=General_1General_1

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 33: Genomica colombia

E-ciencia (e-science)

bull Resulta del uso y aplicacioacuten de la Ciberinfraestructura en la praacutectica cientifica

bull Se caracteriza por la inter y multidisciplinariedad

bull Colaboracioacuten la participacioacuten de un gran nuacutemero de investigadores (en algunos casos cientos) localizados en diversas regiones y con diferentes especialidades que se forman grupos trabajo (Hey y Trefethen 2005 Barbera et al2009)

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 34: Genomica colombia

Uno de los primeros proyectos de e-ciencia fue el de el genoma humano se publicoacute en el 2001 en dos artiacuteculos con un diacutea de diferencia en las revistas Nature y Science

NatureInitial sequencing and analysis of the human genome79 Autores48 Instituciones

181 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientesDepartment of Cellular and Structural BiologyDepartment of Molecular GeneticsDepartment of Molecular Biology

Science The Sequence of the Human Genome276 Autores14 Instituciones452 referenciasTodos los autores provenientes de departamentos de Ciencias Genoacutemicas (o geneacutetica) exceptuando los siguientes Department of Biology e Informaacutetica Meacutedica

E-ciencia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 35: Genomica colombia

1865 Gregor Mendel descubre las leyes de la Geneacutetica httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1953 James Watson y Francis Crick describen la estructura de la doble-heacutelice del ADN httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1966 Marshall Nirenberg Har Gobind Khorana y Robert Holley determinan el coacutedigo

geneacutetico

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1972 Stanley Cohen and Herbert Boyer desarrollan la tecnologiacutea del ADN recombinante httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1977 Frederick Sanger Allan Maxam y Walter Gilbert desarrollan meacutetodos de

secuenciacioacuten del ADN

httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1982 La base de datos bdquoGenBank‟ es establecida httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1985 Es inventada la Reaccioacuten en Cadena de la Polimerasa (PCR) httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1990 El Proyecto Genoma Humano (HGP) inicia en Estados Unidos httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

1997 El genoma de Escherichia coli es secuenciado httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

2003 La secuencia del genoma humano es finalizada httpwwwnaturecomnaturejournalv42

2n6934pdf timeline_01626pdf

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 36: Genomica colombia

La Genoacutemicabull Es el estudio sistemaacutetico de la documentacioacuten completa de

las secuencias del ADN de los organismos (MeSH 2010) dicho estudio ha ayudado a comprender polimorfismos dentro de las especies la interaccioacuten de las proteiacutenas y la evolucioacuten (Brent 2000)

bull Incluye todos los meacutetodos que recopilan y analizan datos completos acerca de los genes incluida las secuencias la abundancia de los aacutecidos nucleicos y las propiedades de las proteiacutenas que codifican (Murray 2000)

bull Nuestra capacidad para estudiar la funcioacuten geacutenica estaacute aumentando en la especificidad gracias a esta nueva disciplina (Collins et aacutel 2003) que se compromete a acelerar el descubrimiento cientiacutefico en todos los aacutembitos de la ciencia bioloacutegica (Burley 2000)

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 37: Genomica colombia

0

500

1000

1500

2000

2500

3000

3500

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Pu

blic

acio

ne

s

Antildeo

Nuacutemero de Publicaciones

Pathology and

immunopathology

research en 1988 The

genomics of human

homeobox-containing

loci‟ escrito por CA

Ferguson-Smith y FH

Ruddle del

Departamento de

Biologiacutea en la

Universidad de Yale

New Haven

Connecticut

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 38: Genomica colombia

0

100

200

300

400

500

600

Pu

blic

acio

ne

s

Institucioacuten

Nuacutemero de Publicaciones

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 39: Genomica colombia

Nature 2001 Feb 15409(6822)860-921Initial sequencing and analysis of the human genome

Lander ES Linton LM Birren B Nusbaum C Zody MC Baldwin J Devon K Dewar K Doyle M FitzHugh W Funke R GageD Harris K Heaford A Howland J Kann L Lehoczky J LeVine R McEwan P McKernan K Meldrim J Mesirov JP Miranda C Morris W Naylor J Raymond C Rosetti M Santos R Sheridan A Sougnez C Stange-Thomann NStojanovicN Subramanian A Wyman D Rogers J Sulston J Ainscough R Beck S Bentley D Burton J Clee C Carter N CoulsonA Deadman R Deloukas P Dunham ADunham I Durbin R French L Grafham D Gregory S Hubbard T HumphrayS Hunt A Jones M Lloyd C McMurray A Matthews L Mercer S Milne S Mullikin JC Mungall APlumb R Ross M Shownkeen R Sims S Waterston RH Wilson RK Hillier LW McPherson JD Marra MA Mardis ER FultonLA Chinwalla AT Pepin KH Gish WR Chissoe SL Wendl MC Delehaunty KD Miner TL Delehaunty A Kramer JB Cook LL Fulton RS Johnson DL Minx PJ Clifton SW Hawkins T Branscomb E Predki P Richardson PWenning S SlezakT Doggett N Cheng JF Olsen A Lucas S Elkin C Uberbacher E Frazier M Gibbs RA Muzny DM Scherer SE BouckJB Sodergren EJ Worley KC Rives CM Gorrell JH Metzker ML Naylor SL Kucherlapati RS Nelson DL WeinstockGM Sakaki Y Fujiyama A Hattori M Yada T Toyoda A Itoh T Kawagoe C Watanabe H Totoki YTaylor T WeissenbachJ Heilig R Saurin W Artiguenave F Brottier P Bruls T Pelletier E Robert C Wincker P Smith DR Doucette-StammL Rubenfield M Weinstock K Lee HM Dubois J Rosenthal A Platzer M Nyakatura G Taudien S Rump A Yang H YuJ Wang J Huang G Gu J Hood L Rowen L Madan A Qin S Davis RW Federspiel NAAbola AP Proctor MJ Myers RM Schmutz J Dickson M Grimwood J Cox DR Olson MV Kaul R Raymond C Shimizu N Kawasaki K MinoshimaS Evans GA Athanasiou MSchultz R Roe BA Chen F Pan H Ramser J Lehrach H Reinhardt R McCombie WR de la Bastide M Dedhia N Bloumlcker H Hornischer K Nordsiek G Agarwala R Aravind LBailey JA Bateman A BatzoglouS Birney E Bork P Brown DG Burge CB Cerutti L Chen HC Church D Clamp M Copley RR Doerks T Eddy SR EichlerEE Furey TSGalagan J Gilbert JG Harmon C Hayashizaki Y Haussler D Hermjakob H Hokamp K Jang W Johnson LS Jones TA Kasif S Kaspryzk A Kennedy S Kent WJ Kitts PKoonin EV Korf I Kulp D Lancet D Lowe TM McLysaghtA Mikkelsen T Moran JV Mulder N Pollara VJ Ponting CP Schuler G Schultz J Slater G Smit AF Stupka ESzustakowskiJ Thierry-Mieg D Thierry-Mieg J Wagner L Wallis J Wheeler R Williams A Wolf YI Wolfe KH Yang SP Yeh RF Collins F Guyer MS Peterson J Felsenfeld AWetterstrand KA Patrinos A Morgan MJ de Jong P Catanese JJ OsoegawaK Shizuya H Choi S Chen YJ International Human Genome Sequencing Consortium

Whitehead Institute for Biomedical Research Center for Genome Research Cambridge Massachusetts 02142 USA landergenomewimitedu

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 40: Genomica colombia

bull The human genome holds an extraordinary trove of information about human development physiology medicine and evolution Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome We also present an initial analysis of the data describing some of the insights that can be gleaned from the sequence

bull Here we report the results of a collaboration involving 20 groups from the United States the United Kingdom Japan France Germany and China to produce a draft sequence of the human genome

bull Of course navigating information spanning nearly ten orders of magnitude requires computational tools to extract the full value

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 41: Genomica colombia

FIGURA 1 Liacutenea de tiempo de los anaacutelisis genoacutemicos a gran escala

Nature 409 860-921(15 February 2001)doi10103835057062

httpwwwnaturecomnaturejournalv409n6822fig_tab409860a0_F1html

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 42: Genomica colombia

Nature 409 860-921(15 February 2001)doi10103835057062httpwwwnaturecomnaturejournalv409n6822images409860ac2jpg

FIGURE 3 The automated production line for sample preparation at the Whitehead Institute Center for Genome Research

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 43: Genomica colombia

bull Science 16 February 2001Vol 291 no 5507 pp 1304 - 1351DOI 101126science1058040

bull REVIEW

bull The Sequence of the Human Genome

bull J Craig Venter1 Mark D Adams1 Eugene W Myers1 Peter W Li1 Richard J Mural1 Granger G Sutton1 Hamilton O Smith1 Mark Yandell1 Cheryl A Evans1Robert A Holt1 Jeannine D Gocayne1 Peter Amanatides1 Richard M Ballew1 Daniel H Huson1 Jennifer Russo Wortman1 Qing Zhang1Chinnappa D Kodira1 Xiangqun H Zheng1 Lin Chen1 Marian Skupski1 Gangadharan Subramanian1 Paul D Thomas1 Jinghui Zhang1George L Gabor Miklos2 Catherine Nelson3 Samuel Broder1 Andrew G Clark4 Joe Nadeau5 Victor A McKusick6 Norton Zinder7 Arnold J Levine7Richard J Roberts8 MelSimon9 Carolyn Slayman10 Michael Hunkapiller11 Randall Bolanos1 Arthur Delcher1 Ian Dew1 Daniel Fasulo1 Michael Flanigan1Liliana Florea1 Aaron Halpern1 Sridhar Hannenhalli1 Saul Kravitz1 Samuel Levy1 Clark Mobarry1 KnutReinert1 Karin Remington1 Jane Abu-Threideh1Ellen Beasley1 Kendra Biddick1 Vivien Bonazzi1 Rhonda Brandon1 MicheleCargill1 Ishwar Chandramouliswaran1 Rosane Charlab1 Kabir Chaturvedi1Zuoming Deng1 Valentina Di Francesco1 Patrick Dunn1 Karen Eilbeck1 Carlos Evangelista1 Andrei E Gabrielian1 Weiniu Gan1 Wangmao Ge1Fangcheng Gong1 ZhipingGu1 Ping Guan1 Thomas J Heiman1 Maureen E Higgins1 Rui-Ru Ji1 Zhaoxi Ke1 Karen A Ketchum1 Zhongwu Lai1 YidingLei1Zhenya Li1 Jiayin Li1 Yong Liang1 Xiaoying Lin1 Fu Lu1 Gennady V Merkulov1 Natalia Milshina1 Helen M Moore1 Ashwinikumar K Naik1Vaibhav A Narayan1 Beena Neelam1 Deborah Nusskern1 Douglas B Rusch1 Steven Salzberg12 Wei Shao1 Bixiong Shue1 Jingtao Sun1 Zhen Yuan Wang1Aihui Wang1 Xin Wang1 Jian Wang1 Ming-Hui Wei1 Ron Wides13 Chunlin Xiao1 Chunhua Yan1 Alison Yao1 Jane Ye1 Ming Zhan1 Weiqing Zhang1Hongyu Zhang1 QiZhao1 Liansheng Zheng1 Fei Zhong1 Wenyan Zhong1 Shiaoping C Zhu1 Shaying Zhao12 Dennis Gilbert1 SuzannaBaumhueter1Gene Spier1 Christine Carter1 Anibal Cravchik1 Trevor Woodage1 Feroze Ali1 Huijin An1 Aderonke Awe1 Danita Baldwin1 Holly Baden1 Mary Barnstead1Ian Barrow1 Karen Beeson1 Dana Busam1 Amy Carver1 Angela Center1 Ming LaiCheng1 Liz Curry1 Steve Danaher1 Lionel Davenport1 Raymond Desilets1Susanne Dietz1 Kristina Dodson1 Lisa Doup1 Steven Ferriera1 Neha Garg1 Andres Gluecksmann1 Brit Hart1 Jason Haynes1 Charles Haynes1 CherylHeiner1Suzanne Hladun1 Damon Hostin1 Jarrett Houck1 Timothy Howland1 Chinyere Ibegwam1 Jeffery Johnson1 Francis Kalush1 Lesley Kline1 Shashi Koduru1Amy Love1 Felecia Mann1 David May1 Steven McCawley1 Tina McIntosh1 IvyMcMullen1 Mee Moy1 Linda Moy1 Brian Murphy1 Keith Nelson1Cynthia Pfannkoch1 Eric Pratts1 Vinita Puri1 HinaQureshi1 Matthew Reardon1 Robert Rodriguez1 Yu-Hui Rogers1 Deanna Romblad1 Bob Ruhfel1Richard Scott1 CynthiaSitter1 Michelle Smallwood1 Erin Stewart1 Renee Strong1 Ellen Suh1 Reginald Thomas1 Ni Ni Tint1 Sukyee Tse1 Claire Vech1Gary Wang1 Jeremy Wetter1 Sherita Williams1 Monica Williams1 Sandra Windsor1 Emily Winn-Deen1 KeriellenWolfe1 Jayshree Zaveri1 Karena Zaveri1Josep F Abril14 Roderic Guigoacute14 Michael J Campbell1 Kimmen V Sjolander1 Brian Karlak1 Anish Kejariwal1 Huaiyu Mi1 Betty Lazareva1 Thomas Hatton1Apurva Narechania1 Karen Diemer1 AnushyaMuruganujan1 Nan Guo1 Shinji Sato1 Vineet Bafna1 Sorin Istrail1 Ross Lippert1 Russell Schwartz1Brian Walenz1 ShibuYooseph1 David Allen1 Anand Basu1 James Baxendale1 Louis Blick1 Marcelo Caminha1 John Carnes-Stine1 ParrisCaulk1Yen-Hui Chiang1 My Coyne1 Carl Dahlke1 Anne Deslattes Mays1 Maria Dombroski1 Michael Donnelly1 Dale Ely1 Shiva Esparham1 Carl Fosler1 Harold Gire1Stephen Glanowski1 Kenneth Glasser1 Anna Glodek1 Mark Gorokhov1 Ken Graham1 Barry Gropman1 Michael Harris1 Jeremy Heil1 Scott Henderson1Jeffrey Hoover1 Donald Jennings1 Catherine Jordan1 James Jordan1 John Kasha1 Leonid Kagan1 Cheryl Kraft1 Alexander Levitsky1 Mark Lewis1Xiangjun Liu1 John Lopez1 Daniel Ma1 William Majoros1 Joe McDaniel1 Sean Murphy1 Matthew Newman1 Trung Nguyen1 Ngoc Nguyen1 Marc Nodell1Sue Pan1 Jim Peck1 Marshall Peterson1 William Rowe1 Robert Sanders1 John Scott1 Michael Simpson1 Thomas Smith1 Arlan Sprague1Timothy Stockwell1 Russell Turner1 Eli Venter1 Mei Wang1 Meiyuan Wen1 David Wu1 Mitchell Wu1 Ashley Xia1 Ali Zandieh1 Xiaohong Zhu1

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 44: Genomica colombia

bull A 291-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method The 148-billion bp DNA sequence was generated over 9 months from 27271853 high-quality sequence reads (511-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals Two assembly strategies--a whole-genome assembly and a regional chromosome assembly--were used each combining sequence data from Celera and the publicly funded genome effort The public data were shredded into 550-bp segments to create a 29-fold coverage of those genome regions that had been sequenced without including biases inherent in the cloning and assembly procedure used by the publicly funded group This brought the effective coverage in the assemblies toeightfold reducing the number and size of gaps in the final assembly over what would be obtained with 511-fold coverage The two assembly strategies yielded very similar results that largely agree with independent mapping data The assemblies effectively cover the euchromatic regions of the human chromosomes More than 90 of the genome is in scaffold assemblies of 100000 bp or more and 25 of the genome is in scaffolds of 10 million bp or larger Analysis of the genome sequence revealed 26588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ~12000 computationally derived genes with mouse matches or other weak supporting evidence Although gene-dense clusters are obvious almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence Only 11 of the genome is spanned by exons whereas 24 is in introns with 75 of the genome being intergenic DNA Duplications of segmental blocks ranging in size up to chromosomal lengths are abundant throughout the genome and reveal a complex evolutionary history Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function with tissue-specific developmental regulation and with the hemostasis and immune systems DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 21 million single-nucleotide polymorphisms (SNPs) A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average but there was marked heterogeneity in the level of polymorphism across the genome Less than 1 of all SNPs resulted in variation in proteins but the task of determining which SNPs have functional consequences remains an open challenge

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 45: Genomica colombia

J C Venter et al Science 291 1304 -1351 (2001)

Fig 2 Flow diagram for sequencing pipeline Samples are received selected and processed in compliance with standard operating procedures with a focus on quality within and across departments Each process has defined inputs and outputs with the capability to exchange samples and data with both internal and external entities according to defined quality guidelines Manufacturing pipeline processes products quality control measures and responsible parties are indicated and are described further in the text

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 46: Genomica colombia

httpgenomeucsceducgi-binhgTracksorg=human

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 47: Genomica colombia

Registro de PubMed

httpwwwncbinlmnihgovSitemapsamplerecordhtml

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 48: Genomica colombia

OMICSbull Los estudios ldquoomicsrdquo estaacuten involucrados en el anaacutelisis de cantidades grandes de paraacutemetros

generalmente proteiacutenas (proteomics) liacutepidos (lipidomics) y metabolitos (metabolomics) El teacutermino ldquoomicsrdquo deriva del sufijo griego ldquoomerdquo que significa muchos o masa Las mediciones se hacen a base de marcadores quiacutemicos que son indicativos de alguacuten evento bioloacutegico (biomarkers) Los valores asociados con los paraacutemetros medidos son investigados con el fin de encontrar una correlacioacuten con enfermedades Cuando los objetos de investigacioacuten son proteinas genes y metabolitos las aproximaciones correspondientes son proteomics genomics y metabolommics

bull En antildeos recientes con el desarrollado de meacutetodos para medir y analizar un nuacutemero muy grande de analitos de una sola muestra se ha popularizado las investigaciones que intentan medir miles de paraacutemetros en vez de soacutelo unos cuantos Esta aproximacioacuten es la que dio pie a los experimentos modernos de ldquoomicsrdquo

bull La meta final de las aproximaciones ldquoomicsrdquo es comprender comprehensiva e integralmente los procesos bioloacutegicos mediante la identificacioacuten y correlacioacuten de varios ldquojugadoresrdquo (eg genes RNA proteiacutenas metabolitos) en vez de estudiar cada uno de ellos de manera individual

bull Lamentablemente un estudio sobre la probabilidad de reproducir investigaciones en genomics (asociar grupos de genes con enfermedades complejas) en el cual se compararon 370 estudios reveloacute una alta frecuencia de investigaciones subsecuentes que no confirmaban los descubrimientos iniciales Debido al enorme nuacutemero de mediciones y el nuacutemero limitado de muestras de investigacioacuten surgen problemas relacionados a la estadiacutestica el sesgo la metodologiacutea y el uso inadecuado del meacutetodo

bull

bull Layjr J Liyanage R Borgmann S amp Wilkins C (2006) Problems with the ldquoomicsrdquo TrAC Trendsin Analytical Chemistry 25(11) 1046-1056 doi 101016jtrac200610007

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 49: Genomica colombia

OMICrsquoS ARTICULO

Genomics Genomics buzzword or reality

httpviewncbinlmnihgovpubmed10343163

Metabolomics Metabolomic

httpwwwbenthamdirectorgpagescontentphpCDM200800000009000000010011FSGM

Proteomics Proteomics

httpdxdoiorg10109701mnh000007969189474ee

Transcriptomics Transcriptomics

httpviewncbinlmnihgovpubmed18336229

Vaccinomics Vaccinomics

httpwwwgtmborgVOL12BPDF16_Gomase_amp_Tagore_141-146pdf

Oncogenomics Oncogenomics

httpwwwncbinlmnihgovpubmed18336222

Pharmacogenomics Pharmacogenomics

httpwwwncbinlmnihgovpubmed18336223

Epigenomics Epigenomics

httpwwwncbinlmnihgovpubmed18336226

Toxicogenomics Toxicogenomics

httpwwwncbinlmnihgovpubmed18336230

Kinomics Kinomics

httpwwwncbinlmnihgovpubmed18336231

Physiomics Physiomics

httpwwwncbinlmnihgovpubmed18336232

Cytomics Cytomics

httpwwwncbinlmnihgovpubmed18336233

Postgenomics Tracking the shift to postgenomics

httpdxdoiorg101159000092656

Glycomics Probing glycomics

httpdxdoiorg101016jcbpa200611040

Lipidomics Lipidomics is emerging

httpviewncbinlmnihgovpubmed14643793

Cellunomic Cellunomics the interaction analysis of cells

httpviewncbinlmnihgovpubmed19887340

Phylogenomics Phylogenomics evolution and genomics intersection

httpviewncbinlmnihgovpubmed19778869

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 50: Genomica colombia

httpbiiiogeekblogspotcom

httplaylamichanunamblogspotcom

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 51: Genomica colombia

bull Este proyecto se lleva a cabo gracias al financiamiento de

DGAPA UNAMProyecto PAPIME PE 201509

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 52: Genomica colombia

Agradecimientos

bull Laura Montoya

bull Lourdes Valencia

bull Fernando Galicia

bull Roberto Calderoacuten

bull Lyssania Maciacuteas

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001

Page 53: Genomica colombia

No se nos ha escapado a nuestra atencioacuten que

cuaacutento maacutes exploramos el genoma humano

maacutes nos queda por explorar No cesaremos de

explorar Pues al final de toda exploracioacuten

llegaremos donde empezamos y conoceremos

cuaacutel es nuestro lugar por primera vezldquo

Thomas Stearns Eliot 2001