Tae-Hyung Kim 1 Gil-Mi Ryu 1,2 Kth2001@hyowon.pusan.ac.kr gmryu@hyowon.pusan.ac.kr InSong Koh 2 Jong...

Preview:

Citation preview

Tae-Hyung Kim1 Gil-Mi Ryu1,2

Kth2001@hyowon.pusan.ac.kr gmryu@hyowon.pusan.ac.krInSong Koh2 Jong Park3

insong@nih.go.kr jong@mrc-dunn.cam.ac.uk

1 Department of Bioinformatics, Bioinformatics Cooperative Course, Pusan National University, Pusan, Korea2 Section of Bioinformatics, Central Genome Center, National Institute of Health, Nokbun-Dong 5, Seoul, Korea3 MRC-DUNN, Hills Road Cambridge CB2, 2XY, England, UK

Figure 3. Ontological classification based on methodology. The methodology for DNA sequence determination can be classified according to work procedure such as mapping, sequencing, assembly, and searching. RNA analysis is classified according to cDNA chip procedure resulting in expression analysis. Protein analysis methodology can be classified as comparative and predictive methods.

1 Introduction

One of the major obstacle of bioinformatics is the difficulty in computation with literature information. Unlike sequence and structure, it is impossible to establish homology, similarity, interaction and function criteria for literature information. To ease this problem, attempts to clarify the ontological problems have become bioinformatic projects. The idea of ontology is to define terms and concepts in a mechanical and computable units. The result will be clear classification and mapping of text elements for computers. We have applied this ontological advantage of classifying elements to the very bioinformatics field. This project has an important merit of efficient understanding and dissemination of bioinformatics knowledge to this fast growing field. Any intuitive classification system of bioinformatics itself can provide us with valuable project ideas and future directions. There are three main components of ontology of bioinformatics field: 1) classification based on methodology, 2) knowledge based classification (database systems) and 3) classification based on biological data types. These components overlap and they are different aspects of the same or similar information. However, depending on the users interest, the certain view can be more relevant to design and organize a bioinformatics project

Figure 4. Classification of databases.Biological databases can be classified according to the data features. The popular databases used in the biological community were included in this schematic map..

Figure 5. Classification according to biological data types. According to this classification map, biological data can be identified through prediction of sequence structure and function. As information acquired from data flows from right to left, it becomes more and more clear.

References

[1] Patricia G. Baker, Carole A. Goble, Sean Bechhofer, Norman W. Paton, Robert Stevens, Andy Brass, An ontology for bioinformatics applications, Bioinformatics vol 15, no 6, 510-520, 1999

[2] Robert Stevens, Patricia Baker, Sean Bechhofer, Gary Ng, Alex Jacoby, Norman W. Paton, Carole A. Goble, Andy Brass, TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources, Bioinformatics vol. 16 no. 2, 184-185, 2000

[3] Andreas D. Baxevanis, The Molecular Biology Database Collection: an online compilation of relevant database resources, Nucleic Acid Research, vol. 28. No. 1, 2000

[4] The Gene Ontology Consortium, Gene ontology : Tool for the unification of biology, Nature America Inc

http://genetics.nature.com., nature genetics volume 25, 2000

2 Method and Results

2.1Classification based on methodology.We tried to classify bioinformatics field according to analysis method of biol

ogical data(DNA, RNA, Protein). In this way, bioinformatics can be understood intuitively through a schematic map.

Figure 1. main window Figure 2. sub windows

2.3 Classification based on biological data typesWe categorized the component fields of bioinformatics according to the implementation types used by the biologists after data acquisition. We differentiate them by the common procedures used and tools applied to the biological knowledge, which is a usual procedure carried out by biologists

3 Discussion

In this classification of the components of bioinformatics, we introduced our ontology schema in classifying and mapping the bioinformatics field itself. This ontological procedure was designed to represent the methodology, features of databases and data content. So it allows us to find projects and relate the problem domain in bioinformatics in the much more systematic way. Also it can be used to cluster biological sequence data based on their bioinformatics ontology characteristics and it can provide us computation on the specific elements such as sequence and database. In addition, schematic maps are drawn to show a visual tree so that one can get the global picture on bioinformatics field, and obtain more precise information intuitively and efficiently. The lower levels of each classification criterion is linked to the web pages.(http://nihcgc.re.kr/BioinfoMap and http://interaction.mrc-dunn.cam.ac.uk/BioinfoMap/). The classification system is still being developed and will be stored in an SQL based database for more dynamic navigation between different component concepts of bioinformatics field.

Acknowledgement

We thank Mi-Ae Yoo and Heui-Soo Kim(Pusan National University) for support. This work was funded in part by the Bioinformatics Training Grant of Ministry of Health & Welfare, Korea and supported by Pusan National University, Korea and MRC, UK

2.2 Knowledge based classification (database systems).These databases can be classified according to data features, thus classified as 1) sequence, 2) protein, 3) metabolic pathway, 4) organism and 5) RNA groups.