ThalInd, a β-thalassemia and hemoglobinopathies database for India: defining a model country-specific and disease-centric bioinformatics resource

  • View

  • Download

Embed Size (px)


  • Human MutationDATABASES

    ThalInd, a b-Thalassemia and HemoglobinopathiesDatabase for India: Defining a Model Country-Specificand Disease-Centric Bioinformatics Resource

    Sujata Sinha,1,2 Michael L. Black,1 Sarita Agarwal,3 Reena Das,4 Alan H. Bittles,1,5 and Matthew Bellgard1

    1Centre for Comparative Genomics, Murdoch University, Perth, Australia; 2Thalassemia Working Group, Varanasi, India; 3Sanjay Gandhi Post

    Graduate Institute of Medical Sciences, Lucknow, India; 4Postgraduate Institute of Medical Education and Research, Chandigarh, India; 5Edith

    Cowan University, Perth, Australia

    Communicated by Richard G.H. CottonReceived 19 November 2010; accepted revised manuscript 29 March 2011.

    Published online 21 April 2011 in Wiley Online Library ( DOI 10.1002/humu.21510

    ABSTRACT: Web-based informatics resources for geneticdisorders have evolved from genome-wide databases likeOMIM and HGMD to Locus Specific databases(LSDBs) and National and Ethnic Mutation Databases(NEMDBs). However, with the increasing amenability ofgenetic disorders to diagnosis and better management,many previously underreported conditions are emergingas disorders of public health significance. In turn, thegreater emphasis on noncommunicable disorders hasgenerated a demand for comprehensive and relevantdisease-based information from end-users, includingclinicians, patients, genetic epidemiologists, healthadministrators and policymakers. To accommodate thesedemands, country-specific and disease-centric resourcesare required to complement the existing LSDBs andNEMDBs. Currently available preconfigured Web-basedsoftware applications can be customized for this purpose.The present article describes the formulation andconstruction of a Web-based informatics resource forb-thalassemia and other hemoglobinopathies, initially foruse in India, a multiethnic, multireligious country with apopulation approaching 1,200 million. The resourceThalInd ( has beencreated using the LOVD system, an open sourceplatform-independent database system. The system hasbeen customized to incorporate and accommodate datapertinent to molecular genetics, population genetics,genotypephenotype correlations, disease burden, andinfrastructural assessment. Importantly, the resource alsohas been aligned with the administrative health systemand demographic resources of the country.Hum Mutat 32:887893, 2011. & 2011 Wiley-Liss, Inc.

    KEY WORDS: bioinformatics resource; database; b-tha-lassemia; hemoglobinopathies; India ThalInd


    The evolution of Web-based informatics resources for geneticdisorders can be traced through three distinct phases: (1) genome-wide mutation databases, (2) locus-specific databases (LSDBs),and (3) national and ethnic mutation databases (NEMDBs).Genome-wide databases, such as OMIM, HGMD, and Ensembl,contain pooled information on all genes and incorporateadvanced tools for gene analysis with user interface. LSDBs weredesigned so that researchers dealing with a specific disease canretrieve current data from a single source and thus need to searchno further than an LSDB [Cotton, 2009]. The majority of LSDBsincorporate tools for the analysis of gene expression and thephenotype in normal and disease conditions [Patrinos, 2006].NEMDBs represent the third phase and were devised to provideinformation on disease-causing mutations and their frequencies indifferent population groups within a country. They can help in theoptimization of molecular diagnostic services and the creation ofappropriate awareness among clinicians, scientists, and the generalpublic about genetic disorders that may be prevalent in differentpopulations and communities [Patrinos, 2006]. The developmentof disease-specific national resources could therefore be consid-ered the next critical phase in the evolution of Web-basedinformatics resources on genetic disorders.

    With the increasing amenability of many genetic disordersto prevention, early diagnosis, and better management throughearly intervention and increasing curative options, informaticsresources also need to be extended and up-scaled to providerelevant information to a wide range of potential users, includingclinicians, patients, genetic epidemiologists, health administrators,and policymakers. It also is desirable that data and issues of thisnature can be accommodated within the purview of a nationalhealth system. Disease-specific national resources can accommo-date the requirements of information on disease burden, treat-ment options, and existing facilities for diagnosis and trackingutilization of these services would greatly enhance their relevanceto society.

    In this article we introduce and discuss the design of a country-specific bioinformatics resource for a genetic disorder of wide-spread, major public health significance. The highlight of theconceptual model is alignment with national demographic,administrative, and health systems illustrated by its the applicationto b-thalassemia and other hemoglobinopathies, autosomalrecessive disorders adversely affecting the health of large numbersof people worldwide. The resource has been specifically designed


    & 2011 WILEY-LISS, INC.

    Additional Supporting Information may be found in the online version of this article.Correspondence to: Matthew Bellgard, Centre for Comparative Genomics,

    Murdoch University, South Street, Perth, WA 6150, Australia.


  • for adoption in India, a large and demographically complexcountry, but it has the additional potential to serve as a prototypefor other genetic disorders that impact on health in all low- andmiddle-income countries.

    Bioinformatics Requirements of b-Thalassemiaand Hemoglobinopathies in the Indian Context

    It has been suggested that in the near future b-thalassemia andrelated disorders are likely to emerge as the category of geneticdisease that will have the most widespread impact on public healthand health resources in India [Agarwal, 2005; Petrou, 2010;Weatherall, 2010]. The autosomal recessive disease b-thalassemiais the most complex disease among the larger group of inheritedhemoglobin disorders. The b-globin gene itself is located onchromosome 11p15.5, with 242 mutations reported [Giardineet al., 2007]. However, expression of the b-globin gene isinfluenced by secondary and tertiary genetic modifiers of thedisease phenotype, resulting in extensive phenotypic diversity[Weatherall, 2001]. The prevalence of symptomatic or clinicallysilent hemoglobin variants such as HbE, HbS, and HbD within thesame population subsets further contribute to diverse phenotypes,resulting in thalassemic hemoglobinopathies and homozygousand compound hemoglobinopathies. Stem cell transplantation isthe only curative option currently available to patients, but in low-income countries its adoption has been restricted because oflimited donor availability, the high costs involved, and the smallnumber of specialist centers. As a result, a large majority ofpatients remain reliant on chronic management regimensinvolving regular blood transfusions and iron chelation therapy,which places a huge burden on the national health resources andon the resources of patients, their families, and communities.Given these circumstances, and the large numbers of patientsinvolved [Sinha et al., 2009], it is envisaged that a comprehensivenational information resource on thalassemia could greatly aidhealthcare delivery and control strategies [IUSSTF, 2007].The occurrence and prevalence of recessive mutations in a

    particular population may be dependent on marriage andreproductive practices [Sinha et al., 2009]. The complex, highlystratified structure of the Indian population, characterized by theunique, long-established caste system, has been further compli-cated by multiple waves of immigration and subdivisions based onsix major religions and 22 major spoken languages [Black et al.,2010]. With a multifaceted population history of this nature, athorough knowledge and understanding of local communitystructure is required in order to devise an effective and relevantinformatics resource for b-thalassemia.Health policies in India are formulated at national level and

    implemented by individual states according to the directives of thenational government with each state organizing its own healthinfrastructure via a Department for Health and Family Welfare.The decennial Census of India ( is themajor national demographic resource, and serves as a referencepoint for most national planning and policy decisions. TheNational Rural Health Mission (NRHM) (, a major initiative of the national government focuseson improving healthcare delivery to the rural areas where 470%of the population reside.A national bioinformatics resource aligned with the health

    administrative system should therefore facilitate: (1) the imple-mentation of prevention and control programs by public healthauthorities; (2) an improvement in the availability and accessibility

    of thalassemia care services in all areas rural and urban; (3) clinicalresearch to improve chronic management regimens; and (4) researchin related disciplines, including population genetics, public health,and clinical medicine.

    Structuring the Web-Based National Resourcefor b-Thalassemia and Hemoglobinopathies

    Given the immense size of the population, the very significantlevel of regional diversity, and the current health administrativesystem, the creation of a pan-Indian Web-based resource created bymerging information generated at State level is a logical progression.The key components required of an effective b-thalassemia Web-based resource are outlined in Figure 1.

    The core component of this resourc