Ten Thousand Views of Bioinformatics: A Bibliome Perspective
07 March 2018 (online)
Objective Summarize the current state bioinformatics research from the published literature in 2008.
Methods The entire corpus of publications indexed by the National Library of Medicine in the PubMed repository was reviewed for articles tagged as belonging to the discipline of bioinformatics by Medical Subject Heading or by term in the title or abstract of the article. Selected summary statistics of this corpus were then used to motivate additional exploration.
Results Over ten thousand articles published in 2008 populated the bioinformatics corpus. Significantly, there were at least as many publications in genomics and genetics that used computational techniques but that were not identified as bioinformatics research. Genomics and proteomics continued to be the leading application domains of bioinformatics research but despite the proliferation of human studies, the genes most studied in the corpus were from yeast rather than the human organism. The growth in the genomic studies of human disease was accompanied by a growing critical literature regarding the methods, results and impact of these studies. Concurrently, the availability of full genome sequences at commodity prices has increased the computational challenges of human studies by several orders of magnitude. Further concerns were raised about the consequences of public disclosure of comprehensive or even aggregate genomic data.
Conclusion The impressive size of the bioinformatics bibliome is easily dwarfed by the challenges generated by the continued increased growth of high-throughput biological data sets. The demand for bioinformatics expertise and tools is therefore likely to continue to increase, at least in the near term.
- 1 Donovan S. Big data: teaching must evolve to keep up with advances. Nature 2008; 455 7212 461.
- 2 Stein LD. Bioinformatics: alive and kicking. Genome Biol 2008; 09 (12) 114.
- 3 Asangani IA, Rasheed SA, Nikolova DA, Leupold JH, Colburn NH, Post S. et al. MicroRNA-21 (miR-21) post-transcriptionally downregulates tumor suppressor Pdcd4 and stimulates invasion, intravasation and metastasis in colorectal cancer. Oncogene 2008; 27 (15) 2128-36.
- 4 Bairoch A, Bougueleret L, Altairac S, Amendolia V, Auchincloss A, Puy GA. et al. The Universal Protein Resource (UniProt). NucleicAcids Research 2008; 36: D190-D5.
- 5 Bennett-Lovsey RM, Herbert AD, Sternberg MJE, Kelley LA. Exploring the extremes of sequence/ structure space with ensemble fold recognition in the program Phyre. Proteins-Structure Function and Bioinformatics 2008; 70: 611-25.
- 6 Bult CJ, Eppig JT, Kadin JA, Richardson JE, Blake JA. Mouse Genome Database Group. The Mouse Genome Database (MGD): mouse biology and model systems. Nucleic Acids Res 2008; 36: D724-8.
- 7 De Vos RC, Moco S, Lommen A, Keurentjes JJ, Bino RJ, Hall RD. Untargeted large-scale plant metabolomics using liquid chromatography coupled to mass spectrometry. Nature Protocols 2007; 02 (04) 778-91.
- 8 Gonzalez-Diaz H, Gonzalez-Diaz Y, Santana L, Ubeira FM, Uriarte E. Proteomics, networks and connectivity indices. Proteomics 2008; 08 (04) 750-78.
- 9 Graumann J, Hubner NC, Kim JB, Ko K, Moser M, Kumar C. et al. Stable Isotope Labeling by AminoAcids in Cell Culture (SILAC) and proteome quantitation of mouse embryonic stem cells to a depth of 5,111 proteins. Mol Cell Proteomics 2008; 07 (04) 672-83.
- 10 Gwinn DM, Shackelford DB, Egan DF, Mihaylova MM, Mery A, Vasquez DS. et al. AMPK phosphorylation of raptor mediates a metabolic checkpoint. Mol Cell 2008; 30 (03) 214-26.
- 11 Kawamata N, Ogawa S, Zimmermann M, Kato M, Sanada M, Hemminki K. et al. Molecular allelokaryotyping of pediatric acute lymphoblastic leukemias by high-resolution single nucleotide polymorphism oligonucleotide genomic microarray. Blood 2008; 111 (02) 776-84.
- 12 Wang WX, Rajeev BW, Stromberg AJ, Ren N, Tang G, Huang Q. et al. The expression of microRNA miR-107 decreases early inAlzheimer’s disease and may accelerate disease progression through regulation of beta-site amyloid precursor protein-cleaving enzyme 1. J Neurosci 2008; 28 (05) 1213-23.
- 13 Smith L, Tanabe LK, Ando RJ, Kuo CJ, Chung IF, Hsu CN. et al. Overview of BioCreative II gene mention recognition. Genome Biol 2008; 9 Suppl 2: S2.
- 14 Alex B, Grover C, Haddow B, Kabadjov M, Klein E, Matthews M. et al. Assisted curation: does text mining really help?. Pacific Symposium on Biocomputing Pac Symp Biocomput 2008; 556-67.
- 15 Bloom T, Ferguson C, Gross L, Maccallum CJ, Milton J, Shields R. et al. PLoS Biology at 5: the future is open access. PLoS Biol 2008; 06: e267.
- 16 Cockerill MJ, Norton M. Open-access journals are delivering high impact, and more. Lancet 2008; 371: 2084.
- 17 McMullan E. Open access mandate threatens dissemination of scientif ic information. J Neuroophthalmol 2008; 28: 72-4.
- 18 Hu JC, Aramayo R, Bolser D, Conway T, Elsik CG, Gribskov M. et al. The emerging world of wikis. Science 2008; 320 5881 1289-90.
- 19 Ripatti S, Becker T, Bickeböller H, Dominicus A, Fischer C, Humphreys K. et al. GENESTAT: an information portal for design and analysis of genetic association studies. Eur J Hum Genet 2009; Apr; 17 (04) 533-6.
- 20 Daub J, Gardner PP, Tate J, Ramsköld D, Manske M, Scott WG. et al. The RNA WikiProject: community annotation of RNA families. RNA 2008; 14: 2462-4.
- 21 Homer N, Szelinger S, Redman M, Duggan D, Tembe W, Muehling J. et al. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 2008; 04: e1000167.
- 22 Cassa CA, Schmidt B, Kohane IS, Mandl KD. My sister’s keeper?: genomic research and the identifiability of siblings. BMC Med Genomics 2008; 01: 32.
- 23 Foster M, Sharp R. Out of sequence: how consumer genomics could displace clinical genetics. Nat Rev Genet 2008; 09 (06) 419.
- 24 Wheeler D, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A. et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 2008; 452 7189 872-6.
- 25 Pop M, Salzberg S. Bioinformatics challenges of new sequencing technology. Trends in Genetics 2008; 24: 142-9.
- 26 Weidinger S, Gieger C, Rodriguez E, Baurecht H, Mempel M, Klopp N. et al. Genome-wide scan on total serum IgE levels identifies FCER1A as novel susceptibility locus. PLoS Genet 2008; 04: e1000166.
- 27 O’Donovan MC, Norton N, Williams H, Peirce T, Moskvina V, Nikolov I. et al. Analysis of 10 independent samples provides evidence for association between schizophrenia and a SNP flanking fibroblast growth factor receptor 2. Mol Psychiatry 2009; 14 (01) 30-6.
- 28 van den Oord EJ, Kuo PH, Hartmann AM, Webb BT, Möller HJ, Hettema JM. et al. Genomewide association analysis followed by a replication study implicates a novel candidate gene for neuroticism. Arch Gen Psychiatry 2008; 65 (09) 1062-71.
- 29 Hofmann S, Franke A, Fischer A, Jacobs G, Nothnagel M, Gaede KI. et al. Genome-wide association study identifies ANXA11 as a new susceptibility locus for sarcoidosis. Nat Genet. 2008
- 30 Han J, Kraft P, Nan H, Guo Q, Chen C, Qureshi A. et al. A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation. PLoS Genet 2008; 04 (05) e1000074.
- 31 Franke A, Balschun T, Karlsen TH, Hedderich J, May S, Lu T. et al. Replication of signals from recent studies of Crohn’s disease identifies previously unknown disease loci for ulcerative colitis. Nat Genet 2008; 40: 713-5.
- 32 Gold B, Kirchhoff T, Stefanov S, Lautenberger J, Viale A, Garber J. et al. Genome-wide association study provides evidence for a breast cancer risk locus at 6q22.33. Proc Natl Acad Sci U S A 2008; 105: 4340-5.
- 33 Thomas G, Jacobs KB, Yeager M, Kraft P, Wacholder S, Orr N. et al. Multiple loci identified in a genome-wide association study of prostate cancer. Nat Genet 2008; 40: 310-5.
- 34 van Ommen GJ. Popper revisited: GWAS here, last year. Eur J Hum Genet 2008; 16: 1-2.
- 35 Klupa T, Malecki MT. All we need is GWAS: Genome-Wide Association Studies in Type 2 Diabetes Mellitus presented on the 2008 EASD Meeting in Rome. Rev Diabet Stud 2008; 05: 175-9.
- 36 Need AC, Attix DK, McEvoy JM, Cirulli ET, Linney KN, Wagoner AP. et al. Failure to replicate effect of Kibra on human memory in two large cohorts of European origin. Am J Med Genet B Neuropsychiatr Genet 2008; 147B: 667-8.
- 37 McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 2008; 09: 356-69.
- 38 Grossman I, Sullivan PF, Walley N, Liu Y, Dawson JR, Gumbs C. et al. Genetic determinants of variable metabolism have little impact on the clinical use of leading antipsychotics in the CATIE study. Genet Med 2008; 10: 720-9.
- 39 Burnett JR, Hooper AJ. Common and Rare Gene Variants Affecting Plasma LDL Cholesterol. Clin Biochem Rev 2008; 29: 11-26.
- 40 Wojczynski MK, Tiwari HK. Definition of phenotype. Adv Genet 2008; 60: 75-105.
- 41 Uzuner O, Goldstein YLuo, Kohane IS. Identifying patient smoking status from medical discharge records. J Am Med Inform Assoc 2008; 15: 14-24.
- 42 Roden DM, Pulley JM, Basford MA, Bernard GR, Clayton EW, Balser JR. et al. Development of a large-scale de-identified DNA biobank to enable personalized medicine. Clin Pharmacol Ther 2008; 84: 362-9.
- 43 Mandl KD, Kohane IS. Tectonic shifts in the health information economy. N Engl J Med 2008; 358: 1732-7.