Summary
Background:
The recent flood of data from genome sequences and functional genomics has given
rise to new field, bioinformatics, which combines elements of biology and computer
science.
Objectives:
Here we propose a definition for this new field and review some of the research that
is being pursued, particularly in relation to transcriptional regulatory systems.
Methods:
Our definition is as follows: Bioinformatics is conceptualizing biology in terms
of macromolecules (in the sense of physical-chemistry) and then applying “informatics”
techniques (derived from disciplines such as applied maths, computer science, and
statistics) to understand and organize the information associated with these molecules,
on a large-scale.
Results and Conclusions:
Analyses in bioinformatics predominantly focus on three types of large datasets available
in molecular biology: macromolecular structures, genome sequences, and the results
of functional genomics experiments (eg expression data). Additional information includes
the text of scientific papers and “relationship data” from metabolic pathways, taxonomy
trees, and protein-protein interaction networks. Bioinformatics employs a wide range
of computational techniques including sequence and structural alignment, database
design and data mining, macromolecular geometry, phylogenetic tree construction, prediction
of protein structure and function, gene finding, and expression data clustering. The
emphasis is on approaches integrating a variety of computational methods and heterogeneous
data sources. Finally, bioinformatics is a practical discipline. We survey some representative
applications, such as finding homologues, designing drugs, and performing large-scale
censuses. Additional information pertinent to the review is available over the web
at http://bioinfo.mbb.yale.edu/what-is-it.
Keywords
Bioinformatics - Genomics - Introduction - Transcription Regulation