|
| |
BIOINFORMATICS: New frontier for IT
Bioinformatics
is the development and application of computer methods for analysis,
interpretation, and prediction, as well as for the design of experiments. It has
emerged as a strategic frontier between biology and computer science.
Computer technology is constantly changing: Ten years ago, "Bioinformatics"
didn't exist, now it is commonly heard in the developing world. Some of the role
models had been building databases, developing algorithms and making biological
discoveries by sequence analysis since as far back as the sixties - long before
anyone thought to label this activity with a special term (if anything it was
called "molecular evolution"). As yet there are very few books (for instance,
Bioinformatics: The Machine Learning Approach by Pierre Baldi and Søren
Brunak (Feb 1998: MIT Press ) available on the subject, though some other
material has been published in papers and journals - for instance, the journal
Bioinformatics that Oxford University Press publishes in collaboration
with the Stanford University ().
The science of bioinformatics, which is melting of molecular biology with
computer science, is essential to the use of genomic information in
understanding human diseases, and in the identification of new molecular targets
for drug discovery. In recognition of this, many universities, government
institutes, and pharmaceutical firms are forming bioinformatics groups today,
consisting of computational biologists and bioinformatics computer scientists.
Such groups (for instance: ; /; ) will prove to be the key to unraveling mass
information generated by large scale sequencing efforts under way in
laboratories around the world.
We define computational biology in the broadest sense of the term. For us, a
computational biologist is someone who is expert in using modern software tools
to explore the function of living systems. He or she discovers answers to
significant biomedical questions by applying these tools to the analysis and
synthesis of biological, physical, and chemical data. The discipline is not new.
It has its roots in mathematical biology, and has attracted distinguished
scientific minds for nearly two centuries.
But with the advent of ever more extensive public and private databases, and
with the explosive development of the Internet, computational biology has
achieved a high-profile, rapid-growth position in the biomedical marketplace.
The great potential of computational biology is the ability to extract useful
knowledge from enormously complex biological data.
Computational biology subsumes many specialties, such as:
-
Polynucleotide sequence
analysis
-
Polypeptide sequence
analysis
-
Combinatorial approaches
to sequencing
-
Protein homology modeling
-
Computer modeling of
dynamic biological systems
-
PKPD modelling
-
Structure-function
analysis
-
Biomedical database
design, analysis, and applications
-
Phylogenetic analysis
-
Computational and
combinatorial chemistry
-
Optimization of
experimental designs
-
Neural nets
-
Chaos theory
-
Biostatistics
-
Protein molecular dynamics
-
Protein ab initio structure prediction
No research group can expect to establish and maintain expertise in all of these
sub-disciplines. As bioinformatics service provider Pakistani IT companies can
chose to focus on one that, in their experience, is critical for the solution of
complex problems that arise in cell biology and in animal and human physiology
and path physiology.
Bioinformatics should be judged on how well it can advance biology. Computer
results must always be validated in the wet laboratory setting. The problem is
that, in some cases, the link between the computer and the lab is missing. A
successful bioinformatics company must go the extra step to show that its
computer results are biologically accurate.
It is also important to know that after the information is integrated and
organized, it must be analyzed by ever more sophisticated statistical and
analytical tools to quickly and accurately identify drug targets and lead
compounds.
market trends
Recent research by London-based consulting firm Silico research indicates the
market for bioinformatics software and services is growing at 17 per cent
annually, and it is expected to reach $110 million by 2004. Research and
consulting firm Frost & Sullivan, for example, has predicted sales of $160
million in 2001, with the potential to grow to as much as $5 billion in the next
five years. In comparison, the global pharmaceutical industry is worth more than
$150 billion per year.
In the United States, Celera Genomics is a hybrid company, having sequenced the
fruit fly, human, mouse, and dog genomes. It has created a large database of
single-nucleotide polymorphisms (SNPs), which are the genetic variations between
individuals. Celera is augmenting its genetic data with more functional data
that indicate, for example, when a gene is expressed and what genes are
expressed in a given disease state. The public GenBank holds sequence data on
more than seven billion units of DNA, while Celera Genomics claims to have 50
terabytes of data in store, equivalent to 80,000 compact disks.
It is impossible to access the data or to make any sense of the sequences
without special software. Some software are developed and made freely available
in the public domain, but the databases of private companies are provided to
paid-up subscribers only. Whilebioinformatics in the US offer one-stop internet
shopping. These online companies, such as, "Double Twist" allows users to access
various types of databases and use software to manipulate the data. In 1999,
Compugen launched what it calls the first Internet life sciences research
engine, , which takes advantage of the Internet's salient feature - speeding and
facilitating information transfer to boost the value of laboratory experiments
and to broaden access to both public and proprietary knowledge.
Tim Littlejohn, CSO of e-bioinformatics , once explained in a conference, how
Australia's four-year-old national bioinformatics resource has evolved into a
for-profit service called BioNavigator that is now run by e-bioinformatics.
Bio-navigator is an easy to use interface for accessing multiple public data
sets, such as those offered by the National Centre for Biotechnology Information
(NCBI). E-bioinformatics also lets customers store data on their servers. This
helps small and midsize bio-techs avoid the costs of maintaining their own data
warehouses. However, e-Bioinformatics' business is modeled on software
outsourcing services called ASPs (application service providers) that allow
small business to rent essential software applications like payroll or benefits
administration over the Internet.
Methodology
Experimental proof is still the gold standard. Bioinformatics is used to help
focus the experiments of the bench top biologist one goal is to eliminate false
positives.
As Dan Levine, director of business development for Xpogen, says, "Our goal is
to make sense of the human Genome project. We want to bring things to the next
stage, finding out not just what the human genome is, but what it does. Without
bioinformatics, it would be impossible to find patterns in the vast sea of data
that is being generated."
"The availability of the genome sequence is just the beginning," says Arnold
Hagler, president of Structural Proteomics. He adds, "The goal is to identify
patterns in this information that can be used to develop more effective
therapeutics - drugs that work more quickly, are safer, are less toxic, and have
better bio-availability."
As a biologist, what skills doyou need to make the transition to
bioinformatics?
In addition to extensive knowledge of the run-of-the-mill molecular biology
packages (GCG, Blast, etc.), you will need to learn web and programming skills
including HTML, Perl, JAVA and C++, and be familiar with a variety of operating
systems (specially UNIX). Relational database skills are very much sought after,
so knowledge of SQL and a major database application such as Sybase or Oracle
will be highly advantageous. You will also need to learn about structural
biology and modeling, mathematical optimization, computer graphics theory and
linear algebra.
As a computational/quantitative scientists, what skills do you need to make
the tansition to bioinformatics?
The subjects they require to consider are molecular biology, protein (bio)
chemistry, structural biology, phylogenetics, genomics, and drug design.
Applications
Bioinformatics technology uses computational tools provided by the information
technology revolution, such as statistical software, graphics simulation and
database management, to organize and analyze information about biological
systems, which, for biotechnology, is information about cells and biological
molecules. Using another product of the information revolution, the Internet,
scientists broadcast this information around the world.
Bioinformatics technology helps us to:
-
Map genomes and identify
genes
-
Determine protein
structure and simulate protein interactions
-
Discover new therapeutic
targets and design medicines aimed at the targets
-
Assess the effects of virtual mutations on gene
function.
DNA Chip Technology
DNA chip technology, a marriage of the semi-conductor manufacturing industry and
molecular genetics, helps in converting the raw genetic data provided by the
human Genome project into useful products. Sequencing the human genome, while a
remarkable achievement, provides only the first milestone in the upcoming
medical revolution. The gene sequence and mapping data mean little until we
determine what those genes do. This field of study, known as functional
genomics, helps us translate gene identification and DNA sequence data into
biological functions.
Any study of gene function is, at its core, a study of proteins. Each cell
produces thousands of proteins, each with a specific function. This collection
of proteins in a cell is known as the proteome, and, unlike the genome, which is
constant irrespective of cell type, the proteome varies from one cell type to
the next.
The science of proteomics attempts to identify the protein profile of each cell
type, assess protein differences between healthy and diseased cells, and uncover
not only a protein's specific function but also how it interacts with other
proteins.Neither functional genomics nor proteomics is an end in itself. Their
medical value will be in identifying specific therapeutic targets and helping us
understand the complex biochemistry of disease processes.
The DNA chip technology is being used to:
-
Detect mutations in
disease-causing genes
-
Monitor gene activity
-
Diagnose infectious
diseases and identify the best antibiotic treatment
-
Identify genes important
to crop productivity
-
Improve screening for microbes used in bioremediation.
The Future
Because it is a multidisciplinary field, bioinformatics is evolving at the rate
of the different disciplines that comprise it. The changes taking place in
computer science, molecular biology, structural chemistry and a variety of other
fields all impact on the growth of bioinformatics.
However, not all areas of bioinformatics are growing. Bioinformatics research
suffers from the same pressures found in other areas of science. This is
specially true in bioinformatics because many of the algorithms have been
developed to compensate for inadequate experimental technique. Biology is
becoming more and more an information science and bioinformatics tools are
finding their way into every corner of the biological sciences.
All of these changes raise a variety of questions. How will bioinformatics
change once instrumentation improves? How will fragment assembly programs change
once sequencing machines can read longer fragments? Will we still need
structural prediction programs if structural determination becomes easy for the
average scientist to perform? Where is bioinformatics heading? What happens when
we begin to solve current challenges? What are the future challenges and
opportunities?
As always, scientific knowledge is a double edged sword. Bioinformatics can both
solve and create problems. The big question is: What are these potential
problems and how can they be addressed?
| |
|