Systems Biology

In physics terms, systems biology is the ultimate many body problem of living matter. New exciting currents are entering the field, with conceptual roots in modern physical ideas. Systems biology is being propelled by the successes of molecular biology and genetics that have made available genomic blueprints of numerous organisms, together with extensive experimental data covering most aspects of cell function. Problems of post-genomic biology focus on the mechanisms through which biological function emerges from the interaction of numerous molecular components. These problems can seldom be solved by the reductionist method - instead, they call for the emergence of new and creative approaches. They also present an opportunity for a significant role of theory that can guide experiment by developing increasingly complex hypotheses, formed on the basis of modeling the phenomena and analyzing genomic and other experimental data. Beyond the cell level, systems biology addresses questions of how multi-cellular organisms develop and function, and how populations interact on the ecological scale. Systems biology necessarily includes the study of evolution, as – to quote Theodosius Dobzhansky: “Nothing in biology makes sense except in the light of evolution”.

 

Introduction

Condensed matter and materials physicists have long been motivated to explore problems with emergent features such as phase transitions, strong fluctuations, multiple scales of space and time, and spatially-extended dynamics in the context of materials science. A hallmark of the success of this endeavor has been the recognition that there are alternatives to reductionist modeling, which can still yield quantitative predictions for specific systems. This viewpoint was found to be productive in certain judiciously chosen arenas outside of conventional condensed matter, such as atomic and molecular physics, nuclear physics and astrophysics. In systems biology, condensed matter and materials theorists are presented with another compelling set of phenomena to comprehend, which could benefit from their viewpoint.

 

At the same time, progress in systems biology will require deep and detailed understanding of biological systems that is essential for identifying the "right” questions. It will require development of new physical concepts geared towards living matter, which is extremely heterogeneous, non-generic and highly specialized, far from equilibrium, and nontrivially coupled to the environment. Systems biology is an intrinsically interdisciplinary subject that requires not only close collaboration with biologists but also calls for merging diverse perspectives from physics, mathematics, engineering, and computer science. Ideas and concepts from these other fields will enrich physical science as it strives to describe the complexity of living matter. The new tools that will have to be invented on the way may well prove useful in other areas of physics and materials science. Systems biology is a natural arena for condensed matter and materials theory: this community can, and should, embrace the rich and fascinating problems that systems biology provides.

 

Past Successes

Systems biology is an emerging discipline, but there are precursors to its perspective on global properties of biological organization. An early example is the prebiological models of evolution initiated by Manfred Eigen (Nobel Prize 1967) and built on the concept of quasi-species, which have shaped almost all subsequent thinking about evolution. Another example is neuroscience, where physicists contributed to the development of neural network ideas and algorithms. Recent important and promising contributions (by physicists) to systems biology include seminal work on identification of network motifs and modules, successful models in ecology and immunology, fruitful applications of statistical mechanics to algorithm development for bioinformatics, and successful efforts to define and exploit general principles, such as “robustness”.

Figure 18.  Life’s Complexity Pyramid [Z. N. Oltvai and A.-L. Barabási, Science 298, 763 (2002)].

 
 


 


Current and Future Challenges

Systems biology is too vast a subject for us to attempt an exhaustive list. Some of the subjects studied by systems biology are:

Molecular networks

Networks of molecular interactions underlie information flow into and within the cell. External stimuli, such as nutrients (in bacterial chemotaxis), pheromones (in yeast mating response), and light (in vertebrate and invertebrate phototransduction), are sensed by receptors that change conformation and typically acquire enzymatic activity, which modifies a downstream component of a particular signal transduction pathway. Signal transduction pathways often involve multiple intermediate stages before reaching the “effector” or “output device” (such as the flagellar motor in chemotaxis or a transcription factor protein which controls gene expression, e.g. in mating response). Even in cases when most of the components of the transduction cascade are well understood, the fundamental questions of system level function such as sensitivity, noise and adaptation properties are only beginning to be tackled. What are the role and the design principle governing numerous feedback loops that typically surround the transduction cascade? What is the dynamics of adaptation on different time scales? What is the effect of molecular stochasticity?

 

Other types of molecular networks - genetic networks – are formed by interactions between transcription factor proteins and genes. Genetic networks govern gene expression and control the functional state (e.g. metabolism) and the developmental fate of the cell. For example, the regulatory network of the Drosophila melanogaster segment polarity genes maintains their periodic pattern and the segmented embryonic phenotype. How sensitive is the behavior of this system to various parameters characterizing molecular interactions? Can the molecular dynamics or chemical kinetics description of such network be reduced to Boolean networks? What can and cannot be described by such reduced models?

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 19.  Bacterial Chemotactic System.

 

Understanding the function of molecular networks requires understanding their spatio-temporal dynamics. It is not enough to know that protein X can activate Y: we must also know “where” and “when”. Receptors tend to cluster, signaling components assemble into complexes, and common second messengers such as Ca or cyclic-nucleotides are diffusion limited; “activation” and “activity” of proteins often occurs in different spatial locations (e.g. eukaryotic transcription factors may be “recruited” in cytoplasm but bind DNA in the nucleus). Quantitative modeling will be an important component in the study of spatio-temporal dynamics of molecular networks. A recent example of such modeling is provided by the study of spatio-temporal oscillations of Min proteins in E. Coli, which serves as a mechanism by which these bacteria find their “middle” in order to divide.

Statistical Mechanics in Bioinformatics

Progress in the systems biology of molecular networks depends on bioinformatics to extract knowledge from the abundant base of data, which includes genomes, transcriptomes, proteomes, etc. Most aspects of bioinformatics, however, relate naturally to statistical mechanics: e.g. problems of sequence alignment or motif discovery and analysis can be stated in terms of partition functions. For example, a common problem in bioinformatics is to look for a "signal" that is statistically significant given some null model. This requires understanding the statistical properties of null models; the latter can be justifiably taken to be random systems. Many years of work in the physics of disordered systems (e.g., spin glasses) provides statistical physicists with an arsenal of powerful techniques to characterize the properties of such random systems, including the tails of various distributions functions. Consequently scientists with condensed matter and materials and statistical mechanics backgrounds have been at the forefront of this discipline. Statistical physics methods have already yielded several exact results about the probability distribution of sequence alignment scores. Some of the most challenging open problems include the development of “reverse engineering” methods that would allow statistical inference of the underlying networks based on heterogeneous experimental sequence and expression data.


 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Figure 20.  A gene regulatory network [U.S. Department of Energy Genomics: GTL Program, http://www.ornl.gov/sci/techresources/Human_Genome/graphics/slides/sciregulatory.shtml.]

Development and Morphogenesis

The complex spatio-temporal process of forming an adult organism from a fertilized egg is another area where theory can make a significant contribution. In the past two decades we have learned that stripes in early Drosophila melanogaster embryos – despite being a relatively simple developmental pattern - are not generated by a simple physical mechanism (Turing instability) as once expected. Instead, the spatial pattern of gene expression that underlies these stripes is directed through the sequence-specific interaction of transcription factor proteins with regulatory segments of DNA. This, however, does not spell doom for theoretical approaches. Genomic data for Drosophila melanogaster (and related species) is now readily available and modeling and bioinformatic approaches guiding future experimental work will play an important role in understanding the gene regulatory network underlying patterning of the embryo. Yet, other morphogenetic processes – such as patterning of the hexagonal array of ommatidia in a fly eye – involve competition of diffusible activators and inhibitors and are likely to involve a Turing-like, and hence physical, mechanism. Other physical mechanisms (e.g. mechanical interactions) yet to be discovered may be involved in regulation of tissue growth and its coordination with patterning. This offers another fruitful area for interdisciplinary research. Last but not least we mention the challenge of Evo-Devo (Evolutionary Developmental biology), which seeks to identify the mechanisms of evolution of regulatory interactions that control development and to understand the origin of the diversity of species.

Evolution: phylogenetics, comparative genomics, and network evolution


The advent of molecular phylogenetic methods based upon conserved sequences, such as that of the 16S ribosomal RNA molecule, has provided deep insights into evolutionary dynamics. The discovery of the Archaea as a third branch of the tree of life has revolutionized microbiology, while the relationships between the phylogenetic trees based upon the various amino-acyl-tRNA synthetases have highlighted the significant role of horizontal gene transfer in the distant past. It has become evident that in order to understand fully the emergence of modern day evolution based upon vertical descent, the competition between various mechanisms contributing to genome dynamics must be modeled, and any phase transitions identified. The

Figure 21.  The genetic information is stored in the DNA packaged in the chromosomes contains the genes that encode for proteins.

 

consequences of this viewpoint are potentially profound, shedding light on the emergence of cellular organization, as well as providing insight into genome mosaicism and other structural features.

 

Evolution poses many problems that require novel algorithms and approaches to solve. Finding the best phylogenetic tree, when many species are to be fit, requires Monte Carlo methods for sampling trees, and some reckoning of how to balance the most likely assignment of ancestors (given some metric for similarity) with the number of ways of achieving almost as good an assignment. Evolution is a noisy optimization problem for which we know only very vaguely the function (fitness) being optimized. But in certain cases for problems on a variety of scales, intelligent guesses can be made. Consider, for example, control of gene expression by transcription factor proteins. Specific collections of transcription factors need to bind to DNA to control a gene, and one knows the physical chemistry of binding: what is the most mutation-resistant arrangement of binding sites? Here, a quantitative definition of fitness may already be within reach. Many current genome-sequencing projects are intended to provide data with which to compare manifestly similar organisms with the ultimate goal of defining the map from the genome to the organism itself. There is a huge redundancy in how the genome controls gene expression, and tools from natural language processing, cryptography, and various constraint optimization problems (all of interest to the physics community) will be used to decipher what matters.

Neuronal Networks and the Brain

The study of the brain focuses on networks of neurons underlying perception, motor control and cognition and aims to understand cortical organization and processes, as well as to discover and abstract the information-processing algorithms involved in brain functions. This field poses numerous problems that call for statistical-mechanics and dynamical-systems approaches. Automated, high-throughput techniques for serial section and three-dimensional reconstruction of neural circuits are currently under development by many research groups. These techniques could revolutionize the study of neuro-anatomy at the micro scale, greatly accelerating the pace at which data is generated. It will be important for this data to be processed and interpreted by theorists.

 

Because the brain is a result of evolution by natural selection that favors optimal use of limited resources, it is natural to use constrained optimization as a tool to understand brain design and function. Some of the limited resources have a clear physical nature, such as space, time, and energy. Therefore, physics provides a natural starting point for developing a theoretical framework of brain design, which will eventually unify many disjoint experimental facts. An example of a successful optimization approach is “wiring economy” which calls for minimization of the physical distance between connected neurons. This approach has been able to explain multiple anatomical facts, such as the morphology of individual neurons and organization of cortical areas. In addition, this approach led to a theoretical proposal for a novel mechanism of long-term memory storage, which has been validated by recent experimental observations of structural plasticity.

 

Another fundamental problem involves understanding the neural code. A number of possibilities are being investigated which involve coding by spike frequency, or alternatively by precise spike timing, and involve either single neurons or correlated activity of populations of neurons. Experiments recording or imaging (at high spatiotemporal resolution) the activity of thousands of neurons are now routinely performed. This data naturally lends itself to analysis using information-theoretical techniques derived from statistical physics. Condensed-matter theory ideas concerning collective dynamics of strongly interacting systems and techniques based on dynamical-systems theory have proven useful in identifying specific patterns of neural activity.

 

Perceptual and cognitive functions are determined by collective neuronal dynamics. Whereas the original work on neural nets was based on relaxational dynamics and isolated fixed points, more recent work addresses more complex (e.g. line) attractors and recurrent (i.e. non-relaxational) dynamics. Theorists are now creating network models of cognitive functions such as decision-making and “rational” choices between alternatives of uncertain value.

 

Of great interest is the problem of relating synaptic plasticity to learning and memory. Recent discoveries about the role of spike timing in the induction of long-term synaptic modifications have been incorporated into models of neural development. Another important frontier is the study of how reinforcement learning could be related to the interaction between synaptic plasticity and reward signals in the brain.

Quantitative models in immunology

Better understanding of the function and limitations of the immune system is crucial for addressing the “Grand Challenges of Global Health” pertaining to vaccines. Any model must account for the dynamics of the repertoire of B cell and T cell sequences that exist within each individual and must incorporate the year-to-year evolution and mutation of the virus and the existence of multiple viral strains. Phenomena that are ripe for study using techniques from condensed matter and materials theory, such as spin glass theory, include original antigenic sin and immunodominance. In antigenic sin, a primary response of the immune system to an initial viral disease selects memory cells that are recruited in a secondary response to a related, but distinct virus - as a result, individuals vaccinated against the original strain may become more susceptible to infection by mutated strains of the virus than would individuals receiving no vaccination. In immunodominance, dominant antigen fragments suppress generation of T cell activity toward other non-dominant antigen fragments - this becomes especially significant when one virus, or vaccine, shares an antigen fragment with another virus. For example, exposure to influenza A may increase susceptibility to hepatitis C.

 

The seemingly very medically applied areas of growth of antibiotic resistance and evasion of the immune system by the AIDS virus are actually high speed evolution problems in a confined setting, with vast experimental resources available precisely because they are medical problems. What has to change for bacteria to evade a drug: genes or their regulation? Are there any principles in common with how new species arise in the environment?

Biodiversity and ecology

With increasing public awareness of the loss of biodiversity, it has become all the more important to understand quantitatively the spatial correlations and abundance distributions of species. These are found, empirically, to be characterized by striking power-law relationships (the species-area rule) and heavy-tailed distributions for species abundance, but a convincing theoretical explanation is still lacking. The universality of these findings, across a variety of biomes, suggests that there may be fruitful analogies with problems in condensed matter physics with strong fluctuations.

 

Most of the central questions in the subject are yet to be answered. What are the key factors that control the patterns of species abundance? What are the principal causes of extinctions and how may they be averted? What is the network of interactions between species and what impact does this network have on the fragility of the community and its ability to recover from a disturbance? How does the metabolic activity of organisms, especially microbes, interact with geochemical processes and help to shape the environment? And what are the right ways to characterize the representative genome (sometimes called the metagenome) in highly diverse environments?

 

Fundamental Questions and Issues

The great challenge of systems biology is to identify and explore the fundamental principles underlying organization and evolution of biological systems. What are the common design principles of biological systems? What are the “laws” of evolution? Essentially all of the phenomena considered above are non-equilibrium. For example, molecular networks develop from proteins available in cells, but require a continuous energy supply to function; morphogenesis is based on a continuous energy flux; and evolution is inherently a process of adaptation to a changing environment. Thus non-equilibrium physics and statistical mechanics in particular, should be the background against which the design principles and laws are analyzed.

 

Two important recurrent issues are discussed in the next two sub-sections.

Specificity, crosstalk, robustness and evolvability

The problem of specificity arises in numerous contexts. Transcription factors, which control gene expression, must recognize and bind to certain DNA sequences, in order to function. Too little DNA binding specificity would have them bind to spurious DNA sites, while too much specificity would make regulation extra sensitive to mutations. In signal transduction (be it MAP kinase pathways in eukaryotes or two component systems in prokaryotes, antigen recognition in the immune system, or odor perception in olfaction), there is a question of the extent to which these systems segregate different input/output streams and the extent of cross talk between them. The great substrate specificity of enzymes is often emphasized, yet it is difficult to understand how a system with perfect specificity could evolve a new function.

 

A related notion is the idea of robustness, which asserts that biological systems must function without fine-tuning of parameters, so that they are not too sensitive to mutation. The property of robustness, and the aspects of the system which render it robust, have been explored in diverse contexts: adaptation circuits in chemotaxis, morphogen gradient formation, and transcription factor binding, as well as for genetic networks governing segment polarity in fly development and the molecular network of yeast cell cycle. “Robustness” is clearly a general notion that cuts across all systems. "Evolvability" (the ability of an organism to acquire a new heritable function or trait in response to a changing environment) and "adaptability" (the ability of an organism or system to respond with non-heritable functional changes to a change in its environment) are important complementary issues, which fit naturally with modeling approaches employed by condensed matter and materials physicists.

Modularity

Not only has life evolved, but the nature of evolution itself has evolved. Concomitant with the evolution of biological diversity there must have been the evolution of mechanisms that facilitate evolution. What are the features of living matter that facilitate evolution? Modularity is likely to be one of them and it appears to be implemented on all scales. Generation of protein diversity cannot be explained by base substitution alone. Instead protein domains are shuffled like modules by a variety of genomic rearrangement processes, which in addition to homologous recombination include deletions, horizontal transfers, transpositions, etc. Modularity is manifest in the organization of prokaryotic genomes into transcriptional units – operons – that strongly correlate with multi-protein functional units and are common units of horizontal transfer. Modularity is also evident in the organization of regulatory and metabolic networks; yet, systematic methods for module identification are still to be developed. Recent applications of statistical mechanics methods to network motif analysis provide a promising starting point. Understanding the constraints imposed by modularity and quantifying the role of modules in evolution may also benefit from an influx of ideas from statistical mechanics and condensed matter/materials theory.
 

Summary

What can physics contribute to systems biology?

Many problems in systems biology involve aspects of physical phenomena such as spatio-temporal oscillations, excitable media, disorder, stochastic processes and rare fluctuations. Problems involving pattern formation and dynamical systems are ubiquitous. Of particular relevance are recent attempts to create dynamical-systems models of cells and metabolic processes within organisms. Such attempts are rapidly “hitting the wall of complexity” with the huge number of degrees of freedom that are being modeled, and there is a clear and explicit need for systematic techniques that can identify and model only the important dynamical degrees of freedom, just as has been done in other spatially-extended dynamical systems studies within the purview of condensed matter and materials physics. Another direct contribution is through application of physics methods, such as statistical mechanics methods applied to bioinformatics and dynamical systems methods applied in the context of network analysis. Here, physics can become a source of high-tech analytic tools for biology.

 

A less direct but equally important contribution will be the application of condensed matter and materials theory methods to modeling complex systems, which emphasize finding correct “effective” models to describe the essential aspects of the phenomena with the least number of parameters, yet make quantitative and falsifiable predictions. Such phenomenological modeling is well known in biology – it can claim as a past success the model of Hodgkin-Huxley (Nobel Prize 1963). Many other fields of application exist in systems biology.

What does systems biology bring to physics?

The immediate impact is to expand the scope of existing sub-fields of condensed matter and materials physics. Thus, the study of morphogenesis in animal development brings forth new problems in pattern formation: e.g. what are the biologically plausible “normal forms” which govern pattern formation in biological systems and what is their generic behavior? Novel problems in statistical mechanics arise from problems of sequence analysis and phylogenetics. The study of stability and evolution of biological networks gives rise to a new class of non-equilibrium system models: the “designed” systems. It is already leading to the establishment of a new subject: “the physics of networks”. Application of statistical physics methods to sequence alignment has also clarified some puzzling issues in the physics of directed polymers.

 

In the past, physics has dealt mainly with the properties of homogeneous matter. More recently, condensed matter and materials physics has turned to the study of inhomogeneous, random systems. These systems, however, are still homogeneous in the statistical sense. The natural next step is the investigation of inhomogeneous and non-random systems, defined (or “designed”) by some optimization process. Beyond that, we must learn to deal with systems with active feedback and degrees of freedom, rather than the passive complexity that is common in today’s condensed matter. Several aspects of systems biology fall into this category.

 

Looking into the future we foresee a major expansion of the range of physics. As Biological Physics takes root and grows into the “Physics of Living Matter” we anticipate theoretical and experimental study of evolving systems and insight into the origin of life. We anticipate the discovery of Laws of Evolution which, like the Laws of Thermodynamics that rule over the chaotic motion of molecules, will be found to govern the seemingly endless diversity of living forms. For now we must seek to ask the right questions and bring up the new generation of scientists who will be equal to this task.