Science

Publications

Neutral evolution of Protein-protein interactions: a computational study using simple models

Background: Protein-protein interactions are central to cellular organization, and must have appeared at an early stage of evolution. To understand better their role, we consider a simple model of protein evolution and determine the effect of an explicit selection for Protein-protein interactions. Results: In the model, viable sequences all have the same fitness, following the neutral evolution theory. A very simple, two-dimensional lattice representation of the protein structures is used, and the model only considers two kinds of amino acids: hydrophobic and polar. With these approximations, exact calculations are performed. The results do not depend too strongly on these assumptions, since a model using a 3D, off-lattice representation of the proteins gives results in qualitative agreement with the 2D one. With both models, the evolutionary dynamics lead to a steady state population that is enriched in sequences that dimerize with a high affinity, well beyond the minimal level needed to survive. Correspondingly, sequences close to the viability threshold are less abundant in the steady state, being subject to a larger proportion of lethal mutations. The set of viable sequences has a "funnel" shape, consistent with earlier studies: sequences that are highly populated in the steady state are "close" to each other (with proximity being measured by the number of amino acids that differ). Conclusion: This bias in the the steady state sequences should lead to an increased resistance of the population to environmental change and an increased ability to evolve.

J. Noirel and T. Simonson.
BMC Structural Biology 2007 Nov 19;7:79.
Available from Biomed Central.

Automated extraction of meaningful pathways from quantitative proteomics data

Technological developments in the life sciences have resulted in an ever-accelerating pace of data production. Systems Biology tries to shed light upon these data by building complex models describing the interactions between biological components. However, extracting information from this morass of data requires the use of sophisticated computational techniques. Here, we propose a method suitable to integrate data drawn from quantitative proteomics into a metabolic scaffold and identify the metabolic pathways which are collectively up-regulated or down-regulated. The availability of such a tool is highly desirable as the extracted information could then be taken as a starting point for in-depth analyses, in particular in fields like Synthetic Biology, where datasets need be characterised routinely.

J. Noirel, S. Y. Ow, G. Sanguinetti, A. Jaramillo, P. C. Wright.
Briefings in Functional Genomics and Proteomics 2008.
Software: Browse, archive.

MMG: a probabilistic tool to identify submodules of metabolic pathways

Motivation: A fundamental task in systems biology is the identification of groups of genes that are involved in the cellular response to particular signals. At its simplest level, this often reduces to identifying biological quantities (mRNA abundance, enzyme concentrations, etc.) which are differentially expressed in two different conditions. Popular approaches involve using t-test statistics, based on modelling the data as arising from a mixture distribution. A common assumption of these approaches is that the data are independent and identically distributed; however, biological quantities are usually related through a complex (weighted) network of interactions, and often the more pertinent question is which subnetworks are differentially expressed, rather than which genes. Furthermore, in many interesting cases (such as high-throughput proteomics and metabolomics), only very partial observation is available, resulting for the need for efficient imputation techniques. Results: We introduce MMG (Mixture Model on Graphs), a novel probabilistic model to identify differentially expressed submodules of biological networks and pathways. The method can easily incorporate information about weights in the network, is robust against missing data and can be easily generalised to directed networks. We propose an efficient sampling strategy to infer posterior probabilities of differential expression, as well as posterior probabilities over the model parameters. We assess our method on artificial data demonstrating significant improvements over standard mixture model clustering. Analysis of our model results on quantitative highthroughput proteomic data leads to the identification of biologically significant subnetworks, as well as the prediction of the expression level of a number of enzymes, some of which are then verified experimentally. Availability: MATLAB code is available from http://www.dcs.shef.ac.uk/~guido/software.html.

G. Sanguinetti, J. Noirel, P. C. Wright.
Bioinformatics 2008 Feb 21.

Quantitative overview of N2 fixation in Nostoc punctiforme ATCC 29133 through cellular enrichments and iTRAQ shotgun proteomics

Nostoc punctiforme ATCC 29133 is a photoautotrophic cyanobacterium with the capacity to fix atmospheric N2. Its ability to mediate this process is similar to that described for Nostoc sp. PCC 7120, where vegetative cells differentiate into heterocysts. Quantitative proteomic investigations at both the filament level and the heterocyst level are presented using isobaric tagging technology (iTRAQ), with 721 proteins at the 95% confidence interval quantified across both studies. Observations from both experiments yielded findings confirmatory of both transcriptional studies and existing Nostoc sp. PCC 7120 iTRAQ experiments. N. punctiforme exhibits similar metabolic trends, though changes in a number of metabolic pathways are less pronounced compared to Nostoc sp. PCC 7120. Results also suggest a number of proteins that may benefit from future investigations. These include ATP dependent Zn-proteases, N-reserve degraders and also redox balance proteins. Complementary proteomic datasets from both organisms present key precursor knowledge that is important for future cyanobacterial biohydrogen research.

S. Y. Ow, J. Noirel, T. Cardona, A. Taton, P. Lindblad, K. Stensjö, and P. C. Wright
Submitted to the Journal of Proteome Research 2008.

An R implementation of MMG