Focus

September 16, 2005

Genomics
Integrated Technology Predicts Functional Systems in Cell

Epigenetics
Novel Players Identified in Gene Regulation

Sleep Medicine
Heart Tracings Reveal Sleep Patterns for Health and Disease

Health Care Policy
National Working Group Examines Health Care Tradeoffs in Public Forum at HMS

Bacteria May Be Early Signal of Oral Cancer

Step Taken Toward $1,000 Personal Genome

Fat Cell Protein Seen to Cause Insulin Resistance

Department Chair, Assistant Dean Named at HSPH

School Welcomes Incoming Students

New Full and Endowed Professorships

AIDS Vaccine Program Gains $19m Grant

Honors and Advances

Surgeon-Journalist Plies Both Trades in Iraqi War Zone

Front Page

GENOMICS

Integrated Technology Predicts Functional Systems in Cell

Protein Interaction, Gene Expression, RNA Interference Combine to Rough Out ‘Molecular Machines’

In a study that combines the newest technologies for gathering data about protein function, a team including members of HMS has created a global map of protein interactions in C. elegans during the first moments of life. The resulting model reveals a wide-angle view of cellular activity, and from this vantage point can be used to make predictions about unknown protein interactions.

Marc Vidal
Photo by Farnsworth/Blalock

By bringing together data from different high-throughput technologies, Marc Vidal and his collaborators found they could create a more complete picture of cellular activity during the first stages of life.


The study, published in the Aug. 11 Nature, looks at early embryogenesis: the first two cell divisions when a single cell becomes four in a little more than an hour. During this short time, there is little new input from DNA, since the process does not rely on gene transcription. Instead, it is a performance carried out simply by what is already contained in the cell’s cytoplasm. But how do the cell’s contents perform such a feat?

Protein to Protein
Marc Vidal, HMS associate professor of genetics at the Dana–Farber Cancer Institute, said that the first step of the task was to ask, “What do we know today, globally, that would help us start to draw a different kind of model?” One approach is to look at which proteins physically touch each other. This is Vidal’s specialty. Using yeast two-hybrid screening, his team has been able to quickly characterize physical interactions between proteins. Hui Ge, then a graduate student in Vidal’s lab and now a fellow at the Whitehead Institute, was working on protein maps of early embryogenesis with Debra Goldberg, then a research fellow at HMS and now an assistant professor at the University of Colorado, Boulder, and they wanted to find other datasets to make the maps more accurate and comprehensive.

Another way to characterize proteins is to remove each gene product in turn and observe the results, something that has been made feasible with RNA interference (RNAi). Fabio Piano, assistant professor of biology at NYU, has been working to find a way to systematically describe phenotypic change. As he explained, decades of work in genetics—and now with RNAi—have produced detailed descriptions of what happens when a gene is absent. However, it is difficult to use this kind of descriptive information quantitatively. Piano has developed what he calls a “phenotypic grammar” for the changes caused by perturbing genes. His team has developed a list of 47 possible aberrations in the steps of early embryogenesis. They collaborated with a German company, Cenix, which had systematically inhibited using RNAi each of the 600-plus genes thought to be involved in early embryogenesis. The team made movies of the process under a light microscope for each of the 661 perturbations. For each movie, the researchers could assign a score—a series of ones and zeros—based on whether they saw a change in each of the 47 categories. “Now you basically turn all these complex phenotypes into a digital signal,” Piano said, and it is possible to quantify how closely the proteins match phenotypically.

“For the first time, we showed that combining all these datasets really gives a lot of insight into development. Biologists can take these systems-level models and test hypotheses.”

In addition to the protein interaction and phenotypic information, the team brought in data from DNA microarrays, linking proteins to one another based on how closely their expression overlaps during early embryogenesis. It was not clear whether data from three disparate technologies would truly mesh. Vidal’s lab collaborated with Piano; Kristin Gunsalus, research assistant professor at NYU; and the lab of Fritz Roth, HMS assistant professor of biological chemistry and molecular pharmacology, to overlay the three types of data into one.

To Comb a Hairball
In the combined model, each node on the map is a gene and its protein product, and each edge or connection represents one of the three factors. What the researchers got is, not surprisingly, a “big mess,” as Roth put it. The resulting cloud of protein networks is difficult to decipher. But the researchers whittled some of the noise away by including only proteins connected by at least two of the three relationships. That filter transformed “a difficult-to-interpret hairball to something where you can really see dense clusters of genes,” said Roth, and few links existed between groups.

Some of these islands of activity contained proteins involved in known molecular complexes like the ribosome and proteasome. Other clusters in the model contain proteins that are related phenotypically and have similar expression profiles, but fewer direct connections. These groups seem to correspond to what Piano calls “logical interactions” that may represent molecular pathways rather than physical complexes. The team called these clusters “molecular machines” of the cell—groups that work together to perform discrete tasks.


Three separate networks are combined into one. To identify functional networks in the cell, links are first drawn between proteins that physically interact in binding assays (top, center) as well as gene products that cross a certain threshold of correlation in both expression profiles using DNA microarrays (left) and phenotype profiles using RNA interference (right). The resulting network (bottom), a union of the three, shows stronger connections between proteins linked by at least two of the three techniques. Clusters of connected proteins correspond to known molecular machines of the cell.
(Image courtesy of Kris Gunsalus, adapted by Rachel Eastwood)


Though it was encouraging to see many familiar players in their model, its true power is in its ability to reveal new and verifiable information. Piano’s team picked out 10 genes whose function was unknown that were found by the model to be connected to known genes, suggesting their potential roles. Postdoctoral fellow Aaron Schetter performed experiments using fluorescent tags to find where the proteins reside in the cell during early embryogenesis. In eight out of 10 of the tests, the proteins were found to localize to a site consistent with the location of molecular machines that the model had predicted they belonged to. In one interesting case, a protein was predicted to play a role in two separate clusters; the localization experiment showed it did, indeed, shuttle between the two sites during different stages of cell division.

Vidal concedes that this model is a crude first draft of a map that encompasses the vast territory of the cell, which has been largely studied in its detail. An ideal map, he said, would contain both the big picture and the details, like the mapping tool complete with satellite images for every neighborhood now available from Google. “We’re very far from being able to do Google Earth with the cell,” Vidal said. “We’re drawing relatively approximate cellular roads.”

Ge said that high-throughput technologies have endured criticism that the data they produce are not useful. Alone, each of these technologies suffers from incompleteness and inaccuracies. “For the first time, we showed that combining all these datasets really gives a lot of insight into development,” she said. “Biologists can take these systems-level models and test hypotheses.”


top