Proteomics:
Pegging Protein Function to DNA Sequence

Molecular Medicine:
Breast Cancer Protein Shown to Be Highly Tuned Molecular Medicine
Cell Biology:
Cell Death Pathway Found with Link to Alzheimer's
Leadership:
Nancy Andrews Chosen to Direct Medical School's MD–PhD Program
Administration:
Survey Examines Salary Equity by Gender



Drug Targets Resistant HIV, Awaits FDA Approval

Meta-Study Shows Alcohol Cuts Heart Disease Risk

New Angiogenesis Inhibitor Identified

Cytokine Block Prevents Post-Immune Suppression



McLean Lab to Analyze Child and Adolescent Brain MRIs

Western Blot Story

Faculty Appointments

Satellite Broadcast Opens Up Neighborhood Dialogue

Honors and Advances

How Does Harvard Pilgrim Health Care Crisis Affect Its Residency?

Front Page

PROTEOMICS

Pegging Protein Function to DNA Sequence

Researchers Use Worm Sex Organ to Demonstrate Meaning of Genome Letters

Lost amid the media fanfare greeting new DNA sequencing milestones—most recently that of human chromosome 22—is the less glamorous fact that eye-glazing streams of As, Cs, Ts, and Gs per se reveal little about how a creature works.

"How do you get from a complete genome sequence back to the biology of the organism? This is a big, unanswered question," says Marc Vidal, currently HMS instructor in medicine at the Cancer Center at Massachusetts General Hospital.

In the Jan. 7 Science, a research team led by Vidal takes a stab at this problem. In a report accompanied by a Perspectives article, the scientists describe how they have tested a way of discovering which proteins interact with one another. The method uses a high-throughput version of a standardized assay to generate data on hundreds of proteins in one experiment.

marc vidalMarc Vidal (left) and Marian Walhout have developed a method to study how otherwise completely unknown gene products interact.


It is the first functional genomics study that attempts to reveal the protein–protein interactions in the tiny worm Caenorhabditis elegans, which last year became the first animal to have its genome fully sequenced. Previous work in the field interpreted the genomes of yeast or bacteria, and much of it uses DNA chips to assess the expression of all genes of these cells under a given condition.

But while expression studies indicate which proteins may be involved in the biological function at hand, it is the proteins that actually carry it out, making protein studies a more direct approach to understanding how things work.

The goal of this research is to find quick ways to "annotate" genes predicted to exist by the genome sequence but that remain unexplored by conventional genetics. In C. elegans, 20 years of molecular genetics, studying a gene at a time, has assigned a function to only 1,277 of the 19,293 genes thought to compose the worm's genome.

Yet quick often implies quick and dirty. Indeed, Vidal concedes that molecular biologists rightly fault functional genomics for the high rate of false-positive and false-negative results produced by the automated screening methods used to test large numbers of genes all at once.

That is why the authors devote much of the current paper to discussing ways for validating the potential protein–protein interactions their screen has generated.

Functional genomics as Vidal develops it is routinely used in genomics companies, but academia has been slower to embrace it. Beyond criticizing the artificial nature of its screens, some academic scientists question genomics because it is not driven by a specific hypothesis. Vidal, who is setting up his lab as an HMS assistant professor of genetics at the Dana–Farber Cancer Institute, says, "We do not have a hypothesis." But he adds, "We hope to generate sensible hypotheses that can be tested."

Seeing Who Does What to Whom

In this study, first author Marian Walhout, HMS research fellow in Vidal's lab, used an improved and automated version of the two-hybrid screen, in which C. elegans genes are cloned into yeast strains in such a way that an interaction between two expressed proteins will allow the yeast cells to grow.

To test their approach, the scientists chose to study which proteins interact to form the worm's vulva. This area has been studied extensively with conventional genetics, allowing Vidal's group to check the reliability of their screen.

First, Walhout ran a matrix experiment testing 29 proteins implicated in vulva development against each other. The screen picked up six of 11 known interactions and suggested two new ones. Then she tested each of the 29 vulva proteins individually against thousands of C. elegans proteins expressed from a cDNA library. That experiment generated 150 potential protein–protein interactions—not necessarily all connected to vulva development—involving 126 genes. Interestingly, 110 of those genes fall among those 18,000 for which no information is available.

"So one small sampling of the genome with just these 29 proteins produced a first annotation for 110 genes," says Vidal.

But how can one tell the real interactions from the artifacts? "We are trying to develop a set of increasing heuristic values that gradually establish confidence in the data generated by the screens," says Vidal. For instance, they introduce the concept of the "interolog," meaning that a known interaction between homologous proteins in another species makes a potential interaction in the worm screens more plausible.

Such combing of the literature, however, enables only the transfer of already known protein–protein interactions back to C. elegans. To assess new interactions, the authors adapted a method of data analysis to extract patterns from the 150 interactions. The method essentially looks for snakes that bite their own tails—patterns where A interacts with B, B with C, C with D, and so on, until one protein harks back to A. The premise is that such clusters again increase the likelihood that the proteins actually work together in the worm.

Finally, by integrating these patterns with whatever information is known about some of the involved proteins, the researchers suggest networks of protein interaction that geneticists can test (see image).

Even so, Vidal's comprehensive effort to establish protein–protein interactions for most of the C. elegans genes might be wasted if its data were to stand in isolation, he says. But they will not. His is only one part of a larger effort to connect several different genomics/proteomics approaches through a set of hyperlinked databases. A group at Stanford is gathering gene expression data on 12,000 worm genes in parallel. Other academic groups are working on systematically knocking out genes and on visualizing where proteins reside in the cell.

worm
This nematode has developed multiple female genitals (arrows) thanks to mutations in two of the genes depicted on the right. The diagram exemplifies how scientists including Marc Vidal are beginning to extract from the worm's genome sequence information about its proteins. This work represents an early attempt at handling many proteins at the same time and placing them in functional relationships. In this case, screens testing many proteins in parallel generated large sets of raw data from which Vidal identified clusters of protein interactions that feed back on each other. The diagram represents such a cluster (protein abbreviations shown). Circled proteins were previously described; adjacent ones are known to interact physically; red ones probably act in linear pathways. Plain labels denote gene products merely predicted by the DNA sequence, for which no information is available. Vidal's top-down approach posits that these genes all interact, and he invites genetic experiments to test that notion.


In Silico Biology

The academic labs are expected to hyperlink their respective Web pages to a central database on worm genes maintained by Proteome Inc., a Beverly, Mass., company that curates published information about genes. Access for academic scientists is free. Vidal and Walhout expect that such databases will eventually be established for other model organisms—one for yeast already exists—and linked together into one powerful resource.

Taking the long view, Vidal hopes this resource will change the way biomedical scientists form research hypotheses a few years from now. Before starting the first experiment, a scientist could consult functional genomics databases to learn everything known about dozens of genes involved in a favorite problem, and then decide on a set of genes to study in detail with genetic tests tailored to the problem. In short, genetics and genomics complement—and need—each other, says Vidal.

—Gabrielle Strobel