Investigators develops new mathematical approach to distinguish health and disease states

Investigators have developed a new mathematical approach to analyze molecular data derived from complex mixtures of immune cells. This approach, when combined with well-established techniques, readily identifies changes in small samples of human whole blood, and has the potential to distinguish between health and disease states.

Led by Mark Davis, Ph.D., and Atul Butte, M.D., Ph.D., of Stanford University, Calif., the team of investigators received support from the National Institute of Allergy and Infectious Diseases (NIAID), as well as the National Heart, Lung, and Blood Institute and the National Cancer Institute, all part of the National Institutes of Health. Details about their work appear online at Nature Methods.

"Defining the status of the human immune system in health and disease is a major goal of human immunology research," says NIAID Director Anthony S. Fauci, M.D. "A method allowing clinicians to accurately and quickly characterize the many different immune cells in human blood would be a valuable research and diagnostic tool."

Over the past 15 years, the technology for gene expression microarrays, which allow investigators to identify and measure relative amounts of many different genes in parallel, has advanced tremendously. Today researchers can measure nearly every gene in the human genome using very small amounts of blood. However, blood contains numerous types of immune cells, such as lymphocytes, basophils and monocytes, and when microarray analysis is performed on this mixture, the interpretation of the results becomes problematic.

"Current methods that examine gene expression differences in mixtures of immune cells in blood do not take into account that, even among healthy individuals, there is a wide range of variation in the proportion of each cell type," says Dr. Davis.

"This creates so-called noise that masks many differences in gene expression. Even when you do observe a difference, you do not know if this is due to a real difference or a reflection of the varying number of cell types in the mixture."

Until now, scientists had to separate out the cell types from a mixture prior to analysis to verify that actual changes in gene expression had occurred. But cell separation is time-consuming and costly, and requires large samples of blood, Dr. Davis adds.

To overcome such obstacles, the study team developed a computational approach called cell specific significance analysis of microarrays (csSAM).

"What csSAM does is marry the concepts of cell separation with the ease of analyzing large families of genes on a microarray," explains Dr. Butte. "Using a mathematical approach, we can virtually separate out the different cell types found in blood, determine the gene expression patterns of these cell types, and identify which changes in gene expression are due to actual disease and which are simply due to variations in the cell proportions."

Investigators first tested the csSAM approach using liver, brain and lung cells from rats. They began by analyzing the gene expression patterns in the three separate cell populations. Then they mixed the cells together in different known ratios and used the new mathematical approach to pick out the individual gene expression patterns of each cell subset in each mixture. Once they had confirmed that this analytical approach correctly identified the gene expression patterns of each individual cell subset in the mixtures, they tested csSAM on blood from kidney transplant patients who were either undergoing kidney rejection or who were stable.

Using a traditional approach to analyze the gene expression from blood, the investigators did not observe any differences between people undergoing transplant rejection and people with stable transplants. However, using the csSAM approach, they were able to pull out and measure from the mixture the gene expression patterns of five specific subsets of immune cells-monocytes, basophils, neutrophils, eosinophils, and lymphocytes, known as T and B cells. The researchers were able to identify more than 300 differences in monocyte gene expression between the two groups and more than 100 genes that were significantly increased. Because monocytes make up a smaller proportion of immune cells in the blood, compared to neutrophils or lymphocytes, the traditional analysis approach could not distinguish these differences in monocyte gene expression amongst all the other gene signals.

Methods that separate out specific cell populations from blood can lead to changes in expression of genes that are not due to disease but to the experimental manipulations. The csSAM approach circumvents these additional measures that may create even more noise in the samples.

Another advantage of the csSAM approach is that it could be applied to other high-throughput analyses of genes, proteins or other cellular products of any complex mixture of cells or tissues.

"In recent years, NIAID has placed increasing emphasis on supporting human immunology research, such as this study," says Daniel Rotrosen, M.D., of NIAID's Division of Allergy, Immunology, and Transplantation. "We are very encouraged that our investment has helped to develop this new analytical approach, which has the potential not only to define the parameters of the healthy human immune system, but also to help identify biomarkers of disease and develop more effective vaccines."