Integrative Biology

Our group embraces the data mining challenges that have resulted from high-throughput biology. Recent profiling technologies allow us to easily characterize the molecular profiles of biological samples on many levels, including genetic sequence, gene expression, protein abundance and activity, and epigenetic marks. Integrating these multidimensional data to generate biological models and testable hypotheses is a critical challenge in biomedical research. Our group uses the tools of machine learning, computer science, statistics, and network biology to integrate these disparate data types.

We have applied these tools of integrative biology to research projects that span a wide variety of disease areas and biological topics. We’ve grouped them into projects on the cutting edge and the bleeding edge.

The Cutting Edge

A representative sample of ongoing projects for which we have published papers:

Unfolded protein response: Together with the Wiseman and Kelly labs, we have performed a detailed dissection of the unfolded protein response (UPR). The UPR is a stress-activated pathway that is stimulated by the presence of unfolded or misfolded protein in the endoplasmic reticulum. We have combined genomic and proteomic analysis with a novel cell-based assay to dissect the overlapping and unique features of the three arms of the UPR. In collaboration with Peipei Ping, we are now building on this work in the context of the UPR’s role in cardiovascular disease (see below). (PubMed)

Osteoarthritis: In collaboration with Martin Lotz, we are studying the genomic basis of osteoarthritis, a degenerative joint disease that affects 27 million people in the United States alone. While OA is strongly associated with aging, advanced age alone does not cause disease. This collaboration attempts to characterize the genomic and epigenetic changes that differentiate healthy cartilage aging from diseased cartilage. By integrating data from massively parallel sequencing of mRNA, miRNA, and lncRNA together with methylation status, we have identified a new unified model of OA pathogenesis. This model serves as a framework for identifying key novel nodes in OA biology and points of therapeutic intervention. (PubMed)

The Bleeding Edge

A representative sample of ongoing projects about which we’re too excited to leave off of this page, but for which journal articles have not yet been published. (And who knows, these projects may fail spectacularly…)

Cystic fibrosis: In collaboration with Bill Balch, we are elucidating the proteostasis networks that are perturbed in cystic fibrosis. This Mendelian disease is caused by the improper folding and export of the CFTR protein to the cell membrane, but the biological networks surrounding CFTR greatly influence the severity (and ultimately the reversal) of disease symptoms. Here, we have developed an approached called Network-Augmented Genomic Analysis (NAGA) that combines RNAi-based functional screening with protein interaction data from mass spectrometry. We have used this approach to identify many new regulators of CFTR processing.

Cardiovascular health and disease: In a new consortium co-led by Peipei Ping, we are building a new Big Data center focused on proteomics analysis of cardiovascular disease. The goal of this multi-site multi-investigator center will be the generation of an integrated community platform for data warehousing, analysis, sharing, and annotation. This effort will be focused on cardiovascular health and disease, leveraging several large and well-phenotyped populations that have been profiled using genomic and proteomic technologies.

N-of-1 cancer profiling: In collaboration with clinical oncologists at several sites, we have developed a highly optimized pipeline for suggesting personalized cancer therapeutics based on a tumor’s molecular profile. This pipeline was optimized for an aggressive two-week turn-around from the time of biopsy to the generation of a simple genomic report for the treating physician. Importantly, in contrast to many commercial offerings that perform comparable analyses, all of the software and databases that we are developing are released via open-source and free licenses. We strongly believe that open community collaboration will be the most effective route to personalized genomic medicine.