Date of Award

2015

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Graduate Group

Genomics & Computational Biology

First Advisor

John I. Murray

Abstract

Development proceeds through many stages, and requires genes to function at particular places and times. Knowing when and where a gene is expressed can predict its function. Furthermore, tissue-specific gene expression is regulated by many factors, whose expression patterns often overlap. Understanding this regulation would be helped by finding examples of regulatory targets of these factors, throughout the genome. The nematode C. elegans provides a model of how parts combine to form an organism. It develops into 558 cells during embryogenesis via an invariant lineage (pattern of divisions). Fluorescent markers are available for many well-defined groups of cells. Therefore, we asked how well we could “deconvolute” the expression genome-wide in each individual cell, based on expression measurements in overlapping sets of cells. Using simulated data, we compared the performance of several different methods for solving this problem. We found that we could estimate the possible range of expression throughout the embryo, using far fewer measurements than there are cells. Based on the performance simulations, we measured expression in eighteen populations of cells, flow-sorted by fluorescent markers expressed in the C. elegans embryo. Applying our deconvolution methods allowed us to estimate every gene’s expression in every cell, although the accuracy of these predictions with our current sample size are not yet high enough to make them broadly useful. We clustered this dataset, and found that many genes known to be expressed in particular tissues cluster together. Comparison with existing annotation suggests that over a hundred of these clusters of genes are expressed in a tissue-specific manner. RNA-FISH confirms some of these expression predictions. Motifs corresponding to known C. elegans transcription factors were enriched upstream of the genes in many of these clusters. By combining motif enrichment with coexpression, we obtain many novel predictions about gene regulation. We have validated several of these predictions using RT-PCR in a mutant background. Our data and analysis provides a resource for improving our knowledge of tissue-specific expression and its regulation throughout C. elegans development. Furthermore, our results suggest a framework for inferring changes in gene expression and cell type composition in complex tissues.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS