Detecting Selection on Noncoding Nucleotide Variation: Methods and Applications

Ding, Yang

Detecting Selection on Noncoding Nucleotide Variation: Methods and Applications

Files

0-supp_figure_S1_120speciesphylogeny.pdf (45.35 KB)

1-supplementary_tables.xlsx (161.19 KB)

Ding_upenngdas_0175C_12053.pdf (1.95 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Biology

Subject

Codon Usage Bias
Natural Selection
Protein Translation
RNA Structure
Bioinformatics
Biology
Evolution

Copyright date

2016-11-29T00:00:00-08:00

Abstract

There has been a long tradition in molecular evolution to study selective pressures operating at the amino-acid level. But protein-coding variation is not the only level on which molecular adaptations occur, and it is not clear what roles non-coding variation has played in evolutionary history, since they have not yet been systematically explored. In this dissertation I systematically explore several aspects of selective pressures of noncoding nucleotide variation: The first project (Chapter 2) describes research on the determinants of eukaryotic translation dynamics, which include selection on non-coding aspects of DNA variation. Deep sequencing of ribosome-protected mRNA fragments and polysome gradients in various eukaryotic organisms have revealed an intriguing pattern: shorter mRNAs tend to have a greater overall density of ribosomes than longer mRNAs. There is debate about the cause of this trend. To resolve this open question, I systematically analysed 5’ mRNA structure and codon usage patterns in short versus long genes across 100 sequenced eukaryotic genomes. My results showed that compared with longer ones, short genes initiate faster, and also elongate faster. Thus the higher ribosome density in short eukaryote genes cannot be explained by translation elongation. Rather it is the translation initiation rate that sets the pace for eukaryotic protein translation. This work was followed by modelling studies of translation dynamics in a yeast cell. Chapter 3 concerns detecting selective pressures on the viral RNA structures. Most previous research on RNA viruses has focused on identifying amino-acid residues under positive or purifying selection, whereas selection on RNA structures has received less attention. I developed algorithms to scan along the viral genome and identify regions that exhibit signals of purifying or diversifying selection on RNA structure, by comparing the structural distances between actual viral RNA sequences against an appropriate null distribution. Unlike other algorithms that identify structural constraints, my approach accounts for the phylogenetic relationships among viral sequences, as well the observed variation in amino-acid sequences. Applied to Influenza viruses, I found that a significant portion of influenza viral genomes have experienced purifying selection for RNA structure, in both the positive- and negative-sense RNA forms, over the past few decades; and I found the first evidence of positive selection on RNA structure in specific regions of these viral genomes. Overall, the projects presented in these chapters represent a systematic look at several novel aspects of selection on noncoding nucleotide variation. These projects should open up new directions in studying the molecular signatures of natural selection, including studies on interactions between different layers at which selection may operate simultaneously (e.g. RNA structure and protein sequence).

Advisor

Joshua B. Plotkin

Date of degree

2015-01-01

Collection

Dissertations and Theses