Statistics Papers

Document Type

Journal Article

Date of this Version

2004

Publication Source

Statistical Science

Volume

19

Issue

1

Start Page

188

Last Page

204

DOI

10.1214/088342304000000107

Abstract

The Bayesian approach together with Markov chain Monte Carlo techniques has provided an attractive solution to many important bioinformatics problems such as multiple sequence alignment, microarray analysis and the discovery of gene regulatory binding motifs. The employment of such methods and, more broadly, explicit statistical modeling, has revolutionized the field of computational biology. After reviewing several heuristics-based computational methods, this article presents a systematic account of Bayesian formulations and solutions to the motif discovery problem. Generalizations are made to further enhance the Bayesian approach. Motivated by the need of a speedy algorithm, we also provide a perspective of the problem from the viewpoint of optimizing a scoring function. We observe that scoring functions resulting from proper posterior distributions, or approximations to such distributions, showed the best performance and can be used to improve upon existing motif-finding programs. Simulation analyses and a real-data example are used to support our observation.

Keywords

gene regulation, motif discovery, Bayesian models, scoring functions, optimization, Markov chain Monte Carlo

Share

COinS
 

Date Posted: 27 November 2017

This document has been peer reviewed.