Progress in the development and application of computational methods for probabilistic protein design

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CBE)
Degree type
Discipline
Subject
proteins
peptide sequences
computational analysis
Funder
Grant number
License
Copyright date
Distributor
Related resources
Contributor
Abstract

Proteins exhibit a wide range of physical and chemical properties, including highly selective molecular recognition and catalysis, and are also key components in biological metabolic, catabolic, and signaling pathways. Given that proteins are well-structured and can now be rapidly synthesized, they are excellent targets for engineering of both molecular structure and biological function. Computational analysis of the protein design problem allows scientists to explore sequence space and systematically discover novel protein molecules. Nonetheless, the complexity of proteins, the subtlety of the determinants of folding, and the exponentially large number of possible sequences impede the search for peptide sequences compatible with a desired structure and function. Directed search algorithms, which identify directly a small number of sequences, have achieved some success in identifying sequences with desired structures and functions. Alternatively, one can adopt a probabilistic approach. Instead of a finite number of sequences, such calculations result in a probabilistic description of the sequence ensemble. In particular, by casting the formalism in the language of statistical mechanics, the site-specific amino acid probabilities of sequences compatible with a target structure may be readily identified. The computational probabilities are well suited for both de novo protein design of particular sequences as well as combinatorial, library-based protein engineering. The computed site-specific amino acid profile may be converted to a nucleotide base distribution to allow assembly of a partially randomized gene library. The ability to synthesize readily such degenerate oligonucleotide sequences according to the prescribed distribution is key to constructing a biased peptide library genuinely reflective of the computational design. Herein we illustrate how a standard DNA synthesizer can be used with only a slight modification to the synthesis protocol to generate a pool of degenerate DNA sequences, which encodes a predetermined amino acid distribution with high fidelity.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2004-07-06
Journal title
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Postprint version. http://www.sciencedirect.com/science/journal/00981354 Computers & Chemical Engineering (in press)
Recommended citation
Collection