Maximum Entropy Methods for Biological Sequence Modeling

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
maximum entropy
amino acids
sequence analysis
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Buehler, Eugen C
Contributor
Abstract

Many of the same modeling methods used in natural languages, specifically Markov models and HMM's, have also been applied to biological sequence analysis. In recent years, natural language models have been improved upon by using maximum entropy methods which allow information based upon the entire history of a sequence to be considered. This is in contrast to the Markov models, whose predictions generally are based on some mixed number of previous emissions, that have been the standard for most biological sequence models. To test the utility of Maximum Entropy modeling for biological sequence analysis, we used these methods to model amino acid sequences. Our results show that there is significant long-distance information in amino acid sequences and suggests that maximum entropy techniques may be beneficial for a range of biological sequence analysis problems.

Advisor
Date of presentation
2001-08-26
Conference name
Departmental Papers (CIS)
Conference dates
2023-05-16T22:28:22.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Presented at the Workshop on Data Mining in Bioinformatics 2001 (BIOKDD 2001).
Presented at the Workshop on Data Mining in Bioinformatics 2001 (BIOKDD 2001).
Recommended citation
Collection