Automated Recognition of Malignancy Mentions in Biomedical Literature

Jin, Yang; McDonald, Ryan T; Lerman, Kevin; Mandel, Mark A; Carroll, Steven; Liberman, Mark Y; Pereira, Fernando C.N.; Winters, Raymond S; White, Peter S

Automated Recognition of Malignancy Mentions in Biomedical Literature

Files

pereira.pdf (355.37 KB)

Penn collection

Departmental Papers (CIS)

Permalink

https://repository.upenn.edu/handle/20.500.14332/6315

View all metadata

Author

Jin, Yang

McDonald, Ryan T

Lerman, Kevin

Mandel, Mark A

Carroll, Steven

Liberman, Mark Y

Pereira, Fernando C.N.

Winters, Raymond S

White, Peter S

Abstract

The rapid proliferation of biomedical text makes it increasingly difficult for researchers to identify, synthesize, and utilize developed knowledge in their fields of interest. Automated information extraction procedures can assist in the acquisition and management of this knowledge. Previous efforts in biomedical text mining have focused primarily upon named entity recognition of well-defined molecular objects such as genes, but less work has been performed to identify disease-related objects and concepts. Furthermore, promise has been tempered by an inability to efficiently scale approaches in ways that minimize manual efforts and still perform with high accuracy. Here, we have applied a machine-learning approach previously successful for identifying molecular entities to a disease concept to determine if the underlying probabilistic model effectively generalizes to unrelated concepts with minimal manual intervention for model retraining.

Publication date

2006-11-07

Comments

Reprinted from BMC Bioinformatics, Volume 7, Article 492, November 2006, 8 pages.

Collection

Articles