Using Machine Learning And Natural Language Processing To Improve Scientific Processes

Titipat Achakulvisut, University of Pennsylvania


Scientific information has been growing exponentially over the past decades. Ar- guably, traditional processes of doing science cannot keep up with this growth. This expansion has a scaling impact on scientific activities such as funding, the review process, conferences, and exploring the literature. To improve on the traditional sci- entific processes, useful tools and understanding of these processes are crucial. This dissertation advances the scientific processes by incorporating knowledge and tools from machine learning (ML) and natural language processing (NLP). We discuss the applications in three applications of scientific endeavors including (1) improving on traditional conferences with data driven approaches, (2) extracting scientific claims for scientific literature, and (3) understanding the funding process using content of applications. To complement our findings, we provided open-source softwares, tools, and real-world implementation for other researchers. In sum, this thesis serves as both a conceptual point of view and a proof-of-concept implementation of how we can improve science through the use of ML and NLP.