Toward Intelligent Assistance for a Data Mining Process: An Ontology-Based Approach for Cost-Sensitive Classification

Loading...
Thumbnail Image
Penn collection
Operations, Information and Decisions Papers
Degree type
Discipline
Subject
Cost-sensitive learning
data mining
data mining process
intelligent assistants
knowledge discovery
knowledge discovery process
machine learning
metalearning
Databases and Information Systems
Other Computer Engineering
Other Education
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Bernstein, Abraham
Hill, Shawndra
Provost, Foster
Contributor
Abstract

A data mining (DM) process involves multiple stages. A simple, but typical, process might include preprocessing data, applying a data mining algorithm, and postprocessing the mining results. There are many possible choices for each stage, and only some combinations are valid. Because of the large space and nontrivial interactions, both novices and data mining specialists need assistance in composing and selecting DM processes. Extending notions developed for statistical expert systems we present a prototype intelligent discovery assistant (IDA), which provides users with 1) systematic enumerations of valid DM processes, in order that important, potentially fruitful options are not overlooked, and 2) effective rankings of these valid processes by different criteria, to facilitate the choice of DM processes to execute. We use the prototype to show that an IDA can indeed provide useful enumerations and effective rankings in the context of simple classification processes. We discuss how an IDA could be an important tool for knowledge sharing among a team of data miners. Finally, we illustrate the claims with a demonstration of cost-sensitive classification using a more complicated process and data from the 1998 KDDCUP competition.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2005-04-01
Journal title
IEEE Transactions on Knowledge and Data Engineering
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection