Learning Visual Concepts

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Computer and Information Science
Discipline
Data Science
Computer Sciences
Electrical Engineering
Subject
Classification
Computer Vision
Machine Perception
Object Detection
Object Recognition
Funder
Grant number
License
Copyright date
01/01/2024
Distributor
Related resources
Author
Shivakumar, Shreyas Skandan
Contributor
Abstract

We propose a framework to use off-the-shelf pre-trained object detection models and extend them for use on unseen datasets in a manner requiring little to no modification of the original architecture, and by adding only a few additional components to the overall pipeline. Motivated by the role of attributes in zero-shot-learning paradigms, we define conceptual groups by using positive and negative exemplars retroactively, and evaluate the feasibility of recognizing a variety of these proposed conceptual groups in a corpus of previously unseen data, including unseen categories. We conduct experiments with networks trained on the COCO dataset, and utilize Open-Images-V7 as our held out unseen dataset. Our analysis suggests that existing off-the-shelf object detection networks such as Faster-RCNN can be leveraged to extract useful information beyond the scope of a straightforward category prediction framework. This information can be used to operationalize the idea of concept learning through a set of positive and negative exemplars and a simple linear SVM operating on the features produced by the deep network. We compare this approach to vision enabled large language models such as LLaVA, CogVLM and GPT4V, and show a strong baseline performance with lower resource requirements. Additionally, we illustrate that this method can be scaled to larger concept sets by validating this approach on a larger set of concepts in the LVIS dataset. We illustrate a few approaches to better understand the semantic topology of their learned feature space, and we measure the feasibility of using these features for the identification of the proposed conceptual groups. We propose strategies to leverage this information to predict these conceptual groups on previously unseen samples containing unseen class categories.

Advisor
Taylor, Camillo, J
Date of degree
2024
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation