Learning Image Segmentation With Relation-Centric Loss And Representation

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Computer and Information Science
Discipline
Subject
Artificial Intelligence
Computer Vision
Deep Learning
Image Segmentation
Machine Learning
Metric Learning
Computer Sciences
Funder
Grant number
License
Copyright date
2021-08-31T20:20:00-07:00
Distributor
Related resources
Author
Hwang, Jyh-Jing
Contributor
Abstract

How to equip machines with the ability to understand an image and explain everything in it has a long history in computer vision, motivating tasks from image recognition, detection, and segmentation towards holistic scene understanding. Researchers have been tackling an array of related structured prediction tasks to approach this problem: low-level and high-level image segmentation, monocular image depth estimation, surface normal prediction, etc. These tasks focus on different aspects of holistic image understanding yet intertwine with one another to unveil underlying image structures. This dissertation presents our efforts in this direction, with an emphasis on learning to model pixel relationships. We identify three major problems in image understanding and segmentation: (1) The pixel-wise classification based approaches fundamentally lack the sense of object structures and spatial layouts. (2) The lack of explainability prevents us from understanding the mistakes and from proposing remedies accordingly. (3) The definition of image segmentation is ambiguous and changing over time. These constant arguments reveal the complexity of holistic image understanding and motivate us to propose unsupervised feature learning for image segmentation. To tackle these problems one by one, we first propose Adaptive Affinity Fields (AAF) for semantic segmentation, which capture and match the semantic relations between neighboring pixels in the label space. Furthermore, we generalize the affinity fields in AAF as another neural network and propose Adversarial Structure Matching (ASM) for structured prediction tasks. ASM extends the structure learning concept from segmentation to monocular depth estimation and surface normal prediction. To tackle the second and third problems, we turn attention to metric learning and propose Segment Sorting (SegSort) for semantic segmentation. Most excitingly, SegSort is the first attempt using deep learning for unsupervised semantic segmentation with interpretable results. Last but not least, we extend our proposed SegSort to image parsing as a first attempt to unify semantic and instance segmentation.

Advisor
Jianbo Shi
Stella X. Yu
Date of degree
2020-01-01
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation