Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01b2773z77c
Title: 3D Representations for Learning to Reconstruct and Segment Shapes
Authors: Genova, Kyle Adam
Advisors: Funkhouser, Thomas
Contributors: Computer Science Department
Keywords: 3D Reconstruction
Computer Graphics
Computer Vision
Differentiable Rendering
Semantic Segmentation
Shape Representation
Subjects: Computer science
Issue Date: 2021
Publisher: Princeton, NJ : Princeton University
Abstract: The focus of this dissertation is the novel use of shape representations to empower 3D reasoning for reconstruction and segmentation. It is organized into three sections based on application: domain specific shape reconstruction (Chapter 2), general shape reconstruction (Chapter 3), and semantic segmentation (Chapter 4). In each chapter, we outline the setting and related work, and then introduce one or two approaches with a novel use of shape representation. Our key contribution is to use shape representation to enable new types of supervision and improve generalization when learning 3D priors. Because current reconstruction and segmentation methods share the use of learned 3D encoder and decoder architectures, these contributions apply to both tasks. In Chapters 2-4, we demonstrate experimentally that reconstruction and segmentation algorithms benefit from our choices of shape representation. A primary benefit of our approaches is enabling new types of supervision that require some property of the representation to be effective. Domain specific representation enables supervising 3D face reconstruction with a face recognition network for the first time, resulting in provably more recognizable reconstructions (Chapter 2). Our SIF representation learns shape correspondence from only reconstruction supervision (Chapter 3). Large, diverse image collections are already semantically labeled, making it possible to train 3D semantic segmentation models for datasets without point cloud annotations (Chapter 4). A secondary benefit is improved generalization, by deriving better priors from existing supervision. We propose a new shape representation, LDIF, which is trained on existing 3D reconstruction data. LDIF learns robust local priors, improving generalization to unseen classes and shapes (Chapter 3). The addition of image-based supervision in segmentation algorithms improves generalization to cities with no 3D supervision (Chapter 4). We conclude that our choices of representation enable new supervision, better generalization, and learning useful 3D priors from readily available labels (e.g., labeled and unlabeled images, or unlabeled shape collections). We hypothesize that effective future representations will build on this trend by deriving higher level semantic priors from unannotated datasets and other inexpensive sources of supervision (Chapter 5).
URI: http://arks.princeton.edu/ark:/88435/dsp01b2773z77c
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Genova_princeton_0181D_13648.pdf33.35 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.