Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp0147429c74g
Title: Advances in decision-making under uncertainty: inference, finite-time analysis, and health applications
Authors: Wang, Yingfei
Advisors: Powell, Warren B
Contributors: Computer Science Department
Subjects: Computer science
Operations research
Issue Date: 2017
Publisher: Princeton, NJ : Princeton University
Abstract: This thesis considers the problem of sequentially making decisions under uncertainty, exploring the ways where efficient information collection influences and improves decision-making strategies. Most previous optimal learning approaches are restricted to fully sequential settings with Gaussian noise models where exact analytic solutions can be easily obtained. In this thesis, we bridge the gap between statistics, machine learning and optimal learning by providing a comprehensive set of techniques that span from designing appropriate stochastic models to describing the uncertain environment, to proposing novel statistical models and inferences, to finite-time and asymptotic guarantees, with an emphasis on how efficient information collection can expand access, decrease costs and improve quality in health care. Specifically, we provide the first finite-time bound for the knowledge gradient policy. Since there are many situations where the outcomes are dichotomous, we consider the problem of sequentially making decisions that are rewarded by “successes” and “failures”. The binary outcome can be predicted through an unknown relationship that depends on partially controllable attributes of each instance. With the adaptation of an online Bayesian linear classifier, we design a knowledge gradient (KG) policy to guide the experiment. Motivated by personalized medicine where a treatment regime is a function that maps individual patient information to a recommended treatment, hence explicitly incorporating the heterogeneity in need for treatment across individuals, we further extend our knowledge gradient policy to a Bayesian contextual bandits setting. Since the sparsity and the relatively small number of patients make learning more difficult, we design an ensemble optimal learning method, in which multiple models are strategically generated and combined to minimize the incorrect selection of a particularly poorly performing statistical model. Driven by numerous needs among materials science society, we developed a KG policy for sequential experiments when experiments can be conducted in parallel and/or there are multiple tunable parameters which are decided at different stages in the process. Finally, we present a new Modular, Optimal Learning Testing Environment MOLTE as a public-domain test environment to facilitate the process of more comprehensive comparisons, on a broader set of test problems and a broader set of policies.
URI: http://arks.princeton.edu/ark:/88435/dsp0147429c74g
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Wang_princeton_0181D_12083.pdf6.86 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.