Lab Home | Phone | Search | ||||||||
|
||||||||
We consider sequential optimization decision making problems, which entail making optimal decisions at specified epochs of time to maximize or minimize the specific objective of the decision maker. The system in question evolves in a probabilistic manner through time. This means that the evolution of uncertainty is revealed in a sequential manner as and when the random events occur. The decision maker bases decisions as the uncertainty evolves, up to the point of the decision epoch, and known information about its probabilistic evolution through distributions. The class of problems we focus on in this research is multistage sequential decision making problems where the uncertainty is purely exogenous in nature. Specifically, we compare multistage stochastic programming (MSP) and Markov decision processes (MDP). We describe these approaches generically and draw a theoretical equivalence between these methods for a given underlying stochastic process while highlighting subtle differences in both these methodologies using a numerical example. We then extend this research to solve a large scale sequential routing problem for Unmanned Aerial Vehicles (UAV) for reconnaissance patrolling missions, formulated as a two stage stochastic integer program with a single recourse action and present the results. We then propose an extension of the above problem to include multiple recourse actions and attempt to use a standard heuristic technique, progressive hedging- a scenario-based decomposition technique well-suited to solving stochastic mixed-integer programs, to formulate and solve this sequential routing UAV problem with multiple recourse actions. In addition we also propose an alternative MDP formulation of the same problem posed as a multi-stage sequential decision making problem. We develop a novel policy approximation algorithm coupled with an epsilon-greedy approach to solve the multistage problem and present the results of the same. The eventual goal of this research is to investigate the cross applicability of algorithms used in each of these methodologies to more efficiently solve large scale sequential decision making problems. Host: Kaarthik Sundar |