Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Home 
 People 
 Current 
 Executive Committee 
 Postdocs 
 Visitors 
 Students 
 Research 
 Publications 
 Conferences 
 Workshops 
 Sponsorship 
 Talks 
 Seminars 
 Postdoc Seminars Archive 
 Quantum Lunch 
 Quantum Lunch Archive 
 P/T Colloquia 
 Archive 
 Ulam Scholar 
 
 Postdoc Nominations 
 Student Requests 
 Student Program 
 Visitor Requests 
 Description 
 Past Visitors 
 Services 
 General 
 
 History of CNLS 
 
 Maps, Directions 
 CNLS Office 
 T-Division 
 LANL 
 
Monday, June 05, 2017
11:00 AM - 12:00 PM
CNLS Conference Room (TA-3, Bldg 1690)

Seminar

Hierarchical Decision Making for Power Systems under Uncertainty: Stochastic Optimization and Reinforcement Learning Solutions

Gal Dalal
Technion

Decision making processes for power networks can be roughly categorized per four time-scales: real-time, short-term, mid-term and long-term. Each of these categories of problems encompasses its own set of visible information and considerations. Examples for such are day-ahead generator scheduling, months-ahead asset management, and years-ahead system development. I will open this talk with our formulation of the mid-term asset management task, crafted with feedback from industry experts. It is a chance-constrained stochastic pro-gram, which proves highly challenging due to high-dimensional decision variables in the case of large networks; non-convex and even non-analytic mathematical forms; interdependent sub-problems encapsulating decision-making processes of shorter time-scales; and high uncertainty due to renewable energy sources. I will then present our methodology for solving the resulting hierarchical formulation, which relies on distributed simulation-based optimization using efficient scenario generation. To tackle the computational burden of hierarchically simulating lower-level decision making, I introduce a novel concept – using machine learning for designing ’proxies’ that quickly approximate outcomes of short-term decisions. These proxies are thus trained to predict, e.g., unit commitment and ACOPF solutions. An additional, natural approach for solving such dynamic control problems is reinforce-ment learning (RL). To facilitate joint decision making among several stakeholders, I propose a new hierarchical RL model to be served as a proxy for real-time power grid reliability. Our matching algorithm alternates between slow time-scale policy improvement and fast time-scale value function evaluation. Lastly, pursuing the goal of enabling analysis of such hierarchical RL algorithms, I will present our recent convergence rate results for several well-known and commonly used RL algorithms; these are obtained using tailor-made stochastic approximation tools. Our results include: i) the first reported concentration bound for an unaltered online Temporal Difference (TD) algorithm with function approximation; and ii) the first reported concentration bound for two-timescale stochastic approximation algorithms. In particular, we apply the latter result to the “gradient TD” family of two-timescale RL algorithms.

Host: Michael Chertkov