Date of Award
Doctor of Philosophy (PhD)
Electrical & Systems Engineering
Learning, prediction and identification has been a main topic of interest in science and engineering for many years. Common in all these problems is an agent that receives the data to perform prediction and identification procedures. The agent might process the data individually, or might interact in a network of agents. The goal of this thesis is to address problems that lie at the interface of statistical processing of data, online learning and network science with a focus on developing distributed algorithms. These problems have wide-spread applications in several domains of systems engineering and computer science. Whether in individual or group, the main task of the agent is to understand how to treat data to infer the unknown parameters of the problem. To this end, the first part of this thesis addresses statistical processing of data. We start with the problem of distributed detection in multi-agent networks. In contrast to the existing literature which focuses on asymptotic learning, we provide a finite-time analysis using a notion of Kullback-Leibler cost. We derive bounds on the cost in terms of network size, spectral gap and relative entropy of data distribution. Next, we turn to focus on an inverse-type problem where the network structure is unknown, and the outputs of a dynamics (e.g. consensus dynamics) are given. We propose several network reconstruction algorithms by measuring the network response to the inputs. Our algorithm reconstructs the Boolean structure (i.e., existence and directions of links) of a directed network from a series of dynamical responses. The second part of the thesis centers around online learning where data is received in a sequential fashion. As an example of collaborative learning, we consider the stochastic multi-armed bandit problem in a multi-player
network. Players explore a pool of arms with payoffs generated from player-dependent distributions. Pulling an arm, each player only observes a noisy payoff of the chosen arm. The goal is to maximize a global welfare or to find the best global arm. Hence, players exchange information locally to benefit from side observations. We develop a distributed online algorithm with a logarithmic regret with respect to the best global arm, and generalize our results to the case that availability of arms varies over time. We then return to individual online learning where one learner plays against an adversary. We develop a fully adaptive algorithm that takes advantage of a regularity of the sequence of observations, retains worst-case performance guarantees, and performs well against complex benchmarks. Our method competes with dynamic benchmarks in which regret guarantee scales with regularity of the sequence of cost functions and comparators. Notably, the regret bound adapts to the smaller complexity measure in the problem environment.
Shahrampour, Shahin, "Online and Statistical Learning in Networks" (2015). Publicly Accessible Penn Dissertations. 1999.