#### Title

#### Date of Award

2015

#### Degree Type

Dissertation

#### Degree Name

Doctor of Philosophy (PhD)

#### Graduate Group

Electrical & Systems Engineering

#### First Advisor

Alejandro Ribeiro

#### Second Advisor

Rakesh Vohra

#### Abstract

This thesis builds from the realization that Bayesian Nash equilibria are the natural definition of optimal behavior in a network of distributed autonomous agents. Game equilibria are often behavior models of competing rational agents that take actions that are strategic reactions to the predicted actions of other players. In autonomous systems however, equilibria are used as models of optimal behavior for a different reason: Agents are forced to play strategically against inherent uncertainty. While it may be that agents have conflicting intentions, more often than not, their goals are aligned. However, barring unreasonable accuracy of environmental information and unjustifiable levels of coordination, they still can't be sure of what the actions of other agents will be. Agents have to focus their strategic reasoning on what they believe the information available to other agents is, how they think other agents will respond to this hypothetical information, and choose what they deem to be their best response to these uncertain estimates. If agents model the behavior of each other as equally strategic, the optimal response of the network as a whole is a Bayesian Nash equilibrium. We say that the agents are playing a Bayesian network game when they repeatedly act according to a stage Bayesian Nash equilibrium and receive information from their neighbors in the network.

The first part of the thesis is concerned with the development and analysis of algorithms that agents can use to compute their equilibrium actions in a game of incomplete information with repeated interactions over a network. In this regard, the burden of computing a Bayesian Nash equilibrium in repeated games is, in general, overwhelming. This thesis shows that actions are computable in the particular case when the local information that agents receive follows a Gaussian distribution and the game's payoff is represented by a utility function that is quadratic in the actions of all agents and an unknown parameter. This solution comes in the form of the Quadratic Network Game filter that agents can run locally, i.e., without access to all private signals, to compute their equilibrium actions. For the more generic payoff case of Bayesian potential games, i.e., payoffs represented by a potential function that depends on population actions and an unknown state of the world, distributed versions of fictitious play that converge to Nash equilibrium with identical beliefs on the state are derived. This algorithm highlights the fact that in order to determine optimal actions there are two problems that have to be solved: (i) Construction of a belief on the state of the world and the actions of other agents. (ii) Determination of optimal responses to the acquired beliefs. In the case of symmetric and strictly supermodular games, i.e., games with coordination incentives, the thesis also derives qualitative properties of Bayesian network games played in the time limit. In particular, we ask whether agents that play and observe equilibrium actions are able to coordinate on an action and learn about others' behavior from only observing peers' actions. The analysis described here shows that agents eventually coordinate on a consensus action.

The second part of this thesis considers the application of the algorithms developed in the first part to the analysis of energy markets. Consumer demand profiles and fluctuating renewable power generation are two main sources of uncertainty in matching demand and supply in an energy market. We propose a model of the electricity market that captures the uncertainties on both, the operator and the user side. The system operator (SO) implements a temporal linear pricing strategy that depends on real-time demand and renewable generation in the considered period combining Real-Time Pricing with Time-of-Use Pricing. The announced pricing strategy sets up a noncooperative game of incomplete information among the users with heterogeneous but correlated consumption preferences. An explicit characterization of the optimal user behavior using the Bayesian Nash equilibrium solution concept is derived. This explicit characterization allows the SO to derive pricing policies that influence demand to serve practical objectives such as minimizing peak-to-average ratio or attaining a desired rate of return. Numerical experiments show that the pricing policies yield close to optimal welfare values while improving these practical objectives. We then analyze the sensitivity of the proposed pricing schemes to user behavior and information exchange models. Selfish, altruistic and welfare maximizing user behavior models are considered. Furthermore, information exchange models in which users only have private information, communicate or receive broadcasted information are considered. For each pair of behavior and information exchange models, rational price anticipating consumption strategy is characterized. In all of the information exchange models, equilibrium actions can be computed using the Quadratic Network Game filter. Further experiments reveal that communication model is beneficial for the expected aggregate payoff while it does not affect the expected net revenue of the system operator. Moreover, additional information to the users reduces the variance of total consumption among runs, increasing the accuracy of demand predictions.

#### Recommended Citation

Eksin, Ceyhun, "Bayesian Network Games" (2015). *Publicly Accessible Penn Dissertations*. 1052.

https://repository.upenn.edu/edissertations/1052