Playing Games With Approximation Algorithms

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
approximation algorithms
regret minimization
online linear optimization
Statistics and Probability
Theory and Algorithms
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Kakade, Sham M
Kalai, Adam T
Ligett, Katrina
Contributor
Abstract

In an online linear optimization problem, on each period t, an online algorithm chooses st ∈ S from a fixed (possibly infinite) set S of feasible decisions. Nature (who may be adversarial) chooses a weight vector wt ∈ Rn, and the algorithm incurs cost c(st,wt), where c is a fixed cost function that is linear in the weight vector. In the full-information setting, the vector wt is then revealed to the algorithm, and in the bandit setting, only the cost experienced, c(st,wt), is revealed. The goal of the online algorithm is to perform nearly as well as the best fixed s ∈ S in hindsight. Many repeated decision-making problems with weights fit naturally into this framework, such as online shortest-path, online TSP, online clustering, and online weighted set cover. Previously, it was shown how to convert any efficient exact offline optimization algorithm for such a problem into an efficient online bandit algorithm in both the full-information and the bandit settings, with average cost nearly as good as that of the best fixed s ∈ S in hindsight. However, in the case where the offline algorithm is an approximation algorithm with ratio α > 1, the previous approach only worked for special types of approximation algorithms. We show how to convert any offline approximation algorithm for a linear optimization problem into a corresponding online approximation algorithm, with a polynomial blowup in runtime. If the offline algorithm has an α-approximation guarantee, then the expected cost of the online algorithm on any sequence is not much larger than α times that of the best s ∈ S, where the best is chosen with the benefit of hindsight. Our main innovation is combining Zinkevich's algorithm for convex optimization with a geometric transformation that can be applied to any approximation algorithm. Standard techniques generalize the above result to the bandit setting, except that a "Barycentric Spanner" for the problem is also (provably) necessary as input. Our algorithm can also be viewed as a method for playing largerepeated games, where one can only compute approximate best-responses, rather than best-responses.

Advisor
Date of presentation
2007-01-01
Conference name
Statistics Papers
Conference dates
2023-05-17T15:27:29.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
At the time of publication, author Sham M. Kakade was affiliated with Toyota Technological Institute, Chicago. Currently (August 2016), he is a faculty member at the Statistics Department at the University of Pennsylvania.
Recommended citation
Collection