Graph Convolutions For Teams Of Robots
In many applications in robotics, there exist teams of robots operating in dynamic environments requiring the design of complex communication and control schemes. The problem is made easier if one assumes the presence of an oracle that has instantaneous access to states of all entities in the environment and can communicate simultaneously without any loss. However, such an assumption is unrealistic especially when there exist a large number of robots. More specifically, we are interested in decentralized control policies for teams of robots using only local communication and sensory information to achieve high level team objectives. We first make the case for using distributed reinforcement learning to learn local behaviours by optimizing for a sparse team wide reward as opposed to existing model based methods. A central caveat of learning policies using model free reinforcement learning is the lack of scalability. To achieve large scale scalable results, we introduce a novel paradigm where the policies are parametrized by graph convolutions. Additionally, we also develop new methodologies to train these policies and derive technical insights into their behaviors. Building upon these, we design perception action loops for teams of robots that rely only on noisy visual sensors, a learned history state and local information from nearby robots to achieve complex team wide-objectives. We demonstrate the effectiveness of our methods on several large scale multi-robot tasks.