Date of Award
Doctor of Philosophy (PhD)
Electrical & Systems Engineering
With field-programmable gate arrays (FPGAs) being widely deployed in data centers to enhance the computing performance, an efficient virtualization support is required to fully unleash the potential of cloud FPGAs. However, the system support for FPGAs in the context of the cloud environment is still in its infancy, which leads to a low resource utilization due to the tight coupling between compilation and resource allocation. Moreover, the system support proposed in existing works is limited to a homogeneous FPGA cluster comprising identical FPGA devices, which is hard to be extended to a heterogeneous FPGA cluster that comprises multiple types of FPGAs. As the FPGA cloud is expected to become increasingly heterogeneous due to the hardware rolling upgrade strategy, it is necessary to provide efficient virtualization support for the heterogeneous FPGA cluster.
In this dissertation, we first identify three pairs of conflicting requirements from runtime management and offline compilation, which are related to the tradeoff between flexibility and efficiency. These conflicting requirements are the fundamental reason why the single-level abstraction proposed in prior works for the homogeneous FPGA cluster cannot be trivially extended to the heterogeneous cluster. To decouple these conflicting requirements, we provide a two-level system abstraction. Specifically, the high-level abstraction is FPGA-agnostic and provides a simple and homogeneous view of the FPGA resources to simplify the runtime management and maximize the flexibility. On the contrary, the low-level abstraction is FPGA-specific and exposes sufficient low-level hardware details to the compilation framework to ensure the mapping quality and maximize the efficiency. This generic two-level system abstraction can also be specialized to the homogeneous FPGA cluster and/or be extended to leverage application-specific information to further improve the efficiency. We also develop a compilation framework and a modular runtime system with a heuristic-based runtime management policy to support this two-level system abstraction. By enabling a dynamic FPGA sharing at the sub-FPGA granularity, the proposed virtualization solution can deploy 1.62x more applications using the same amount of FPGA resources and reduce the compilation time by 22.6% (perform as many compilation tasks in parallel as possible) with an acceptable virtualization overhead, i.e.,
Finally, we use Liquid Silicon as a case study to show that the proposed virtualization solution can be extended to other spatial reconfigurable architectures. Liquid Silicon is a homogeneous reconfigurable architecture enabled by the non-volatile memory technology (i.e., RRAM). It extends the configuration capability of existing FPGAs from computation to the whole spectrum ranging from computation to data storage. It allows users to better customize hardware by flexibly partitioning hardware resources between computation and memory based on the actual usage. Instead of naively applying the proposed virtualization solution onto Liquid Silicon, we co-optimize the system abstraction and Liquid Silicon architecture to improve the performance.
Zha, Yue, "Virtualizing Reconfigurable Architectures: From Fpgas To Beyond" (2022). Publicly Accessible Penn Dissertations. 5418.