Accelerating HLS Autotuning of Large, Highly-parameterized Reconfigurable SoC Mappings

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Electrical and Systems Engineering
Discipline
Electrical Engineering
Subject
Bayesian optimization
Design space exploration
High-level synthesis
Reconfigurable computing
Funder
Grant number
License
Copyright date
2023
Distributor
Related resources
Author
Giesen, Johannes
Contributor
Abstract

High-level synthesis has accelerated the adoption of autotuners to explore the design spaces of applications mapped on systems-on-chip with reconfigurable logic. Design-space size increases exponentially in the number of design parameters, and building a single configuration of a full application easily consumes hours, so existing autotuners are frequently demonstrated with small kernels and small design spaces to render the problem tractable. This dissertation shows that 30+-parameter applications mapped on 200k+-LUT reconfigurable SoCs can be explored in less than 12 build times on an 8-core host using the model-based approach we refine. We explore various techniques to reduce the tuning time. At the heart of our tuner is an iterative refinement approach that builds a prediction model representing the design space. Our models are multi-fidelity models, which enable discontinuation of unpromising builds in multi-stage CAD flows. We organized the build resources into a pipeline to improve the tuning performance and increase the utilization of build resources. Build failures are mitigated in several ways. Invalid accelerator configurations are replaced with valid ones on-the-fly. Routing errors caused by congestion are mitigated through congestion models. Because the curse of dimensionality deteriorates the performance quickly as the number of parameters increases, we apply dimensionality reduction to focus on the most important parameters. To validate our approach, we injected 32-46 parameters, varying from pragmas to CAD tool parameters, into the Rosetta benchmarks. Compared to OpenTuner, our tuner succeeds 71% more often at finding mappings onto the ZCU102 within 12 hours, and the found mapping is 3.5x faster. Alternatively, we observed that tuning runs are on average at least 8.8x shorter.

Advisor
DeHon, André, M
Date of degree
2023
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation