-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Summary
We would like to add a new backend for the PandA-bambu high-level synthesis tool to hls4ml.
The idea is to start from the existing Vivado HLS backend and reuse the same structure and nnet_utils C++ templates, adapting the necessary pieces to generate PandA-bambu projects and to support its FPGA and ASIC flows.
In the long term, our goal is to provide feature parity (layers and functionality) with the Vivado/Vitis HLS backend for the new PandA-bambu backend.
Motivation and context
hls4ml already supports multiple HLS backends (Vivado/Vitis, Intel, Catapult, oneAPI), allowing users to target a range of FPGA and ASIC design flows from the same high-level ML description.
PandA-bambu is an open-source research framework for high-level synthesis of complex applications from C/C++ (and LLVM IR), providing flows for both FPGAs and ASICs.
It can interface with several logic synthesis and implementation toolchains such as Quartus, Vivado/Vitis, Impulse from NX, and OpenROAD-based ASIC flows (e.g., Nangate45, ASAP7), which means that adding a PandA-bambu backend to hls4ml would immediately unlock support for all FPGA families and ASIC flows currently supported by PandA-bambu.
Our aim is to:
- enable the standard hls4ml workflow (Keras / PyTorch / ONNX → hls4ml → C++/HLS → hardware) using PandA-bambu as the HLS engine;
- leverage PandA-bambu’s open-source nature and diverse FPGA/ASIC support in contexts where commercial tools are not ideal (e.g., academic, research, or open-source ASIC flows).
High-level design
We plan to base the new backend closely on the existing Vivado/Vitis backend and follow patterns already used in other backends derived from it (e.g., the Catapult backend).
Concretely:
-
Backend structure
- Introduce a new backend entry (e.g., backend name
bambu) mirroring the structure of the Vivado HLS backend. - Reuse and adapt the existing
nnet_utilstemplates so that the generated C++ is compatible with PandA-bambu’s expectations (C/C++/LLVM-based flow, top function, pragmas/options, etc.). - Implement a
Backendsubclass for PandA-bambu that:- configures the build directory and project structure;
- maps hls4ml configuration options (e.g., clock period, reuse factor, precision types) to PandA-bambu command-line options and scripts;
- handles device/flow selection (FPGA vs ASIC) where appropriate.
- Introduce a new backend entry (e.g., backend name
-
Code generation
- Start from the Vivado backend writer and adapt:
- generated C++ signatures (top function, arguments) for PandA-bambu;
- build scripts / Makefiles / helper scripts to call
bambuwith the required options; - any pragmas or compiler attributes that are Vivado-specific, replacing them with PandA-bambu-compatible constructs when needed.
- The intent is to keep the same C++/template layer implementation wherever possible, to maximize shared maintenance with the Vivado/Vitis backend.
- Start from the Vivado backend writer and adapt:
-
Tool invocation and flows
- Support at least:
- a “project generation only” mode (generate C++ and scripts, user runs PandA-bambu manually); and
- a “build” mode where
hls_model.build()drives the complete PandA-bambu flow, including:- high-level synthesis and RTL generation,
- synthesis and out-of-context place-and-route for the selected FPGA/ASIC flow (where supported),
- extraction and parsing of implementation reports.
- Allow selecting the downstream flow (FPGA family, or ASIC/OpenROAD flow, etc.) through hls4ml configuration options that are passed through to PandA-bambu.
- Support at least:
Scope of the initial implementation
For the first iteration, our plan is:
-
Target feature set
- Layers and features: focus on the same layer set currently supported by the Vivado/Vitis HLS backend (e.g., Dense, Convolutional layers, Pooling, BatchNorm, activations, etc.), reusing
nnet_utilsimplementations as much as possible. - Models: MLPs and CNNs, with the aim to support the same model types that are already well-supported by Vivado/Vitis in hls4ml.
- Layers and features: focus on the same layer set currently supported by the Vivado/Vitis HLS backend (e.g., Dense, Convolutional layers, Pooling, BatchNorm, activations, etc.), reusing
-
Supported flows
- FPGA synthesis for boards/devices supported by PandA-bambu (via its existing FPGA flows).
- ASIC-oriented flows through PandA-bambu’s integration with open-source toolchains (e.g., Yosys + OpenROAD), where users already have those environments configured.
- In the
buildmode, our goal already in the initial implementation is to run the flow up to post–place-and-route (out-of-context) and to base the reported performance on post-implementation results rather than purely HLS-level estimates, whenever the selected PandA-bambu flow supports it.
-
Out of scope for the first PR
- Any advanced, backend-specific optimizations beyond what is already present in the Vivado/Vitis backend.
- Potential PandA-bambu-specific extensions (e.g., graph-oriented optimizations, special flows) that go beyond the core hls4ml patterns.
- Full coverage of all exotic/experimental layers; the priority is to reach parity with Vivado/Vitis for the commonly used layers.
Testing plan
At this stage, we plan to rely primarily on the existing regression tests in hls4ml:
- Start from the current regression tests used for the Vivado/Vitis backend and:
- enable an analogous set for the PandA-bambu backend;
- compare the generated C++/HLS code where appropriate;
- verify that the full PandA-bambu flow driven by
hls_model.build()runs to completion, including synthesis and out-of-context place-and-route for the selected flow, and that timing/resource reports are correctly parsed and exposed.
- As the backend stabilizes, we plan to extend test coverage to:
- more layer types and model topologies;
- basic checks on reported latency and resource usage (within reasonable tolerances between tools).
If there is interest later, we can explore adding a lightweight CI configuration for PandA-bambu (or document how to run the PandA-bambu-specific test suite externally, if tool installation in CI is not practical).
Roadmap
Our planned steps:
- Clone the Vivado/Vitis backend code and create a PandA-bambu variant, adjusting:
- backend registration;
- writer/project generator;
- any Vivado-specific pragmas and scripts.
- Make the minimal changes in
nnet_utilsand related files required for PandA-bambu compatibility, while keeping common code shared as much as possible. - Add a configuration path in hls4ml that allows:
- selecting
bambuasbackendin the config/CLI; - passing PandA-bambu options (device/flow selection, top function, etc.).
- selecting
- Integrate with the existing regression tests and enable a reasonable subset for the new backend.
- Iterate to extend layer coverage until the PandA-bambu backend reaches parity with the Vivado/Vitis backend in terms of supported layers and model types.
We already foresee a decomposition of the work into the following tasks, each backed by existing tests or template code in hls4ml:
1️⃣ Softmax
Based on test_softmax.py (e.g.
https://github.com/fastmachinelearning/hls4ml/blob/b493ea7fd282ed969e250a94a0738f29f144ac7e/test/pytest/test_softmax.py#L38). See the first work done by #5.
2️⃣ Fixed-point quantizers (QKeras)
Based on test_qkeras.py (e.g.
https://github.com/fastmachinelearning/hls4ml/blob/b493ea7fd282ed969e250a94a0738f29f144ac7e/test/pytest/test_qkeras.py#L286)
3️⃣ Activations
Based on test_activations.py (e.g.
https://github.com/fastmachinelearning/hls4ml/blob/b493ea7fd282ed969e250a94a0738f29f144ac7e/test/pytest/test_activations.py#L38)
4️⃣ Dense layers
Based on the Keras API tests (e.g.
https://github.com/fastmachinelearning/hls4ml/blob/b493ea7fd282ed969e250a94a0738f29f144ac7e/test/pytest/test_keras_api.py#L30)
5️⃣ Pooling
Based on the pooling tests (e.g.
https://github.com/fastmachinelearning/hls4ml/blob/main/test/pytest/test_pooling.py)
6️⃣ Concat / Merge layers
Based on the merge tests (e.g.
https://github.com/fastmachinelearning/hls4ml/blob/b493ea7fd282ed969e250a94a0738f29f144ac7e/test/pytest/test_merge.py#L83)
7️⃣ Compile, write, build, predict, trace
Based on the ModelGraph / HLSModel implementation and its methods (e.g.
https://github.com/fastmachinelearning/hls4ml/blob/b493ea7fd282ed969e250a94a0738f29f144ac7e/hls4ml/model/graph.py#L797) See the issue #4
In this step, we plan to ensure that the PandA-bambu backend supports the full flow, including:
- script generation for synthesis and simulation;
- execution of the complete PandA-bambu flow (HLS + synthesis + out-of-context place-and-route, where supported);
- collection and exposure of post–place-and-route implementation reports.
8️⃣ Upsampling / image operations
Based on the existing Vivado templates (e.g.
https://github.com/fastmachinelearning/hls4ml/blob/ab1c4090b05bae9c1791975565eeca5495e98fb1/hls4ml/templates/vivado/nnet_utils/nnet_image.h#L19)
Feedback on this plan is welcome. If there are no major objections, we will start prototyping the backend based on the Vivado/Vitis implementation and open a draft PR once the first version is usable.