Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
296 commits
Select commit Hold shift + click to select a range
cf5c9a1
snapshot
jmitrevs Feb 7, 2024
81f3e53
bug fixes from attempting to run
jmitrevs Feb 8, 2024
9a74e46
fix some bugs from qonnx pytest
jmitrevs Feb 12, 2024
60a74bb
fix assertion of not matching the number of inputs when replacing node
jmitrevs Feb 12, 2024
a032a5d
Merge remote-tracking branch 'vloncar/auto_precision' into qonnx-1p0
jmitrevs Feb 29, 2024
88a8d35
update some precisions inference
jmitrevs Feb 29, 2024
0379db2
Merge remote-tracking branch 'upstream/main' into qonnx-1p0
jmitrevs Feb 29, 2024
10a3c50
extract bitwidth from size 1 array in quant node
jmitrevs Feb 29, 2024
ab8d67b
update automatic onnx configuration
jmitrevs Mar 2, 2024
0a863ad
standardize on merge operators
jmitrevs Mar 2, 2024
bfe6a3f
snapshot of current work
jmitrevs Mar 8, 2024
25849ef
Fix bug in FuseBatchNormalization
jmitrevs Mar 10, 2024
4485bf3
fix issue with configuration setup of test
jmitrevs Mar 11, 2024
52067c3
fix bug in FuseConsecutiveBatchNormalization
jmitrevs Mar 11, 2024
24d6245
add missing header
jmitrevs Mar 11, 2024
835af4e
attempt to make qonnx tests match better
jmitrevs Mar 11, 2024
4a41b63
Merge remote-tracking branch 'upstream/main' into qonnx-1p0
jmitrevs Mar 12, 2024
2bcec04
fix pre-commit
jmitrevs Mar 12, 2024
b3facd2
remove count, become more selective on when True is returned
jmitrevs Apr 17, 2024
b580866
Merge remote-tracking branch 'upstream/main' into qonnx-1p0
jmitrevs Apr 19, 2024
105b38a
Merge remote-tracking branch 'upstream/main' into qonnx-1p0
jmitrevs Apr 19, 2024
0d8108e
fix optimizer issue when quantizer is None
jmitrevs Apr 19, 2024
229b44a
Merge remote-tracking branch 'upstream/main' into qonnx-1p0
jmitrevs May 3, 2024
d509976
Merge branch 'main' into hls4ml-optimization-api-part-2
jmitrevs May 3, 2024
1fa59dc
update pytest image to 0.5.6
jmitrevs May 16, 2024
3d8912d
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
jmduarte May 28, 2024
65857a4
Merge branch 'main' into qonnx-1p0
jmitrevs May 30, 2024
b565067
Merge branch 'main' into qonnx-1p0
jmitrevs May 31, 2024
f1a238d
Merge remote-tracking branch 'upstream/main' into hw_opt_p2
vloncar Jun 4, 2024
8a48417
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
jmduarte Jun 4, 2024
a181d97
add vitis
jmduarte Jun 10, 2024
2a78f93
Merge branch 'main' into qonnx-1p0
jmitrevs Jun 25, 2024
c5841a2
seperate out parse_qonnx flow
jmitrevs Jun 25, 2024
de790ca
Again allow for None in target shape--for pytorch
jmitrevs Jun 26, 2024
0ea246c
Refactor matrix-multiplication kernel as a function pointer
vloncar Jul 15, 2024
6189953
Merge branch 'main' into qonnx-1p0
jmitrevs Jul 17, 2024
2909d15
Following what seems to be done in the main branch
jmitrevs Jul 18, 2024
c9693da
update infer_precision based on changes in keras-config-auto
jmitrevs Jul 19, 2024
aaaa2fc
loosen batchnorm merging restrictions, fix ternary handling
jmitrevs Jul 19, 2024
a2b88f4
remove some backends from slow qonnx test
jmitrevs Jul 19, 2024
169d9e5
Merge remote-tracking branch 'upstream/main' into qonnx-1p0
jmitrevs Aug 21, 2024
ef02b4f
move multi_dense to conv above inferming precision types
jmitrevs Aug 21, 2024
c3ffa7b
fix the default reuse factor
jmitrevs Aug 21, 2024
2ed0865
Reorganize codegen of unrolled implementation
vloncar Aug 22, 2024
10f648c
Merge remote-tracking branch 'upstream/main' into hw_opt_p2
vloncar Aug 22, 2024
fbc4107
Remove mentions of dense_resource_implementation
vloncar Aug 25, 2024
ecda5c9
Default to 'auto' for pipeline style and move check to an optimizer
vloncar Aug 25, 2024
ce8431d
Pimp the docs a bit
vloncar Aug 25, 2024
3591ae5
Merge remote-tracking branch 'upstream/main' into qonnx-1p0
jmitrevs Sep 3, 2024
cc7652d
Pre-commit fix
jmitrevs Sep 3, 2024
b36fe4f
fix qonnx review suggestions
jmitrevs Sep 4, 2024
c37d953
fix qonnx review suggestions (part 2)
jmitrevs Sep 4, 2024
23825de
fix error message
jmitrevs Sep 4, 2024
cad06fa
change order of qonnx optimizers
jmitrevs Sep 9, 2024
5e9f4d6
Merge branch 'main' into qonnx-1p0
jmitrevs Sep 11, 2024
51c80f9
make the optimizer oder be more similar to main branch
jmitrevs Sep 12, 2024
8e6dd58
Merge remote-tracking branch 'upstream/main' into qonnx-1p0
jmitrevs Sep 12, 2024
ce09665
Merge branch 'main' into qonnx-1p0
jmitrevs Sep 13, 2024
8eaf10a
fix dimensions when moving scales
jmitrevs Sep 19, 2024
d80dc3b
Added support and some missing parts for `Depthwise` and `Pointwise` …
jmitrevs Sep 20, 2024
fae647d
add seperable conv to test
jmitrevs Sep 23, 2024
56c85a4
fix pointwise with naming, quant_opt
jmitrevs Sep 24, 2024
b0efdd6
fix ConstantBatchNormFusion
jmitrevs Sep 24, 2024
14da6f5
update broadcasting for moving scales for conv
jmitrevs Sep 25, 2024
0333d36
snapshot of current development
jmitrevs Sep 26, 2024
80184d2
snapshot working through scale downs
jmitrevs Sep 26, 2024
6bb0817
finish making the various cases
jmitrevs Sep 26, 2024
766a14c
accidentally reverted the example models
jmitrevs Sep 26, 2024
5ff1373
some bug fixes
jmitrevs Sep 26, 2024
65e0127
Merge pull request #10 from jmitrevs/qonnx-1p0-sepconv-dev
jmitrevs Sep 29, 2024
da4f9e5
Merge branch 'main' into qonnx-1p0
jmitrevs Sep 29, 2024
86abdd2
update qonnx sepconv test
jmitrevs Sep 29, 2024
ac8d9fd
Merge branch 'main' into hw_opt_p2
vloncar Oct 1, 2024
eff80aa
Merge remote-tracking branch 'upstream/main' into hw_opt_p2
vloncar Oct 1, 2024
d30773f
update qkeras in Jenkinsfile
jmitrevs Oct 1, 2024
6363702
Merge branch 'main' into qonnx-1p0
jmitrevs Oct 1, 2024
09c5d5b
intial depthconv2d implementation
laurilaatu Oct 2, 2024
c92091b
intial depthconv2d implementation
laurilaatu Oct 2, 2024
8403348
Merge remote-tracking branch 'refs/remotes/origin/oneapi_separablecon…
laurilaatu Oct 2, 2024
15abf5b
Merge branch 'main' into update_jenkins
JanFSchulte Oct 2, 2024
afed23b
Merge pull request #1072 from jmitrevs/update_jenkins
JanFSchulte Oct 2, 2024
accadaf
Merge branch 'main' into qonnx-1p0
jmitrevs Oct 2, 2024
c4af46a
Rename "unrolled" -> "resource_unrolled"
vloncar Oct 7, 2024
97c5347
Move optimization API to "dsp_aware_pruning" module (new optimization…
vloncar Oct 7, 2024
5fbdae8
Hardcode weights loading (ensures weights loading works from any dir)
vloncar Oct 8, 2024
c596f30
Rename to depthconv, add strides and add tests
laurilaatu Oct 9, 2024
bcd8c70
Remove class for DepthwiseConv2D
laurilaatu Oct 9, 2024
c8d7fc6
merge
jmduarte Oct 9, 2024
a6a5c7f
add flow
jmduarte Oct 10, 2024
170999f
div roundup
jmduarte Oct 10, 2024
8981112
Remove Separable convolution template
laurilaatu Oct 10, 2024
5ad1188
Remove layer optimizer for sepconv
laurilaatu Oct 11, 2024
308af4e
[pre-commit.ci] pre-commit autoupdate
pre-commit-ci[bot] Oct 14, 2024
b4111c6
Merge pull request #1075 from fastmachinelearning/pre-commit-ci-updat…
jmitrevs Oct 16, 2024
12a2d1e
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
jmduarte Oct 17, 2024
4ec6387
update
jmduarte Oct 17, 2024
cfbad0b
Merge branch 'main' into hls4ml-optimization-api-part-2
JanFSchulte Oct 22, 2024
352c124
Merge pull request #809 from bo3z/hls4ml-optimization-api-part-2
JanFSchulte Oct 22, 2024
aaab34a
fix softmax parsing in pytorch and add test
JanFSchulte Oct 22, 2024
a9bfc6a
Merge branch 'main' into softmaxfix_torch
JanFSchulte Oct 22, 2024
655aef6
precommit
JanFSchulte Oct 22, 2024
61695b6
precommit v2
JanFSchulte Oct 22, 2024
a306e3f
add small tweak to fix issue 1054
JanFSchulte Oct 22, 2024
583a8c2
In softmax, max axis -1 if it's a positive index that's identical
jmitrevs Oct 23, 2024
9cbf0f1
add more onnx tests, optimize the handling of some attributes, update…
jmitrevs Oct 23, 2024
6ca1055
Figure out the weights dir automatically from the location of build_l…
vloncar Oct 23, 2024
12034d3
Merge remote-tracking branch 'upstream/main' into weight_txt_path
vloncar Oct 23, 2024
3ec6c5a
update qonnx documentation
jmitrevs Oct 24, 2024
39d0e91
Merge pull request #1089 from vloncar/weight_txt_path
JanFSchulte Oct 24, 2024
10eb161
Merge branch 'main' into qonnx-1p0
jmitrevs Oct 24, 2024
210f8c2
quote the to handle special characters
jmitrevs Oct 24, 2024
03096cf
Merge pull request #1091 from fastmachinelearning/path_special_char_esc
JanFSchulte Oct 24, 2024
fc0417b
Merge branch 'main' into qonnx-1p0
JanFSchulte Oct 24, 2024
4518537
Beginnings of the oneAPI backend (#955)
jmitrevs Oct 25, 2024
f9a2412
update keras activation parsing, especially leaky relu (#1085)
jmitrevs Oct 25, 2024
ab45708
Merge pull request #1086 from JanFSchulte/softmaxfix_torch
jmitrevs Oct 25, 2024
c75b29d
[pre-commit.ci] pre-commit autoupdate
pre-commit-ci[bot] Oct 28, 2024
25b08cf
Merge pull request #1098 from fastmachinelearning/pre-commit-ci-updat…
JanFSchulte Oct 28, 2024
f8beb3a
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
jmduarte Oct 29, 2024
6ca2f1b
roundup
jmduarte Oct 29, 2024
352772d
restore example-models
jmduarte Oct 29, 2024
cfad81f
Fix wrong note in README.md
bo3z Oct 29, 2024
82d059b
Change indexing in filling result for io_parallel convolutions, Vitis…
jmitrevs Oct 30, 2024
2c17f66
Merge pull request #979 from jmitrevs/qonnx-1p0
vloncar Oct 31, 2024
1a93246
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
jmduarte Nov 1, 2024
d37a843
remove pointwise conv implementation option; make it default
jmduarte Nov 1, 2024
f5629db
remove pointwise conv implementation option; make it default
jmduarte Nov 1, 2024
f4ae08f
Restore tab
jmduarte Nov 1, 2024
ecd6b04
Add back nnet_helpers.h
jmduarte Nov 1, 2024
6f5cbd9
format
jmduarte Nov 1, 2024
3c5b633
Merge branch 'main' into oneapi_separableconv
laurilaatu Nov 1, 2024
d422659
Merge branch 'main' into update-readme
jmitrevs Nov 5, 2024
fabcf8c
update the project status
jmitrevs Nov 5, 2024
b844acf
restructure of existing documentation
jmitrevs Nov 5, 2024
88e84f3
add an internal layers section, and auto precision
jmitrevs Nov 5, 2024
6abc8ad
pre-commit fixes
jmitrevs Nov 5, 2024
54657f9
make auto default precision for pytorch parser
JanFSchulte Nov 6, 2024
bd28050
add max_precision to onnx parser
jmitrevs Nov 6, 2024
6b9bf0c
Loop unroll
laurilaatu Nov 7, 2024
97d7186
remove incorrect input from Constant nodes
jmitrevs Nov 8, 2024
7612b4f
Merge pull request #1119 from fastmachinelearning/make_Constant_witho…
JanFSchulte Nov 11, 2024
e2fd8a5
Merge pull request #1113 from fastmachinelearning/onnx_parser_max_pre…
JanFSchulte Nov 11, 2024
dab7b85
Merge branch 'main' into pytorch_auto
JanFSchulte Nov 11, 2024
01d4f79
more default settings suggested by Jovan
JanFSchulte Nov 11, 2024
d947cdb
Merge branch 'pytorch_auto' of https://github.com/JanFSchulte/hls4ml …
JanFSchulte Nov 11, 2024
efb4379
Add RF to config templates for "Merge" layers
vloncar Nov 11, 2024
b44426d
Merge pull request #1121 from vloncar/merge_template_missing_rf
JanFSchulte Nov 11, 2024
364a0b7
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
jmitrevs Nov 12, 2024
6aeafdd
Merge branch 'main' into pytorch_auto
JanFSchulte Nov 12, 2024
8ebeefe
jovan comments
jmduarte Nov 13, 2024
4099c8d
p-clang-format
jmduarte Nov 13, 2024
d999ad8
p-clang-format
jmduarte Nov 13, 2024
5e5b81f
Introduce optional description to layer attributes
vloncar Nov 13, 2024
5616e5a
Add doc for HGQ (#1117)
calad0i Nov 13, 2024
1214b65
Pre-commit fix
vloncar Nov 13, 2024
daae96d
fix
jmduarte Nov 14, 2024
6d84b80
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
jmduarte Nov 14, 2024
677c738
multi output and flatten@streaming fix
calad0i Oct 29, 2024
04cbe83
relaxing remove node shape check cond
calad0i Oct 29, 2024
e73b3d3
fix regression errors
calad0i Oct 29, 2024
b055233
catapult and oneapi tests
calad0i Oct 29, 2024
26b4c54
add catapult 3-clone
calad0i Oct 29, 2024
702e4eb
rm ill-condition
calad0i Oct 29, 2024
d016612
allow removing nodes w/o i or o
calad0i Nov 8, 2024
d1a3b75
chore
calad0i Nov 9, 2024
bf6fe7a
cosmatic
calad0i Nov 10, 2024
7b58c1d
typo and docstring
calad0i Nov 13, 2024
ef2e8f4
allow io_stream if used as model output
calad0i Nov 13, 2024
51cb83c
Merge pull request #1112 from JanFSchulte/pytorch_auto
jmitrevs Nov 15, 2024
68f80ea
remove incorrect setting of result_t
jmitrevs Nov 15, 2024
18fc2b8
remove additional incorrect result_t settings
jmitrevs Nov 15, 2024
e778ed3
Merge pull request #1130 from fastmachinelearning/result_t_bug_fix
bo3z Nov 17, 2024
9e3fc8d
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
bo3z Nov 17, 2024
09013a1
Merge remote-tracking branch 'origin' into oneapi_separableconv
laurilaatu Nov 18, 2024
21f21fc
Pre-commit format
laurilaatu Nov 18, 2024
9536248
Fix spelling
laurilaatu Nov 18, 2024
4eb0746
Fixes the problem if scale is a tensor. scale[0] does not return a sc…
jurevreca12 Nov 19, 2024
c320f50
Merge pull request #1132 from jurevreca12/fix-FuseQuantWithConstant
jmitrevs Nov 19, 2024
8ebdf22
Merge branch 'fastmachinelearning:main' into oneapi_separableconv
laurilaatu Nov 20, 2024
0fb0997
depthconv1d, channel order in loop, product
laurilaatu Nov 20, 2024
4b91d49
Merge branch 'main' into attrs_desc
vloncar Nov 20, 2024
d34876d
Gather result to accum
laurilaatu Nov 20, 2024
e813d41
Tweak writing of all attributes, allow writing only configurable attr…
vloncar Nov 20, 2024
8505e78
Added support for QONNX `Resize` node ingestion and tested with tiny …
nghielme Nov 21, 2024
7d9ec3a
Merge branch 'main' into oneapi_separableconv
laurilaatu Nov 22, 2024
d56dc73
vladimir comments
jmduarte Nov 22, 2024
dd021ec
fix n_in/n_out
jmduarte Nov 22, 2024
93acaa6
pre-commit
jmduarte Nov 22, 2024
0268c2f
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
JanFSchulte Nov 22, 2024
e845e02
Update install_requires for 1.0.0
vloncar Nov 22, 2024
9852bf0
Merge branch 'main' into update_setup_reqs_100
bo3z Nov 23, 2024
22878ce
Merge pull request #1136 from vloncar/update_setup_reqs_100
bo3z Nov 23, 2024
d1c10ca
Merge branch 'main' into oneapi_separableconv
laurilaatu Nov 23, 2024
1867dfc
fix resource strategy
jmduarte Nov 25, 2024
7570c11
Merge remote-tracking branch 'upstream/main' into update-docs
vloncar Nov 26, 2024
09bbefb
Typo fixes
vloncar Nov 26, 2024
42cb368
Add video tutorial link
bo3z Dec 3, 2024
4a1c25a
Merge branch 'main' into split_pointwise_conv_by_rf_codegen
JanFSchulte Dec 4, 2024
76d06e7
add warning when moving scale fales
jmitrevs Dec 4, 2024
915d2e1
better handle cases when there is no previous node
jmitrevs Dec 4, 2024
2fc8941
Merge pull request #881 from jmduarte/split_pointwise_conv_by_rf_codegen
JanFSchulte Dec 4, 2024
26f4eb2
Merge branch 'main' into update-readme
jmitrevs Dec 4, 2024
88c1fe7
Minor doc improvements to attributes (#57)
bo3z Dec 4, 2024
fedf790
respond to some review comments and update some descriptions
jmitrevs Dec 4, 2024
cf91c3b
Merge branch 'main' into attrs_desc
bo3z Dec 5, 2024
ce7f1f1
Merge pull request #1127 from vloncar/attrs_desc
bo3z Dec 5, 2024
f28f364
fix documentation of channels_last conversion for pytorch
JanFSchulte Dec 5, 2024
e55b29c
slightly expand discussion of channels_last in pytorch
JanFSchulte Dec 5, 2024
c8e1857
Merge pull request #1142 from fastmachinelearning/qonnx_warnings
JanFSchulte Dec 5, 2024
6de4043
Merge branch 'main' into oneapi_separableconv
laurilaatu Dec 5, 2024
99e3be0
update requirements
jmduarte Dec 5, 2024
96b530f
add pointwise documentation
jmduarte Dec 5, 2024
a7b6f79
update pointwise description
jmduarte Dec 5, 2024
135eaa2
Merge remote-tracking branch 'upstream/main' into update-readme
vloncar Dec 6, 2024
6af7fef
Add FAQ to docs and readme
vloncar Dec 6, 2024
eac61dd
Nicer link to the tutorial
vloncar Dec 6, 2024
c65e915
add doc strings to pytorch-specific padding calculation functions
JanFSchulte Dec 6, 2024
7cf4134
Merge branch 'update-readme' of https://github.com/fastmachinelearnin…
JanFSchulte Dec 6, 2024
4fc1ea9
clarify default for channels last conversion in pytorch
JanFSchulte Dec 6, 2024
91ee88e
fixes to parsing of pytorch models when using torch functionals
JanFSchulte Dec 6, 2024
2afae66
fix quotation marks
JanFSchulte Dec 6, 2024
a0a573e
fix quotation marks
JanFSchulte Dec 6, 2024
f377fe0
Merge pull request #1143 from JanFSchulte/parsing_fixes
jmitrevs Dec 6, 2024
548c462
Restructure documentation
vloncar Dec 6, 2024
4da52a4
bump version to 1.0.0
jmduarte Dec 6, 2024
6959c71
remove obsolete file references
jmitrevs Dec 6, 2024
47d7435
add a touch of text on the backends
jmitrevs Dec 6, 2024
05f8a45
expand pytorch frontend documentation
JanFSchulte Dec 8, 2024
6f971eb
Merge branch 'main' into update-readme
JanFSchulte Dec 9, 2024
536c069
[pre-commit.ci] auto fixes from pre-commit hooks
pre-commit-ci[bot] Dec 9, 2024
d9d09e0
typos in pytorch frontend documentation
JanFSchulte Dec 9, 2024
16c4055
Merge branch 'update-readme' of https://github.com/fastmachinelearnin…
JanFSchulte Dec 9, 2024
e69a392
improve description of brevtias -> QONNX -> hlsm4l workflow
JanFSchulte Dec 9, 2024
326b188
Merge branch 'main' into oneapi_separableconv
laurilaatu Dec 9, 2024
896951a
Add docs on BramFactor
vloncar Dec 9, 2024
cfcd46c
Merge pull request #1100 from fastmachinelearning/update-readme
JanFSchulte Dec 9, 2024
5dd7715
Temporary workaround for QKeras installation
vloncar Dec 9, 2024
cc4fbf9
Merge pull request #1145 from vloncar/qkeras_install_hook
JanFSchulte Dec 9, 2024
6617310
don't overwrite already set accum_t, fix pointwise output res
jmitrevs Dec 11, 2024
f211a0e
split hgq tests and isolate qkeras tests to make tests run in under 1h
JanFSchulte Dec 13, 2024
82ab6bf
pre-commit
JanFSchulte Dec 13, 2024
96da3fe
[pre-commit.ci] auto fixes from pre-commit hooks
pre-commit-ci[bot] Dec 13, 2024
8a018f1
remove unnecessary import
JanFSchulte Dec 13, 2024
46bdacc
update example-model
jmitrevs Dec 13, 2024
1d0cf1e
change order of optimizers
jmitrevs Dec 13, 2024
eabb785
fix example-models setting for long running pytetss
JanFSchulte Dec 13, 2024
fb12040
add pytorch to long tests
JanFSchulte Dec 14, 2024
0dd372a
Merge pull request #1146 from fastmachinelearning/fix_pointwise_res_type
JanFSchulte Dec 14, 2024
3c63e27
Merge pull request #1153 from JanFSchulte/split_pytests
jmitrevs Dec 14, 2024
c58db99
Merge branch 'main' into oneapi_separableconv
laurilaatu Dec 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@ exclude: (^hls4ml\/templates\/(vivado|quartus)\/(ap_types|ac_types)\/|^test/pyte

repos:
- repo: https://github.com/psf/black
rev: 24.8.0
rev: 24.10.0
hooks:
- id: black
language_version: python3
args: ['--line-length=125',
'--skip-string-normalization']

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
rev: v5.0.0
hooks:
- id: check-added-large-files
- id: check-case-conflict
Expand All @@ -30,13 +30,13 @@ repos:
args: ["--profile", "black", --line-length=125]

- repo: https://github.com/asottile/pyupgrade
rev: v3.17.0
rev: v3.19.0
hooks:
- id: pyupgrade
args: ["--py36-plus"]

- repo: https://github.com/asottile/setup-cfg-fmt
rev: v2.5.0
rev: v2.7.0
hooks:
- id: setup-cfg-fmt

Expand All @@ -50,7 +50,7 @@ repos:
'--extend-ignore=E203,T201'] # E203 is not PEP8 compliant

- repo: https://github.com/mgedmin/check-manifest
rev: "0.49"
rev: "0.50"
hooks:
- id: check-manifest
stages: [manual]
Expand Down
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: software
authors:
- given-names: "FastML Team"
title: "hls4ml"
version: "v0.8.1"
version: "v1.0.0"
doi: 10.5281/zenodo.1201549
repository-code: "https://github.com/fastmachinelearning/hls4ml"
url: "https://fastmachinelearning.org/hls4ml"
Expand Down
2 changes: 1 addition & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ pipeline {
sh '''#!/bin/bash --login
conda activate hls4ml-py310
conda install -y jupyterhub pydot graphviz pytest pytest-cov
pip install pytest-randomly jupyter onnx>=1.4.0 matplotlib pandas seaborn pydigitalwavetools==1.1 pyyaml tensorflow==2.14 qonnx torch git+https://github.com/google/qkeras.git pyparsing
pip install pytest-randomly jupyter onnx>=1.4.0 matplotlib pandas seaborn pydigitalwavetools==1.1 pyyaml tensorflow==2.14 qonnx torch git+https://github.com/jmitrevs/qkeras.git@qrecurrent_unstack pyparsing
pip install -U ../ --user
./convert-keras-models.sh -x -f keras-models.txt
pip uninstall hls4ml -y'''
Expand Down
16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@ If you have any questions, comments, or ideas regarding hls4ml or just want to s

# Documentation & Tutorial

For more information visit the webpage: [https://fastmachinelearning.org/hls4ml/](https://fastmachinelearning.org/hls4ml/)
For more information visit the webpage: [https://fastmachinelearning.org/hls4ml/](https://fastmachinelearning.org/hls4ml/).

For introductory material on FPGAs, HLS and ML inferences using hls4ml, check out the [video](https://www.youtube.com/watch?v=2y3GNY4tf7A&ab_channel=SystemsGroupatETHZ%C3%BCrich).

Detailed tutorials on how to use `hls4ml`'s various functionalities can be found [here](https://github.com/hls-fpga-machine-learning/hls4ml-tutorial).

Expand Down Expand Up @@ -49,8 +51,8 @@ hls_model = hls4ml.converters.keras_to_hls(config)
hls4ml.utils.fetch_example_list()
```

### Building a project with Xilinx Vivado HLS (after downloading and installing from [here](https://www.xilinx.com/products/design-tools/vivado/integration/esl-design.html))
Note: Vitis HLS is not yet supported. Vivado HLS versions between 2018.2 and 2020.1 are recommended.
### Building a project.
We will build the project using Xilinx Vivado HLS, which can be downloaded and installed from [here](https://www.xilinx.com/products/design-tools/vivado/integration/esl-design.html). Alongside Vivado HLS, hls4ml also supports Vitis HLS, Intel HLS, Catapult HLS and has some experimental support dor Intel oneAPI. The target back-end can be changed using the argument backend when building the model.

```Python
# Use Vivado HLS to synthesize the model
Expand All @@ -61,15 +63,19 @@ hls_model.build()
hls4ml.report.read_vivado_report('my-hls-test')
```

# FAQ

List of frequently asked questions and common HLS synthesis can be found [here](https://fastmachinelearning.org/hls4ml/faq.html)

# Citation
If you use this software in a publication, please cite the software
```bibtex
@software{fastml_hls4ml,
author = {{FastML Team}},
title = {fastmachinelearning/hls4ml},
year = 2023,
year = 2024,
publisher = {Zenodo},
version = {v0.8.1},
version = {v1.0.0},
doi = {10.5281/zenodo.1201549},
url = {https://github.com/fastmachinelearning/hls4ml}
}
Expand Down
22 changes: 22 additions & 0 deletions docs/advanced/auto.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
=============================
Automatic precision inference
=============================

The automatic precision inference (implemented in :py:class:`~hls4ml.model.optimizer.passes.infer_precision.InferPrecisionTypes`) attempts to infer the appropriate
widths for a given precision. It is initiated by setting a precision in the configuration as ``'auto'``. (Note, only layer-level precisions can be set to ``'auto'``,
not model-level.) Functions like :py:class:`~hls4ml.utils.config.config_from_keras_model`, :py:class:`~hls4ml.utils.config.config_from_onnx_model`,
and :py:class:`~hls4ml.utils.config.config_from_pytorch_model` automatically set most precisions to ``'auto'`` if the ``'name'`` granularity is used.

.. note::
It is recommended to pass the backend to the ``config_from_*`` functions so that they can properly extract all the configurable precisions.

The approach taken by the precision inference is to set accumulator (the internal variable used to accumulate values in the matrix multiplications) and other precisions
to never truncate, using only the bitwidths of the inputs (not the values). This is quite conservative, especially in cases where post-training quantization is used, or
if the bit widths were set fairly loosely. The recommended action in that case is to edit the configuration and explicitly set some widths in it, potentially in an iterative process
after profiling the data. Another option is to pass a maximum precision using the ``max_precison`` parameter of the ``config_form_*`` functions. Then the automatic precision
inference will never set a bitwdith larger than the bitwidth of the ``max_precision`` or an integer part larger than the integer part of the ``max_precision`` that is passed.
(The bitwidth and integer parts of the ``max_precision`` are treated separately.)

When manually setting bitdwidths, the accumulator can overflow, and the precision may need to be reduced. For the accumulator, it is usually a bad idea to explicitly
enable rounding or saturation modes since it dramatically increases the execution time. For other types (e.g. output types or weight types), however, rounding and saturation handling
can be enabled as needed.
42 changes: 42 additions & 0 deletions docs/advanced/bramfactor.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
==================================
Loading weights from external BRAM
==================================

.. note::
This feature is being evaluated for re-implementation. We welcome feedback from users how to make the implementation more flexible.

``hls4ml`` can optionally store weights in BRAMs external to the design. This is supported in Vivado/Vitis and Catapult backends. It is the responsibility of the user to ensure the weights are properly loaded during the operation of the design.

The feature works as a threshold, exposed through a ``BramFactor`` config parameter. Layers with more weights above the threshold will be exposed as BRAM interface. Consider the following code:

.. code-block:: Python

model = tf.keras.models.Sequential()
model.add(Dense(10, activation="relu", input_shape=(12,), name="dense_1"))
model.add(Dense(20, activation="relu", name="dense_2"))
model.add(Dense(5, activation="softmax", name="dense_3"))
model.compile(optimizer='adam', loss='mse')

config = hls4ml.utils.config_from_keras_model(model)
config["Model"]["Strategy"] = "Resource"
config["Model"]["BramFactor"] = 100

hls_model = hls4ml.converters.convert_from_keras_model(
model, hls_config=config, output_dir=output_dir, io_type=io_type, backend=backend
)

Having set ``BramFactor=100``, only layers with more than 100 weights will be exposed as external BRAM, in this case layers ``dense_1`` and ``dense_2``. ``BramFactor`` can currently be only set at the model level. The generated code will now have weights as part of the interface.

.. code-block:: C++

void myproject(
hls::stream<input_t> &dense_1_input,
hls::stream<result_t> &layer7_out,
model_default_t w2[120],
model_default_t w4[200]
) {
#pragma HLS INTERFACE axis port=dense_1_input,layer7_out
#pragma HLS INTERFACE bram port=w2,w4
...

When integrating the design, users can use the exposed interface to implement weight reloading scheme.
49 changes: 49 additions & 0 deletions docs/advanced/hgq.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
===================================
High Granularity Quantization (HGQ)
===================================

.. image:: https://github.com/calad0i/HGQ/actions/workflows/sphinx-build.yml/badge.svg
:target: https://calad0i.github.io/HGQ/
.. image:: https://badge.fury.io/py/hgq.svg
:target: https://badge.fury.io/py/hgq
.. image:: https://img.shields.io/badge/arXiv-2405.00645-b31b1b.svg
:target: https://arxiv.org/abs/2405.00645

`High Granularity Quantization (HGQ) <https://github.com/calad0i/HGQ/>`_ is a library that performs gradient-based automatic bitwidth optimization and quantization-aware training algorithm for neural networks to be deployed on FPGAs. By leveraging gradients, it allows for bitwidth optimization at arbitrary granularity, up to per-weight and per-activation level.

.. image:: https://calad0i.github.io/HGQ/_images/overview.svg
:alt: Overview of HGQ
:align: center

Conversion of models made with HGQ library is fully supported. The HGQ models are first converted to proxy model format, which can then be parsed by hls4ml bit-accurately. Below is an example of how to create a model with HGQ and convert it to hls4ml model.

.. code-block:: Python

import keras
from HGQ.layers import HDense, HDenseBatchNorm, HQuantize
from HGQ import ResetMinMax, FreeBOPs

model = keras.models.Sequential([
HQuantize(beta=1.e-5),
HDenseBatchNorm(32, beta=1.e-5, activation='relu'),
HDenseBatchNorm(32, beta=1.e-5, activation='relu'),
HDense(10, beta=1.e-5),
])

opt = keras.optimizers.Adam(learning_rate=0.001)
loss = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=opt, loss=loss, metrics=['accuracy'])
callbacks = [ResetMinMax(), FreeBOPs()]

model.fit(..., callbacks=callbacks)

from HGQ import trace_minmax, to_proxy_model
from hls4ml.converters import convert_from_keras_model

trace_minmax(model, x_train, cover_factor=1.0)
proxy = to_proxy_model(model, aggressive=True)

model_hls = convert_from_keras_model(proxy, backend='vivado',output_dir=... ,part=...)


An interactive example of HGQ can be found in the `kaggle notebook <https://www.kaggle.com/code/calad0i/small-jet-tagger-with-hgq-1>`_. Full documentation can be found at `calad0i.github.io/HGQ <https://calad0i.github.io/HGQ/>`_.
22 changes: 11 additions & 11 deletions docs/advanced/model_optimization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ The code block below showcases three use cases of the hls4ml Optimization API -
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import CategoricalAccuracy
from tensorflow.keras.losses import CategoricalCrossentropy
from hls4ml.optimization.keras import optimize_model
from hls4ml.optimization.keras.utils import get_model_sparsity
from hls4ml.optimization.attributes import get_attributes_from_keras_model
from hls4ml.optimization.objectives import ParameterEstimator
from hls4ml.optimization.scheduler import PolynomialScheduler
from hls4ml.optimization.dsp_aware_pruning.keras import optimize_model
from hls4ml.optimization.dsp_aware_pruning.keras.utils import get_model_sparsity
from hls4ml.optimization.dsp_aware_pruning.attributes import get_attributes_from_keras_model
from hls4ml.optimization.dsp_aware_pruning.objectives import ParameterEstimator
from hls4ml.optimization.dsp_aware_pruning.scheduler import PolynomialScheduler
# Define baseline model and load data
# X_train, y_train = ...
# X_val, y_val = ...
Expand Down Expand Up @@ -75,7 +75,7 @@ To optimize GPU FLOPs, the code is similar to above:

.. code-block:: Python

from hls4ml.optimization.objectives.gpu_objectives import GPUFLOPEstimator
from hls4ml.optimization.dsp_aware_pruning.objectives.gpu_objectives import GPUFLOPEstimator

# Optimize model
# Note the change from ParameterEstimator to GPUFLOPEstimator
Expand All @@ -98,7 +98,7 @@ Finally, optimizing Vivado DSPs is possible, given a hls4ml config:
.. code-block:: Python

from hls4ml.utils.config import config_from_keras_model
from hls4ml.optimization.objectives.vivado_objectives import VivadoDSPEstimator
from hls4ml.optimization.dsp_aware_pruning.objectives.vivado_objectives import VivadoDSPEstimator

# Note the change from optimize_model to optimize_keras_model_for_hls4ml
# The function optimize_keras_model_for_hls4ml acts as a wrapper for the function, parsing hls4ml config to model attributes
Expand All @@ -124,11 +124,11 @@ Finally, optimizing Vivado DSPs is possible, given a hls4ml config:
acc_optimized = accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_optimized, axis=1))
print(f'Optimized Keras accuracy: {acc_optimized}')

There are two more Vivado "optimizers" - VivadoFFEstimator, aimed at reducing register utilisation and VivadoMultiObjectiveEstimator, aimed at optimising BRAM and DSP utilisation.
Note, to ensure DSPs are optimized, "unrolled" Dense multiplication must be used before synthesing HLS, by modifying the config:
There are two more Vivado "optimizers" - VivadoFFEstimator, aimed at reducing register utilization and VivadoMultiObjectiveEstimator, aimed at optimizing BRAM and DSP utilization.
Note, to ensure DSPs are optimized, "unrolled" Dense multiplication must be used before synthesizing HLS, by modifying the config:

.. code-block:: Python

hls_config = config_from_keras_model(optimized_model)
hls_config['Model']['DenseResourceImplementation'] = 'Unrolled'
# Any addition hls4ml config, such as strategy, reuse factor etc...
hls_config['Model']['Strategy'] = 'Unrolled'
# Any addition hls4ml config, reuse factor etc...
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/command.rst → docs/api/command.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ hls4ml config

hls4ml config [-h] [-m MODEL] [-w WEIGHTS] [-o OUTPUT]

This creates a conversion configuration file. Visit Configuration section of the :doc:`Setup <setup>` page for more details on how to write a configuration file.
This creates a conversion configuration file. Visit Configuration section of the :doc:`Setup <../intro/setup>` page for more details on how to write a configuration file.

**Arguments**

Expand Down
Loading