Skip to content

Commit 5db6de3

Browse files
authored
Remove memory-aware node bindings (#123)
Remove memory-aware node bindings because it makes the great parser refactor easier. The memory-aware node bindings exist only for Neureka to be able to separate bindings that use the dedicated weight memory vs. those who don't. But those bindings can simply be rewritten to check whether the weights reside in weight memory and change behavior accordingly. By removing the memory-aware node bindings, we remove another dependency on having hoisted buffers in the middle of parsing. The RequantHelpers are a bonus that fixes the requantization mul and add hyperrectangles to keep the rank of the original tensors. ## Added - RequantHelpers.py for Neureka's TileConstraints ## Changed - Removed NodeMemoryLevelChecker, MemoryAwareNodeBinding - _parseNode from MemoryNetworkDeployer since we don't need the annotations before typeChecking anymore - Wmem variants of bindings and tile constraints from Neureka ## Fixed - Keep mul/add rank of requantized Neureka tile constraints
1 parent f905830 commit 5db6de3

File tree

11 files changed

+214
-1140
lines changed

11 files changed

+214
-1140
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ This file contains the changelog for the Deeploy project. The changelog is divid
44
## Unreleased (Planned Release Target: v0.2.1)
55

66
### List of Pull Requests
7+
- Remove memory-aware node bindings [#123](https://github.com/pulp-platform/Deeploy/pull/123)
78
- Fix missing const's layout transformation and refactor NCHWtoNHWC passes [#122](https://github.com/pulp-platform/Deeploy/pull/122)
89
- Fix aliasing [#125](https://github.com/pulp-platform/Deeploy/pull/125)
910
- Support for 1D Autoencoder [#98](https://github.com/pulp-platform/Deeploy/pull/98)
@@ -49,6 +50,7 @@ This file contains the changelog for the Deeploy project. The changelog is divid
4950
- Buffer utilities: `checkNumLevels` validation and `sizeInBytes` method
5051
- Per–memory-level usage tracking and worst-case reporting in `NetworkContext`
5152
- Memory/I/O summaries and input/output logging in deployers
53+
- RequantHelpers.py for Neureka's TileConstraints
5254

5355
### Changed
5456
- Replaced platform-specific tags (`*-amd64`, `*-arm64`) with direct digest references in `Noelware/docker-manifest-action`.
@@ -80,6 +82,9 @@ This file contains the changelog for the Deeploy project. The changelog is divid
8082
- Refactored `hoistConstant`
8183
- Refactored TransientBuffer's `__init__`
8284
- Refactor of the NCHWtoNHWC passes
85+
- Removed NodeMemoryLevelChecker, MemoryAwareNodeBinding
86+
- Removed _parseNode from MemoryNetworkDeployer since we don't need the annotations before typeChecking anymore
87+
- Removed Wmem variants of bindings and tile constraints from Neureka
8388

8489
### Fixed
8590
- Prevent node duplication for graphs generated via GraphSurgeon
@@ -92,6 +97,7 @@ This file contains the changelog for the Deeploy project. The changelog is divid
9297
- Fixed `Unsqueeze` Op. when using ONNX opset 13 or higher (from attribute to input)
9398
- Fixed aliasing
9499
- Missing layout transformation of the const's (bias, mul, add, shift in Conv/RequantizedConv)
100+
- Keep mul/add rank of requantized Neureka tile constraints
95101

96102
### Removed
97103
- Delete outdated and unused `.gitlab-ci.yml` file

Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
#
33
# SPDX-License-Identifier: Apache-2.0
44

5-
from typing import Any, Tuple, Union
5+
from typing import Any, Union
66

77
import onnx_graphsurgeon as gs
88

@@ -63,11 +63,6 @@ def lower(self, graph: gs.Graph) -> gs.Graph:
6363
def codeTransform(self, verbose: CodeGenVerbosity = _NoVerbosity):
6464
return self._innerObject.codeTransform(verbose)
6565

66-
# MemoryAwareDeployer augment
67-
def _parseNode(self, node: ONNXLayer, ctxt: NetworkContext,
68-
default_channels_first: bool) -> Tuple[NetworkContext, bool]:
69-
return self._innerObject._parseNode(node, ctxt, default_channels_first)
70-
7166
# PULPDeployer augment
7267
def generateBufferAllocationCode(self) -> str:
7368
return self._innerObject.generateBufferAllocationCode()

Deeploy/MemoryLevelExtension/MemoryLevels.py

Lines changed: 1 addition & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,7 @@
22
#
33
# SPDX-License-Identifier: Apache-2.0
44

5-
from typing import Dict, List, Optional, Sequence, Tuple
6-
7-
import onnx_graphsurgeon as gs
8-
9-
from Deeploy.DeeployTypes import CodeTransformation, NetworkContext, NodeBinding, NodeTemplate, NodeTypeChecker, \
10-
OperatorRepresentation
5+
from typing import Dict, List, Optional
116

127

138
class MemoryLevel():
@@ -109,58 +104,3 @@ def getDefaultMemoryLevel(self):
109104
if self._defaultMemoryLevel is None:
110105
raise ValueError('defaultMemoryLevel level not set!')
111106
return self._defaultMemoryLevel
112-
113-
114-
class NodeMemoryLevelChecker():
115-
116-
def __init__(self, inputMemoryLevels: Sequence[Optional[str]], outputMemoryLevels: Sequence[Optional[str]]):
117-
self.inputMemoryLevels = inputMemoryLevels
118-
self.outputMemoryLevels = outputMemoryLevels
119-
120-
def _memEq(self, memoryLevel: str, annotatedMemoryLevel: str) -> bool:
121-
if memoryLevel is None:
122-
return True
123-
else:
124-
return memoryLevel == annotatedMemoryLevel
125-
126-
def _checkMemoryLevels(self, ctxt: NetworkContext, memoryLevels: Sequence[str],
127-
tensors: Sequence[gs.Tensor]) -> bool:
128-
buffers = [ctxt.lookup(tensor.name) for tensor in tensors]
129-
if not all(hasattr(buffer, "_memoryLevel") for buffer in buffers):
130-
return False
131-
132-
annotatedMemoryLevels = [buffer._memoryLevel for buffer in buffers]
133-
if all(
134-
self._memEq(memoryLevel, annotatedMemoryLevel)
135-
for memoryLevel, annotatedMemoryLevel in zip(memoryLevels, annotatedMemoryLevels)):
136-
return True
137-
else:
138-
return False
139-
140-
def check(self, ctxt: NetworkContext, node: gs.Node, operatorRepresentation) -> Tuple[NetworkContext, bool]:
141-
if self._checkMemoryLevels(ctxt, self.inputMemoryLevels, node.inputs) and self._checkMemoryLevels(
142-
ctxt, self.outputMemoryLevels, node.outputs):
143-
return ctxt, True
144-
else:
145-
return ctxt, False
146-
147-
148-
class MemoryAwareNodeBinding(NodeBinding):
149-
150-
def __init__(self, typeChecker: NodeTypeChecker, memoryLevelChecker: NodeMemoryLevelChecker, template: NodeTemplate,
151-
codeTransformer: CodeTransformation):
152-
super().__init__(typeChecker, template, codeTransformer)
153-
self.memoryLevelChecker = memoryLevelChecker
154-
155-
def typeCheck(self, ctxt: NetworkContext, node: gs.Node,
156-
operatorRepresentation: OperatorRepresentation) -> Tuple[NetworkContext, bool]:
157-
newCtxt, ret = self.memoryLevelChecker.check(ctxt, node, operatorRepresentation)
158-
if ret:
159-
return super().typeCheck(newCtxt, node, operatorRepresentation)
160-
161-
return ctxt, False
162-
163-
164-
def memoryAwareNodeBindingExtension(binding: NodeBinding,
165-
memoryLevelChecker: NodeMemoryLevelChecker) -> MemoryAwareNodeBinding:
166-
return MemoryAwareNodeBinding(binding.typeChecker, memoryLevelChecker, binding.template, binding.codeTransformer)

Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py

Lines changed: 13 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
from Deeploy.CommonExtensions.NetworkDeployers.NetworkDeployerWrapper import NetworkDeployerWrapper
1212
from Deeploy.CommonExtensions.NetworkDeployers.SignPropDeployer import SignPropDeployer
1313
from Deeploy.DeeployTypes import CodeGenVerbosity, ConstantBuffer, DeploymentEngine, DeploymentPlatform, \
14-
NetworkContext, NetworkDeployer, NetworkOptimizationPass, NetworkOptimizer, ONNXLayer, Schedule, StructBuffer, \
14+
NetworkContext, NetworkDeployer, NetworkOptimizationPass, NetworkOptimizer, Schedule, StructBuffer, \
1515
TopologyOptimizer, TransientBuffer, VariableBuffer, _NoVerbosity
1616
from Deeploy.Logging import DEFAULT_LOGGER as log
1717
from Deeploy.MemoryLevelExtension.MemoryLevels import MemoryHierarchy, MemoryLevel
@@ -128,25 +128,16 @@ def getTargetMemoryLevelMapping(self) -> TargetMemoryLevelMapping:
128128
f"Platform should be a MemoryPlatform or MemoryPlatformWrapper! Got {type(self.Platform).__name__}"
129129
return TargetMemoryLevelMapping(self.graph, self.Platform, self.ctxt)
130130

131-
def _parseNode(self, node: ONNXLayer, ctxt: NetworkContext,
132-
default_channels_first: bool) -> Tuple[NetworkContext, bool]:
133-
134-
newCtxt, parsePass = super()._parseNode(node, ctxt, default_channels_first)
135-
136-
if not parsePass:
137-
return ctxt, False
138-
139-
newCtxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(newCtxt, self.graph)
140-
141-
return newCtxt, parsePass
142-
143131
def bind(self):
132+
log.info("- Perform Memory Level Annotation")
133+
# LMACAN: Annotate before bind because during binding (specifically alignToContext) templates
134+
# may expect the memoryLevel annotation already.
135+
self.ctxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(self.ctxt, self.graph)
144136

145137
ret = super().bind()
146138
if not ret:
147139
return False
148140

149-
log.info("- Perform Memory Level Annotation")
150141
# SCHEREMO: There might be hoisting; reassign memoryLevel preferences
151142
self.ctxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(self.ctxt, self.graph)
152143

@@ -181,29 +172,16 @@ def getTargetMemoryLevelMapping(self) -> TargetMemoryLevelMapping:
181172
f"Platform should be a MemoryPlatform or MemoryPlatformWrapper! Got {type(self.Platform).__name__}"
182173
return TargetMemoryLevelMapping(self.graph, self.Platform, self.ctxt)
183174

184-
def _parseNode(self, node: ONNXLayer, ctxt: NetworkContext,
185-
default_channels_first: bool) -> Tuple[NetworkContext, bool]:
186-
187-
newCtxt, parsePass = node.parse(ctxt.copy(), default_channels_first)
188-
189-
if not parsePass:
190-
return ctxt, False
191-
192-
newCtxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(newCtxt, self.graph)
193-
newCtxt, LayerBindSuccess = node.typeCheck(newCtxt)
194-
195-
if not LayerBindSuccess:
196-
return ctxt, False
197-
198-
return newCtxt, True
199-
200175
def bind(self):
176+
log.info("- Perform Memory Level Annotation")
177+
# LMACAN: Annotate before bind because during binding (specifically alignToContext) templates
178+
# may expect the memoryLevel annotation already.
179+
self.ctxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(self.ctxt, self.graph)
201180

202181
ret = super().bind()
203182
if not ret:
204183
return False
205184

206-
log.info("- Perform Memory Level Annotation")
207185
# SCHEREMO: There might be hoisting; reassign memoryLevel preferences
208186
self.ctxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(self.ctxt, self.graph)
209187

@@ -229,29 +207,16 @@ def getTargetMemoryLevelMapping(self) -> TargetMemoryLevelMapping:
229207
f"Platform should be a MemoryPlatform or MemoryPlatformWrapper! Got {type(self.Platform).__name__}"
230208
return TargetMemoryLevelMapping(self.graph, self.Platform, self.ctxt)
231209

232-
def _parseNode(self, node: ONNXLayer, ctxt: NetworkContext,
233-
default_channels_first: bool) -> Tuple[NetworkContext, bool]:
234-
235-
newCtxt, parsePass = node.parse(ctxt.copy(), default_channels_first)
236-
237-
if not parsePass:
238-
return ctxt, False
239-
240-
newCtxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(newCtxt, self.graph)
241-
newCtxt, LayerBindSuccess = node.typeCheck(newCtxt)
242-
243-
if not LayerBindSuccess:
244-
return ctxt, False
245-
246-
return newCtxt, True
247-
248210
def bind(self):
211+
log.info("- Perform Memory Level Annotation")
212+
# LMACAN: Annotate before bind because during binding (specifically alignToContext) templates
213+
# may expect the memoryLevel annotation already.
214+
self.ctxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(self.ctxt, self.graph)
249215

250216
ret = super().bind()
251217
if not ret:
252218
return False
253219

254-
log.info("- Perform Memory Level Annotation")
255220
# SCHEREMO: There might be hoisting; reassign memoryLevel preferences
256221
self.ctxt, self.graph = self.memoryLevelAnnotationOptimizer.optimize(self.ctxt, self.graph)
257222

Deeploy/Targets/Neureka/Bindings.py

Lines changed: 0 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@
55
from Deeploy.AbstractDataTypes import PointerClass
66
from Deeploy.CommonExtensions.DataTypes import int8_t, int32_t, uint8_t
77
from Deeploy.DeeployTypes import NodeBinding
8-
from Deeploy.MemoryLevelExtension.MemoryLevels import NodeMemoryLevelChecker, memoryAwareNodeBindingExtension
98
from Deeploy.Targets.Generic.TypeCheckers import ConvChecker
109
from Deeploy.Targets.Neureka.Templates.ConvTemplate import NeurekaDenseConv2D_Template, NeurekaDWConv2D_Template, \
1110
NeurekaPWConv2D_Template, NeurekaRqntDenseConv2D_Template, NeurekaRqntDWConv2D_Template, \
@@ -33,15 +32,6 @@
3332
for weight_type in [uint8_t, int8_t]
3433
]
3534

36-
NeurekaWmemRQSPWConv2DBindings = [
37-
memoryAwareNodeBindingExtension(binding, NodeMemoryLevelChecker([None, "WeightMemory_SRAM", None, None], [None]))
38-
for binding in NeurekaRQSPWConv2DBindings
39-
]
40-
NeurekaWmemPWConv2DBindings = [
41-
memoryAwareNodeBindingExtension(binding, NodeMemoryLevelChecker([None, "WeightMemory_SRAM"], [None]))
42-
for binding in NeurekaPWConv2DBindings
43-
]
44-
4535
NeurekaRQSDWConv2DBindings = [
4636
NodeBinding(
4737
PULPConvChecker(
@@ -62,15 +52,6 @@
6252
for weight_type in [uint8_t, int8_t]
6353
]
6454

65-
NeurekaWmemRQSDWConv2DBindings = [
66-
memoryAwareNodeBindingExtension(binding, NodeMemoryLevelChecker([None, "WeightMemory_SRAM", None, None], [None]))
67-
for binding in NeurekaRQSDWConv2DBindings
68-
]
69-
NeurekaWmemDWConv2DBindings = [
70-
memoryAwareNodeBindingExtension(binding, NodeMemoryLevelChecker([None, "WeightMemory_SRAM"], [None]))
71-
for binding in NeurekaDWConv2DBindings
72-
]
73-
7455
NeurekaRQSDenseConv2DBindings = [
7556
NodeBinding(
7657
PULPConvChecker(
@@ -91,12 +72,3 @@
9172
for data_in_type in [uint8_t, int8_t]
9273
for weight_type in [uint8_t, int8_t]
9374
]
94-
95-
NeurekaWmemRQSDenseConv2DBindings = [
96-
memoryAwareNodeBindingExtension(binding, NodeMemoryLevelChecker([None, "WeightMemory_SRAM", None, None], [None]))
97-
for binding in NeurekaRQSDenseConv2DBindings
98-
]
99-
NeurekaWmemDenseConv2DBindings = [
100-
memoryAwareNodeBindingExtension(binding, NodeMemoryLevelChecker([None, "WeightMemory_SRAM"], [None]))
101-
for binding in NeurekaDenseConv2DBindings
102-
]

Deeploy/Targets/Neureka/Engine.py

Lines changed: 9 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -12,27 +12,17 @@
1212
NeurekaRQSDenseConv2DParser, NeurekaRQSDWConv2DParser, NeurekaRQSPWConv2DParser
1313
from Deeploy.Targets.Neureka.Tiler import NeurekaDenseConv2DTilingReadyBindings, NeurekaDWConv2DTilingReadyBindings, \
1414
NeurekaPWConv2DTilingReadyBindings, NeurekaRQSDenseConv2DTilingReadyBindings, \
15-
NeurekaRQSDWConv2DTilingReadyBindings, NeurekaRQSPWConv2DTilingReadyBindings, \
16-
NeurekaWmemDenseConv2DTilingReadyBindings, NeurekaWmemDWConv2DTilingReadyBindings, \
17-
NeurekaWmemPWConv2DTilingReadyBindings, NeurekaWmemRQSDenseConv2DTilingReadyBindings, \
18-
NeurekaWmemRQSDWConv2DTilingReadyBindings, NeurekaWmemRQSPWConv2DTilingReadyBindings
15+
NeurekaRQSDWConv2DTilingReadyBindings, NeurekaRQSPWConv2DTilingReadyBindings
1916
from Deeploy.Targets.PULPOpen.Layers import PULPRQSConvLayer
2017

21-
NeurekaRqntPWConv2DMapper = NodeMapper(
22-
NeurekaRQSPWConv2DParser(), NeurekaWmemRQSPWConv2DTilingReadyBindings + NeurekaRQSPWConv2DTilingReadyBindings)
23-
NeurekaPWConv2DMapper = NodeMapper(NeurekaPWConv2DParser(),
24-
NeurekaWmemPWConv2DTilingReadyBindings + NeurekaPWConv2DTilingReadyBindings)
25-
26-
NeurekaRqntDWConv2DMapper = NodeMapper(
27-
NeurekaRQSDWConv2DParser(), NeurekaWmemRQSDWConv2DTilingReadyBindings + NeurekaRQSDWConv2DTilingReadyBindings)
28-
NeurekaDWConv2DMapper = NodeMapper(NeurekaDWConv2DParser(),
29-
NeurekaWmemDWConv2DTilingReadyBindings + NeurekaDWConv2DTilingReadyBindings)
30-
31-
NeurekaRqntDenseConv2DMapper = NodeMapper(
32-
NeurekaRQSDenseConv2DParser(),
33-
NeurekaWmemRQSDenseConv2DTilingReadyBindings + NeurekaRQSDenseConv2DTilingReadyBindings)
34-
NeurekaDenseConv2DMapper = NodeMapper(NeurekaDenseConv2DParser(),
35-
NeurekaWmemDenseConv2DTilingReadyBindings + NeurekaDenseConv2DTilingReadyBindings)
18+
NeurekaRqntPWConv2DMapper = NodeMapper(NeurekaRQSPWConv2DParser(), NeurekaRQSPWConv2DTilingReadyBindings)
19+
NeurekaPWConv2DMapper = NodeMapper(NeurekaPWConv2DParser(), NeurekaPWConv2DTilingReadyBindings)
20+
21+
NeurekaRqntDWConv2DMapper = NodeMapper(NeurekaRQSDWConv2DParser(), NeurekaRQSDWConv2DTilingReadyBindings)
22+
NeurekaDWConv2DMapper = NodeMapper(NeurekaDWConv2DParser(), NeurekaDWConv2DTilingReadyBindings)
23+
24+
NeurekaRqntDenseConv2DMapper = NodeMapper(NeurekaRQSDenseConv2DParser(), NeurekaRQSDenseConv2DTilingReadyBindings)
25+
NeurekaDenseConv2DMapper = NodeMapper(NeurekaDenseConv2DParser(), NeurekaDenseConv2DTilingReadyBindings)
3626

3727
NeurekaMapping = {
3828
'RequantizedConv':

0 commit comments

Comments
 (0)