Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
da89fa9
Refactor mchan hal
lukamac Jul 8, 2025
863fe6c
Refactor IntrospectiveCodeTransformation
lukamac Jul 8, 2025
0cfb2b1
Refactor MemoryAllocation
lukamac Aug 8, 2025
404bb39
Add minimalIntegerType helper function
lukamac Aug 8, 2025
fc5ad70
Small refactor DeeployTypes
lukamac Aug 8, 2025
140881c
Change Neureka tile constraints to new TilingCodegen function
lukamac Aug 8, 2025
1e29f77
Small refactors
lukamac Aug 8, 2025
1fb0ffb
Permutation refactor
lukamac Aug 8, 2025
3faa05e
Refactor TransposeTileConstraint
lukamac Aug 8, 2025
bfe77b1
Remove manual name mangling from templates since it's automatically d…
lukamac Aug 8, 2025
10c16ef
Change serialize to produce same shape rank as original
lukamac Aug 8, 2025
c4b7343
Refactor TilingExtension
lukamac Aug 8, 2025
5847a07
Port PULPOpen
lukamac Aug 8, 2025
f6a9863
Port Snitch
lukamac Aug 8, 2025
2f3e2e0
DeeployTest: Extract generic tiling code into tilingUtils.py
lukamac Aug 8, 2025
2b0ce32
DeeployTest: Extract common test generation code
lukamac Aug 8, 2025
ba38eae
DeeployTest: Add Dma tests
lukamac Aug 8, 2025
d06d754
Apply Philip's comments
lukamac Aug 30, 2025
2625ab4
Add unravelReference doc comment and fix the dealiasBuffer's
lukamac Sep 1, 2025
ad2477c
Refactor type inference and minimal(Integer|Float)Type
lukamac Sep 2, 2025
072f62d
Revert extra inputs hack
lukamac Sep 2, 2025
f35377b
Add mchan check for both event- and poll-based event checking flags b…
lukamac Sep 2, 2025
c4cc70a
Fix HyperRectangle arg order
lukamac Sep 9, 2025
3f75b10
Fix mchan check whether size is representable within 17 bits
lukamac Sep 9, 2025
b0e0953
Fix init, deinit, wait on initialFuture in DoubleBuffering, rename ge…
lukamac Sep 9, 2025
ffb7316
Fix GEMM tile constraint serialization to check transA and transB
lukamac Sep 9, 2025
4bce88c
Fix inherit from ABC in AsyncDma and AsyncDmaWaitingStrategy
lukamac Sep 9, 2025
09f742a
Fix use tileSizeInBytes to check whether it fits in memory
lukamac Sep 9, 2025
2ea6a12
Update changelog
lukamac Sep 9, 2025
dc71c80
Add missing transferOpRepr abstract method from the BlockingAsyncDmaA…
lukamac Sep 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -986,6 +986,24 @@ jobs:
python testRegexMatching.py
shell: bash

deeploy-test-dmas:
runs-on: ${{ needs.select-docker-image-and-runner.outputs.runner }}
needs: select-docker-image-and-runner
container:
image: ${{ needs.select-docker-image-and-runner.outputs.image }}
steps:
- name: Checkout Repo
uses: actions/checkout@v4
with:
submodules: recursive
- name: Build Deeploy
run: pip install -e .
- name: Run Test
run: |
cd DeeployTest
python testDmas.py
shell: bash

linting:
runs-on: ${{ needs.select-docker-image-and-runner.outputs.runner }}
needs: select-docker-image-and-runner
Expand Down
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,60 @@
This file contains the changelog for the Deeploy project. The changelog is divided into sections based on the version of the project. Each section contains a list of pull requests, features, changes, fixes, and removals that were made in that version.

## Unreleased (Planned Release Target: v0.2.1)

### List of Pull Requests
- Change order of typeMatching entries [#68](https://github.com/pulp-platform/Deeploy/pull/68)
- Node Mangling to avoid duplication [#93](https://github.com/pulp-platform/Deeploy/pull/93)
- Prepare Post v0.2.0 Release [#104](https://github.com/pulp-platform/Deeploy/pull/104)
- Use Docker digests instead of arch-specific tags [#106](https://github.com/pulp-platform/Deeploy/pull/106)
- Refactor tiling code generation [#105](https://github.com/pulp-platform/Deeploy/pull/105)

### Added
- Add manual type inference feature (CLI: `--input-type-map`/`--input-offset-map`) to resolve ambiguities when test inputs are not representative enough
- Added a `testTypeInferenceDifferentTypes` test case to validate type inference for different input types
- Added `_mangleNodeNames` function to avoid duplicate node mappings
- Output Docker image digests per platform (`amd64`, `arm64`) after build, which is used to construct the multi-arch Docker manifest. This preventes registry clutter caused by unnecessary per-architecture Docker tags.
- AsyncDma abstraction of DMA's
- test runner per DMA and a script that tests all the DMA's
- generic Single/DoubleBufferingTilingCodeGeneration classes
- TilingVariableReplacementUpdate class that updates the variable replacement refs
- TilingHoistingMixIn class that encapsulates all the hoisting helper functions of tiling
- sorting of input memory allocations to allow references that live in the same memory level as the memory they are referencing
- a function that tests the tiling solution for correctness which currently only tests buffer allocation for byte alignment
- IntrospectiveCodeTransformation: `_indexPointer()`, `indexVars()`, `dereferenceVars()`. The `*Vars` functions index/dereference a list of variables (useful for tiling)
- NetworkContext: `unravelReference()` that unravels a `_ReferenceBuffer` until the base buffer
- NetworkContext: `is_object()` - helper function that determines whether the string represents a name of a local or global object
- NetworkContext: `is_buffer()` - helper function that determines whether the string represents a name of a buffer
- missing checks for environment variables
- `_permuteHyperRectangle` helper function

### Changed
- Replaced platform-specific tags (`*-amd64`, `*-arm64`) with direct digest references in `Noelware/docker-manifest-action`.
- mchan HAL is now reduced to bare-bones
- refactor of the IntrospectiveCodeTransformation to work on the Mako template
- refactor of memory allocation code transformation passes
- _ReferenceBuffer accepts an optional `offset` argument to offset the reference
- NetworkContext: `hoistReference` - accepts the actual buffer as reference instead of name, accepts shape, offset, and override_type arguments, and returns the actual buffer, not its name
- `_mangleNodeRep` -> `_mangleOpRepr` - the canonical name we use is `OperatorRepresentation`. `NodeRep` and `ParseDict` are old iterations of the name.
- rename of permutation functions to follow this convention: `permute` is an action that permutes something, `permutation` is a function that generates a permutation
- `_permuteList` to just `_permute`
- removed manual buffer name mangling since we do it in the ExecutionBlock generate() function, simplifies templates
- we now check that buffer shapes/hyperrectangles/tiling ranks match which required changing a few `serializeTilingSolution` functions to preserve the same shape rank
- big refactor of the code generation part of the TilingExtension and needed changes to PULPOpen and Snitch due to it
- PULPClusterTilingSB and PULPClusterTilingDB now allow for transfers of any rank (dimensionality)
- PULP's final output diff is now calculated as absolute error, instead of just subtraction
- common code generation code between testMVP/generateNetwork/... was extracted into a single `generateTestNetwork` function
- in some functions, instead of passing the name of a buffer, the actual buffer is just passed
- tile function allows overriding the optimizer with external tilingSolution and memoryMap
- refactor of the permutation functions for clarity

### Fixed
- Prevent node duplication for graphs generated via GraphSurgeon
- Resolved issue with missing `id` in the `Build Cache for Docker` step, used in the `Inject build-cache` step.

### Removed
- Delete outdated and unused `.gitlab-ci.yml` file
- dory_dma.c and dory_dma.h

## Release v0.2.0 (2025-07-08) [#103](https://github.com/pulp-platform/Deeploy/pull/103)
This release containing major architectural changes, new platform support, enhanced simulation workflows, floating-point kernel support, training infrastructure for CCT models, memory allocation strategies, and documentation improvements.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ def _generateClosureStruct(self, ctxt: NetworkContext, executionBlock: Execution
closureStruct: Dict[str, Union[Pointer, Immediate, Struct]] = {}
makoDynamicReferences = self.extractDynamicReferences(ctxt, executionBlock, True)

for arg in list(dict.fromkeys(makoDynamicReferences)):
for arg in makoDynamicReferences:
ref = ctxt.lookup(arg)
if isinstance(ref, TransientBuffer):
closureStructArgsType[ctxt._mangle(arg)] = PointerClass(VoidType)
Expand Down Expand Up @@ -202,7 +202,7 @@ def _generateClosureStruct(self, ctxt: NetworkContext, executionBlock: Execution
# Add closure struct info to operatorRepresentation
closureStructArgsType = {}
closureStruct = {}
makoDynamicReferences = self.extractDynamicReferences(ctxt, executionBlock, True)
makoDynamicReferences = self.extractDynamicReferences(ctxt, executionBlock, unrollStructs = True)

filteredMakoDynamicReferences = []

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,16 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import copy
import types
from typing import Dict, List

import mako.codegen as codegen
from mako.lexer import Lexer
from mako.parsetree import Expression, TemplateNode
from mako.parsetree import Expression, TemplateNode, Text
from mako.template import Template

from Deeploy.AbstractDataTypes import Pointer, Struct
from Deeploy.DeeployTypes import ExecutionBlock, NetworkContext, NodeTemplate, OperatorRepresentation, VariableBuffer
from Deeploy.DeeployTypes import ExecutionBlock, NetworkContext, OperatorRepresentation, VariableBuffer

_NULL: str = "NULL"

Expand All @@ -42,65 +42,76 @@ class IntrospectiveCodeTransformationMixIn():
parseTreeDict: Dict[int, TemplateNode] = {}

@staticmethod
def _generateParseTree(template: NodeTemplate) -> TemplateNode:
return Lexer(template.template._source).parse()
def _generateParseTree(template: Template) -> TemplateNode:
return Lexer(template._source).parse()

@staticmethod
def _reconstructCode(template: NodeTemplate, node: TemplateNode):

def fixupParseTree(parseTree: TemplateNode) -> TemplateNode:
nodes = []
prevLine = 0
prevPos = 0
for node in parseTree.nodes:

newNode = copy.copy(node)
offset = len(node.source)

# Expression contain the actual expression + the symbols "${}", i.e. 3 offset symbols
if isinstance(newNode, Expression):
offset += 3
def _reconstructCode(template: Template, node: TemplateNode) -> Template:
lexer = Lexer(template._source)
source = codegen.compile(
node,
template.uri,
None,
default_filters = template.default_filters,
buffer_filters = template.buffer_filters,
imports = template.imports,
future_imports = template.future_imports,
source_encoding = lexer.encoding,
generate_magic_comment = True,
strict_undefined = template.strict_undefined,
enable_loop = template.enable_loop,
reserved_names = template.reserved_names,
)
module = types.ModuleType(template.module_id)
code = compile(source, template.module_id, "exec")
exec(code, module.__dict__, module.__dict__)

prevPos = prevPos + offset
template._code = code
template.module = module
template.callable_ = template.module.render_body
return template

if prevLine != node.lineno:
prevPos = node.pos
@staticmethod
def _indexPointer(parseTree: TemplateNode, ptrName: str, index: str) -> TemplateNode:
indexes = [i for i, node in enumerate(parseTree.nodes) if isinstance(node, Expression) and node.text == ptrName]

newNode.pos = prevPos
prevLine = node.lineno
for offset, idx in enumerate(indexes):
bracketOpen = Text("[", source = "[", lineno = 0, pos = 0, filename = None)
indexExpr = Expression(index, '', source = index, lineno = 0, pos = 0, filename = None)
bracketClose = Text("]", source = "]", lineno = 0, pos = 0, filename = None)
parseTree.nodes.insert(idx + 3 * offset + 1, bracketOpen)
parseTree.nodes.insert(idx + 3 * offset + 2, indexExpr)
parseTree.nodes.insert(idx + 3 * offset + 3, bracketClose)

nodes.append(newNode)
return parseTree

parseTree.nodes = nodes
@staticmethod
def indexVars(template: Template, varNames: List[str], index: str) -> None:
if len(varNames) == 0:
return
parseTree = IntrospectiveCodeTransformationMixIn._generateParseTree(template)
for name in varNames:
parseTree = IntrospectiveCodeTransformationMixIn._indexPointer(parseTree, name, index)
IntrospectiveCodeTransformationMixIn._reconstructCode(template, parseTree)

return parseTree
@staticmethod
def _dereferencePointer(parseTree: TemplateNode, ptrName: str) -> TemplateNode:
indexes = [i for i, node in enumerate(parseTree.nodes) if isinstance(node, Expression) and node.text == ptrName]

node = fixupParseTree(node)
for offset, idx in enumerate(indexes):
text = Text("*", source = "*", lineno = 0, pos = 0, filename = None)
parseTree.nodes.insert(idx + offset, text)

temp = template.template
lexer = Lexer(temp._source)
source = codegen.compile(
node,
temp.uri,
None,
default_filters = temp.default_filters,
buffer_filters = temp.buffer_filters,
imports = temp.imports,
future_imports = temp.future_imports,
source_encoding = lexer.encoding,
generate_magic_comment = True,
strict_undefined = temp.strict_undefined,
enable_loop = temp.enable_loop,
reserved_names = temp.reserved_names,
)
module = types.ModuleType(temp.module_id)
code = compile(source, temp.module_id, "exec")
exec(code, module.__dict__, module.__dict__)
return parseTree

temp._code = code
temp.module = module
temp.callable_ = temp.module.render_body
template.template = temp
@staticmethod
def dereferenceVars(template: Template, varNames: List[str]) -> None:
if len(varNames) == 0:
return
parseTree = IntrospectiveCodeTransformationMixIn._generateParseTree(template)
for name in varNames:
parseTree = IntrospectiveCodeTransformationMixIn._dereferencePointer(parseTree, name)
IntrospectiveCodeTransformationMixIn._reconstructCode(template, parseTree)

def extractDynamicReferences(self,
ctxt: NetworkContext,
Expand All @@ -112,7 +123,7 @@ def extractDynamicReferences(self,
for codeSnippet in executionBlock.codeSnippets:
template, operatorRepresentation = codeSnippet.template, codeSnippet.operatorRepresentation

newRefs = self._extractDynamicExpressions(ctxt, operatorRepresentation, template, unrollStructs,
newRefs = self._extractDynamicExpressions(ctxt, operatorRepresentation, template.template, unrollStructs,
includeGobalReferences)

makoDynamicReferences += newRefs
Expand All @@ -132,11 +143,10 @@ def _fixCtxtOrdering(ctxt: NetworkContext, nameList: List[str]) -> List[str]:
def _extractDynamicExpressions(self,
ctxt: NetworkContext,
operatorRepresentation: OperatorRepresentation,
template: NodeTemplate,
template: Template,
unrollStructs = False,
includeGobalReferences = False):

codeHash = hash(template.template._source)
codeHash = hash(template._source)

if codeHash in self.parseTreeDict.keys():
makoParseTree = self.parseTreeDict[codeHash]
Expand All @@ -146,60 +156,43 @@ def _extractDynamicExpressions(self,
self.parseTreeDict[codeHash] = makoParseTree

# Filter parsing tree for expressions
makoExpressions = [node.text for node in makoParseTree.nodes if type(node) == Expression]
makoExpressions = [node.text for node in makoParseTree.nodes if isinstance(node, Expression)]

# Filter expressions for local variables contained in operatorRepresentation
makoLocalReferences = [
node for node in makoExpressions
if ((node in operatorRepresentation) and type(operatorRepresentation[node]) == str and (
operatorRepresentation[node] in ctxt.localObjects.keys()))
# Filter represented expressions
representedExpressions = [
operatorRepresentation[expr] for expr in makoExpressions if expr in operatorRepresentation
]

# Filter expressions for global variables contained in operatorRepresentation
makoGlobalReferences = [
node for node in makoExpressions
if ((node in operatorRepresentation) and type(operatorRepresentation[node]) == str and (
operatorRepresentation[node] in ctxt.globalObjects.keys()))
]
# Filter buffers from expressions
references = [expr for expr in representedExpressions if ctxt.is_buffer(expr)]

if unrollStructs:

def _unrollStructReferences(val: Struct) -> List[str]:
assert isinstance(val, Struct)
# Recursively unroll struct references
structReferences = []
for field in val.value.values():
if isinstance(field, Struct):
structReferences += _unrollStructReferences(field)
elif isinstance(field, Pointer) and field.referenceName != _NULL:
structReferences.append(field.referenceName)
return structReferences

# Unroll local struct references
for ref in references:
if hasattr(ctxt.lookup(ref), "structDict"):
references += _unrollStructReferences(ctxt.lookup(ref).structDict)

def _unrollStructReferences(val) -> List[str]:
# Unroll struct references
structReferences = []
if isinstance(val, Struct):
for key, _type in val.value.items():
if isinstance(_type, Struct):
structReferences += _unrollStructReferences(val.value[key])
elif isinstance(_type, Pointer) and val.value[key].referenceName != _NULL:
structReferences.append(val.value[key].referenceName)
return structReferences

# Unroll local struct references
localReferences = []
localStructReferences = []
for ref in makoLocalReferences:
localReferences.append(operatorRepresentation[ref])
if unrollStructs:
if ctxt.is_local(operatorRepresentation[ref]) and hasattr(ctxt.lookup(operatorRepresentation[ref]),
"structDict"):
localStructReferences += _unrollStructReferences(
ctxt.lookup(operatorRepresentation[ref]).structDict)

# Unroll global struct references
globalReferences = []
globalStructReferences = []
for ref in makoGlobalReferences:
globalReferences.append(operatorRepresentation[ref])
if unrollStructs:
if ctxt.is_global(operatorRepresentation[ref]) and hasattr(ctxt.lookup(operatorRepresentation[ref]),
"structDict"):
globalStructReferences += _unrollStructReferences(
ctxt.lookup(operatorRepresentation[ref]).structDict)
# Filter expressions for local variables contained in operatorRepresentation
localReferences = [ref for ref in references if ctxt.is_local(ref)]

# Filter expressions for global variables contained in operatorRepresentation
globalReferences = [ref for ref in references if ctxt.is_global(ref)]

# Filter for dynamically allocated tensors
dynamicLocalReferences = [ref for ref in localReferences + localStructReferences if ctxt.lookup(ref)._deploy]
dynamicGlobalReferences = [
ref for ref in globalReferences + globalStructReferences if isinstance(ctxt.lookup(ref), VariableBuffer)
]
dynamicLocalReferences = [ref for ref in localReferences if ctxt.lookup(ref)._deploy]
dynamicGlobalReferences = [ref for ref in globalReferences if isinstance(ctxt.lookup(ref), VariableBuffer)]

if includeGobalReferences:
return dynamicLocalReferences + dynamicGlobalReferences
Expand Down
Loading
Loading