Skip to content

Commit 93cbafe

Browse files
Integer-CtrlRivinHD
andcommitted
doc: project report iteration
Co-authored-by: Vincent Gerlach <[email protected]>
1 parent 61bfc5d commit 93cbafe

File tree

10 files changed

+795
-193
lines changed

10 files changed

+795
-193
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,10 @@ This repository includes:
1616

1717
The weekly tasks from the lab can be found here: [scalable-analyses](https://github.com/scalable-analyses/pbtc/tree/main/lab)
1818

19-
## Technical Documentation
19+
## CMake Library
2020

21-
A detailed technical documentation of our implementation including the design decisions and solutions to the lab tasks, and explanations of the source code is available on our [project website](https://integer-ctrl.github.io/machine-learning-compilers/).
21+
To make the compiler easy to integrate into other projects, we structured it as a CMake library. This allows users to include and build upon our functionality directly in their own CMake-based projects. More details about the library and how to use it can be found in the [user-guide](https://github.com/Integer-Ctrl/machine-learning-compilers/blob/main/cmake-library/README.md).
2222

23-
## CMake Library
23+
## Technical Documentation
2424

25-
To make the compiler easy to integrate into other projects, we structured it as a CMake library. This allows users to include and build upon our functionality directly in their own CMake-based projects. More details about the library and how to use it can be found in the [user-guide.md](https://github.com/Integer-Ctrl/machine-learning-compilers/cmake-library/user-guide.md).
25+
A detailed technical documentation of our implementation including the design decisions and solutions to the lab tasks, and explanations of the source code is available on our [project website](https://integer-ctrl.github.io/machine-learning-compilers/).
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ mlc::Error error = mlc::contraction(in0, in1, out, "[0,1,2],[3,4,1]->[0,3,4,2]")
154154

155155
In the example above, the contraction operation takes two input tensors `in0` and `in1`, and produces an output tensor `out`. The expression `"[0,1,2],[3,4,1]->[0,3,4,2]"` defines that the dimensions with IDs `0`, `2`, `3`and `4` are retained in the output tensor, while the dimensions with IDs `1` is contracted. The output tensor will have the dimensions `[5, 5, 2, 3]`.
156156

157-
To further advance the contraction operation, a first touch primitive and a last touch primitive can be specified. The first touch primitive is applied to the output tensor before the contraction operation, while the last touch primitive is applied to the output tensor after the contraction operation. The supported primitives are `mlc::UnaryType::None`, `mlc::UnaryType::Zero`, `mlc::UnaryType::Identity` and `mlc::UnaryType::ReLu`.
157+
To further advance the contraction operation, a first touch primitive and a last touch primitive can be specified. The first touch primitive is applied to the output tensor before the contraction operation, while the last touch primitive is applied to the output tensor after the contraction operation. The supported primitives are `mlc::UnaryType::None`, `mlc::UnaryType::Zero`, `mlc::UnaryType::Identity` and `mlc::UnaryType::ReLU`.
158158

159159
```cpp
160160
#include <MachineLearningCompiler/Tensor.h>

docs_sphinx/Doxyfile.in

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -864,7 +864,7 @@ WARN_LOGFILE =
864864
# spaces. See also FILE_PATTERNS and EXTENSION_MAPPING
865865
# Note: If this tag is empty the current directory is searched.
866866

867-
INPUT = "../src/" "../includes/"
867+
INPUT = "../src/" "../include/"
868868

869869
# This tag can be used to specify the character encoding of the source files
870870
# that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses

docs_sphinx/chapters/report_individual.rst

Lines changed: 672 additions & 85 deletions
Large diffs are not rendered by default.

docs_sphinx/chapters/tensor_operations.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ Backend
99
-------
1010

1111
User Interface
12-
""""""""""""""
12+
^^^^^^^^^^^^^^
1313

1414
1. setup
15-
^^^^^^^^
15+
""""""""
1616

1717
**Task**: Begin implementing the ``setup`` function of the class ``einsum::backend::TensorOperation`` for binary tensor contractions.
1818
Parse the configuration parameters passed to the function and generate the corresponding (BR)GEMM kernel at runtime.
@@ -246,10 +246,10 @@ primitives in combination with a naive version. The tests are located in the fol
246246
TEST_CASE("Test tensor operation with outer loop with first touch: unary (zero, relu, copy) & main kernel: brgemm & last touch: unary (zero, relu, copy)", "[tensor_operation][unary][brgemm][correctness]")
247247
248248
Performance Benchmarking
249-
------------------------
249+
^^^^^^^^^^^^^^^^^^^^^^^^
250250

251251
1. Performance
252-
^^^^^^^^^^^^^^
252+
""""""""""""""
253253

254254
**Task**: Benchmark the performance of your implementation for the above examples. Report the measured performance in GFLOPS.
255255

@@ -292,7 +292,7 @@ Tensor contraction using the Zero, BRGEMM and ReLU primitives:
292292
BM_tensor_Zero+BRGEMM+RELU/size_a:262144/size_b:262144/size_c:1048576/config:2/min_warmup_time:0.300_cv 0.32 % 0.32 % 10 0.32%
293293
294294
2. Own Setups
295-
^^^^^^^^^^^^^
295+
"""""""""""""
296296

297297
**Task**: Design your own setups. Which setups achieve a high performance and which setups are slow?
298298

@@ -354,14 +354,14 @@ Tensor contraction using the Zero, BRGEMM and ReLU primitives:
354354
355355
356356
Shared Memory Parallelization
357-
-----------------------------
357+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
358358

359359
In the shared memory domain, loops can be parallelized at any point within the nested loop structure. However, to simplify the
360360
implementation, we only parallelize the outermost loops. In other words, we do not parallelize loops that are nested inside
361361
sequential loops.
362362

363363
1. execute_iter_parallel
364-
^^^^^^^^^^^^^^^^^^^^^^^^
364+
""""""""""""""""""""""""
365365

366366
**Task**: Implement the function ``execute_iter_parallel``, which parallelizes a binary tensor contraction in the shared memory domain.
367367

@@ -727,9 +727,9 @@ And validated with some additional tests: File: ``TensorOperation.test.cpp``.
727727
.. code-block:: cpp
728728
729729
bool mini_jit::TensorOperation::isValidPrimStrides(const std::span<const TensorConfig::dim_t> &dim,
730-
const std::span<const TensorConfig::exec_t> &exec,
731-
const std::span<const int64_t> &strides_in0, const std::span<const int64_t> &strides_out,
732-
const TensorConfig::prim_t main_prim)
730+
const std::span<const TensorConfig::exec_t> &exec,
731+
const std::span<const int64_t> &strides_in0, const std::span<const int64_t> &strides_out,
732+
const TensorConfig::prim_t main_prim)
733733
{
734734
// ...
735735

docs_sphinx/conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,8 @@
7272
"source_branch": "main",
7373
"source_directory": "docs_sphinx/",
7474
}
75+
html_title = "Machine Learning Compilers"
76+
language = "en"
7577

7678
# html_theme = 'sphinx_rtd_theme'
7779
# html_theme_options = {

docs_sphinx/index.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,15 @@ Machine Learning Compilers
77
==========================
88

99
.. toctree::
10-
:maxdepth: 4
10+
:maxdepth: 1
1111
:caption: GETTING STARTED
1212
:glob:
1313

1414
getting_started/building_project.rst
1515
getting_started/building_docs.rst
1616

1717
.. toctree::
18-
:maxdepth: 4
18+
:maxdepth: 2
1919
:caption: CHAPTERS
2020
:glob:
2121

@@ -29,7 +29,7 @@ Machine Learning Compilers
2929
chapters/report_individual.rst
3030

3131
.. toctree::
32-
:maxdepth: 4
32+
:maxdepth: 2
3333
:caption: API
3434
:glob:
3535

src/interface/Einsum.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ mlc::Error mlc::EinsumOperation::execute(const std::vector<std::reference_wrappe
4242
return error;
4343
}
4444

45-
Error checkError = hasSameDimensions<std::reference_wrapper<const Tensor>>(inputs);
45+
Error checkError = hasSameDimensions<std::reference_wrapper<const Tensor>>(inputs, output);
4646
if (checkError.type != ErrorType::None)
4747
{
4848
return checkError;
@@ -58,7 +58,7 @@ mlc::Error mlc::EinsumOperation::execute(const std::vector<const Tensor *> &inpu
5858
return error;
5959
}
6060

61-
Error checkError = hasSameDimensions<const Tensor *>(inputs);
61+
Error checkError = hasSameDimensions<const Tensor *>(inputs, output);
6262
if (checkError.type != ErrorType::None)
6363
{
6464
return checkError;

src/interface/Einsum.h

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -92,10 +92,26 @@ namespace mlc
9292
return {mlc::ErrorType::None, "Success"};
9393
}
9494

95-
template <typename T> inline Error EinsumOperation::hasSameDimensions(const std::vector<T> &inputs)
95+
template <typename T> inline Error EinsumOperation::hasSameDimensions(const std::vector<T> &inputs, const Tensor &output)
9696
{
97-
std::vector<mini_jit::EinsumTree::EinsumNode *> nodesToProcess = {einsumTree.get_root()};
9897
auto &sortedDimSizes = einsumTree.get_sorted_dim_sizes();
98+
const mini_jit::EinsumTree::EinsumNode *root = einsumTree.getRoot();
99+
100+
if (output->dim_sizes.size() != root->output_dim_ids.size())
101+
{
102+
return {ErrorType::ExecuteWrongDimension, "The count of dimensions do not match in the output tensor."};
103+
}
104+
105+
for (size_t i = 0; i < root->output_dim_ids.size(); i++)
106+
{
107+
if (output->dim_sizes[i] != static_cast<uint64_t>(sortedDimSizes[root->output_dim_ids[i]]))
108+
{
109+
return {ErrorType::ExecuteWrongDimension,
110+
"The output tensor dimension has a different size than the size than the tensor it was setup up with."};
111+
}
112+
}
113+
114+
std::vector<mini_jit::EinsumTree::EinsumNode *> nodesToProcess = {einsumTree.get_root()};
99115
uint32_t processedInputs = 0;
100116
while (nodesToProcess.size() > 0)
101117
{
@@ -113,7 +129,7 @@ namespace mlc
113129

114130
if (tensor->dim_sizes.size() != node->output_dim_ids.size())
115131
{
116-
return {ErrorType::ExecuteWrongDimension, "The count of dimensions do not match."};
132+
return {ErrorType::ExecuteWrongDimension, "The count of dimensions do not match in an input tensor."};
117133
}
118134

119135
for (size_t i = 0; i < node->output_dim_ids.size(); i++)

0 commit comments

Comments
 (0)