[luci] Use static dimensions from new_shape attribute#14844
[luci] Use static dimensions from new_shape attribute#14844mbencer wants to merge 2 commits intoSamsung:masterfrom
Conversation
| { | ||
| for (uint32_t axis = 0; axis < base_shape.rank(); ++axis) | ||
| { | ||
| if (!base_shape.dim(axis).known() && merged_shape.dim(axis).known()) |
There was a problem hiding this comment.
This case handles shape_by_input has unknown dimension, while shape_by_attr knows the dimension. Could you give an example of this case?
There was a problem hiding this comment.
Sure, in parallel with circle-resizer I'm working enabling LLama2 with dynamic input in circle-exir.
I have a case where target shape input is is non-const (not CircleConst) so in the moment of shape inference here we can calculate only output rank based on it. By the attribute = {-1, 256} I can include also one known dimension here which is very useful for further shape inference (before runtime execution).
Missing first dimension is calculated in runtime.
There was a problem hiding this comment.
btw, I see that sth similar is already done but only for CircleOutputDummy but I am not sure why
There was a problem hiding this comment.
Thaks for the explanation. So, this happens because circle-resizer only updates input shape but not attributes.
I'm working enabling LLama2 with dynamic input in circle-exir.
Could you explain it a little bit more? I imagined that it can be used for changing sequence length as below.
Torch -> static circle (seq len: 256) -> static circle (seq len: 512 or else)
But it seems that your model includes dynamic shapes. Could you describe your usage scenario?
There was a problem hiding this comment.
Sure. In general I'm trying the resolve the issue 293 from circle-exir repo.
- I exported LLama2 model with dynamic batch and sequence length using my
circle-exirdraft PR 871. - I resized the model to static shapes batch: 1 and seq_len: 3 using my draft tool circle-resizer.
It fails during luci::CircleShapeInferencePass without this change in the current moment.
There was a problem hiding this comment.
Thanks. So, the whole process is Torch -> dynamic shape circle (batch and seq_len are unknown) -> static shape circle.
Dynamic shape circle would have additional operators, such as Shape, not executable on NPU. circle-resizer may only update the shapes while not removing those operators. Do you have a plan to remove them as well, e.g., using constant folding?
There was a problem hiding this comment.
Dynamic shape circle would have additional operators, such as Shape, not executable on NPU. circle-resizer may only update the shapes while not removing those operators. Do you have a plan to remove them as well, e.g., using constant folding?
You are completely right. The next step should be calling optimization passes (like ConstantFold - especially Shape->Gather patterns). In the ideal scenario the model after such running such passes should be the same as a static version produced by frontend. However, as I understand it should be rather a job for circle2circle
|
You may need to change the commit message. [luci-interpreter] -> [luci] |
|
btw, it's rather a topic for a separate PR but I think that is_static_shape can depend only on input data shape. In the current moment, it's also set to false for dynamic dimensions in |
| } | ||
| if (unknown_dim_index != UINT32_MAX) | ||
| { | ||
| if (input_element_count % output_element_count != 0) |
There was a problem hiding this comment.
For negative tests purpose, should be moved to a separate PR?
There was a problem hiding this comment.
Before asking, how is this condition related with your issue?
There was a problem hiding this comment.
I can't catch the problem.
There was a problem hiding this comment.
It's not directly related with my change but I haven't better ideas to new negative tests (especially because we have in Reshape shape infer impl rather asserts instead of throwing exceptions).
But I see now, that it's better to move it to a separate PR ;)
442cf87 to
d57f2e7
Compare
| namespace | ||
| { | ||
| loco::TensorShape merge_shapes(const loco::TensorShape &base_shape, |
There was a problem hiding this comment.
| namespace | |
| { | |
| loco::TensorShape merge_shapes(const loco::TensorShape &base_shape, | |
| namespace | |
| { | |
| loco::TensorShape merge_shapes(const loco::TensorShape &base_shape, |
|
@mbencer , I can't see what case you are solving with |
Unfortunately, I have a problem with preparing such test model using tensor flow recipe. If I try to set dynamic dimension (by I've prepared such (simplified) test model using CircleGen (dumped to a file) - Without this PR, shape inference pass fails during Squeeze inference because dynamic dimension in input shape is not checked. But even if we add checking dyn dimension in Squueze, the shape inference doesn't look good: I am not sure what happens in the version without Reshape change from this PR. I'll try to debug it. |
|
Why are you struggling with |
| { | ||
| if (input_element_count % output_element_count != 0) | ||
| { | ||
| INTERNAL_EXN("Unknown output dimension cannot be calculated for inputs"); |
There was a problem hiding this comment.
| INTERNAL_EXN("Unknown output dimension cannot be calculated for inputs"); | |
| INTERNAL_EXN("Reshape Op cannot infer unknown dimension from inputs."); |
| ASSERT_EQ(4, output_shape.dim(1).value()); | ||
| } | ||
|
|
||
| TEST(ShapeRuleTest, reshape_should_infer_incorrect_zero_NEG) |
There was a problem hiding this comment.
| TEST(ShapeRuleTest, reshape_should_infer_incorrect_zero_NEG) | |
| TEST(ShapeRuleTest, reshape_zero_rank_mismatch_NEG) |
| ASSERT_THROW(shape_inf_rule.infer(node_reshape, output_shape), oops::InternalExn); | ||
| } | ||
|
|
||
| TEST(ShapeRuleTest, reshape_should_infer_incorrect_target_shape_NEG) |
There was a problem hiding this comment.
| TEST(ShapeRuleTest, reshape_should_infer_incorrect_target_shape_NEG) | |
| TEST(ShapeRuleTest, reshape_wrong_target_shape_NEG) |
| ASSERT_TRUE(shape_inf_rule.infer(node_reshape, output_shape)); | ||
|
|
||
| ASSERT_EQ(2, output_shape.rank()); | ||
| ASSERT_FALSE(output_shape.dim(0).known()); |
There was a problem hiding this comment.
Why is this unknown? I guess that Reshape (2,3,4) into (-1, 12) should have returned (2, 12)?
There was a problem hiding this comment.
Hmm, good point. My intention here was to set as dynamic, but you are right that -1 have also special meaning.
I debug it more deeply and IHMO is_static_shape = false is not needed here. As a result -1 handling here is not called.
But it seems to be sth separately to the main goal of my change (especially if even now it's not very clear😉)
To avoid confusions I've changed the test data.
As I mentioned in #14844 (comment) - without fix from this PR in Reshape, the shape inference fails on Squeeze op (which is after Reshape). EDIT: I've found the reason of strange shape inference results in version without this PR + fix in Squeeze. btw: IMO storing dynamic dimensions as |
|
I've tried to clean up the stuff by: IHMO still sth work to consider: how to mark dynamic dimensions: |
This PR adds use static dimension from new_shape attribute even if only some of them are static and the rest dynamic. ONE-DCO-1.0-Signed-off-by: Mateusz Bencer m.bencer@partner.samsung.com
d57f2e7 to
b4da310
Compare
|
Plz add a draft link to see #14844 (comment) for this too |
| auto rank = new_shape->rank(); | ||
| auto shape_dummy = dynamic_cast<luci::CircleOutputDummy *>(node->shape()); | ||
| if (shape_dummy && rank > 0) | ||
| if (rank > 0) |
There was a problem hiding this comment.
Q) why did you remove to check shape_dummy ?
There was a problem hiding this comment.
For my model converted from Pytorch I would like to use dimensions stored in newShape but models converted from Pytorch have no luci::CircleOutputDummy
There was a problem hiding this comment.
Do you know what and how CircleOutputDummy works?
There was a problem hiding this comment.
I know only (based on the spec) that it's "Temporary DummyNode used with dangle CircleNode" but to be honest I don't know why it is used here ;)
There was a problem hiding this comment.
If you don't know the purpose of this node, why did you deleted it?
There was a problem hiding this comment.
Just for clarification - I believe I understand concept of CircleOutputDummy here -> we are using values from attribute only if shape input to missing (dummy node assign to avoid dangling operand) -> attribute here is to reflect TF impl of Reshape node.
I am just not understand why we cannot relax the algorithm -> if based on shape input we can only calculate rank, let's check if maybe we can extract sth valuable from attribute (still treating input shape with the higher priority)
| auto node_reshape = g->nodes()->create<luci::CircleReshape>(); | ||
| auto tensor_input = g->nodes()->create<luci::CircleInput>(); | ||
| auto target_shape = g->nodes()->create<luci::CircleInput>(); | ||
| ; |
There was a problem hiding this comment.
thank you for catching it. removed
| if (node->newShape()->dim(axis) > 0) | ||
| { | ||
| shape_by_attr.dim(axis) = node->newShape()->dim(axis); | ||
| } | ||
| else | ||
| { | ||
| shape_by_attr.dim(axis).unset(); // unset means unknown dimension | ||
| } |
There was a problem hiding this comment.
how is this change related with removing shape_dummy ?
There was a problem hiding this comment.
The purpose of the change is to align way of dynamic shape representation: -1 for newShape and 0 for shape_by_attr.
But relation with shape_dummy removing is only to not show info log about shape mismatch.
There was a problem hiding this comment.
is only to not show info log about shape mismatch.
I don't understand this. plz explain your understandings about this.
There was a problem hiding this comment.
Let's consider even my case from reshape_by_newShape_dynamic unit test:
shape_by_input here after
ONE/compiler/luci/service/src/Nodes/CircleReshape.cpp
Lines 130 to 135 in e5e151d
[0,2,0]while
shape_by_attr is [-1,2,-1]TensorShape::operator== doesn't handle -1 as dynamic dimension
There was a problem hiding this comment.
doesn't handle -1 as dynamic dimension
Not this. I'm asking about the relation with shape_dummy removal.
I understand about handling 0 to be handled as unknown.
There was a problem hiding this comment.
And I really don't understand why you are mentioning about the log.
What is the problem with the log?
There was a problem hiding this comment.
IMO incorrect information about mismatch will be shown just because other representation of dynamic dimension. It is related with this change because after call
ONE/compiler/luci/service/src/Nodes/CircleReshape.cpp
Lines 130 to 135 in e5e151d
shape_dummy) condition shape_by_input == shape_by_attr will be negative incorrectly
| ASSERT_TRUE(output_shape.dim(1).known()); | ||
| ASSERT_EQ(4, output_shape.dim(1).value()); | ||
| ASSERT_FALSE(output_shape.dim(2).known()); | ||
| } |
There was a problem hiding this comment.
IIUC, you are to reshape (2, 3, 4) to (-1, 4, -1)
I understand you are just preparing a test but
I really don't understand how (2, 3, 4) can be reshaped to (-1, 4, -1).
This values can make others reading this code confused about the ReshapeOp.
If you think I don't know about this, please explain.
There was a problem hiding this comment.
it's probably can be reshaping to (6, 4, 1) but agree that such test data can be confusing. Changed to 2 (still with assumption that -1 means dyn dimension)
There was a problem hiding this comment.
Have you tried running with pytorch or TF with reshape from (2, 3, 4) to (6, 4, 1) ?
(and maybe with (-1, 4, -1))
There was a problem hiding this comment.
import tensorflow as tf
input = tf.zeros([2, 3, 4])
print(tf.reshape(input, [6, 4, 1]).shape)
import torch
input = torch.zeros(2, 3, 4)
print(torch.reshape(input, (6, 4, 1)).shape)
for both (6, 4, 1)
| // reshape to {dynamic, 4, dynamic} | ||
| node_reshape->newShape()->rank(3); | ||
| node_reshape->newShape()->dim(0) = -1; | ||
| node_reshape->newShape()->dim(1) = 2; |
There was a problem hiding this comment.
Now you changed 4 to 2. What's the rationale behind this?
There was a problem hiding this comment.
the idea was to set data easier to understand (especially after shape specialization)
I really think this is VERY dangerous way of thinking. We also support tflite and TensorFlow files. |
I cannot at least now image case when such change can break something for other format. |
|
Due to fact that I have no strong arguments that the change is completely safe for other frontends I'm closing the PR for now. For my current purposes (LLama2 with dynamic batch and seq_len) is enough to introduce #14857 (not all possible dimensions will be propagated during shape inference) but |




This PR adds use static dimension from new_shape attribute even if only some of them are static and the rest dynamic.
ONE-DCO-1.0-Signed-off-by: Mateusz Bencer m.bencer@partner.samsung.com
Issue: #14791