Skip to content

Allocating shared buffers for intermediate tensors of layers #25

@Jzjerry

Description

@Jzjerry

Hi,

Thank you for creating such a convenient framework!

I've been using the framework for a while and it works pretty well, but I encountered some memory exceeding when I'm deploying some larger models on my small RISC-V core with a small RAM.

I think the problem comes from the fact that Baremetal-NN codegen allocates independent memory for each of layer/operator inputs/outputs.
For example, in examples/mlp:

model->input_1.shape[0] = 1;
model->input_1.shape[1] = 48;
model->input_1.data = (float *)malloc(192);
model->actor_0.shape[0] = 1;
model->actor_0.shape[1] = 512;
model->actor_0.data = (float *)malloc(2048);
model->actor_0_weight.shape[0] = 512;
model->actor_0_weight.shape[1] = 48;
model->actor_0_weight.data = (float *)(model_weight_data + 0);
model->actor_0_bias.shape[0] = 512;
model->actor_0_bias.data = (float *)(model_weight_data + 98304);
model->actor_1.shape[0] = 1;
model->actor_1.shape[1] = 512;
model->actor_1.data = (float *)malloc(2048);
model->actor_2.shape[0] = 1;
model->actor_2.shape[1] = 256;
model->actor_2.data = (float *)malloc(1024);

The buffers for input_1(192 bytes), actor_0(2048 bytes), actor_1(2048 bytes), actor_2(1024 bytes) are allocated separately. This is good for tracking the values of layer outputs, but seems not that memory-efficient, especially for larger models.

Therefore, I wonder if it is possible to add a feature to create shared buffers for these tensors in the codegen, like a shared 2048 bytes buffer for actor_0, actor_1, actor_2, which saves a lot runtime memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions