Skip to content

[BUG] Copying array contents is slow and unnecessarily creates new array #726

@sfe-SparkFro

Description

@sfe-SparkFro

Describe the bug

When copying one array's contents to another (eg. buf2[:] = buf1), it takes a long time, and ends up creating a new array in the process, unnecessarily increasing memory consumption.

ulab version: 6.8.0-2D-c

To Reproduce

Run the following (DMA part only works on RP2, tested with Pico 2):

from ulab import numpy as np
import time
import rp2
import gc

# Set buffer size in bytes
buffer_size = 100_000

# Log free memory
print("mem_free:", gc.mem_free())

# Create first buffer
t0 = time.ticks_us()
buf1 = np.zeros((buffer_size), dtype=np.uint8)
t1 = time.ticks_us()
print("create buf1:", t1 - t0, "microseconds")

# Log free memory
print("mem_free:", gc.mem_free())

# Create second buffer
t0 = time.ticks_us()
buf2 = np.zeros((buffer_size), dtype=np.uint8)
t1 = time.ticks_us()
print("create buf2:", t1 - t0, "microseconds")

# Log free memory
print("mem_free:", gc.mem_free())

# Create a DMA controller and configure it
dma = rp2.DMA()
bytes_per_transfer = 4 # 1, 2, or 4 bytes per transfer
dma_ctrl = dma.pack_ctrl(
    # 0 = 1 byte, 1 = 2 bytes, 2 = 4 bytes
    size = {1:0, 2:1, 4:2}[bytes_per_transfer],
    inc_write = True,
    inc_read = True
)
dma.config(
    read = buf1,
    write = buf2,
    count = buffer_size // bytes_per_transfer,
    ctrl = dma_ctrl
)

# Fill buf1 with data
t0 = time.ticks_us()
buf1[:] = 1
t1 = time.ticks_us()
print("filling buf1:", t1 - t0, "microseconds")

# Log free memory
print("mem_free:", gc.mem_free())

# Copy buf1 to buf2 using standard assignment
t0 = time.ticks_us()
buf2[:] = buf1
t1 = time.ticks_us()
print("copy buf1 to buf2:", t1 - t0, "microseconds")

# Log free memory
print("mem_free:", gc.mem_free())

# Copy buf1 to buf2 using DMA
t0 = time.ticks_us()
_ = dma.active(True)
while dma.active():
    pass
t1 = time.ticks_us()
print("copy buf1 to buf2 with dma:", t1 - t0, "microseconds")

# Log free memory
print("mem_free:", gc.mem_free())

Expected behavior

buf2[:] = buf1 should take about as long as filling one of the buffers, and should create a new array in the process.

With a Pico 2, the following is printed:

mem_free: 490016
create buf1: 1395 microseconds
mem_free: 389920
create buf2: 1576 microseconds
mem_free: 289824
filling buf1: 23131 microseconds
mem_free: 289408
copy buf1 to buf2: 47145 microseconds
mem_free: 189216
copy buf1 to buf2 with dma: 193 microseconds
mem_free: 189216

With a Pico 2 and buffer_size = 100_000, filling one buffer takes ~23ms, whereas copying takes ~47ms and unnecessarily allocates a whole extra array (100kB) for some reason. By contrast, the DMA can do the memory copy in just 0.2ms (~250x faster!).

Additional context

I understand that the copy with CPU will take longer than the DMA, but why does it take so much longer than simply filling the array, and why is whole new array being allocated? The project I'm working on uses large arrays, and it's pretty wasteful memory usage for whole new arrays to be create for a simple copy operation. Is that intended behavior? Is there a solution that's more efficient? I see copyto() isn't implemented, perhaps that should be a feature request? Just trying to better understand the problem before asking for that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions