Skip to content

Doc: Show stats comparing to numpy #879

@JonathanWoollett-Light

Description

@JonathanWoollett-Light

What type of report is this?

Improvement

Please describe the issue.

It would be good if a GitHub action ran a test that generated plots comparing performance to numpy, these could then be pushed to a GitHub page and viewable.

If you have a suggestion on how it should be, add it below.

An example is that:

Image

Which shows at which density sparse is more efficient for different numbers of dimensions.

Which I generated with:

import sparse
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm


def test_boolean():
    # Generate density values (11 points between 0.00-0.01)
    densities = np.linspace(0.00, 1.00, num=100)
    dims = range(1, 5)
    size = 2000

    sparse_mem: list[list[int]] = []
    numpy_mem: list[list[int]] = []
    for dim in dims:  # Dimensions 1-3 (0D removed)
        print(f"dim: {dim}")

        dim_size = int(float(size) ** (1 / float(dim)))
        sparse_mem_dim: list[int] = []
        numpy_mem_dim: list[int] = []
        for density in tqdm(densities):

            # Sparse array memory
            sparse_arr = sparse.random([dim_size for _ in range(dim)], density=density)
            sparse_mem_dim.append(sparse_arr.nbytes)

            # Dense array memory
            dense_arr = np.empty([dim_size for _ in range(dim)])
            numpy_mem_dim.append(dense_arr.nbytes)
        sparse_mem.append(sparse_mem_dim)
        numpy_mem.append(numpy_mem_dim)

    # Plotting
    plt.figure(figsize=(10, 6))
    for i, d in enumerate(dims):
        plt.plot(densities, sparse_mem[i], "o", alpha=0.5, label=f"Sparse {d}D ")
        plt.plot(densities, numpy_mem[i], "o", alpha=0.5, label=f"Numpy {d}D")
    plt.xlabel("Density")
    plt.ylabel("Memory Usage (bytes)")
    plt.title(f"Memory Usage vs Density for nD Arrays")
    plt.legend()
    plt.grid(True)
    plt.savefig(f"memory_usage.png")
    plt.close()

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions