Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
2023996
perf implement rust co-occurrence statistics
wenjie1991 Mar 18, 2025
e51f546
misc: change rust-py deps
wenjie1991 Mar 18, 2025
a5b8226
doc: improve the documentation
wenjie1991 Mar 18, 2025
f7ff293
add python re-implementation
MDLDan Mar 19, 2025
9af4252
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 19, 2025
9457767
Clean the tests and dependencies
wenjie1991 Mar 19, 2025
0ec6985
Merge branch 'numba-co-occurrence' of github.com:wenjie1991/squidpy i…
wenjie1991 Mar 19, 2025
92f3da5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 19, 2025
ad674ad
Merge branch 'main' into numba-co-occurrence
wenjie1991 Mar 19, 2025
057decc
Merge branch 'main' into numba-co-occurrence
timtreis Mar 28, 2025
26200d3
Merge branch 'main' into numba-co-occurrence
wenjie1991 Mar 28, 2025
c2a57ac
Merge branch 'scverse:main' into numba-co-occurrence
wenjie1991 Jun 12, 2025
52a5fae
Optimize memory access pattern & cache kernel
wenjie1991 Jun 12, 2025
cee646a
jit the outer function and parallelize
wenjie1991 Jun 12, 2025
05dd724
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 12, 2025
c490d45
Fix Mypy checking Typing error
wenjie1991 Jun 12, 2025
7b72292
Merge branch 'numba-co-occurrence' of github.com:wenjie1991/squidpy i…
wenjie1991 Jun 12, 2025
d0884ca
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 12, 2025
6e5d1df
Disable the cast typing in jit
wenjie1991 Jun 12, 2025
63702d1
merge
wenjie1991 Jun 12, 2025
a62128b
Try fix typing check error by mypy
wenjie1991 Jun 12, 2025
0aab856
Try: fix typing check error by mypy
wenjie1991 Jun 14, 2025
9ac2693
Try: fix typing check error by mypy
wenjie1991 Jun 14, 2025
5303de0
Merge branch 'main' into numba-co-occurrence
wenjie1991 Jun 21, 2025
482d0d8
init for benchmark
selmanozleyen Jul 10, 2025
18a767c
add the array order and stuff later. Also sparse vs dense later maybe
selmanozleyen Jul 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name: Benchmark

on:
push:
branches: [main]
pull_request:
branches: [main]

env:
FORCE_COLOR: "1"

defaults:
run:
shell: bash -e {0} # -e to fail on error

jobs:
benchmark:
runs-on: ${{ matrix.os }}

strategy:
fail-fast: false
matrix:
python: ["3.13"]
os: [ubuntu-latest]

env:
OS: ${{ matrix.os }}
PYTHON: ${{ matrix.python }}
ASV_DIR: "./benchmarks"

steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Fetch main branch for `asv run`’s hash
run: git fetch origin main:main
if: ${{ github.ref_name != 'main' }}

- name: Set up Python ${{ matrix.python }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python }}
cache: 'pip'

- name: Cache datasets
uses: actions/cache@v4
with:
path: |
~/.cache
key: benchmark-state-${{ hashFiles('benchmarks/**') }}

- name: Install dependencies
run: pip install 'asv>=0.6.4'

- name: Configure ASV
working-directory: ${{ env.ASV_DIR }}
run: asv machine --yes

- name: Quick benchmark run
working-directory: ${{ env.ASV_DIR }}
run: asv run --dry-run --quick --show-stderr --verbose HEAD^!
27 changes: 27 additions & 0 deletions benchmarks/.asv/results/benchmarks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{
"co_occurence.peakmem_co_occurrence": {
"code": "def peakmem_co_occurrence():\n sq.gr.co_occurrence(adata, cluster_key=\"cell type\")\n\ndef setup():\n global adata # noqa: PLW0603\n adata = sq.datasets.imc()",
"name": "co_occurence.peakmem_co_occurrence",
"param_names": [],
"params": [],
"type": "peakmemory",
"unit": "bytes",
"version": "8e588751fbcc15d56cfe73dbb3844752d6cac545c868bca1ce1045f0a54e301d"
},
"co_occurence.time_co_occurrence": {
"code": "def time_co_occurrence():\n sq.gr.co_occurrence(adata, cluster_key=\"cell type\")\n\ndef setup():\n global adata # noqa: PLW0603\n adata = sq.datasets.imc()",
"min_run_count": 2,
"name": "co_occurence.time_co_occurrence",
"number": 0,
"param_names": [],
"params": [],
"repeat": 0,
"rounds": 2,
"sample_time": 0.01,
"type": "time",
"unit": "seconds",
"version": "e764c1b999e315fffac8cac6708ede350ff254bd0b05a9be580a43ea3689d614",
"warmup_time": -1
},
"version": 2
}
9 changes: 9 additions & 0 deletions benchmarks/.asv/results/mac/machine.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"arch": "arm64",
"cpu": "Apple M1",
"machine": "mac",
"num_cpu": "8",
"os": "Darwin 23.1.0",
"ram": "17179869184",
"version": 1
}
22 changes: 22 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Squidpy Benchmarks

This directory contains code for benchmarking Squidpy using [asv][].

The functionality is checked using the [`benchmark.yml`][] workflow.
Benchmarks are run using the [benchmark bot][].

[asv]: https://asv.readthedocs.io/
[`benchmark.yml`]: ../.github/workflows/benchmark.yml
[benchmark bot]: https://github.com/apps/scverse-benchmark # TODO

## Data processing in benchmarks

# TODO
Each dataset is processed so it has

- `.layers['counts']` (containing data in C/row-major format) and `.layers['counts-off-axis']` (containing data in FORTRAN/column-major format)
- `.X` and `.layers['off-axis']` with log-transformed data (formats like above)
- a `.var['mt']` boolean column indicating mitochondrial genes

The benchmarks are set up so the `layer` parameter indicates the layer that will be moved into `.X` before the benchmark.
That way, we don’t need to add `layer=layer` everywhere.
172 changes: 172 additions & 0 deletions benchmarks/asv.conf.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
{
// The version of the config file format. Do not change, unless
// you know what you are doing.
"version": 1,

// The name of the project being benchmarked
"project": "squidpy",

// The project's homepage
"project_url": "https://squidpy.readthedocs.io/",

// The URL or local path of the source code repository for the
// project being benchmarked
"repo": "..",

// The Python project's subdirectory in your repo. If missing or
// the empty string, the project is assumed to be located at the root
// of the repository.
// "repo_subdir": "",

// Customizable commands for building, installing, and
// uninstalling the project. See asv.conf.json documentation.
//
// "install_command": ["python -mpip install {wheel_file}"],
// "uninstall_command": ["return-code=any python -mpip uninstall -y {project}"],
"build_command": [
"python -m pip install build",
"python -m build --wheel -o {build_cache_dir} {build_dir}",
],

// List of branches to benchmark. If not provided, defaults to "master"
// (for git) or "default" (for mercurial).
"branches": ["main"], // for git

// The DVCS being used. If not set, it will be automatically
// determined from "repo" by looking at the protocol in the URL
// (if remote), or by looking for special directories, such as
// ".git" (if local).
"dvcs": "git",

// The tool to use to create environments. May be "conda",
// "virtualenv" or other value depending on the plugins in use.
// If missing or the empty string, the tool will be automatically
// determined by looking for tools on the PATH environment
// variable.
"environment_type": "conda",

// timeout in seconds for installing any dependencies in environment
// defaults to 10 min
//"install_timeout": 600,

// the base URL to show a commit for the project.
"show_commit_url": "https://github.com/scverse/squidpy/commit/",

// The Pythons you'd like to test against. If not provided, defaults
// to the current version of Python used to run `asv`.
// "pythons": ["3.11", "3.13"],

// The list of conda channel names to be searched for benchmark
// dependency packages in the specified order
"conda_channels": ["conda-forge", "defaults"],

// The matrix of dependencies to test. Each key is the name of a
// package (in PyPI) and the values are version numbers. An empty
// list or empty string indicates to just test against the default
// (latest) version. null indicates that the package is to not be
// installed. If the package to be tested is only available from
// PyPi, and the 'environment_type' is conda, then you can preface
// the package name by 'pip+', and the package will be installed via
// pip (with all the conda available packages installed first,
// followed by the pip installed packages).
//
"matrix": {
"numpy": [""],
// "scipy": ["1.2", ""],
"scipy": [""],
"h5py": [""],
"natsort": [""],
"pandas": [""],
"memory_profiler": [""],
// "zarr": ["2.18.4"],
"pytest": [""],
// "scanpy": [""],
"squidpy": [""],
"python-igraph": [""],
// "psutil": [""]
"pooch": [""],
"scikit-image": [""],
// "scikit-misc": [""],
},

// Combinations of libraries/python versions can be excluded/included
// from the set to test. Each entry is a dictionary containing additional
// key-value pairs to include/exclude.
//
// An exclude entry excludes entries where all values match. The
// values are regexps that should match the whole string.
//
// An include entry adds an environment. Only the packages listed
// are installed. The 'python' key is required. The exclude rules
// do not apply to includes.
//
// In addition to package names, the following keys are available:
//
// - python
// Python version, as in the *pythons* variable above.
// - environment_type
// Environment type, as above.
// - sys_platform
// Platform, as in sys.platform. Possible values for the common
// cases: 'linux2', 'win32', 'cygwin', 'darwin'.
//
// "exclude": [
// {"python": "3.2", "sys_platform": "win32"}, // skip py3.2 on windows
// {"environment_type": "conda", "six": null}, // don't run without six on conda
// ],
//
// "include": [
// // additional env for python2.7
// {"python": "2.7", "numpy": "1.8"},
// // additional env if run on windows+conda
// {"platform": "win32", "environment_type": "conda", "python": "2.7", "libpython": ""},
// ],

// The directory (relative to the current directory) that benchmarks are
// stored in. If not provided, defaults to "benchmarks"
// "benchmark_dir": "benchmarks",

// The directory (relative to the current directory) to cache the Python
// environments in. If not provided, defaults to "env"
"env_dir": ".asv/env",

// The directory (relative to the current directory) that raw benchmark
// results are stored in. If not provided, defaults to "results".
"results_dir": ".asv/results",

// The directory (relative to the current directory) that the html tree
// should be written to. If not provided, defaults to "html".
"html_dir": ".asv/html",

// The number of characters to retain in the commit hashes.
// "hash_length": 8,

// `asv` will cache results of the recent builds in each
// environment, making them faster to install next time. This is
// the number of builds to keep, per environment.
// "build_cache_size": 2,

// The commits after which the regression search in `asv publish`
// should start looking for regressions. Dictionary whose keys are
// regexps matching to benchmark names, and values corresponding to
// the commit (exclusive) after which to start looking for
// regressions. The default is to start from the first commit
// with results. If the commit is `null`, regression detection is
// skipped for the matching benchmark.
//
// "regressions_first_commits": {
// "some_benchmark": "352cdf", // Consider regressions only after this commit
// "another_benchmark": null, // Skip regression detection altogether
// },

// The thresholds for relative change in results, after which `asv
// publish` starts reporting regressions. Dictionary of the same
// form as in ``regressions_first_commits``, with values
// indicating the thresholds. If multiple entries match, the
// maximum is taken. If no entry matches, the default is 5%.
//
// "regressions_thresholds": {
// "some_benchmark": 0.01, // Threshold of 1%
// "another_benchmark": 0.5, // Threshold of 50%
// },
}
1 change: 1 addition & 0 deletions benchmarks/benchmarks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""ASV benchmark suite for sqidpy."""
32 changes: 32 additions & 0 deletions benchmarks/benchmarks/co_occurence.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
"""Benchmark tool operations in Squidpy.

API documentation: <https://squidpy.readthedocs.io/en/stable/api>.
"""

from __future__ import annotations

from typing import TYPE_CHECKING

import squidpy as sq


if TYPE_CHECKING:
from anndata import AnnData

# setup variables

adata: AnnData


def setup():
global adata # noqa: PLW0603
adata = sq.datasets.imc()


def time_co_occurrence():
sq.gr.co_occurrence(adata, cluster_key="cell type")


def peakmem_co_occurrence():
sq.gr.co_occurrence(adata, cluster_key="cell type")

Loading
Loading