Add challenge 70: Merge Sorted Arrays (Medium) by claude[bot] · Pull Request #179 · AlphaGPU/leetgpu-challenges

claude · 2026-02-19T10:02:46Z

Summary

Adds challenge [LeetGPU Challenge Problem] Add 1D RMS Norm #70: Merge Sorted Arrays (Medium difficulty)
Given two sorted float32 arrays A (M elements) and B (N elements), produce a single sorted output array C of M+N elements
The sequential merge has a serial dependency chain; an efficient GPU solution must use the merge path / co-ranking algorithm — each thread independently binary-searches to find its diagonal on the merge path and merges a local tile without any inter-thread dependencies

Why this is interesting

This challenge exercises GPU concepts that are underrepresented in the current set:

Independent work decomposition at the thread/block level for an inherently sequential algorithm
Binary search on GPU to compute co-ranks
Merge path algorithm (Odeh et al., 2012) — a classic GPGPU primitive
Memory access patterns for irregular work distribution

It is not element-wise at all; every output element's value depends on a non-trivial combination of positions in both input arrays.

Test plan

13 functional test cases covering: edge cases (M=1/N=1), all-zero inputs, all-negative inputs, interleaved ranges, non-overlapping ranges, power-of-2 sizes (16, 64/128, 256/512), non-power-of-2 sizes (30/25, 100/73, 255/200), realistic sizes (1K/500, 5K/3K)
Performance test: M = N = 10,000,000 (80 MB output, well within 16 GB VRAM × 5)
All 6 framework starters present (CUDA, PyTorch, Triton, JAX, CuTe, Mojo)
black, isort, flake8 --bugbear, clang-format all pass
Challenge loads via importlib (module name + signature verified)

🤖 Generated with Claude Code

Introduces a parallel merge challenge requiring solvers to merge two sorted float32 arrays into a single sorted output. Unlike a trivial sequential merge, an efficient GPU implementation must use the merge path (co-ranking) technique — each thread independently binary-searches to determine which slice of A and B it is responsible for, then merges locally without serial dependencies. Key GPU concepts exercised: independent work decomposition, binary search on GPU, merge path algorithm, memory access patterns. Includes 13 functional tests (edge cases, powers-of-2, non-powers, negative values, non-overlapping ranges, realistic sizes) and a performance test at M = N = 10,000,000. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

kunal-mansukhani · 2026-02-20T06:35:49Z

@claude Rebase and bump this pr to be challenge_id 71

claude bot requested review from ishaan-arya and kunal-mansukhani as code owners February 19, 2026 10:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add challenge 70: Merge Sorted Arrays (Medium)#179

Add challenge 70: Merge Sorted Arrays (Medium)#179
claude[bot] wants to merge 1 commit intomainfrom
challenge/70-merge-sorted-arrays

claude bot commented Feb 19, 2026

Uh oh!

kunal-mansukhani commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

claude bot commented Feb 19, 2026

Summary

Why this is interesting

Test plan

Uh oh!

kunal-mansukhani commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant