Skip to content

Commit e71ccf5

Browse files
authored
feat: NumPy 2.x aligned Shape architecture and broadcast semantics (#538)
* fix: broadcast_to unilateral validation matching NumPy semantics (Bug 3) NumPy's broadcast_to is unilateral — it only stretches source dimensions that are size 1 to match the target shape. If the source has a dimension larger than the target, or more dimensions than the target, it raises ValueError. NumSharp's broadcast_to was delegating directly to the bilateral Broadcast(Shape, Shape) which allows both sides to stretch. Added ValidateBroadcastTo() helper called from all 9 broadcast_to overloads before the bilateral Broadcast call. The check enforces: - source ndim <= target ndim - each source dimension (right-aligned) must be 1 or equal to target This cannot live inside Broadcast() itself because arithmetic operations (a + b) require bilateral stretching of both operands. Verified with dotnet_run scripts against NumPy: broadcast_to(ones(3), (1,)) → now throws (was accepted) broadcast_to(ones(1,2), (2,1)) → now throws (was accepted) broadcast_to(ones(1,3), (2,3)) → still works (valid unilateral) * fix: remove IsBroadcasted guard, unify 2-arg/N-arg broadcast paths (Bug 4) 2-arg Broadcast(Shape, Shape) threw NotSupportedException when either input was already broadcast. This blocked legitimate operations like np.clip on broadcast arrays (which internally re-broadcasts) and explicit re-broadcasting via broadcast_to. Changes to the 2-arg path: - Removed the IsBroadcasted guard at line 299 - When an input IsBroadcasted, resolve BroadcastInfo.OriginalShape as the root original for chain tracking — stride=0 dims from prior broadcasts naturally propagate through the stride computation loop - ViewInfo for sliced inputs now uses the resolved original shape Changes to the N-arg Broadcast(Shape[]) path: - Added ViewInfo handling for sliced inputs, matching the 2-arg path. Without this, N-arg broadcast_arrays with sliced inputs produced wrong values (GetOffset couldn't resolve slice strides) - Added re-broadcast support via BroadcastInfo.OriginalShape - Removed dead code: `it.size = tmp` (was immediately overwritten by ComputeHashcode which recalculates size from dimensions) Verified both paths produce identical results for sliced inputs: arange(12).reshape(3,4)[:,1:2] broadcast to (3,3) now correctly returns [[1,1,1],[5,5,5],[9,9,9]] through both 2-arg and N-arg paths. Re-broadcast chains tested up to triple depth: broadcast_to(broadcast_to(broadcast_to(x, s1), s2), s3) works. * test: broadcast audit — unilateral validation, re-broadcast, path parity Updated and added tests for the broadcast system audit: - BroadcastTo_UnilateralSemantics_RejectsInvalidCases: replaces the old BroadcastTo_BilateralBroadcast_KnownDiscrepancy test. Now verifies that (3,)→(1,), (1,2)→(2,1), and (1,1)→(1,) all throw, matching NumPy. - ReBroadcast_2Arg_SameShape: broadcast → re-broadcast same shape - ReBroadcast_2Arg_HigherDim: (3,1)→(3,3)→(2,3,3) chain - ReBroadcast_2Arg_ClipOnBroadcast: np.clip on broadcast (Bug 4 variant) - BroadcastArrays_NArg_SlicedInput_CorrectValues: N-arg path with sliced column input (was returning [0,0,0],[1,1,1],[2,2,2] instead of [1,1,1],[5,5,5],[9,9,9] due to missing ViewInfo) - BroadcastPaths_2Arg_vs_NArg_SlicedInput_Identical: verifies 2-arg and N-arg paths produce identical results for the same sliced input * test: add Bugs 23-24 discovered during broadcast stress testing New bugs found by running 65 stress tests against the broadcast system after Phase 3/4 fixes. All are pre-existing, not regressions. Bug 23a — reshape col-broadcast wrong element order: reshape(broadcast_to([[10],[20],[30]], (3,3)), (9,)) returns [10,20,30,10,20,30,...] instead of [10,10,10,20,20,20,...]. _reshapeBroadcast uses offset % OriginalShape.size modular arithmetic which walks original storage linearly instead of logical row-major. Workaround: np.copy(a).reshape(...). Bug 23b — np.abs on broadcast throws IncorrectShapeException: Cast creates UnmanagedStorage with mismatched shape size (broadcast size=6 vs storage size=0). The abs implementation doesn't handle broadcast arrays that have storage smaller than the broadcast shape. Bug 24 — transpose col-broadcast returns wrong values: broadcast_to([[10],[20],[30]], (3,3)).T returns [[10,10,10],...×3] instead of [[10,20,30],...×3]. Transpose materializes via Clone and creates plain strides [3,1], losing the stride=0 broadcast semantics. Should swap strides to [0,1] (zero-copy, like NumPy). Row-broadcast .T works by coincidence. * docs: add offset-model-rewrite investigation plan Entry point for planning the rewrite of NumSharp's view/offset resolution to match NumPy's base_offset + strides architecture. Current model: ViewInfo + BroadcastInfo chains with 6+ GetOffset code paths, recursive ParentShape resolution, and complex lazy-loaded UnreducedBroadcastedShape computation. Target model: base_offset (int) + strides[] + dimensions[]. Offset computation becomes a single loop: sum(stride[i] * coord[i]). All slice/broadcast/transpose operations just adjust the base_offset and strides — no chains, no special cases. The plan covers: - Investigation checklist (12 items): catalog all consumers, understand IArraySlice bounds checking, NDIterator relationship, reshape-after- slice interactions, generated template code, IsSliced/IsBroadcasted derivation from strides, memory management - Risk assessment with mitigations - Suggested incremental approach (prototype Shape2, verify parity, migrate) - NumPy reference files in src/numpy/ for each subsystem * Add GitHub Issues section to CLAUDE.md Adds a 'GitHub Issues' section to .claude/CLAUDE.md that documents using `gh issue create` for SciSharp/NumSharp and notes GH_TOKEN availability via the env-tokens skill. Provides structured templates for Feature/Enhancement and Bug Report issues (checklists and fields such as overview, problem/proposal, evidence, scope, benchmarks, breaking changes, reproduction, expected/actual behavior, workaround, root cause, and related issues) to standardize reporting. * fix: broadcast infrastructure — GetCoordinates, flatten, cumsum, reshape_unsafe, re-broadcast Shape.GetCoordinates: use dimension-based decomposition for broadcast shapes instead of stride-based, which breaks on zero-stride dims. Matches NumPy's PyArray_ITER_GOTO1D factor-based approach. NDArray.flatten (both overloads): guard broadcast arrays by delegating to np.ravel() — flat.copy() produced wrong element order, and non-clone path caused out-of-bounds reads on the small backing buffer. Default.Reduction.CumAdd: strip broadcast metadata via shape.Clean() before allocating the result array, preventing slice writes from going to a detached clone (IsBroadcasted clone path in GetViewInternal). NdArray.ReShape: fix reshape_unsafe to pass ref newshape instead of the instance's shape — was silently ignoring the requested shape. Tests: update re-broadcast test to expect success (not throw) after Bug 4 fix; fix GetCoordinates_Broadcasted to validate correct logical coordinates; clean up OpenBugs.cs (remove fixed bugs, keep reference comments). * fix: rewrite np.roll using NumPy's slice-based algorithm Replace broken type-switch + GetCoordinates/GetOffset implementation with NumPy's empty_like + slice-copy approach. Fixes 5 tracked bugs: - Bug 27: np.roll returns int instead of NDArray - Bug 45: no-axis roll returns null - Bug 50: roll only supports Int32/Single/Double (now all 12 dtypes) - Bug 14a/b: broadcast roll produces zeros - Bug 19a/b: broadcast roll Data<T> reads garbage NDArray.roll.cs: rewritten from 104-line type-switch to 2-line delegation to np.roll(this, shift, axis). np.roll.cs: new 70-line static method — no axis: ravel→roll→reshape; with axis: empty_like + 2 slice-copy pairs (body shift + tail wrap). Handles negative axis, shift modulo, all dtypes via slicing. np.array_manipulation.cs: removed broken static np.roll that returned int. Add 110 roll tests (100 pass, 10 OpenBugs for multi-axis tuple shift API gap and empty 2D with axis=1). Add 52 ravel tests (50 pass, 2 OpenBugs for upstream Shape.IsContiguous too conservative on contiguous slices). Document C-order-only as architectural constraint in CLAUDE.md Key Design Decisions table. * fix: np.empty_like — clone shape, add shape override and NPTypeCode overload Fix aliasing bug: prototype.shape (raw int[]) was passed by reference to the new Shape, causing both arrays to share the same dimensions array. Now clones via (int[])prototype.shape.Clone(), matching full_like's existing pattern. Add shape override parameter (Shape shape = default): when provided, overrides the prototype's shape while preserving its dtype. Matches NumPy's empty_like(a, shape=(4,5)) signature. Add NPTypeCode overload: empty_like(NDArray, NPTypeCode, Shape) for callers that already have an NPTypeCode, avoiding Type→NPTypeCode conversion. Delegates to np.empty() for consistency. Add 103 tests verified against NumPy 2.4.2 ground truth covering: shape/dtype preservation (1D–4D, scalar), dtype override (Type and NPTypeCode, all 12 types), shape override (2D→1D/2D/3D, with dtype, same/diff size, scalar, broadcast/slice sources), empty arrays (zero-dim), sliced/broadcast/transposed prototypes, memory independence, writeability, aliasing fix verification, sibling contract comparison (zeros_like/ones_like), chained operations, and integration with np.roll pattern. * test: migrate FluentAssertions 5.10.3 to AwesomeAssertions 9.3.0, fix 5 assertion bugs FluentAssertions went proprietary after v8. AwesomeAssertions is the Apache 2.0 community fork — permanently free, actively maintained. Package upgrade: - FluentAssertions 5.10.3 → AwesomeAssertions 9.3.0 in csproj - Renamed `using FluentAssertions` → `using AwesomeAssertions` across 83 test files - Adapted to AA 9.x API: ReferenceTypeAssertions now requires (subject, AssertionChain) constructor, Execute.Assertion replaced with AssertionChain field, Subject read-only Bugs fixed in FluentExtension.cs: - Bug 1: AllValuesBe error messages showed literal "0","1","2" instead of actual values due to unescaped {0}/{1}/{2} inside $"" strings — fixed to {{0}}/{{1}}/{{2}} - Bug 2: BeOfValuesApproximately all 12 dtype branches said "(dtype: Boolean)" — fixed each branch to show correct dtype name (Byte, Int16, Double, etc.) - Bug 3: NDArrayAssertions.Identifier returned "shape" (copy-paste) — changed to "ndarray" - Bug 4: BeShaped(ITuple)/BeEquivalentTo had no bounds check — added dimension count assertion before accessing dimensions[i] to prevent IndexOutOfRangeException - Bug 5: BeShaped used order-insensitive BeEquivalentTo, so BeShaped(3,2) would pass on a (2,3) shape — changed to order-sensitive Equal(). This correctly exposed 5 pre-existing NumSharp bugs (np.moveaxis, NewAxis indexing, SlicingWithNewAxis) AA 9.x API compatibility fixes across test files: - .Array.Should().ContainInOrder() → .Data<int>().Should().ContainInOrder() (typed) - .Array.Should().BeEquivalentTo(.Array) → .Data<bool>().Should().Equal() (typed) - .Should().BeInAscendingOrder() → .Data<double>().Should().BeInAscendingOrder() - BeEquivalentTo(params) → BeEquivalentTo(new[]{}) for type inference - BeLessOrEqualTo → BeLessThanOrEqualTo (renamed in AA 8.x) - .And.HaveCount() → .Which.Should().HaveCount() (chain semantics) - Cast<T>().Should().BeEquivalentTo(NDArray) → .Should().Be(NDArray) Infrastructure: - Added FluentExtensionTests.cs with 72 tests covering all custom assertion methods, error message quality (catches Bug 1/2 regressions), chaining, all 12 dtypes, edge cases (scalar, sliced, broadcast, 2D), UnmanagedStorage entry point - Removed OpenBugs.DeprecationAudit.cs (duplicate method names conflicting with OpenBugs.cs) Test results: 1644 passed, 5 failed (pre-existing), 34 skipped — both net8.0 and net10.0 * test: fix 3 assertion correctness issues, add 6 new capabilities, 16 new tests Correctness fixes in FluentExtension.cs: - Fix NotBe/NotBeShaped error messages: was "Expected shape to be X" when shapes ARE equal — now correctly says "Did not expect shape to be X" - Fix UInt64 overflow in BeOfValuesApproximately: unsigned subtraction (expected - nextval) wraps on underflow; cast to double before subtraction - Remove dead System.IO import New assertion capabilities added to ShapeAssertions and NDArrayAssertions: - BeContiguous() / NotBeContiguous() — asserts Shape.IsContiguous - HaveStrides(params int[]) — asserts exact stride values - BeEmpty() — asserts size == 0 (NDArrayAssertions only) - NotBe(NDArray) — complement to Be(), uses np.array_equal negation - NotBeOfType(NPTypeCode) / NotBeOfType<T>() — complement to BeOfType New infrastructure tests (16, total now 88): - Contiguous assertions: fresh array, sliced step, shape-level (4) - HaveStrides: shape pass, shape fail, ndarray pass (3) - BeEmpty: empty pass, non-empty fail (2) - NotBeOfType: mismatch pass, match fail, generic form (3) - NotBe: different pass, equal fail (2) - Error message correctness: NotBe/NotBeShaped say "Did not expect" (2) - UInt64 overflow regression: both directions (3UL vs 5UL) (1) All 88 infrastructure tests pass on net8.0 and net10.0. Full suite: 1644 pass, 5 fail (pre-existing NumSharp bugs), 34 skipped. * test: fix FluentAssertions → AwesomeAssertions namespace in 4 post-merge files These 4 test files were added to broadcast-refactor after the tests branch diverged, so they weren't included in the AwesomeAssertions migration. The merge brought AwesomeAssertions as the package but these files still referenced `using FluentAssertions` — renamed to `using AwesomeAssertions`. Files fixed: - np.empty_like.Test.cs - np.ravel.Test.cs - np.reshape.Test.cs (new file, untracked) - np.roll.Test.cs * fix: broadcast reshape — always set ViewInfo in _reshapeBroadcast for correct offset resolution _reshapeBroadcast previously only set ViewInfo when the broadcast shape was also sliced (guarded by `if (IsSliced)`). Without ViewInfo, the reshaped shape's GetOffset fell through to the `offset % OriginalShape.size` modular arithmetic path, which happened to produce correct results for row broadcasts (where data is already laid out linearly) but produced wrong element ordering for column broadcasts and other non-trivial broadcast patterns. The fix removes the `if (IsSliced)` guard so ViewInfo is always set. This forces offset resolution through the recursive GetOffset path, which walks up to the parent broadcast shape and uses its strides (with zeros for broadcast dimensions) to compute the correct physical offset via GetCoordinates → parent.GetOffset. Validated against NumPy 2.4.2 output across 80+ individual checks: - Column, row, scalar, 3D, 4D, 5D broadcast reshapes - Slice→broadcast→reshape, broadcast→slice→reshape chains - Step slices, reverse slices, non-contiguous sources - Double/triple reshape chains, copy equivalence - All access patterns: flat, ToString, ravel, multi-dim indexing, copy Also adds comprehensive np.reshape test suite (61 tests) covering basic reshapes, -1 dimension inference, view semantics, scalar/empty arrays, sliced+reshape, broadcast+reshape, all 12 dtypes, large arrays, error cases, static vs instance API, transposed arrays. * test: add swapaxes OpenBugs 66-69 with NumPy verification Bug 66 (3 tests): swapaxes produces C-contiguous strides instead of permuted strides. For arange(24).reshape(2,3,4) with strides [12,4,1], swapaxes(0,2) should give [1,4,12] but gives [6,2,1]. Root cause: Default.Transpose.cs allocates new C-contiguous storage and copies data via MultiIterator.Assign, discarding the permuted strides. Direct consequence of Bug 64 (transpose copies instead of returning a view). Bug 67 (1 test): swapaxes on 0D scalar succeeds instead of throwing. NumPy scalar has shape=(), ndim=0 so any axis is out of bounds. NumSharp represents scalars as shape=[1], ndim=1, so swapaxes(0,0) is valid. Bug 68 (2 tests): swapaxes on empty arrays (shape with 0 dimension) crashes with InvalidOperationException from NDIterator. NumPy handles this correctly — just swaps dimensions. Resolves automatically when Bug 64 is fixed (no iteration needed for view). Bug 69 (2 tests): Out-of-bounds axis throws IndexOutOfRangeException (accidental leak from array access) instead of descriptive AxisError. Root cause: check_and_adjust_axis only adjusts negative indices but never validates bounds. * test: migrate test framework from MSTest to TUnit 1.13.11 Migrate the test suite (156 files, ~2,076 tests) from MSTest to TUnit, a modern .NET testing framework using source generators instead of reflection. **csproj changes:** - Add TUnit 1.13.11 as test framework (source-generated test discovery) - Add OutputType=Exe (required by TUnit's Microsoft.Testing.Platform) - Add TUnitAssertionsImplicitUsings=false (prevent TUnit.Assertions.Assert from conflicting with MSTest's Assert class) - Remove MSTest.TestAdapter 2.1.1 (replaced by TUnit engine) - Remove Microsoft.NET.Test.Sdk 16.7.1 (replaced by Microsoft.Testing.Platform) - Remove coverlet.collector 1.3.0 (incompatible with TUnit) - Keep MSTest.TestFramework 2.1.1 for Assert.* compatibility (1,252 calls across 85+ files — converting these risks argument-reorder bugs) - Keep AwesomeAssertions 9.3.0 (~3,689 .Should() calls unchanged) **New files:** - global.json: Microsoft.Testing.Platform runner config, required for `dotnet test` on .NET 10 SDK (MTP mode replaces VSTest) - AssemblyAttributes.cs: [assembly: NotInParallel] disables TUnit's default parallel execution for safety (MSTest ran sequentially) **Attribute replacements across 156 test files:** - [TestClass] → deleted (152 lines, TUnit doesn't need class-level markers) - [TestMethod] → [Test] (2,056 occurrences) - [DataTestMethod] → [Test] (11 files) - [DataRow(] → [Arguments(] (195 parameterized test rows) - [TestCategory(] → [Category(] (34 occurrences) - [Ignore] / [Ignore("...")] → [Skip("...")] (12 occurrences) - [ExpectedException(...)] → deleted (3 in np.any.Test.cs, all OpenBugs) - [TestMethod, Ignore("...")] → [Test, Skip("...")] (combined attrs) - [TestMethod, Timeout(10000)] → [Test, TUnit.Core.Timeout(10000)] with CancellationToken parameter (TUnit requirement) **Compile-time fixes:** - TestClass.cs: Fully qualify System.Reflection.Assembly in 3 methods to resolve ambiguity with TUnit's HookType.Assembly enum member - Shape.Test.cs: Add CancellationToken parameter + System.Threading using for TUnit's [Timeout] attribute requirement **Test results (both net8.0 and net10.0):** total: 2,076 | passed: 2,040 | failed: 25 (all pre-existing) | skipped: 11 All 25 failures are pre-existing dead-code/known-bug tests (AND/OR operators, isnan/isfinite/isclose/allclose, memory allocation, broadcast/newaxis). **Usage changes:** - `dotnet test --project <path> --treenode-filter "/*/*/*/*[Category!=OpenBugs]"` replaces the old `--filter "TestCategory!=OpenBugs"` syntax - `dotnet run --project <path> -- --treenode-filter ...` also works directly * test: enable parallel execution, add WindowsOnly auto-skip, update CI for TUnit Enable TUnit's default parallel test execution by removing the [assembly: NotInParallel] guard. Tests run ~43% faster in parallel (~8s vs ~14s sequential for 2,076 tests). **Parallel race condition fixes:** - np.load.Test.cs: Add [NotInParallel] on NumpyLoad class — tests share a read-only data file (data/1-dim-int32_4_comma_empty.npy) that np.Load opens with exclusive access - np.tofromfile.Test.cs: Fix copy-paste bug in NumpyToFromFileTestUShort1 that used nameof(NumpyToFromFileTestByte1) — both tests wrote to the same file "test.NumpyToFromFileTestByte1" causing race conditions **WindowsOnly platform auto-skip:** - Add WindowsOnlyAttribute (extends TUnit.Core.SkipAttribute) that auto-skips tests on non-Windows via OperatingSystem.IsWindows() - Replace [Category("WindowsOnly")] with [WindowsOnly] on 3 bitmap test classes (BitmapExtensionsTests, BitmapWithAlphaTests, OpenBugsBitmap) - Eliminates need for separate CI filter logic per OS **CI workflow update (build-and-release.yml):** - Switch from `dotnet test --filter "TestCategory!=OpenBugs"` (VSTest) to `dotnet run -- --treenode-filter "/*/*/*/*[Category!=OpenBugs]"` (MTP) - Remove per-OS filter matrix (WindowsOnly now handled by runtime skip) - Simplify matrix to just os: [windows-latest, ubuntu-latest, macos-latest] - Add --report-trx for TRX artifact upload **Stability:** 8 consecutive runs (5 net10.0 + 3 net8.0), all identical: 2,076 total | 2,040 passed | 25 failed (pre-existing) | 11 skipped Closes #539 * ci: run tests against both net8.0 and net10.0 in CI The previous CI config used `dotnet run` without --framework, which only runs one TFM. Split into two explicit steps (net8.0 and net10.0) to ensure both target frameworks are tested on all 3 OS runners. * test: optimize top 10 slowest tests — 1.5s saved per run Targeted optimizations on the tests dominating wall-clock time: **Allocate_1GB (1,113ms → 70ms, 16x faster):** np.ones → np.empty — test verifies large allocation succeeds, not that 4GB of memory is filled with ones **GcDoesntCollectArraySliceAlone (361ms → 95ms, 3.8x faster):** Reduce iterations from 100K+1M to 10K+100K — still 110K allocations with GC.Collect + sleep, more than sufficient to test GC correctness **Dot product tests (removed redundant work):** - Remove Console.WriteLine(np.dot(x,y).ToString(false)) calls that recomputed the entire dot product AND stringified the result array - Dot2x2, Dot2222x2222, Dot3412x5621, Dot311x511: each was calling np.dot twice — once for debug output, once for assertion - Dot30_300x30_300: remove Stopwatch + Console.WriteLine benchmark scaffolding — the test just verifies the operation completes **Net effect on total suite (2,076 tests, Release, parallel):** Before: ~8.0s wall clock After: ~6.6s wall clock (18% faster) * perf: optimize contiguous slices to use offset InternalArray alias When GetView() produces a slice that describes a contiguous memory block, create an offset InternalArray alias instead of a ViewInfo-based alias. This makes IsContiguous=true for the result, enabling: - Fast-path NDIterator (pointer increment vs GetOffset per element) - Efficient ravel/flatten (can return view instead of copy) - Proper copyto semantics **Contiguity detection algorithm:** Scan SliceDefs right-to-left. Trailing dimensions must be fully taken (Start=0, Step=1, Count=origDim). First partially-taken dimension must have Step=1 (or Count<=1). All dimensions left of that must have Count=1. Examples of contiguous slices now optimized: - arr[0, :] — first row of 2D (was ViewInfo, now offset alias) - arr[:5] — prefix slice (was ViewInfo, now offset alias) - arr[2:4, :, :] — row range of 3D (was ViewInfo, now offset alias) Non-contiguous slices unchanged (still use ViewInfo): - arr[::2] — stepped slice - arr[:, 0] — column slice (non-trailing partial dim) * fix: compute IsContiguous from strides using NumPy algorithm (Phase 1) Replaces flag-based IsContiguous computation with stride-based analysis matching NumPy's C_CONTIGUOUS algorithm (flagsobject.c:116-160). ## Changes ### Shape.cs - Add ComputeIsContiguousFromStrides() implementing NumPy algorithm: scan right-to-left, stride[-1]=1, stride[i]=shape[i+1]*stride[i+1], skip size-1 dimensions, empty arrays (dim=0) are contiguous - IsContiguous property now calls stride-based computation - GetCoordinates uses dimension-based decomposition for IsSliced shapes (strides may have gaps from step!=1 slices) - Slice() computes actual memory strides: origin.strides[i] * step enabling correct contiguity detection for step-2, reversed slices - TransformOffset checks ModifiedStrides (transposed shapes need GetOffset) ### Default.Transpose.cs - Returns view instead of copy (NumPy semantics) - Identity case (axis==start) returns array itself, not clone - Empty arrays: just permute dimensions, no data copy - Add axis bounds checking with AxisOutOfRangeException - Broadcastable arrays can use view (zero strides preserved) - Sliced/already-transposed arrays still need clone ### NdArray.ReShape.cs - Non-contiguous arrays (transposed/sliced) copy before reshape matching NumPy behavior where reshape of non-contiguous returns copy ### NDArray.flatten.cs - Add ModifiedStrides check for correct element ordering (transposed arrays must use ravel path) ### UnmanagedStorage.Slicing.cs - Enhanced documentation for contiguous slice optimization - Contiguous slices use InternalArray.Slice(offset, count) with clean shape enabling Address to point to correct location ### UnmanagedStorage.Cloning.cs - CloneData uses IsContiguous instead of checking flags separately now correctly handles transposed arrays ### Shape.Reshaping.cs - ViewInfo setup extended for ModifiedStrides (transposed shapes) ensures GetOffset correctly transforms through parent ### Tests - NdArray.Transpose.Test: expect view semantics (shares memory) - Add Shape.IsContiguous.Test.cs with comprehensive test cases ## Test Results - Failures: 217 → 141 (-76) - All IsContiguous behaviors verified against NumPy ## Architecture Note Views (IsSliced || IsBroadcasted) return IsContiguous=false because Address doesn't account for view offset. Contiguous slice optimization creates offset InternalArray with clean shape, making Address correct. This bridges NumPy's offset+strides model with NumSharp's ViewInfo model. * refactor: align Shape.GetOffset with NumPy architecture NumPy-aligned offset calculation replaces complex ViewInfo traversal: - Element access now uses simple formula: offset + sum(indices * strides) - Offset computed at slice time, strides include step factor - stride=0 handles broadcast repetition Removed ~200 lines of legacy code: - GetOffset_broadcasted, GetOffset_broadcasted_1D - GetOffset_IgnoreViewInfo - resolveUnreducedBroadcastedShape Added: - IsSimpleSlice property for fast-path documentation - Offset preservation in DefaultEngine.Broadcast() - 32 parity tests verifying NumPy behavior Test results: 123 failures (6 fewer than baseline 129) The removed recursive slice handling was always fragile; NumPy handles reshape-of-slice differently (copies if non-contiguous). * fix: template to match new broadcast refactor * feat(benchmark): add comprehensive benchmark suite for NumSharp vs NumPy Add a complete benchmark infrastructure for comparing NumSharp performance against NumPy baselines using BenchmarkDotNet and Python. ## Structure - benchmark/NumSharp.Benchmark.GraphEngine/ - C# BenchmarkDotNet project - benchmark/NumSharp.Benchmark.Python/ - NumPy baseline benchmarks - benchmark/scripts/ - Helper scripts for result merging - benchmark/run-benchmarks.ps1 - Main runner with report generation ## Benchmark Suites (130+ operations) - Arithmetic: +, -, *, /, % with element-wise and scalar variants - Unary: sqrt, abs, exp, log, sin, cos, tan, etc. - Reduction: sum, mean, var, std, min, max, argmin, argmax - Broadcasting: scalar, row, column, 3D patterns - Creation: zeros, ones, empty, full, copy, *_like - Manipulation: reshape, transpose, ravel, flatten, stack - Slicing: contiguous, strided, reversed views - MultiDim: 1D vs 2D vs 3D performance comparison - Dispatch: comparison of dispatch mechanisms (DynamicMethod, static, struct) - Fusion: multi-pass vs fused kernel patterns ## Array Sizes - Scalar (1): pure overhead measurement - Tiny (100): common small collections - Small (1K): L1 cache tier - Medium (100K): L2/L3 cache tier - Large (10M): memory-bound throughput ## Features - Interactive menu for selecting benchmark suites - Automated report generation (markdown, JSON, CSV) - README.md auto-updates with latest results when present - Matching methodology: same operations, sizes, seeds as NumPy - All 12 NumSharp data types supported * docs: add NEP reference documentation for NumPy 2.x compliance Add comprehensive documentation for 24 NumPy Enhancement Proposals (NEPs) relevant to NumSharp's goal of 1-to-1 NumPy 2.x behavioral compatibility. Documentation structure: - README.md: Index with priority tiers, quick reference, implementation roadmap - Individual NEP files: Detailed analysis of each proposal Priority classifications: - CRITICAL (NumPy 2.0 breaking): NEP 50 (type promotion), NEP 52 (API cleanup), NEP 56 (Array API standard) - HIGH (significant impl): NEP 01 (.npy format), NEP 07 (datetime), NEP 19 (RNG), NEP 27 (zero-rank), NEP 38/54 (SIMD) - MEDIUM (behavioral): NEP 05/20 (gufuncs), NEP 10 (iterator), NEP 21 (indexing), NEP 34 (ragged), NEP 42/43 (dtypes), NEP 51 (scalar repr) - LOW (informational): NEP 13/18 (Python dispatch), NEP 32 (remove financial), NEP 49 (allocators), NEP 53 (C-API) Includes .NET SIMD implementation patterns and NumPy 1.x vs 2.x quick reference. Related: #547, #544, #545, #529 (NumPy 2.x Compliance milestone) * fix(benchmark): remove duplicate broadcasting tests from AddBenchmarks Broadcasting tests (row/column vector) were duplicated between AddBenchmarks and BroadcastBenchmarks. The Byte type failed on broadcasting operations, causing benchmark failures. Changes: - Remove _matrix, _rowVector, _colVector fields - Remove Add_RowBroadcast and Add_ColBroadcast benchmark methods - Update docstring to note that broadcasting is in BroadcastBenchmarks BroadcastBenchmarks.cs already covers these scenarios with float64 only, avoiding the type compatibility issues. AddBenchmarks now focuses on element-wise and scalar operations across all ArithmeticTypes. * feat(benchmark): add exploration benchmark source files Adds source files for the NumSharp.Benchmark.Exploration project - a standalone benchmark suite for isolated performance experiments. Structure: - Infrastructure/: BenchFramework (timing), BenchResult (data model), OutputFormatters (CSV/JSON/MD), SimdImplementations (SIMD patterns) - Isolated/: Self-contained micro-benchmarks for specific scenarios - SizeThresholds: Find N where SIMD overhead breaks even - BroadcastScenarios: Isolated broadcast pattern benchmarks - SimdStrategies: Compare Vector<T> vs AVX2 vs loop - DispatchOverhead: Measure call overhead - MemoryPatterns: Sequential vs strided access - CombinedOptimizations: Multi-optimization combinations - Integration/: NumSharpBroadcast tests against real NumSharp - BenchmarkDotNet/: BenchmarkDotNet-formatted broadcast tests - Python/: NumPy baseline script for comparison - Results/: Output directory (.gitkeep, ignore generated files) Purpose: Exploration benchmarks help identify optimization opportunities before implementing them in the main NumSharp codebase. They provide isolated measurements without NumSharp's dispatch overhead. * refactor(Shape): make readonly struct with ArrayFlags, remove ViewInfo/BroadcastInfo Shape is now a `readonly struct` with immutable fields computed at construction. This aligns with NumPy's architecture where ndarray metadata is set once. Key changes: - Added ArrayFlags enum (C_CONTIGUOUS, OWNDATA, ALIGNED, WRITEABLE, BROADCASTED) matching numpy/core/include/numpy/ndarraytypes.h flag definitions - Replaced mutable ViewInfo/BroadcastInfo reference types with value fields: - offset (int): starting position in underlying buffer - bufferSize (int): size of the original buffer for view tracking - _flags (int): cached ArrayFlags computed at construction - IsContiguous is now a cached flag (was computed on every access) - IsBroadcasted is now a cached flag (was BroadcastInfo != null) - IsSliced computed from offset/bufferSize/ModifiedStrides (was ViewInfo != null) - GetOffset simplified to pure `offset + sum(indices * strides)` formula, eliminating recursive ViewInfo coordinate resolution - Reshape returns new Shape via constructor, preserving offset/bufferSize - Deleted BroadcastInfo.cs and ViewInfo.cs — their data is now encoded directly in Shape's immutable fields Breaking: ViewInfo, BroadcastInfo, ChangeTensorLayout, ComputeHashcode, IsRecursive, and mutable strides/dimensions are removed. * refactor(storage): align slicing, getters, cloning, transpose, broadcasting with readonly Shape Adapts all storage and backend code to work with the immutable Shape struct. Slicing: - Removed 50-line contiguous slice optimization from GetViewInternal - All slices now return Alias(slicedShape) views using offset+strides - Shape.Slice() computes correct offset/strides, Alias shares InternalArray Getters: - Replaced BroadcastInfo.OriginalShape.size with Shape.bufferSize - Replaced IsSliced guard with !IsContiguous for memory slicing decisions Cloning: - CloneData now accounts for Shape.offset when cloning contiguous sliced views (previously would copy from buffer start, ignoring view offset) Reshaping: - Added copy-on-reshape for non-contiguous arrays (NumPy behavior): materializes data before reshaping stepped/transposed arrays ToArray: - Fixed contiguous Buffer.MemoryCopy to start from Address + offset Transpose: - Now creates immutable permuted Shape via constructor — always O(1) view - Removed data cloning path for sliced/transposed arrays Broadcasting: - Rewrote ResolveShapes to compute broadcast strides directly as int[] - Uses Shape.WithFlags() to set BROADCASTED flag on readonly Shape - Removed all BroadcastInfo/ViewInfo mutation Flatten: - Unified into Manipulation/NDArray.flatten.cs (deleted duplicate in Creation/) - Always copies via CloneData (matches NumPy: flatten always returns copy) NDArray constructors: - Removed ChangeTensorLayout calls (C-order only, parameter accepted but ignored) Indexing.Selection: - Removed ViewInfo mutation from fancy indexing flatten path * refactor(math): use IsContiguous instead of !IsSliced for linear access checks All 70 generated math operation files (Add, Subtract, Multiply, Divide, Mod x 14 dtypes each) plus the template now use: leftLinear = leftshape.IsContiguous && !leftshape.IsBroadcasted instead of the previous: leftLinear = !leftshape.IsBroadcasted && !leftshape.IsSliced This is semantically correct: IsContiguous (cached flag from strides) is the right predicate for "can we do linear pointer arithmetic", whereas IsSliced could be true for contiguous views (e.g. a[2:7] with stride=1). * test: update tests for readonly Shape and removed ViewInfo/BroadcastInfo APIs - Shape.Test.cs: Removed ChangeTensorLayout test (C-order only), updated HashcodeComputation for immutable struct (no ComputeHashcode call), fixed HashcodeScalars to use constructor instead of mutating readonly offset - Shape.OffsetParity.Tests.cs: Updated for new Shape constructor API - NDArray.View.Test.cs: Updated view tests for readonly Shape behavior - UnmanagedStorage.ReshapeView.Tests.cs: Updated reshape view assertions - FluentExtensionTests.cs: Updated BeSliced test to use step slice (::2) which is non-contiguous, matching new IsSliced semantics - StringApiTests.cs: Replaced BeSliced() with IsContiguous assertions for column slices (offset=0 but stride!=1) - GetData/SetData tests: Updated for readonly Shape - NDArray.Indexing.Test.cs: Updated for ViewInfo removal - NumSharp.UnitTest.csproj: Test project dependency updates * docs(CLAUDE.md): update test framework refs from MSTest to TUnit - Updated build & test commands for TUnit --reflection mode - Changed TestCategory attribute references to Category (TUnit) - Added treenode-filter syntax for OpenBugs/WindowsOnly categories - Added output formatting recipes for grep-based result filtering - Added OpenBugs.ApiAudit.cs to known OpenBugs files list - Fixed test suite Q&A: MSTest -> TUnit framework * feat(broadcast): add write protection for broadcast arrays (NumPy alignment) Broadcast arrays in NumPy are read-only because multiple logical positions map to the same physical memory location (stride=0). Writing to them would corrupt shared data. This commit aligns NumSharp with that behavior. Changes: - Add NumSharpException.ThrowIfNotWriteable() matching PyArray_FailUnlessWriteable - Add ThrowReadOnly() with standard NumPy error format "X is read-only" - Add write protection checks to all NDArray indexer setters - Add write protection to UnmanagedStorage setter methods - Add write protection to MultiIterator.Assign() methods - Add write protection to np.copyto() and np.clip() @out parameter NumPy equivalent error: ValueError: assignment destination is read-only NumSharp error: NumSharpException: assignment destination is read-only * refactor(Shape): remove deprecated ModifiedStrides, use IsContiguous flag Complete the readonly Shape architecture by removing the deprecated ModifiedStrides field. The IsContiguous cached flag (computed from strides at construction) provides the same information more efficiently. Changes: - Remove Shape.ModifiedStrides field entirely - Update IsSliced property to use !IsContiguous instead of ModifiedStrides - Add explicit scalar Shape() constructor with proper flag initialization - Fix IsWriteable computation: broadcast shapes (stride=0) are now read-only - Remove modifiedStrides parameter from internal Shape constructor - Update all Shape constructor call sites (Transpose, Storage, Reshape) This aligns with NumPy's architecture where: - IsContiguous = C_CONTIGUOUS flag (stride pattern matches row-major) - IsSliced = view into different buffer region (offset, size, or non-contiguous) - IsWriteable = not a broadcast array (no stride=0 dimensions) * fix(iterator): check Shape.offset for sliced view iteration paths NDIterator must use the slower coordinate-based iteration path when the shape has a non-zero offset, not just when !IsContiguous. A contiguous slice with offset > 0 still needs proper offset handling. Changes: - Change condition from !Shape.IsContiguous to !Shape.IsContiguous || Shape.offset != 0 - Apply to NDIterator.cs base template - Apply to all 12 type-specific NDIterator.Cast.*.cs generated files - Minor cleanup in NDArray.String.cs for readonly Shape compatibility This fixes iteration over sliced views like arr["2:5"] where the slice is contiguous but starts at offset 2 in the underlying buffer. * test: update broadcast and indexing tests for readonly Shape architecture Update test files to work with the readonly Shape refactoring and write protection for broadcast arrays. Changes: - NpBroadcastFromNumPyTests.cs: Fix test assertions and method imports - NDArray.Indexing.Test.cs: Update tests for write protection behavior - CLAUDE.md: Documentation updates for new architecture * feat(NDArray): add `base` property for NumPy-compatible view tracking Implements the NumPy-aligned `ndarray.base` property chain for tracking view ownership. All views chain to the ultimate owner (not intermediate views), matching NumPy semantics. Storage-level: - Add `_baseStorage` internal field to UnmanagedStorage - Add `BaseStorage` public property (read-only by design) - Add `IsView` convenience property (equivalent to BaseStorage != null) - Update all three `Alias()` overloads to propagate base reference - Update `CreateBroadcastedUnsafe(storage, shape)` for base tracking - Update `GetData()` slicing to chain to ultimate owner NDArray-level: - Add `@base` property returning NDArray wrapper of BaseStorage - Document semantic difference from NumPy: property returns new wrapper each call (not cached), but Storage reference equality holds Affected operations that now track base: - Slicing via indexer (a["2:5"]) - Selection getter (fancy indexing) - Reshape (when returning view) - Alias() for explicit view creation - Broadcast operations This enables: - View detection: `arr.@base != null` or `arr.Storage.IsView` - Memory debugging: trace which array owns shared data - NumPy-compatible semantics for view chains * docs(CLAUDE.md): document Shape architecture and ArrayFlags Updates project documentation to reflect readonly struct Shape design: Shape architecture section: - Document internal fields (dimensions, strides, offset, bufferSize, _flags) - Document ArrayFlags enum values matching NumPy's ndarraytypes.h - Document key O(1) properties: IsContiguous, IsBroadcasted, IsWriteable, IsSliced, IsSimpleSlice Key design decisions: - Add Shape readonly struct entry - Add broadcast write protection entry - Update C-order description to reference ArrayFlags.C_CONTIGUOUS Capability reference updates: - Fix np.cumsum location (APIs/np.cumsum.cs, not NDArray.cumsum.cs duplicate) - Add missing Math functions (add, subtract, multiply, divide, mod, etc.) - Fix Sorting paths (Sorting_Searching_Counting/ not Sorting/) - Update np.roll status: fully implemented (was partial) Test filtering: - Update treenode-filter examples for 4-level path pattern * test: remove Option2Fix category from validated contiguity tests The IsContiguous fix has been validated - these tests now pass and should run unconditionally as part of the normal test suite. Tests promoted to regular execution: - IsContiguous_Step1Slice1D - IsContiguous_RowSlice2D - IsContiguous_SingleRow2D - IsContiguous_SingleRowPartialCol2D - IsContiguous_SingleElement1D - IsContiguous_3D_RowSlice - IsContiguous_3D_SingleRowPartialCol - IsContiguous_SliceOfContiguousSlice - IsContiguous_SliceOfSteppedSlice_SingleElement - ViewSemantics_Step1Slice1D_MutationPropagates - ViewSemantics_RowSlice2D_MutationPropagates - ViewSemantics_SingleRowPartialCol_MutationPropagates - ViewSemantics_SliceOfContiguousSlice_MutationPropagates - Ravel_ContiguousSlice1D_IsView - Ravel_ContiguousRowSlice2D_IsView - Copyto_ContiguousSlice_FastPath - ContiguousSlice_Float64/Float32/Byte/Int64_Values - FullSlice_IsContiguous - ContiguousSlice_ThenReshape_Values These verify NumPy-aligned behavior: step-1 slices are marked contiguous. * test: migrate from [Category("OpenBugs")] to typed [OpenBugs] attribute Replaces string-based category with typed attribute defined in TestCategory.cs for better IDE support and compile-time validation. Files updated: - Issues/448.cs - Logic/np.any.Test.cs - Logic/np_all_axis_Test.cs - Manipulation/np.ravel.Test.cs - Manipulation/np.reshape.Test.cs - Manipulation/np.roll.Test.cs - OpenBugs.Bitmap.cs - OpenBugs.cs (class-level attribute) - Selection/NDArray.Indexing.Test.cs Both forms work with TUnit's --treenode-filter: /*/*/*/*[Category!=OpenBugs] The typed attribute is preferred for: - Compile-time typo detection - IDE autocomplete and navigation - Consistent usage across the codebase * style: add braces to single-statement conditionals in reduction methods Adds explicit braces to if statements in reduction axis-handling code for consistency with project coding conventions. Files: Default.Reduction.{AMax,AMin,Add,Mean,Product,Std,Var}.cs * test: add comprehensive .base property tests and typed category attributes Add test coverage for the NumPy-compatible .base property: NDArray.Base.Test.cs (35 tests): - NumPy behavior: owned arrays have null base, views chain to owner - View chaining: slice-of-slice chains to ultimate owner (not intermediate) - Copy ownership: copy() creates owned array with null base - Operations: reshape, transpose, broadcast_to, expand_dims - Edge cases: scalar, 0-d, empty arrays - All 12 dtypes verification NDArray.Base.MemoryLeakTest.cs: - Memory lifecycle: views keep base alive - Concurrent access safety - Finalization ordering - [Misaligned] test for broadcast-then-slice materialization TestCategory.cs: - [OpenBugs] - known failing tests (excluded from CI) - [Misaligned] - NumSharp differs from NumPy (runs, documents difference) - [WindowsOnly] - platform-specific tests (GDI+/System.Drawing) These typed attributes replace string-based [Category("...")] for better IDE support and compile-time checking. * docs: add .base property storage-level implementation plan Documents the design and implementation approach for NumPy-compatible .base property tracking at the UnmanagedStorage level. Key decisions documented: - Storage-level _baseStorage field chains to ultimate owner - Memory safety via shared Disposer (not base reference) - Read-only BaseStorage property (prevents ownership corruption) - Known limitation: broadcast slicing materializes data Code paths analyzed: - Alias() overloads for view creation - GetData() for contiguous and broadcast paths - CreateBroadcastedUnsafe() for broadcast operations This plan was executed in commit ea8fef5. * ci: fix test step to only test net10.0 (matching test project) The test project currently only targets net10.0 (net8.0 commented out with TODO note about TUnit compatibility). The CI was trying to test both frameworks, causing "No such file or directory" failures because the net8.0 executable doesn't exist. Aligns CI with test project's actual target framework until net8.0 support is re-enabled in NumSharp.UnitTest.csproj. * test: fix [Ignore] → [Skip] for TUnit compatibility MSTest's [Ignore] attribute was not migrated to TUnit's [Skip] for StringArraySample1 test. TUnit doesn't recognize [Ignore], causing the test to run instead of being skipped. Fixes CI test execution where this test was unexpectedly running. * test: convert remaining [Ignore] → [OpenBugs] for TUnit compatibility Complete the MSTest → TUnit migration by replacing all remaining [Ignore] attributes with [OpenBugs]. TUnit does not recognize MSTest's [Ignore] attribute, causing tests to run instead of being skipped. Changes across 16 test files: - AllocationTests.cs: 2GB/4GB/44GB allocation tests (Int32 limit) - ReduceAddTests.cs: keepdims returns wrong shape - np.dot.Test.cs: high-dimensional array bugs - np.matmul.Test.cs: ArgumentOutOfRangeException crashes - np.allclose.Test.cs: depends on unimplemented np.isclose - np.isclose.Test.cs: returns null (dead code) - np.isfinite.Test.cs: returns null (dead code) - np.isnan.Test.cs: returns null (dead code) - NDArray.flat.Test.cs: IsBroadcasted flag bug - np.moveaxis.Test.cs: wrong shape returned - NdArray.Convolve.Test.cs: returns null (dead code) - NDArray.AND.Test.cs: returns null (dead code) - NDArray.OR.Test.cs: returns null (dead code) - NDArray.Indexing.Test.cs: slice/newaxis bugs - NdArray.Mean.Test.cs: keepdims wrong shape - Shape.OffsetParity.Tests.cs: contiguous slice optimization All tests now properly excluded from CI via --treenode-filter "/*/*/*/*[Category!=OpenBugs]" instead of silently failing. * ci: fix WindowsOnly test filtering on non-Windows platforms Problem: - Job names showed ugly filters: "test (ubuntu-latest, & TestCategory!=WindowsOnly)" - WindowsOnly tests were running on Ubuntu/macOS and failing - Two conflicting WindowsOnlyAttribute classes caused namespace shadowing Root cause: - Commit 3c8350b added WindowsOnlyAttribute : CategoryAttribute in TestCategory.cs - This shadowed the existing Utilities/WindowsOnlyAttribute : SkipAttribute - Tests resolved [WindowsOnly] to the CategoryAttribute version (no skip behavior) - The CI workflow was simplified to remove the extra_filter matrix Fix: - Remove Utilities/WindowsOnlyAttribute.cs (eliminates namespace conflict) - Compute filter dynamically in workflow step using $RUNNER_OS - OpenBugs: excluded on all platforms (global) - WindowsOnly: excluded only on non-Windows (conditional) Result: - Clean job names: "test (ubuntu-latest)", "test (windows-latest)", etc. - WindowsOnly tests correctly skipped on Ubuntu/macOS - Single [WindowsOnly] attribute with clear semantics * ci: fix treenode-filter syntax for macOS (avoid & operator) * ci: try && for filter conjunction (TUnit treenode-filter) * ci: try separate bracket blocks for filter AND logic * ci: switch back to dotnet test --filter (MSTest-style syntax) * ci: fix dotnet test syntax for .NET 10 (--project flag) * ci: try AND operator with parens for TUnit filter * ci: fix WindowsOnly test skip with runtime SkipAttribute TUnit's --treenode-filter doesn't support compound filters with & or AND operators reliably across platforms. Instead of CI filtering: 1. Add SkipOnNonWindowsAttribute (extends TUnit's SkipAttribute) - Runtime check: OperatingSystem.IsWindows() - Auto-skips WindowsOnly tests on non-Windows 2. Bitmap test classes now use both attributes: - [WindowsOnly] - CategoryAttribute for categorization/documentation - [SkipOnNonWindows] - SkipAttribute for runtime skip 3. Simplified CI workflow: - Single --treenode-filter for OpenBugs only (all platforms) - WindowsOnly handled at runtime by SkipOnNonWindows - Clean job names without platform-specific filters * ci: restore shell: bash for cross-platform line continuation
1 parent 191c914 commit e71ccf5

File tree

367 files changed

+67264
-46186
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

367 files changed

+67264
-46186
lines changed

.claude/CLAUDE.md

Lines changed: 151 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,44 @@ np Static API class (like `import numpy as np`)
4444
| Decision | Rationale |
4545
|----------|-----------|
4646
| Unmanaged memory | Benchmarked fastest ~5y ago; Span/Memory immature then |
47+
| C-order only | Only row-major (C-order) memory layout. Uses `ArrayFlags.C_CONTIGUOUS` flag. No F-order/column-major support. The `order` parameter on `ravel`, `flatten`, `copy`, `reshape` is accepted but ignored. |
4748
| Regen templating | ~200K lines generated for type-specific code |
4849
| TensorEngine abstract | Future GPU/SIMD backends possible |
4950
| View semantics | Slicing returns views (shared memory), not copies |
51+
| Shape readonly struct | Immutable after construction (NumPy-aligned). Contains `ArrayFlags` for cached O(1) property access |
52+
| Broadcast write protection | Broadcast views are read-only (`IsWriteable = false`), matching NumPy behavior |
53+
54+
## Shape Architecture (NumPy-Aligned)
55+
56+
Shape is a `readonly struct` with cached `ArrayFlags` computed at construction:
57+
58+
```csharp
59+
public readonly partial struct Shape
60+
{
61+
internal readonly int[] dimensions; // Dimension sizes
62+
internal readonly int[] strides; // Stride values (0 = broadcast dimension)
63+
internal readonly int offset; // Base offset into storage
64+
internal readonly int bufferSize; // Size of underlying buffer
65+
internal readonly int _flags; // Cached ArrayFlags bitmask
66+
}
67+
```
68+
69+
**ArrayFlags enum** (matches NumPy's `ndarraytypes.h`):
70+
| Flag | Value | Meaning |
71+
|------|-------|---------|
72+
| `C_CONTIGUOUS` | 0x0001 | Data is row-major contiguous |
73+
| `F_CONTIGUOUS` | 0x0002 | Reserved (always false for NumSharp) |
74+
| `OWNDATA` | 0x0004 | Array owns its data buffer |
75+
| `ALIGNED` | 0x0100 | Always true for managed allocations |
76+
| `WRITEABLE` | 0x0400 | False for broadcast views |
77+
| `BROADCASTED` | 0x1000 | Has stride=0 with dim > 1 |
78+
79+
**Key Shape properties:**
80+
- `IsContiguous` — O(1) check via `C_CONTIGUOUS` flag
81+
- `IsBroadcasted` — O(1) check via `BROADCASTED` flag
82+
- `IsWriteable` — False for broadcast views (prevents corruption)
83+
- `IsSliced` — True if offset != 0, different size, or non-contiguous
84+
- `IsSimpleSlice` — IsSliced && !IsBroadcasted (fast offset path)
5085

5186
## Critical: View Semantics
5287

@@ -109,9 +144,12 @@ nd["..., -1"] // Ellipsis fills dimensions
109144
### Math Functions (`Math/`)
110145
| Function | File |
111146
|----------|------|
147+
| `np.add`, `np.subtract`, `np.multiply`, `np.divide` | `np.math.cs` |
148+
| `np.mod`, `np.true_divide` | `np.math.cs` |
149+
| `np.positive`, `np.negative`, `np.convolve` | `np.math.cs` |
112150
| `np.sum` | `np.sum.cs` |
113-
| `np.prod` | `NDArray.prod.cs` |
114-
| `np.cumsum` | `NDArray.cumsum.cs` |
151+
| `np.prod`, `nd.prod()` | `np.math.cs`, `NDArray.prod.cs` |
152+
| `np.cumsum`, `nd.cumsum()` | `APIs/np.cumsum.cs`, `Math/NDArray.cumsum.cs` |
115153
| `np.power` | `np.power.cs` |
116154
| `np.sqrt` | `np.sqrt.cs` |
117155
| `np.abs`, `np.absolute` | `np.absolute.cs` |
@@ -131,9 +169,9 @@ nd["..., -1"] // Ellipsis fills dimensions
131169
| `np.mean`, `nd.mean()` | `np.mean.cs`, `NDArray.mean.cs` |
132170
| `np.std`, `nd.std()` | `np.std.cs`, `NDArray.std.cs` |
133171
| `np.var`, `nd.var()` | `np.var.cs`, `NDArray.var.cs` |
134-
| `np.amax`, `nd.amax()` | `Sorting/np.amax.cs`, `NDArray.amax.cs` |
135-
| `np.amin`, `nd.amin()` | `Sorting/np.min.cs`, `NDArray.amin.cs` |
136-
| `np.argmax`, `nd.argmax()` | `Sorting/np.argmax.cs`, `NDArray.argmax.cs` |
172+
| `np.amax`, `nd.amax()` | `Sorting_Searching_Counting/np.amax.cs`, `NDArray.amax.cs` |
173+
| `np.amin`, `nd.amin()` | `Sorting_Searching_Counting/np.min.cs`, `NDArray.amin.cs` |
174+
| `np.argmax`, `nd.argmax()` | `Sorting_Searching_Counting/np.argmax.cs`, `NDArray.argmax.cs` |
137175
| `np.argmin`, `nd.argmin()` | `Sorting_Searching_Counting/np.argmax.cs`, `NDArray.argmin.cs` |
138176

139177
### Sorting & Searching (`Sorting_Searching_Counting/`)
@@ -168,7 +206,7 @@ nd["..., -1"] // Ellipsis fills dimensions
168206
| `np.swapaxes` | `np.swapaxes.cs`, `NdArray.swapaxes.cs` |
169207
| `np.moveaxis` | `np.moveaxis.cs` |
170208
| `np.rollaxis` | `np.rollaxis.cs` |
171-
| `nd.roll()` | `NDArray.roll.cs` | Partial: only Int32/Single/Double with axis; no-axis returns null |
209+
| `np.roll`, `nd.roll()` | `np.roll.cs`, `NDArray.roll.cs` | Fully implemented (all dtypes, with/without axis) |
172210
| `np.atleast_1d/2d/3d` | `np.atleastd.cs` |
173211
| `np.unique`, `nd.unique()` | `np.unique.cs`, `NDArray.unique.cs` |
174212
| `np.repeat` | `np.repeat.cs` |
@@ -244,7 +282,6 @@ nd["..., -1"] // Ellipsis fills dimensions
244282
| Function | File |
245283
|----------|------|
246284
| `np.size` | `np.size.cs` |
247-
| `np.cumsum` | `np.cumsum.cs` |
248285

249286
---
250287

@@ -334,22 +371,97 @@ Create issues on `SciSharp/NumSharp` via `gh issue create`. `GH_TOKEN` is availa
334371
## Build & Test
335372

336373
```bash
374+
# Build (silent, errors only)
337375
dotnet build -v q --nologo "-clp:NoSummary;ErrorsOnly" -p:WarningLevel=0
338-
dotnet test -v q --nologo "-clp:ErrorsOnly" test/NumSharp.UnitTest/NumSharp.UnitTest.csproj
376+
```
377+
378+
### Running Tests
379+
380+
Tests use **TUnit** framework with source-generated test discovery.
381+
382+
```bash
383+
# Run from test directory
384+
cd test/NumSharp.UnitTest
385+
386+
# All tests (includes OpenBugs - expected failures)
387+
dotnet test --no-build
388+
389+
# Exclude OpenBugs (CI-style - only real failures)
390+
dotnet test --no-build -- --treenode-filter "/*/*/*/*[Category!=OpenBugs]"
391+
392+
# Run ONLY OpenBugs tests
393+
dotnet test --no-build -- --treenode-filter "/*/*/*/*[Category=OpenBugs]"
394+
```
395+
396+
### Output Formatting
397+
398+
```bash
399+
# Results only (no messages, no stack traces)
400+
dotnet test --no-build 2>&1 | grep -E "^(failed|skipped|Test run| total:| failed:| succeeded:| skipped:| duration:)"
401+
402+
# Results with messages (no stack traces)
403+
dotnet test --no-build 2>&1 | grep -v "^ at " | grep -v "^ at " | grep -v "^ ---" | grep -v "^ from K:" | sed 's/TUnit.Engine.Exceptions.TestFailedException: //' | sed 's/AssertFailedException: //'
404+
405+
# Detailed output (shows passed tests too)
406+
dotnet test --no-build -- --output Detailed
339407
```
340408

341409
## Test Categories
342410

343-
Tests are filtered by `[TestCategory]` attributes. Adding new bug reproductions or platform-specific tests only requires the right attribute — no CI workflow changes.
411+
Tests use typed category attributes defined in `TestCategory.cs`. Adding new bug reproductions or platform-specific tests only requires the right attribute — no CI workflow changes.
412+
413+
| Category | Attribute | Purpose | CI Behavior |
414+
|----------|-----------|---------|-------------|
415+
| `OpenBugs` | `[OpenBugs]` | Known-failing bug reproductions. Remove when fixed. | **EXCLUDED** via filter |
416+
| `Misaligned` | `[Misaligned]` | Documents NumSharp vs NumPy behavioral differences. | Runs (tests pass) |
417+
| `WindowsOnly` | `[WindowsOnly]` | Requires GDI+/System.Drawing.Common | Runtime platform check |
418+
419+
### How CI Excludes OpenBugs
420+
421+
The CI pipeline (`.github/workflows/build-and-release.yml`) uses TUnit's `--treenode-filter` to exclude `OpenBugs`:
344422

345-
| Category | Purpose | CI filter |
346-
|----------|---------|-----------|
347-
| `OpenBugs` | Known-failing bug reproductions. Remove category when fixed. | `TestCategory!=OpenBugs` (all platforms) |
348-
| `WindowsOnly` | Requires GDI+/System.Drawing.Common | `TestCategory!=WindowsOnly` (Linux/macOS) |
423+
```yaml
424+
env:
425+
TEST_FILTER: '/*/*/*/*[Category!=OpenBugs]'
349426

350-
Apply at class level (`[TestClass][TestCategory("OpenBugs")]`) or individual method level (`[TestMethod][TestCategory("OpenBugs")]`).
427+
- name: Test
428+
run: dotnet run ... -- --treenode-filter "${{ env.TEST_FILTER }}"
429+
```
430+
431+
This filter excludes all tests with `[OpenBugs]` or `[Category("OpenBugs")]` from CI runs. Tests pass locally when the bug is fixed — then remove the `[OpenBugs]` attribute.
432+
433+
### Usage
351434

352-
**OpenBugs files**: `OpenBugs.cs` (broadcast bugs), `OpenBugs.Bitmap.cs` (bitmap bugs). When a bug is fixed, the test starts passing — remove the `OpenBugs` category and move to a permanent test class.
435+
```csharp
436+
// Class-level (all tests in class)
437+
[OpenBugs]
438+
public class BroadcastBugTests { ... }
439+
440+
// Method-level
441+
[Test]
442+
[OpenBugs]
443+
public async Task BroadcastWriteCorruptsData() { ... }
444+
445+
// Documenting behavioral differences (NOT excluded from CI)
446+
[Test]
447+
[Misaligned]
448+
public void BroadcastSlice_MaterializesInNumSharp() { ... }
449+
```
450+
451+
### Local Filtering
452+
453+
```bash
454+
# Exclude OpenBugs (same as CI)
455+
dotnet test -- --treenode-filter "/*/*/*/*[Category!=OpenBugs]"
456+
457+
# Run ONLY OpenBugs tests (to verify fixes)
458+
dotnet test -- --treenode-filter "/*/*/*/*[Category=OpenBugs]"
459+
460+
# Run ONLY Misaligned tests
461+
dotnet test -- --treenode-filter "/*/*/*/*[Category=Misaligned]"
462+
```
463+
464+
**OpenBugs files**: `OpenBugs.cs` (general bugs), `OpenBugs.Bitmap.cs` (bitmap bugs), `OpenBugs.ApiAudit.cs` (API audit failures).
353465

354466
## CI Pipeline
355467

@@ -387,9 +499,14 @@ NumSharp uses unsafe in many places, hence include `#:property AllowUnsafeBlocks
387499
|--------|----------------|
388500
| `shape.dimensions` | Raw int[] of dimension sizes |
389501
| `shape.strides` | Raw int[] of stride values |
390-
| `shape.size` | Total element count |
391-
| `shape.ViewInfo` | Slice/view metadata (null if not a view) |
392-
| `shape.BroadcastInfo` | Broadcast metadata (null if not broadcast) |
502+
| `shape.size` | Internal field: total element count |
503+
| `shape.offset` | Base offset into storage (NumPy-aligned) |
504+
| `shape.bufferSize` | Size of underlying buffer |
505+
| `shape._flags` | Cached ArrayFlags bitmask |
506+
| `shape.IsWriteable` | False for broadcast views (NumPy behavior) |
507+
| `shape.IsBroadcasted` | Has any stride=0 with dimension > 1 |
508+
| `shape.IsSimpleSlice` | IsSliced && !IsBroadcasted |
509+
| `shape.OriginalSize` | Product of non-broadcast dimensions |
393510
| `arr.Storage` | Underlying `UnmanagedStorage` |
394511
| `arr.GetTypeCode` | `NPTypeCode` of the array |
395512
| `arr.Array` | `IArraySlice` — raw data access |
@@ -399,7 +516,18 @@ NumSharp uses unsafe in many places, hence include `#:property AllowUnsafeBlocks
399516
| `NPTypeCode.GetPriority()` | Type priority for promotion |
400517
| `NPTypeCode.AsNumpyDtypeName()` | NumPy dtype name (e.g. "int32") |
401518
| `Shape.NewScalar()` | Create scalar shapes |
402-
| `Shape.ComputeHashcode()` | Recalculate shape hash |
519+
520+
### Common Public NDArray Properties
521+
522+
| Property | Description |
523+
|----------|-------------|
524+
| `nd.shape` | Dimensions as `int[]` |
525+
| `nd.ndim` | Number of dimensions |
526+
| `nd.size` | Total element count |
527+
| `nd.dtype` | Element type as `Type` |
528+
| `nd.typecode` | Element type as `NPTypeCode` |
529+
| `nd.T` | Transpose (swaps axes) |
530+
| `nd.flat` | 1D iterator over elements |
403531

404532
## Adding New Features
405533

@@ -433,7 +561,7 @@ A: Yes, 1-to-1 matching.
433561
A: Anything that can use the capabilities - porting Python ML code, standalone .NET scientific computing, integration with TensorFlow.NET/ML.NET.
434562

435563
**Q: Are there areas of known fragility?**
436-
A: Slicing/broadcasting system is complex with ViewInfo and BroadcastInfo interactions - fragile but working.
564+
A: Slicing/broadcasting system is complex — offset/stride calculations with contiguity detection require careful handling. The `readonly struct Shape` with `ArrayFlags` simplifies this but edge cases remain.
437565

438566
**Q: How is NumPy compatibility validated?**
439567
A: Written by hand based on NumPy docs and original tests. Testing philosophy: run actual NumPy code, observe output, replicate 1-to-1 in C#.
@@ -455,7 +583,7 @@ A: Implementations that differ from original NumPy 2.x behavior. A comprehensive
455583
A: `NDArray` (user-facing API), `UnmanagedStorage` (raw memory management), and `Shape` (dimensions, strides, coordinate translation). They work together: NDArray wraps Storage which uses Shape for offset calculations.
456584

457585
**Q: What is Shape responsible for?**
458-
A: Dimensions, strides, coordinate-to-offset translation, contiguity tracking, and slice/broadcast info. Key properties: `IsScalar`, `IsContiguous`, `IsSliced`, `IsBroadcasted`. Methods: `GetOffset(coords)`, `GetCoordinates(offset)`.
586+
A: Shape is a `readonly struct` containing dimensions, strides, offset, bufferSize, and cached `ArrayFlags`. Key properties: `IsScalar`, `IsContiguous`, `IsSliced`, `IsBroadcasted`, `IsWriteable`, `IsSimpleSlice`. Methods: `GetOffset(coords)`, `GetCoordinates(offset)`. NumPy-aligned: broadcast views are read-only (`IsWriteable = false`).
459587

460588
**Q: How does slicing work internally?**
461589
A: The `Slice` class parses Python notation (e.g., "1:5:2") into `Start`, `Stop`, `Step`. It converts to `SliceDef` (absolute indices) for computation. `SliceDef.Merge()` handles recursive slicing (slice of a slice).
@@ -502,10 +630,10 @@ A: Core ops (`dot`, `matmul`) in `LinearAlgebra/`. Advanced decompositions (`inv
502630
## Q&A - Development
503631

504632
**Q: What's in the test suite?**
505-
A: MSTest framework in `test/NumSharp.UnitTest/`. Many tests adapted from NumPy's own test suite. Decent coverage but gaps in edge cases.
633+
A: TUnit framework in `test/NumSharp.UnitTest/`. Many tests adapted from NumPy's own test suite. Decent coverage but gaps in edge cases. Uses source-generated test discovery (no special flags needed).
506634

507635
**Q: What .NET version is targeted?**
508-
A: Library and tests multi-target `net8.0` and `net10.0`. Dropped `netstandard2.0` in the dotnet810 branch upgrade.
636+
A: Library multi-targets `net8.0` and `net10.0`. Tests currently target `net10.0` only (TUnit requires .NET 9+ runtime). Dropped `netstandard2.0` in the dotnet810 branch upgrade.
509637

510638
**Q: What are the main dependencies?**
511639
A: No external runtime dependencies. `System.Memory` and `System.Runtime.CompilerServices.Unsafe` (previously NuGet packages) are built into the .NET 8+ runtime.

0 commit comments

Comments
 (0)