Skip to content

Add CatalogIterator Interface#1246

Open
dougbrn wants to merge 12 commits intomainfrom
data_iterator
Open

Add CatalogIterator Interface#1246
dougbrn wants to merge 12 commits intomainfrom
data_iterator

Conversation

@dougbrn
Copy link
Contributor

@dougbrn dougbrn commented Feb 5, 2026

Closes #1042. Generally pretty open on a lot of the design decisions, naming, etc here. I didn't attempt the pytorch integration mentioned in the issue, happy to discuss more about that!

@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 98.21429% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.69%. Comparing base (b1f2780) to head (56d6084).

Files with missing lines Patch % Lines
src/lsdb/iterator/catalog_iterator.py 98.18% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1246      +/-   ##
==========================================
+ Coverage   96.66%   96.69%   +0.02%     
==========================================
  Files          46       47       +1     
  Lines        2877     2933      +56     
==========================================
+ Hits         2781     2836      +55     
- Misses         96       97       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

github-actions bot commented Feb 5, 2026

Before [b1f2780] After [fd12ad4] Ratio Benchmark (Parameter)
8.10±0.1s 8.28±0.02s 1.02 benchmarks.time_lazy_crossmatch_many_columns_all_suffixes
175±1ms 177±0.9ms 1.01 benchmarks.time_open_many_columns_list
50.2±0.4ms 50.8±0.8ms 1.01 benchmarks.time_polygon_search
7.08±0.07s 7.08±0.01s 1 benchmarks.time_create_large_catalog
1.06±0.01s 1.06±0.01s 1 benchmarks.time_create_midsize_catalog
109±2ms 109±1ms 1 benchmarks.time_kdtree_crossmatch
3.88±0.03s 3.87±0.01s 1 benchmarks.time_open_many_columns_all
388±3ms 386±2ms 0.99 benchmarks.time_open_many_columns_default
19.9±0.02s 19.8±0.07s 0.99 benchmarks.time_save_big_catalog
30.4±0.7ms 29.7±0.2ms 0.98 benchmarks.time_box_filter_on_partition

Click here to view all benchmarks.

@dougbrn dougbrn marked this pull request as ready for review February 5, 2026 22:13
@dougbrn dougbrn requested a review from hombit February 5, 2026 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a background running data iterator interface

1 participant