Skip to content

Conversation

@tarang-jain
Copy link
Contributor

@tarang-jain tarang-jain commented Dec 2, 2025

This PR brings new params to ivf_pq: an option for the user to choose the layout of the ivf lists. The lists can be flat (no interleaving) or interleaved (current default). Flat codes allows building the index in a CPU-compatible format.

[UPDATE as of 12/19/2025]:
After #1278 is merged, we can unify IVF-PQ and PQ API codepaths.

[UPDATE 01/08/2026]:
This PR can be merged before #1278. The flat code-writing can potentially be reverted once #1278 is merged (so we can later use the PQ preprocessing API directly). However that will come naturally as a part if a broader unification of IVF-PQ and PQ codepaths.

[Benchmarks 01/15/2026]:

IVF-PQ Layout Benchmark Results

Dataset: 1,000,000 vectors × 128 dimensions | pq_dim: 32

pq_bits Code Size Direct FLAT Build (ms) INTERLEAVED Build (ms) Convert INTERLEAVED to FLAT with Codepacker (ms) Total time for INTERLEAVED build + Conversion to FLAT with Codepacker (unpack) (ms) Overhead
8 32 bytes 372.46 385.86 985.28 1371.13 3.68×
6 24 bytes 298.83 300.99 961.82 1262.82 4.23×
5 20 bytes 283.25 281.95 795.43 1077.38 3.80×
4 16 bytes 270.63 271.01 489.73 760.75 2.81×

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 2, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@tarang-jain tarang-jain self-assigned this Dec 2, 2025
@tarang-jain tarang-jain added feature request New feature or request non-breaking Introduces a non-breaking change labels Dec 2, 2025
@tarang-jain tarang-jain changed the title [FEA] IVF-PQ to Write Flat PQ Codes [WIP] [FEA] IVF-PQ to Write Flat PQ Codes Dec 2, 2025
@tarang-jain
Copy link
Contributor Author

@lowener I believe #1278 can be used directly to do this, rather than writing a new kernel, correct? If so, we can wait for #1278 to be merged and then just call those APIs from IVF-PQ. I also so the comments in #1648 related to this.

@cjnolet cjnolet moved this from Todo to In Progress in Vector Search, ML, & Data Mining Release Board Jan 5, 2026
@tarang-jain tarang-jain marked this pull request as ready for review January 9, 2026 01:30
@tarang-jain tarang-jain requested a review from a team as a code owner January 9, 2026 01:30
@tarang-jain tarang-jain changed the title [WIP] [FEA] IVF-PQ to Write Flat PQ Codes [FEA] IVF-PQ to Write Flat PQ Codes Jan 9, 2026
@tarang-jain
Copy link
Contributor Author

/ok to test dc99bfb

@tarang-jain
Copy link
Contributor Author

/ok to test f6cf637

@tarang-jain
Copy link
Contributor Author

/ok to test d9c9b62

@cjnolet
Copy link
Member

cjnolet commented Jan 15, 2026

/ok to test 02f4c76

@tarang-jain
Copy link
Contributor Author

/ok to test a5e482b

@tarang-jain tarang-jain requested a review from a team as a code owner January 15, 2026 22:53
uint8_t code = action(in_ix, j);
if (lane_id == 0) { code_view[j] = code; }
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small optimization opportunity here: We can do the lane_id == 0 check at the very beginning and simply return if that condition is not true. Because the code writing is done only by the thread with lane_id = 0.

@tarang-jain
Copy link
Contributor Author

/ok to test 252cfbd

1 similar comment
@cjnolet
Copy link
Member

cjnolet commented Jan 16, 2026

/ok to test 252cfbd

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 16, 2026

/ok to test 252cfbd

@cjnolet, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@cjnolet
Copy link
Member

cjnolet commented Jan 16, 2026

/ok to test 3bf9018

@tarang-jain
Copy link
Contributor Author

/ok to test 3bf9018

@tarang-jain tarang-jain changed the base branch from main to release/26.02 January 16, 2026 20:02
@tarang-jain tarang-jain requested review from a team as code owners January 16, 2026 20:02
@tarang-jain tarang-jain requested a review from gforsyth January 16, 2026 20:02
@tarang-jain
Copy link
Contributor Author

/ok to test b5cfc7e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Introduces a non-breaking change

Projects

Development

Successfully merging this pull request may close these issues.

2 participants