Skip to content

MB-65170: Add filter support to BooleanQuery#2147

Closed
CascadingRadium wants to merge 24 commits intomasterfrom
boolFilter
Closed

MB-65170: Add filter support to BooleanQuery#2147
CascadingRadium wants to merge 24 commits intomasterfrom
boolFilter

Conversation

@CascadingRadium
Copy link
Member

@CascadingRadium CascadingRadium commented Feb 24, 2025

  • Add a Filter query for Boolean queries, which filters the document set returned by the
    Boolean query itself.
  • The filter query does not affect scores, and any document returned by the Boolean query
    must also satisfy the filter query.
  • The key difference between using a Filter query and placing an equivalent query in the
    Must clause is that queries in the Must clause contribute to the score. The Filter
    query is intended purely for filtering purposes without modifying the document scores
    set by the base Boolean query.
  • Requires a Bug Fix (BugFix: Fixed Advance() API implementation in Optimized Composite Searchers #2146)
    to allow for the usage of the Advanced API in case the FilterQuery is an optimized
    composite searcher.
  • Resolves support Filter in searcher like elasticsearch #2037

@abhinavdangeti abhinavdangeti modified the milestones: v2.5.2, v2.5.3 May 22, 2025
Base automatically changed from bugFIx to master May 23, 2025 05:58
@abhinavdangeti
Copy link
Member

@CascadingRadium would you resolve conflicts that've popped up here - it seems we can get this in after.

@abhinavdangeti abhinavdangeti requested review from capemox and maneuvertomars and removed request for metonymic-smokey August 19, 2025 20:09
christiangda and others added 8 commits August 21, 2025 11:15
__NOTES:__

- In the cases where I can use a `unit test` to test the code, I will do
it. The unit test I added was created by `Github Copilot Agent` using
the model `Gemini 2.5 Pro`, and most of them were adapted by me.
- Some of the improvements I made are not directly related to the
`golangci-lint` findings, but I believe they are important for the
project. All of them were proposed by `Github Copilot Agent` using the
model `Gemini 2.5 Pro`, but I modified most of them so that they
wouldn't alter the code too much and facilitate the `code review`.


This delivery improves two golangci-lint findings:

1. `S1023:` redundant break statement (staticcheck)
2. `QF1003:` could use tagged switch on count (staticcheck)

```bash
golangci-lint --version
golangci-lint has version v2.1.2 built with go1.24.2 from (unknown, modified: ?, mod sum: "h1:bcOB+jVr4EYEgOEIskQIhtdxOpIGl+iOCwliG/hNPXw=") on (unknown)
```
Fixed the following files and lines:

- minor formatting changes
- `golangci-lint run --enable-only staticcheck ./... | grep "S1023"`
- analysis/char/asciifolding/asciifolding.go:3564:5: S1023: redundant
break statement (staticcheck)
- `golangci-lint run --enable-only staticcheck ./... | grep "QF1003"`
- analysis/datetime/iso/iso.go:176:6: QF1003: could use tagged switch on
count (staticcheck)
- analysis/datetime/percent/percent.go:87:3: QF1003: could use tagged
switch on formatString[idx+2] (staticcheck)
- analysis/lang/cjk/cjk_bigram.go:165:2: QF1003: could use tagged switch
on *itemsInRing (staticcheck)
- index/scorch/merge_test.go:45:3: QF1003: could use tagged switch on
e.Kind (staticcheck)
- index/scorch/snapshot_index.go:334:2: QF1003: could use tagged switch
on fuzziness (staticcheck)
- mapping/document.go:486:5: QF1003: could use tagged switch on
fieldMapping.Type (staticcheck)
- mapping/document.go:571:6: QF1003: could use tagged switch on
fieldMapping.Type (staticcheck)
- mapping/field.go:234:2: QF1003: could use tagged switch on fm.Type
(staticcheck)
- mapping/field.go:337:2: QF1003: could use tagged switch on shape
(staticcheck)
- search/sort.go:157:2: QF1003: could use tagged switch on input
(staticcheck)
- search/sort.go:429:2: QF1003: could use tagged switch on stype
(staticcheck)
…ex` option is `False` (#2190)

- If a field mapping for a vector or geoshape has the `Index` option set
to `false`, the field is still indexed because the current code
incorrectly always overrides this option to `true`.

- This behavior has been fixed by ensuring that the Index option is not
overridden and the field is only indexed if explicitly allowed by the
user-specified Index setting.
- Previously, filter queries were not validated during the KNN request
validation phase, which could lead to unexpected filter query behavior.
This has been fixed by adding proper validation.
- Also fixed an issue where errors during JSON unmarshalling of invalid
filter queries were being silently ignored.
…ment plugins (#2192)

- Fixed a bug where legacy segment plugins caused an infinite loop
during query execution; the synonym set is now correctly left empty,
triggering fallback to basic FTS search without synonyms
- Refactored related code for improved clarity and performance
- LineStrings containing Points and MultiPoints is now supported.
Changed testcases related to it
 - Added more detailed documentation for geoshape support
Linked Geo PR - blevesearch/geo#27

---------

Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>
CascadingRadium and others added 12 commits August 21, 2025 11:15
- Used a tool to identify and fix lint errors in all the markdown files.
- Fixed code examples to use only the exported bleve package (and not
internal ones)
- Formatted JSON and code examples throughout
- Added Hybrid Search example to vectors.md

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…rchers (#2146)

- When using the `Advance()` API on an `IndexSnapshotTermFieldReader`, a
special replacement mechanism is applied during a `Rewind`— when the
requested document ID is behind the current document ID pointed to by
the reader. In such cases, the object itself gets replaced.
- However, this approach fails when the `IndexSnapshotTermFieldReader`
is created by the `optimizeCompositeSearcher` method, which is used to
construct optimized `conjunction` or `disjunction` searchers.
- The issue arises because the `unadornedTermFieldReader` is initialized
with the `term` set to `<optimization-type>` and `field` set to `*`. As
a result, calling `Advance()` renders the entire `TFR` unusable, as it
gets replaced by a dummy `TFR` that no longer functions.
- This problem occurs specifically when calling `Advance()` on the
`unadornedTermFieldReader` with an ID less than the current posting ID
(triggering the rewind mechanism).
- The issue is resolved by resetting the underlying
`unadornedPostingsIteratorBitmap`, which effectively achieves the same
result as the `TFR` replacement technique without rendering it unusable.
- Added a new field to document match to store the decoded sort values
- Added implementations for decoding numeric, datetime and geo sort
values
 - Added appropriate test cases for the same

Also, upgrade zapx/v16 for fix:
* 4e38ae4 Likith B | MB-59633: Support list of geo shapes for a single
field

---------

Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>
- As an edge case, the segment layout can be such that the merge
planning generates tasks having only a single non-empty segment.
- When this task is processed and introduced into the scorch system, the
root doesn't change and you'd be stuck in an infinite loop because the
same segment is still present in the system and the plan keeps
generating the task with the same single segment, which doesn't converge
the index to a steady state.
)

+ Generate updated `upsidedown.pb.go` file using the latest of
`protoc-gen-go`.
+ Add `index/upsidedown/protoc-README.md` to capture instructions on how
to generate the new stubs.
+ Updated dependency from `github.com/golang/protobuf` (deprecated) to
`google.golang.org/protobuf`.
…ver non textual content (#2208)

Add search after functionality for numeric, datetime, and geo fields in
TopNCollector

- Introduced new test cases for searching after numeric, datetime, and
geo fields.
- Implemented encoding for pagination on these fields to enhance search
capabilities.
- Updated the createSearchAfterDocument function to support encoded sort
values.
- Ensured proper handling of search results in the new test cases.

This enhances the search functionality by allowing users to paginate
through results based on specific criteria.
In the Go 1.21 standard library, a new function has been introduced that
enhances code conciseness and readability. It can be find
[here](https://pkg.go.dev/slices@go1.21.1#Equal).

Signed-off-by: houpo-bob <houpocun@outlook.com>
- Fixed a testcase that initialized an invalid geometry collection
object
 - Pulling in geo v0.2.4
Added short documentation about pagination in Bleve
@CascadingRadium
Copy link
Member Author

opened #2220 after rebase

@CascadingRadium CascadingRadium deleted the boolFilter branch August 21, 2025 05:56
@abhinavdangeti abhinavdangeti removed this from the v2.5.4 milestone Sep 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support Filter in searcher like elasticsearch

8 participants