Skip to content

Conversation

Samiul-TheSoccerFan
Copy link
Contributor

@Samiul-TheSoccerFan Samiul-TheSoccerFan commented Apr 21, 2025

During index mapping when a dense_vector field is defined with bbq_* as index_options, the oversample value is now set to 3.0 by default.

bbq_hnsw::

PUT my-image-index
{
  "mappings": {
    "properties": {
       "image-vector": {
        "type": "dense_vector",
        "dims": 64,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "bbq_hnsw"
        }
      },
      "file-type": {
        "type": "keyword"
      },
      "title": {
        "type": "text"
      }
    }
  }
}


response:::
{
  "my-image-index": {
    "mappings": {
      "properties": {
        "file-type": {
          "type": "keyword"
        },
        "image-vector": {
          "type": "dense_vector",
          "dims": 64,
          "index": true,
          "similarity": "l2_norm",
          "index_options": {
            "type": "bbq_hnsw",
            "m": 16,
            "ef_construction": 100,
            "rescore_vector": {
              "oversample": 3
            }
          }
        },
        "title": {
          "type": "text"
        }
      }
    }
  }
}

bbq_flat::

PUT my-image-index2
{
  "mappings": {
    "properties": {
       "image-vector": {
        "type": "dense_vector",
        "dims": 64,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "bbq_flat"
        }
      },
      "file-type": {
        "type": "keyword"
      },
      "title": {
        "type": "text"
      }
    }
  }
}

response::
{
  "my-image-index2": {
    "mappings": {
      "properties": {
        "file-type": {
          "type": "keyword"
        },
        "image-vector": {
          "type": "dense_vector",
          "dims": 64,
          "index": true,
          "similarity": "l2_norm",
          "index_options": {
            "type": "bbq_flat",
            "rescore_vector": {
              "oversample": 3
            }
          }
        },
        "title": {
          "type": "text"
        }
      }
    }
  }
}

int8::

PUT my-image-index
{
  "mappings": {
    "properties": {
       "image-vector": {
        "type": "dense_vector",
        "dims": 3,
        "index": true,
        "similarity": "l2_norm"
      },
      "file-type": {
        "type": "keyword"
      },
      "title": {
        "type": "text"
      }
    }
  }
}

response::
{
  "my-image-index": {
    "mappings": {
      "properties": {
        "file-type": {
          "type": "keyword"
        },
        "image-vector": {
          "type": "dense_vector",
          "dims": 3,
          "index": true,
          "similarity": "l2_norm",
          "index_options": {
            "type": "int8_hnsw",
            "m": 16,
            "ef_construction": 100
          }
        },
        "title": {
          "type": "text"
        }
      }
    }
  }
}

Respect the provided value for bbq_*:

PUT my-image-index3
{
  "mappings": {
    "properties": {
       "image-vector": {
        "type": "dense_vector",
        "dims": 64,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "bbq_hnsw",
          "rescore_vector": {"oversample": 2.0}
        }
      },
      "file-type": {
        "type": "keyword"
      },
      "title": {
        "type": "text"
      }
    }
  }
}

response
{
  "my-image-index3": {
    "mappings": {
      "properties": {
        "file-type": {
          "type": "keyword"
        },
        "image-vector": {
          "type": "dense_vector",
          "dims": 64,
          "index": true,
          "similarity": "l2_norm",
          "index_options": {
            "type": "bbq_hnsw",
            "m": 16,
            "ef_construction": 100,
            "rescore_vector": {
              "oversample": 2
            }
          }
        },
        "title": {
          "type": "text"
        }
      }
    }
  }
}

@Samiul-TheSoccerFan Samiul-TheSoccerFan force-pushed the update_default_oversample_for_bbq branch from e1acdbe to 70db95c Compare April 22, 2025 22:07
@Samiul-TheSoccerFan
Copy link
Contributor Author

@benwtrent @jimczi While I work on the yaml tests, can I get a quick feedback on the current changes?

RescoreVector rescoreVector = null;
if (indexVersion.onOrAfter(ADD_RESCORE_PARAMS_TO_QUANTIZED_VECTORS)) {
rescoreVector = RescoreVector.fromIndexOptions(indexOptionsMap, indexVersion);
if (rescoreVector == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only happen on new indices. Please add a new index version to bar changing this on existing indices.

@elasticsearchmachine
Copy link
Collaborator

Hi @Samiul-TheSoccerFan, I've created a changelog YAML for you.

@Samiul-TheSoccerFan Samiul-TheSoccerFan marked this pull request as ready for review April 25, 2025 14:49
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Apr 25, 2025
@Samiul-TheSoccerFan
Copy link
Contributor Author

@benwtrent This is safe to merge or wait for @jimczi's review?

@benwtrent
Copy link
Member

We aren't in a hurry. Let's see what @jimczi says :)

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @Samiul-TheSoccerFan

@jimczi
Copy link
Contributor

jimczi commented Apr 28, 2025

Let's have a follow up to update the documentation @Samiul-TheSoccerFan ?

@Samiul-TheSoccerFan Samiul-TheSoccerFan merged commit cd4fcbf into elastic:main Apr 28, 2025
17 checks passed
@Samiul-TheSoccerFan
Copy link
Contributor Author

Added Documentation PR: elastic/docs-content#1290

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request May 5, 2025
elasticsearchmachine pushed a commit that referenced this pull request May 5, 2025
This adds backport index versions in preparation for backporting
#127134
@benwtrent
Copy link
Member

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request May 6, 2025
This adds backport index versions in preparation for backporting
elastic#127134
elasticsearchmachine pushed a commit that referenced this pull request May 6, 2025
* Update Default value of Oversample for bbq (#127134)

* Unit test to validate default behavior

* adding default value to oversample for bbq

* Fix code style issue

* Update docs/changelog/127134.yaml

* Update changelog

* Adding index version to support only new indices

* Update index version name to better match

* Adding a simple yaml test to verify the yaml functionality for oversample value

* Refactor knn float to add rescore vector by default when index type is one of bbq

* adding yaml tests to verify oversampel default value

* Fixing format issue for not_exists

(cherry picked from commit cd4fcbf)

* Adding backport index versions for PR #127134 (#127724)

This adds backport index versions in preparation for backporting
#127134

---------

Co-authored-by: Samiul Monir <[email protected]>
ywangd pushed a commit to ywangd/elasticsearch that referenced this pull request May 9, 2025
This adds backport index versions in preparation for backporting
elastic#127134
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request May 12, 2025
This adds backport index versions in preparation for backporting
elastic#127134
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport pending >enhancement :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants