feat: adds VectorIndex extension #378

jayhack · 2025-02-09T21:51:50Z

Introduces very naive vector index as an "extension":

Get all embeddings from OpenAI
Store then in a numpy array
Ability to store/save/load on disk

Initial investigations show this takes about 50mb of memory for all of pytorch and takes 2.5 minutes.

Future iterations on this can show how to:

invalidate embeddings when a file blob hash changes
store on a symbol level
compute symbol-level embeddings including their extended context

etc.

This is designed to be used as input for other APIs, like the "semantic search" tool, LlamaIndex retrievers etc.

codecov · 2025-02-09T21:54:07Z

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

Additional details and impacted files

github-actions · 2025-02-10T20:15:43Z

🎉 This PR is included in version 0.6.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

.

ca556c2

jayhack requested review from a team and codegen-team as code owners February 9, 2025 21:51

jayhack and others added 5 commits February 9, 2025 14:00

added docs

ee7dd1a

Discard changes to src/codegen/extensions/langchain/tools.py

2f8cb0d

Discard changes to src/codegen/extensions/tools/file_operations.py

60286d9

.

00e6f81

.

697e095

kopekC approved these changes Feb 9, 2025

View reviewed changes

Discard changes to src/codegen/sdk/core/codebase.py

ca1dcf7

jayhack merged commit e2b2da2 into develop Feb 9, 2025
23 of 24 checks passed

jayhack deleted the jay/vector-index branch February 9, 2025 23:28

github-actions bot added the released label Feb 10, 2025

tkfoss pushed a commit that referenced this pull request Feb 10, 2025

feat: adds VectorIndex extension (#378)

106111e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: adds VectorIndex extension #378

feat: adds VectorIndex extension #378

Uh oh!

jayhack commented Feb 9, 2025 •

edited

Loading

Uh oh!

codecov bot commented Feb 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Feb 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: adds VectorIndex extension #378

feat: adds VectorIndex extension #378

Uh oh!

Conversation

jayhack commented Feb 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

github-actions bot commented Feb 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jayhack commented Feb 9, 2025 •

edited

Loading

codecov bot commented Feb 9, 2025 •

edited

Loading