Skip to content

Conversation

@sobychacko
Copy link
Contributor

  • Deprecate CassandraVectorStoreConfig
  • Move its contents to a top level Builder class within the CassandraVectorStore

Restructuring packages for Cassandra vector store and chat memory classes

 - Deprecate CassandraVectorStoreConfig
 - Move its contents to a top level Builder class within the CassandraVectorStore

Restructuring packages for Cassandra vector store and chat memory classes
@sobychacko
Copy link
Contributor Author

@michaelsembwever We are trying to make the vector store builder pattern consistent. See the recently merged chroma vector store for a template. Cassandra vector store seems to be more involved. Since you are the original author of the cassandra vector store, can you take a look at this PR and see what you think? Thank you!


import org.springframework.ai.chat.memory.CassandraChatMemory;
import org.springframework.ai.chat.memory.CassandraChatMemoryConfig;
import org.springframework.ai.chat.memory.cassandra.CassandraChatMemory;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all vectorstore modules moving their classes to subpackages ? (i don't see that yet with pgvector. has there been some discussion/agreement to this somewhere we can link to ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are in the process of moving the vector store classes under that structure - in this case - org.springframework.ai.chat.memory.cassandra and springframework.ai.vectorstore.cassandra. The issue was that the vector store classes currently use the same package name in all the different modules - springframework.ai.vectorstore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was hoping that there was some link you could provide to where the discussion had been had and landed on that conclusion and action. but it's ok, i trust.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next time we will reach out to you, i wish we had a more centralized way to communicate changes to frequent contributors. we are working on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can't read the diff here (because the change is clobbering the move).
would it be possible to do the move in a separate commit, to make the reviewing simpler…?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see any diff there^ 😔

* https://github.com/apache/cassandra-java-driver/tree/4.x/manual/core/configuration
*
*/
public static class CassandraBuilder extends AbstractVectorStoreBuilder<CassandraBuilder> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does CassandraBuilder have to be an inner class ?
no strong opinion, but i gut instinct is it's big enough to warrant being a separate java file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can consider that - but wanted to make the usage consistent across all the vector stores.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would suggest that, given the lines of code and that the builder has a little more logic in it the a super simple builder. (i had previously intentionally kept the classes separate for the sake of encapsulation between startup/configuration versus runtime behaviour).

it's not a hill i will die on. if you take past PoVs into account, then i'm happy it's your call to make.

String[] parts = id.split("§¶");
String title = parts[0];
int chunk_no = 0 < parts.length ? Integer.parseInt(parts[1]) : 0;
int chunk_no = parts.length > 0 ? Integer.parseInt(parts[1]) : 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert that.

String[] parts = id.split("§¶");
String title = parts[0];
int chunk_no = 0 < parts.length ? Integer.parseInt(parts[1]) : 0;
int chunk_no = parts.length > 0 ? Integer.parseInt(parts[1]) : 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert that.

Comment on lines +73 to +74
.disallowSchemaChanges(!properties.isInitializeSchema())
.returnEmbeddings(properties.getReturnEmbeddings())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a behaviour change here, in that we're no longer respecting the builder's default, imposing the autoConfiguration's default over it. (the builder's default should be canonical)

i'm not sure how important this is… 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, we should be consistent. I'll create a new issue to capture this and discuss.

@michaelsembwever
Copy link
Contributor

@sobychacko the overall direction of the refactoring looks good to me, nice work.

Threw in a few questions. The reviewing need to separate config-build vs package renaming is the big one.
Otherwise, not having kept up with all project discussions and changes, I'm missing the project wide statement/agreement to move to this builder pattern and packaging naming. My concern would be that we're only applying this to Chroma and Cassandra. I assume that's not the case, but would still like an explicit reference link to this agreement :)

@sobychacko
Copy link
Contributor Author

@michaelsembwever Yes, we are moving to the builder pattern for all the vector store classes to make it consistent. Started with chroma,and there are a few PR's out there that's performing this migration on the other vector stores. Before the M5 release, we hope to complete this migration.

@markpollack
Copy link
Member

Thanks for your patience and understanding @michaelsembwever

I've created #1949 to track the remaining comment in the discussion.

I've merged this after updating ref doc and class javadoc.

Merged in 25123a5

@sobychacko
Copy link
Contributor Author

@michaelsembwever linking the original epic here: #1825

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants