Skip to content

Conversation

@rjrudin
Copy link
Contributor

@rjrudin rjrudin commented Sep 26, 2025

No description provided.

Copilot AI review requested due to automatic review settings September 26, 2025 19:28
@github-actions
Copy link

github-actions bot commented Sep 26, 2025

Copyright Validation Results
Total: 10 | Passed: 8 | Failed: 0 | Skipped: 2 | at: 2025-09-26 20:04:26 UTC | commit: 40b51c9

⏭️ Skipped (Excluded) Files

  • .copyrightconfig
  • test-app/src/main/ml-schemas-12/tde/xml-vector-chunks.json

✅ Valid Files

  • marklogic-spark-connector/src/main/java/com/marklogic/spark/Options.java
  • marklogic-spark-connector/src/main/java/com/marklogic/spark/Util.java
  • marklogic-spark-connector/src/main/java/com/marklogic/spark/core/embedding/XmlChunkConfig.java
  • marklogic-spark-connector/src/main/java/com/marklogic/spark/core/splitter/ChunkConfig.java
  • marklogic-spark-connector/src/main/java/com/marklogic/spark/dom/NamespaceContextFactory.java
  • marklogic-spark-connector/src/test/java/com/marklogic/spark/AbstractIntegrationTest.java
  • marklogic-spark-connector/src/test/java/com/marklogic/spark/writer/embedding/AddEmbeddingsFromTextTest.java
  • marklogic-spark-connector/src/test/java/com/marklogic/spark/writer/embedding/AddEmbeddingsToXmlTest.java

✅ All files have valid copyright headers!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the default namespace for embedding elements in XML documents from the generic model namespace to a MarkLogic-specific vector namespace. The change aligns the connector with the server's default vector namespace behavior.

  • Updated the default embedding namespace from http://marklogic.com/appservices/model to http://marklogic.com/vector
  • Modified XML chunk configuration to use the new vector namespace by default
  • Updated test expectations to reflect the namespace change from model:embedding to vec:embedding

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
Util.java Introduces DEFAULT_VECTOR_NAMESPACE constant for the new MarkLogic vector namespace
Options.java Updates documentation to clarify the new default namespace behavior as of version 3.0.0
ChunkConfig.java Simplifies embedding namespace logic to default to the vector namespace
XmlChunkConfig.java Changes default embedding namespace from model to vector namespace
NamespaceContextFactory.java Adds vector namespace prefix mapping for XPath operations
AbstractIntegrationTest.java Registers the new vector namespace prefix for test XML parsing
AddEmbeddingsToXmlTest.java Updates test assertions and comments to expect vec:embedding elements
xml-vector-chunks.json Updates TDE template to use vector namespace for embedding extraction

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


for (XmlNode chunk : doc.getXmlNodes("/root/model:chunks/model:chunk")) {
String embeddingValue = chunk.getElementValue("/model:chunk/model:embedding");
System.out.println(chunk.getPrettyXml());
Copy link

Copilot AI Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug print statement should be removed before merging to production. This appears to be leftover debugging code that will clutter the test output.

Suggested change
System.out.println(chunk.getPrettyXml());

Copilot uses AI. Check for mistakes.
@sonarqube-progress-marklogic
Copy link

@rjrudin rjrudin merged commit 82020c0 into develop Sep 29, 2025
4 checks passed
@rjrudin rjrudin deleted the feature/MLE-24374 branch September 29, 2025 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants