Skip to content

Commit 1bb2de8

Browse files
committed
feat: add convenience methods and dual API approach to V2 client
Implements a hybrid API design that provides both simple convenience methods for common operations (80% use case) and builder patterns for complex scenarios (20% use case). This aligns the Java client with Chroma's official Python/TypeScript SDKs and follows patterns from successful Java libraries like OkHttp and Jedis. Changes: - Add convenience methods to Collection class for add, query, get, update, upsert, and delete operations - Add queryTexts support to QueryRequest for text-based semantic search - Add queryByText convenience methods to Collection class - Update QuickStartExample to demonstrate both simple and advanced APIs - Add comprehensive tests for new convenience methods (17 tests passing) - Update V2_API.md with dual API documentation and usage guidelines Design philosophy: "Simple things should be simple, complex things should be possible"
1 parent 405f1d7 commit 1bb2de8

File tree

5 files changed

+703
-32
lines changed

5 files changed

+703
-32
lines changed

V2_API.md

Lines changed: 377 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,377 @@
1+
# ChromaDB V2 API Documentation
2+
3+
## Overview
4+
5+
The V2 API is an experimental implementation of the ChromaDB v2 client for Java, designed with principles of radical simplicity based on successful Java libraries like OkHttp, Retrofit, and Jedis.
6+
7+
**⚠️ Important:** The v2 API does not yet exist in ChromaDB. This implementation is based on anticipated v2 API design and is provided for experimental/preview purposes only.
8+
9+
## Design Principles
10+
11+
### Radical Simplicity
12+
- **Dual API Approach**: Convenience methods for common cases (80%), builders for complex operations (20%)
13+
- **Chroma-Aligned**: API mirrors official Python/TypeScript SDKs for familiarity
14+
- **Flat Package Structure**: All public API classes in `tech.amikos.chromadb.v2` package (no sub-packages)
15+
- **Simple Things Simple**: Common operations in 1-2 lines, no builders required
16+
- **Minimal Public API Surface**: ~20-25 classes total (following OkHttp's model)
17+
- **Concrete Over Abstract**: Prefer concrete classes over interfaces where possible
18+
19+
## Architecture
20+
21+
```
22+
Client (interface)
23+
├── BaseClient (abstract)
24+
│ ├── ServerClient (self-hosted)
25+
│ └── CloudClient (cloud - future)
26+
27+
└── Collection (smart entity with operations)
28+
├── query()
29+
├── get()
30+
├── add()
31+
├── update()
32+
├── upsert()
33+
├── delete()
34+
└── count()
35+
```
36+
37+
### Core Classes (~20 total)
38+
- `ServerClient` / `CloudClient` - Client implementations
39+
- `Collection` - Concrete collection class (not interface)
40+
- `Metadata` - Strongly-typed metadata with builder
41+
- Query builders: `QueryBuilder`, `AddBuilder`, etc.
42+
- Model classes: `Where`, `WhereDocument`, `Include`
43+
- Auth: `AuthProvider` interface with implementations
44+
- Exceptions: Strongly-typed exception hierarchy
45+
46+
## Quick Start
47+
48+
### 1. Create a Client
49+
50+
```java
51+
import tech.amikos.chromadb.v2.ChromaClient;
52+
import tech.amikos.chromadb.v2.AuthProvider;
53+
54+
ChromaClient client = ChromaClient.builder()
55+
.serverUrl("http://localhost:8000")
56+
.auth(AuthProvider.none())
57+
.tenant("default_tenant")
58+
.database("default_database")
59+
.build();
60+
```
61+
62+
### 2. Create a Collection
63+
64+
```java
65+
// Simple creation
66+
Collection collection = client.createCollection("my-collection");
67+
68+
// With metadata
69+
Collection collection = client.createCollection("my-collection",
70+
Map.of("description", "My collection"));
71+
```
72+
73+
## Simple API (Convenience Methods)
74+
75+
For most use cases, use the simple, Chroma-aligned convenience methods:
76+
77+
### 3. Add Records
78+
79+
```java
80+
// Simple add - mirrors Python/TypeScript Chroma API
81+
collection.add(
82+
List.of("id1", "id2", "id3"),
83+
List.of(
84+
List.of(0.1f, 0.2f, 0.3f),
85+
List.of(0.4f, 0.5f, 0.6f),
86+
List.of(0.7f, 0.8f, 0.9f)
87+
),
88+
List.of("Document 1", "Document 2", "Document 3"),
89+
List.of(
90+
Map.of("author", "John"),
91+
Map.of("author", "Jane"),
92+
Map.of("author", "Bob")
93+
)
94+
);
95+
```
96+
97+
### 4. Query Collection
98+
99+
```java
100+
// Simple query by embeddings
101+
QueryResponse results = collection.query(
102+
List.of(List.of(0.1f, 0.2f, 0.3f)),
103+
10 // number of results
104+
);
105+
106+
// Query with filtering
107+
results = collection.query(
108+
List.of(List.of(0.1f, 0.2f, 0.3f)),
109+
10,
110+
Where.eq("author", "John")
111+
);
112+
113+
// Query by text (auto-embedded)
114+
results = collection.queryByText(
115+
List.of("quantum computing"),
116+
5
117+
);
118+
```
119+
120+
### 5. Get Records
121+
122+
```java
123+
// Simple get by IDs
124+
GetResponse records = collection.get(List.of("id1", "id2"));
125+
126+
// Get with includes
127+
records = collection.get(
128+
List.of("id1", "id2"),
129+
Include.DOCUMENTS, Include.METADATAS
130+
);
131+
```
132+
133+
### 6. Update/Upsert Records
134+
135+
```java
136+
// Simple upsert
137+
collection.upsert(
138+
List.of("id4"),
139+
List.of(List.of(0.2f, 0.3f, 0.4f)),
140+
List.of("New document")
141+
);
142+
```
143+
144+
### 7. Delete Records
145+
146+
```java
147+
// Delete by IDs
148+
collection.delete(List.of("id1", "id2"));
149+
150+
// Delete by filter
151+
collection.delete(Where.eq("status", "archived"));
152+
```
153+
154+
## Advanced API (Builder Pattern)
155+
156+
For complex operations with multiple options, use the builder pattern:
157+
158+
### Complex Query
159+
160+
```java
161+
QueryResponse results = collection.query()
162+
.queryEmbeddings(List.of(List.of(0.1f, 0.2f, 0.3f)))
163+
.nResults(10)
164+
.where(Where.and(
165+
Where.eq("status", "published"),
166+
Where.gte("score", 8.0)
167+
))
168+
.whereDocument(WhereDocument.contains("technology"))
169+
.include(Include.EMBEDDINGS, Include.METADATAS, Include.DISTANCES)
170+
.execute();
171+
```
172+
173+
### Complex Get with Pagination
174+
175+
```java
176+
GetResponse records = collection.get()
177+
.where(Where.eq("category", "tech"))
178+
.limit(100)
179+
.offset(0)
180+
.include(Include.DOCUMENTS, Include.METADATAS)
181+
.execute();
182+
```
183+
184+
### Complex Add
185+
186+
```java
187+
collection.add()
188+
.ids(List.of("id1", "id2"))
189+
.embeddings(embeddings)
190+
.documents(documents)
191+
.metadatas(metadatas)
192+
.uris(uris)
193+
.execute();
194+
```
195+
196+
## Advanced Features
197+
198+
### Authentication
199+
200+
```java
201+
// Basic authentication
202+
ServerClient client = ServerClient.builder()
203+
.baseUrl("http://localhost:8000")
204+
.auth(AuthProvider.basic("username", "password"))
205+
.build();
206+
207+
// Bearer token
208+
client = ServerClient.builder()
209+
.baseUrl("http://localhost:8000")
210+
.auth(AuthProvider.bearerToken("your-api-token"))
211+
.build();
212+
213+
// X-Chroma-Token header
214+
client = ServerClient.builder()
215+
.baseUrl("http://localhost:8000")
216+
.auth(AuthProvider.chromaToken("chroma-token"))
217+
.build();
218+
```
219+
220+
### Embedding Functions
221+
222+
```java
223+
// Default embedding (uses all-MiniLM-L6-v2)
224+
EmbeddingFunction defaultEF = EmbeddingFunction.getDefault();
225+
226+
// OpenAI embeddings
227+
EmbeddingFunction openAI = EmbeddingFunction.openAI("your-api-key");
228+
229+
// Custom embedding function
230+
EmbeddingFunction custom = new EmbeddingFunction() {
231+
@Override
232+
public List<List<Float>> embed(List<String> texts) {
233+
// Your embedding logic
234+
}
235+
};
236+
237+
// Use with collection
238+
Collection collection = client.createCollection(builder -> builder
239+
.name("documents")
240+
.embeddingFunction(openAI)
241+
);
242+
```
243+
244+
### Metadata Filtering (Where DSL)
245+
246+
```java
247+
// Complex filter conditions
248+
Where filter = Where.builder()
249+
.and(
250+
Where.eq("status", "published"),
251+
Where.gte("score", 8.0),
252+
Where.or(
253+
Where.eq("category", "tech"),
254+
Where.eq("category", "science")
255+
)
256+
)
257+
.build();
258+
259+
// Use in queries
260+
QueryResponse results = collection.query(builder -> builder
261+
.queryTexts(Arrays.asList("search text"))
262+
.where(filter)
263+
.nResults(10)
264+
);
265+
```
266+
267+
### Document Filtering
268+
269+
```java
270+
// Filter by document content
271+
WhereDocument docFilter = WhereDocument.contains("machine learning");
272+
273+
QueryResponse results = collection.query(builder -> builder
274+
.queryTexts(Arrays.asList("AI research"))
275+
.whereDocument(docFilter)
276+
.nResults(5)
277+
);
278+
```
279+
280+
## Implementation Status
281+
282+
### What's Implemented ✅
283+
- Basic client structure (`ServerClient`, `CloudClient`)
284+
- Authentication providers (Basic, Token, ChromaToken)
285+
- Model classes for v2 operations
286+
- Collection operations interface
287+
- Query builder pattern
288+
- Fluent API for all operations
289+
- Type-safe metadata and filtering
290+
291+
### Known Issues ⚠️
292+
1. **API Endpoints:** Currently modified to use `/api/v1` endpoints as a temporary workaround
293+
2. **Tenant/Database Support:** v2 expects multi-tenancy which v1 doesn't fully support
294+
3. **Response Models:** Field names and structure differ between v1 and v2
295+
4. **Embedding Functions:** Integration needs refinement for v2 API
296+
297+
### Coming Soon 🚀
298+
- CloudClient implementation
299+
- Advanced query capabilities
300+
- Batch operations optimization
301+
- Streaming results
302+
- Async/reactive operations
303+
304+
## API Design: Dual Approach
305+
306+
The V2 API offers **two complementary approaches**:
307+
308+
### 1. Convenience Methods (Simple API)
309+
- **For**: 80% of use cases
310+
- **Style**: Direct method calls with parameters
311+
- **Benefit**: Minimal boilerplate, Chroma-aligned
312+
- **Example**: `collection.add(ids, embeddings, documents)`
313+
314+
### 2. Builder Pattern (Advanced API)
315+
- **For**: 20% of complex use cases
316+
- **Style**: Fluent builders with `.execute()`
317+
- **Benefit**: Maximum flexibility, all options available
318+
- **Example**: `collection.query().queryEmbeddings(...).where(...).execute()`
319+
320+
### When to Use Which?
321+
322+
| Use Case | Recommended Approach | Example |
323+
|----------|---------------------|---------|
324+
| Simple add with all data | Convenience | `collection.add(ids, embeddings, documents, metadatas)` |
325+
| Add with URIs or complex options | Builder | `collection.add().ids(...).uris(...).execute()` |
326+
| Basic query | Convenience | `collection.query(embeddings, 10)` |
327+
| Query with whereDocument or complex filters | Builder | `collection.query().queryEmbeddings(...).whereDocument(...).execute()` |
328+
| Get by IDs | Convenience | `collection.get(List.of("id1", "id2"))` |
329+
| Get with pagination | Builder | `collection.get().limit(100).offset(0).execute()` |
330+
| Delete by IDs | Convenience | `collection.delete(ids)` |
331+
| Delete by complex filter | Builder | `collection.delete().where(...).whereDocument(...).execute()` |
332+
333+
### Design Philosophy
334+
335+
> **"Simple things should be simple, complex things should be possible."**
336+
337+
The dual API approach ensures:
338+
- New users can get started quickly with minimal code
339+
- Power users have full control when needed
340+
- API feels familiar to Chroma users from Python/TypeScript
341+
- Java best practices (type safety, clarity) are maintained
342+
343+
## Migration from V1
344+
345+
The V2 API is designed to coexist with V1. Key differences:
346+
347+
| V1 | V2 |
348+
|----|-----|
349+
| `Client` class | `ChromaClient` |
350+
| Swagger-generated models | Hand-crafted POJOs |
351+
| Builder-only patterns | Dual approach (convenience + builders) |
352+
| Multiple ways to configure | Flat, simple API surface |
353+
| Nested packages | Flat package structure |
354+
355+
## Testing
356+
357+
The V2 API includes comprehensive test coverage:
358+
359+
```bash
360+
# Run all V2 tests
361+
mvn test -Dtest="tech.amikos.chromadb.v2.**"
362+
363+
# Run with specific ChromaDB version
364+
export CHROMA_VERSION=1.1.0 && mvn test
365+
366+
# Run stress tests
367+
mvn test -Dtest=V2StressTest
368+
```
369+
370+
## Support
371+
372+
This is an experimental API. For production use, please use the stable V1 API.
373+
374+
For issues or questions:
375+
- GitHub Issues: [chromadb-java-client/issues](https://github.com/amikos-tech/chromadb-java-client/issues)
376+
- Documentation: This file
377+
- Examples: See test files in `src/test/java/tech/amikos/chromadb/v2/`

0 commit comments

Comments
 (0)