Skip to content

Commit ca5d722

Browse files
Docs: Add more methods on the package usage
1 parent 953cc68 commit ca5d722

File tree

5 files changed

+431
-6
lines changed

5 files changed

+431
-6
lines changed

README.md

Lines changed: 284 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -169,13 +169,16 @@ If the tenant or database doesn't exist, the package will automatically create t
169169

170170
### Creating a Collection
171171

172+
Creating a collection is as simple as calling the `createCollection` method on the client and passing in the name of
173+
the collection.
174+
172175
```php
173176

174177
$collection = $chroma->createCollection('test-collection');
175178

176179
```
177180

178-
If the collection already exists in the database, the package will automatically fetch it for you.
181+
If the collection already exists in the database, the package will throw an exception.
179182

180183
### Inserting Documents
181184

@@ -231,10 +234,23 @@ that you can use:
231234

232235
$collection = $chromaDB->createCollection('test-collection', embeddingFunction: $embeddingFunction);
233236
```
237+
You can get your OpenAI API key and organization id from your [OpenAI dashboard](https://beta.openai.com/), and you
238+
can omit the organization id if your API key doesn't belong to an organization. The model name is optional as well and
239+
defaults to `text-embedding-ada-002`
234240

235-
- `HuggingFaceEmbeddingFunction`: This embedding function uses the HuggingFace API to compute the embeddings. You can
236-
use it like this:
241+
- `JinaEmbeddingFunction`: This is a wrapper for the Jina Embedding models. You can use by passing your Jina API key and
242+
the desired model. THis defaults to `jina-embeddings-v2-base-en`
243+
```php
244+
use Codewithkyrian\ChromaDB\Embeddings\JinaEmbeddingFunction;
245+
246+
$embeddingFunction = new JinaEmbeddingFunction('api-key');
247+
248+
$collection = $chromaDB->createCollection('test-collection', embeddingFunction: $embeddingFunction);
249+
```
237250

251+
- `HuggingFaceEmbeddingServerFunction`: This embedding function is a wrapper around the HuggingFace Text Embedding
252+
Server. Before using it, you need to have
253+
the [HuggingFace Embedding Server](https://github.com/huggingface/text-embeddings-inference) running somewhere locally. Here's how you can use it:
238254
```php
239255
use CodeWithKyrian\Chroma\EmbeddingFunction\HuggingFaceEmbeddingFunction;
240256

@@ -243,7 +259,8 @@ that you can use:
243259
$collection = $chromaDB->createCollection('test-collection', embeddingFunction: $embeddingFunction);
244260
```
245261

246-
You can also create your own embedding function by implementing the `EmbeddingFunctionInterface` interface.
262+
Besides the built-in embedding functions, you can also create your own embedding function by implementing
263+
the `EmbeddingFunction` interface (including Anonymous Classes):
247264

248265
```php
249266
use CodeWithKyrian\ChromaDB\EmbeddingFunction\EmbeddingFunctionInterface;
@@ -258,6 +275,269 @@ $embeddingFunction = new class implements EmbeddingFunctionInterface {
258275
$collection = $chroma->createCollection('test-collection', embeddingFunction: $embeddingFunction);
259276
```
260277

278+
> The embedding function will be called for each batch of documents that are inserted into the collection, and must be
279+
> provided either when creating the collection or when querying the collection. If you don't provide an embedding
280+
> function, and you don't provide the embeddings, the package will throw an exception.
281+
282+
### Inserting Documents into a Collection with an Embedding Function
283+
284+
```php
285+
$ids = ['test1', 'test2', 'test3'];
286+
$documents = [
287+
'This is a test document',
288+
'This is another test document',
289+
'This is yet another test document',
290+
];
291+
$metadatas = [
292+
['url' => 'https://example.com/test1'],
293+
['url' => 'https://example.com/test2'],
294+
['url' => 'https://example.com/test3'],
295+
];
296+
297+
$collection->add(
298+
ids: $ids,
299+
documents: $documents,
300+
metadatas: $metadatas
301+
);
302+
```
303+
304+
### Getting a Collection
305+
306+
```php
307+
$collection = $chromaDB->getCollection('test-collection');
308+
```
309+
310+
Or with an embedding function:
311+
312+
```php
313+
$collection = $chromaDB->getCollection('test-collection', embeddingFunction: $embeddingFunction);
314+
```
315+
316+
> Make sure that the embedding function you provide is the same one that was used when creating the collection.
317+
318+
### Counting the items in a collection
319+
320+
```php
321+
$collection->count() // 2
322+
```
323+
324+
### Updating a collection
325+
326+
```php
327+
$collection->update(
328+
ids: ['test1', 'test2', 'test3'],
329+
embeddings: [
330+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
331+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
332+
[10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0],
333+
],
334+
metadatas: [
335+
['url' => 'https://example.com/test1'],
336+
['url' => 'https://example.com/test2'],
337+
['url' => 'https://example.com/test3'],
338+
]
339+
);
340+
```
341+
342+
### Deleting Documents
343+
344+
```php
345+
$collection->delete(['test1', 'test2', 'test3']);
346+
```
347+
348+
### Querying a Collection
349+
350+
```php
351+
$queryResponse = $collection->query(
352+
queryEmbeddings: [
353+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
354+
],
355+
nResults: 2
356+
);
357+
358+
echo $queryResponse->ids[0][0]; // test1
359+
echo $queryResponse->ids[0][1]; // test2
360+
```
361+
362+
To query a collection, you need to provide the following:
363+
364+
- `queryEmbeddings` (optional): An array of query embeddings. The embeddings must be a 1D array of floats. You
365+
can compute the embeddings using any embedding model of your choice (just make sure that's what you use when inserting
366+
as
367+
well).
368+
- `nResults`: The number of results to return. Defaults to 10.
369+
- `queryTexts` (optional): An array of query texts. The texts must be strings. You can omit this if you provide the
370+
embeddings. Here's
371+
an example:
372+
```php
373+
$queryResponse = $collection->query(
374+
queryTexts: [
375+
'This is a test document'
376+
],
377+
nResults: 2
378+
);
379+
380+
echo $queryResponse->ids[0][0]; // test1
381+
echo $queryResponse->ids[0][1]; // test2
382+
```
383+
- `where` (optional): The where clause to use to filter items based on their metadata. Here's an example:
384+
```php
385+
$queryResponse = $collection->query(
386+
queryEmbeddings: [
387+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
388+
],
389+
nResults: 2,
390+
where: [
391+
'url' => 'https://example.com/test1'
392+
]
393+
);
394+
395+
echo $queryResponse->ids[0][0]; // test1
396+
```
397+
The where clause must be an array of key-value pairs. The key must be a string, and the value can be a string or
398+
an array of valid filter values. Here are the valid filters (`$eq`, `$ne`, `$in`, `$nin`, `$gt`, `$gte`, `$lt`,
399+
`$lte`):
400+
- `$eq`: Equals
401+
- `$ne`: Not equals
402+
- `$gt`: Greater than
403+
- `$gte`: Greater than or equal to
404+
- `$lt`: Less than
405+
- `$lte`: Less than or equal to
406+
407+
Here's an example:
408+
```php
409+
$queryResponse = $collection->query(
410+
queryEmbeddings: [
411+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
412+
],
413+
nResults: 2,
414+
where: [
415+
'url' => [
416+
'$eq' => 'https://example.com/test1'
417+
]
418+
]
419+
);
420+
```
421+
You can also use multiple filters:
422+
```php
423+
$queryResponse = $collection->query(
424+
queryEmbeddings: [
425+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
426+
],
427+
nResults: 2,
428+
where: [
429+
'url' => [
430+
'$eq' => 'https://example.com/test1'
431+
],
432+
'title' => [
433+
'$ne' => 'Test 1'
434+
]
435+
]
436+
);
437+
```
438+
- `whereDocument` (optional): The where clause to use to filter items based on their document. Here's an example:
439+
```php
440+
$queryResponse = $collection->query(
441+
queryEmbeddings: [
442+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
443+
],
444+
nResults: 2,
445+
whereDocument: [
446+
'text' => 'This is a test document'
447+
]
448+
);
449+
450+
echo $queryResponse->ids[0][0]; // test1
451+
```
452+
The where clause must be an array of key-value pairs. The key must be a string, and the value can be a string or
453+
an array of valid filter values. In this case, only two filtering keys are supported - `$contains`
454+
and `$not_contains`.
455+
456+
Here's an example:
457+
```php
458+
$queryResponse = $collection->query(
459+
queryEmbeddings: [
460+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
461+
],
462+
nResults: 2,
463+
whereDocument: [
464+
'text' => [
465+
'$contains' => 'test document'
466+
]
467+
]
468+
);
469+
```
470+
- `include` (optional): An array of fields to include in the response. Possible values
471+
are `embeddings`, `documents`, `metadatas` and `distances`. It defaults to `embeddings`
472+
and `metadatas` (`documents` are not included by default because they can be large).
473+
```php
474+
$queryResponse = $collection->query(
475+
queryEmbeddings: [
476+
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
477+
],
478+
nResults: 2,
479+
include: ['embeddings']
480+
);
481+
```
482+
`distances` is only valid for querying and not for getting. It returns the distances between the query embeddings
483+
and the embeddings of the results.
484+
485+
Other relevant information about querying and retrieving a collection can be found in
486+
the [ChromaDB Documentation](https://docs.trychroma.com/usage-guide).
487+
488+
### Deleting items in a collection
489+
490+
To delete the documents in a collection, pass in an array of the ids of the items:
491+
492+
```php
493+
$collection->delete(['test1', 'test2']);
494+
495+
$collection->count() // 1
496+
```
497+
498+
Passing the ids is optional. You can delete items from a collection using a where filter:
499+
500+
```php
501+
$collection->add(
502+
['test1', 'test2', 'test3'],
503+
[
504+
[1.0, 2.0, 3.0, 4.0, 5.0],
505+
[6.0, 7.0, 8.0, 9.0, 10.0],
506+
[11.0, 12.0, 13.0, 14.0, 15.0],
507+
],
508+
[
509+
['some' => 'metadata1'],
510+
['some' => 'metadata2'],
511+
['some' => 'metadata3'],
512+
]
513+
);
514+
515+
$collection->delete(
516+
where: [
517+
'some' => 'metadata1'
518+
]
519+
);
520+
521+
$collection->count() // 2
522+
```
523+
524+
### Deleting a collection
525+
526+
Deleting a collection is as simple as passing in the name of the collection to be deleted.
527+
528+
```php
529+
$chroma->deleteCollection('test_collection');
530+
```
531+
532+
## Testing
533+
534+
```
535+
// Run chroma by running the docker compose file in the repo
536+
docker compose up -d
537+
538+
composer test
539+
```
540+
261541
## Contributors
262542

263543
- [Kyrian Obikwelu](https://github.com/CodeWithKyrian)

src/Generated/ChromaApiClient.php

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -353,6 +353,7 @@ private function handleChromaApiException(\Exception|ClientExceptionInterface $e
353353
$message = $error['detail'];
354354
$error_type = ChromaException::inferTypeFromMessage($message);
355355

356+
356357
ChromaException::throwSpecific($message, $error_type, $e->getCode());
357358
}
358359

src/Generated/Exceptions/ChromaException.php

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ public static function throwSpecific(string $message, string $type, int $code)
1919
throw new ChromaUniqueConstraintException($message, $code);
2020
case 'DimensionalityError':
2121
throw new ChromaDimensionalityException($message, $code);
22+
case 'TypeError':
23+
throw new ChromaTypeException($message, $code);
2224
default:
2325
throw new self($message, $code);
2426
}
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
<?php
2+
3+
declare(strict_types=1);
4+
5+
6+
namespace Codewithkyrian\ChromaDB\Generated\Exceptions;
7+
8+
class ChromaTypeException extends ChromaException
9+
{
10+
11+
}

0 commit comments

Comments
 (0)