@@ -169,13 +169,16 @@ If the tenant or database doesn't exist, the package will automatically create t
169169
170170### Creating a Collection
171171
172+ Creating a collection is as simple as calling the ` createCollection ` method on the client and passing in the name of
173+ the collection.
174+
172175``` php
173176
174177$collection = $chroma->createCollection('test-collection');
175178
176179```
177180
178- If the collection already exists in the database, the package will automatically fetch it for you .
181+ If the collection already exists in the database, the package will throw an exception .
179182
180183### Inserting Documents
181184
@@ -231,10 +234,23 @@ that you can use:
231234
232235 $collection = $chromaDB->createCollection('test-collection', embeddingFunction: $embeddingFunction);
233236 ```
237+ You can get your OpenAI API key and organization id from your [OpenAI dashboard](https://beta.openai.com/), and you
238+ can omit the organization id if your API key doesn't belong to an organization. The model name is optional as well and
239+ defaults to `text-embedding-ada-002`
234240
235- - `HuggingFaceEmbeddingFunction`: This embedding function uses the HuggingFace API to compute the embeddings. You can
236- use it like this:
241+ - `JinaEmbeddingFunction`: This is a wrapper for the Jina Embedding models. You can use by passing your Jina API key and
242+ the desired model. THis defaults to `jina-embeddings-v2-base-en`
243+ ```php
244+ use Codewithkyrian\ChromaDB\Embeddings\JinaEmbeddingFunction;
245+
246+ $embeddingFunction = new JinaEmbeddingFunction('api-key');
247+
248+ $collection = $chromaDB->createCollection('test-collection', embeddingFunction: $embeddingFunction);
249+ ```
237250
251+ - `HuggingFaceEmbeddingServerFunction`: This embedding function is a wrapper around the HuggingFace Text Embedding
252+ Server. Before using it, you need to have
253+ the [HuggingFace Embedding Server](https://github.com/huggingface/text-embeddings-inference) running somewhere locally. Here's how you can use it:
238254 ```php
239255 use CodeWithKyrian\Chroma\EmbeddingFunction\HuggingFaceEmbeddingFunction;
240256
@@ -243,7 +259,8 @@ that you can use:
243259 $collection = $chromaDB->createCollection('test-collection', embeddingFunction: $embeddingFunction);
244260 ```
245261
246- You can also create your own embedding function by implementing the `EmbeddingFunctionInterface` interface.
262+ Besides the built-in embedding functions, you can also create your own embedding function by implementing
263+ the `EmbeddingFunction` interface (including Anonymous Classes):
247264
248265```php
249266use CodeWithKyrian\ChromaDB\EmbeddingFunction\EmbeddingFunctionInterface;
@@ -258,6 +275,269 @@ $embeddingFunction = new class implements EmbeddingFunctionInterface {
258275$collection = $chroma->createCollection('test-collection', embeddingFunction: $embeddingFunction);
259276```
260277
278+ > The embedding function will be called for each batch of documents that are inserted into the collection, and must be
279+ > provided either when creating the collection or when querying the collection. If you don't provide an embedding
280+ > function, and you don't provide the embeddings, the package will throw an exception.
281+
282+ ### Inserting Documents into a Collection with an Embedding Function
283+
284+ ``` php
285+ $ids = ['test1', 'test2', 'test3'];
286+ $documents = [
287+ 'This is a test document',
288+ 'This is another test document',
289+ 'This is yet another test document',
290+ ];
291+ $metadatas = [
292+ ['url' => 'https://example.com/test1'],
293+ ['url' => 'https://example.com/test2'],
294+ ['url' => 'https://example.com/test3'],
295+ ];
296+
297+ $collection->add(
298+ ids: $ids,
299+ documents: $documents,
300+ metadatas: $metadatas
301+ );
302+ ```
303+
304+ ### Getting a Collection
305+
306+ ``` php
307+ $collection = $chromaDB->getCollection('test-collection');
308+ ```
309+
310+ Or with an embedding function:
311+
312+ ``` php
313+ $collection = $chromaDB->getCollection('test-collection', embeddingFunction: $embeddingFunction);
314+ ```
315+
316+ > Make sure that the embedding function you provide is the same one that was used when creating the collection.
317+
318+ ### Counting the items in a collection
319+
320+ ``` php
321+ $collection->count() // 2
322+ ```
323+
324+ ### Updating a collection
325+
326+ ``` php
327+ $collection->update(
328+ ids: ['test1', 'test2', 'test3'],
329+ embeddings: [
330+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
331+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
332+ [10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0],
333+ ],
334+ metadatas: [
335+ ['url' => 'https://example.com/test1'],
336+ ['url' => 'https://example.com/test2'],
337+ ['url' => 'https://example.com/test3'],
338+ ]
339+ );
340+ ```
341+
342+ ### Deleting Documents
343+
344+ ``` php
345+ $collection->delete(['test1', 'test2', 'test3']);
346+ ```
347+
348+ ### Querying a Collection
349+
350+ ``` php
351+ $queryResponse = $collection->query(
352+ queryEmbeddings: [
353+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
354+ ],
355+ nResults: 2
356+ );
357+
358+ echo $queryResponse->ids[0][0]; // test1
359+ echo $queryResponse->ids[0][1]; // test2
360+ ```
361+
362+ To query a collection, you need to provide the following:
363+
364+ - ` queryEmbeddings ` (optional): An array of query embeddings. The embeddings must be a 1D array of floats. You
365+ can compute the embeddings using any embedding model of your choice (just make sure that's what you use when inserting
366+ as
367+ well).
368+ - ` nResults ` : The number of results to return. Defaults to 10.
369+ - ` queryTexts ` (optional): An array of query texts. The texts must be strings. You can omit this if you provide the
370+ embeddings. Here's
371+ an example:
372+ ``` php
373+ $queryResponse = $collection->query(
374+ queryTexts: [
375+ 'This is a test document'
376+ ],
377+ nResults: 2
378+ );
379+
380+ echo $queryResponse->ids[0][0]; // test1
381+ echo $queryResponse->ids[0][1]; // test2
382+ ```
383+ - `where` (optional): The where clause to use to filter items based on their metadata. Here's an example:
384+ ```php
385+ $queryResponse = $collection->query(
386+ queryEmbeddings: [
387+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
388+ ],
389+ nResults: 2,
390+ where: [
391+ 'url' => 'https://example.com/test1'
392+ ]
393+ );
394+
395+ echo $queryResponse->ids[0][0]; // test1
396+ ```
397+ The where clause must be an array of key-value pairs. The key must be a string, and the value can be a string or
398+ an array of valid filter values. Here are the valid filters (` $eq ` , ` $ne ` , ` $in ` , ` $nin ` , ` $gt ` , ` $gte ` , ` $lt ` ,
399+ ` $lte ` ):
400+ - ` $eq ` : Equals
401+ - ` $ne ` : Not equals
402+ - ` $gt ` : Greater than
403+ - ` $gte ` : Greater than or equal to
404+ - ` $lt ` : Less than
405+ - ` $lte ` : Less than or equal to
406+
407+ Here's an example:
408+ ``` php
409+ $queryResponse = $collection->query(
410+ queryEmbeddings: [
411+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
412+ ],
413+ nResults: 2,
414+ where: [
415+ 'url' => [
416+ '$eq' => 'https://example.com/test1'
417+ ]
418+ ]
419+ );
420+ ```
421+ You can also use multiple filters:
422+ ``` php
423+ $queryResponse = $collection->query(
424+ queryEmbeddings: [
425+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
426+ ],
427+ nResults: 2,
428+ where: [
429+ 'url' => [
430+ '$eq' => 'https://example.com/test1'
431+ ],
432+ 'title' => [
433+ '$ne' => 'Test 1'
434+ ]
435+ ]
436+ );
437+ ```
438+ - `whereDocument` (optional): The where clause to use to filter items based on their document. Here's an example:
439+ ```php
440+ $queryResponse = $collection->query(
441+ queryEmbeddings: [
442+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
443+ ],
444+ nResults: 2,
445+ whereDocument: [
446+ 'text' => 'This is a test document'
447+ ]
448+ );
449+
450+ echo $queryResponse->ids[0][0]; // test1
451+ ```
452+ The where clause must be an array of key-value pairs. The key must be a string, and the value can be a string or
453+ an array of valid filter values. In this case, only two filtering keys are supported - `$contains`
454+ and `$not_contains`.
455+
456+ Here's an example:
457+ ```php
458+ $queryResponse = $collection->query(
459+ queryEmbeddings: [
460+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
461+ ],
462+ nResults: 2,
463+ whereDocument: [
464+ 'text' => [
465+ '$contains' => 'test document'
466+ ]
467+ ]
468+ );
469+ ```
470+ - `include` (optional): An array of fields to include in the response. Possible values
471+ are `embeddings`, `documents`, `metadatas` and `distances`. It defaults to `embeddings`
472+ and `metadatas` (`documents` are not included by default because they can be large).
473+ ```php
474+ $queryResponse = $collection->query(
475+ queryEmbeddings: [
476+ [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
477+ ],
478+ nResults: 2,
479+ include: ['embeddings']
480+ );
481+ ```
482+ `distances` is only valid for querying and not for getting. It returns the distances between the query embeddings
483+ and the embeddings of the results.
484+
485+ Other relevant information about querying and retrieving a collection can be found in
486+ the [ChromaDB Documentation](https://docs.trychroma.com/usage-guide).
487+
488+ ### Deleting items in a collection
489+
490+ To delete the documents in a collection, pass in an array of the ids of the items:
491+
492+ ```php
493+ $collection->delete(['test1', 'test2']);
494+
495+ $collection->count() // 1
496+ ```
497+
498+ Passing the ids is optional. You can delete items from a collection using a where filter:
499+
500+ ``` php
501+ $collection->add(
502+ ['test1', 'test2', 'test3'],
503+ [
504+ [1.0, 2.0, 3.0, 4.0, 5.0],
505+ [6.0, 7.0, 8.0, 9.0, 10.0],
506+ [11.0, 12.0, 13.0, 14.0, 15.0],
507+ ],
508+ [
509+ ['some' => 'metadata1'],
510+ ['some' => 'metadata2'],
511+ ['some' => 'metadata3'],
512+ ]
513+ );
514+
515+ $collection->delete(
516+ where: [
517+ 'some' => 'metadata1'
518+ ]
519+ );
520+
521+ $collection->count() // 2
522+ ```
523+
524+ ### Deleting a collection
525+
526+ Deleting a collection is as simple as passing in the name of the collection to be deleted.
527+
528+ ``` php
529+ $chroma->deleteCollection('test_collection');
530+ ```
531+
532+ ## Testing
533+
534+ ```
535+ // Run chroma by running the docker compose file in the repo
536+ docker compose up -d
537+
538+ composer test
539+ ```
540+
261541## Contributors
262542
263543- [ Kyrian Obikwelu] ( https://github.com/CodeWithKyrian )
0 commit comments