Skip to content

[Store] Integrate Symfony's finder for more flexible loaders #1579

@aszenz

Description

@aszenz

I have this current code to index all non-empty markdown files, I feel this use case is common enough to warrant some better DX by integrating the finder directly but curious to hear alternative solutions on making the loader more flexible.

        $documentsLoader = new TextFileLoader();
        $vectorizer = new Vectorizer(
            $this->aiPlatform->getPlatform(),
            AIPlatform::EMBEDDING_MODEL
        );
        $docFiles = Finder::create()->files()->in($folderPath)->name('*.md')->size('> 0')->getIterator();
        $docsArray = \array_map(
            fn(\SplFileInfo $file) => $file->getRealPath() ?: throw new \RuntimeException('File path not found'),
            iterator_to_array($docFiles)
        );
        $output->writeln(\sprintf('Indexing %d documents...', \count($docsArray)));
        $indexer = new Indexer(
            loader: $documentsLoader,
            vectorizer: $vectorizer,
            store: $this->vektor,
            source: $docsArray,
            transformers: [
                new RecursiveCharacterTextTransformer(
                    separators: ['#', '##', '\n', ' '],
                    chunkSize: 1000,
                )
            ]
        );
        $indexer->index();

Metadata

Metadata

Assignees

No one assigned

    Labels

    RFCRFC = Request For Comments (proposals about features that you want to be discussed)StoreIssues & PRs about the AI Store component

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions