Using IngestionPipeline for content not originating from the file system

I'm beginning to look at **Microsoft.Extensions.DataIngestion** pipelines. As a test, I considered using an `IngestionPipeline` to ingest content stored in a CMS SQL database and create a vector store for use with RAG. However, I'm unclear on how to implement it when the data to be ingested is stored in a database.

Currently, both overloads of the `ProcessAsync` method require file system objects.

https://github.com/dotnet/extensions/blob/15ffd76a9ed12213f9299c9b94ccf2f86eea1b62/src/Libraries/Microsoft.Extensions.DataIngestion/IngestionPipeline.cs#L80-L81

and

https://github.com/dotnet/extensions/blob/15ffd76a9ed12213f9299c9b94ccf2f86eea1b62/src/Libraries/Microsoft.Extensions.DataIngestion/IngestionPipeline.cs#L107-L108

Perhaps I misunderstand its purpose or how it's meant to be used, but it would appear that it can only ingest data originating from files. Is that the case?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using IngestionPipeline for content not originating from the file system #7082

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	public async IAsyncEnumerable<IngestionResult> ProcessAsync(DirectoryInfo directory, string searchPattern = ".",
	SearchOption searchOption = SearchOption.TopDirectoryOnly, [EnumeratorCancellation] CancellationToken cancellationToken = default)

Using IngestionPipeline for content not originating from the file system #7082

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions