Feature: Batch Embedding Generation for RAG Ingestion

# Summary

Implement batch embedding generation in \`rag_ingest\` to significantly reduce ingestion time by grouping multiple chunks into single HTTP API calls instead of making one call per chunk.

## Problem

The current implementation of \`rag_ingest\` generates vector embeddings one chunk at a time.

**Performance Impact:**
- 1000 chunks = 1000 HTTP API calls
- At ~100ms per HTTP call (typical for embedding APIs)
- Total: ~100 seconds just for embeddings

## Solution

Collect chunks into batches and process multiple embeddings per HTTP call.

**Performance Improvement:**
- 1000 chunks with \`batch_size=16\` = ~63 HTTP API calls
- At ~100ms per HTTP call
- Total: ~6.3 seconds for embeddings
- **Speedup: ~16x faster**

## Implementation Details

### Changes Made

1. **Added \`PendingEmbedding\` struct** - holds chunk metadata (chunk_id, doc_id, source_id, input_text) for batched processing

2. **Added \`flush_embedding_batch()\` function** - generates embeddings for multiple chunks in a single API call and inserts all vectors into the database

3. **Modified ingestion loop** - collects chunks into pending batch, flushes when batch_size is reached, and flushes remaining chunks at the end

### Configuration

Uses existing \`batch_size\` from \`embedding_json\` (default: 16):

\`\`\`json
{
    "enabled": true,
    "model": "text-embedding-3-large",
    "dim": 1536,
    "batch_size": 16,
    ...
}
\`\`\`

## Testing Requirements

### Unit Testing

- [ ] Batch Size Boundary Tests
  - Test with exactly \`batch_size\` chunks (should flush exactly once)
  - Test with \`batch_size + 1\` chunks (should flush twice)
  - Test with < \`batch_size\` chunks (should flush once at end)

- [ ] Empty Batch Handling
  - Test with 0 chunks (no embedding calls)
  - Test with \`enabled=false\` (no pending embeddings)

- [ ] Document Boundary Behavior
  - Multiple documents with varying chunk counts
  - Verify pending embeddings carry over between documents
  - Verify final flush processes all remaining embeddings

### Integration Testing

- [ ] End-to-End Ingestion
  - Ingest from MySQL with embeddings enabled
  - Verify all chunks have corresponding vectors in \`rag_vec_chunks\`
  - Compare vector count with chunk count (should match)

- [ ] Performance Benchmarking
  - Time ingestion with 100, 1000, 10000 chunks
  - Compare before/after batching implementation
  - Verify ~16x speedup with \`batch_size=16\`

- [ ] API Validation
  - Verify batch requests use correct format (OpenAI API: array of inputs)
  - Verify responses contain correct number of embeddings

### Verification SQL

\`\`\`sql
-- Verify all chunks have embeddings
SELECT COUNT(*) FROM rag_chunks c
LEFT JOIN rag_vec_chunks v ON c.chunk_id = v.chunk_id
WHERE v.chunk_id IS NULL;
-- Expected: 0
\`\`\`

## Acceptance Criteria

- [x] Batching implementation compiles successfully
- [ ] Unit tests pass for batch boundary conditions
- [ ] Integration test with real MySQL source succeeds
- [ ] All chunks have corresponding vectors in \`rag_vec_chunks\`
- [ ] Performance improvement measured and documented
- [ ] Documentation updated with batching behavior

## Related

- Design doc: \`RAG_POC/embeddings-design.md\` (Section 11.2)
- Original PR: #5318 (RAG ingestion feature)
- Branch: \`v4.0_rag_ingest_2\` (contains implementation)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Batch Embedding Generation for RAG Ingestion #5320

Summary

Problem

Solution

Implementation Details

Changes Made

Configuration

Testing Requirements

Unit Testing

Integration Testing

Verification SQL

Acceptance Criteria

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Batch Embedding Generation for RAG Ingestion #5320

Description

Summary

Problem

Solution

Implementation Details

Changes Made

Configuration

Testing Requirements

Unit Testing

Integration Testing

Verification SQL

Acceptance Criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions