Feature Discussion: P2P Federation, Realtime Ephemeral Keys, Reasoning Parser, and Distributed Cache

## Summary

I am planning to contribute several interconnected features to LocalAI that enhance P2P federation, realtime API security, function calling reasoning support, and distributed model caching. This issue serves as a discussion starter before breaking the work into smaller, reviewable PRs.

## Proposed Changes

### 1. P2P Node Snapshot and Federation Routing (core/p2p/)
- NodeConfig struct for declarative node configuration
- discoveryTunnels full-node snapshot for cluster state synchronization
- ReplaceNodes for safe cluster membership updates
- HMAC-signed node advertisements for tamper-evident federation

### 2. Reasoning Content Parser (pkg/functions/)
- XMLToolCallFormat extended with three new fields for reasoning content
- ParseMsgWithXMLToolCalls now supports thinking reasoning extraction
- Enables models that emit chain-of-thought before tool calls

### 3. Realtime API Ephemeral Keys (core/http/endpoints/openai/realtime.go)
- 60-second HMAC-signed ephemeral keys for secure client authentication
- Optional transcription model per-session
- Any-to-any modality detection for unified audio/text sessions

### 4. Distributed Replica Cache (pkg/model/loader.go)
- replicaCache layer to reduce redundant FindAndLockNodeWithModel DB calls
- Per-model-ID caching with configurable TTL

### 5. Config YAML Endpoints (core/http/endpoints/localai/config_meta.go)
- New REST endpoints for retrieving and validating model configuration
- Supports dynamic config reloading without restart

### 6. MCP HTTP API Improvements (pkg/mcp/localaitools/httpapi/)
- Enhanced client with better error handling
- Additional routes for tool discovery

### 7. Metrics and Monitoring Enhancements (core/services/monitoring/)
- Additional backend monitoring metrics
- Improved worker file staging telemetry

### 8. Template Context Pipeline (core/templates/)
- Extended template loader with context-aware evaluation
- Support for dynamic template parameters

## Breaking Down the Work

Per maintainer feedback, I will open separate PRs for each non-controversial subset:

1. PR 1 - Config YAML endpoints (non-controversial, self-contained)
2. PR 2 - Context pipeline and template loader improvements
3. PR 3 - Metrics and monitoring additions
4. PR 4 - MCP HTTP API routes and client hardening
5. PR 5 - Reasoning parser (with tests, proper implementation)
6. PR 6 - Distributed cache (with tests, document cache invalidation behavior)
7. PR 7 - P2P node snapshot and ReplaceNodes (with design doc, cancellation safety)
8. PR 8 - Realtime ephemeral key (with tests for HMAC round-trip)

## Blocker Items to Resolve

- [ ] Dead code: GetConfigEndpointShutdown, StartReplicaCache, LoadModel - remove or implement with tests
- [ ] Reasoning parser no-op branches: proper implementation needed
- [ ] P2P cancellation safety: needs test under churn
- [ ] HMAC ephemeral key details: sign userID, fix response fields
- [ ] Tests: HMAC handshake, P2P snapshot/cancel, flag/validation round-trips
- [ ] AI-assisted code review: add Assisted-by trailer per project guidelines

## Request for Feedback

I would appreciate early feedback on:

1. The overall feature set - are all of these wanted in LocalAI?
2. The P2P federation design - is a full-node snapshot the right approach?
3. The ephemeral key TTL (60s) - is this appropriate?
4. The distributed cache invalidation strategy - TTL vs event-driven?

Once we align on the design, I will open the individual PRs.

---

Related: This is a follow-up to the previously closed PR that combined all changes into one.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Discussion: P2P Federation, Realtime Ephemeral Keys, Reasoning Parser, and Distributed Cache #10317

Summary

Proposed Changes

1. P2P Node Snapshot and Federation Routing (core/p2p/)

2. Reasoning Content Parser (pkg/functions/)

3. Realtime API Ephemeral Keys (core/http/endpoints/openai/realtime.go)

4. Distributed Replica Cache (pkg/model/loader.go)

5. Config YAML Endpoints (core/http/endpoints/localai/config_meta.go)

6. MCP HTTP API Improvements (pkg/mcp/localaitools/httpapi/)

7. Metrics and Monitoring Enhancements (core/services/monitoring/)

8. Template Context Pipeline (core/templates/)

Breaking Down the Work

Blocker Items to Resolve

Request for Feedback

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Feature Discussion: P2P Federation, Realtime Ephemeral Keys, Reasoning Parser, and Distributed Cache #10317

Description

Summary

Proposed Changes

1. P2P Node Snapshot and Federation Routing (core/p2p/)

2. Reasoning Content Parser (pkg/functions/)

3. Realtime API Ephemeral Keys (core/http/endpoints/openai/realtime.go)

4. Distributed Replica Cache (pkg/model/loader.go)

5. Config YAML Endpoints (core/http/endpoints/localai/config_meta.go)

6. MCP HTTP API Improvements (pkg/mcp/localaitools/httpapi/)

7. Metrics and Monitoring Enhancements (core/services/monitoring/)

8. Template Context Pipeline (core/templates/)

Breaking Down the Work

Blocker Items to Resolve

Request for Feedback

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions