Skip to content

Commit b3bcdee

Browse files
authored
Sentinel LoRA training + RAG budget fix + fallback elimination
* LoRA training pipeline commands + adapter entity persistence - genome/dataset-prepare: extract chat history into JSONL training data - genome/train: PEFT LoRA training with Python subprocess, returns adapter - genome/training-pipeline: Sentinel pipeline orchestrating prepare→train→register→activate - AdapterPackage: manifest.json packaging with SHA-256 content hashing, size calculation - GenomeLayerEntity persistence: genome/train creates database record after training - genome/paging-adapter-register: accepts layerId to hydrate from persisted entity - LoRATrainingPipeline wires layerId via {{steps.1.0.data.layerId}} interpolation - 74 unit tests (11 new for AdapterPackage + pipeline layerId wiring) * Academy Dojo: dual-sentinel teacher/student architecture Entities: AcademySession, AcademyCurriculum, AcademyExamination Commands: genome/dataset-synthesize (LLM data generation), genome/academy-session (orchestration) Pipelines: TeacherPipeline (curriculum→synthesize→exam→grade), StudentPipeline (train→answer→report) Extended PipelineStep bindings with emit/watch/parallel/sentinel step types 101 unit tests passing, integration tests for both new commands * Sentinel interpolation engine: multi-pass nesting, JSON traversal, loop-relative refs Three enhancements to the Rust interpolation engine that enable the Academy dual-sentinel pipeline to execute end-to-end: - Multi-pass nested interpolation: {{steps.0.output.topics.{{input.iteration}}.name}} resolves inner patterns first via regex matching innermost {{}} only - traverse_json_path(): array indexing (topics.0.name) and JSON string auto-parsing for structured LLM output traversal - {{loop.N.field}} syntax: stable intra-loop step referencing via _loop_base offset, so loop.0.data always means "first sub-step of current iteration" Pipeline command routing fix: sentinel command steps now route through TypeScript (execute_ts_json) instead of Rust module registry, avoiding the data/ prefix collision where Rust DataModule intercepted commands meant for TypeScript context injection (dbPath, sessionId, userId). ORMRustClient.store() fix: returns Rust-generated entity ID instead of echoing back original input data (which lacked the auto-generated UUID). Pipeline template fixes: correct watch payload paths (data.payload.X), entity ID paths (data.data.id), LLM output traversal (output.X not data.X for parsed JSON), session-scoped adapter names, system default model for student exams. 106 Rust sentinel tests pass. Demonstrated 6 of 9 step types in live dual-sentinel orchestration: LLM, Command, Emit, Watch, Loop, Condition. * Sentinel lifecycle: entity persistence, persona ownership, escalation to inbox - SentinelEntity class with field decorators, registered in EntityRegistry - SentinelEscalationService: event-driven bridge routing sentinel lifecycle events (complete/error/cancelled) to owning persona's inbox - Persona ownership: parentPersonaId on all sentinels, academy-session wires it - Execution tracking: handle→entity mapping, persistExecutionResult() - sentinel/save + sentinel/run extended with persona ownership params - TaskEntity: new 'sentinel' domain + 4 sentinel task types - Architecture docs updated with lessons learned + multi-modal roadmap - 111 unit tests passing (11 new for SentinelEntity + escalation rules) * Sentinel memory integration, trigger service, and pattern recall - MemoryType.SENTINEL: sentinel executions stored as durable persona memories - SentinelTriggerService: auto-execute sentinels on event/cron/immediate triggers with debounce, concurrent execution guards, and dynamic registration - PersonaTaskExecutor: sentinel task handlers + recallSentinelPatterns() for querying past sentinel executions when processing similar tasks - InboxTask metadata: typed sentinel fields (sentinelName, entityId, handle, status) - 125 unit tests passing (14 new: memory types, cron parsing, trigger validation) * Phenotype validation + quality gating for Academy student pipeline - genome/phenotype-validate command: LLM-as-judge scores pre/post training responses - Student pipeline pre-test baseline (loop.1) before training establishes comparison point - Quality gate condition (loop.10): only registers adapters with measurable improvement - inference:demo and quality:gate:failed event payloads in AcademyTypes - 138 tests passing (13 new covering phenotype scoring, quality gate, pipeline structure) * Dynamic composition, LRU paging, and remediation loop Phase C complete: - genome/compose command: merges multiple LoRA layers into stacked genome - Student pipeline: paging-activate after registration (LRU eviction) - Student pipeline: post-loop genome/compose step merges all trained adapters - Fix GenomeAssemblyTypes Timestamp import (pre-existing tech debt) Phase D remediation: - Teacher pipeline restructured with inner exam retry loop - On failure: synthesizes targeted remedial data from weakAreas feedback - Re-emits dataset:ready for student re-training, up to maxTopicAttempts - TopicRemediatePayload and RemediationDatasetReadyPayload types 153 tests passing (15 new covering composition, paging, remediation) * Multi-persona competition + gap analysis for Academy Phase D - CompetitionTypes: CompetitorEntry, TopicGap, GapAnalysis, TournamentRound/Ranking, competition events - CompetitionEntity: academy_competitions collection with 2+ competitor validation - genome/academy-competition: spawns 1 shared teacher + N student sentinels per competitor - genome/gap-analysis: per-topic field stats, weakness identification, remediation priorities - 177 tests passing (24 new for competition types, entity, command types, gap analysis, events) * Purge Ollama: complete removal from types, runtime, comments, and tests Candle is the ONLY local inference path. All 75+ files updated: - Type system: ollamaModelName→trainedModelName, InferenceRuntime.OLLAMA→CANDLE, ModelTier 'ollama-capable'→'local-capable', embedding provider 'ollama'→'fastembed' - Runtime: inference-worker routes 'candle'/'local' to CandleAdapter, VisionDescriptionService uses candle, PersonaModelConfigs deduplicated - Comments: all Ollama references replaced with Candle/PEFT/local equivalents - Tests: fixtures updated (219 affected unit tests pass, 0 regressions) - Wire format: TS maps trainedModelName→ollama_model_name for Rust compat (Rust-side rename deferred to separate cargo test cycle) * Fix JTAGClient connection hang + Academy Dojo integration tests Two critical bugs causing external WebSocket clients to hang forever: 1. JTAGRouter.handleIncomingRequest had no try/catch around routeToSubscriber — thrown errors propagated without sending a response, leaving clients waiting indefinitely. 2. CommandDaemon.processMessage threw on missing sessionId instead of returning an error response, triggering the above silent hang. Also: ConnectionBroker ESM fix, vitest config with path aliases, raw WebSocket diagnostic script, and integration tests properly registering client via JTAGClient.registerClient(). All 10 sentinel-lora-training integration tests pass (50s). * Eliminate Academy Dojo technical debt: type safety, imports, READMEs - Remove unused TEXT_LENGTH imports from AcademySessionEntity and AcademyCurriculumEntity - Use TEXT_LENGTH.UNLIMITED constant in AcademyExaminationEntity instead of raw maxLength: 0 - Replace all 9 `as any` casts in GenomeAcademySessionServerCommand with proper DataCreate/DataUpdate/PipelineSentinelParams types - Replace all 15 `as any` casts in GenomeAcademyCompetitionServerCommand with same proper types - Add missing browser command for academy-competition - Add READMEs for genome/dataset-synthesize, genome/academy-session, and genome/academy-competition 177 unit tests pass, TypeScript compiles clean. * Replace as-any casts with typed executors across all genome commands - genome/train: DataCreate.execute(), UUID layerId type - genome/compose: DataRead.execute<GenomeLayerEntity>(), DataCreate, typed GenomePagingAdapterRegister and GenomeActivate params - genome/gap-analysis: DataRead.execute<CompetitionEntity>(), DataList.execute<BaseEntity>() with readonly items - genome/paging-adapter-register: DataRead.execute<GenomeLayerEntity>() Only 1 `as any` remains across all genome server commands (enum check in job-create). All 10 integration tests pass live. * Wire LoRA adapters into Candle inference + unified model config Five gaps prevented trained LoRA adapters from affecting inference: 1. activeAdapters not on Rust wire type (TS-only) 2. AIProviderRustClient stripped activeAdapters from IPC payload 3. CandleAdapter.generate_text() never called load_lora/apply_lora 4. Candle registered as quantized() which rejects LoRA 5. Model mismatch: training on SmolLM2, inference on Llama-3.1-8B Fixes: - Add ActiveAdapterRequest to Rust wire type (ts-rs generated) - Wire activeAdapters through AIProviderRustClient to Candle - Add ensure_adapters() to CandleAdapter for LoRA loading + stacking - Switch Candle to regular mode (BF16, LoRA-compatible) - Unify all model references to LOCAL_MODELS.DEFAULT (Llama-3.2-3B) - Eliminate duplicate model mapping table in PEFTLoRAAdapter - Add AdapterStore: filesystem-based single source of truth for adapter discovery (replaces hardcoded paths in LimbicSystem) - Add path validation in PersonaGenome.getActiveAdaptersForRequest() - Fix SystemPaths.genome.adapters to match actual directory - Fix lora.rs directory path resolution for adapter loading 48 files changed across Rust, TypeScript, and Python. All native AIs (Helper, Teacher, CodeReview, Local Assistant) verified responding after deployment. * Knowledge Synthesis system: source-agnostic learning + benchmarks WP1: KnowledgeTypes.ts foundation — SourceKnowledge, ExtractedFact, DataSourceConfig (5 source types), BenchmarkDefinition/Result WP2: groundingContext on genome/dataset-synthesize — grounded synthesis forces LLM to trace all answers to verified facts WP3: KnowledgeExplorationPipeline — builds sentinel pipelines that explore git repos, web pages, conversations, or documents then extract structured facts via LLM WP4: TeacherPipeline rewrite — dynamic step indexing, optional knowledge exploration, backward compatible WP5: BenchmarkPipeline — auto-generates persistent test suites from extracted knowledge, plus runner pipeline for scoring WP6: SearchRateLimiter — Brave API quota tracking, 24hr LRU cache, in-flight request deduplication WP7: Documentation updates — completion criteria table, Phase D.5, PRACTICAL-ROADMAP LoRA status correction 4 E2E tests: knowledge-synthesis-repo, benchmark-generation, web-research-synthesis, sentinel-multi-step-pipeline * Route Python training through Rust sentinel process management Replace Node.js spawn() in BaseServerLoRATrainer with RustCoreIPCClient.sentinelExecute() — Python training subprocess now runs under Rust's SentinelModule which provides: - kill_on_drop: automatic cleanup if handle is dropped - Timeout enforcement at the Rust tokio level - Log capture to .sentinel-workspaces/{handle}/logs/ - Handle-based tracking: cancellable, status-queryable - Concurrent execution limits (max_concurrent in Rust) Sentinel handle propagates through LoRATrainingResult → GenomeTrainResult so callers can inspect logs/status. Verified: lora-inference-improvement E2E test passes (0% → 100%) * Async training architecture: SentinelEventBridge + dual-mode genome/train SentinelEventBridge polls Rust sentinel handles and emits TypeScript Events, bridging the IPC boundary for widgets and services. genome/train now supports async mode (returns handle immediately) alongside sync mode (default, blocks). TrainingCompletionHandler processes async results on completion. E2E verified: 0% → 80% Nexaflux improvement with full pipeline. * Fix integration tests: sync pipeline mode, collection routing, response types - sentinel/run sync mode (async=false): sentinelExecute polls until completion, returns output directly instead of unavailable stepResults - CLI timeout: sentinel commands added to 300s category (LLM pipeline steps need minutes, not the 10s default) - sentinelExecute crash fix: pipeline-type sentinels don't produce log streams, added try/catch fallback to status.handle.error - BenchmarkPipeline runner: data/list+filter instead of data/read (academy_benchmarks is a dynamic collection without registered entity) - BenchmarkPipeline: removed apostrophe from grading prompt that broke shell single-quote wrapping in CLI test harness - recipe-load test: fixed response structure (client proxy returns flat payload, not wrapped in commandResult), collection → collectionName - genome-crud test: replaced undefined DATA_COMMANDS with literals, reduced embedding dims from 768→16, fixed nested result.data.id path - genome-fine-tuning-e2e: generates inline dataset if fixture missing - All 6 test suites: replaced references to unavailable stepResults/ stepsCompleted with success+output fields from sync pipeline mode - CRUDTestUtils: added DATA_COMMANDS constant for shared test use Validated: sentinel-pipeline 4/4, genome-crud 4/4, recipe-load 4/4, benchmark-generation 4/4, lora-inference-improvement 0%→100% * Add LearningScheduler: RTOS-style periodic training coordinator Monitors active personas, triggers training for those with enough accumulated data. Throttles to max 1 concurrent GPU training job. Integrates with PersonaUser serviceInbox loop. * BenchmarkEntity + CodingChallengePipeline + sentinelExecute output fix - BenchmarkEntity and BenchmarkResultEntity: proper entities for academy benchmarks, replacing hardcoded collection strings with registered types - BenchmarkPipeline: uses entity .collection constants instead of raw strings - CodingChallengePipeline: deterministic coding challenge evaluation via sentinel — reads buggy source, runs tests, LLM fixes, re-runs tests, scores pass/fail with no LLM grading bias - sentinelExecute: fix empty output for pipeline-type sentinels by falling back to last step output from steps log when combined log is empty - Integration tests: coding-challenge-benchmark (100% score on task-manager 3-bug challenge), benchmark-generation regression test updated * claude forgot these * Fix RAG token budget and eliminate fallback defaults across AI pipeline RAG budget was using chars/4 estimation (250 tokens/msg) but Llama tokenizer averages chars/3 — causing 35% underestimate. Combined with hardcoded totalBudget=8000, minMessages=5 floor, Math.max(50,...) output floor, and isSmallContext threshold too low at 1500, candle personas (2048 context) had prompts exceeding context window and silently failing. Fixes: - totalBudget derived from contextWindow * 0.75 (not hardcoded 8000) - avgTokensPerMessage: 250 → 350 (chars/3 estimation) - Removed minMessages floor that forced 5 messages when budget allowed 4 - Removed Math.max(50,...) output token floor (0 = budget blown, not 50) - isSmallContext threshold: 1500 → 3000 (skips injections for small models) - calculateAdjustedMaxTokens uses actual content chars/3 not flat 250/msg Type strictness (compiler-enforced, no runtime fallbacks): - modelId and provider REQUIRED on RAGBuildOptions, AIGenerateParams, ThoughtStreamParams, RAGInspectParams - model and provider REQUIRED on ModelConfig (UserEntity) - getModelConfigForProvider() throws on unknown provider (no candle fallback) - PersonaUser validates merged modelConfig (entity + provider defaults) - Eliminated all || 'candle', ?? 'candle', || LOCAL_MODELS.DEFAULT fallbacks * ModelBackend trait, academy pipelines, domain classifier, candle fixes Rust: - ModelBackend trait unifying safetensors and GGUF backends - backends/llama_safetensors.rs + llama_gguf.rs with BF16_PRACTICAL_CONTEXT - Vendored quantized_llama.rs for future GGUF context window fix - DomainClassifier for persona task routing - Self-task generator, genome paging, cognition module updates - Channel module and unified persona updates TypeScript: - Academy session command + types for coding challenges - CodingStudent/CodingTeacher/ProjectStudent/ProjectTeacher pipelines - CandleGrpcAdapter with correct model ID and IPC query - ModelContextWindows: Llama-3.2-3B at 2048 (BF16 practical limit) - ModelRegistry console.log cleanup - PersonaGenome, PersonaAutonomousLoop, RustCognitionBridge updates - TrainingDataAccumulator, MotorCortex, PersonaMemory fixes - QueueItemTypes, PersonaTaskExecutor updates - Project scaffolds (ecommerce-api, url-shortener) - Integration + unit tests All compiles clean (TypeScript + Rust).
1 parent 75a1bcb commit b3bcdee

File tree

288 files changed

+27875
-2495
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

288 files changed

+27875
-2495
lines changed

docs/MLX-LORA-RESEARCH.md

Lines changed: 721 additions & 0 deletions
Large diffs are not rendered by default.

src/debug/jtag/browser/generated.ts

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
/**
22
* Browser Structure Registry - Auto-generated
33
*
4-
* Contains 11 daemons and 205 commands and 2 adapters and 28 widgets.
4+
* Contains 11 daemons and 211 commands and 2 adapters and 28 widgets.
55
* Generated by scripts/generate-structure.ts - DO NOT EDIT MANUALLY
66
*/
77

@@ -124,9 +124,15 @@ import { FileAppendBrowserCommand } from './../commands/file/append/browser/File
124124
import { FileLoadBrowserCommand } from './../commands/file/load/browser/FileLoadBrowserCommand';
125125
import { FileMimeTypeBrowserCommand } from './../commands/file/mime-type/browser/FileMimeTypeBrowserCommand';
126126
import { FileSaveBrowserCommand } from './../commands/file/save/browser/FileSaveBrowserCommand';
127+
import { GenomeAcademyCompetitionBrowserCommand } from './../commands/genome/academy-competition/browser/GenomeAcademyCompetitionBrowserCommand';
128+
import { GenomeAcademySessionBrowserCommand } from './../commands/genome/academy-session/browser/GenomeAcademySessionBrowserCommand';
127129
import { GenomeBatchMicroTuneBrowserCommand } from './../commands/genome/batch-micro-tune/browser/GenomeBatchMicroTuneBrowserCommand';
130+
import { GenomeDatasetPrepareBrowserCommand } from './../commands/genome/dataset-prepare/browser/GenomeDatasetPrepareBrowserCommand';
131+
import { GenomeDatasetSynthesizeBrowserCommand } from './../commands/genome/dataset-synthesize/browser/GenomeDatasetSynthesizeBrowserCommand';
128132
import { GenomeJobCreateBrowserCommand } from './../commands/genome/job-create/browser/GenomeJobCreateBrowserCommand';
129133
import { GenomeJobStatusBrowserCommand } from './../commands/genome/job-status/browser/GenomeJobStatusBrowserCommand';
134+
import { GenomeTrainBrowserCommand } from './../commands/genome/train/browser/GenomeTrainBrowserCommand';
135+
import { GenomeTrainingPipelineBrowserCommand } from './../commands/genome/training-pipeline/browser/GenomeTrainingPipelineBrowserCommand';
130136
import { HelpBrowserCommand } from './../commands/help/browser/HelpBrowserCommand';
131137
import { IndicatorBrowserCommand } from './../commands/indicator/browser/IndicatorBrowserCommand';
132138
import { InferenceGenerateBrowserCommand } from './../commands/inference/generate/browser/InferenceGenerateBrowserCommand';
@@ -852,11 +858,31 @@ export const BROWSER_COMMANDS: CommandEntry[] = [
852858
className: 'FileSaveBrowserCommand',
853859
commandClass: FileSaveBrowserCommand
854860
},
861+
{
862+
name: 'genome/academy-competition',
863+
className: 'GenomeAcademyCompetitionBrowserCommand',
864+
commandClass: GenomeAcademyCompetitionBrowserCommand
865+
},
866+
{
867+
name: 'genome/academy-session',
868+
className: 'GenomeAcademySessionBrowserCommand',
869+
commandClass: GenomeAcademySessionBrowserCommand
870+
},
855871
{
856872
name: 'genome/batch-micro-tune',
857873
className: 'GenomeBatchMicroTuneBrowserCommand',
858874
commandClass: GenomeBatchMicroTuneBrowserCommand
859875
},
876+
{
877+
name: 'genome/dataset-prepare',
878+
className: 'GenomeDatasetPrepareBrowserCommand',
879+
commandClass: GenomeDatasetPrepareBrowserCommand
880+
},
881+
{
882+
name: 'genome/dataset-synthesize',
883+
className: 'GenomeDatasetSynthesizeBrowserCommand',
884+
commandClass: GenomeDatasetSynthesizeBrowserCommand
885+
},
860886
{
861887
name: 'genome/job-create',
862888
className: 'GenomeJobCreateBrowserCommand',
@@ -867,6 +893,16 @@ export const BROWSER_COMMANDS: CommandEntry[] = [
867893
className: 'GenomeJobStatusBrowserCommand',
868894
commandClass: GenomeJobStatusBrowserCommand
869895
},
896+
{
897+
name: 'genome/train',
898+
className: 'GenomeTrainBrowserCommand',
899+
commandClass: GenomeTrainBrowserCommand
900+
},
901+
{
902+
name: 'genome/training-pipeline',
903+
className: 'GenomeTrainingPipelineBrowserCommand',
904+
commandClass: GenomeTrainingPipelineBrowserCommand
905+
},
870906
{
871907
name: 'help',
872908
className: 'HelpBrowserCommand',

src/debug/jtag/cli.ts

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -378,10 +378,11 @@ async function main() {
378378

379379
// Execute command with command-specific timeout
380380
try {
381-
// AI commands need longer timeout due to queue + generation time
382-
// Genome commands can take longer for training operations
383-
// Interface commands (screenshot) may need to wait for html2canvas rendering
384-
// Inference commands (inference/generate) need time for local model generation
381+
// Extract --timeout from params (CLI-level override, not a command parameter)
382+
const userTimeoutMs = params.timeout ? Number(params.timeout) : undefined;
383+
delete params.timeout;
384+
385+
// Category-based default timeouts
385386
const isAICommand = command.startsWith('ai/');
386387
const isGenomeCommand = command.startsWith('genome/');
387388
const isInterfaceCommand = command.startsWith('interface/');
@@ -390,9 +391,11 @@ async function main() {
390391
const isCollaborationCommand = command.startsWith('collaboration/');
391392
const isChallengeCommand = command.startsWith('challenge/');
392393
const isCodeCommand = command.startsWith('code/');
393-
const needsLongerTimeout = isAICommand || isInferenceCommand || isSocialCommand || isInterfaceCommand || isCollaborationCommand || isCodeCommand;
394-
const needsLongTimeout = isGenomeCommand || isChallengeCommand;
395-
const timeoutMs = needsLongTimeout ? 300000 : needsLongerTimeout ? 60000 : 10000; // 5min for genome/challenge, 60s for AI/inference/social/interface/collaboration/code, 10s for others
394+
const isSentinelCommand = command.startsWith('sentinel/');
395+
const needsLongerTimeout = isAICommand || isSocialCommand || isInterfaceCommand || isCollaborationCommand || isCodeCommand;
396+
const needsLongTimeout = isGenomeCommand || isChallengeCommand || isInferenceCommand || isSentinelCommand;
397+
const defaultTimeoutMs = needsLongTimeout ? 300000 : needsLongerTimeout ? 60000 : 10000; // 5min for genome/challenge/inference/sentinel, 60s for AI/social/interface/collaboration/code, 10s for others
398+
const timeoutMs = userTimeoutMs ?? defaultTimeoutMs;
396399
const timeoutSeconds = timeoutMs / 1000;
397400

398401
const commandTimeout = new Promise((_, reject) =>

src/debug/jtag/commands/ai/agent/server/AiAgentServerCommand.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ export class AiAgentServerCommand extends AiAgentCommand {
8787
const provider = params.provider || 'anthropic';
8888
const model = params.model || (
8989
provider === 'anthropic' ? 'claude-sonnet-4-5-20250929' :
90-
provider === 'candle' || provider === 'ollama' ? LOCAL_MODELS.DEFAULT :
90+
provider === 'candle' ? LOCAL_MODELS.DEFAULT :
9191
'claude-sonnet-4-5-20250929'
9292
);
9393

src/debug/jtag/commands/ai/agent/shared/AiAgentTypes.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ export interface AiAgentParams extends CommandParams {
3535
/** Model ID (e.g., 'claude-sonnet-4-5-20250929', 'llama-3.1-8b') */
3636
model?: string;
3737

38-
/** Provider (e.g., 'anthropic', 'openai', 'together', 'ollama') */
38+
/** Provider (e.g., 'anthropic', 'openai', 'together', 'candle') */
3939
provider?: string;
4040

4141
/** Sampling temperature */

src/debug/jtag/commands/ai/generate/server/AIGenerateServerCommand.ts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,10 +66,12 @@ export class AIGenerateServerCommand extends AIGenerateCommand {
6666
params.roomId,
6767
targetPersonaId,
6868
{
69+
modelId: params.model,
70+
provider: params.provider,
6971
maxMessages: params.maxMessages || 20,
7072
includeArtifacts: params.includeArtifacts ?? true,
7173
includeMemories: params.includeMemories ?? true,
72-
triggeringTimestamp: Date.now(), // Preview shows current state (no race filtering for manual preview)
74+
triggeringTimestamp: Date.now(),
7375
maxTokens: params.maxTokens ?? 2000,
7476
}
7577
);

src/debug/jtag/commands/ai/generate/shared/AIGenerateTypes.ts

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,14 +34,11 @@ export interface AIGenerateParams extends CommandParams {
3434
// Preview mode - returns request instead of calling LLM
3535
preview?: boolean;
3636

37-
// Model configuration
38-
model?: string;
37+
// Model configuration — required for RAG budget and inference routing
38+
model: string;
39+
provider: 'openai' | 'anthropic' | 'local' | 'candle' | 'groq' | 'deepseek';
3940
temperature?: number;
4041
maxTokens?: number;
41-
42-
// Provider selection
43-
// 'local' and 'candle' route to native Rust inference (Candle)
44-
provider?: 'openai' | 'anthropic' | 'local' | 'candle' | 'groq' | 'deepseek';
4542
}
4643

4744
// AI Generate Result

src/debug/jtag/commands/ai/rag/inspect/server/RAGInspectServerCommand.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@ export class RAGInspectServerCommand extends RAGInspectCommand {
2929
params.contextId,
3030
params.personaId,
3131
{
32+
modelId: params.modelId,
33+
provider: params.provider,
3234
maxMessages: params.maxMessages ?? 20,
3335
includeArtifacts: params.includeArtifacts ?? true,
3436
includeMemories: params.includeMemories ?? true,

src/debug/jtag/commands/ai/rag/inspect/shared/RAGInspectTypes.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,12 @@ export interface RAGInspectParams extends CommandParams {
2020
/** Persona ID requesting context */
2121
personaId: UUID;
2222

23+
/** Model ID — drives context window budget */
24+
modelId: string;
25+
26+
/** Provider — scopes model lookup */
27+
provider: string;
28+
2329
/** Optional: Limit number of messages */
2430
maxMessages?: number;
2531

src/debug/jtag/commands/ai/thoughtstream/server/ThoughtStreamServerCommand.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,8 @@ export class ThoughtStreamServerCommand extends ThoughtStreamCommand {
102102
stream.contextId,
103103
thought.personaId,
104104
{
105+
modelId: params.modelId,
106+
provider: params.provider,
105107
maxTokens: 2000,
106108
maxMessages: 20,
107109
maxMemories: 0,
@@ -394,6 +396,8 @@ export class ThoughtStreamServerCommand extends ThoughtStreamCommand {
394396
entry.roomId,
395397
personaId,
396398
{
399+
modelId: params.modelId,
400+
provider: params.provider,
397401
maxTokens: 2000,
398402
maxMessages: 20,
399403
maxMemories: 0,

0 commit comments

Comments
 (0)