-
Notifications
You must be signed in to change notification settings - Fork 1
LoRA Genomic Evolution: GPU-aware genome paging for local expert models #280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 2 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
896a767
GPU Memory Manager: unified VRAM coordination for all GPU consumers
joelteply 0efad03
gpu/stats command: TypeScript command scaffold + IPC mixin for GPU me…
joelteply e85b28c
Wire GPU memory tracking into all consumers: inference, TTS, renderer…
joelteply a8e9608
PEFT training memory safety: gradient checkpointing, OOM handling, GP…
joelteply d66524a
Sentinel real-time observability, LLM retry, academy pipeline resilience
joelteply 43c9dcf
Strip markdown fences in sentinel interpolation, doc refresh
joelteply File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # Development files | ||
| .eslintrc* | ||
| tsconfig*.json | ||
| vitest.config.ts | ||
|
|
||
| # Build artifacts | ||
| *.js.map | ||
| *.d.ts.map | ||
|
|
||
| # IDE | ||
| .vscode/ | ||
| .idea/ | ||
|
|
||
| # Logs | ||
| *.log | ||
| npm-debug.log* | ||
|
|
||
| # OS files | ||
| .DS_Store | ||
| Thumbs.db |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,169 @@ | ||
| # Gpu Stats Command | ||
|
|
||
| Query GPU memory manager stats including VRAM detection, per-subsystem budgets (inference, TTS, rendering), usage tracking, and memory pressure. Returns real hardware data from Metal (macOS) or CUDA APIs. | ||
|
|
||
| ## Table of Contents | ||
|
|
||
| - [Usage](#usage) | ||
| - [CLI Usage](#cli-usage) | ||
| - [Tool Usage](#tool-usage) | ||
| - [Parameters](#parameters) | ||
| - [Result](#result) | ||
| - [Examples](#examples) | ||
| - [Testing](#testing) | ||
| - [Unit Tests](#unit-tests) | ||
| - [Integration Tests](#integration-tests) | ||
| - [Getting Help](#getting-help) | ||
| - [Access Level](#access-level) | ||
| - [Implementation Notes](#implementation-notes) | ||
|
|
||
| ## Usage | ||
|
|
||
| ### CLI Usage | ||
|
|
||
| From the command line using the jtag CLI: | ||
|
|
||
| ```bash | ||
| ./jtag gpu/stats [options] | ||
| ``` | ||
|
|
||
| ### Tool Usage | ||
|
|
||
| From Persona tools or programmatic access using `Commands.execute()`: | ||
|
|
||
| ```typescript | ||
| import { Commands } from '@system/core/shared/Commands'; | ||
|
|
||
| const result = await Commands.execute('gpu/stats', { | ||
| // your parameters here | ||
| }); | ||
| ``` | ||
|
|
||
| ## Parameters | ||
|
|
||
| - **subsystem** (optional): `string` - Filter to specific subsystem: 'inference', 'tts', or 'rendering'. Omit for full stats. | ||
|
|
||
| ## Result | ||
|
|
||
| Returns `GpuStatsResult` with: | ||
|
|
||
| Returns CommandResult with: | ||
| - **gpuName**: `string` - GPU hardware name (e.g., 'Apple M3 Max', 'NVIDIA RTX 5090') | ||
| - **totalVramMb**: `number` - Total detected VRAM in MB | ||
| - **totalUsedMb**: `number` - Total VRAM used across all subsystems in MB | ||
| - **pressure**: `number` - Memory pressure 0.0-1.0 (0=idle, 0.6=warning, 0.8=high, 0.95=critical) | ||
| - **reserveMb**: `number` - Reserved headroom in MB (5% of total, prevents OOM) | ||
| - **rendering**: `SubsystemInfo` - Rendering subsystem budget and usage | ||
| - **inference**: `SubsystemInfo` - Inference subsystem budget and usage (models, LoRA adapters) | ||
| - **tts**: `SubsystemInfo` - TTS subsystem budget and usage | ||
|
|
||
| ## Examples | ||
|
|
||
| ### Get full GPU stats | ||
|
|
||
| ```bash | ||
| ./jtag gpu/stats | ||
| ``` | ||
|
|
||
| **Expected result:** | ||
| { gpuName: 'Apple M3 Max', totalVramMb: 36864, pressure: 0.12, inference: { budgetMb: 25804, usedMb: 3200 }, ... } | ||
|
|
||
| ### Get inference subsystem only | ||
|
|
||
| ```bash | ||
| ./jtag gpu/stats --subsystem=inference | ||
| ``` | ||
|
|
||
| **Expected result:** | ||
| { gpuName: 'Apple M3 Max', totalVramMb: 36864, pressure: 0.12, inference: { budgetMb: 25804, usedMb: 3200 } } | ||
|
|
||
| ## Getting Help | ||
|
|
||
| ### Using the Help Tool | ||
|
|
||
| Get detailed usage information for this command: | ||
|
|
||
| **CLI:** | ||
| ```bash | ||
| ./jtag help gpu/stats | ||
| ``` | ||
|
|
||
| **Tool:** | ||
| ```typescript | ||
| // Use your help tool with command name 'gpu/stats' | ||
| ``` | ||
|
|
||
| ### Using the README Tool | ||
|
|
||
| Access this README programmatically: | ||
|
|
||
| **CLI:** | ||
| ```bash | ||
| ./jtag readme gpu/stats | ||
| ``` | ||
|
|
||
| **Tool:** | ||
| ```typescript | ||
| // Use your readme tool with command name 'gpu/stats' | ||
| ``` | ||
|
|
||
| ## Testing | ||
|
|
||
| ### Unit Tests | ||
|
|
||
| Test command logic in isolation using mock dependencies: | ||
|
|
||
| ```bash | ||
| # Run unit tests (no server required) | ||
| npx tsx commands/Gpu Stats/test/unit/GpuStatsCommand.test.ts | ||
| ``` | ||
|
|
||
| **What's tested:** | ||
| - Command structure and parameter validation | ||
| - Mock command execution patterns | ||
| - Required parameter validation (throws ValidationError) | ||
| - Optional parameter handling (sensible defaults) | ||
| - Performance requirements | ||
| - Assertion utility helpers | ||
|
|
||
| **TDD Workflow:** | ||
| 1. Write/modify unit test first (test-driven development) | ||
| 2. Run test, see it fail | ||
| 3. Implement feature | ||
| 4. Run test, see it pass | ||
| 5. Refactor if needed | ||
|
|
||
| ### Integration Tests | ||
|
|
||
| Test command with real client connections and system integration: | ||
|
|
||
| ```bash | ||
| # Prerequisites: Server must be running | ||
| npm start # Wait 90+ seconds for deployment | ||
|
|
||
| # Run integration tests | ||
| npx tsx commands/Gpu Stats/test/integration/GpuStatsIntegration.test.ts | ||
| ``` | ||
|
|
||
| **What's tested:** | ||
| - Client connection to live system | ||
| - Real command execution via WebSocket | ||
| - ValidationError handling for missing params | ||
| - Optional parameter defaults | ||
| - Performance under load | ||
| - Various parameter combinations | ||
|
|
||
| **Best Practice:** | ||
| Run unit tests frequently during development (fast feedback). Run integration tests before committing (verify system integration). | ||
|
|
||
| ## Access Level | ||
|
|
||
| **ai-safe** - Safe for AI personas to call autonomously | ||
|
|
||
| ## Implementation Notes | ||
|
|
||
| - **Shared Logic**: Core business logic in `shared/GpuStatsTypes.ts` | ||
| - **Browser**: Browser-specific implementation in `browser/GpuStatsBrowserCommand.ts` | ||
| - **Server**: Server-specific implementation in `server/GpuStatsServerCommand.ts` | ||
| - **Unit Tests**: Isolated testing in `test/unit/GpuStatsCommand.test.ts` | ||
| - **Integration Tests**: System testing in `test/integration/GpuStatsIntegration.test.ts` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| /** | ||
| * Gpu Stats Command - Browser Implementation | ||
| * | ||
| * Query GPU memory manager stats including VRAM detection, per-subsystem budgets (inference, TTS, rendering), usage tracking, and memory pressure. Returns real hardware data from Metal (macOS) or CUDA APIs. | ||
| */ | ||
|
|
||
| import { CommandBase, type ICommandDaemon } from '@daemons/command-daemon/shared/CommandBase'; | ||
| import type { JTAGContext } from '@system/core/types/JTAGTypes'; | ||
| import type { GpuStatsParams, GpuStatsResult } from '../shared/GpuStatsTypes'; | ||
|
|
||
| export class GpuStatsBrowserCommand extends CommandBase<GpuStatsParams, GpuStatsResult> { | ||
|
|
||
| constructor(context: JTAGContext, subpath: string, commander: ICommandDaemon) { | ||
| super('gpu/stats', context, subpath, commander); | ||
| } | ||
|
|
||
| async execute(params: GpuStatsParams): Promise<GpuStatsResult> { | ||
| console.log('🌐 BROWSER: Delegating Gpu Stats to server'); | ||
| return await this.remoteExecute(params); | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| { | ||
| "name": "@jtag-commands/gpu/stats", | ||
| "version": "1.0.0", | ||
| "description": "Query GPU memory manager stats including VRAM detection, per-subsystem budgets (inference, TTS, rendering), usage tracking, and memory pressure. Returns real hardware data from Metal (macOS) or CUDA APIs.", | ||
| "main": "server/GpuStatsServerCommand.ts", | ||
| "types": "shared/GpuStatsTypes.ts", | ||
| "scripts": { | ||
| "test": "npm run test:unit && npm run test:integration", | ||
| "test:unit": "npx vitest run test/unit/*.test.ts", | ||
| "test:integration": "npx tsx test/integration/GpuStatsIntegration.test.ts", | ||
| "lint": "npx eslint **/*.ts", | ||
| "typecheck": "npx tsc --noEmit" | ||
| }, | ||
| "peerDependencies": { | ||
| "@jtag/core": "*" | ||
| }, | ||
| "files": [ | ||
| "shared/**/*.ts", | ||
| "browser/**/*.ts", | ||
| "server/**/*.ts", | ||
| "test/**/*.ts", | ||
| "README.md" | ||
| ], | ||
| "keywords": [ | ||
| "jtag", | ||
| "command", | ||
| "gpu/stats" | ||
| ], | ||
| "license": "MIT", | ||
| "author": "", | ||
| "repository": { | ||
| "type": "git", | ||
| "url": "" | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| /** | ||
| * Gpu Stats Command - Server Implementation | ||
| * | ||
| * Routes to Rust GpuModule via continuum-core IPC: | ||
| * - gpu/stats: Full GPU memory manager snapshot | ||
| * - gpu/pressure: Quick pressure query (0.0-1.0) | ||
| */ | ||
|
|
||
| import { CommandBase, type ICommandDaemon } from '@daemons/command-daemon/shared/CommandBase'; | ||
| import type { JTAGContext } from '@system/core/types/JTAGTypes'; | ||
| import type { GpuStatsParams, GpuStatsResult } from '../shared/GpuStatsTypes'; | ||
| import { createGpuStatsResultFromParams } from '../shared/GpuStatsTypes'; | ||
| import { RustCoreIPCClient, getContinuumCoreSocketPath } from '../../../../workers/continuum-core/bindings/RustCoreIPC'; | ||
|
|
||
| export class GpuStatsServerCommand extends CommandBase<GpuStatsParams, GpuStatsResult> { | ||
| private rustClient: RustCoreIPCClient; | ||
|
|
||
| constructor(context: JTAGContext, subpath: string, commander: ICommandDaemon) { | ||
| super('gpu/stats', context, subpath, commander); | ||
| this.rustClient = new RustCoreIPCClient(getContinuumCoreSocketPath()); | ||
| } | ||
|
|
||
| async execute(params: GpuStatsParams): Promise<GpuStatsResult> { | ||
| await this.rustClient.connect(); | ||
|
|
||
| try { | ||
| const stats = await this.rustClient.gpuStats(); | ||
|
|
||
| return createGpuStatsResultFromParams(params, { | ||
| success: true, | ||
| gpuName: stats.gpuName, | ||
| totalVramMb: stats.totalVramMb, | ||
| totalUsedMb: stats.totalUsedMb, | ||
| pressure: stats.pressure, | ||
| reserveMb: stats.reserveMb, | ||
| rendering: stats.rendering, | ||
| inference: stats.inference, | ||
| tts: stats.tts, | ||
| }); | ||
| } finally { | ||
| this.rustClient.disconnect(); | ||
| } | ||
| } | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GpuStatsServerCommand.execute()ignoresparams.subsystemeven though the generated types/spec expose it. If the filter is intended to work, pass the param through (and/or filter the returned stats) so./jtag gpu/stats --subsystem=...behaves as documented.