Skip to content

Commit bd2643e

Browse files
authored
fix: env override for single node process, mem limit env var issue (#561)
* fix: optimize readme * fix: env override for single node process * fix: update hard memory limiter environment variable for GPU configuration
1 parent cddb445 commit bd2643e

File tree

3 files changed

+20
-8
lines changed

3 files changed

+20
-8
lines changed

README.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -40,25 +40,22 @@ Tensor Fusion is a state-of-the-art **GPU virtualization and pooling solution**
4040
- [Run vGPU in VM Hypervisor](https://tensor-fusion.ai/guide/getting-started/deployment-vm)
4141
- [Learn Essential Concepts & Architecture](https://tensor-fusion.ai/guide/getting-started/architecture)
4242

43-
<!-- (TODO: Asciinema) -->
44-
4543
### 💬 Discussion
4644

4745
- Discord channel: [https://discord.gg/2bybv9yQNk](https://discord.gg/2bybv9yQNk)
4846
- Discuss anything about TensorFusion: [Github Discussions](https://github.com/NexusGPU/tensor-fusion/discussions)
49-
- Contact us with WeCom for Greater China region: [企业微信](https://work.weixin.qq.com/ca/cawcde42751d9f6a29)
47+
- Contact us with WeCom for Greater China region: [企业微信](https://work.weixin.qq.com/ca/cawcde42751d9f6a29)
5048
- Email us: [support@tensor-fusion.com](mailto:support@tensor-fusion.com)
5149
- Schedule [1:1 meeting with TensorFusion founders](https://tensor-fusion.ai/book-demo)
5250

53-
5451
## 🔮 Features & Roadmap
5552

5653
### Core GPU Virtualization Features
5754

5855
- [x] Fractional GPU and flexible oversubscription
5956
- [x] Remote GPU sharing with SOTA GPU-over-IP technology, less than 4% performance loss
6057
- [x] GPU VRAM expansion and hot/cold tiering
61-
- [x] None NVIDIA GPU/NPU vendor support
58+
- [x] Non-NVIDIA GPU/NPU vendor support
6259

6360
### Pooling & Scheduling & Management
6461

@@ -67,7 +64,7 @@ Tensor Fusion is a state-of-the-art **GPU virtualization and pooling solution**
6764
- [x] GPU node auto provisioning/termination, Karpenter integration
6865
- [x] GPU compaction/bin-packing
6966
- [x] Take full control of GPU allocation with precision targeting by vendor, model, device index, and more
70-
- [x] Seamless onboarding experience for Pytorch, TensorFlow, llama.cpp, vLLM, Tensor-RT, SGlang and all popular AI training/serving frameworks
67+
- [x] Seamless onboarding experience for PyTorch, TensorFlow, llama.cpp, vLLM, TensorRT, SGLang and all popular AI training/serving frameworks
7168
- [x] Seamless migration from existing NVIDIA operator and device-plugin stack
7269
- [x] Centralized Dashboard & Control Plane
7370
- [x] GPU-first autoscaling policies, auto set requests/limits/replicas

pkg/constants/env.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ const (
138138
// hard limiter (not open sourced) in megabytes, only take effect on worker container and
139139
// when open source vgpu.rs gpu-limiter is disabled
140140
// when use this mode, memory request can not autoscale dynamically
141-
HardMemLimiterEnv = "TF_CUDA_MEMORY_LIMIT"
141+
HardMemLimiterEnv = "TF_GPU_MEMORY_LIMIT"
142142

143143
TensorFusionRemoteWorkerPortNumber = 8000
144144
TensorFusionRemoteWorkerPortName = "remote-vgpu"

pkg/hypervisor/backend/single_node/single_node_backend.go

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ import (
99
"os"
1010
"os/exec"
1111
"path/filepath"
12+
"strings"
1213
"sync"
1314
"syscall"
1415
"time"
@@ -371,8 +372,22 @@ func (b *SingleNodeBackend) buildCmd(ps *processState) (*exec.Cmd, io.Closer, er
371372
}
372373

373374
cmd := exec.Command(ps.executable, ps.args...)
374-
cmd.Env = os.Environ()
375+
376+
// Build environment: start with current environment, then override with ps.env
377+
envMap := make(map[string]string)
378+
for _, env := range os.Environ() {
379+
parts := strings.SplitN(env, "=", 2)
380+
if len(parts) == 2 {
381+
envMap[parts[0]] = parts[1]
382+
}
383+
}
384+
// Override with custom environment variables
375385
for k, v := range ps.env {
386+
envMap[k] = v
387+
}
388+
// Convert back to []string format
389+
cmd.Env = make([]string, 0, len(envMap))
390+
for k, v := range envMap {
376391
cmd.Env = append(cmd.Env, k+"="+v)
377392
}
378393
if ps.workingDir != "" {

0 commit comments

Comments
 (0)