Skip to content

Commit d0e38e1

Browse files
author
Lasim
committed
feat(satellite): enhance nsjail resource limits and cache directory management
1 parent 22383da commit d0e38e1

File tree

5 files changed

+187
-42
lines changed

5 files changed

+187
-42
lines changed

services/satellite/.env.example

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,15 +61,31 @@ EVENT_FLUSH_TIMEOUT_MS=5000
6161
# nsjail Resource Limits (Production Linux only)
6262
# These limits apply to each MCP server process when running with nsjail isolation
6363
# Only active when NODE_ENV=production and platform is Linux
64+
#
65+
# Defaults based on empirical testing with npx and Node.js V8:
66+
# - 2048MB memory: Absolute minimum for V8 initialization (cannot be reduced)
67+
# - 1000 processes: npm spawns many child processes during package installation
68+
# - 1024 file descriptors: Adequate for file I/O operations
69+
# - 50MB file size: Prevents oversized downloads while accommodating 99% of npm packages
70+
# - 100MB tmpfs: Sufficient for npm cache and temporary operations
6471

65-
# Memory limit per process in MB (default: 50)
66-
NSJAIL_MEMORY_LIMIT_MB=50
72+
# Memory limit per process in MB (default: 2048, V8 minimum requirement)
73+
NSJAIL_MEMORY_LIMIT_MB=2048
6774

6875
# CPU time limit per process in seconds (default: 60)
6976
NSJAIL_CPU_TIME_LIMIT_SECONDS=60
7077

71-
# Maximum number of processes per MCP server (default: 50)
72-
NSJAIL_MAX_PROCESSES=50
78+
# Maximum number of processes per MCP server (default: 1000, required for npm)
79+
NSJAIL_MAX_PROCESSES=1000
80+
81+
# Maximum number of open file descriptors (default: 1024)
82+
NSJAIL_RLIMIT_NOFILE=1024
83+
84+
# Maximum file size in MB (default: 50, prevents oversized npm downloads)
85+
NSJAIL_RLIMIT_FSIZE=50
86+
87+
# Tmpfs size for /tmp directory (default: 100M, sufficient for npm cache)
88+
NSJAIL_TMPFS_SIZE=100M
7389

7490
# Process Idle Timeout (stdio MCP servers only)
7591
# Idle stdio processes are automatically terminated after this duration to save resources

services/satellite/Dockerfile

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
# Production image - Debian base required for nsjail
22
FROM node:24-bookworm-slim
33

4+
# Create deploystack user with home directory (simulating production setup)
5+
RUN useradd -m -d /opt/deploystack -s /bin/bash deploystack
6+
47
# Install build dependencies and runtime dependencies for nsjail
58
RUN apt-get update && \
69
apt-get install -y --no-install-recommends \
@@ -45,6 +48,10 @@ RUN apt-get remove -y \
4548
apt-get autoremove -y && \
4649
rm -rf /var/lib/apt/lists/*
4750

51+
# Create mcp-cache base directory with proper ownership
52+
RUN mkdir -p /opt/deploystack/mcp-cache && \
53+
chown -R deploystack:deploystack /opt/deploystack
54+
4855
WORKDIR /app
4956

5057
# Copy package files
@@ -63,7 +70,7 @@ RUN echo "NODE_ENV=production" > .env && \
6370

6471
EXPOSE 3001
6572

66-
# Run as non-root user for security
67-
USER node
73+
# Run as deploystack user (simulating production setup)
74+
USER deploystack
6875

6976
CMD ["node", "--env-file=.env", "dist/index.js"]

services/satellite/README.md

Lines changed: 33 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -107,9 +107,12 @@ EVENT_MAX_QUEUE_SIZE=10000 # Maximum events in memory queue (default: 1
107107
EVENT_FLUSH_TIMEOUT_MS=5000 # Graceful shutdown flush timeout in milliseconds (default: 5000)
108108

109109
# nsjail Resource Limits (Production Linux only)
110-
NSJAIL_MEMORY_LIMIT_MB=50 # Memory limit per MCP server process in MB (default: 50)
110+
NSJAIL_MEMORY_LIMIT_MB=2048 # Memory limit per MCP server process in MB (default: 2048)
111111
NSJAIL_CPU_TIME_LIMIT_SECONDS=60 # CPU time limit per MCP server process in seconds (default: 60)
112-
NSJAIL_MAX_PROCESSES=50 # Maximum number of processes per MCP server (default: 50)
112+
NSJAIL_MAX_PROCESSES=1000 # Maximum number of processes per MCP server (default: 1000)
113+
NSJAIL_RLIMIT_NOFILE=1024 # Maximum number of open file descriptors (default: 1024)
114+
NSJAIL_RLIMIT_FSIZE=50 # Maximum file size in MB (default: 50)
115+
NSJAIL_TMPFS_SIZE=100M # Tmpfs size for /tmp directory (default: 100M)
113116

114117
# Process Idle Timeout (stdio MCP servers only)
115118
MCP_PROCESS_IDLE_TIMEOUT_SECONDS=180 # Idle timeout in seconds before terminating stdio processes (default: 180, set to 0 to disable)
@@ -121,12 +124,13 @@ MCP_PROCESS_IDLE_TIMEOUT_SECONDS=180 # Idle timeout in seconds before terminatin
121124

122125
These limits control resource allocation for MCP server processes running in nsjail isolation:
123126

124-
**NSJAIL_MEMORY_LIMIT_MB** (Default: 50)
127+
**NSJAIL_MEMORY_LIMIT_MB** (Default: 2048)
125128

126129
- Memory limit per MCP server process in megabytes
127-
- Prevents memory exhaustion attacks
128-
- Recommended range: 50-500 MB depending on MCP server requirements
129-
- Example: Set to 100 for memory-intensive MCP servers
130+
- **Minimum 2048MB required for Node.js V8 initialization** (cannot be reduced)
131+
- Based on empirical testing with npx and Node.js environments
132+
- Prevents memory exhaustion while supporting npm package operations
133+
- Note: This is the absolute minimum - complex MCP servers may need more
130134

131135
**NSJAIL_CPU_TIME_LIMIT_SECONDS** (Default: 60)
132136

@@ -135,11 +139,31 @@ These limits control resource allocation for MCP server processes running in nsj
135139
- Recommended range: 30-300 seconds
136140
- Note: This is CPU time, not wall-clock time
137141

138-
**NSJAIL_MAX_PROCESSES** (Default: 50)
142+
**NSJAIL_MAX_PROCESSES** (Default: 1000)
139143

140144
- Maximum number of child processes per MCP server
141-
- Prevents fork bombs and process exhaustion
142-
- Recommended range: 10-100 depending on MCP server needs
145+
- **Required: 1000 for npm operations** which spawn many child processes
146+
- Prevents fork bombs while allowing normal npm package installations
147+
- Based on empirical testing with npx package management
148+
149+
**NSJAIL_RLIMIT_NOFILE** (Default: 1024)
150+
151+
- Maximum number of open file descriptors per MCP server
152+
- Adequate for typical file I/O operations
153+
- Prevents file descriptor exhaustion attacks
154+
155+
**NSJAIL_RLIMIT_FSIZE** (Default: 50)
156+
157+
- Maximum file size in megabytes per MCP server
158+
- Prevents oversized npm package downloads
159+
- Accommodates 99% of legitimate npm packages (average: 416KB, 99th percentile: <10MB)
160+
- Blocks abuse while allowing packages with binary assets (TensorFlow, Puppeteer)
161+
162+
**NSJAIL_TMPFS_SIZE** (Default: 100M)
163+
164+
- Size limit for /tmp directory tmpfs mount
165+
- Sufficient for npm cache and temporary file operations
166+
- Balances functionality with memory efficiency
143167

144168
**Development Mode:**
145169

Lines changed: 28 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,38 @@
11
/**
22
* nsjail Resource Limits Configuration
33
* These limits apply only in production on Linux platforms when nsjail isolation is enabled
4+
*
5+
* Defaults based on empirical testing with npx and Node.js V8 requirements:
6+
* - 2048MB memory: Absolute minimum for V8 initialization (cannot be reduced)
7+
* - 1000 processes: Sufficient for npm operations which spawn many child processes
8+
* - 1024 file descriptors: Adequate for file I/O operations
9+
* - 50MB file size: Prevents oversized downloads while accommodating 99% of npm packages
10+
* - 100MB tmpfs: Sufficient for npm cache operations
411
*/
512
export const nsjailConfig = {
6-
/** Memory limit per MCP server process in MB (default: 50) */
7-
memoryLimitMB: parseInt(process.env.NSJAIL_MEMORY_LIMIT_MB || '50', 10),
13+
/** Memory limit per MCP server process in MB (default: 2048, V8 minimum) */
14+
memoryLimitMB: parseInt(process.env.NSJAIL_MEMORY_LIMIT_MB || '2048', 10),
815

916
/** CPU time limit per MCP server process in seconds (default: 60) */
1017
cpuTimeLimitSeconds: parseInt(process.env.NSJAIL_CPU_TIME_LIMIT_SECONDS || '60', 10),
1118

12-
/** Maximum number of processes per MCP server (default: 50) */
13-
maxProcesses: parseInt(process.env.NSJAIL_MAX_PROCESSES || '50', 10)
19+
/** Maximum number of processes per MCP server (default: 1000, required for npm) */
20+
maxProcesses: parseInt(process.env.NSJAIL_MAX_PROCESSES || '1000', 10),
21+
22+
/** Maximum number of open file descriptors (default: 1024) */
23+
maxOpenFiles: parseInt(process.env.NSJAIL_RLIMIT_NOFILE || '1024', 10),
24+
25+
/** Maximum file size in MB (default: 50, prevents oversized npm downloads) */
26+
maxFileSizeMB: parseInt(process.env.NSJAIL_RLIMIT_FSIZE || '50', 10),
27+
28+
/** Tmpfs size for /tmp directory (default: 100M) */
29+
tmpfsSize: process.env.NSJAIL_TMPFS_SIZE || '100M'
1430
};
31+
32+
/**
33+
* MCP Cache Base Directory
34+
* Base directory for MCP server cache storage
35+
* In production: /opt/deploystack (deploystack user's home)
36+
* Falls back to /opt/deploystack if HOME is not set
37+
*/
38+
export const mcpCacheBaseDir = process.env.HOME || '/opt/deploystack';

services/satellite/src/process/manager.ts

Lines changed: 97 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,12 @@ import { spawn } from 'child_process';
22
import { EventEmitter } from 'events';
33
import { v4 as uuidv4 } from 'uuid';
44
import { Logger } from 'pino';
5+
import { mkdir } from 'fs/promises';
6+
import { existsSync } from 'fs';
57
import { MCPServerConfig, ProcessInfo } from './types';
68
import type { EventBus } from '../services/event-bus';
79
import type { RuntimeState } from './runtime-state';
8-
import { nsjailConfig } from '../config/nsjail';
10+
import { nsjailConfig, mcpCacheBaseDir } from '../config/nsjail';
911

1012
/**
1113
* Process Manager for MCP server subprocesses
@@ -310,7 +312,7 @@ export class ProcessManager extends EventEmitter {
310312
// Determine isolation mode based on environment
311313
const useNsjail = this.shouldUseNsjail();
312314
const childProcess = useNsjail
313-
? this.spawnWithNsjail(config)
315+
? await this.spawnWithNsjail(config)
314316
: this.spawnDirect(config);
315317

316318
const processInfo: ProcessInfo = {
@@ -912,41 +914,113 @@ export class ProcessManager extends EventEmitter {
912914
});
913915
}
914916

917+
/**
918+
* Ensure team-specific cache directory exists
919+
*/
920+
private async ensureCacheDirectory(teamId: string): Promise<string> {
921+
const cacheDir = `${mcpCacheBaseDir}/mcp-cache/${teamId}`;
922+
923+
if (!existsSync(cacheDir)) {
924+
this.logger.info({
925+
operation: 'create_cache_directory',
926+
team_id: teamId,
927+
cache_dir: cacheDir
928+
}, `Creating team cache directory: ${cacheDir}`);
929+
930+
try {
931+
await mkdir(cacheDir, { recursive: true });
932+
933+
this.logger.info({
934+
operation: 'cache_directory_created',
935+
team_id: teamId,
936+
cache_dir: cacheDir
937+
}, `Team cache directory created successfully`);
938+
} catch (error) {
939+
const errorMessage = error instanceof Error ? error.message : String(error);
940+
this.logger.error({
941+
operation: 'cache_directory_creation_failed',
942+
team_id: teamId,
943+
cache_dir: cacheDir,
944+
error: errorMessage
945+
}, `Failed to create team cache directory`);
946+
throw new Error(`Failed to create cache directory: ${errorMessage}`);
947+
}
948+
}
949+
950+
return cacheDir;
951+
}
952+
915953
/**
916954
* Spawn process with nsjail isolation (production mode on Linux)
955+
*
956+
* Configuration based on empirical testing with npx and Node.js:
957+
* - Memory: 2048MB (V8 minimum requirement)
958+
* - Processes: 1000 (npm spawns many child processes)
959+
* - File descriptors: 1024 (adequate for I/O operations)
960+
* - File size: 50MB (prevents oversized downloads)
961+
* - /dev files: Required for Node.js crypto and I/O operations
962+
* - --proc_rw: Required for pthread_create and thread management
917963
*/
918-
private spawnWithNsjail(config: MCPServerConfig) {
964+
private async spawnWithNsjail(config: MCPServerConfig) {
965+
// Ensure team-specific cache directory exists before mounting
966+
const cacheDir = await this.ensureCacheDirectory(config.team_id);
967+
919968
this.logger.info({
920969
operation: 'spawn_nsjail',
921970
installation_name: config.installation_name,
922971
team_id: config.team_id,
972+
cache_dir: cacheDir,
923973
memory_limit_mb: nsjailConfig.memoryLimitMB,
924974
cpu_time_limit_seconds: nsjailConfig.cpuTimeLimitSeconds,
925-
max_processes: nsjailConfig.maxProcesses
975+
max_processes: nsjailConfig.maxProcesses,
976+
max_open_files: nsjailConfig.maxOpenFiles,
977+
max_file_size_mb: nsjailConfig.maxFileSizeMB,
978+
tmpfs_size: nsjailConfig.tmpfsSize
926979
}, 'Spawning process with nsjail isolation');
927980

928-
// Build nsjail arguments
981+
// Get current user UID and GID (deploystack user in production)
982+
const uid = process.getuid ? process.getuid() : 1000;
983+
const gid = process.getgid ? process.getgid() : 1000;
984+
985+
// Build nsjail arguments based on working production configuration
929986
const nsjailArgs = [
930-
'-Mo', // Mount mode: once, don't remount
931-
'--rlimit_as', String(nsjailConfig.memoryLimitMB), // Memory limit (MB)
987+
'-Mo', // Mount mode: once, don't remount
988+
'--proc_rw', // CRITICAL: Required for Node.js pthread_create
989+
'--user', String(uid), // Use current user (deploystack)
990+
'--group', String(gid), // Use current group (deploystack)
991+
'--rlimit_as', String(nsjailConfig.memoryLimitMB), // Memory limit (MB) - 2048 minimum for V8
932992
'--rlimit_cpu', String(nsjailConfig.cpuTimeLimitSeconds), // CPU time limit (seconds)
933-
'--rlimit_nproc', String(nsjailConfig.maxProcesses), // Max processes
934-
'--time_limit', '0', // No wall-clock time limit
935-
'--user', '99999', // Non-root user
936-
'--group', '99999', // Non-root group
937-
'-R', '/usr', // Read-only mount: /usr
938-
'-R', '/lib', // Read-only mount: /lib
939-
'-R', '/lib64', // Read-only mount: /lib64
940-
'-R', '/bin', // Read-only mount: /bin
941-
'-R', '/etc/resolv.conf', // DNS resolution
942-
'-T', '/tmp', // Writable temp directory
943-
'--disable_clone_newnet', // Allow network access
944-
'--hostname', `mcp-${config.team_id}`, // Team-specific hostname
945-
// Inject environment variables
993+
'--rlimit_nproc', String(nsjailConfig.maxProcesses), // Max processes - 1000 for npm
994+
'--rlimit_nofile', String(nsjailConfig.maxOpenFiles), // Max file descriptors
995+
'--rlimit_fsize', String(nsjailConfig.maxFileSizeMB), // Max file size (MB)
996+
'--time_limit', '0', // No wall-clock time limit
997+
'-R', '/usr', // Read-only mount: /usr
998+
'-R', '/lib', // Read-only mount: /lib
999+
'-R', '/lib64', // Read-only mount: /lib64
1000+
'-R', '/bin', // Read-only mount: /bin
1001+
'-R', '/sbin', // Read-only mount: /sbin
1002+
'-R', '/etc', // Read-only mount: /etc (includes resolv.conf)
1003+
'-T', `/tmp:size=${nsjailConfig.tmpfsSize}`, // Writable temp with size limit (100M)
1004+
'-B', `${cacheDir}:/home/npx`, // Team-specific cache directory mount
1005+
'--bindmount', '/dev/null:/dev/null', // Required for I/O redirection
1006+
'--bindmount', '/dev/urandom:/dev/urandom', // Required for crypto operations
1007+
'--bindmount', '/dev/zero:/dev/zero', // Required for memory allocation
1008+
'--symlink', '/proc/self/fd:/dev/fd', // Required for file descriptor management
1009+
'-E', 'HOME=/home/npx', // Set HOME for npx cache
1010+
'-E', 'PATH=/usr/bin:/bin:/usr/local/bin', // Set PATH
1011+
'-E', 'NPM_CONFIG_CACHE=/home/npx/.npm', // npm cache location
1012+
'-E', 'NPM_CONFIG_PREFIX=/home/npx/.npm-global', // npm global prefix
1013+
'-E', 'NPM_CONFIG_UPDATE_NOTIFIER=false', // Disable update notifier
1014+
'-E', 'NO_UPDATE_NOTIFIER=1', // Disable update notifier (alternative)
1015+
// Inject user-provided environment variables
9461016
...Object.entries(config.env).flatMap(([key, value]) => ['-E', `${key}=${value}`]),
947-
'--', // End of nsjail args
948-
config.command, // MCP server command
949-
...config.args // MCP server arguments
1017+
'--disable_clone_newnet', // Allow network access (required for npm downloads)
1018+
'--disable_clone_newcgroup', // Disable cgroup namespace (causes clone() errors on some kernels)
1019+
'--disable_no_new_privs', // May be needed for some packages
1020+
'--hostname', `mcp-${config.team_id}`, // Team-specific hostname
1021+
'--', // End of nsjail args
1022+
config.command, // MCP server command (e.g., /usr/bin/npx)
1023+
...config.args // MCP server arguments
9501024
];
9511025

9521026
return spawn('nsjail', nsjailArgs, {

0 commit comments

Comments
 (0)