Skip to content

JupyterLab Interface Becomes Unresponsive During Multi-Process CPU-Intensive Operations in Terminal/Notebook #1571

@WangChangsongGit

Description

@WangChangsongGit

Description

When running CPU-intensive multi-process Python scripts (using 4, 8, or 16 processes) in JupyterLab's Terminal or notebook cells, the JupyterLab interface becomes extremely unresponsive even when the container's total CPU cores (64 cores) are not fully utilized. The websocket connections and backend API responses become significantly slower, with the lag worsening as more processes are used.

The same scripts run directly in the container (outside JupyterLab) do not exhibit this performance degradation, suggesting this is JupyterLab-specific behavior rather than a general resource contention issue.

Reproduce

  1. Start a JupyterLab instance in a K8S container with 64 cores and 128GB RAM
  2. Open either:
    • A Terminal session in JupyterLab, OR
    • A notebook (.ipynb) file
  3. Execute a CPU-intensive Python script using multiprocessing with:
    • 4 processes (should use ~600% CPU)
    • 8 processes
    • 16 processes
  4. Observe that JupyterLab interface becomes extremely laggy:
    • Terminal input/output delays
    • Slow websocket responses
    • Delayed backend API calls
  5. Compare with running the same script directly via shell access to the container - no performance issues occur
  1. Go to '...'
  2. Click on '...'
  3. Scroll down to '...'
  4. See error '...'

Expected behavior

JupyterLab should remain responsive when running multi-process workloads that don't fully utilize the container's CPU resources. The main JupyterLab process should be able to use the remaining available cores without significant performance degradation.

Context

This is particularly problematic for AI/ML development workflows where distributed multi-process training is common in notebook environments.

  • Operating System and version: Linux (K8S container environment)
  • Browser and version: Any browser accessing JupyterLab (Chrome 142.0.7444.59)
  • JupyterLab version: jupyterlab 4.4.8
  • ipykernel : 6.30.1
  • jupyter_client : 8.6.3
  • jupyter_core : 5.8.1
  • jupyter_server : 2.16.0

Additional information about the script used during testing:

import tensorflow as tf
import concurrent.futures
import time
import os
import numpy as np

# 定义一个计算密集型任务,并包含磁盘 I/O 操作
def compute_task(thread_id, matrix_size, num_iterations, output_dir):
    print(f"Thread {thread_id} started")
    with tf.device('/CPU:0'):
        # 创建两个大矩阵
        matrix1 = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)
        matrix2 = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)
        
        for i in range(num_iterations):
            # 执行矩阵乘法
            result = tf.matmul(matrix1, matrix2)
            
            # 将结果写入磁盘
            output_file = os.path.join(output_dir, f"thread_{thread_id}.npy")
            np.save(output_file, result.numpy())
            
            if i % 10 == 0:
                print(f"Thread {thread_id} iteration {i}")
    
    print(f"Thread {thread_id} finished")

# 获取 CPU 核心数
num_threads = os.cpu_count() - 20
# num_threads = 1
print(f'num_threads: {num_threads}')

# 定义矩阵大小和迭代次数
matrix_size = 1000  # 矩阵大小
iterations_per_thread = 100000  # 每个线程的迭代次数

# 定义输出目录
output_dir = "/home/work/output"
os.makedirs(output_dir, exist_ok=True)

# 创建线程池
with concurrent.futures.ThreadPoolExecutor(max_workers=num_threads) as executor:
    # 提交任务到线程池
    futures = [executor.submit(compute_task, i, matrix_size, iterations_per_thread, output_dir) for i in range(num_threads)]

    # 等待所有任务完成
    for future in concurrent.futures.as_completed(futures):
        future.result()

print("All threads finished")

Troubleshoot Output
Paste the output from running `jupyter troubleshoot` from the command line here.
You may want to sanitize the paths in the output.
Command Line Output
Paste the output from your command line running `jupyter lab` here, use `--debug` if possible.
Browser Output
Paste the output from your browser Javascript console here, if applicable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions