-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Description
What's the problem this feature will solve?
Currently, pip downloads packages sequentially (one at a time), which significantly underutilizes available network bandwidth and increases total installation time, especially when:
Installing packages with many dependencies (e.g., data science stacks, web frameworks)
Working on high-bandwidth connections where network latency is the bottleneck
Installing in CI/CD pipelines where speed is critical
Using package mirrors with good concurrent connection support
For example, installing a typical data science environment with 50+ packages can spend most of its time waiting for sequential downloads rather than actively utilizing the available network capacity. Users on modern networks (100+ Mbps) see pip using only a fraction of their bandwidth.
Describe the solution you'd like
Add support for parallel downloads with a configurable worker pool, similar to how other package managers (npm, cargo, dnf) handle concurrent downloads.
Proposed implementation:
Configuration parameter in PipSession:
Add parallel_downloads parameter (default: 1 for backward compatibility)
Validation to ensure minimum of 1 worker
Parallel batch processing in Downloader:
Enhance batch() method to detect parallel_downloads > 1
Use ThreadPoolExecutor for concurrent downloads
Maintain existing error handling and retry logic
User-facing configuration options (future work):
CLI flag: pip install --parallel-downloads N
Environment variable: PIP_PARALLEL_DOWNLOADS=N
Config file setting in pip.conf
Benefits:
2-5x faster installation on typical networks (based on other package manager benchmarks)
Better resource utilization on modern high-bandwidth connections
Minimal code changes, leverages existing download infrastructure
Backward compatible (defaults to sequential behavior)
Alternative Solutions
Process-based parallelism
Use multiprocessing instead of threads
Additional context
Use 5 parallel downloads
pip install --parallel-downloads 5 requests pandas numpy scipy
Environment variable
export PIP_PARALLEL_DOWNLOADS=3
pip install -r requirements.txt
Configuration file
[global]
parallel-downloads = 4
Code of Conduct
- I agree to follow the PSF Code of Conduct.