Date: 2024-11-18 Topic: Conda vs Pip, and Docker network optimization
Today I faced a decision point: should we use Conda or Pip for dependency management? This involved understanding the fundamental differences between these tools.
The core difference:
Conda is an environment management tool that also functions as a package manager. Pip is purely a package management tool.
Key considerations:
- Complexity of environment management
- Build process efficiency
- Deployment environment consistency
- Maintenance costs
Infrastructure setup:
FROM python:3.9-slim
WORKDIR /app
# Configure apt sources for China
RUN echo \
deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bullseye main contrib non-free \
> /etc/apt/sources.list
# System dependency installation
RUN apt-get update && apt-get install -y \
git \
poppler-utils \
libgl1-mesa-glx \
libglib2.0-0 \
&& rm -rf /var/lib/apt/lists/*Optimized Docker daemon settings:
{
"registry-mirrors": [
"https://hub-mirror.c.163.com",
"https://mirror.baidubce.com",
"https://registry.docker-cn.com"
],
"dns": [
"8.8.8.8",
"8.8.4.4"
],
"max-concurrent-downloads": 3,
"max-concurrent-uploads": 3,
"mtu": 1400
}Historical progression:
Basic Package Management → Virtual Environments → Integrated Environment Management
(pip) (virtualenv) (Conda)
Modern trends:
- Rise of Poetry
- Containerized deployment
- Improved dependency resolution algorithms
# requirements.txt
fastapi>=0.68.0
uvicorn>=0.15.0
python-multipart>=0.0.5
pillow>=8.3.1
pdf2image>=1.16.0
python-magic>=0.4.24
loguru>=0.5.3
pydantic<2.0.0 # Version locking exampleBest practices:
- Clear version constraints
- Grouped dependency management
- Environment isolation
- Regular updates
When choosing a technology stack, consider:
| Dimension | Considerations |
|---|---|
| Maturity | How stable is it? |
| Community | Is there active support? |
| Learning Curve | How long to get productive? |
| Maintenance Cost | Long-term overhead? |
| Ecosystem | Integration options? |
Simplicity: Avoid unnecessary complexity
Maintainability: Focus on long-term maintenance costs
Scalability: Reserve space for future expansion
Today's main learning was about making technology choices. The Conda vs Pip debate is a good example of "no best choice, only the most suitable choice."
Conda is powerful but adds complexity. It manages its own Python installation, which can conflict with system Python. For simple projects, this overhead isn't justified.
Pip is simpler but doesn't handle non-Python dependencies well. If you need system libraries (like OpenCV's dependencies), you're on your own.
For our Docker-based deployment, we chose pip inside Docker. Docker handles system dependencies through apt-get, and pip handles Python packages. This separation keeps each tool doing what it does best.
- Poetry for modern Python packaging
- Docker network modes (bridge, host, overlay)
- Dependency resolution algorithms
- Virtual environment internals