Note: This is a tutorial to start building an example of a distributed system and is not intended for a production deployment.
We welcome contributions! Please see the CONTRIBUTING.md file for guidelines on how to contribute to this project.
-
Part 3: Dockerizing the Blender Server and Orchestrator
- Introduction
- Why Docker?
- Limitations and Considerations
- Warning: Limited CPU/GPU Resources in Docker
- Step 1: Installing Docker
- Step 2: Changes Required in
server.jsfor Supporting Docker - Step 3: Docker Network
- Step 4: Dockerfile for the Blender Server
- Step 5: Changes Required in
orchestrator.jsfor Supporting Docker Network - Dockerfile for the Orchestrator
- Test the Orchestrator
- Conclusion
In this tutorial, we will create a web app that allows users to submit Blender jobs. The app will consist of a NodeJS and Express API with endpoints to submit and check the status of rendering jobs. We'll use a hardcoded Blender example file for demonstration purposes.
First, make sure you have NodeJS and NPM installed. You can download it from here.
Ensure Blender is installed on your system and accessible via the command line. You can download Blender from here.
The image used in this example is Racing car sample.
NOTE ABOUT BLENDER: Blender rendering is an intensive resource consumption process. In this tutorial, some examples suggests to render up to 5 frames or to have multiple render processes working in parallel in your machine. In case of your computer face issues doing this, please adjust these numbers to a reasonable value in order to follow the tutorial.
We are going to create our working folder for this tutorial:
mkdir distributed-system-tutorial
cd distributed-system-tutorialThis tutorial has been tested on macOS / Linux. If you are using Windows, please ensure that when using paths you put the right values for escaping strings and dealing with spaces in the routes.
Set up a new NodeJS project and install the necessary dependencies:
mkdir server
cd server
npm init -y
npm install expressCreate a file named server.js and set up a basic Express server:
const express = require('express');
const app = express();
app.use(express.json());
const port = 3000;
app.listen(port, () => {
console.log(`Server running on port ${port}`);
});You can run the following command to start the server listening:
node server.jsDefine constants for blenderPath, blendFilePath, and outputDir:
const blenderPath = '{PUT HERE THE BLENDER PATH}'; // Populate with actual Blender path
const blendFilePath = '/blend/files/splash-pokedstudio.blend'; // Populate with actual blend file path
const outputDir = '{PUT HERE THE OUTPUT DIR}'; // Populate with desired output directoryCreate a directory for storing the blend file to use:
mkdir blend
cd blend
mkdir filesThis endpoint will accept a JSON payload with "from" and "to" properties, invoke Blender with the specified frames interval, and return the PID of the process.
const { exec } = require('child_process');
const express = require('express');
const app = express();
app.use(express.json());
const port = 3000;
const jobs = {}; // Store job processes by their PIDs
app.listen(port, () => {
console.log(`Server running on port ${port}`);
});
const blenderPath = '{PUT HERE THE BLENDER PATH}'; // Populate with actual Blender path
const blendFilePath = '/blend/files/splash-pokedstudio.blend'; // Populate with actual blend file path
const outputDir = '{PUT HERE THE OUTPUT DIR}'; // Populate with desired output directory
// POST /job endpoint with headers
app.post('/job', (req, res) => {
const { from, to } = req.body;
if (from === undefined || to === undefined) {
return res.status(400).send('Invalid input');
}
const command = `${blenderPath} -b ${blendFilePath} -o ${outputDir}/blender-render_#### -E \"CYCLES\" -s ${from} -e ${to} -t 0 -a`;
const jobProcess = exec(command);
const pid = jobProcess.pid;
jobs[pid] = jobProcess;
// Capture and log output
jobProcess.stdout.on('data', (data) => {
console.log(`Job ${pid} stdout: ${data}`);
});
jobProcess.stderr.on('data', (data) => {
console.error(`Job ${pid} stderr: ${data}`);
});
jobProcess.on('close', (code) => {
console.log(`Job ${pid} exited with code ${code}`);
});
// Calculate a suggested retry-after time (e.g., 5 seconds)
const retryAfter = 5; // seconds
res.status(202)
.header('Location', `/job/${pid}`)
.header('Retry-After', retryAfter)
.send({ pid });
});This endpoint will check the status of the given PID and return the appropriate status code based on the job status.
const { spawnSync } = require('child_process');
app.get('/job/:jobId', (req, res) => {
const jobId = parseInt(req.params.jobId);
const jobProcess = jobs[jobId];
if (!jobProcess) {
return res.status(400).send('Job not found');
}
const result = spawnSync('ps', ['-p', jobId.toString()]);
if (result.status !== 0) {
res.status(200).send('Job completed');
} else {
const retryAfter = 5; // Suggest retry after 5 seconds
res.status(202)
.header('Location', `/job/${jobId}`)
.header('Retry-After', retryAfter)
.send('Job still running');
}
});The API is designed to handle multiple jobs concurrently by storing PIDs in memory and allowing multiple requests simultaneously. For this tutorial, the configuration is left as is, but it should be defined based on the machines where this will be executed.
The Blender command used in the API is structured as follows:
const command = `${blenderPath} -b ${blendFilePath} -o ${outputDir}/blender-render_#### -E \"CYCLES\" -s ${from} -e ${to} -t 0 -a`;-b: Runs Blender in background (no GUI).${blendFilePath}: Path to the Blender file to be rendered.-o ${outputDir}/blender-render_####: Specifies the output directory and file pattern.-E "CYCLES": Sets the rendering engine to Cycles.-s ${from}: Start frame.-e ${to}: End frame.-t 0: Use all available threads.-a: Render animation.
For more details on Blender command-line arguments, refer to the Blender documentation.
Here are some example curl commands to test the API functionality implemented in Part 1:
This command submits a request to render frames from 1 to 5.
curl -X POST http://localhost:3000/job -H "Content-Type: application/json" -d '{"from": 1, "to": 5}'To kill a Blender process manually, use the kill command followed by the process ID:
For macos/Linux
kill {processId}For Windows
taskkill /PID {processId} /T /FReplace {processId} with the actual PID of the Blender process you want to terminate.
-
202 Accepted: This status code indicates that the request has been accepted for processing, but the processing is not yet complete. According to RFC 7231, Section 6.3.3, it allows the server to provide information about when the client should check back for the status.
-
200 OK: This status code indicates that the request was successful and the processing is complete. According to RFC 7231, Section 6.3.1, it is the standard response for successful HTTP requests.
-
Headers:
- Location: Indicates the URL to check the status of the job.
- Retry-After: Suggests how long (in seconds) the client should wait before retrying the request to check the job status.
Polling is a technique where the client repeatedly requests the status of a job at regular intervals until the job is complete. This is useful when the client needs to know the result of a long-running process without holding the connection open.
The Retry-After header guides the client on when to make the next request, balancing the load on the server and ensuring timely updates to the client.
In this tutorial, we've created a web app to submit and monitor Blender rendering jobs. We have set up a NodeJS and Express API with endpoints for job submission and status checking. We've also included instructions on how to manage Blender processes manually and we have explained the use of HTTP status codes and headers.
In this part of the tutorial, we will create an orchestrator to manage and distribute Blender rendering jobs across multiple nodes. The orchestrator will be a NodeJS and Express API that communicates with the API from Part 1 to submit and monitor rendering jobs. This example is designed to demonstrate the basics of building a distributed application and is not intended for production deployment.
An orchestrator helps manage and distribute workloads across multiple nodes, ensuring efficient use of resources and parallel processing. It can split a large task into smaller batches, distribute these batches to different nodes, and monitor their progress. This approach improves scalability, fault tolerance, and resource utilization.
Set up a new NodeJS project for the orchestrator:
mkdir orchestrator
cd orchestrator
npm init -y
npm install express axiosCreate a file named orchestrator.js and set up the orchestrator server:
const express = require('express');
const axios = require('axios'); // For making HTTP requests to the nodes
const app = express();
app.use(express.json());
const port = 4000;
const NODES = [
'http://localhost:3001',
'http://localhost:3002',
'http://localhost:3003',
]; // List of node endpoints
const BATCH_SIZE = 5; // Define batch size as a constant
let jobs = {}; // Store all jobs
// POST /render endpoint to start rendering a movie
app.post('/render', async (req, res) => {
const { from, to } = req.body;
if (from === undefined || to === undefined) {
return res.status(400).send('Invalid input');
}
const jobId = generateJobId();
jobs[jobId] = { status: 'pending', batches: [] };
const frameChunks = splitFramesIntoChunks(from, to, BATCH_SIZE);
const jobPromises = frameChunks.map((chunk, index) =>
assignJobToNode(NODES[index % NODES.length], chunk.from, chunk.to, jobId)
);
try {
const results = await Promise.all(jobPromises);
jobs[jobId].batches.push(...results); // Save batch info
res.status(202).header('Location', `/status/${jobId}`).send({ jobId });
} catch (error) {
console.error('Error assigning jobs:', error);
jobs[jobId].status = 'failed';
res.status(500).send('Failed to distribute jobs');
}
});
// GET /status/:jobId endpoint to check overall job status
app.get('/status/:jobId', async (req, res) => {
const jobId = req.params.jobId;
const job = jobs[jobId];
if (!job) {
return res.status(404).send('Job not found');
}
try {
const statusPromises = job.batches.map(batch =>
checkJobStatus(batch.node, batch.pid)
);
const statuses = await Promise.all(statusPromises);
const allCompleted = statuses.every(status => status === 'completed');
if (allCompleted) {
job.status = 'completed';
res.status(200).send('Job completed');
} else {
job.status = 'in-progress';
res.status(202).header('Location', `/status/${jobId}`).header('Retry-After', 5).send('Job still running');
}
} catch (error) {
console.error('Error checking job statuses:', error);
res.status(500).send('Failed to fetch statuses');
}
});
// Utility function to generate a unique job ID
function generateJobId() {
return Math.random().toString(36).substring(2, 15);
}
// Utility function to split frames into chunks
function splitFramesIntoChunks(from, to, batchSize) {
const chunks = [];
for (let i = from; i <= to; i += batchSize) {
chunks.push({ from: i, to: Math.min(i + batchSize - 1, to) });
}
return chunks;
}
// Utility function to assign a job to a node
async function assignJobToNode(nodeUrl, from, to, jobId) {
console.log(`Invoking URL: ${nodeUrl}/job with frames ${from} to ${to}`);
try {
const response = await axios.post(`${nodeUrl}/job`, { from, to });
console.log(`Job assigned to ${nodeUrl} with PID: ${response.data.pid}`);
return { node: nodeUrl, pid: response.data.pid };
} catch (error) {
console.error(`Failed to assign job to ${nodeUrl}:`, error.message);
throw error;
}
}
// Utility function to check job status
async function checkJobStatus(nodeUrl, pid) {
const response = await axios.get(`${nodeUrl}/job/${pid}`);
return response.data.status === 'Job completed' ? 'completed' : 'running';
}
// Start the orchestrator server
app.listen(port, () => {
console.log(`Orchestrator running on port ${port}`);
});The node URLs in the NODES array are based on Part 1 of this tutorial. They can be the same (e.g., http://localhost:3000 if you run multiple instances of the node API) or different, depending on your setup. This is a tutorial example, and the configuration may need to be adjusted for your specific environment.
Once the POST request is made, you can see in the terminal of the server.js (from Part 1) how Blender is running.
Navigate to the orchestrator directory and start the server:
node orchestrator.jsYou should see a message indicating that the server is running on port 4000:
Orchestrator running on port 4000
To submit a request to render frames 1 to 20, use the following curl command:
curl -X POST http://localhost:4000/render -H "Content-Type: application/json" -d '{"from": 1, "to": 20}' -iHTTP/1.1 202 Accepted
Location: /status/abcd1234
Content-Type: application/json
Content-Length: 20
{"jobId":"abcd1234"}
To check the status of the job, use the following curl command, replacing abcd1234 with the actual job ID returned by the POST request:
curl -X GET http://localhost:4000/status/abcd1234 -iHTTP/1.1 202 Accepted
Location: /status/abcd1234
Retry-After: 5
Content-Type: text/plain
Content-Length: 17
Job still running
HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 13
Job completed
HTTP/1.1 404 Not Found
Content-Type: text/plain
Content-Length: 13
Job not found
A promise in JavaScript represents the eventual completion (or failure) of an asynchronous operation and its resulting value. Promises are used to handle asynchronous tasks in a more manageable way compared to callbacks. In the orchestrator, we use promises to handle HTTP requests to nodes and check job statuses.
For more information on promises, refer to the MDN Web Docs on Promises.
In Part 2 of this tutorial, we've created an orchestrator to manage and distribute Blender rendering jobs across multiple nodes. The orchestrator API splits the frame range into smaller batches, submits them to the nodes, and monitors their progress. We used NodeJS and Express to build the orchestrator and provided curl examples for submitting and checking the status of jobs.
+---------+
| Client |
+---------+
|
| Render Request (frames 1-20)
v
+-----------------+
| Orchestrator |
| (Split & Assign)|
+-----------------+
| \
Batch 1 | \ Batch 2,3,...
v \
+---------------+ +---------------+
| Blender | | Blender |
| Server | | Server |
+---------------+ +---------------+
^ \
| \
+--------+
|
(Status Updates, Aggregation)
|
v
+----------+
| Client |
+----------+
In this part of the tutorial, we'll create Docker images for both the Blender server (from Part 1) and the orchestrator (from Part 2). We'll use the linuxserver/blender Docker image as a base for the server and build custom Docker images for both components. This process will help ensure consistency, portability, and ease of deployment.
Docker provides a way to package and run applications in isolated environments called containers. This approach offers several benefits:
- Consistency: Ensures that the application runs the same way in any environment.
- Portability: Containers can be easily shared and deployed across different systems.
- Isolation: Keeps applications and their dependencies separate from the host system.
- Scalability: Simplifies scaling applications by running multiple containers.
- Image Size: Blender is a large application, so the Docker image may be substantial in size.
- Resource Management: Running Blender inside a container may require careful tuning of resource limits to avoid overwhelming the host system.
- Storage: Ensure that the Blender files and output directories are properly mounted to access them outside the container.
It's important to note that Docker has limited access to CPU and GPU resources, which may not be ideal for resource-intensive applications like Blender. For more information on Docker's limitations with CPU and GPU resources, refer to the following links:
However, for the purpose of this tutorial, Docker is used to demonstrate the basics of containerization and orchestration.
To get started with Docker, you'll need to install it on your machine. Follow the instructions for your operating system:
Update the server.js file with the following changes to support running inside a Docker container:
const port = 3000; // Port to match Dockerfile
const blenderPath = '/usr/bin/blender'; // Blender path inside the container
const blendFilePath = '/app/blend/files/splash-pokedstudio.blend'; // Adjust to your mounted volume path
const outputDir = '/app/output/blender-render_####'; // Adjust to your mounted volume pathTo allow communication between the Blender server and the orchestrator, create a custom Docker network:
docker network create blender-networkCreate a Dockerfile for the Blender server in the project directory:
# Use the linuxserver/blender Docker image as the base
FROM linuxserver/blender:latest
# Install NodeJS and npm
RUN apt-get update && apt-get install -y \
nodejs \
npm \
&& rm -rf /var/lib/apt/lists/*
# Set the working directory inside the container
WORKDIR /app
# Copy application files into the container
COPY . .
# Install NodeJS dependencies
RUN npm install
# Expose the port that your API will run on
EXPOSE 3000
# Override the entrypoint to bypass starting Xvnc and Openbox
ENTRYPOINT []
# Start the NodeJS application
CMD ["node", "server.js"]-
Build the Docker image:
docker build -t blender-server . -
Run the container, ensuring that the Blender files and output directories are properly mounted:
docker run -p 3000:3000 --network blender-network --name blender-server -v /path/to/blend/files:/app/blend/files -v /path/to/output:/app/output blender-server
Replace /path/to/blend/files and /path/to/output with the actual paths on your host system.
NOTE As /path/to/blend/files usually is a folder, path shoudl end with \
Update the orchestrator.js file with the following changes to use the container name as the hostname:
const NODES = [
'http://blender-server:3000', // Use the container name as the hostname
];Create a Dockerfile for the orchestrator in its project directory:
# Use the official NodeJS image as the base
FROM node:16
# Set the working directory inside the container
WORKDIR /app
# Copy application files into the container
COPY . .
# Install NodeJS dependencies
RUN npm install
# Expose the port that your API will run on
EXPOSE 4000
# Start the NodeJS application
CMD ["node", "orchestrator.js"]-
Build the Docker image:
docker build -t orchestrator . -
Run the container on the custom network:
docker run -p 4000:4000 --network blender-network orchestrator
Submit a render request for 20 frames (split into batches of 5 frames) using curl:
curl -X POST http://localhost:4000/render -H "Content-Type: application/json" -d '{"from": 1, "to": 20}' -iThis is the same curl command used in the regular case.
Dockerizing the Blender server and orchestrator ensures consistent, portable, and isolated environments for running these applications.
In this part of the tutorial, we'll use Docker Compose to manage and run multiple containers for the Blender server and the orchestrator.
Docker Compose simplifies the process of managing multi-container Docker applications. It allows you to define and run multiple services in a single file (docker-compose.yml), making it easier to manage dependencies, networking, and configuration.
Create a docker-compose.yml file in the project directory to define the three Blender server instances and the orchestrator:
version: '3.8'
services:
blender-server-1:
build:
context: ./server
ports:
- "3001:3000"
volumes:
- ./blend/files:/app/blend/files
- ./output:/app/output
networks:
- blender-network
blender-server-2:
build:
context: ./server
ports:
- "3002:3000"
volumes:
- ./blend/files:/app/blend/files
- ./output:/app/output
networks:
- blender-network
blender-server-3:
build:
context: ./server
ports:
- "3003:3000"
volumes:
- ./blend/files:/app/blend/files
- ./output:/app/output
networks:
- blender-network
orchestrator:
build:
context: ./orchestrator
ports:
- "4000:4000"
depends_on:
- blender-server-1
- blender-server-2
- blender-server-3
networks:
- blender-network
networks:
blender-network:
driver: bridgeEnsure your distributed-system-tutorial directory is structured as follows:
distributed-system-tutorial/
│
├── server/
│ ├── Dockerfile
│ ├── server.js
│ ├── package.json
│ └── blend/
│ └── files/
├── orchestrator/
│ ├── Dockerfile
│ ├── orchestrator.js
│ ├── package.json
└── docker-compose.yml
Update the NODES in orchestrator.js file to reference the three Blender servers:
const NODES = [
'http://blender-server-1:3000', // First server
'http://blender-server-2:3000', // Second server
'http://blender-server-3:3000', // Third server
]; // List of node endpointsNavigate to the project directory and run the following command to build and start the services:
docker-compose up --buildThis command will build the Docker images for the three Blender servers and the orchestrator, mount the volumes, and start the containers.
Submit a render request for 20 frames (split into batches of 5 frames) using curl:
curl -X POST http://localhost:4000/render -H "Content-Type: application/json" -d '{"from": 1, "to": 20}' -iThis is the same curl command used in the regular case.
Using Docker Compose simplifies the management and deployment of multi-container applications like the Blender server and orchestrator. By setting up three Blender server instances, you can distribute rendering tasks more efficiently. This setup can be easily extended for deployment to other environments.
+---------+
| Client |
+---------+
|
| Render Request (frames 1-20)
v
+------------------+
| Orchestrator |
| (Split job into |
| batches) |
+------------------+
/ | \ ...
/ | \
+----------+ +---------+ +----------+
| Blender | | Blender | | Blender |
| Server 1 | | Server 2| | Server 3 |
+----------+ +---------+ +----------+
^ ^ ^
| | |
+----Status Updates-------+
|
v
+---------+
| Client |
+---------+
In this chapter, we'll explore potential enhancements and improvements to optimize the Blender server and orchestrator setup. These suggestions aim to improve performance, scalability, reliability, and ease of deployment in production environments.
To handle scenarios where too many requests are made to the server, we can implement rate limiting. When the rate limit is exceeded, the server should respond with a 429 Too Many Requests status code. Rate limiting can help prevent overloading the server and ensure fair usage.
Key Points:
- Implement Rate Limiting: Use libraries like
express-rate-limitin Node.js to limit the number of requests. - Return 429 Status Code: Configure the rate limiter to return a
429 Too Many Requestsstatus code when the limit is exceeded. - Documentation: RFC 6585 - Additional HTTP Status Codes
To ensure that the Blender server processes one job at a time, we can implement a check to reject new jobs if a process is already running. The orchestrator should handle this by queuing pending requests and assigning them to available servers when they become free.
Key Points:
- Single Job Processing: Modify the server to check if a job is already running and reject new jobs with an appropriate status code (e.g.,
503 Service Unavailable). - Job Queueing in Orchestrator: Implement a job queue in the orchestrator to manage pending requests and assign them to available servers.
- Handling Job Rejections: Orchestrator should retry rejected jobs until they are successfully assigned.
Deploying the Blender server and orchestrator in a production environment requires careful planning to ensure scalability, reliability, and security. Azure Kubernetes Service (AKS) is a robust platform for managing containerized applications in production.
Key Points:
- Cluster Autoscaler: Configure AKS to automatically scale the number of nodes based on resource demands.
- Horizontal Pod Autoscaler: Set up the Horizontal Pod Autoscaler to scale the number of pods based on CPU utilization or other metrics.
- Rolling Updates: Use rolling updates to deploy changes without downtime.
- Monitoring and Logging: Integrate with Azure Monitor and Azure Log Analytics to monitor and log application performance and issues.
- Documentation: Azure Kubernetes Service (AKS) Documentation
Implementing a discovery system can automatically detect how many nodes (Blender servers) are available. This can help dynamically adjust the load distribution and optimize resource utilization.
Key Points:
- Service Discovery: Use tools like Consul, etcd, or Kubernetes built-in service discovery to detect available nodes.
- AKS Integration: When using AKS, service discovery is automatically handled by Kubernetes, which maintains an updated list of available pods.
- Documentation: Kubernetes Service Discovery
Consider using message queues to decouple the orchestrator from the Blender servers. Queues can improve reliability and scalability by buffering requests and ensuring they are processed even if some servers are temporarily unavailable.
Key Points:
- Message Queues: Use systems like RabbitMQ, Kafka, or Azure Service Bus to manage job requests.
- Decoupling: Queues help decouple the orchestrator from the Blender servers, enabling asynchronous processing.
- Retry Mechanism: Queues can automatically retry failed jobs, improving reliability.
- Documentation: RabbitMQ Documentation, Apache Kafka Documentation, Azure Service Bus Documentation
Implement comprehensive logging to ensure traceability of the entire process. Logs should capture key events, errors, and performance metrics to help monitor and debug the system.
Key Points:
- Structured Logging: Use structured logging to capture detailed information about each event.
- Centralized Logging: Aggregate logs in a centralized system like Elasticsearch, Logstash, and Kibana (ELK stack) or Azure Monitor.
- Tracing: Implement distributed tracing to follow the flow of requests across services and identify performance bottlenecks.
- Documentation: ELK Stack Documentation, Azure Monitor Documentation
-
Load Balancing
- Description: Implement load balancing to distribute incoming requests evenly across multiple Blender server instances.
- Documentation: Kubernetes Services and Load Balancing
-
Fault Tolerance and High Availability
- Description: Ensure high availability by deploying multiple replicas of the Blender server and orchestrator, and configuring Kubernetes to handle pod failures and restarts.
- Documentation: Kubernetes High Availability
-
Security Enhancements
- Description: Implement security best practices, such as network policies, role-based access control (RBAC), and secret management to protect sensitive data.
- Documentation: Kubernetes Security Best Practices
-
Performance Optimization
- Description: Optimize the performance of the Blender server and orchestrator by tuning resource limits and requests, and profiling the application to identify bottlenecks.
- Documentation: Kubernetes Resource Management
By implementing these enhancements and improvements, you can optimize the Blender server and orchestrator setup for better performance, scalability, reliability, and security in production environments. These suggestions provide a roadmap for future development and deployment strategies.
