Batchman - High-Performance Ollama Batch Processor

A high-performance Python tool for processing large batches of inputs through Ollama LLM with concurrent workers, real-time progress tracking, and performance metrics.

Features

✅ Concurrent Processing: Process multiple requests in parallel with configurable worker count
✅ Real-time Progress: Live progress bar with ETA, percentage, and average response time
✅ Line Preservation: Output file maintains exact line correspondence with input file
✅ Error Handling: Automatic JSON parsing error logging to separate error file
✅ Performance Metrics: Detailed statistics including throughput, avg response time, and total time
✅ Benchmark Mode: Test different worker counts to find optimal performance
✅ Robust JSON Extraction: Handles LLM responses with extra text around JSON

Installation

Install Ollama (if not already installed):
- Visit https://ollama.ai and follow installation instructions
- Start Ollama service
Pull the model (example with gemma3:1b):
```
ollama pull gemma3:1b
```
Install Python dependencies:
```
pip install -r requirements.txt
```

Configuration

Edit config.py to customize settings:

# Ollama Configuration
OLLAMA_MODEL = "gemma3:1b"          # Change to your preferred model
OLLAMA_BASE_URL = "http://localhost:11434"
OLLAMA_CONTEXT = 4096
OLLAMA_KEEP_ALIVE = 30              # Minutes to keep model in memory

# File Paths
PROMPT_FILE = "prompt.txt"
INPUT_FILE = "input.txt"
OUTPUT_FILE = "output.jsonl"
ERROR_FILE = "errors.log"

# Performance Settings
PARALLEL_WORKERS = 5                # Adjust based on your system
REQUEST_TIMEOUT = 120               # Seconds per request

Usage

Basic Usage

Prepare your input file (input.txt):
- One input item per line
- Each line will be processed separately
Customize your prompt (prompt.txt):
- Use {INPUT} as a placeholder for each line from input file
- Example:
```
You are an expert classifier.
Analyze this input: {INPUT}
Return valid JSON with your analysis.
```
Run the processor:
```
python main.py
```

Output

output.jsonl: JSONL file with one JSON object per line (matching input line numbers)
errors.log: JSON log entries for any failed parsing attempts

Real-time Progress Display

While running, you'll see:

[12/17] 70.6% | Avg: 2.34s | ETA: 0:00:12 | Elapsed: 0:00:28

[12/17]: Current item / Total items
70.6%: Completion percentage
Avg: 2.34s: Average response time per item
ETA: 0:00:12: Estimated time to completion
Elapsed: 0:00:28: Total time elapsed

Performance Optimization

Finding Optimal Worker Count

Use the benchmark script to test different worker counts:

python benchmark.py

This will:

Test with multiple worker counts (1, 3, 5, 10, 15, 20)
Measure throughput and response times
Recommend the optimal worker count for your system
Save detailed results to benchmark_results.json

Factors Affecting Performance

System Resources:
- CPU cores available
- RAM (models need memory)
- Disk I/O speed
Model Size:
- Smaller models (1b-7b): Can handle more workers
- Larger models (13b+): Need fewer workers due to memory/CPU constraints
Ollama Configuration:
- Ensure Ollama has sufficient resources allocated
- Consider running multiple Ollama instances for extreme parallelism

Performance Tips

Start with 5 workers and adjust based on benchmark results
Monitor system resources during processing
Larger models may benefit from fewer workers (3-5)
Smaller models can handle more workers (10-20+)
SSD vs HDD: Faster storage helps with model loading

Example Use Case: Music File Classification

The included example classifies music file paths:

Input (input.txt):

C:\Users\Sam\Music\10cc 20th Anniversary\CD14\12 24 Hours (Edit).opus

Prompt (prompt.txt):

You are an expert music classifier.
Extract metadata from this file path: {INPUT}
Return JSON: {"artist": "", "album": "", "year": "", "track_number": "", "track_name": ""}

Output (output.jsonl):

{
  "artist": "10cc",
  "album": "20th Anniversary - CD14",
  "year": "",
  "track_number": "12",
  "track_name": "24 Hours (Edit)"
}

Error Handling

Errors are logged to errors.log in JSON format:

{
  "timestamp": "2025-11-13 14:23:45",
  "line_number": 5,
  "input": "problematic input line",
  "error": "JSON parse error: Expecting value: line 1 column 1"
}

Failed items leave an empty line in output.jsonl to maintain line correspondence.

Troubleshooting

Connection Errors

Ensure Ollama is running: ollama serve
Check Ollama URL in config matches your setup
Verify model is pulled: ollama list

Performance Issues

Run benchmark.py to find optimal worker count
Reduce PARALLEL_WORKERS if system is overloaded
Increase REQUEST_TIMEOUT for slow responses
Check system resources (CPU, RAM, disk)

JSON Parsing Errors

Check errors.log for specific failures
Improve prompt to ensure LLM returns valid JSON
The system automatically extracts JSON from surrounding text

Advanced Usage

Custom JSON Extraction

The system automatically finds JSON in LLM responses:

Searches for first { and last }
Extracts and parses the JSON portion
Handles extra text before/after JSON

Line Number Preservation

The system guarantees:

Output line N corresponds to input line N
Failed items produce empty lines (errors logged separately)
JSONL format for easy line-by-line processing

Performance Expectations

With the example configuration (gemma3:1b, 5 workers):

Throughput: 2-5 items/second (depending on system)
Response time: 0.5-2 seconds per item
Scaling: Near-linear up to CPU core count

License

This project is provided as-is for batch processing tasks with Ollama.

Contributing

Feel free to submit issues or pull requests for improvements!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Batchman - High-Performance Ollama Batch Processor

Features

Installation

Configuration

Usage

Basic Usage

Output

Real-time Progress Display

Performance Optimization

Finding Optimal Worker Count

Factors Affecting Performance

Performance Tips

Example Use Case: Music File Classification

Error Handling

Troubleshooting

Connection Errors

Performance Issues

JSON Parsing Errors

Advanced Usage

Custom JSON Extraction

Line Number Preservation

Performance Expectations

License

Contributing

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
__pycache__		__pycache__
BENCHMARK_ANALYSIS.md		BENCHMARK_ANALYSIS.md
QUICK_START.md		QUICK_START.md
README.md		README.md
USAGE_GUIDE.md		USAGE_GUIDE.md
benchmark.py		benchmark.py
benchmark_results.json		benchmark_results.json
compare_models.py		compare_models.py
config.py		config.py
errors.log		errors.log
input.txt		input.txt
main.py		main.py
output.jsonl		output.jsonl
prompt.txt		prompt.txt
requirements.txt		requirements.txt

samlovescoding/batchman

Folders and files

Latest commit

History

Repository files navigation

Batchman - High-Performance Ollama Batch Processor

Features

Installation

Configuration

Usage

Basic Usage

Output

Real-time Progress Display

Performance Optimization

Finding Optimal Worker Count

Factors Affecting Performance

Performance Tips

Example Use Case: Music File Classification

Error Handling

Troubleshooting

Connection Errors

Performance Issues

JSON Parsing Errors

Advanced Usage

Custom JSON Extraction

Line Number Preservation

Performance Expectations

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages