A vibe coded web application that automates bulk CSV data uploads to Epicollect5 with async parallel processing. It uses Playwright to drive a headless Chromium browser, splitting large CSV files into chunks and uploading them in parallel through multiple browser contexts.
- You submit a CSV file through the web interface along with your Epicollect5 project name and email.
- The server splits the CSV into chunks of 150 rows each.
- A pool of up to 4 headless browser contexts uploads chunks in parallel, each automating the Epicollect5 "Upload BETA" workflow.
- The frontend polls for progress every 5 seconds and displays real-time status updates.
Browser session cookies are persisted to browser_data/storage_state.json so that subsequent uploads reuse the existing login without re-authentication.
- Python 3.14+
- A valid Epicollect5 account with access to the target project
- A pre-existing browser session (see Authentication)
# Create and activate a virtual environment
python3.14 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Install Chromium for Playwright
playwright install chromiumFor development (code formatting):
pip install -r requirements_local.txt# Development (auto-reload)
uvicorn src.main:app --reload --host 0.0.0.0 --port 5000
# Production
uvicorn src.main:app --host 0.0.0.0 --port 5000Then open http://localhost:5000 in your browser.
docker build -t epicollect5-bulk-uploader .
docker run -p 5000:5000 epicollect5-bulk-uploaderThe bot runs in headless mode and cannot perform interactive logins. You must provide a valid session before your first upload:
- Run the app in non-headless mode (modify
headless=Truetoheadless=Falseinsrc/services.pytemporarily), or manually create the session file. - Log in to Epicollect5 through the browser window that opens.
- The session is saved to
browser_data/storage_state.jsonand reused for all future uploads.
If the session expires, delete browser_data/ and repeat the login process.
- Open the web interface at http://localhost:5000.
- Enter your Project Name (the slug from your Epicollect5 project URL, e.g.,
my-project). - Enter your Project Email (the email associated with the Epicollect5 project).
- Select the number of Parallel Workers (1-4).
- Upload a CSV file (drag-and-drop or click to browse).
- Click Upload and Process and monitor progress in real time.
Only one upload job can run at a time. If a job is already in progress, new submissions are rejected until it finishes.
| Method | Path | Description |
|---|---|---|
| GET | / |
Serves the web interface |
| POST | /upload |
Accepts a CSV file and starts background processing |
| GET | /status |
Returns the current job status as JSON |
Form fields:
file— CSV file (required)project_name— Epicollect5 project slug (required)project_email— Account email (required)max_workers— Number of parallel browser contexts, 1-4 (default: 4)
Returns JSON with fields: running, progress, percentage, total_records, processed_chunks, total_chunks, error, has_browser_profile, max_workers, start_time, end_time.
src/
main.py # FastAPI app initialization, static file mounting
routes.py # HTTP endpoints and request validation
services.py # Job lifecycle, CSV chunking, progress tracking
epicollect5_bot.py # Playwright browser automation (Epicollect5Bot + ParallelUploader)
logger.py # Logging configuration (uses uvicorn's logger)
templates/
index.html # Single-page web interface (Tailwind CSS)
static/
favicon.ico # Favicon
Runtime directories (created automatically, not checked into git):
files/— Temporary storage for uploaded CSVs and chunk files (cleaned up after processing)browser_data/— Playwright session persistence (storage_state.json)
This project uses Black for code formatting:
# Format all files
black .
# Check formatting without modifying files
black --check .This project is licensed under the MIT License.