This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This is a Python-based fast monitoring agent (swf-fastmon-agent) that is part of the ePIC streaming workflow testbed - a distributed scientific computing system for high-energy physics data processing. This agent is one of several optional agent repositories in the larger ecosystem:
Core Repositories (REQUIRED):
swf-testbed- Infrastructure, CLI, and orchestrationswf-monitor- Django web application for monitoring and REST APIswf-common-lib- Shared utilities and common code
Optional Agent Repositories:
swf-fastmon-agent- Fast monitoring agent (this repository)swf-daqsim-agent- Data acquisition simulation agentswf-data-agent- Data management agentswf-processing-agent- Processing workflow agent
Critical: The three core repositories must exist as siblings in the same parent directory. This agent repository should also be placed as a sibling for proper integration.
The project is designed to work with PostgreSQL databases and ActiveMQ messaging systems, communicating via loosely coupled message-based architecture.
- Python Version: 3.9+
- IDE: PyCharm or VScode (with Black formatter configured)
- Code Formatter: Black
- License: Apache 2.0
- Environment Variable:
SWF_HOMEautomatically set to parent directory containing all swf-* repos (via swf-testbed CLI) - Architecture: Extends BaseAgent from swf-common-lib for standardized agent behavior
The agent now inherits from BaseAgent (swf-common-lib) providing:
- Automatic environment setup and .env loading
- REST logging to swf-monitor
- Sequential agent ID generation
- Enhanced heartbeat with workflow metadata
- Automatic subscriber registration
- Connection resilience with auto-reconnection
Integrated with swf-monitor's workflow tracking system:
- Creates workflow stages via
/api/workflow-stages/ - Tracks statuses:
fastmon_received,fastmon_processing,fastmon_complete - Records input/output messages and processing times
- Enables end-to-end workflow visibility
Updated to use swf-common-lib's mq_comms module:
- Requires
client_idparameter for durable subscriptions - SSL support with
MQ_CAFILEenvironment variable - Standardized error handling and reconnection logic
The project has been converted to Django framework with modern packaging:
src/swf_fastmon_agent/ # Agent implementations
├── __init__.py # Package initialization
├── main.py # Main file monitoring agent
├── fastmon_utils.py # Utility functions for the agent
└── database/ # Django database configuration
└── settings.py # Django settings
src/swf_fastmon_client/ # Lightweight monitoring client
├── __init__.py # Package initialization
├── main.py # Typer CLI client for TF monitoring
└── README.md # Client documentation
Additional project files:
├── manage.py # Django management script
├── requirements.txt # Python dependencies
├── pyproject.toml # Modern Python packaging configuration
├── setup_db.py # Database setup utility
├── test_client.py # Client functionality tests
└── demo_integration.py # Integration demonstration
-
main.py: Main file monitoring agent (FastMonitorAgent) that:- Monitors specified directories for newly created STF files
- Applies time-based filtering (files created within X minutes)
- Randomly selects a configurable fraction of discovered files
- Records selected files in the database with metadata
- Broadcasts selected files to ActiveMQ message queues
- Designed for continuous operation under supervisord
- Supports environment variable configuration for deployment flexibility
-
fastmon_utils.py: Core utility functions including:- File discovery and time-based filtering
- Random file selection algorithms
- Database operations for STF file recording via REST API
- Run number extraction from filenames
- Checksum calculation and validation
- ActiveMQ message broadcasting to client queues
- TF (Time Frame) file simulation and sampling from STF files
src/swf_fastmon_client/main.py: Lightweight monitoring client (FastMonitoringClient) that:- Receives TF metadata from ActiveMQ using STOMP protocol
- Stores metadata in local SQLite database for remote monitoring
- Provides Typer-based CLI with
start,status, andinit-dbcommands - Supports SSL connections and flexible ActiveMQ configuration
- Designed for minimal infrastructure requirements and portability
- Enables remote monitoring of ePIC data acquisition with local data persistence
- Future Development: Will become a standalone application separate from the agent repository
This project integrates with:
- PostgreSQL: Database operations using Django ORM (credentials in
.pgpass, logs excluded) - ActiveMQ: Message queuing system (logs and kahadb excluded)
- Agent framework: Secrets/credentials managed through
secrets.yaml,credentials.json,config.ini
Core Dependencies:
- Django: Web framework with ORM for database operations (>=4.2, <5.0)
- psycopg: Modern PostgreSQL adapter for Python (>=3.2.0)
- psycopg2-binary: Legacy PostgreSQL adapter for Python (>=2.9.0)
- typer: Command-line interface framework (>=0.9.0)
- stomp.py: STOMP protocol client for ActiveMQ (>=8.1.0)
Development Dependencies:
- pytest: Testing framework (>=7.0.0)
- pytest-django: Django testing integration (>=4.5.0)
- pytest-cov: Test coverage reporting (>=4.0.0)
- black: Code formatter (>=22.0.0)
- flake8: Code linter (>=4.0.0)
- isort: Import sorting utility (>=5.10.0)
- mypy: Static type checking (>=1.0.0)
- django-stubs: Django type stubs (>=1.13.0)
The agent requires a .env file with the following variables:
Monitor Connection:
SWF_MONITOR_URL- HTTPS URL for authenticated API calls (required)SWF_MONITOR_HTTP_URL- HTTP URL for REST logging (optional)SWF_API_TOKEN- Authentication token for swf-monitor API (required)
ActiveMQ Configuration:
ACTIVEMQ_HOST- ActiveMQ broker host (default: localhost)ACTIVEMQ_PORT- STOMP port (default: 61612)ACTIVEMQ_USER- ActiveMQ username (required)ACTIVEMQ_PASSWORD- ActiveMQ password (required)ACTIVEMQ_USE_SSL- Enable SSL connections (true/false)ACTIVEMQ_SSL_CA_CERTS- Path to CA certificate file
MQ Communications (swf-common-lib):
MQ_USER- Message queue username (required)MQ_PASSWD- Message queue password (required)MQ_HOST- Message queue host (required)MQ_PORT- Message queue port (required)MQ_CAFILE- SSL CA certificate path (required for SSL)
Logging:
SWF_LOG_LEVEL- Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)SWF_STOMP_DEBUG- Enable STOMP protocol debugging (true/false)SWF_AGENT_QUIET- Minimal output mode (true/false)
Agent Configuration:
FASTMON_MODE- Operation mode:message(default) orcontinuousFASTMON_SELECTION_FRACTION- STF sampling fraction (0.0-1.0, default: 0.1)FASTMON_TF_FILES_PER_STF- TF files per STF (default: 7)
See .env.example for a complete template with all available options.
Legacy Django settings (if needed for local development):
POSTGRES_HOST(default: localhost)POSTGRES_PORT(default: 5432)POSTGRES_DB(default: epic_monitoring)POSTGRES_USER(default: postgres)POSTGRES_PASSWORD(default: empty)
- Configuration files containing secrets are gitignored:
secrets.yaml,credentials.json,config.ini,*.session - Database credentials (
.pgpass) are excluded from version control - Log files are excluded from commits
The agent integrates with swf-monitor REST API endpoints:
POST /api/runs/- Create/retrieve run recordsPOST /api/stf-files/- Register STF files (development mode only)POST /api/fastmon-files/- Register TF files (primary endpoint)POST /api/workflow-stages/- Create workflow stage trackingPATCH /api/workflow-stages/{id}/- Update stage status and timestampsPOST /api/subscribers/- Auto-register as ActiveMQ subscriber (via BaseAgent)
{
"stf_file": "parent_stf_filename.stf",
"tf_filename": "tf_001.tf",
"file_size_bytes": 1234567,
"status": "registered",
"metadata": {
"simulation": true,
"created_from": "stf_filename.stf",
"agent_name": "swf-fastmon-agent-1"
}
}The agent creates and updates workflow stages for each STF processed:
# Create stage
stage_data = {
'workflow': workflow_id,
'agent_name': 'swf-fastmon-agent-1',
'agent_type': 'fastmon',
'status': 'fastmon_received',
'input_message': {...}
}
# Update during processing
{'status': 'fastmon_processing', 'started_at': '2025-11-19T10:30:00Z'}
# Mark complete
{'status': 'fastmon_complete', 'completed_at': '2025-11-19T10:30:15Z', 'output_message': {...}}cd $SWF_PARENT_DIR/swf-testbed
source .venv/bin/activate # or conda activate your_env_name
pip install -e $SWF_PARENT_DIR/swf-common-lib $SWF_PARENT_DIR/swf-monitor $SWF_PARENT_DIR/swf-fastmon-agent .
# CRITICAL: Set up environment configuration
cd $SWF_PARENT_DIR/swf-fastmon-agent
cp .env.example .env
# Edit .env with actual values for SWF_MONITOR_URL, SWF_API_TOKEN, ActiveMQ credentials, etc.
# Set up Django environment (swf-monitor)
cp $SWF_PARENT_DIR/swf-monitor/.env.example $SWF_PARENT_DIR/swf-monitor/.env
# Edit .env to set DB_PASSWORD='your_db_password' and SECRET_KEY
cd $SWF_PARENT_DIR/swf-monitor/src && python manage.py migrate
# Initialize testbed
cd $SWF_PARENT_DIR/swf-testbed && swf-testbed initWith Django framework in place, use these standard commands:
python manage.py runserver- Start development serverpython manage.py makemigrations- Create database migrationspython manage.py migrate- Apply database migrationspython manage.py shell- Django interactive shellpython manage.py dbshell- Database shell
python manage.py test- Run Django testspython manage.py test swf_fastmon_agent- Run specific app testspytest- Run all tests using pytest-djangopytest src/swf_fastmon_agent/tests/test_fastmon_utils.py- Run specific test modulepytest -vs -q src/swf_fastmon_agent/tests/test_fastmon_utils.py- Run with verbose outputblack .- Format code with Blackflake8 .- Lint code with Flake8isort .- Sort importsmypy src/- Static type checking
python setup_db.py- Custom database setup utility
python -m swf_fastmon_agent.main- Run file monitoring agent- Use supervisord for deployment with appropriate configuration
Fast monitoring client commands (from src/swf_fastmon_client/):
python -m swf_fastmon_client.main start- Start monitoring client with default settingspython -m swf_fastmon_client.main start --host localhost --port 61612- Start with custom ActiveMQ settingspython -m swf_fastmon_client.main start --ssl --ca-certs /path/to/ca.pem- Start with SSLpython -m swf_fastmon_client.main status- Show client configurationpython -m swf_fastmon_client.main version- Show version information- Client dependencies are included in project requirements (typer, stomp.py)
- GitHub Actions: Automated testing workflow configured in
.github/workflows/test-fastmon-utils.yml - Python Version: Tests run on Python 3.11 in CI environment
- Test Execution:
pytest -vs -q src/swf_fastmon_agent/tests/test_fastmon_utils.py - Environment: Uses
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1to avoid external plugin conflicts
The FastMon agent now sends real-time notifications to clients when TF files are registered:
- Agent Processing: When STF files are processed and TF subsamples created
- Database Recording: TF files are recorded in the FastMonFile table via REST API
- Message Broadcasting: Agent sends notifications to
/queue/fastmon_clientqueue - Client Display: Client receives and displays TF file information in formatted terminal output
{
"msg_type": "tf_file_registered",
"tf_file_id": "uuid",
"tf_filename": "run001_stf_001_tf_001.tf",
"file_size_bytes": 15728640,
"stf_filename": "run001_stf_001.stf",
"run_number": 1,
"status": "registered",
"timestamp": "2025-08-21T10:30:00Z",
"agent_name": "swf-fastmon-agent"
}- Real-time monitoring: Live display of TF file registrations
- Formatted output: Color-coded status, human-readable file sizes
- Statistics tracking: Per-run TF counts, total data processed
- Graceful shutdown: Ctrl+C handling with summary display
- Configurable connection: SSL support, custom ActiveMQ settings
python test_client.py- Basic functionality testspython demo_integration.py- Integration demonstration- Both agent and client can run independently for testing
The project includes comprehensive test coverage:
src/swf_fastmon_agent/tests/
├── __init__.py # Test package initialization
├── README.md # Testing documentation
├── test_fastmon_utils.py # Core utility function tests
└── test_api_conversion.py # API conversion and integration tests
test_fastmon_utils.py: Tests core FastMon utilities including file discovery, filtering, and database operationstest_api_conversion.py: Tests REST API integration and data conversion between agent and monitor systemstest_client.py: Tests client functionality and integration with ActiveMQdemo_integration.py: Demonstrates end-to-end integration between agent and client
Tests are integrated with both local development and CI/CD:
- Local execution:
pytest src/swf_fastmon_agent/tests/ - CI execution: Automated via GitHub Actions on push/PR
- Specific tests:
pytest -vs -q src/swf_fastmon_agent/tests/test_fastmon_utils.py
This agent is part of a multi-module scientific workflow system. Dependencies on swf-testbed, swf-monitor, and swf-daqsim-agent suggest coordination with other components in the ecosystem.
- Virtual Environment Persistence: The shell environment, including the activated virtual environment, does not persist between command calls. You MUST chain environment setup and the command that requires it in a single call.
- Correct:
cd $SWF_PARENT_DIR/swf-testbed && source .venv/bin/activate && python manage.py migrate - Incorrect: Running
source .venv/bin/activatein one call andpython manage.py migratein another.
- Correct:
- Conda Environment Support: Scripts now support both virtual environments and Conda environments. The improved environment detection checks for both
sys.prefix != sys.base_prefix(venv) andCONDA_DEFAULT_ENVenvironment variable. - Core repository structure: Ensure swf-testbed, swf-monitor, swf-common-lib, and swf-fastmon-agent are siblings
- Database connections: Verify PostgreSQL is running and accessible
- ActiveMQ connectivity: Check message broker is running on expected ports
The FastMon agent integrates with the swf-monitor Django REST API. Several issues were identified and resolved:
Issue: After Django migration 0016, the file_url field was renamed to stf_filename, but the agent was still using the old parameter.
Symptoms: API timeouts when querying for existing files
Solution: Updated fastmon_utils.py to use stf_filename parameter in API queries
Issue: The StfFileViewSet lacked proper filtering configuration for query parameters
Symptoms: API timeouts when filtering by stf_filename
Solution: Added DjangoFilterBackend and filterset_fields to the ViewSet in swf-monitor
Issue: Django model expects lowercase status values ("registered") but agent was sending uppercase ("REGISTERED")
Symptoms: HTTP 400 errors with "not a valid choice" messages
Solution: Updated FileStatus constants in fastmon_utils.py to match Django model choices:
class FileStatus:
REGISTERED = 'registered' # was 'REGISTERED'
PROCESSING = 'processing' # was 'PROCESSING'
PROCESSED = 'processed' # was 'PROCESSED'
FAILED = 'failed' # was 'ERROR'
DONE = 'done' # was 'ARCHIVED'Issue: Agent code assumed paginated API responses {"results": [...]} but API sometimes returns direct lists
Symptoms: 'list' object has no attribute 'get' errors
Solution: Added robust response handling for both formats in get_or_create_run() and record_file() functions
# Check if in proper environment (works for both venv and conda)
python -c "import sys, os; print('Virtual env:', sys.prefix != sys.base_prefix); print('Conda env:', 'CONDA_DEFAULT_ENV' in os.environ)"
# Verify core repository structure
ls -la $SWF_PARENT_DIR/swf-testbed $SWF_PARENT_DIR/swf-monitor $SWF_PARENT_DIR/swf-common-lib $SWF_PARENT_DIR/swf-fastmon-agentNote to AI Assistant: The following guidelines ensure consistent, high-quality contributions aligned with the ePIC streaming workflow testbed project standards.
(Taken from the swf-testbed README)
- Do not delete anything added by a human without explicit approval!!
- Adhere to established standards and conventions. When implementing new features, prioritize the use of established standards, conventions, and naming schemes provided by the programming language, frameworks, or widely-used libraries. Avoid introducing custom terminology or patterns when a standard equivalent exists.
- Portability is paramount. All code must work across different platforms (macOS, Linux, Windows), Python installations (system, homebrew, pyenv, etc.), and deployment environments (Docker, local, cloud). Never hardcode absolute paths, assume specific installation directories, or rely on system-specific process names or command locations. Use relative paths, environment variables, and standard tools rather than platform-specific process detection. When in doubt, choose the more portable solution.
- Favor Simplicity and Maintainability. Strive for clean, simple, and maintainable solutions. When faced with multiple implementation options, recommend the one that is easiest to understand, modify, and debug. Avoid overly complex or clever code that might be difficult for others (or your future self) to comprehend. Adhere to the principle of "Keep It Simple, Stupid" (KISS).
- Follow Markdown Linting Rules. Ensure all markdown content adheres to the project's linting rules. This includes, but is not limited to, line length, list formatting, and spacing. Consistent formatting improves readability and maintainability.
- Maintain the prompts. Proactively suggest additions or modifications to these tips as the project evolves and new collaboration patterns emerge.
-
Context Refresh. To regain context on the SWF Testbed project, follow these steps:
- Review the high-level goals and architecture in
swf-testbed/README.mdandswf-testbed/docs/architecture_and_design_choices.md. - Examine the dependencies and structure by checking the
pyproject.tomlandrequirements.txtfiles in each sub-project (swf-testbed,swf-monitor,swf-common-lib). - Use file and code exploration tools to investigate the existing codebase relevant to the current task. For data models, check
models.py; for APIs, checkurls.pyandviews.py. - Consult the conversation summary to understand recent changes and immediate task objectives.
- Review the high-level goals and architecture in
-
Verify and Propose Names. Before implementing new names for variables, functions, classes, context keys, or other identifiers, first check for consistency with existing names across the relevant context. Once verified, propose them for review. This practice ensures clarity and reduces rework.
Ensuring Robust and Future-Proof Tests:
- Write tests that assert on outcomes, structure, and status codes—not on exact output strings or UI text, unless absolutely required for correctness.
- For CLI and UI tests, check for valid output structure (e.g., presence of HTML tags, table rows, or any output) rather than specific phrases or case.
- For API and backend logic, assert on status codes, database state, and required keys/fields, not on full response text.
- This approach ensures your tests are resilient to minor UI or output changes, reducing maintenance and avoiding false failures.
- Always run tests using the provided scripts (
./run_tests.shor./run_all_tests.sh) to guarantee the correct environment and configuration.
This agent repository participates in the coordinated multi-repository development workflow:
- Always use infrastructure branches:
infra/baseline-v1,infra/baseline-v2, etc. - Create coordinated branches with the same name across all affected repositories
- Document changes through descriptive commit messages, not branch names
- Never push directly to main - always use branches and pull requests
CURRENT STATUS: Core repositories are on coordinated infra/baseline-v3 branches with:
- Virtual environment documentation updates (CRITICAL warnings added)
- Top-level CLAUDE.md moved to swf-testbed/CLAUDE-toplevel.md with symlink
- Directory verification guidance added
Check for existing infrastructure branches:
# Check all repos for current infrastructure baseline
cd $SWF_PARENT_DIR
for repo in swf-testbed swf-monitor swf-common-lib swf-fastmon-agent; do
echo "=== $repo ==="
cd $repo && git branch -a | grep infra && cd ..
done# Create coordinated infrastructure branch across repos
cd $SWF_PARENT_DIR
for repo in swf-testbed swf-monitor swf-common-lib swf-fastmon-agent; do
cd $repo && git checkout -b infra/baseline-vN && cd ..
done
# Run comprehensive tests across all repositories
cd swf-testbed && ./run_all_tests.sh- Plan infrastructure phase: Identify all repositories that need changes
- Create coordinated branches: Same
infra/baseline-vNacross affected repos - Work systematically: Make changes across repositories as needed
- Test integration: Run
./run_all_tests.shfrom swf-testbed before merging - Coordinate merges: Merge pull requests simultaneously across repositories
- ALWAYS use
git push -u origin branch-nameon first push - this sets up tracking - After pushing, verify tracking with
git branch -vv- should show[origin/branch-name] - If tracking is missing, fix with:
git branch --set-upstream-to=origin/branch-name branch-name - VS Code "Publish branch" button indicates missing tracking - resolve immediately
Development Mode (Docker-managed infrastructure):
- Managed via
swf-testbed start,stop,statuscommands - PostgreSQL and ActiveMQ run in Docker containers
- Best for development and testing
System Mode (System-managed infrastructure):
- Managed via
swf-testbed start-local,stop-local,status-localcommands - Uses system-level PostgreSQL and ActiveMQ services (e.g., on production servers)
- Best for production deployment
From swf-testbed repository:
# Docker mode
swf-testbed status
# System mode
swf-testbed status-local
# Comprehensive system readiness check
python report_system_status.py