Skip to content

Modernization Phase 4: Replace shell command substitution in monitoring with native Starlark #43

@avrabe

Description

@avrabe

Problem

The monitoring system in toolchains/monitoring.bzl uses shell command substitution within Starlark strings, creating unnecessary shell dependencies and inconsistent platform detection.

Current Implementation Issues

File: toolchains/monitoring.bzl (Lines 8, 14, 45, 67)

def log_build_metrics(ctx, tool_name, operation, duration_ms, success):
    # Shell command substitution in Starlark strings
    metrics_entry = {
        "timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",  # Shell dependency!
        "tool": tool_name,
        "operation": operation,
        "duration_ms": duration_ms,
        "success": success,
        "platform": _detect_platform_simple(ctx),  # This one is already correct
        "bazel_version": "$(bazel version | head -1)",  # Shell dependency!
    }

def create_health_check(ctx, component_name):
    health_check_script = """#!/bin/bash
echo "Platform: $(uname -sm)"  # Shell command instead of repository_ctx
# ...
"""

def add_build_telemetry(ctx, tool_downloads):
    telemetry_script = """#!/bin/bash
echo "PLATFORM: $(uname -sm)"  # Shell command instead of repository_ctx
echo "BAZEL_VERSION: $(bazel version 2>/dev/null | head -1 || echo 'unknown')"
# ...
"""

Issues

  • Shell command substitution: Using $(date ...) and $(bazel version ...) in Starlark strings
  • Inconsistent platform detection: Some functions use repository_ctx, others use uname
  • Runtime shell dependencies: Scripts rely on system utilities instead of build-time data
  • Cross-platform inconsistency: uname behaves differently across platforms

Impact

  • Windows compatibility: uname and date commands may not be available
  • Hermetic violations: Depends on system shell and utilities at runtime
  • Inconsistent metrics: Platform detection varies between different functions
  • Debugging complexity: Shell substitution makes it harder to predict output

Solution

Replace shell command substitution with native Starlark and repository context functions.

Implementation Plan

  1. Replace shell timestamp generation:

    def log_build_metrics(ctx, tool_name, operation, duration_ms, success):
        # Use build-time metadata instead of runtime shell commands
        metrics_entry = {
            "tool": tool_name,
            "operation": operation, 
            "duration_ms": duration_ms,
            "success": success,
            "platform": _detect_platform_simple(ctx),  # Already correct
            "build_time": ctx.build_time if hasattr(ctx, 'build_time') else "unknown",
            # Remove shell substitution - use build metadata instead
        }
  2. Standardize platform detection:

    def create_health_check(ctx, component_name):
        platform = _detect_platform_simple(ctx)  # Use consistent detection
        
        health_check_script = """#!/bin/bash
    echo "🏥 Health Check: {component}"
    echo "Platform: {platform}"  # Use Starlark variable instead of shell command
    # ...
    """.format(component=component_name, platform=platform)
  3. Remove version detection shell commands:

    def add_build_telemetry(ctx, tool_downloads):
        platform = _detect_platform_simple(ctx)
        
        telemetry_script = """#!/bin/bash
    echo "PLATFORM: {platform}"  # Use Starlark variable
    echo "TOOL_DOWNLOADS: {download_count}"
    # Remove bazel version shell command - not needed for telemetry
    """.format(platform=platform, download_count=len(tool_downloads))
  4. Alternative: Use structured data instead of scripts:

    def log_build_metrics(ctx, tool_name, operation, duration_ms, success):
        # Write structured metrics to a file instead of shell script
        metrics_file = ctx.actions.declare_file("build_metrics.json")
        
        metrics_data = {
            "tool": tool_name,
            "operation": operation,
            "duration_ms": duration_ms,
            "success": success,
            "platform": _detect_platform_simple(ctx),
        }
        
        ctx.actions.write(
            output = metrics_file,
            content = json.encode(metrics_data)  # If available
        )

Files to Update

  • toolchains/monitoring.bzl - Replace shell command substitution
  • toolchains/monitoring.bzl - Standardize platform detection usage
  • Any files using monitoring functions - verify compatibility

Testing

  • Test monitoring functions on Windows, macOS, and Linux
  • Verify health check scripts work correctly
  • Check that telemetry collection functions properly
  • Ensure metrics output is consistent across platforms

Acceptance Criteria

  • No shell command substitution ($(command)) in Starlark strings
  • Consistent platform detection across all monitoring functions
  • All monitoring functionality preserved
  • Cross-platform compatibility verified
  • Health check and telemetry scripts work on all platforms

Priority

LOW - Cosmetic improvements with minimal impact, but good for consistency.

Dependencies

Independent issue. Can be implemented anytime.

Notes

This is primarily a code quality improvement. The current shell substitution patterns work but are inconsistent with the rest of the modernized codebase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions