Skip to content

A small Flask-based demo app to explore Cloud Run observability features (logging, tracing, metrics) and simple Firestore integration.

License

Notifications You must be signed in to change notification settings

discoverlance-com/my-cloud-run-observability-sandbox

Repository files navigation

Cloud Run Observability Sandbox

A small Flask-based demo app to explore Cloud Run observability features (logging, tracing, metrics) and simple Firestore integration.

What this project is

  • A minimal Flask application demonstrating:
    • JSON structured logging (using python-json-logger)
    • Trace linkage via X-Cloud-Trace-Context header
    • Simulated latency and CPU load endpoints for tracing/metrics
    • Basic Firestore usage for sample user writes
  • Intended for running on Google Cloud Run (Dockerfile uses a builder + distroless runtime), but also runnable locally.

Key files

  • main.py — Flask app and all endpoints
  • requirements.txt — Python dependencies
  • Dockerfile — multi-stage build, runtime uses distroless and runs gunicorn as a module
  • .gitignore — ignores common Python artifacts

Environment

  • PROJECT_ID (optional) — Google Cloud project id used to initialize Firestore client
  • DATABASE_ID (optional) — Firestore database id (defaults to (default))

If PROJECT_ID is not provided, Firestore initialization will fail and the app continues with db = None (you can still use non-Firestore endpoints).

Quick start — local (dev)

  1. Create and activate a virtual environment:
    • python -m venv .venv
    • source .venv/bin/activate (or .venv\Scripts\activate on Windows)
  2. Install dependencies:
    • pip install -r requirements.txt
  3. Run:
    • For simple dev run: python main.py (starts Flask builtin server on port 8080)
    • To run with Gunicorn (recommended, matches production):
      • python -m gunicorn --bind :8080 --workers 1 --threads 8 main:app

Quick start — Docker (matches deployment)

  1. Build:
    • docker build -t cloud-run-observability-sandbox .
  2. Run (example):
    • docker run -p 8080:8080 -e PROJECT_ID=your-gcp-project cloud-run-observability-sandbox

The Dockerfile uses a builder stage to install packages into /app/site-packages and a distroless runtime image for a smaller surface area.

Endpoints

  • GET / — Health / welcome message
  • POST /users — Add a user to Firestore
    • Expects JSON with id and name
  • GET /slow-trace — Simulates a slow dependency (sleeps, touches Firestore) — useful to create longer traces
  • GET /cpu-heavy — Computes some arbitrary CPU task — useful to observe CPU metrics
  • GET /flaky — Randomly returns 500 to simulate intermittent errors
  • GET /crash — Demonstrate a runtime error and how Error reporting records and keeps track of errors
  • GET /cached-config — Demonstrates cold vs warm start behaviour using a global cache

Observability notes

  • Logging:
    • Uses python-json-logger to emit JSON logs to stdout.
    • Logs include a trace_id field extracted from the X-Cloud-Trace-Context header so logs can be correlated with Cloud Trace.
    • Logs also include a component field to filter by logical area (e.g., trace-demo, cpu-demo, cache, user-module).
  • Traces:
    • The app reads X-Cloud-Trace-Context header and includes the trace id in log payloads to link logs and traces.
  • Metrics & Alerts:
    • Use cpu-heavy and slow-trace to generate observable signals for metrics dashboards and alerting rules.

Optional: Service-to-service tracing (service_b integration)

  • Purpose
    • If you want richer, linked traces for requests that span services (service A → service B) and more visibility into individual spans within each service, enable service-to-service trace context propagation and per-service instrumentation.
  • What this repo already provides (service_b example)
    • service_b.py demonstrates per-service instrumentation:
      • FlaskInstrumentor().instrument_app(app) to capture incoming HTTP spans.
      • RequestsInstrumentor().instrument() to automatically inject/extract trace propagation headers on outgoing requests.
    • setup_opentelemetry.py configures OTLP exporters for traces, logs, and metrics so spans and logs are exported to your collector/Cloud Trace.
  • How to enable better cross-service traces
    • Ensure every service:
      • Initializes OpenTelemetry SDK (tracer provider and OTLP exporter) like setup_opentelemetry.py.
      • Instruments the web framework (e.g., Flask) to capture incoming spans.
      • Instruments the HTTP client (requests, httpx, etc.) to propagate context on outgoing calls so the downstream service receives the same trace id.
    • In practice:
      • On the caller side, instrument the outgoing HTTP client (RequestsInstrumentor is used in service_b.py) — this will add the necessary headers for propagation.
      • On the callee side, instrument the web framework (FlaskInstrumentor) so incoming requests create child spans under the propagated trace.
    • Manual propagation (optional):
      • If you need explicit control, read the incoming header (e.g., X-Cloud-Trace-Context) and forward it on outgoing requests:
        • Read: trace_header = request.headers.get("X-Cloud-Trace-Context")
        • Forward: requests.post(url, json=payload, headers={"X-Cloud-Trace-Context": trace_header})
      • Automatic instrumentation (RequestsInstrumentor) usually handles this for you and is recommended.
  • Single-service detailed spans and visibility
    • To get more granular spans inside a service, use an explicit tracer and create spans around important operations:
      • Use the SDK tracer (via trace.get_tracer(__name__)) and with tracer.start_as_current_span("operation-name"): around database calls, business logic, or outgoing requests.
      • This gives you fine-grained spans within a service so Cloud Trace shows not only the request-level span but also sub-operations.
  • When to use this
    • Enable service-to-service propagation when you want end-to-end traces across microservices.
    • Add extra internal spans when you need deeper visibility into bottlenecks or want to surface internal operations in Cloud Trace.

Firestore

  • The app initializes a Firestore client at startup using google-cloud-firestore.
  • If initialization fails, db falls back to None and DB-related endpoints will skip database operations (but still run).
  • POST /users writes to the sampleUsers collection.

Notes for contributors

  • Keep changes small and focused: this repo is a demo to exercise observability features.
  • When adding endpoints, include structured logging with component and trace_id in extra for consistency.
  • Avoid adding secrets or service account keys to the repo. Use environment variables or Cloud Run/IAM service accounts for production credentials.

Troubleshooting

  • If Firestore calls fail locally, ensure application has GCP credentials (e.g., gcloud auth application-default login) or unset PROJECT_ID to run without DB.
  • For local tracing/log correlation, include a header like:
    • X-Cloud-Trace-Context: TRACE_ID/123;o=1 where TRACE_ID is a hex trace id.

License

  • MIT

About

A small Flask-based demo app to explore Cloud Run observability features (logging, tracing, metrics) and simple Firestore integration.

Resources

License

Stars

Watchers

Forks