Skip to content

Config_Setup_Guide

2raymoori edited this page Dec 16, 2025 · 1 revision

Configuration & Setup Guide

This document explains the configuration system used by the project and how to correctly set it up for local development, experimentation, and production deployment.

The configuration is implemented using Pydantic Settings, which provides type-safe configuration management and seamless integration with environment variables.


1. Overview

The project uses a central Settings class to manage:

  • Application metadata
  • API routing configuration
  • Dataset (Zarr) locations
  • Cloud & local storage paths
  • Server runtime parameters
  • CORS configuration
  • Output paths

All settings:

  • Have safe defaults for development
  • Can be overridden via environment variables
  • Can be configured via a .env file

This design allows the same codebase to be used locally, in containers, and in production.


2. Configuration Mechanism

2.1 Technologies Used

  • pydantic-settings – typed configuration management
  • python-dotenv – loading environment variables from .env
from pydantic_settings import BaseSettings
from dotenv import load_dotenv

load_dotenv()

At runtime:

  1. Environment variables are loaded from .env

  2. Pydantic reads values from:

    • Environment variables
    • Defaults defined in the Settings class

Environment variables always take precedence over defaults.


3. Application Metadata

PROJECT_NAME: str = "Weather Visualization API"
VERSION: str = "1.0.0"
API_V1_STR: str = "/api/v1"
Variable Description
PROJECT_NAME Human-readable project name
VERSION API version identifier
API_V1_STR Base path for versioned API routes

These values are commonly used for:

  • OpenAPI / Swagger documentation
  • Logging
  • Health checks

4. Google Cloud Storage (GCS) Settings

GCS_PROJECT: str = "anon"
Variable Description
GCS_PROJECT GCP project ID used when accessing GCS-backed datasets

Notes

  • Required only if using datasets hosted on Google Cloud Storage
  • For local-only usage, this value can remain unchanged

5. Zarr Dataset Configuration

This project relies heavily on Zarr datasets for weather model data. Each model has its own configurable path.

5.1 GraphCast

GRAPHCAST_ZARR_PATH: str = "gs://weatherbench2/datasets/graphcast/...derived.zarr"
  • Points to a public GCS bucket
  • Requires GCS access (anonymous or authenticated)
GRAPHCAST_INTERPOLATED_ZARR_PATH: str = "/PATH/TO/.../graphcast-prediction-2021_wb2_jan_2021.zarr"
  • Local filesystem path
  • Used for preprocessed or interpolated outputs

5.2 CERRA / CERRORA Datasets

CERRORA_EXAMPLE_ZARR_PATH: str = "/PATH/TO/.../cerra_full_derived_jan2021.zarr"
CERRORA_GT_ZARR_PATH: str = "/PATH/TO/.../cerra_full_derived_jan2021.zarr"
CERRORA_ZARR_PATH: str = "/PATH/TO/.../forecast_cerrora_6h_rollout_step15000_10_07_2025_jan_2021.zarr"
Variable Purpose
CERRORA_EXAMPLE_ZARR_PATH Example / sample dataset
CERRORA_GT_ZARR_PATH Ground truth data
CERRORA_ZARR_PATH Forecast or rollout output

⚠️ Important: These paths are machine-specific and must be overridden by users on their own systems.


5.3 Experimental Datasets

EXPERIMENTAL_ZARR_PATH: str = "data/experimental/...derived.zarr"

Used for:

  • Testing new datasets
  • Rapid prototyping
  • Research experiments

6. Server Configuration

HOST: str = "0.0.0.0"
PORT: int = 8999
RELOAD: bool = False
Variable Description
HOST Interface to bind the server
PORT Server port
RELOAD Enables hot-reload (development only)

Recommended Values

Environment HOST PORT RELOAD
Local Dev 127.0.0.1 8999 true
Docker 0.0.0.0 8999 false
Production 0.0.0.0 8999 false

7. CORS Configuration

CORS_ORIGINS: List[str] = [
    "http://127.0.0.1",
    "http://localhost",
    "http://127.0.0.1:3000",
    "http://localhost:3000",
    "http://0.0.0.0:3000",
    "http://172.20.8.50:3000"
]

Defines which frontends are allowed to access the API.
Important: These paths are machine-specific and must be overridden by users on their own systems.

Notes

  • Add your frontend URL here
  • For production, restrict this list to known domains only

8. Image & Output Settings

IMAGE_OUTPUT_DIR: str = "streaming"
Variable Description
IMAGE_OUTPUT_DIR Directory where generated images/frames are stored

9. Base URL Configuration

BASE_URL: str = os.getenv(
    "API_BASE_URL",
    "http://127.0.0.1:8000/backend-fast-api/streaming",
)

This value defines the public URL used to serve generated outputs.

Environment Override Example

API_BASE_URL=https://api.my-domain.com/streaming

10. Using a .env File (Recommended)

Create a .env file in the project root:

# Server
HOST=127.0.0.1
PORT=8999
RELOAD=true

# Paths (example)
GRAPHCAST_INTERPOLATED_ZARR_PATH=/data/graphcast/interpolated.zarr
CERRORA_ZARR_PATH=/data/cerrora/forecast.zarr

# Base URL
API_BASE_URL=http://localhost:8999/streaming

The .env file should not be committed to GitHub.

12. Summary

This configuration system provides:

  • Type safety
  • Clear defaults
  • Environment-based overrides
  • Reproducible deployments

It is designed to support:

  • Local development
  • Research experimentation
  • Cloud & production deployment

For questions or contributions, please refer to the project repository or open an issue.