Skip to content

Latest commit

 

History

History
410 lines (330 loc) · 11.4 KB

File metadata and controls

410 lines (330 loc) · 11.4 KB

Usage Guide

Usage instructions for the GEO Benchmark framework.

Workflow

  1. Generate mesh (global coordinate grid)
  2. Run LLM benchmark (query LLMs for climate data)
  3. Analysis pipeline (spatial RMSE, population, bathymetry)
  4. Visualization (temperature maps, clustering, statistical plots)

1. Mesh Generation

Create geographic coordinate grids with land/ocean detection.

Basic Usage

# Generate 10-degree resolution mesh
python geo_mesh_processor.py 10

# Generate 1-degree high-resolution mesh  
python geo_mesh_processor.py 1

# Generate 20-degree coarse mesh
python geo_mesh_processor.py 20

Output

  • meshes/mesh_data_{resolution}deg.json - Mesh data with land points
  • meshes/mesh_data_{resolution}deg.csv - CSV format for analysis

Visualization

# Plot mesh with land boundaries
python plot_mesh.py meshes/mesh_data_10.0deg.json

2. LLM Climate Benchmarking

Query LLMs for temperature data using configuration files. Supports multiple providers.

Configuration Setup

Edit config.yaml to set up your benchmark parameters:

# Basic benchmark settings
benchmark:
  mesh_file: "meshes/mesh_data_10.0deg.json"
  num_repeats: 10
  simple_mode: true
  month: "July"
  use_batch: true
  disable_tracing: false
  resume: false

# Model configuration
model:
  provider: "openai"  # openai, anthropic, google, ollama
  name: "gpt-5-nano"
  temperature: 0
  max_tokens: 300
  max_retries: 3
  timeout: 30

Command Structure

# Use default config.yaml
python climate_llm_benchmark.py

# Use custom config file
python climate_llm_benchmark.py my_config.yaml

Provider Setup

OpenAI (GPT Models)

export OPENAI_API_KEY="your-api-key"
model:
  provider: "openai"
  name: "gpt-5-nano"  # or gpt-4o, gpt-4o-mini, gpt-3.5-turbo

Anthropic Claude

pip install langchain-anthropic
export ANTHROPIC_API_KEY="your-api-key"
model:
  provider: "anthropic"
  name: "claude-3-5-sonnet-20241022"  # or claude-3-5-haiku-20241022

Google Gemini

pip install langchain-google-genai
export GOOGLE_API_KEY="your-api-key"
model:
  provider: "google"
  name: "gemini-1.5-pro"  # or gemini-1.5-flash

Ollama (Local Models)

pip install langchain-community
ollama serve
ollama pull llama3.1:8b
model:
  provider: "ollama"
  name: "llama3.1:8b"  # or mistral:7b, qwen2.5:14b

Examples

Basic Benchmark

# Configure in config.yaml, then run
python climate_llm_benchmark.py

High-Throughput Processing

# In config.yaml
benchmark:
  use_batch: true
  disable_tracing: true
  num_repeats: 20

Resume Interrupted Run

# In config.yaml
benchmark:
  resume: true

Different Models and Months

# For January with Claude
model:
  provider: "anthropic"
  name: "claude-3-5-haiku-20241022"
benchmark:
  month: "January"

# For local Ollama model
model:
  provider: "ollama"
  name: "mistral:7b"
benchmark:
  use_batch: false  # Recommended for local models

Output Files

  • results/climate_results_{resolution}deg_r{repeats}_{model}_simple.json - Final results
  • results/climate_results_intermediate_{n}_{model}_simple.json - Intermediate saves

3. ERA5 Climatology Processing

Process ERA5 NetCDF data to create monthly climatology reference.

Usage

# Process ERA5 data to create climatology
python process_era5_climatology.py data/era5_raw_data.nc

Output

  • data/t2m_climatology_1991-2020.nc - Monthly climatology (1991-2020)

ERA5 Data Requirements

  • NetCDF format with 2m temperature (t2m) variable
  • Time series covering 1991-2020 period
  • Global coverage with regular grid

4. Temperature Visualization

Create temperature maps from LLM results.

Usage

# Create temperature maps from LLM results
python plot_temperature_results.py meshes/mesh_data_10.0deg.json results/climate_results_10.0deg_r10_simple.json

Output Maps

  • png/temperature_map_{resolution}deg_mean.png - Mean temperature
  • png/temperature_map_{resolution}deg_series_{n}.png - Individual request series
  • png/temperature_map_{resolution}deg_std.png - Standard deviation

5. LLM vs ERA5 Comparison

Compare LLM predictions against ERA5 climatology with comprehensive analysis.

Usage

# Full comparison with maps and statistics
python compare_llm_era5.py meshes/mesh_data_10.0deg.json results/climate_results_10.0deg_r10_simple.json data/t2m_climatology_1991-2020.nc

Output Files

  • results/climate_results_{resolution}deg_r{repeats}_simple_era5.json - Combined data
  • png/llm_era5_comparison_{resolution}deg.png - Scatter plot with error bars
  • png/llm_temperature_map_{resolution}deg.png - LLM temperature map
  • png/era5_temperature_map_{resolution}deg.png - ERA5 temperature map
  • png/temperature_difference_map_{resolution}deg.png - Difference map

Features

  • Error bars: ERA5 uncertainty (horizontal) + LLM variability (vertical)
  • Statistics: RMSE, MAE, bias, correlation, request counts
  • Consistent scaling: Both maps use ERA5 temperature range
  • Difference visualization: Blue=LLM<ERA5, red=LLM>ERA5

Common Workflows

Quick Evaluation (Coarse Resolution)

# 1. Generate coarse mesh
python geo_mesh_processor.py 20

# 2. Configure for quick test
# Edit config.yaml:
# mesh_file: "meshes/mesh_data_20.0deg.json"
# num_repeats: 3

# 3. Run benchmark
python climate_llm_benchmark.py

# 4. Compare with ERA5
python compare_llm_era5.py meshes/mesh_data_20.0deg.json results/climate_results_20.0deg_r3_gpt-5-nano_simple.json data/t2m_climatology_1991-2020.nc

Production Run (High Resolution)

# 1. Generate fine mesh
python geo_mesh_processor.py 1

# 2. Configure for production
# Edit config.yaml:
# mesh_file: "meshes/mesh_data_1.0deg.json"
# num_repeats: 10
# disable_tracing: true
# resume: false

# 3. Run comprehensive benchmark
python climate_llm_benchmark.py

# 4. If interrupted, enable resume and re-run
# Edit config.yaml: resume: true
python climate_llm_benchmark.py

# 5. Run complete analysis pipeline
python run_complete_analysis_pipeline.py results/climate_results_1.0deg_r10_gpt-5-nano_simple.json

Seasonal Analysis

# Create configs for different months
for month in January April July October; do
  # Edit config.yaml: month: $month
  python climate_llm_benchmark.py
  python compare_llm_era5.py meshes/mesh_data_10.0deg.json results/climate_results_10.0deg_r10_gpt-5-nano_simple.json data/t2m_climatology_1991-2020.nc
done

Multi-Provider Comparison

# 1. Test OpenAI GPT
# config.yaml: provider: "openai", name: "gpt-4o"
python climate_llm_benchmark.py

# 2. Test Anthropic Claude  
# config.yaml: provider: "anthropic", name: "claude-3-5-sonnet-20241022"
python climate_llm_benchmark.py

# 3. Test Google Gemini
# config.yaml: provider: "google", name: "gemini-1.5-pro"
python climate_llm_benchmark.py

# 4. Test local Ollama model
# config.yaml: provider: "ollama", name: "llama3.1:8b", use_batch: false
python climate_llm_benchmark.py

Performance Tips

Speed Optimization

  • Use batch processing (default)
  • Disable LangSmith tracing: disable
  • Use coarser resolution for testing (20°)
  • Enable resume for long runs

Quality Improvement

  • Increase repeats per point (10-20)
  • Use higher resolution mesh (1-5°)
  • Validate with multiple months/seasons
  • Compare different LLM models

6. Complete Analysis Pipeline

Automated Pipeline

# Complete analysis from raw results to all plots
python run_complete_analysis_pipeline.py results/climate_results_1.0deg_r10_simple.json

Pipeline steps:

  1. Spatial RMSE calculations
  2. Bathymetry/elevation data integration
  3. Population density data integration
  4. Spatial analysis plots
  5. Temperature comparison plots
  6. Elevation clustering analysis
  7. Population clustering plots
  8. Bathymetry maps and comparisons
  9. Population maps and correlations
  10. Filtered spatial analysis (pop≥5/km², elev≤2000m)

Individual Analysis Steps

Spatial RMSE Enhancement

# Add neighborhood analysis to existing results
python extend_results_with_spatial_rmse.py results/climate_results_1.0deg_r10_simple.json

Population Integration

# Add population density data
python add_population_to_results.py results/climate_results_1.0deg_r10_simple_spatial_rmse.json

Bathymetry Integration

# 1. Aggregate GEBCO data to 1° grid (one-time setup)
python aggregate_bathymetry.py

# 2. Add elevation parameters  
python add_bathymetry_to_results.py results/climate_results_1.0deg_r10_simple_spatial_rmse_population.json

Individual Visualization Scripts

# Enhanced spatial analysis with density plots
python plot_spatial_analysis.py results/climate_results_1.0deg_r10_simple_spatial_rmse_bathymetry_population.json

# Colored comparison plots (elevation, population, roughness)
python plot_temperature_comparison_colored.py results/climate_results_1.0deg_r10_simple_spatial_rmse_bathymetry_population.json

# Elevation-based clustering (3×3 grid)
python plot_elevation_clusters.py results/climate_results_1.0deg_r10_simple_spatial_rmse_bathymetry_population.json

# Population-based clustering (3×3 grid)  
python plot_population_clusters.py results/climate_results_1.0deg_r10_simple_spatial_rmse_bathymetry_population.json

# Bathymetry maps and correlations
python plot_bathymetry_map.py results/climate_results_1.0deg_r10_simple_spatial_rmse_bathymetry_population.json

# Population maps and comparisons
python plot_population_map.py results/climate_results_1.0deg_r10_simple_spatial_rmse_bathymetry_population.json

# Filtered analysis (populated, low elevation areas)
python plot_spatial_analysis_filtered.py results/climate_results_1.0deg_r10_simple_spatial_rmse_bathymetry_population.json

Multivariate Analysis

# Statistical modeling
python multivariate_rmse_analysis.py results/climate_results_1.0deg_r10_simple_spatial_rmse_bathymetry_population.json
# Outputs: distributions, correlations, GAM/XGBoost, spatial CV

File Structure

geo_benchmark/
├── data/
│   ├── land/                    # Natural Earth shapefiles
│   ├── t2m_climatology_*.nc    # ERA5 climatology
│   └── bathymetry_1deg_aggregated.nc  # GEBCO elevation data
├── meshes/
│   └── mesh_data_*.json        # Generated meshes
├── results/
│   ├── climate_results_*.json                              # Basic LLM results
│   ├── climate_results_*_spatial_rmse.json                # + spatial analysis
│   ├── climate_results_*_spatial_rmse_bathymetry.json     # + elevation data
│   └── climate_results_*_spatial_rmse_bathymetry_population.json  # Complete enhanced data
├── reports/
│   └── multivariate_rmse_report.txt    # Statistical analysis
└── png/
    └── {results_filename}/      # Organized by results file
        ├── spatial_analysis_*.png        # Spatial maps
        ├── llm_era5_comparison_*.png     # Comparison plots
        ├── temperature_comparison_*.png  # Colored scatter plots
        ├── elevation_clusters_*.png      # Elevation clustering
        ├── population_clusters_*.png     # Population clustering
        ├── bathymetry_*.png             # Elevation/roughness maps
        ├── population_*.png             # Population analysis
        ├── filtered_*.png               # Filtered analysis
        └── multivariate_*.png           # Statistical plots