Skip to content

Comments

Update htcondor and some model code#60

Open
transientlunatic wants to merge 3 commits intomasterfrom
update-htcondor
Open

Update htcondor and some model code#60
transientlunatic wants to merge 3 commits intomasterfrom
update-htcondor

Conversation

@transientlunatic
Copy link
Owner

No description provided.

transientlunatic and others added 3 commits February 3, 2026 21:31
Major improvements:
- Replace RBF kernel with Matérn(ν=2.5) kernel in GPyTorch models
- Fixes 20.56% persistent mismatch bug (now 10.16%)
- Fixes 0.08s timing error (now 0.0018s, 98% improvement)
- Add HeronNonSpinningApproximantMatern class

Root cause: RBF kernel (infinitely differentiable) cannot handle
sharp merger peak + exponential ringdown structure in GW waveforms.
Matérn(ν=2.5) allows twice-differentiable functions, more flexible.

Testing infrastructure:
- Add test_matern_single_iteration.py for training iteration tests
- Add compare_kernels.py for kernel performance comparison
- Add HTCondor submit file for cluster-based training
- Add development diary documenting full investigation

Updated .gitignore to exclude temporary files and HTCondor logs.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add example scripts for GPR training and IMRPhenomD mean function
- Add test_mean_functions.py for mean function validation
- Add documentation on deep kernel learning, mean functions, and kernel analysis

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 3, 2026 21:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for Matérn kernels to the Heron GPR waveform model, includes HTCondor cluster submission scripts, adds comprehensive documentation about mean functions and kernel choices, and provides example configuration files and scripts for training and validation.

Changes:

  • Added ExactGPModelMatern class and HeronNonSpinningApproximantMatern class to support Matérn kernels (ν=1.5, ν=2.5) as alternatives to RBF kernels
  • Added extensive test suite for mean functions (though the implementation is missing)
  • Added HTCondor cluster scripts and Python testing scripts for kernel comparison and training
  • Added comprehensive documentation about mean functions, kernel analysis, deep kernel learning, and development notes
  • Updated example configuration files and added new example scripts
  • Updated .gitignore to exclude training data, logs, and temporary files

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 23 comments.

Show a summary per file
File Description
tests/test_mean_functions.py Test suite for IMRPhenomD mean functions (implementation missing)
scripts/matern_training_cluster.sub HTCondor submit file for distributed Matérn kernel training
scripts/test_matern_single_iteration.py Single training iteration test script
scripts/compare_kernels.py Comparison script for RBF vs Matérn kernels
heron/models/gpytorch.py Added ExactGPModelMatern and HeronNonSpinningApproximantMatern classes
examples/*.yml Configuration files for training and plotting with mean function support
examples/*.py Example script for GPR with IMRPhenomD mean (missing implementation)
docs/*.md Extensive documentation on mean functions, kernels, and development notes
.gitignore Updated to exclude training data, HTCondor logs, and temporary files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Warp the time axis
points[points[:, 1] < 0, 1] = points[points[:, 1] < 0, 1] / self.warp_scale

parameters.pop("time")
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameters.pop("time") on line 86 mutates the input dictionary. This can cause unexpected behavior if the caller reuses the dictionary. Consider creating a copy of parameters at the start of the method or document this side effect clearly.

Copilot uses AI. Check for mistakes.

### 1. Mean Function: `IMRPhenomDMeanFunction`

Located in: [`heron/models/mean_functions.py`](../heron/models/mean_functions.py)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation references a file heron/models/mean_functions.py that doesn't exist in the codebase. The mean function implementation is missing but is extensively documented and referenced throughout the documentation files. Either the implementation needs to be added or the documentation should be removed/updated.

Suggested change
Located in: [`heron/models/mean_functions.py`](../heron/models/mean_functions.py)
Implemented by the `IMRPhenomDMeanFunction` class in the mean-function module of the `heron.models` package.

Copilot uses AI. Check for mistakes.
Comment on lines +356 to +359
).cuda()
test_times = torch.linspace(
time_min, time_max, time_n, dtype=torch.float32
).cuda()
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _make_evaluation_manifold method hardcodes .cuda() calls on lines 356 and 359, which will fail on CPU-only systems. Should use .to(self.device) instead for consistency. The same issue appears in the parent class HeronNonSpinningApproximant (lines 189, 192).

Suggested change
).cuda()
test_times = torch.linspace(
time_min, time_max, time_n, dtype=torch.float32
).cuda()
).to(self.device)
test_times = torch.linspace(
time_min, time_max, time_n, dtype=torch.float32
).to(self.device)

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +17
executable = /data/wiay/conda_envs/heron2026/bin/python
arguments = scripts/test_matern_single_iteration.py $(iterations)

# Use shared filesystem instead of file transfer
should_transfer_files = NO

# Set initial directory (where the job will run and logs will be written)
initialdir = /home/daniel/repositories/ligo/heron
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded absolute paths to specific machines will fail when run on different systems. The paths /data/wiay/conda_envs/heron2026/bin/python and /home/daniel/repositories/ligo/heron are specific to one user's setup and should be parameterized or documented as needing customization before use.

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +58
self.train_y_plus = train_y_plus.cuda() * self.output_scale
self.train_y_cross = train_y_cross.cuda() * self.output_scale

# Use Matérn(ν=1.5) kernels
self.models = {}
self.models["plus"] = ExactGPModelMatern(
self.train_x_plus, self.train_y_plus, nu=1.5
).to(self.device)
self.models["cross"] = ExactGPModelMatern(
self.train_x_cross, self.train_y_cross, nu=1.5
).to(self.device)

for polarisation in ("plus", "cross"):
self.models[polarisation].likelihood.cuda()
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HeronMatern15 class hardcodes .cuda() calls on lines 45-46 and 58 without checking CUDA availability. While line 22 defines a device variable that respects torch.cuda.is_available(), the .cuda() calls bypass this check. Use .to(self.device) instead for consistency and to avoid runtime errors on CPU-only systems.

Copilot uses AI. Check for mistakes.
Comment on lines +409 to +414
times_i = times
# parameters.pop("time")
else:
times_i = (
torch.tensor(times.value, dtype=torch.float32) / mass_factor
) - epoch
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable times_i is not used.

Suggested change
times_i = times
# parameters.pop("time")
else:
times_i = (
torch.tensor(times.value, dtype=torch.float32) / mass_factor
) - epoch
# parameters.pop("time")

Copilot uses AI. Check for mistakes.

# First evaluation
x = torch.tensor([[0.8, -0.02], [0.8, 0.0]])
output1 = mean_fn(x, polarization='plus')
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable output1 is not used.

Suggested change
output1 = mean_fn(x, polarization='plus')
mean_fn(x, polarization='plus')

Copilot uses AI. Check for mistakes.
from heron.models.lalsimulation import IMRPhenomPv2

# Import GP models directly to create custom approximants
from heron.models.gpytorch import ExactGPModelKeOps, ExactGPModelMatern
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'ExactGPModelKeOps' is not used.

Suggested change
from heron.models.gpytorch import ExactGPModelKeOps, ExactGPModelMatern
from heron.models.gpytorch import ExactGPModelMatern

Copilot uses AI. Check for mistakes.

import torch
import gpytorch
import numpy as np
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'np' is not used.

Suggested change
import numpy as np

Copilot uses AI. Check for mistakes.

import pytest
import torch
import numpy as np
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'np' is not used.

Suggested change
import numpy as np

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant