Optimisation for Evaluation Pipeline

**Detailed Description**
The current evaluation pipeline relies on `iterrows()` and nested loops to generate forecast horizons and perform timestamp matching. For each test row, the implementation repeatedly applies boolean masking over the entire `pv_data` dataframe to find the closest timestamp within a ±5 minute window. This leads to repeated full-dataframe scans and unnecessary Python-level iteration.

**Context**
Evaluation runs may involve large datasets and 100k+ horizon expansions. Because the pipeline performs repeated filtering operations for each horizon and each test entry, runtime scales poorly as dataset size increases. The bottleneck appears to be algorithmic (row-wise iteration and repeated dataframe scans) rather than hardware-related.

**Possible Implementation**
Focus on algorithmic optimization
Optimizing the data-processing logic should significantly improve performance and scalability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimisation for Evaluation Pipeline #344

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Optimisation for Evaluation Pipeline #344

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions