Skip to content

Choosing threshold in ExtremeValues #245

@torbenschmith

Description

@torbenschmith

Choosing threshold in ExtremeValues

In xsdba, the ExtremeValues correction procedure involves fitting of GPD distributions, which are then blended with empirical distributions to get a mixture distribution. These mixture distributions are then combined to get the bias-correction mapping. The procedure is as follows (as I understand it):
· For ref and hist datasets, define clusters as exceeding 1 mm (or a user defined threshold ’cluster_thresh’) surrounded by value below the threshold
· For ref and hist datasets, take the q_thresh (default = 0.95 ) quantile of the above, giving a threshold for ref and one for hist
· Define a final threshold as the mean of the two above thresholds
thresh = (np.nanquantile(ref[ref >= cluster_thresh], q_thresh) + np.nanquantile(hist[hist >= cluster_thresh], q_thresh)) / 2

Typically the range, i.e also peak values, will differ between hist and ref data. The above procedure of defining the threshold for the GPD fits can be inconvenient because it may select a different number of data points for ref and hist, respectively. In the extreme case, no data may be selected for one of the datasets.

A more flexible setup

A more flexible setup would be:
· possibility to specify individual thresholds for ref, hist (and sim)
· possibility tor specify individual number of peaks for ref, hist (and sim)
Such an implementation would allow for most practical applications. I would therefore like to the community’s opinion about that.

Additional context

No response

Contribution

  • I would be willing/able to open a Pull Request to contribute this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions