Skip to content

Implement method to calculate the Kullback-Leibler Divergence between two features #240

@hechth

Description

@hechth

The current representation of a peak includes the mz, rt the sd1 and sd2 of the bi-gaussian peak shape and the peak area. The peak is overall modeled using a bi-gaussian, which means we can construct a probability distribution from those parameters. The Kullback Leibler divergence is used to calculate the similarity between probability distributions. Implementing this metric to compare bi-gaussian peaks would allow us to calculate peak similarity. If we normalize the height of the peak (so not compare using the overall abundance), we could nicely compare the peak shapes of two peaks.

The task is to implement this calculation in two ways, including the area (meaning also including the peak intensity in the comparison) and using a normalized area (ignoring intensity and focusing only on peak shape). Note that ignoring the intensity doesn't mean changing all peaks to a unit height or so, as this would change the actual standard deviation values of the peak itself.

There are also some other divergence functions which could be used to measure the dissimilarity (see here for an overall summary:
Kullback-Leiber Divergence
Jensen-Shannon Divergence
Wasserstein Distance

Note that the bi-gaussian falls into the category of split-normal distributions and the metrics may not be defined. This article gives a good introduction to the topic: https://projecteuclid.org/journalArticle/Download?urlId=10.1214%2F13-STS417

A good starting point would be the implementation of various probability density functions as well as split normal distributions to get familiar with how they work and then start implementing the distance measures - AI chat bots can be very useful in this but should also be used with caution, since it is very difficult to verify these claims here.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions