Add thread-safe caching to similarity calculations

**Desired functionality**
`hpo` has a struct that caches similarity calculations from term-term calculations. This caching should be safe across threads to allow multiprocessing similarity.

**Constraints**
With an Ontology with ~13,000 terms, the total number of possible combinations is 
`n! / (k! * (n - k)!)`
--> `13,000! / (2! * (13,000 -2)!)` 
==> `84,493,500`

For each combination we must store a 32bit float similarity score + a hash for the two 32bit HpoTermIds. So we could end up with a huge cache and might have to find a way to limit the overall size. We could e.g. have one Hashset that contains all comparisons that result in 1 and another one for all that result in 0.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add thread-safe caching to similarity calculations #23

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add thread-safe caching to similarity calculations #23

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions