tempered posterior experiment (Temperature) with log scores#30
tempered posterior experiment (Temperature) with log scores#30
Conversation
|
This method will be implemented after first submission as an alternative method if it ends up yielding better results. We agree @hansenp will implement log scores in BOQA and a prioritiser for his method in Exomiser[boqa branch]. Please leave this branch open as a draft, I will port it where appropriate after Xmas |
|
After careful consideration, the tempering of the posterior has been placed inside the BOQA repository, namely in Still todo: update javadoc which is probably not correct anymore. |
|
The re-scaling of the BOQA scores is motivated by the fact that in Exomiser the BOQA scores cannot be combined well with the variant score (and presumably in the future ACMG score), because in most of the cases the BOQA score for the gene/disease at rank 1 is almost 1 and close to zero at all subsequent ranks. Therefore, I believe that such re-scaling methods should be implemented in In this PR you created an additional parameter for BOQA ( As an alternative to all the changes to BOQA in this PR, one could implement the following function in Let's discuss this next year. |
|
Hi, I find myself once more somewhat in disagreement with what you say, sorry! In any case, thanks for the input. I think more scientific questions and "where should the code live" could be handled offline, but since we started, here is my take:
I agree partially. I think in general, beyond what Exomiser does, there is some interest in having a "decent" distribution of scores. What that means, is not easy to define. In this sense, tempering of posteriors is a standard Bayesian approach, though there is still discussion in the literature about it and I could not make up my mind about how mathematically solid this really is, yet. My gut feeling is that it is a good tool to use.
Same as answer above, but I would like to further stress that this is not just a pure normalization, but a transformation of the distribution following some principles. It has been picked to be easily merged with Exomiser, but if it turns out working I would not want it do depend on Exomiser, but to be in BOQA where it belongs.
I don't see an issue with embedding a parameter deep in the codebase, especially if we are in develop. This is git and we are in a development phase, and if it turns out that we do not want it at all, we can still go back. In this regard, I also believe having an extra parameter in
I strongly disagree and find this somewhat uncalled for. Temperature is an extension. Temperature equal to 1 behaves exactly in the same way as previous BOQA. Saying we have no evidence that the temperature re-scaling performs better seems a bit unfair. We are in a development phase and the method that is already there is there simply because you put it there, while I had not had the time to put my proposal in until last week. I could even go as far as saying so far, we have no evidence that the re-scaling methods we already have perform better in the boqa/exomiser context than the temperature re-scaling. I have tried on multiple occasions to explain to you where I think there might be issues with the current method, and it seems to me like you did not even try to understand. I need to put the code somewhere in order to test it, and I would prefer to avoid having the two options, what is already there vs temperature, in two different branches, or with the current way to run exomiser-boqa (recompilation, moving to lib etc.) it will be really impractical to carry out.
This I pointed out myself elsewhere. This is the usual conundrum: adding a parameter makes the model richer, but one also faces the issue of having to find a somewhat robust and detail-independent methodology to tune it. I think, though, that rather than a static frozen re-scaling like the one we already have, which has unpredictable issues depending on the set of diseases one uses and whether some specific patient-disease pair happens to have a very low log-score (simply stated: if a given patient happens to have a really low score among all of the existing diseases) what we really need to have solid, robust results is to have a tunable parameter that can be flexibly adapted if needed. If you don't understand this, I already said I am happy to explain it to you once more, but please refrain from rejecting ideas without having taken the time to try understanding them. |
|
The two good points about embedding deep into the codebase are which I fill fix next year:
I am not even sure why |
Base branch add parameters because I needed that function