Different results when fisher in EWC is called multiple times ? #808
Unanswered
DragonRed18
asked this question in
Q&A
Replies: 1 comment 2 replies
-
While we are looking more into this, can you try by adding |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Goodmorning.
Hi have a problem that I am not able to solve.
When using EWC strategy the "compute_importances" method is called to obtain the FIM.
The method is equivalent to the call of a metric i.e. it shouldn't leave any trace behind.
However, I found out that if I call this method multiple times, the Accuracy and FIM of successive tasks will change with respect to when compute_importances was called a single time.
I provide below the code used to obtain the results.
I set the seeds for reproducibility:
class MLP_Model(nn.Module):
While the following is the main code:
I observed that calling multiple time the evaluation of fisher change the final results.
This can be done by editing in the source code
In ewc.py in after_training_exp method, instead of compute_importances we call multiple_calls_compute_importances method.
This method will only do multiple calls to compute_importances method:
The surprising thing is that changing n=1(fisher is called only once as in the normal ewc) to n=6(fisher is called 6 times) the final results are different.
This can be observed not only in the final accuracy but also looking to the trace of the FIM.
For this is sufficient to do a little edit in the compute_importances method:
It is equal to the original one but now is also calculated the trace in the sum_imp parameter.
Let's call:
From the trace of fisher of second task we can observe different results:
Moreover, the final accuracy in 1) is for Task 0: 0.3209 and for Task 1: 0.9750
Instead the final accuracy in 2) is for Task 0: 0.3055 and for Task 1: 0.9671.
It would be very interesting and useful to understand why there is so much difference due to the call of a method that shouldn't have left any trace behind.
Beta Was this translation helpful? Give feedback.
All reactions