NeurIPS Research analysis Remember What You Want to Forget: Algorithms for Machine Unlearning #7

ShiroganeMiyuki-0 · 2023-11-06T15:13:56Z

ShiroganeMiyuki-0
Nov 6, 2023
Collaborator

@ShiroganeMiyuki-0
This is a research paper about algorithms for machine unlearning, which is the problem of deleting data points from a trained machine learning model without retraining from scratch. The paper has the following main contributions:

It introduces a new notion of generalization in machine unlearning, where the goal is to perform well on unseen test data after data deletion, rather than minimizing the empirical loss on the remaining training data.
It considers both computational and storage complexity of unlearning algorithms, and shows that there is a trade-off between them. It also shows that some existing unlearning algorithms based on differential privacy have suboptimal deletion capacity, which is the number of data points that can be deleted while preserving test accuracy.
It proposes a new unlearning algorithm for convex loss functions that improves the deletion capacity by a quadratic factor in the problem dimension over differential privacy-based algorithms. It also provides theoretical guarantees on the test loss and the deletion capacity of the proposed algorithm.
The paper is related to our project, as we are also interested in developing efficient and accurate unlearning algorithms for machine learning models. The paper provides some useful insights and techniques that we can use or adapt for our own problem setting. However, the paper also has some limitations, such as:
It only considers convex loss functions, which may not be applicable to more complex and non-convex models such as deep neural networks.
It assumes that the unlearning algorithm has access to some cheap-to-store data statistics, which may not be available or easy to compute in some scenarios.
It does not provide any empirical evaluation or comparison with existing unlearning algorithms on real-world datasets or applications..

Comparison of models
The overall difference between the two papers of the page is that the first paper neur ips focuses on the problem of coded machine unlearning, which is a new framework for removing data from trained ML models using data encoding and ensemble learning, while the second paper coded mul focuses on the problem of probabilistic machine unlearning, which is a relaxed definition of data deletion that requires the output model to be similar to the one trained without the deleted data. The first paper proposes a coded learning and unlearning protocol that uses random linear coding to combine the training samples into smaller coded shards and updates the model accordingly. The second paper proposes a Gaussian mechanism for unlearning that adds Gaussian noise to the output of the learning algorithm and proves its differential privacy and excess risk guarantees. The first part also presents some synthetic data experiments to demonstrate the performance versus unlearning cost trade-off of the coded protocol. The second part also derives a lower bound on the sample size required for unlearning and extends the results to the case of convex but not strongly convex loss functions.

check out the paper 3 in the research.md file

ShiroganeMiyuki-0 · 2023-11-08T03:53:11Z

ShiroganeMiyuki-0
Nov 8, 2023
Collaborator Author

To put it in simple words this algorithm focuses on making the algorithm more efficient and fast by adding noise to the sensitive data rather than deleting it. Which is similar to forgetting it. And it uses some of the mathematical functions to make that happen.

0 replies

veerasagar · 2023-11-09T18:49:54Z

veerasagar
Nov 9, 2023
Collaborator

Yes adding noise is one of the common technique used in order to mask the data present in the model but nowadays since the privacy laws are getting stricter the whole data of the consumer has to deleted on demand from the model and also from the database of the company, as in this case I guess adding noise is not a suitable course of action.

0 replies

ShiroganeMiyuki-0 · 2023-11-11T09:37:41Z

ShiroganeMiyuki-0
Nov 11, 2023
Collaborator Author

Yes to overcome that check out the updated discussion and the paper 3 in research.md file

there are always exceptions, the suitable course of action depends on the particular use cases, for the use case mentioned in the comment there is another model summarized in the discussion as well as the comprehensive details are available in the paper 3 summarization in the research.md file.

0 replies

mo-shahab · 2023-11-11T09:42:39Z

mo-shahab
Nov 11, 2023
Maintainer

Comparison of models

The overall difference between the two papers of the page is that the first paper neur ips focuses on the problem of coded machine unlearning, which is a new framework for removing data from trained ML models using data encoding and ensemble learning, while the second paper coded mul focuses on the problem of probabilistic machine unlearning, which is a relaxed definition of data deletion that requires the output model to be similar to the one trained without the deleted data. The first paper proposes a coded learning and unlearning protocol that uses random linear coding to combine the training samples into smaller coded shards and updates the model accordingly. The second paper proposes a Gaussian mechanism for unlearning that adds Gaussian noise to the output of the learning algorithm and proves its differential privacy and excess risk guarantees. The first part also presents some synthetic data experiments to demonstrate the performance versus unlearning cost trade-off of the coded protocol. The second part also derives a lower bound on the sample size required for unlearning and extends the results to the case of convex but not strongly convex loss functions.

0 replies

mo-shahab · 2023-11-11T09:43:29Z

mo-shahab
Nov 11, 2023
Maintainer

Comparison of models

The overall difference between the two papers of the page is that the first paper neur ips focuses on the problem of coded machine unlearning, which is a new framework for removing data from trained ML models using data encoding and ensemble learning, while the second paper coded mul focuses on the problem of probabilistic machine unlearning, which is a relaxed definition of data deletion that requires the output model to be similar to the one trained without the deleted data. The first paper proposes a coded learning and unlearning protocol that uses random linear coding to combine the training samples into smaller coded shards and updates the model accordingly. The second paper proposes a Gaussian mechanism for unlearning that adds Gaussian noise to the output of the learning algorithm and proves its differential privacy and excess risk guarantees. The first part also presents some synthetic data experiments to demonstrate the performance versus unlearning cost trade-off of the coded protocol. The second part also derives a lower bound on the sample size required for unlearning and extends the results to the case of convex but not strongly convex loss functions.

@veerasagar this is the argument op is making

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NeurIPS Research analysis Remember What You Want to Forget: Algorithms for Machine Unlearning #7

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Comparison of models

Select a reply

Uh oh!

NeurIPS Research analysis Remember What You Want to Forget: Algorithms for Machine Unlearning #7

Uh oh!

Uh oh!

ShiroganeMiyuki-0 Nov 6, 2023 Collaborator

Replies: 5 comments

Uh oh!

ShiroganeMiyuki-0 Nov 8, 2023 Collaborator Author

Uh oh!

veerasagar Nov 9, 2023 Collaborator

Uh oh!

Uh oh!

ShiroganeMiyuki-0 Nov 11, 2023 Collaborator Author

Uh oh!

mo-shahab Nov 11, 2023 Maintainer

Comparison of models

Uh oh!

mo-shahab Nov 11, 2023 Maintainer

Comparison of models

ShiroganeMiyuki-0
Nov 6, 2023
Collaborator

ShiroganeMiyuki-0
Nov 8, 2023
Collaborator Author

veerasagar
Nov 9, 2023
Collaborator

ShiroganeMiyuki-0
Nov 11, 2023
Collaborator Author

mo-shahab
Nov 11, 2023
Maintainer

mo-shahab
Nov 11, 2023
Maintainer