NeurIPS Research analysis Remember What You Want to Forget: Algorithms for Machine Unlearning #7
Replies: 5 comments
-
To put it in simple words this algorithm focuses on making the algorithm more efficient and fast by adding noise to the sensitive data rather than deleting it. Which is similar to forgetting it. And it uses some of the mathematical functions to make that happen. |
Beta Was this translation helpful? Give feedback.
-
Yes adding noise is one of the common technique used in order to mask the data present in the model but nowadays since the privacy laws are getting stricter the whole data of the consumer has to deleted on demand from the model and also from the database of the company, as in this case I guess adding noise is not a suitable course of action. |
Beta Was this translation helpful? Give feedback.
-
Yes to overcome that check out the updated discussion and the paper 3 in research.md file there are always exceptions, the suitable course of action depends on the particular use cases, for the use case mentioned in the comment there is another model summarized in the discussion as well as the comprehensive details are available in the paper 3 summarization in the research.md file. |
Beta Was this translation helpful? Give feedback.
-
Comparison of modelsThe overall difference between the two papers of the page is that the first paper neur ips focuses on the problem of coded machine unlearning, which is a new framework for removing data from trained ML models using data encoding and ensemble learning, while the second paper coded mul focuses on the problem of probabilistic machine unlearning, which is a relaxed definition of data deletion that requires the output model to be similar to the one trained without the deleted data. The first paper proposes a coded learning and unlearning protocol that uses random linear coding to combine the training samples into smaller coded shards and updates the model accordingly. The second paper proposes a Gaussian mechanism for unlearning that adds Gaussian noise to the output of the learning algorithm and proves its differential privacy and excess risk guarantees. The first part also presents some synthetic data experiments to demonstrate the performance versus unlearning cost trade-off of the coded protocol. The second part also derives a lower bound on the sample size required for unlearning and extends the results to the case of convex but not strongly convex loss functions. |
Beta Was this translation helpful? Give feedback.
-
@veerasagar this is the argument op is making |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
@ShiroganeMiyuki-0
This is a research paper about algorithms for machine unlearning, which is the problem of deleting data points from a trained machine learning model without retraining from scratch. The paper has the following main contributions:
The paper is related to our project, as we are also interested in developing efficient and accurate unlearning algorithms for machine learning models. The paper provides some useful insights and techniques that we can use or adapt for our own problem setting. However, the paper also has some limitations, such as:
Comparison of models
The overall difference between the two papers of the page is that the first paper neur ips focuses on the problem of coded machine unlearning, which is a new framework for removing data from trained ML models using data encoding and ensemble learning, while the second paper coded mul focuses on the problem of probabilistic machine unlearning, which is a relaxed definition of data deletion that requires the output model to be similar to the one trained without the deleted data. The first paper proposes a coded learning and unlearning protocol that uses random linear coding to combine the training samples into smaller coded shards and updates the model accordingly. The second paper proposes a Gaussian mechanism for unlearning that adds Gaussian noise to the output of the learning algorithm and proves its differential privacy and excess risk guarantees. The first part also presents some synthetic data experiments to demonstrate the performance versus unlearning cost trade-off of the coded protocol. The second part also derives a lower bound on the sample size required for unlearning and extends the results to the case of convex but not strongly convex loss functions.
check out the paper 3 in the research.md file
Beta Was this translation helpful? Give feedback.
All reactions