Skip to content

Conversation

@liranc6
Copy link

@liranc6 liranc6 commented Jul 31, 2025

No description provided.

liranc6 added 6 commits July 28, 2025 10:56
… fast eval.

files changed: eval.py, metrics/knowmem.py, metrics/privleak.py, metrics/verbmem.py
The primary purpose is to improve evaluation robustness and flexibility when managing model outputs and debug workflows.

The primary changes are:

- Updated `eval_model` to ensure `forget_data`, `retain_data`, and `holdout_data` are initialized consistently before use.
- Replaced hardcoded paths with `os.path.join` using `MUSE_DIR` in `eval_model` for improved path handling.
- Added a `kwargs` parameter to both `eval_model` and `load_then_eval_models` to support dynamic control over file creation and loading.
- Implemented conditional logic in `eval_model` for managing `privleak` file generation based on `kwargs['create_new_files']`.
- Removed unused imports and dynamic import logic from `eval.py`, replacing `importlib` with `sys.path.append` to streamline module loading.
- Improved debug visibility in `eval_model` with additional `print` statements for key file paths and parameter values.
- Increased `debug_subset_len` from 2 to 50 in `eval_model` for broader test coverage during debug mode.
- Updated `exp.ipynb` to align with changes in model handling and evaluation behavior in `eval_model`.
@liranc6 liranc6 force-pushed the initial-checks-cosmetic-edits branch from 8f76d7d to aba8ff4 Compare August 11, 2025 11:34
liranc6 added 23 commits August 11, 2025 22:09
Purpose: Improve the clarity and depth of ILL evaluation, and introduce new tools for classifier-based analysis.

Changes:
- Updated  to clean outputs, improve ROC curves, and set .
- Improved structure and markdown clarity in , with added analysis on loss distributions and unlearning.
- Added  and  for classifier-based ILL feature exploration.
- Added  script for reproducible, scriptable Random Forest analysis.

These updates improve reproducibility, interpretability, and support deeper ILL feature analysis.
…ion in notebooks

The primary purpose is to fix broken imports and implement functional Input Loss Landscape feature computation for machine learning interpretability analysis.

The primary changes are:

- Enhanced import structure in  with additional sklearn modules and SHAP availability check.
- Replaced broken  function calls with working ILL feature computation pipeline.
- Added comprehensive logistic regression analysis with performance metrics, confusion matrix, and feature importance analysis.
- Integrated permutation importance computation and visualization for feature interpretability.
- Fixed execution flow by removing error-prone cells and replacing with successful feature extraction results.
- Updated notebook outputs to show successful ILL feature computation for forget/retain/holdout datasets.
- Added baseline logistic regression performance evaluation with 74% accuracy and detailed classification report.
- Modified  to align with the working implementation in .
The primary purpose is to evaluate the loss landscape of first neighbor sentences to understand the impact of unlearning.

The primary changes are:

- Created a new notebook `MUSE/notebooks/1st_neighbor_classification.ipynb` to analyze the loss landscape of first neighbor sentences.

- Modified `loss_landscape.py` to extract logic into `new_ILL_eval`, `get_features`, and `normalize_features`.

- Replaced dynamic imports with `sys.path` appends in `utils.py`.

- Added `transformers` to `requirements.txt`.

- Increased UMAP dimensionality from 2 to 10 in `embedding.py`.

- Added AUC heatmap and bar chart of top features in `visualization.py`.

- Modified `plotting.py` to return `matplotlib` figure objects instead of file paths.

- Updated `plotting.py` to align with changes made in `visualization.py`.
The purpose of this change is to prevent errors when saving the statistical distances heatmap.

The changes include:

- Added `os.makedirs(plots_base_dir, exist_ok=True)` before saving the heatmap in `eval_with_ILL.py` to ensure the directory exists.
The primary purpose is to provide a reproducible workflow for evaluating Input Loss Landscape (ILL) features on the TOFU dataset using a Llama-2-7b model.

The primary changes are:

- Added `TOFU/notebooks/eval_with_ILL.ipynb` containing a step-by-step pipeline for:

  - Loading and preprocessing the TOFU dataset from Hugging Face.

  - Loading model and tokenizer with correct prompt formatting.

  - Running ILL evaluation using project utilities and saving results.

  - Extracting and normalizing ILL feature tensors for analysis.

  - Visualizing loss landscape features with matplotlib plots.

- The notebook demonstrates integration between the TOFU, MUSE, and project source directories.

- Example code for prompt formatting, model inference, and loss calculation is included for clarity.

- Notebook serves as a reference for future ILL experiments and analysis on TOFU.
…aluation

Detailed description:

- Introduced  to provide a full pipeline for running, analyzing, and visualizing
  unlearning experiments across multiple models and benchmarks.
- Added argument parsing, configuration, and directory management for reproducible experiments.
- Implemented data loading utilities for TOFU, WMDP, and MUSE datasets.
- Integrated model loading, evaluation, and feature extraction using HuggingFace Transformers.
- Added baseline and custom metric computation (AUC, min-k, zlib, ROUGE-L, etc.).
- Created a  class for robust saving/loading of results, tables, and visualizations.
- Automated table generation (aggregate, family, detailed) and summary statistics.
- Added plotting and visualization routines for performance comparison.
- Ensured compatibility with Weights & Biases logging.
- Updated  to return trained classifiers for downstream saving and analysis.
- Modified binary comparison training in  to return classifier objects.
- These changes enable end-to-end experiment management, result analysis, and reporting for the project.
…ults aggregation

Detailed description:

- Added notbooks/ablations_results/create_commands.ipynb to generate experiment command-line arguments for ablation studies across parameters such as n_tokens, max_new_tokens, neighbor_dist, and max_neighbors.
- Added notbooks/ablations_results/max_new_tokens_read_results.ipynb to load, aggregate, and visualize experiment results for different max_new_tokens values, including summary tables and TSV exports for further analysis.
- Both notebooks support reproducible experiment setup and results inspection, with code for classifier loading, dummy predictions, and formatted output for Google Sheets.
These changes enable systematic parameter sweeps and facilitate detailed ablation analysis of unlearning experiments.
…ation and new result scraping tool

The primary purpose is to refine the setup for ablation experiments by adjusting parameter configurations in command generation and introducing a new utility to systematically extract and organize result file paths from terminal run outputs.

The key changes are:

Modified notbooks/ablations_results/create_commands.ipynb to update experiment parameter lists, including adjustments to n_tokens, max_new_tokens, neighbor_dist, and max_neighbors values, and refined the command output structure for better indexing.
Updated notbooks/ablations_results/neighbor_dist_read_results.ipynb to change the Python version from 3.11.13 to 3.11.12.
Added new notbooks/ablations_results/scrap_results_file_names.ipynb to scrape result file paths from terminal output files, build a DataFrame mapping experiment parameters to job indices and paths, and enable querying specific experiments.
Purpose: Provide a reproducible notebook to extract experiment result file paths
from terminal outputs and synchronize experiment command lists across ablations notebooks.

What changed:
- Added `notbooks/ablations_results/scrap_results_file_names.ipynb` to build parameter lists,
  parse terminal outputs, extract JSON paths via regex, and create `results_df` with pandas.
- Updated `notbooks/ablations_results/create_commands.ipynb` to clean command listings and fix job indices.
- Updated `notbooks/ablations_results/generic_read_results.ipynb` to generate plots to examine performance w.r.t parameter, also summarize in tables the results and find outliers run jobs.
Purpose: Fix and standardize notebooks used to generate experiment commands and read/aggregate ablation results.

- notbooks/ablations_results/create_commands.ipynb:
  - Removed duplicated/incorrect command entries.
  - Normalized and corrected parameter lists (neighbor_dist, max_neighbors).
  - Added/cleaned parameter arrays (max_gen_tokens, subset_size, wb_exp_names).

- notbooks/ablations_results/generic_read_results.ipynb:
  - Standardized mapping path building using os.path.join.
  - Normalized job_ids list format and added `param_to_test` assignment.

- notbooks/ablations_results/max_neighbors_read_results.ipynb:
  - Added `param_to_test` and adjusted mapping path usage.
  - Cleaned formatting and clarified data-loading steps.

Notes:
- Keep notebooks consistent across ablation read scripts; consider centralizing path/mapping logic for future refactors.
… additional job IDs in ablation analysis

The purpose is to aggregate and analyze results from more experiments by incorporating data from job IDs 67903336 and 67906792 into the ablation notebooks.

Key changes include:
- Added timestamp argument to `overall_comp.py` for better experiment tracking.
- Updated job_ids lists in `max_neighbors_read_results.ipynb`, `n_tokens_read_results.ipynb`, and `neighbor_dist_read_results.ipynb` to include '67906792'.
- Modified code cells in `create_commands.ipynb` to process additional mapping data and generate commands for new job IDs.
- Added new entries in `results_file_mapping/67906792.json` for the additional job results.
- Updated `scrap_results_file_names.ipynb` with new experiment parameters and paths.
The primary purpose is to enhance the input loss landscape evaluation by adding HDBSCAN clustering and 2D/3D visualization capabilities to analyze data distributions across forget, retain, and holdout sets for improved interpretability.

The primary changes are:

- Added HDBSCAN clustering functionality in 'notbooks/overall_comp.py' with new functions 'perform_hdbscan_clustering' and 'plot_hdbscan_results'.
- Integrated clustering into the 'run_ill_evaluation' function, including PCA and saving of results.
- Updated 'evaluate_model_on_benchmark' to save HDBSCAN clusterers and visualizations.
- Modified the 'DataManager' class to include a 'save_plots' parameter.
- Added necessary imports for PCA, HDBSCAN, and 3D plotting.
…y tests

The primary purpose is to add HDBSCAN clustering and improve data/asset management and transferability testing to enable classifier cross-evaluation and saving/loading of artifacts.

The primary changes are:
- Added HDBSCAN clustering and plotting and integrated into ILL flow:
  - `perform_hdbscan_clustering`, clustering results and 2D/3D figs in `notbooks/overall_comp.py`.
  - Integrated clustering saving into `evaluate_model_on_benchmark`.
- Enhanced `DataManager` in `notbooks/overall_comp.py`:
  - Added `save_tensors`, `load_tensors`, `save_metadata`, `save_classifier`, `load_classifier` (returns metadata), `set_subdirectory`, `return_one_subdir_up`, and CSV helpers.
  - Improved classifier listing and metadata management.
- Added transferability tooling in `notbooks/overall_comp.py`:
  - `test_classifier_transferability`, `analyze_transferability_results`, `get_available_classifiers_summary`, and example usage.
- Misc: saved `features_dict` post-processing, normalized tensors, and minor refactors for metadata consistency and plotting returns.
The primary purpose is to add an aggregated job results mapping and standardize ablations notebooks for reproducible aggregation and analysis.

The primary changes are:
- Added `notbooks/ablations_results/results_file_mapping/67936556.json` with job parameter-to-result mappings and timestamps.
- Updated notebooks in `notbooks/ablations_results/` (`create_commands.ipynb`, `max_neighbors_read_results.ipynb`, `neighbor_dist_read_results.ipynb`, `scrap_results_file_names.ipynb`) to include the new job id, align printed outputs and metadata, and standardize execution info.
- Minor metadata adjustments (Python version, execution counts) to ensure consistent notebook outputs.
The primary purpose is to introduce a new notebook for comparing ablation results between EPP and MLM models, along with updating related notebooks and mappings with the latest experiment data.

The primary changes are:

- Added `notbooks/ablations_results/EPP_MLM_comp_ablation.ipynb` as a new file for comparative analysis.
- Updated command lists, timestamps, and job indices in `notbooks/ablations_results/create_commands.ipynb` to reflect new experiment runs.
- Modified result reading logic and data in `notbooks/ablations_results/max_neighbors_read_results.ipynb` and `notbooks/ablations_results/neighbor_dist_read_results.ipynb` to incorporate updated paths and timestamps.
- Refreshed scraped results table in `notbooks/ablations_results/scrap_results_file_names.ipynb` with new entries and corrected indices.
- Added new JSON mapping file `notbooks/ablations_results/results_file_mapping/67936556.json` containing parameter values, job indices, and paths for the latest experiments.
The primary purpose is to incorporate new experimental results into the ablation analysis notebooks.

The primary changes are:

- Updated `create_commands.ipynb` with new command lists and timestamps.
- Modified `max_neighbors_read_results.ipynb` and `neighbor_dist_read_results.ipynb` with updated metadata.
- Added new results file mapping `67936556.json` in `results_file_mapping/`.
- Updated `scrap_results_file_names.ipynb` with new table data and paths.
The primary purpose is to include the outputs and visualizations from running the EPP and MLM ablation experiments for comparison.

The primary changes are:

- Added execution outputs to multiple cells in 'EPP_MLM_comp_ablation.ipynb', including progress bars, similarity calculations, and statistical results.
- Integrated interactive 3D plots and heatmaps showing mean cosine similarity across perturbation parameters.
- Included comparison statistics, box plots, and example perturbations for EPP vs MLM methods.
- Saved generated figures and dataframes to the output directory via the DataManager class.
… plotting and data saving

The primary purpose is to improve the ablation study by incorporating updated similarity metrics, interactive 3D plots, and automated saving of results for better analysis and reproducibility.

The primary changes are:

- Modified 'notbooks/ablations_results/EPP_MLM_comp_ablation.ipynb' to update execution counts, fix plotting function parameters for flexibility, add data saving calls, and remove redundant cells.
- Added new CSV file 'notbooks/ablations_results/EPP_MLM_comp_ablation/2025_12_25-19_24_25/epp_ablation_results.csv.csv' with ablation results data.
- Added new PNG files 'notbooks/ablations_results/EPP_MLM_comp_ablation/2025_12_25-19_24_25/epp_cosine_similarity_line_plot.png' and 'notbooks/ablations_results/EPP_MLM_comp_ablation/2025_12_25-19_24_25/epp_similarity_heatmap.png' for visualization outputs.
…ults and neighbor file handling

The primary purpose is to refine the script's configuration for better usability in transferability experiments and neighbor generation.

The primary changes are:

- Modified argument parser in 'notbooks/overall_comp.py' to use boolean actions for 'new_classifiers', 'transferability', and 'save_classifiers' instead of string types.
- Added conditional logic to automatically set 'SAVE_CLASSIFIERS' to True when 'TRANSFERABILITY' is enabled.
- Updated timestamp format to include seconds for more precise experiment tracking.
- Changed 'create_new_neighbors_file' to True in 'run_ill_evaluation' to enable neighbor file creation.
@liranc6 liranc6 force-pushed the initial-checks-cosmetic-edits branch from 5104151 to 4865e39 Compare December 27, 2025 11:33
The primary changes are:
- Modified n_tokens_read_results.ipynb to adjust num_runs calculation to account for exclusions, and output summary of aggregated results.
- Updated scrap_results_file_names.ipynb to handle regex better
The primary purpose is to improve consistency and comprehensiveness of results aggregation and table generation.

- Updated  to consolidate numeric cleaning and expand numeric columns to include , , and .
- Standardized column renaming and mapping (e.g.,  -> ,  -> ).
- Made Table 2 construction robust to missing baseline keys and added summary statistics.
- Added  extraction and family-level aggregation for Table 3.
- Returned DataFrames for tables and included detailed results and summary stats.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant