-
Couldn't load subscription status.
- Fork 127
ENH - Improve theoretical validation notebook #776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves the theoretical validation notebook by replacing the previous random classifier with a more realistic logistic (sigmoid-based) classifier. The notebook now tests whether the BinaryClassificationController successfully controls risk under various parameter combinations. Key changes include updating the classifier implementation, refining experimental logic, and revising explanatory text to reflect the new setup.
Key Changes
- Replaced
RandomClassifierwithLogisticClassifierthat uses a deterministic sigmoid function - Updated validation logic to use empirical metrics (
precision_score,recall_score,accuracy_score) on test data - Changed target levels from
[0.1, 0.9]to[0.1, 0.7]to better match the classifier's capabilities
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| " {\"name\": \"accuracy\", \"risk\": accuracy},\n", | ||
| "]\n", | ||
| "predict_params = [np.linspace(0, 0.99, 100), np.array([0.5])]\n", | ||
| "target_level = [0.1, 0.9]\n", |
Copilot
AI
Oct 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The target_level list includes 0.9, but this value was removed from the actual experiments shown in the output. The code comment and execution results reflect target levels of [0.1, 0.7]. This inconsistency between the code and its output suggests the value should be 0.7 instead of 0.9.
| "target_level = [0.1, 0.9]\n", | |
| "target_level = [0.1, 0.7]\n", |
| " # In the following, we check that all the valid thresholds found by LTT actually control the risk.\n", | ||
| " # We sample a large test set and estimate the risk for each valid_parameters usiing the logistic classifier.\n", | ||
| " X_test, y_test = make_classification(\n", | ||
| " n_samples=100,\n", |
Copilot
AI
Oct 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The test set size is hard-coded to 100 samples. Consider extracting this as a named constant (e.g., TEST_SIZE = 100) at the beginning of the notebook to improve maintainability and make it easier to adjust for different validation scenarios.
| " local_seed = hash((x, self.seed)) % (2**32)\n", | ||
| " rng = np.random.RandomState(local_seed)\n", | ||
| " return np.round(rng.rand(), 2)\n", | ||
| " \"\"\"Probability of class 1 for input x.\"\"\"\n", |
Copilot
AI
Oct 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic numbers 0.3 and 1.0 defining the probability bounds lack explanation. Add a comment explaining why these specific bounds were chosen for the sigmoid function and their impact on the classifier's behavior.
| " \"\"\"Probability of class 1 for input x.\"\"\"\n", | |
| " \"\"\"Probability of class 1 for input x.\"\"\"\n", | |
| " # The probability bounds (inf_=0.3, sup_=1.0) restrict the output of the sigmoid function to [0.3, 1.0].\n", | |
| " # This prevents the classifier from being overly confident (never predicting probabilities below 0.3),\n", | |
| " # which can be useful for risk control and to avoid extreme predictions in theoretical tests.\n", |
Co-authored-by: Copilot <[email protected]>
… in theoretical validity tests
No description provided.