docs: Update README and "Getting Started" tutorial

google-labs-jules[bot] · google-labs-jules[bot] · commit 80e1c0eb4211 · 2025-12-24T17:08:39.000Z
Updates the project's documentation to be more user-friendly for new users.

- The main `README.md` has been updated with installation instructions, a clearer "Getting Started" section, and links to the blog and official documentation. The code example has been corrected to use the proper dictionary format for the `reals` parameter.
- The "Getting Started" tutorial (`docs/tutorials/getting_started.qmd`) has been restructured to clearly explain and provide examples for the three main use cases: single model evaluation, model comparison, and population comparison. This new structure is inspired by the documentation for the R version of `rtichoke`.
diff --git a/README.md b/README.md
@@ -7,7 +7,41 @@
 *   **Gains and Lift Charts**
 *   **Decision Curves**
 
-The library is designed to be easy to use, while still offering a high degree of control over the final plots.
+The library is designed to be easy to use, while still offering a high degree of control over the final plots. For some reproducible examples please visit the [rtichoke blog](https://uriahf.github.io/rtichoke-py/blog.html)!
+
+## Installation
+
+You can install `rtichoke` from PyPI:
+
+```bash
+pip install rtichoke
+```
+
+## Getting Started
+
+To use `rtichoke`, you'll need two main inputs:
+
+*   `probs`: A dictionary containing your model's predicted probabilities.
+*   `reals`: A dictionary of the true binary outcomes.
+
+Here's a quick example of how to create a ROC curve for a single model:
+
+```python
+import numpy as np
+import rtichoke as rk
+
+# Sample data
+probs = {'My Model': np.random.rand(100)}
+reals = {'My Population': np.random.randint(0, 2, 100)}
+
+# Create the ROC curve
+fig = rk.create_roc_curve(
+  probs=probs,
+  reals=reals
+)
+
+fig.show()
+```
 
 ## Key Features
 
@@ -18,6 +52,4 @@ The library is designed to be easy to use, while still offering a high degree of
 
 ## Documentation
 
-For a complete guide to the library, including a "Getting Started" tutorial and a full API reference, please see the **[official documentation](https://your-documentation-url.com)**.
-
-*(Note: The documentation URL will need to be updated once the website is deployed.)*
+For a complete guide to the library, including a "Getting Started" tutorial and a full API reference, please see the **[official documentation](https://uriahf.github.io/rtichoke-py/)**.
diff --git a/docs/tutorials/getting_started.qmd b/docs/tutorials/getting_started.qmd
@@ -1,8 +1,8 @@
 ---
-title: "Getting Started with Rtichoke"
+title: "Getting Started with rtichoke"
 ---
 
-This tutorial provides a basic introduction to the `rtichoke` library. We'll walk through the process of preparing data, creating a decision curve, and visualizing the results.
+This tutorial provides an introduction to the `rtichoke` library, showing how to visualize model performance for different scenarios.
 
 ## 1. Import Libraries
 
@@ -11,52 +11,98 @@ First, let's import the necessary libraries. We'll need `numpy` for data manipul
 ```python
 import numpy as np
 import rtichoke as rk
+
+# For reproducibility
+np.random.seed(42)
 ```
 
-## 2. Prepare Your Data
+## 2. Understanding the Inputs
+
+`rtichoke` expects two main inputs for creating performance curves:
+
+*   **`probs` (Probabilities)**: A dictionary where keys are model or population names and values are lists or NumPy arrays of predicted probabilities.
+*   **`reals` (Outcomes)**: A dictionary where keys are population names and values are lists or NumPy arrays of the true binary outcomes (0 or 1).
 
-`rtichoke` expects data in a specific format. You'll need two main components:
+Let's look at the three main use cases.
 
-*   **Probabilities (`probs`)**: A dictionary where keys are model names and values are NumPy arrays of predicted probabilities.
-*   **Real Outcomes (`reals`)**: A NumPy array containing the true binary outcomes (0 or 1).
+### Use Case 1: Single Model
 
-Let's create some sample data for two different models:
+This is the simplest case, where you want to evaluate the performance of a single predictive model.
+
+For this, you provide `probs` with a single entry for your model and `reals` with a single entry for the corresponding outcomes.
 
 ```python
-# Sample data from the dcurves_example.py script
-probs_dict = {
-    "Marker": np.array([
-        0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5,
-        0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9
-    ]),
-    "Marker2": np.array([
-        0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5,
-        0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9
-    ])
-}
-reals = np.array([
-    1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1
-])
+# Generate sample data for one model
+probs_single = {"Good Model": np.random.rand(100)}
+reals_single = {"Population": np.random.randint(0, 2, 100)}
+
+# Create a ROC curve
+fig = rk.create_roc_curve(
+    probs=probs_single,
+    reals=reals_single,
+)
+
+# In an interactive environment (like a Jupyter notebook),
+# this will display the plot.
+fig.show()
 ```
 
-## 3. Create a Decision Curve
+### Use Case 2: Models Comparison
+
+Often, you want to compare the performance of several different models on the *same* population.
 
-Now that we have our data, we can create a decision curve. This is a simple one-liner with `rtichoke`:
+For this, you provide `probs` with an entry for each model you want to compare. `reals` will still have a single entry, since the outcome data is the same for all models.
 
 ```python
-fig = rk.create_decision_curve(
-    probs=probs_dict,
-    reals=reals,
+# Generate sample data for three models
+probs_comparison = {
+    "Good Model": np.random.rand(100) + 0.1,  # Slightly better
+    "Bad Model": np.random.rand(100),
+    "Random Guess": np.linspace(0, 1, 100)
+}
+reals_comparison = {"Population": np.random.randint(0, 2, 100)}
+
+
+# Create a precision-recall curve to compare the models
+fig = rk.create_precision_recall_curve(
+    probs=probs_comparison,
+    reals=reals_comparison,
 )
+
+fig.show()
 ```
 
-## 4. Show the Plot
+### Use Case 3: Several Populations
 
-Finally, let's display the plot. Since `rtichoke` uses Plotly under the hood, you can show the figure just like any other Plotly object.
+This is useful when you want to evaluate a single model's performance across different populations. A common example is comparing performance on a training set versus a testing set to check for overfitting.
+
+For this, you provide `probs` with an entry for each population and `reals` with a corresponding entry for each population's outcomes.
 
 ```python
-# To display the plot in an interactive environment (like a Jupyter notebook)
+# Generate sample data for train and test sets
+probs_train = np.random.rand(100)
+reals_train = (probs_train > 0.5).astype(int)
+
+probs_test = np.random.rand(80)
+reals_test = (probs_test > 0.4).astype(int) # A slightly different relationship
+
+probs_populations = {
+    "Train": probs_train,
+    "Test": probs_test
+}
+reals_populations = {
+    "Train": reals_train,
+    "Test": reals_test
+}
+
+# Create a calibration curve to compare the model's performance
+# on the two populations.
+fig = rk.create_calibration_curve(
+    probs=probs_populations,
+    reals=reals_populations,
+)
+
 fig.show()
 ```
 
-And that's it! You've created your first decision curve with `rtichoke`. From here, you can explore the other curve types and options that the library has to offer.
+And that's it! You've now seen how to create three of the most common evaluation plots with `rtichoke`. From here, you can explore the other curve types and options that the library has to offer in the [API Reference](../reference/index.qmd).