You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+25-24Lines changed: 25 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,11 +15,33 @@ slots is a Python library designed to allow the user to explore and use simple m
15
15
16
16
slots provides a hopefully simple API to allow you to explore, test, and use these strategies. Basic usage looks like this:
17
17
18
+
Using slots to determine the best of 3 variations on a live website.
18
19
```Python
19
20
import slots
20
21
22
+
mab = slots.MAB(3)
23
+
```
24
+
25
+
Make the first choice randomly, record responses, and input reward 2 was chosen. Run online trial (input most recent result) until test criteria is met.
26
+
```Python
27
+
mab.online_trial(bandit=2,payout=1)
28
+
```
29
+
30
+
The response of `mab.online_trial()` is a dict of the form:
For "real world" (online) usage, test results can be sequentially fed into an `MAB` object. The tests will continue until a stopping criterion is met.
40
-
41
-
Using slots to determine the best of 3 variations on a live website.
42
-
```Python
43
-
mab = slots.MAB(live=True, payouts=[]*3)
44
-
```
45
-
46
-
Make the first choice randomly, record responses, and input reward 2 was chosen. Run online trial (input most recent result) until test criteria is met.
47
-
```Python
48
-
mab.online_trial(bandit=2,payout=1)
49
-
```
50
-
51
-
The response of mab.online_trial() is a dict of the form:
-`best` is the current best estimate of the highest payout arm.
59
-
60
-
By default, slots uses the epsilon greedy strategy. Besides epsilon greedy, the softmax, upper confidence bound, and Bayesian bandit strategies are also implemented.
61
+
By default, slots uses the epsilon greedy strategy. Besides epsilon greedy, the softmax, upper confidence bound (UCB1), and Bayesian bandit strategies are also implemented.
61
62
62
63
#### Regret analysis
63
64
A common metric used to evaluate the relative success of a MAB strategy is "regret". This reflects that fraction of payouts (wins) that have been lost by using the sequence of pulls versus the currently best known arm. The current regret value can be calculated by calling the `mab.regret()` method.
0 commit comments