Skip to content

Commit a1d27e4

Browse files
committed
Adding support for Ruby's spaceship operator
1 parent 3e7fd45 commit a1d27e4

File tree

3 files changed

+546
-10
lines changed

3 files changed

+546
-10
lines changed

README.md

Lines changed: 96 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -250,6 +250,64 @@ puts baseline_times.>(optimized_times, alpha: 0.01) # More stringent
250250
puts optimized_times.<(baseline_times, alpha: 0.10) # More lenient
251251
```
252252

253+
##### `#<=>(other, alpha: 0.05)` - The Spaceship Operator
254+
255+
The spaceship operator provides three-way statistical comparison, returning:
256+
- `1` if this collection's mean is significantly greater than the other
257+
- `-1` if this collection's mean is significantly less than the other
258+
- `0` if there's no significant statistical difference
259+
260+
This is particularly useful for sorting collections by statistical significance or implementing custom comparison logic.
261+
262+
```ruby
263+
# Three-way statistical comparison
264+
high_performance = [200, 210, 205, 215, 220] # mean = 210
265+
medium_performance = [150, 160, 155, 165, 170] # mean = 160
266+
low_performance = [50, 60, 55, 65, 70] # mean = 60
267+
268+
puts high_performance <=> medium_performance # => 1 (significantly greater)
269+
puts medium_performance <=> high_performance # => -1 (significantly less)
270+
puts high_performance <=> high_performance # => 0 (no significant difference)
271+
272+
# Sorting datasets by statistical significance
273+
datasets = [
274+
[10, 15, 12, 18, 11], # mean = 13.2
275+
[30, 35, 32, 38, 31], # mean = 33.2
276+
[5, 8, 6, 9, 7], # mean = 7.0
277+
[20, 25, 22, 28, 21] # mean = 23.2
278+
]
279+
280+
# Sort datasets from lowest to highest statistical significance
281+
sorted_datasets = datasets.sort { |a, b| a <=> b }
282+
puts sorted_datasets.map(&:mean) # => [7.0, 13.2, 23.2, 33.2] (ascending by mean)
283+
284+
# A/B testing - sort variants by conversion performance
285+
control = [0.12, 0.11, 0.13, 0.12, 0.10] # 11.6% conversion
286+
variant_a = [0.14, 0.13, 0.15, 0.14, 0.12] # 13.6% conversion
287+
variant_b = [0.16, 0.15, 0.17, 0.16, 0.14] # 15.6% conversion
288+
289+
variants = [control, variant_a, variant_b]
290+
best_to_worst = variants.sort { |a, b| b <=> a } # Descending order
291+
292+
puts "Performance ranking:"
293+
best_to_worst.each_with_index do |variant, index|
294+
puts "#{index + 1}. Mean conversion: #{variant.mean.round(3)}"
295+
end
296+
297+
# Custom alpha levels (default is 0.05)
298+
borderline_a = [100, 102, 104, 106, 108] # mean = 104
299+
borderline_b = [95, 97, 99, 101, 103] # mean = 99
300+
301+
# Standard significance test (95% confidence)
302+
result_standard = borderline_a <=> borderline_b
303+
puts "Standard test (α=0.05): #{result_standard}"
304+
305+
# More lenient test (90% confidence)
306+
# Note: Use method call syntax for custom parameters
307+
result_lenient = borderline_a.public_send(:<=>, borderline_b, alpha: 0.10)
308+
puts "Lenient test (α=0.10): #{result_lenient}"
309+
```
310+
253311
### Comparison Methods
254312

255313
#### `#percentage_difference(other)`
@@ -434,20 +492,47 @@ puts " Sample size: #{clean_temps.size}/#{temperatures.size}"
434492
### A/B Test Analysis
435493

436494
```ruby
437-
# Conversion rates for two variants
438-
variant_a = [0.12, 0.15, 0.11, 0.14, 0.13, 0.16, 0.12, 0.15]
439-
variant_b = [0.18, 0.19, 0.17, 0.20, 0.18, 0.21, 0.19, 0.18]
495+
# Conversion rates for multiple variants
496+
control = [0.12, 0.15, 0.11, 0.14, 0.13, 0.16, 0.12, 0.15] # 13.5% avg conversion
497+
variant_a = [0.18, 0.19, 0.17, 0.20, 0.18, 0.21, 0.19, 0.18] # 18.75% avg conversion
498+
variant_b = [0.16, 0.17, 0.15, 0.18, 0.16, 0.19, 0.17, 0.16] # 16.75% avg conversion
499+
variant_c = [0.22, 0.24, 0.21, 0.25, 0.23, 0.26, 0.22, 0.24] # 23.4% avg conversion
500+
501+
variants = [
502+
{ name: "Control", data: control },
503+
{ name: "Variant A", data: variant_a },
504+
{ name: "Variant B", data: variant_b },
505+
{ name: "Variant C", data: variant_c }
506+
]
507+
508+
# Display individual performance
509+
variants.each do |variant|
510+
mean_pct = (variant[:data].mean * 100).round(1)
511+
std_pct = (variant[:data].standard_deviation * 100).round(1)
512+
puts "#{variant[:name]}: #{mean_pct}% ± #{std_pct}%"
513+
end
440514

441-
puts "Variant A: #{(variant_a.mean * 100).round(1)}% ± #{(variant_a.standard_deviation * 100).round(1)}%"
442-
puts "Variant B: #{(variant_b.mean * 100).round(1)}% ± #{(variant_b.standard_deviation * 100).round(1)}%"
515+
# Sort variants by statistical performance using spaceship operator
516+
sorted_variants = variants.sort { |a, b| b[:data] <=> a[:data] } # Descending order
443517

444-
# Calculate performance lift
445-
lift = variant_b.signed_percentage_difference(variant_a)
446-
puts "Variant B lift: #{lift.round(1)}%" # => "Variant B lift: 34.8%"
518+
puts "\nPerformance Ranking (statistically significant):"
519+
sorted_variants.each_with_index do |variant, index|
520+
conversion_rate = (variant[:data].mean * 100).round(1)
521+
puts "#{index + 1}. #{variant[:name]}: #{conversion_rate}%"
522+
523+
# Compare to control using statistical significance
524+
if variant[:name] != "Control"
525+
is_significantly_better = variant[:data] > control
526+
puts " #{is_significantly_better ? '✅ Significantly better' : '❌ Not significantly different'} than control"
527+
end
528+
end
447529

448530
# Check for outliers that might skew results
449-
puts "A outliers: #{variant_a.outlier_stats[:outliers_removed]}"
450-
puts "B outliers: #{variant_b.outlier_stats[:outliers_removed]}"
531+
puts "\nOutlier Analysis:"
532+
variants.each do |variant|
533+
outlier_count = variant[:data].outlier_stats[:outliers_removed]
534+
puts "#{variant[:name]} outliers: #{outlier_count}"
535+
end
451536
```
452537

453538
### Performance Comparison
@@ -616,6 +701,7 @@ end
616701
| `less_than?(other, alpha: 0.05)` | Test if mean is significantly less | Boolean | One-tailed t-test, customizable alpha level |
617702
| `>(other, alpha: 0.05)` | Alias for `greater_than?` | Boolean | Shorthand operator for statistical comparison |
618703
| `<(other, alpha: 0.05)` | Alias for `less_than?` | Boolean | Shorthand operator for statistical comparison |
704+
| `<=>(other, alpha: 0.05)` | Three-way statistical comparison | Integer (-1, 0, 1) | Returns 1 if greater, -1 if less, 0 if no significant difference |
619705
| `percentage_difference(other)` | Absolute percentage difference | Float | Always positive, symmetric comparison |
620706
| `signed_percentage_difference(other)` | Signed percentage difference | Float | Preserves direction, useful for A/B tests |
621707
| `remove_outliers(multiplier: 1.5)` | Remove outliers using IQR method | Array | Returns new array, original unchanged |

lib/enumerable_stats/enumerable_ext.rb

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,32 @@ def <(other, alpha: 0.05)
193193
less_than?(other, alpha: alpha)
194194
end
195195

196+
# Tests if this collection's mean is significantly different from another collection's mean
197+
# using a two-tailed Student's t-test. Returns 1 if the test indicates statistical
198+
# significance at the specified alpha level, -1 if the test indicates statistical
199+
# significance at the specified alpha level, and 0 if the test indicates no statistical
200+
# significance at the specified alpha level.
201+
#
202+
# @param other [Enumerable] Another collection to compare against
203+
# @param alpha [Float] Significance level (default: 0.05 for 95% confidence)
204+
# @return [Integer] 1 if this collection's mean is significantly greater, -1 if this collection's mean is
205+
# significantly less, 0 if this collection's mean is not significantly different
206+
# @example
207+
# control = [10, 12, 11, 13, 12] # mean ≈ 11.6
208+
# treatment = [15, 17, 16, 18, 14] # mean = 16.0
209+
# control <=> treatment # => 1 (control is significantly different from treatment)
210+
# treatment <=> control # => -1 (treatment is significantly different from control)
211+
# control <=> control # => 0 (control is not significantly different from itself)
212+
def <=>(other, alpha: 0.05)
213+
if greater_than?(other, alpha: alpha)
214+
1
215+
elsif less_than?(other, alpha: alpha)
216+
-1
217+
else
218+
0
219+
end
220+
end
221+
196222
# Calculates the arithmetic mean (average) of the collection
197223
#
198224
# @return [Float] The arithmetic mean of all numeric values

0 commit comments

Comments
 (0)