-
Notifications
You must be signed in to change notification settings - Fork 341
Description
Overview
To calculate an average (mean) in Rust, normally something like this is used:
let a = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let avg = a.iter().sum::<f64>() / a.len() as f64;And that's fine for short vectors/iterators. However, when iterators get long, float rounding becomes an issue. For example:
let a = vec![1.96; 1024];
let avg = a.iter().sum::<f64>() / a.len() as f64;
assert_eq!(avg, 1.96); // panic!
// assertion `left == right` failed
// left: 1.9600000000000248
// right: 1.96However, we can use an iterative mean to calculate the average without summing the entire iterator: http://www.heikohoffmann.de/htmlthesis/node134.html
Additionally, averaging more complex iterators with filtering operations is not as trivial.
Other Languages
Both C# and Kotlin have this built into their standard library:
- C#: Generic over the iterator type, but returns f32 or f64
- Kotlin: Generic over the iterator type, but always returns f64
Example Rust Implementation
pub trait AverageExt<T>: Iterator<Item = T> {
fn average(self) -> Option<f64>;
}
impl<T, I> AverageExt<T> for I
where
I: Iterator<Item = T>,
T: Into<f64>,
{
fn average(self) -> Option<f64> {
let mut iter = self;
if let Some(first) = iter.next() {
let mut count: usize = 1;
let mut avg = first.into();
for num in iter {
count += 1;
avg = avg + (num.into() - avg) / (count as f64);
}
Some(avg)
} else {
None
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn f() {
let a = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let b = a.into_iter().average().unwrap();
assert_eq!(b, 3.0);
// Example 1 where average with sum does not work
let a = vec![1e10 + 2.5; 1024 * 1024];
let s1 = a.iter().sum::<f64>() / a.len() as f64;
let s2 = a.into_iter().average().unwrap();
assert_eq!(s1, s2);
// thread 'tests::f' (3950386) panicked at src/lib.rs:47:9:
// assertion `left == right` failed
// left: 10000000002.214748
// right: 10000000002.5
// Example 2 where average with sum does not work
let a = vec![1.96; 1024];
let s1 = a.iter().sum::<f64>() / a.len() as f64;
let s2 = a.into_iter().average().unwrap();
assert_eq!(s1, s2);
// thread 'tests::f' (3952567) panicked at src/lib.rs:53:9:
// assertion `left == right` failed
// left: 1.9600000000000248
// right: 1.96
// Implementation above is generic over input type
let a: Vec<i32> = vec![1, 2, 3, 4, 5];
let b = a.into_iter().average().unwrap();
assert_eq!(b, 3.0);
}
}The only related issue I could find when searching is this: #1030, however this seems much more specific to statistics. Additionally, a basic average method was generic enough for other languages to add to their library.
I can open a pull request if this method is liked. If we go forward with this method, one question is do we always return f64, or do we make the return type generic over both f32 and f64.