Skip to content

Commit c43338f

Browse files
committed
update readme
1 parent 82062d2 commit c43338f

File tree

2 files changed

+43
-40
lines changed

2 files changed

+43
-40
lines changed

README.md

Lines changed: 36 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -114,42 +114,42 @@ estimated weighted Jaccard: 0.004638671875
114114

115115
Test for true weighted Jaccard from 0.005 to 0.98. Output for ERS:
116116
```bash
117-
true weighted Jaccard: 0.9801980198019894
118-
estimated weighted Jaccard: 0.979248046875
119-
true weighted Jaccard: 0.923076923076927
120-
estimated weighted Jaccard: 0.916259765625
121-
true weighted Jaccard: 0.8691588785046818
122-
estimated weighted Jaccard: 0.870361328125
123-
true weighted Jaccard: 0.8181818181818254
124-
estimated weighted Jaccard: 0.815185546875
125-
true weighted Jaccard: 0.7391304347826214
126-
estimated weighted Jaccard: 0.728271484375
127-
true weighted Jaccard: 0.6666666666666645
128-
estimated weighted Jaccard: 0.661865234375
129-
true weighted Jaccard: 0.6000000000000002
130-
estimated weighted Jaccard: 0.602783203125
131-
true weighted Jaccard: 0.538461538461545
132-
estimated weighted Jaccard: 0.54296875
133-
true weighted Jaccard: 0.48148148148149517
134-
estimated weighted Jaccard: 0.4736328125
135-
true weighted Jaccard: 0.4285714285714275
136-
estimated weighted Jaccard: 0.42724609375
137-
true weighted Jaccard: 0.3793103448275884
138-
estimated weighted Jaccard: 0.373291015625
139-
true weighted Jaccard: 0.3333333333333343
140-
estimated weighted Jaccard: 0.320068359375
141-
true weighted Jaccard: 0.2500000000000003
142-
estimated weighted Jaccard: 0.241943359375
143-
true weighted Jaccard: 0.17647058823529319
144-
estimated weighted Jaccard: 0.187255859375
145-
true weighted Jaccard: 0.11111111111111192
146-
estimated weighted Jaccard: 0.110595703125
147-
true weighted Jaccard: 0.05263157894736851
148-
estimated weighted Jaccard: 0.04638671875
149-
true weighted Jaccard: 0.025641025641025394
150-
estimated weighted Jaccard: 0.026611328125
151-
true weighted Jaccard: 0.00502512562814068
152-
estimated weighted Jaccard: 0.00537109375
117+
true weighted Jaccard: 0.980198019801974
118+
estimated weighted Jaccard: 0.981689453125
119+
true weighted Jaccard: 0.9230769230769282
120+
estimated weighted Jaccard: 0.930419921875
121+
true weighted Jaccard: 0.869158878504667
122+
estimated weighted Jaccard: 0.873779296875
123+
true weighted Jaccard: 0.818181818181824
124+
estimated weighted Jaccard: 0.813232421875
125+
true weighted Jaccard: 0.7391304347826144
126+
estimated weighted Jaccard: 0.743408203125
127+
true weighted Jaccard: 0.6666666666666666
128+
estimated weighted Jaccard: 0.6650390625
129+
true weighted Jaccard: 0.6000000000000038
130+
estimated weighted Jaccard: 0.593994140625
131+
true weighted Jaccard: 0.5384615384615451
132+
estimated weighted Jaccard: 0.52978515625
133+
true weighted Jaccard: 0.4814814814814917
134+
estimated weighted Jaccard: 0.477783203125
135+
true weighted Jaccard: 0.42857142857142766
136+
estimated weighted Jaccard: 0.4443359375
137+
true weighted Jaccard: 0.3793103448275924
138+
estimated weighted Jaccard: 0.36767578125
139+
true weighted Jaccard: 0.33333333333333287
140+
estimated weighted Jaccard: 0.32275390625
141+
true weighted Jaccard: 0.24999999999999994
142+
estimated weighted Jaccard: 0.25390625
143+
true weighted Jaccard: 0.17647058823529382
144+
estimated weighted Jaccard: 0.1767578125
145+
true weighted Jaccard: 0.11111111111111145
146+
estimated weighted Jaccard: 0.112060546875
147+
true weighted Jaccard: 0.052631578947368515
148+
estimated weighted Jaccard: 0.05126953125
149+
true weighted Jaccard: 0.025641025641025373
150+
estimated weighted Jaccard: 0.030029296875
151+
true weighted Jaccard: 0.00502512562814075
152+
estimated weighted Jaccard: 0.005859375
153153

154154
```
155155

src/rejsmp.rs

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,10 @@
33
//! Implements:
44
//! - RS (Shrivastava 2016): Algorithm 1/3 with constant-time ISGREEN via
55
//! integer-to-component and component-to-M maps.
6-
//! - ERS (Li & Li 2021): single shared stream r_t, early stopping, and a safe
7-
//! densification fallback using tabulation hashing.
6+
//! - ERS (Li & Li 2021): K independent fixed-length sequences
7+
//! r_{j,1..L} per hash position j; take the first green if any, otherwise mark
8+
//! empty; then densify empties by uniformly picking from the non-empty positions.
9+
//! All randomness is produced via tabulation hashing for consistency/speed.
810
//!
911
//! Inputs: sparse weighted vector `&[(u64, f64)]` where id ∈ [0, D) and weight ≥ 0.
1012
//! Randomness: purely via Tab32/Tab64 tabulation hashing (no stateful RNG required).
@@ -157,10 +159,11 @@ impl RsWmh {
157159
}
158160
}
159161

160-
/// ERS MinHash (AAAI Algorithm 2): K independent fixed-length random sequences.
162+
/// ERS (AAAI Algorithm 2): K independent fixed-length random sequences.
161163
/// For each j in 0..K, scan r_{j,1},...,r_{j,L}; take first green. If none, mark E.
162164
/// Then densify: replace each E by a deterministic donor via rotation.
163-
/// ID is derived from the ACCEPTING COMPONENT INDEX i (and j), not from t.
165+
/// ID is derived from the ACCEPTED DRAW (j, t*) in the fixed sequence, not just the component.
166+
/// Using a per-draw identity ensures two sets collide iff they accept the same r_{j,t*}.
164167
pub struct ErsWmh {
165168
index: RedGreenIndex,
166169
d: usize,

0 commit comments

Comments
 (0)