Skip to content

Commit 26b1e9a

Browse files
committed
Store the client ID in an additional plaintext file and report on regeneration
1 parent ca591c6 commit 26b1e9a

File tree

14 files changed

+1167
-5
lines changed

14 files changed

+1167
-5
lines changed

.dictionary

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
personal_ws-1.1 en 304 utf-8
1+
personal_ws-1.1 en 306 utf-8
22
AAR
33
AARs
44
ABI
@@ -233,13 +233,15 @@ pdoc
233233
perrymcmanis
234234
pidcat
235235
pipenv
236+
plaintext
236237
polyfill
237238
pre
238239
prebuilt
239240
preinit
240241
profiler
241242
py
242243
pytest
244+
regenerations
243245
rethrow
244246
retransmission
245247
rfloor

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@
22

33
[Full changelog](https://github.com/mozilla/glean/compare/v66.0.1...main)
44

5+
* General
6+
* Store the client ID in an additional plaintext file and report on regeneration ([#3292](https://github.com/mozilla/glean/pull/3292))
7+
58
# v66.0.1 (2025-10-30)
69

710
[Full changelog](https://github.com/mozilla/glean/compare/v66.0.0...v66.0.1)

docs/dev/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,4 +40,5 @@
4040
- [Debug Pings](core/internal/debug-pings.md)
4141
- [Upload mechanism](core/internal/upload.md)
4242
- [Implementations](core/internal/implementations.md)
43+
- [Client ID recovery](core/internal/client_id_recovery.md)
4344
- [API Documentation](api/index.md)
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Client ID recovery
2+
3+
Currently (2025-10-31, Glean v66) we see some unexplained Glean SDK database resets.
4+
These are noticeable in data as client ID regenerations:
5+
A client application with telemetry enabled, which previously already sent data,
6+
regenerates its client ID on initialize and thus looks like a new client.
7+
8+
That's undesirable and a bug.
9+
However we have yet to track down the actual faulty code path.
10+
Until that bug is found and fixed, the Glean SDK provides an extra mitigation.
11+
12+
From Glean v66.1.0 on the SDK will store the client ID in a `client_id.txt` in the provided data path.
13+
Any inconsistencies in that data compared to the database will be reported
14+
and, if applicable, the client ID restored.
15+
16+
**Note:** Glean v66.1.0 will only report the inconsistency, but will not restore a recovered client ID.
17+
This allows us to measure the impact.
18+
The mitigation will be enabled in a later release ([bug 1996862](https://bugzilla.mozilla.org/show_bug.cgi?id=1996862)).
19+
20+
The exact flow of decisions is depicted in the chart below.
21+
The implementation is in [`glean-core/src/core/mod.rs`](https://github.com/mozilla/glean/blob/HEAD/glean-core/src/core/mod.rs#L264)
22+
23+
```mermaid
24+
flowchart TD
25+
A["Glean.init"] -->B
26+
B{client_id.txt exists?} -->|yes| C
27+
B -->|no| D
28+
C["(a) load file ID"] --> E
29+
D["load DB ID"] --> D3
30+
D3{DB ID empty} -->|yes| D4
31+
D3 -->|no| S
32+
D4["generate DB ID"] --> S
33+
E{valid file and ID?} -->|yes| H
34+
E -->|no| G
35+
G["(b) record file read error"] --> H
36+
H{"(c) DB size <= 0"} -->|yes| J
37+
H -->|no| F
38+
J["(d) record empty DB error
39+
report recovered ID: file ID"] --> Q
40+
F["load DB ID"] --> N
41+
L{file ID == DB ID} --> |yes| Z
42+
L -->|no| T
43+
N{DB ID empty?} -->|yes| O
44+
N -->|no| L
45+
O["(f) record regen error"] --> Q
46+
P["(g) record mismatch error
47+
report recovered ID: file ID"] --> S
48+
Q["(e) mitigation:
49+
set DB ID = file ID"] --> Z
50+
S["(h) write DB ID to file"] --> Z
51+
T{"DB ID == 'c0ffee'"}
52+
T -->|yes| U
53+
T -->|no| P
54+
U["(i) record c0ffee error
55+
report recovered ID: file ID"] --> Q
56+
57+
Z(normal operation)
58+
```

docs/dev/core/internal/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,3 +21,4 @@ This includes:
2121
* [Debug Pings](debug-pings.md)
2222
* [Upload mechanism](upload.md)
2323
* [Implementations](implementations.md)
24+
* [Client ID recovery](client_id_recovery.md)

glean-core/metrics.yaml

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1158,3 +1158,92 @@ glean.health:
11581158
expires: never
11591159
send_in_pings:
11601160
- health
1161+
1162+
exception_state:
1163+
type: string
1164+
lifetime: ping
1165+
description: |
1166+
An exceptional state was detected upon trying to laod the database.
1167+
1168+
Valid options are:
1169+
- empty-db
1170+
- regen-db
1171+
- c0ffee-in-db
1172+
- client-id-mismatch
1173+
notification_emails:
1174+
1175+
1176+
bugs:
1177+
- https://bugzilla.mozilla.org/show_bug.cgi?id=1994757
1178+
data_reviews:
1179+
- https://bugzilla.mozilla.org/show_bug.cgi?id=1994757#c2
1180+
data_sensitivity:
1181+
- technical
1182+
expires: never
1183+
send_in_pings:
1184+
- health
1185+
1186+
recovered_client_id:
1187+
type: uuid
1188+
lifetime: ping
1189+
description: |
1190+
A client_id recovered from a `client_id.txt` file on disk.
1191+
Only expected to have a value for the exception states `empty-db`, `c0ffee-in-db` and `client-id-mismatch`.
1192+
See `exception_state` for different exception states when this can happen.
1193+
notification_emails:
1194+
1195+
1196+
bugs:
1197+
- https://bugzilla.mozilla.org/show_bug.cgi?id=1994757
1198+
data_reviews:
1199+
- https://bugzilla.mozilla.org/show_bug.cgi?id=1994757#c2
1200+
data_sensitivity:
1201+
- technical
1202+
expires: never
1203+
send_in_pings:
1204+
- health
1205+
1206+
file_read_error:
1207+
type: labeled_counter
1208+
lifetime: ping
1209+
description: |
1210+
Count of different errors that happened when trying to read the `client_id.txt` file from disk.
1211+
notification_emails:
1212+
1213+
1214+
bugs:
1215+
- https://bugzilla.mozilla.org/show_bug.cgi?id=1994757
1216+
data_reviews:
1217+
- https://bugzilla.mozilla.org/show_bug.cgi?id=1994757#c2
1218+
data_sensitivity:
1219+
- technical
1220+
expires: never
1221+
send_in_pings:
1222+
- health
1223+
labels:
1224+
- parse
1225+
- permission-denied
1226+
- io
1227+
- c0ffee-in-file
1228+
1229+
file_write_error:
1230+
type: labeled_counter
1231+
lifetime: ping
1232+
description: |
1233+
Count of different errors that happened when trying to write the `client_id.txt` file to disk.
1234+
notification_emails:
1235+
1236+
1237+
bugs:
1238+
- https://bugzilla.mozilla.org/show_bug.cgi?id=1994757
1239+
data_reviews:
1240+
- https://bugzilla.mozilla.org/show_bug.cgi?id=1994757#c2
1241+
data_sensitivity:
1242+
- technical
1243+
expires: never
1244+
send_in_pings:
1245+
- health
1246+
labels:
1247+
- permission-denied
1248+
- io
1249+
- not-found

glean-core/rlb/src/net/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,7 @@ impl UploadManager {
124124
)
125125
.is_err()
126126
{
127+
log::trace!("glean.upload thread running. Not starting another one.");
127128
return;
128129
}
129130

glean-core/rlb/tests/health_ping.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,11 @@ fn test_pre_post_init_health_pings_exist() {
124124
.count()
125125
);
126126

127+
let exception_state = &preinits[0].1["metrics"]["string"]["glean.health_exception_state"];
128+
assert_eq!(&JsonValue::Null, exception_state);
129+
let exception_uuid = &preinits[0].1["metrics"]["uuid"]["glean.health_recovered_client_id"];
130+
assert_eq!(&JsonValue::Null, exception_uuid);
131+
127132
// An initial preinit "health" ping will show no db file sizes
128133
let load_sizes = preinits[0].1["metrics"]["object"]["glean.database.load_sizes"]
129134
.as_object()
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
// This Source Code Form is subject to the terms of the Mozilla Public
2+
// License, v. 2.0. If a copy of the MPL was not distributed with this
3+
// file, You can obtain one at https://mozilla.org/MPL/2.0/.
4+
5+
//! This integration test should model how the RLB is used when embedded in another Rust application
6+
//! (e.g. FOG/Firefox Desktop).
7+
//!
8+
//! We write a single test scenario per file to avoid any state keeping across runs
9+
//! (different files run as different processes).
10+
11+
mod common;
12+
13+
use std::{fs, io::Read};
14+
15+
use crossbeam_channel::{bounded, Sender};
16+
use flate2::read::GzDecoder;
17+
use glean::{net, ConfigurationBuilder};
18+
use serde_json::Value as JsonValue;
19+
20+
// Define a fake uploader that reports when and what it uploads.
21+
#[derive(Debug)]
22+
struct ReportingUploader {
23+
sender: Sender<JsonValue>,
24+
}
25+
26+
impl net::PingUploader for ReportingUploader {
27+
fn upload(&self, upload_request: net::CapablePingUploadRequest) -> net::UploadResult {
28+
let upload_request = upload_request.capable(|_| true).unwrap();
29+
let body = upload_request.body;
30+
let decode = |body: Vec<u8>| {
31+
let mut gzip_decoder = GzDecoder::new(&body[..]);
32+
let mut s = String::with_capacity(body.len());
33+
34+
gzip_decoder
35+
.read_to_string(&mut s)
36+
.ok()
37+
.map(|_| &s[..])
38+
.or_else(|| std::str::from_utf8(&body).ok())
39+
.and_then(|payload| serde_json::from_str(payload).ok())
40+
.unwrap()
41+
};
42+
43+
self.sender.send(decode(body)).unwrap();
44+
net::UploadResult::http_status(200)
45+
}
46+
}
47+
48+
/// Test scenario: Write a client ID to the backup file and check that it's used after initialization.
49+
#[test]
50+
fn test_pre_post_init_health_pings_exist() {
51+
common::enable_test_logging();
52+
53+
// Create a custom configuration to use a validating uploader.
54+
let dir = tempfile::tempdir().unwrap();
55+
let tmpname = dir.path().to_path_buf();
56+
57+
let client_id = "e03cc2de-bc8b-4f9c-862f-b474d910899e";
58+
59+
// We write a random but fixed client ID, without there being a Glean database.
60+
let clientid_txt = tmpname.join("client_id.txt");
61+
fs::write(&clientid_txt, client_id.as_bytes()).unwrap();
62+
63+
let (tx, rx) = bounded(1);
64+
let cfg = ConfigurationBuilder::new(true, tmpname.clone(), "health-ping-test")
65+
.with_server_endpoint("invalid-test-host")
66+
.with_use_core_mps(false)
67+
.with_uploader(ReportingUploader { sender: tx })
68+
.build();
69+
common::initialize(cfg);
70+
71+
glean_core::glean_test_destroy_glean(false, Some(tmpname.display().to_string()));
72+
73+
// Check for the initialization pings.
74+
// Wait for the ping to arrive.
75+
let payload = rx.recv().unwrap();
76+
77+
let exception_state = &payload["metrics"]["string"]["glean.health.exception_state"];
78+
assert_eq!("empty-db", exception_state);
79+
let exception_uuid = &payload["metrics"]["uuid"]["glean.health_recovered_client_id"];
80+
assert_eq!(&JsonValue::Null, exception_uuid);
81+
82+
// TODO(bug 1996862): We don't run the mitigation yet.
83+
//let ping_client_id = &payload["client_info"]["client_id"];
84+
//assert_eq!(client_id, ping_client_id);
85+
}

0 commit comments

Comments
 (0)