-
Notifications
You must be signed in to change notification settings - Fork 336
Description
Update: the algorithm below describes background tasks but after discussion on a PR for this, we think we can avoid that, and instead trigger a retry when receive_keys_query_response
runs. I updated #3546 to specify that we should do this during receive_keys_query_response
as well as on a schedule.
When we receive a to-device message containing a megolm session, fetch and store the sender data for it.
Part of #3544 which is part of Invisible Crypto. Depends on #3542 because it holds the information in the data structures created there.
This task includes adding a SessionManager
to prevent clashes between concurrent tasks.
Preventing clashes
Cross-process, we are protected by the cross-process lock - the entire OlmMachine will be reloaded if some other process takes the lock. But we need a way to prevent 2 tasks both updating a session at the same time.
Possibly something like this (but could do with some validation from the Rust team):
struct SessionManager {
sessions_being_processed: HashSet<OwnedSessionId>
}
impl SessionManager {
fn try_lock(session_id: &SessionId) -> Option<SessionGuard>; // If None, give up on this session
}
Pass the SessionGuard
in to any async tasks you spawn. I.e. you keep hold of the lock even across the async boundary.
Algorithm
When we receive a to-device message establishing a megolm session:
A (start)
-
[take the lock] TODO: if we fail to get it, can we just bail out here, dropping the information in this to-device message? What if it contained device info that we need? How will this work if someone maliciously sent us a duplicate of someone else's session id?
-
Does the to-device message contain the device_keys property from MSC4147? Yes->D No->B
B (no device info in to-device message)
We need to find the device details. If we have them in the store, we should use them immediately (rather than waiting for a background task to pick up the session for further processing).
- Does the locally-cached (in the store) devices list contain a device with the curve key of the sender of the to-device message? Yes->D No->C
C (no device info locally)
-
Save this session into the store with no device info, marked as not-legacy,
next_retry_time_ms = now
(in case the app gets killed) andretry_count = 0.
-
↗️ Return, and kick off an async task [keep the lock]: runOlmMachine::get_user_devices
(which waits for /keys/query to complete, then fetches all device info for the user.) Then it should find a device with the curve key we know we used to decrypt the to-device message (same as in get_verification_state. Probably we want to move the impl ofget_verification_state
into another function we call now, andget_verification_state
will look up what we stored instead of calculating it at the time it is called).
If the device is there, -> D
If we still don’t have the device info, -> 😴 Wait to see whether we get device info later. Increment retry_count
and set next_retry_time_ms
per backoff algorithm; let the background job pick it up [drop the lock]
D (we have device info)
- Is the device info cross-signed?
No -> 😴 Wait to see if the device becomes cross-signed soon. Increment retry_count
and set next_retry_time_ms
per backoff algorithm; let the background job pick it up [drop the lock]
Yes -> E
E (we have cross-signed device info)
- Do we have the cross-signing key for this user? Yes -> G No -> F
F (we have cross-signed device info, but no cross-signing keys)
- Upsert the session with the (cross-signed) device info we have, still marked as not-legacy. Set
next_retry_time_ms = now
andretry_count = 0
. ↗️ Return, and kick off an async task [keep the lock]: runOlmMachine::get_identity
(which waits for /keys/query to complete, then fetches this user's cross-signing key from the store.) If we still don’t have a cross-signing key -> 😴 Wait to see if we get one soon. Do nothing; let the background job pick it up [drop the lock]
G (we have cross-signing key)
- Does the cross-signing key match that used to sign the device info?
Yes -> H
No -> 😴 Wait to see if the cross-signing key is updated soon. Increment retry_count and set next_retry_time_ms per backoff algorithm; let the background job pick it up [drop the lock]
H (cross-signing key matches that used to sign the device info!)
- Is the signature in the device info (
ed25519:<ssk_id>
) valid (SelfSigningPubKey::verify_device_keys
)?
Yes -> J
No -> ❗Session is invalid: drop it from the store and forget it (also the device???)
J (device info is verified by matching cross-signing key)
- Look up the MXID and MSK for the user sending the to-device message.
- Decide the MSK trust level based on whether we have verified this user (
matrix_sdk_crypto::identities::user::UserIdentity::is_verified
). - Upsert the session including the MXID, MSK and trust level. Remove the device info and retries since we don't need them.
- Add this information to the
sender_data
. - [drop the lock]
Note: the sender data may become out-of-date if we later verify the user. We have no plans to update it if so.