simplex-chat
diff --git a/‎spec/TOPICS.md‎
Lines changed: 6 additions & 0 deletions b/‎spec/TOPICS.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎spec/modules/Simplex/Messaging/Client.md‎
Lines changed: 117 additions & 0 deletions b/‎spec/modules/Simplex/Messaging/Client.md‎
Lines changed: 117 additions & 0 deletions
diff --git a/‎spec/modules/Simplex/Messaging/Client/Agent.md‎
Lines changed: 67 additions & 0 deletions b/‎spec/modules/Simplex/Messaging/Client/Agent.md‎
Lines changed: 67 additions & 0 deletions
@@ -12,4 +12,10 @@
 
 - **Certificate chain trust model**: ChainCertificates (Shared.hs) defines 0–4 cert chain semantics, used by both Client.hs (validateCertificateChain) and Server.hs (validateClientCertificate, SNI credential switching). The 4-length case skipping index 2 (operator cert) and the FQHN-disabled x509validate are decisions that span the entire transport security model.
 
+- **SMP proxy protocol flow**: The PRXY/PFWD/RFWD proxy protocol involves Client.hs (proxySMPCommand with 10 error scenarios, forwardSMPTransmission with sessionSecret encryption), Protocol.hs (command types, version-dependent encoding), Transport.hs (proxiedSMPRelayVersion cap, proxyServer flag disabling block encryption). The double encryption (client-relay via PFWD + proxy-relay via RFWD), combined timeout (tcpConnect + tcpTimeout), nonce/reverseNonce pairing, and version downgrade logic are not visible from any single module.
+
+- **Service certificate subscription model**: Service subscriptions (SUBS/NSUBS) and per-queue subscriptions (SUB/NSUB) coexist with complex state transitions. Client/Agent.hs manages dual active/pending subscription maps with session-aware cleanup. Protocol.hs defines useServiceAuth (only NEW/SUB/NSUB). Client.hs implements authTransmission with dual signing (entity key over cert hash + transmission, service key over transmission only). Transport.hs handles the service certificate handshake extension (v16+). The full subscription lifecycle — from DBService credentials through handshake to service subscription to disconnect/reconnect — spans all four modules.
+
+- **Two agent layers**: Client/Agent.hs ("small agent") is used only in servers — SMP proxy and notification server — to manage client connections to other SMP servers. Agent.hs + Agent/Client.hs ("big agent") is used in client applications. Both manage SMP client connections with subscription tracking and reconnection, but the big agent adds the full messaging agent layer (connections, double ratchet, file transfer). When documenting Agent/Client.hs, Client/Agent.hs should be reviewed for shared patterns and differences.
+
 - **Handshake protocol family**: SMP (Transport.hs), NTF (Notifications/Transport.hs), and XFTP (FileTransfer/Transport.hs) all have handshake protocols with the same structure (version negotiation + session binding + key exchange) but different feature sets. NTF is a strict subset. XFTP doesn't use the TLS handshake at all (HTTP2 layer). The shared types (THandle, THandleParams, THandleAuth) mean changes to the handshake infrastructure affect all three protocols.
@@ -0,0 +1,117 @@
+# Simplex.Messaging.Client
+
+> Generic protocol client: connection management, command sending/receiving, batching, proxy protocol, reconnection.
+
+**Source**: [`Client.hs`](../../../../src/Simplex/Messaging/Client.hs)
+
+**Protocol spec**: [`protocol/simplex-messaging.md`](../../../../protocol/simplex-messaging.md) — SimpleX Messaging Protocol.
+
+## Overview
+
+This module implements the client side of the `Protocol` typeclass — connecting to servers, sending commands, receiving responses, and managing connection lifecycle. It is generic over `Protocol v err msg`, instantiated for SMP as `SMPClient` (= `ProtocolClient SMPVersion ErrorType BrokerMsg`). The SMP proxy protocol (PRXY/PFWD/RFWD) is also implemented here.
+
+## Four concurrent threads per connection
+
+`getProtocolClient` launches four threads via `raceAny_`:
+- `send`: reads from `sndQ` (TBQueue) and writes to TLS
+- `receive`: reads from TLS and writes to `rcvQ` (TBQueue), updates `lastReceived` timestamp
+- `process`: reads from `rcvQ` and dispatches to response variables or `msgQ`
+- `monitor`: periodic ping loop (only when `smpPingInterval > 0`)
+
+When any thread terminates (via `raceAny_`), the `disconnected` callback fires.
+
+## clientCorrId — random correlation IDs
+
+`clientCorrId` is a `TVar ChaChaDRG` used to generate random `CbNonce` values that serve as correlation IDs. The `CbNonce` is also used as the nonce for proxy encryption. When a nonce is explicitly passed (e.g., by `createSMPQueue`), it is used instead of generating a random one.
+
+## nonBlockingWriteTBQueue — fork on full queue
+
+If `tryWriteTBQueue` returns `False` (queue full), a new thread is forked to perform the blocking write. This prevents the caller from blocking when the send queue is full.
+
+## getResponse — double-check after timeout
+
+Regardless of whether a response arrived or the timeout fired, `getResponse` sets `pending` to `False` and then tries to read the response variable again. The source comment states: "Try to read response again in case it arrived after timeout expired but before `pending` was set to False above. See `processMsg`." This handles the race between the timeout and a response arriving.
+
+`timeoutErrorCount` is incremented on each timeout and reset to 0 on each received response (in `getResponse`). The `receive` thread also resets it to 0 on every TLS read. The monitor uses this count to decide when to drop the connection.
+
+## processMsg — empty corrId means server event
+
+When `corrId` is empty (`B.null $ bs corrId`), the response is treated as a server-initiated event (`STEvent`). When non-empty, it is looked up in `sentCommands`. If the command was already expired (`wasPending` is `False`), the response is forwarded to `msgQ` as `STResponse` rather than being put into the `responseVar`.
+
+Entity ID mismatch (response entity ID differs from request entity ID) is treated as `unexpectedResponse`.
+
+## monitor — adaptive ping with connection drop
+
+The ping loop sleeps for `smpPingInterval`, then checks how long since `lastReceived`. If significant time remains in the interval, it re-sleeps for just that remaining time (avoiding early pings). Pings are only sent when `sendPings` is `True` — this is set by `enablePings`, which is called by `subscribeSMPQueue`, `subscribeSMPQueues`, `subscribeSMPQueueNotifications`, `subscribeSMPQueuesNtfs`, and `subscribeService`, not on connection establishment.
+
+The source code drops the client when `maxCnt` commands have timed out in sequence **and** at least `recoverWindow` (15 minutes) has passed since the last received response.
+
+## Batch commands do not expire
+
+The source comment states: "Currently there is coupling - batch commands do not expire, and individually sent commands do. This is to reflect the fact that we send subscriptions only as batches, and also because we do not track a separate timeout for the whole batch, so it is not obvious when should we expire it."
+
+When using `sendBatch`, requests are written to `sndQ` with `Nothing` as the request parameter (vs `Just r` for individual sends), which means the send thread won't check the `pending` flag.
+
+## chooseTransportHost
+
+Selects onion or public host based on `hostMode` and `socksProxy` configuration:
+- `HMOnionViaSocks`: use onion only if SOCKS proxy is configured
+- `HMOnion`: always prefer onion
+- `HMPublic`: always prefer public
+
+When `requiredHostMode` is `True`, the function returns `Left PCEIncompatibleHost` if no matching host exists. When `False`, it falls back to the first host in the list.
+
+## SocksIsolateByAuth — SOCKS credential generation
+
+When `SocksIsolateByAuth` is the SOCKS auth mode, `clientSocksCredentials` generates SOCKS credentials as `SocksCredentials sessionUsername ""` where `sessionUsername` is `B64.encode $ C.sha256Hash $ bshow userId <> ...`. The suffix varies by `sessionMode`:
+- `TSMUser`: `""`
+- `TSMSession`: `":" <> bshow proxySessTs`
+- `TSMServer`: `":" <> bshow proxySessTs <> "@" <> strEncode srv`
+- `TSMEntity`: `":" <> bshow proxySessTs <> "@" <> strEncode srv <> "/" <> entityId`
+
+## useWebPort — preset domain suffix matching
+
+`useWebPort` decides whether to use port 443 (HTTP) transport. `SWPPreset` mode matches when the server's first host is a domain that has any of `presetDomains` as a suffix (via `isSuffixOf`).
+
+## connectSMPProxiedRelay — combined timeout
+
+The timeout for the `PRXY` command is `netTimeoutInt tcpConnectTimeout nm + netTimeoutInt tcpTimeout nm` — both timeouts are transformed by `netTimeoutInt` (which selects background/interactive and applies exponential scaling) before being summed.
+
+## ProxiedRelay — stored auth for reconnection
+
+The source comment on `prBasicAuth` states: "auth is included here to allow reconnecting via the same proxy after NO_SESSION error."
+
+## proxySMPCommand — 9 error scenarios
+
+The source comment states: "there may be one successful scenario and 9 error scenarios" and documents all 10 (scenarios 0-9), mapping each combination of success/error at the client-proxy and proxy-relay boundaries to specific error types. Errors from the destination relay wrapped in `PRES` are thrown as `ExceptT` errors (transparent proxy). Errors from the proxy itself are returned as `Left ProxyClientError`.
+
+## forwardSMPTransmission — proxy-side forwarding
+
+Used by the proxy server to forward `RFWD` to the destination relay. Uses `cbEncryptNoPad`/`cbDecryptNoPad` (no padding) with the session secret from the proxy-relay connection. Response nonce is `reverseNonce` of the request nonce.
+
+## action — weak thread reference
+
+`action` stores a `Weak ThreadId` (via `mkWeakThreadId`) to the main client thread. `closeProtocolClient` dereferences and kills it.
+
+## netTimeoutInt — exponential backoff for interactive retries
+
+`netTimeoutInt` applies `(3/2)^n` scaling to `interactiveTimeout` for interactive request modes:
+- n=0: `interactiveTimeout`
+- n=1: `interactiveTimeout * 3/2`
+- n=2: `interactiveTimeout * 9/4`
+- n=3+: `interactiveTimeout * 27/8`
+
+Background mode always uses `backgroundTimeout` regardless of retry count.
+
+## authTransmission — dual auth with service signature
+
+When a command uses service auth (`useServiceAuth` returns `True`) and a service certificate is present, the entity key signs over the concatenation of `serviceCertHash <> transmission` (not just the transmission). The source comment states: "entity key must sign over both transmission and service certificate hash, to prevent any service substitution via MITM inside TLS." The service key only signs the transmission itself.
+
+For X25519 keys, `cbAuthenticate` produces a `TAAuthenticator`. For Ed25519/Ed448 keys, `C.sign'` produces a `TASignature`.
+
+## subscribeSMPQueue / getSMPMessage — request mode comments
+
+Several commands have explicit request mode comments:
+- `subscribeSMPQueue`: "This command is always sent in background request mode"
+- `getSMPMessage`: "This command is always sent in interactive request mode, as NSE has limited time"
+- `ackSMPMessage`: "This command is always sent in background request mode"
@@ -0,0 +1,67 @@
+# Simplex.Messaging.Client.Agent
+
+> SMP client connections with subscription management, reconnection, and service certificate support.
+
+**Source**: [`Client/Agent.hs`](../../../../../src/Simplex/Messaging/Client/Agent.hs)
+
+## Overview
+
+`SMPClientAgent` manages `SMPClient` connections via `smpClients :: TMap SMPServer SMPClientVar` (one per SMP server), tracks active and pending subscriptions, and handles automatic reconnection. It is parameterized by `Party` (`p`) and uses the `ServiceParty` constraint to support both `RecipientService` and `NotifierService` modes.
+
+## Dual subscription model
+
+Four TMap fields track subscriptions in two dimensions:
+
+| | Active | Pending |
+|---|---|---|
+| **Service** | `activeServiceSubs :: TMap SMPServer (TVar (Maybe (ServiceSub, SessionId)))` | `pendingServiceSubs :: TMap SMPServer (TVar (Maybe ServiceSub))` |
+| **Queue** | `activeQueueSubs :: TMap SMPServer (TMap QueueId (SessionId, C.APrivateAuthKey))` | `pendingQueueSubs :: TMap SMPServer (TMap QueueId C.APrivateAuthKey)` |
+
+The source comment states: "Only one service subscription can exist per server with this agent. With correctly functioning SMP server, queue and service subscriptions can't be active at the same time." And: "Pending service subscriptions can co-exist with pending queue subscriptions on the same SMP server during subscriptions being transitioned from per-queue to service."
+
+Active subscriptions store the `SessionId` of the connection that established them. On disconnect, only subscriptions matching the disconnected session's `SessionId` are moved to pending.
+
+## persistErrorInterval — delayed error cleanup
+
+When `newSMPClient` (which calls `connectClient`) fails, the error is stored with an expiry timestamp (`addUTCTime ei`) rather than being removed immediately. `waitForSMPClient` checks if the timestamp has expired before retrying. When `persistErrorInterval` is 0, the error is removed from the map immediately.
+
+## removeClientAndSubs — subscription vars looked up outside STM
+
+The source comment states: "Looking up subscription vars outside of STM transaction to reduce re-evaluation. It is possible because these vars are never removed, they are only added."
+
+Within the STM transaction, only subscriptions matching the disconnected session's `SessionId` are moved to pending. The source comment on `updateServiceSub` states: "We don't change active subscription in case session ID is different from disconnected client." And: "We don't reset pending subscription to Nothing here to avoid any race conditions with subsequent client sessions that might have set pending already."
+
+## updateActiveServiceSub
+
+When the service ID and session ID match the existing active subscription, `updateActiveServiceSub` adds the queue count (`n + n'`) and XOR-merges the IdsHash (`idsHash <> idsHash'`) rather than replacing. When they don't match, the subscription is replaced entirely.
+
+## CAServiceUnavailable event
+
+The source comment states: "CAServiceUnavailable is used when service ID in pending subscription is different from the current service in connection. This will require resubscribing to all queues associated with this service ID individually, creating new associations. It may happen if, for example, SMP server deletes service information (e.g. via downgrade and upgrade) and assigns different service ID to the service certificate."
+
+## serviceAvailable check
+
+`smpSubscribeService` checks both that `serviceId` matches and that `partyServiceRole` matches the connection's `serviceRole`. If either doesn't match, `CAServiceUnavailable` is notified.
+
+## groupSub — subscription response classification
+
+When processing subscription responses, each queue is classified:
+- If the response includes a `serviceId` matching the connection's service ID: counted as a service-subscribed queue (added to `sQs`)
+- Otherwise: counted as a queue-only subscription (added to `qOks` with `SessionId` and key)
+- Queues not found in `pending` map are skipped (accumulator unchanged)
+
+## Reconnect worker cleanup race
+
+The source comment on `cleanup` states: "Here we wait until TMVar is not empty to prevent worker cleanup happening before worker is added to TMVar. Not waiting may result in terminated worker remaining in the map."
+
+## DBService abstraction
+
+`DBService` provides `getCredentials :: SMPServer -> IO (Either SMPClientError ServiceCredentials)` and `updateServiceId :: SMPServer -> Maybe ServiceId -> IO (Either SMPClientError ())`. When `dbService` is `Nothing`, connections are made without service credentials.
+
+## isOwnServer — domain suffix matching
+
+`isOwnServer` checks if the server's first host exactly matches any `ownServerDomains` entry, or if any entry is a suffix preceded by `.`. This is used to set the `OwnServer` flag on client connections.
+
+## smpSessions — proxy session lookup
+
+`smpSessions` maps `SessionId` (from `thParams`) to `(OwnServer, SMPClient)`. `lookupSMPServerClient` performs a lookup into this map.