Conversation
📝 WalkthroughWalkthroughAdds workspace and document analytics: new DB tables and Prisma models, a WorkspaceAnalytics model, GraphQL types and queries for admin/dashboard and doc analytics, backend recording of doc views and active-user flush, frontend admin and document analytics UIs, tests, and related routing/i18n updates. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Server as AppServer
participant Gateway as SyncGateway
participant Cache as Redis
participant DB as Postgres
Client->>Server: HTTP GET /doc/:id (or shared page)
Server->>Server: buildVisitorId(req, workspaceId, docId)
Server->>Gateway: (optional) socket connect / presence
Server->>DB: workspaceAnalytics.recordDocView(workspaceId, docId, visitorId, isGuest, userId?)
DB->>DB: upsert workspace_doc_view_daily / workspace_member_last_access
Server->>Cache: markDailyUniqueVisitor(docId, visitorId)
Cache-->>DB: (unique visitor check/expiry)
Client->>Server: GraphQL adminDashboard() / getDocPageAnalytics()
Server->>DB: workspaceAnalytics.adminGetDashboard / getDocPageAnalytics queries
DB-->>Server: stats, series, top links
Gateway->>Gateway: every 60s flushActiveUsersMinute()
Gateway->>DB: upsert into sync_active_users_minutely(active_users, minute_ts)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #14426 +/- ##
==========================================
- Coverage 53.58% 53.49% -0.09%
==========================================
Files 2828 2831 +3
Lines 151202 153049 +1847
Branches 22806 22897 +91
==========================================
+ Hits 81018 81877 +859
- Misses 66973 67997 +1024
+ Partials 3211 3175 -36
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In
`@packages/backend/server/migrations/20260212053401_workspace_analytics/migration.sql`:
- Around line 34-36: Remove the redundant explicit index creation for
"workspace_doc_view_daily_workspace_id_doc_id_date_idx" because the table's
primary key (workspace_id, doc_id, date) already creates an equivalent unique
index; delete the CREATE INDEX IF NOT EXISTS line that defines
workspace_doc_view_daily_workspace_id_doc_id_date_idx and keep the other index
for ("workspace_id", "date") intact.
In `@packages/backend/server/src/core/sync/gateway.ts`:
- Around line 348-366: The method flushActiveUsersMinute is misleading because
this.socketRedis.scard(this.activeConnectionSetKey) counts socket connections
(client IDs), not unique users; either change the implementation to record
unique user IDs (use the auth handshake user identifier when adding/removing
from the Redis set and call a new upsert method like
upsertSyncActiveUniqueUsersMinute) or explicitly rename symbols to reflect
"connections" (rename flushActiveUsersMinute to flushActiveConnectionsMinute,
activeConnectionSetKey to activeConnectionSetKeyConnections or similar, and
update the model call from upsertSyncActiveUsersMinute to
upsertSyncActiveConnectionsMinute) and update all call sites accordingly (refer
to method flushActiveUsersMinute, property activeConnectionSetKey,
this.socketRedis.scard usage, and
this.models.workspaceAnalytics.upsertSyncActiveUsersMinute).
- Around line 212-213: The shared Redis set activeConnectionSetKey is causing
stale entries after crashes; change from a plain set to a sorted set (ZSET) and
update the periodic flush (triggered by flushTimer) to ZADD each client.id with
score = current timestamp and then ZREMRANGEBYSCORE to remove entries older than
your staleness threshold (e.g., now - 2–3 minutes) before reading counts; update
any reads that used SCARD to use ZCARD (or ZCOUNT with the recent range) and
ensure the flushTimer handler performs the trim and heartbeat ZADD so crashed
instances' ids age out automatically.
In `@packages/backend/server/src/models/workspace-analytics.ts`:
- Around line 1037-1079: In buildSharedLinkCursorCondition, validate the
incoming cursor values before converting/using them: for ViewsDesc ensure
cursor.sortValue is a finite number (use Number and Number.isFinite) and for
date-based orders (PublishedAtDesc/UpdatedAtDesc) parse cursor.sortValue into a
Date and check !isNaN(date.valueOf()); if validation fails throw a BadRequest
error (or propagate a validation error) instead of building SQL with invalid
params; apply the same guard logic in paginateDocLastAccessedMembers for
MemberCursor to validate member cursor sortValue before using it.
In `@packages/frontend/admin/src/modules/dashboard/index.tsx`:
- Around line 48-77: The dashboard date/time formatters (formatDateTime and
formatDate) currently use the environment locale which applies the local
timezone; update these functions to call toLocaleString / toLocaleDateString
with an explicit timeZone: 'UTC' (and appropriate format options if needed) so
displayed timestamps align with the UTC sampling window described in the UI.
In `@packages/frontend/i18n/src/i18n.gen.ts`:
- Around line 4577-4644: The generated i18n type defines new analytics keys
(e.g., "com.affine.doc.analytics.title",
"com.affine.doc.analytics.summary.total",
"com.affine.doc.analytics.window.last-days",
"com.affine.doc.analytics.metric.total",
"com.affine.doc.analytics.metric.unique",
"com.affine.doc.analytics.metric.guest",
"com.affine.doc.analytics.chart.total-views",
"com.affine.doc.analytics.chart.unique-views",
"com.affine.doc.analytics.error.load-analytics",
"com.affine.doc.analytics.error.load-viewers",
"com.affine.doc.analytics.empty.no-page-views",
"com.affine.doc.analytics.empty.no-viewers",
"com.affine.doc.analytics.viewers.title",
"com.affine.doc.analytics.viewers.show-all",
"com.affine.doc.analytics.paywall.open-pricing",
"com.affine.doc.analytics.paywall.toast") but those keys are missing from the 24
non-English locale resource files; add the same keys to every locale JSON (ar,
ca, da, de, el-GR, es-AR, es-CL, es, fa, fr, hi, it-IT, it, ja, ko, nb-NO, pl,
pt-BR, ru, sv-SE, uk, ur, zh-Hans, zh-Hant) populating values either with the
proper translations or, temporarily, the English strings from en.json; after
updating the locale files, run the i18n build/generation step so types and
bundles reflect the new keys and verify no fallback to raw keys occurs at
runtime.
🧹 Nitpick comments (8)
packages/backend/server/src/core/sync/gateway.ts (2)
302-326: Inconsistent floating-promise handling betweenhandleConnectionandhandleDisconnect.
handleDisconnect(line 320) prefixes the call withvoidto suppress the no-floating-promises lint rule, buthandleConnection(line 306) does not. Both are fire-and-forget. Addvoidon line 306 for consistency and to avoid lint warnings.Proposed fix
handleConnection(client: Socket) { this.connectionCount++; this.logger.debug(`New connection, total: ${this.connectionCount}`); metrics.socketio.gauge('connections').record(this.connectionCount); - this.onConnectionPresenceChanged(client, 'add').catch(error => { + void this.onConnectionPresenceChanged(client, 'add').catch(error => { this.logger.warn( 'Failed to update connection presence on add', error as Error ); }); }
345-346:flushActiveUsersMinuteis called on every connect/disconnect — may be noisy under churn.Each invocation performs a Redis
scard+ a DB upsert. Under high connection churn (e.g., many short-lived connections), this can produce a significant volume of DB writes per minute, all to the same minute-truncated row.Consider debouncing or throttling so the method runs at most once every few seconds, relying on the 60 s interval as the baseline. Alternatively, remove the call from
onConnectionPresenceChangedentirely and let the timer be the sole flush trigger.Also applies to: 362-365
packages/frontend/admin/src/components/ui/chart.tsx (1)
38-71:dangerouslySetInnerHTMLused for CSS injection — acceptable but consider hardening config keys.The CSS is built from developer-supplied
ChartConfigkeys and color values, so XSS risk is low in practice. However, if a config key ever contained characters like",}, or<, it could break out of the CSS/style context. A lightweight safeguard would be to sanitize keys (e.g., restrict to[a-zA-Z0-9_-]).This is a well-known pattern (e.g., shadcn/ui charts), so flagging as optional.
🛡️ Optional: sanitize config keys
const colorEntries = Object.entries(config).filter( - ([, item]) => item.color || item.theme + ([key, item]) => /^[a-zA-Z0-9_-]+$/.test(key) && (item.color || item.theme) );packages/backend/server/migrations/20260212053401_workspace_analytics/migration.sql (1)
67-75: Legacy cleanupDROP CONSTRAINT/DROP INDEXlackIF EXISTSguards.Unlike the
CREATE TABLE IF NOT EXISTSandDROP TRIGGER IF EXISTS/DROP FUNCTION IF EXISTSstatements earlier in this migration, theALTER TABLE ... DROP CONSTRAINTandDROP INDEXstatements on lines 67-75 will fail if the constraints or indexes don't exist. If this migration needs to be idempotent or re-runnable (e.g., during development), consider adding guards.🛡️ Suggested: add IF EXISTS guards
ALTER TABLE - "user_features" DROP CONSTRAINT "user_features_feature_id_fkey"; + "user_features" DROP CONSTRAINT IF EXISTS "user_features_feature_id_fkey"; ALTER TABLE - "workspace_features" DROP CONSTRAINT "workspace_features_feature_id_fkey"; + "workspace_features" DROP CONSTRAINT IF EXISTS "workspace_features_feature_id_fkey"; -DROP INDEX "user_features_feature_id_idx"; +DROP INDEX IF EXISTS "user_features_feature_id_idx"; -DROP INDEX "workspace_features_feature_id_idx"; +DROP INDEX IF EXISTS "workspace_features_feature_id_idx";packages/backend/server/src/core/workspaces/stats.job.ts (1)
323-348:CURRENT_DATEis server-timezone-dependent — worth documenting.
CURRENT_DATEuses the PostgreSQL server'stimezonesetting. If you ever need consistent date bucketing across environments (e.g., UTC), consider usingCURRENT_DATE AT TIME ZONE 'UTC'or(NOW() AT TIME ZONE 'UTC')::date. This is fine if the server timezone is already set to UTC (common in cloud deployments), but worth a brief inline comment for future maintainers.packages/backend/server/src/core/workspaces/controller.ts (1)
149-164: Consider merging the duplicate!docId.isWorkspaceguard.Lines 149 and 166 both check
!docId.isWorkspace. While they serve different purposes (analytics vs. publish-mode header), combining them into a single block would avoid the small overhead of a redundant condition and make the flow easier to follow.♻️ Suggested refactor
- if (!docId.isWorkspace) { - void this.models.workspaceAnalytics - .recordDocView({ - workspaceId: docId.workspace, - docId: docId.guid, - userId: user?.id, - visitorId: this.buildVisitorId(req, docId.workspace, docId.guid), - isGuest: !user, - }) - .catch(error => { - this.logger.warn( - `Failed to record doc view: ${docId.workspace}/${docId.guid}`, - error as Error - ); - }); - } - - if (!docId.isWorkspace) { + if (!docId.isWorkspace) { + void this.models.workspaceAnalytics + .recordDocView({ + workspaceId: docId.workspace, + docId: docId.guid, + userId: user?.id, + visitorId: this.buildVisitorId(req, docId.workspace, docId.guid), + isGuest: !user, + }) + .catch(error => { + this.logger.warn( + `Failed to record doc view: ${docId.workspace}/${docId.guid}`, + error as Error + ); + }); + // fetch the publish page mode for publish pagepackages/backend/server/schema.prisma (1)
328-383: Consider dropping the redundant PK index.
WorkspaceDocViewDailyalready has@@id([workspaceId, docId, date]), so@@index([workspaceId, docId, date])duplicates the PK index and adds write overhead.♻️ Suggested cleanup
@@id([workspaceId, docId, date]) @@index([workspaceId, date]) - @@index([workspaceId, docId, date]) @@map("workspace_doc_view_daily")packages/backend/server/src/models/workspace-analytics.ts (1)
442-453: Prefer either cursor or offset pagination, not both.
Usingoffsetalongsideaftercan skip/duplicate rows. Consider normalizingoffsetto0when a cursor is present (and doing the same inpaginateDocLastAccessedMembers).♻️ Suggested normalization
- const pagination: PaginationInput = { - ...options.pagination, - first: Math.min(Math.max(options.pagination.first ?? 10, 1), 100), - offset: Math.max(options.pagination.offset ?? 0, 0), - }; + const pagination: PaginationInput = { + ...options.pagination, + first: Math.min(Math.max(options.pagination.first ?? 10, 1), 100), + offset: options.pagination.after + ? 0 + : Math.max(options.pagination.offset ?? 0, 0), + };
packages/backend/server/migrations/20260212053401_workspace_analytics/migration.sql
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/backend/server/src/core/sync/gateway.ts`:
- Around line 348-390: The upsert in
models.workspaceAnalytics.upsertSyncActiveUsersMinute allows last-write-wins
causing smaller values to overwrite larger ones within the same minute; update
the upsert SQL to use GREATEST(active_users, EXCLUDED.active_users) so the
stored value only increases within that minute, or alternately ensure only the
periodic aggregator (flushActiveUsersMinute with aggregateAcrossCluster=true)
performs DB writes and stop calling upsert from connect/disconnect paths—locate
calls to flushActiveUsersMinute and the upsert implementation in
workspaceAnalytics to apply the chosen change.
🧹 Nitpick comments (9)
packages/backend/server/src/__tests__/e2e/workspace/admin-analytics.spec.ts (1)
13-58: Prefer$executeRaw(tagged template) over$executeRawUnsafefor static SQL.All four DDL statements are static string literals with no interpolation. Using the tagged template form (
$executeRaw\...``) is more consistent with the rest of this file and avoids accidentally introducing injection if the strings are ever parameterized later.♻️ Example for the first statement
async function ensureAnalyticsTables(db: PrismaClient) { - await db.$executeRawUnsafe(` + await db.$executeRaw` CREATE TABLE IF NOT EXISTS workspace_admin_stats_daily ( workspace_id VARCHAR NOT NULL, date DATE NOT NULL, snapshot_size BIGINT NOT NULL DEFAULT 0, blob_size BIGINT NOT NULL DEFAULT 0, member_count BIGINT NOT NULL DEFAULT 0, updated_at TIMESTAMPTZ(3) NOT NULL DEFAULT NOW(), PRIMARY KEY (workspace_id, date) - ); - `); + ); + `;Apply the same pattern to the remaining three statements.
packages/backend/server/src/__tests__/sync/gateway.spec.ts (2)
504-520: Test is timing-sensitive and lacks an explicit value assertion.Two observations:
The test's correctness depends on the server flushing active-user counts to the DB within the 5 s
WS_TIMEOUT_MSwindow. If the flush interval is increased or delayed under CI load, this test becomes flaky. Consider either triggering a flush explicitly or making the timeout configurable/generous for this specific case.
t.pass()only proveswaitForActiveUsersdidn't throw. A direct assertion (e.g.,t.is(await latestActiveUsers(db), 1)) after the wait would make the test output more informative on failure.Minor: add explicit assertion after wait
await Promise.all([waitForConnect(first), waitForConnect(second)]); await waitForActiveUsers(db, 1); - t.pass(); + t.is(await latestActiveUsers(db), 1, 'expected exactly 1 active user after dedup');
150-158: Remove redundantensureSyncActiveUsersTable— table is already created by migrations.The hardcoded DDL matches the migration schema exactly, but the call at line 505 is unnecessary since
initTestingDB()only truncates tables and preserves existing schema created by app initialization. If this defensive pattern is preferred, document why it's needed; otherwise, rely on migrations to set up the table structure.packages/backend/server/src/core/sync/gateway.ts (1)
311-322: Disconnect flush uses local connection count, potentially underreporting.With
aggregateAcrossCluster: false, the disconnect handler writesthis.connectionCount(local to this instance) as the active user count for the minute. In a multi-instance deployment, this will drastically underreport. Since the 60s interval corrects this, the impact is limited to a brief inaccuracy, but the DB write on disconnect is arguably misleading.Consider skipping the DB write on disconnect entirely and relying solely on the periodic timer for accuracy, or always using
aggregateAcrossCluster: true.packages/backend/server/src/models/workspace-analytics.ts (3)
509-515: ILIKE wildcard characters in user-supplied keyword are not escaped.Characters
%and_are LIKE wildcards. A keyword like%would match everything;_would match any single character. Since this is an admin endpoint the risk is low, but for correctness you should escape these metacharacters.♻️ Suggested helper
+function escapeLikePattern(pattern: string): string { + return pattern.replace(/[%_\\]/g, '\\$&'); +} + // then in the query: - ILIKE ${`%${keyword}%`} + ILIKE ${`%${escapeLikePattern(keyword)}%`}
1120-1137: Redis failure silently overcounts unique visitors.When
saddthrows (line 1134), the catch returnstrue, meaning the view is counted as unique even if we couldn't verify. This is a reasonable choice for availability-over-accuracy, but the failure is completely silent — no log, no metric.Consider adding a warning log or incrementing an error counter so operational dashboards can detect Redis connectivity issues affecting analytics accuracy.
📝 Proposed change
- } catch { + } catch (error) { + this.logger.warn?.('Failed to check unique visitor in Redis, assuming unique', error); return true; }Note:
BaseModelmay not have a logger — if not, use a static logger or inject one.
991-1028: Duplicated filter logic betweencountAdminSharedLinksandadminPaginateAllSharedLinks.The keyword, workspace, and updatedAfter conditions are reimplemented in both the pagination query and this count query. If filters diverge, the count will become inconsistent with the paginated results. Consider extracting the shared
WHEREclause into a helper that both methods can reuse.packages/backend/server/schema.prisma (2)
343-349:SyncActiveUsersMinutelywill grow unbounded — consider a retention strategy.At one row per minute, this table accumulates ~525K rows/year. The admin dashboard only queries the last 48–72 hours. Without periodic cleanup (e.g., a cron job, pg_cron, or a DB-level partitioning/retention policy), this table will grow indefinitely.
Consider adding a periodic job to prune rows older than a configurable threshold (e.g., 7 or 30 days).
368-382: Consider a composite index to cover the member-access query'sORDER BY.The
paginateDocLastAccessedMembersquery filters on(workspace_id, last_doc_id)and orders by(last_accessed_at DESC, user_id ASC). The current@@index([workspaceId, lastDocId])supports the filter, but Postgres will need a separate sort step for theORDER BY. For large member lists, a composite index could eliminate this:@@index([workspaceId, lastDocId, lastAccessedAt(sort: Desc)])This is unlikely to matter with the
LIMIT 50and for typical workspace sizes, so flagging this as optional.
PR Dependency Tree
This tree was auto-generated by Charcoal
Summary by CodeRabbit