-
Notifications
You must be signed in to change notification settings - Fork 299
new importer #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new importer #214
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 28
🧹 Nitpick comments (35)
apps/start/src/modals/index.tsx (1)
8-8: Consider lazy‑loading AddImport to keep initial bundle slimAddImport brings TRPC/React Query; load it on demand.
Apply:
+import { lazy } from 'react'; -import AddImport from './add-import'; +const AddImport = lazy(() => import('./add-import'));Ensure ModalProvider or the modal surface renders within a .
Also applies to: 41-41
packages/importer/src/base-provider.ts (1)
7-8: Make core hooks async‑friendly and identifiers immutableAllow providers to perform async transforms/validation; mark identifiers readonly.
export abstract class BaseImportProvider< TRawEvent extends BaseRawEvent = BaseRawEvent, > { - abstract provider: string; - abstract version: string; + abstract readonly provider: string; + abstract readonly version: string; + + // Utility type to allow sync or async implementations + export type MaybePromise<T> = T | Promise<T>; /** * Stream-read and parse source (file/API) → yields raw events * This should be implemented as an async generator to handle large files efficiently */ abstract parseSource(): AsyncGenerator<TRawEvent, void, unknown>; /** * Convert provider format → IClickhouseEvent */ - abstract transformEvent(rawEvent: TRawEvent): IClickhouseEvent; + abstract transformEvent(rawEvent: TRawEvent): MaybePromise<IClickhouseEvent>; /** * Validate raw event structure */ - abstract validate(rawEvent: TRawEvent): boolean; + abstract validate(rawEvent: TRawEvent): MaybePromise<boolean>;Also applies to: 19-24
packages/importer/src/utils/referrer-parser.ts (3)
5-35: Type the map and normalize hosts; reduce duplicatesStrong types + normalization avoid listing www/m. variants twice and improve coverage.
-import { stripTrailingSlash } from '@openpanel/common'; +import { stripTrailingSlash } from '@openpanel/common'; +type ReferrerType = 'search' | 'social' | 'referral'; +type ReferrerEntry = { type: ReferrerType; name: string }; +const normalizeHost = (h: string) => h.replace(/^(www\.|m\.)/i, '').toLowerCase(); -const referrers: Record<string, { type: string; name: string }> = { +const referrers: Record<string, ReferrerEntry> = { // Search engines - 'google.com': { type: 'search', name: 'Google' }, - 'www.google.com': { type: 'search', name: 'Google' }, + 'google.com': { type: 'search', name: 'Google' }, 'bing.com': { type: 'search', name: 'Bing' }, - 'www.bing.com': { type: 'search', name: 'Bing' }, 'yahoo.com': { type: 'search', name: 'Yahoo!' }, - 'www.yahoo.com': { type: 'search', name: 'Yahoo!' }, 'duckduckgo.com': { type: 'search', name: 'DuckDuckGo' }, - 'www.duckduckgo.com': { type: 'search', name: 'DuckDuckGo' }, 'baidu.com': { type: 'search', name: 'Baidu' }, - 'www.baidu.com': { type: 'search', name: 'Baidu' }, // Social media - 'facebook.com': { type: 'social', name: 'Facebook' }, - 'www.facebook.com': { type: 'social', name: 'Facebook' }, + 'facebook.com': { type: 'social', name: 'Facebook' }, 'twitter.com': { type: 'social', name: 'Twitter' }, - 'www.twitter.com': { type: 'social', name: 'Twitter' }, 'x.com': { type: 'social', name: 'X' }, - 'www.x.com': { type: 'social', name: 'X' }, 'linkedin.com': { type: 'social', name: 'LinkedIn' }, - 'www.linkedin.com': { type: 'social', name: 'LinkedIn' }, 'instagram.com': { type: 'social', name: 'Instagram' }, - 'www.instagram.com': { type: 'social', name: 'Instagram' }, 'tiktok.com': { type: 'social', name: 'TikTok' }, - 'www.tiktok.com': { type: 'social', name: 'TikTok' }, 'youtube.com': { type: 'social', name: 'YouTube' }, - 'www.youtube.com': { type: 'social', name: 'YouTube' }, 'reddit.com': { type: 'social', name: 'Reddit' }, - 'www.reddit.com': { type: 'social', name: 'Reddit' }, + 't.co': { type: 'social', name: 'Twitter' }, + 'lnkd.in': { type: 'social', name: 'LinkedIn' }, }; export function parseReferrer(url: string | undefined) { - const hostname = getHostname(url); - const match = referrers[hostname] ?? referrers[hostname.replace('www.', '')]; + const hostname = normalizeHost(getHostname(url)); + const match = referrers[hostname];Also applies to: 49-52
49-57: Guard output URL to http(s) onlyAvoid echoing non-web schemes (e.g., javascript:) back to callers.
export function parseReferrer(url: string | undefined) { const hostname = getHostname(url); const match = referrers[hostname] ?? referrers[hostname.replace('www.', '')]; - return { + let safeUrl = ''; + try { + if (url) { + const u = new URL(url); + if (u.protocol === 'http:' || u.protocol === 'https:') { + safeUrl = stripTrailingSlash(url); + } + } + } catch { + // ignore + } + + return { name: match?.name ?? '', type: match?.type ?? 'referral', - url: stripTrailingSlash(url ?? ''), + url: safeUrl, }; }
60-91: Minor: normalize source before lookup and keep types consistentLower-case once; reuse normalizeHost for direct key match; return typed result.
-export function getReferrerWithQuery( - query: Record<string, string> | undefined, -) { +export function getReferrerWithQuery( + query: Record<string, string> | undefined, +): { name: string; type: ReferrerType; url: string } | null { if (!query) { return null; } - const source = query.utm_source ?? query.ref ?? query.utm_referrer ?? ''; + const source = (query.utm_source ?? query.ref ?? query.utm_referrer ?? '').trim(); if (source === '') { return null; } - const match = - Object.values(referrers).find( - (referrer) => referrer.name.toLowerCase() === source.toLowerCase(), - ) || referrers[source]; + const lower = source.toLowerCase(); + const match = + Object.values(referrers).find((r) => r.name.toLowerCase() === lower) || + referrers[normalizeHost(lower)]; if (match) { return { name: match.name, type: match.type, url: '', }; }packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
1-13: Consider more descriptive migration naming.The migration name "fix" is not descriptive. Consider renaming to something like "remove_redundant_import_columns" or "migrate_import_config_to_json" to better indicate the purpose of this schema change.
packages/db/code-migrations/5-add-imports-table.sql (1)
32-32: Consider using an Enum for import_status.The
import_statusfield usesLowCardinality(String)with a default of 'pending'. For better type safety and query performance, consider using an Enum type instead:`import_status` LowCardinality(Enum8('pending' = 1, 'processing' = 2, 'completed' = 3, 'failed' = 4)) DEFAULT 'pending'This ensures only valid status values can be stored and provides better compression.
packages/db/prisma/schema.prisma (1)
470-486: Consider using an enum for the status field.The
statusfield is defined as aStringwith a comment indicating it should be one of'pending' | 'processing' | 'completed' | 'failed'. For better type safety and database constraints, define a Prisma enum:enum ImportStatus { pending processing completed failed } model Import { // ... status ImportStatus // instead of String // ... }This provides compile-time type checking and database-level validation.
packages/trpc/src/root.ts (1)
7-7: Router key named "import" (reserved keyword) — consider renaming for DX.Valid in JS/TS, but can confuse tooling/usages (e.g., appRouter.import vs dynamic import()). Consider renaming to imports/importer for clarity.
Also applies to: 44-44
packages/queue/src/queues.ts (1)
163-176: Harden import jobs: add attempts/backoff/timeout; optionally expose QueueEvents.Imports call external providers; transient failures are expected. Provide sensible defaults and metrics hook.
Apply:
export const importQueue = new Queue<ImportQueuePayload>('import', { connection: getRedisQueue(), defaultJobOptions: { removeOnComplete: 10, - removeOnFail: 50, + removeOnFail: 50, + attempts: 3, + backoff: { type: 'exponential', delay: 30_000 }, + timeout: 1000 * 60 * 30, // 30 min; tune as needed }, }); + +// Optional: enable events for monitoring dashboards/Bull Board progress +export const importQueueEvents = new QueueEvents('import', { + connection: getRedisQueue(), +});apps/worker/src/boot-workers.ts (3)
138-142: Also drain import queue before shutdown to avoid cutting imports mid-run.- await waitForQueueToEmpty(cronQueue); + await waitForQueueToEmpty(cronQueue); + await waitForQueueToEmpty(importQueue, 10 * 60_000); // 10 min cap; tune as needed
155-167: Dev path skips graceful shutdown — can leave jobs in inconsistent state.Unify behavior and call the same exit handler in dev.
- process.on(evt, (code) => { - if (process.env.NODE_ENV === 'production') { - exitHandler(evt, code); - } else { - logger.info('Shutting down for development', { - event: evt, - code, - }); - process.exit(0); - } - }); + process.on(evt, (code) => { + exitHandler(evt, code ?? 0); + });
4-13: Remove unused imports and dead/commented code for clarity.
eventsGroupQueue,GroupWorker,eventsJob,incomingEventPureare no longer used; commented worker block can be dropped.Also applies to: 18-23, 33-48
packages/trpc/src/routers/import.ts (3)
121-131: Guard job removal on delete with error handling (active/locked jobs can throw).const job = await importQueue.getJob(importRecord.jobId); - if (job) { - await job.remove(); - } + if (job) { + try { + await job.remove(); + } catch (e) { + // best-effort removal; proceed with deletion + } + }
165-171: Retry should set stable jobId and attempts/backoff like create.- const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + );
3-6: Unused type import (IImportConfig).
IImportConfigis not referenced. Remove to satisfy lint.packages/importer/src/index.ts (1)
1-14: Public API surface looks good; minor polishConsider also re-exporting provider config/types (e.g., IUmamiImportConfig) from providers to keep consumers on a single import path. Optional.
IMPORT_REVIEW.md (1)
320-337: Move this review doc to /docs or PR notes to avoid shipping in package tarballsKeeps repo clean and avoids confusing consumers.
apps/worker/src/jobs/import.ts (4)
1-11: Avoid deep imports; prefer package barrel exportsImporting services via '@openpanel/db/src/services/import.service' couples to internal paths. Re-export these from '@openpanel/db' and import from there to reduce breakage.
75-81: Progress totalBatches fluctuates; compute once or report pctAs totalEvents grows, totalBatches changes mid-run. Either omit totalBatches here or compute a percentage for UI.
- totalBatches: Math.ceil(totalEvents / BATCH_SIZE), + progressPct: totalEvents ? Math.round((processedEvents / totalEvents) * 100) : 0,
17-17: Make BATCH_SIZE configurableRead from config/env to tune per deployment.
-const BATCH_SIZE = 5000; +const BATCH_SIZE = Number(process.env.IMPORT_BATCH_SIZE ?? 5000);
200-209: Validate provider-specific config with a schema before instantiationAvoid passing untyped Prisma JSON into providers. Parse with provider schema (zod) and fail early.
function createProvider( record: Prisma.ImportGetPayload<{ include: { project: true } }>, ) { switch (record.config.provider) { case 'umami': - return new UmamiProvider(record.projectId, record.config); + // Example: zUmamiImportConfig.safeParse(record.config) + return new UmamiProvider( + record.projectId, + (record.config as any) /* TODO: parse/validate */, + ); default: throw new Error(`Unknown provider: ${record.config.provider}`); } }I can wire in zod parsing if you expose the provider config schemas from @openpanel/importer.
packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (2)
9-16: Constrain status valuesAdd a CHECK constraint (or enum) to prevent invalid states.
"status" TEXT NOT NULL, @@ ) ; + +-- Restrict status values +ALTER TABLE "public"."imports" + ADD CONSTRAINT "imports_status_check" + CHECK ("status" IN ('pending','processing','completed','failed'));
18-22: Add indexes for common queriesList/filters will hit projectId/status/createdAt frequently.
-- AddForeignKey ALTER TABLE "public"."imports" ADD CONSTRAINT "imports_projectId_fkey" FOREIGN KEY ("projectId") REFERENCES "public"."projects"("id") ON DELETE CASCADE ON UPDATE CASCADE; + +-- Indexes for performance +CREATE INDEX IF NOT EXISTS "idx_imports_project_createdAt" + ON "public"."imports" ("projectId", "createdAt" DESC); +CREATE INDEX IF NOT EXISTS "idx_imports_status_createdAt" + ON "public"."imports" ("status", "createdAt" DESC);apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (5)
5-10: Remove unused icon importCheckCircle is unused.
-import { - CheckCircle, - CheckCircleIcon, - Download, - Loader2Icon, - XCircleIcon, -} from 'lucide-react'; +import { CheckCircleIcon, Download, Loader2Icon, XCircleIcon } from 'lucide-react';
83-104: Add onError to mutations for user feedbackSurface errors with toasts; also consider logging.
const deleteImport = useMutation( trpc.import.delete.mutationOptions({ onSuccess: () => { toast.success('Import deleted', { description: 'The import has been successfully deleted.', }); queryClient.invalidateQueries(trpc.import.list.pathFilter()); }, + onError: (err) => { + toast.error('Delete failed', { description: String(err) }); + }, }), ); @@ const retryImport = useMutation( trpc.import.retry.mutationOptions({ onSuccess: () => { toast.success('Import retried', { description: 'The import has been queued for processing again.', }); queryClient.invalidateQueries(trpc.import.list.pathFilter()); }, + onError: (err) => { + toast.error('Retry failed', { description: String(err) }); + }, }), );
244-261: Disable action buttons while pending to avoid double submitsTie disabled state to mutation pending flags.
- {imp.status === 'failed' && ( + {imp.status === 'failed' && ( <Button variant="outline" size="sm" - onClick={() => retryImport.mutate({ id: imp.id })} + disabled={retryImport.isPending} + onClick={() => retryImport.mutate({ id: imp.id })} > Retry </Button> )} <Button variant="ghost" size="sm" - onClick={() => deleteImport.mutate({ id: imp.id })} + disabled={deleteImport.isPending} + onClick={() => deleteImport.mutate({ id: imp.id })} > Delete </Button>
41-66: Consider serving logos locally or via asset proxyExternal Brandfetch URLs can break and may violate CSP. If feasible, mirror assets or use a proxy.
141-161: Consider extending IntegrationCardLogoImage to accept a customizable alt prop for improved accessibilityThe
altattribute is currently hardcoded to the generic "Integration Logo" at line 66 ofapps/start/src/components/integrations/integration-card.tsx. Extending the component to accept an optionalaltprop and passingprovider.namewould provide more descriptive alternative text for each integration, improving accessibility and SEO.packages/validation/src/index.ts (1)
444-445: Local file support vs schema
fileUrlisz.string().url(), but the Umami provider supports local files. Either (a) allowfile:///or absolute paths, or (b) drop local-path support in the provider. Today this schema blocks local imports.- fileUrl: z.string().url(), + fileUrl: z.string().refine( + (v) => /^https?:\/\//i.test(v) || /^file:\/\//i.test(v) || v.startsWith('/'), + 'Must be an http(s) URL, file:// URL, or absolute path', + ),packages/importer/src/providers/umami.ts (2)
178-195: Byte counting viabody.on('data')competes with pipelineAttach a lightweight Transform to count bytes between
bodyanddecompressto preserve backpressure and avoid dual consumers.- if (maxBytes) { - body.on('data', (chunk: Buffer) => { - seenBytes += chunk.length; - if (seenBytes > maxBytes) { - controller.abort(); - body.destroy(new Error(`Stream exceeded size limit (${seenBytes} > ${maxBytes})`)); - } - }); - } + const byteCounter = new (await import('node:stream')).Transform({ + transform(chunk, _enc, cb) { + seenBytes += chunk.length; + if (maxBytes && seenBytes > maxBytes) { + cb(new Error(`Stream exceeded size limit (${seenBytes} > ${maxBytes})`)); + return; + } + cb(null, chunk); + }, + }); @@ - if (decompress) { - await pipeline(body, decompress, parser, { signal: controller.signal }); - } else { - await pipeline(body, parser, { signal: controller.signal }); - } + if (decompress) { + await pipeline(body, byteCounter, decompress, parser, { signal: controller.signal }); + } else { + await pipeline(body, byteCounter, parser, { signal: controller.signal }); + }Also applies to: 211-227
91-99: Optional: support local Brotli (.br) like remote pathMirror remote handling and support
.brlocally if needed.- const isGzipped = filePath.endsWith('.gz'); + const isGzipped = filePath.endsWith('.gz'); + const isBrotli = filePath.endsWith('.br'); @@ - if (isGzipped) { + if (isGzipped) { await pipeline(fileStream, createGunzip(), parser); - } else { + } else if (isBrotli) { + await pipeline(fileStream, createBrotliDecompress(), parser); + } else { await pipeline(fileStream, parser); }packages/db/src/services/import.service.ts (3)
78-92: Consider using CTEs or subqueries for better maintainability.The
argMin/argMaxtuple approach works but is fragile—the 17-element tuple order must match the destructuring on lines 147-173. If fields are added, removed, or reordered, this becomes error-prone and difficult to debug.A clearer alternative would be to use CTEs with
ROW_NUMBER()or separate queries for first/last events, making field selection explicit:WITH ranked_events AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY device_id, session_id, project_id, profile_id ORDER BY created_at ASC) as first_rank, ROW_NUMBER() OVER (PARTITION BY device_id, session_id, project_id, profile_id ORDER BY created_at DESC) as last_rank FROM ${TABLE_NAMES.events_imports} WHERE ${rangeWhere} AND name NOT IN ('session_start', 'session_end') ) SELECT ... FROM ranked_events WHERE first_rank = 1 OR last_rank = 1
26-525: Consider adding explicit error handling and logging.None of the functions include try-catch blocks or error enrichment. While throwing errors to the caller is acceptable, adding contextual error handling would improve debuggability and allow for better error classification (transient vs. permanent failures).
Consider wrapping critical operations:
export async function insertImportBatch( events: IClickhouseEvent[], importId: string, ): Promise<ImportStageResult> { try { if (events.length === 0) { return { importId, totalEvents: 0, insertedEvents: 0 }; } const eventsWithMetadata = events.map((event) => ({ ...event, import_id: importId, import_status: 'pending', imported_at_meta: new Date(), })); await ch.insert({ table: TABLE_NAMES.events_imports, values: eventsWithMetadata, format: 'JSONEachRow', }); return { importId, totalEvents: events.length, insertedEvents: events.length, }; } catch (error) { // Add context and rethrow or log throw new Error(`Failed to insert import batch for ${importId}: ${error instanceof Error ? error.message : String(error)}`); } }This pattern would help with debugging import failures in production.
278-356: Consider batch migration for very large imports.The
INSERT INTO ... SELECTapproach is efficient for most cases, but for imports with millions of events, it might:
- Block for extended periods
- Consume significant memory
- Provide no progress feedback during the migration
For very large imports, consider implementing batch migration:
export async function migrateImportToProduction( importId: string, from?: string, to?: string, batchSize = 100000, // configurable batch size ): Promise<void> { let offset = 0; let migrated = 0; while (true) { const migrationQuery = ` INSERT INTO ${TABLE_NAMES.events} (...) SELECT ... FROM ${TABLE_NAMES.events_imports} WHERE import_id = {importId:String} ${from ? 'AND created_at >= {from:String}' : ''} ${to ? 'AND created_at < {to:String}' : ''} ORDER BY created_at ASC LIMIT {batchSize:UInt32} OFFSET {offset:UInt32} `; const result = await ch.command({ query: migrationQuery, query_params: { importId, from, to, batchSize, offset }, }); // If no rows affected, we're done if (result.query_id && migrated === 0) break; offset += batchSize; // Could call updateImportProgress here } }This is optional and depends on expected import sizes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
packages/importer/a70ff57d-f632-4292-a98b-658d7734fbec.csvis excluded by!**/*.csvpnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (35)
IMPORT_REVIEW.md(1 hunks)apps/start/package.json(1 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/worker/package.json(1 hunks)apps/worker/src/boot-workers.ts(5 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/clickhouse/client.ts(1 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/src/utils/referrer-parser.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/components/integrations/integration-card.tsxapps/start/src/modals/index.tsxapps/start/src/modals/add-import.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/components/ui/badge.tsxapps/start/src/routeTree.gen.ts
🧬 Code graph analysis (17)
packages/db/src/types.ts (1)
packages/validation/src/index.ts (1)
IImportConfig(464-467)
packages/importer/src/utils/referrer-parser.ts (1)
packages/common/src/string.ts (1)
stripTrailingSlash(1-3)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
getRedisQueue(93-105)
apps/worker/src/jobs/import.ts (3)
packages/queue/src/queues.ts (1)
ImportQueuePayload(163-168)packages/importer/src/providers/umami.ts (2)
UmamiRawEvent(55-55)UmamiProvider(57-426)packages/db/src/services/import.service.ts (8)
insertImportBatch(26-53)updateImportProgress(387-400)getImportDateBounds(508-525)reconstructSessions(61-272)migrateImportToProduction(278-356)markImportComplete(361-382)getImportProgress(434-488)markImportFailed(405-429)
apps/start/src/modals/index.tsx (1)
apps/start/src/modals/add-import.tsx (1)
AddImport(29-164)
packages/importer/vitest.config.ts (1)
vitest.shared.ts (1)
getSharedVitestConfig(4-27)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
importQueue(170-176)apps/worker/src/jobs/import.ts (1)
importJob(19-198)
packages/trpc/src/root.ts (1)
packages/trpc/src/routers/import.ts (1)
importRouter(11-183)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
getIsCluster(17-24)packages/db/src/clickhouse/migration.ts (2)
createTable(68-116)runClickhouseMigrationCommands(322-454)
apps/start/src/modals/add-import.tsx (2)
apps/start/src/integrations/tanstack-query/root-provider.tsx (1)
Provider(68-86)packages/importer/src/types.ts (1)
ImportConfig(7-12)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(138-138)protectedProcedure(161-164)packages/trpc/src/access.ts (1)
getProjectAccessCached(4-4)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(270-270)packages/validation/src/index.ts (1)
zCreateImport(469-477)packages/queue/src/queues.ts (1)
importQueue(170-176)
packages/db/src/services/import.service.ts (3)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/client.ts (4)
ch(131-141)TABLE_NAMES(48-60)convertClickhouseDateToJs(218-220)formatClickhouseDate(188-199)packages/db/src/prisma-client.ts (1)
db(270-270)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
importQueue(170-176)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
Route(12-34)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
BaseRawEvent(40-42)ImportJobMetadata(69-73)ErrorContext(45-51)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)
packages/importer/src/providers/umami.ts (4)
packages/validation/src/index.ts (1)
IUmamiImportConfig(450-450)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/importer/src/utils/referrer-parser.ts (2)
parseReferrer(49-58)getReferrerWithQuery(60-91)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
IServiceImportedEventPayload(144-147)packages/db/src/services/import.service.ts (2)
ImportProgress(16-21)ImportStageResult(10-14)
🪛 Gitleaks (8.28.0)
apps/worker/src/index.ts
[high] 60-60: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
🪛 LanguageTool
IMPORT_REVIEW.md
[uncategorized] ~35-~35: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... batchNumber: number ) ``` --- ## 🟡 Medium Priority Issues ### 2. **Missing UUID Dependenc...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[grammar] ~156-~156: Ensure spelling is correct
Context: ...d for EVERY event ``` Improvement: Cleanup periodically (e.g., every 1000 events) ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-and-push-api
- GitHub Check: build-and-push-worker
🔇 Additional comments (24)
apps/start/package.json (1)
6-6: Clarify the motivation for renaming "testing" → "testing2".The script was renamed from
"testing"to"testing2", but the name"testing2"is not descriptive and appears to be a placeholder. Additionally, this change seems orthogonal to the PR objective ("new importer").Please clarify:
- What is the intended purpose of renaming this script?
- Is
"testing2"a temporary name, or was a different name intended?- Was the original
"testing"script referenced in CI/CD pipelines, documentation, or developer workflows? If so, those references need to be updated to avoid breaking them.If this change is unintended, consider reverting it.
apps/start/src/components/integrations/integration-card.tsx (1)
50-69: LGTM! Consistent API enhancement.The addition of the optional
classNameprop alignsIntegrationCardLogoImagewith the other components in this file (IntegrationCardFooter,IntegrationCardHeader,IntegrationCardHeaderButtons) that already acceptclassName. The prop is properly forwarded toIntegrationCardLogo, which correctly merges it using thecn()utility.apps/start/src/components/ui/badge.tsx (2)
12-13: LGTM! Success variant follows good practices.The emerald-700 background with white text provides good contrast and is semantically appropriate for success states.
16-16: LGTM! Info variant is well-implemented.The blue-500 background with white text provides adequate contrast and follows the established pattern.
packages/importer/tsconfig.json (1)
1-9: LGTM! Clean TypeScript configuration.The configuration is properly structured for the new importer package, extending the shared base config and setting appropriate compiler options for a source-in-src, output-to-dist layout.
packages/importer/package.json (1)
27-27:bullmqdependency placement is correct.Verification shows
bullmqis exclusively indevDependencieswith no runtime imports or usage in the importer source code. The package does not directly enqueue or process jobs—job operations are handled through the@openpanel/queueworkspace dependency. The current placement is appropriate.packages/db/src/types.ts (1)
2-2: Alignment with validation types looks goodThe Prisma JSON alias cleanly mirrors the validation union. No issues spotted.
Also applies to: 16-16
packages/trpc/package.json (1)
19-19: No circular dependency detected—safe to proceed.Verification confirms @openpanel/queue does not depend on @openpanel/trpc. The dependency is unidirectional: @openpanel/trpc (line 19) → @openpanel/queue. No cycle risk.
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
45-45: LGTM! Clean addition of the Imports tab.The new imports tab follows the existing pattern and integrates well with the settings navigation structure.
apps/worker/package.json (1)
23-23: LGTM! Necessary dependency for import functionality.The addition of the
@openpanel/importerworkspace dependency enables the worker to process import jobs.packages/db/src/clickhouse/client.ts (1)
59-59: LGTM! Table name constant added correctly.The
events_importstable name addition follows the existing pattern and aligns with the new ClickHouse migration.apps/worker/src/index.ts (1)
8-8: LGTM! Import queue integration added correctly.The
importQueueis properly imported and registered with Bull Board for monitoring.Also applies to: 42-42
packages/db/src/services/event.service.ts (1)
144-147: LGTM! Type definition is clear and purposeful.The
IServiceImportedEventPayloadtype correctly omitsprofileandmetafields for imported events, which makes sense as these would need to be resolved separately during import processing.packages/db/code-migrations/5-add-imports-table.sql (2)
42-43: Verify the 7-day TTL is intentional for staging data.The TTL of 7 days on
imported_at_metameans all staging import data will be automatically deleted after one week. Confirm this aligns with your data retention requirements and import workflow expectations. If imports might take longer than 7 days to validate or if you need to keep historical import records, consider increasing this value or making it configurable.
1-38: LGTM! Well-designed ClickHouse table schema.The table structure uses appropriate data types, codecs for compression, and partitioning strategy. The use of LowCardinality for categorical fields and specialized codecs (ZSTD, Delta, Gorilla) shows good understanding of ClickHouse optimization patterns.
packages/db/prisma/schema.prisma (1)
470-486: LGTM! Import model is well-structured.The Import model has appropriate fields for tracking import progress, includes proper relations to Project with cascade delete, and uses sensible defaults. The use of Json for config provides flexibility while the comment
/// [IPrismaImportConfig]indicates type documentation exists elsewhere.Also applies to: 197-197
IMPORT_REVIEW.md (2)
37-59: Remove the action item; importer uses node:crypto, not the uuid packageThe script results show that
packages/importerusesrandomUUID()imported fromnode:crypto(inpackages/importer/src/providers/umami.tsandpackages/cli/src/importer/importer.ts), not from the externaluuidpackage. Theuuidpackage is used elsewhere in the codebase (specifically inpackages/db/src/services/event.service.ts), but not in the importer package. The review comment's recommendation to adduuidto the importer's dependencies is unnecessary.Likely an incorrect or invalid review comment.
10-31: Review comment is based on outdated or incorrect code state; actual implementation already uses proper typesThe review comment cites code that does not match the current implementation:
- processBatch function: Does not exist in the codebase (rg search found zero matches)
anytype usage: No matches found inapps/worker/src/jobs/(contradicts claim of "extensive use")- Actual typing: Line 46 of
apps/worker/src/jobs/import.tsalready usesconst eventBatch: UmamiRawEvent[] = [];(properly typed, notany[])- BaseImportProvider: Confirmed generic with
<TRawEvent extends BaseRawEvent = BaseRawEvent>, andUmamiProvidercorrectly extends it asBaseImportProvider<UmamiRawEvent>The current implementation already has the proper type safety that the review comment suggests. The code is not coupled via weak typing but properly typed with
UmamiRawEvent. No refactoring is needed.Likely an incorrect or invalid review comment.
apps/start/src/routeTree.gen.ts (1)
57-57: Generated file — no manual editsLooks consistent with the new Imports tab. No action; keep excluded from lint/format.
Also applies to: 373-378, 511-511, 564-564, 800-801, 1170-1176, 1504-1517
packages/importer/src/types.ts (1)
20-25: Naming duplication exists but no actual import collision risk – review comment is incorrect.The analysis shows that while
ImportProgressandImportStageResultare defined in both packages, the importer package does not import these collision types from @openpanel/db. The importer only imports unrelated types likeIClickhouseEventandIServiceImportedEventPayload.Additionally, the
ImportProgresstypes serve different purposes: the importer's version tracks batch processing state (totalEvents, processedEvents, currentBatch, totalBatches), while the db version tracks job metadata (importId, totalEvents, insertedEvents, status). This is intentional, not a collision.Since there are no cross-imports and the types aren't used interchangeably, no "confusion in imports" or "subtle type bugs" will occur. Both packages maintain independent type contracts appropriate to their concerns.
Likely an incorrect or invalid review comment.
packages/db/src/services/import.service.ts (4)
1-21: LGTM!The imports and type definitions are well-structured. The interfaces clearly define the data contracts for staging results and progress tracking.
26-53: LGTM!The batch insert function properly handles empty arrays, adds appropriate metadata, and leverages the retry-wrapped ClickHouse insert. The implementation is clean and defensive.
493-503: Verify cleanup scope: should Postgres import record also be deleted?The
cleanupFailedImportfunction only deletes from ClickHouse'sevents_importstable. The corresponding Postgresimportrecord remains with status 'failed'.Is this intentional (to preserve audit history), or should the Postgres record also be deleted for consistency?
If Postgres records should also be cleaned up, add:
export async function cleanupFailedImport(importId: string): Promise<void> { const deleteQuery = ` DELETE FROM ${TABLE_NAMES.events_imports} WHERE import_id = {importId:String} AND import_status = 'failed' `; await ch.command({ query: deleteQuery, query_params: { importId }, }); + + // Also remove from Postgres if desired + await db.import.delete({ + where: { id: importId, status: 'failed' }, + }); }
508-525: LGTM!The
getImportDateBoundsutility correctly handles the nullable case and provides a clean interface for date range queries. Defensive null check on line 524 prevents runtime errors.
| warning: | ||
| 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix color contrast for accessibility.
White text on a yellow-500 background likely fails WCAG AA contrast requirements (4.5:1). Yellow backgrounds conventionally require dark text for proper readability.
Apply this diff to fix the contrast issue:
- warning:
- 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
+ warning:
+ 'border-transparent bg-yellow-500 text-yellow-950 [a&]:hover:bg-yellow-500/90',Alternatively, use a darker yellow background with white text:
- warning:
- 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
+ warning:
+ 'border-transparent bg-yellow-600 text-white [a&]:hover:bg-yellow-600/90',📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| warning: | |
| 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90', | |
| warning: | |
| 'border-transparent bg-yellow-500 text-yellow-950 [a&]:hover:bg-yellow-500/90', |
| warning: | |
| 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90', | |
| warning: | |
| 'border-transparent bg-yellow-600 text-white [a&]:hover:bg-yellow-600/90', |
🤖 Prompt for AI Agents
In apps/start/src/components/ui/badge.tsx around lines 14-15, the warning
variant uses 'bg-yellow-500 text-white' which likely fails WCAG contrast; change
either the text color to a dark color (e.g., replace text-white with
text-gray-900 or text-black) so dark text sits on yellow-500, or keep white text
and use a darker yellow background (e.g., replace bg-yellow-500 with
bg-yellow-600 or bg-yellow-700) and adjust the hover class accordingly to
maintain accessible contrast.
| toast.error('Import failed', { | ||
| description: error.message, | ||
| }); | ||
| }, | ||
| }), | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid leaking secrets in error paths/logs
error.message may include upstream details. Ensure server never echoes API secrets. Consider a generic toast and log details server-side only.
🤖 Prompt for AI Agents
In apps/start/src/modals/add-import.tsx around lines 55 to 60, the current toast
displays error.message which can leak upstream secrets; replace the client-side
toast with a generic user-facing message (e.g., "Import failed, try again
later") and remove inclusion of error.message, and instead ensure the server
logs the full error details securely (or returns a sanitized error code/message)
so secrets are never echoed to the client; update the toast usage accordingly
and confirm server-side logging/response sanitization is in place.
apps/worker/src/jobs/import.ts
Outdated
| import { UmamiProvider, type UmamiRawEvent } from '@openpanel/importer'; | ||
| import type { ImportQueuePayload } from '@openpanel/queue'; | ||
| import type { Job } from 'bullmq'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove UmamiRawEvent coupling; use BaseImportProvider generics
This job should be provider-agnostic. Swap UmamiRawEvent for BaseImportProvider.
-import { UmamiProvider, type UmamiRawEvent } from '@openpanel/importer';
+import { UmamiProvider, type BaseImportProvider } from '@openpanel/importer';Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 12 to 14, remove the direct
coupling to UmamiRawEvent and instead use the generic BaseImportProvider<T> from
the importer package: drop the UmamiRawEvent import, import BaseImportProvider
(and its type) from '@openpanel/importer', and replace any references to
UmamiRawEvent in the job/payload types with the generic BaseImportProvider<T>
(e.g., ImportQueuePayload<BaseImportProvider<any>> or a concrete T where
appropriate). Ensure all imports and Job generics are updated to use the
BaseImportProvider generic so the job becomes provider-agnostic.
apps/worker/src/jobs/import.ts
Outdated
| // Phase 1: Load & Transform - Process events in batches | ||
| const eventBatch: UmamiRawEvent[] = []; | ||
|
|
||
| for await (const rawEvent of providerInstance.parseSource()) { | ||
| // Validate event | ||
| if (!providerInstance.validate(rawEvent)) { | ||
| jobLogger.warn('Skipping invalid event', { rawEvent }); | ||
| continue; | ||
| } | ||
|
|
||
| eventBatch.push(rawEvent); | ||
| totalEvents++; | ||
|
|
||
| // Process batch when it reaches the batch size | ||
| if (eventBatch.length >= BATCH_SIZE) { | ||
| jobLogger.info('Processing batch', { batchSize: eventBatch.length }); | ||
|
|
||
| // Transform events | ||
| const transformedEvents = eventBatch.map((event) => | ||
| providerInstance.transformEvent(event), | ||
| ); | ||
|
|
||
| // Phase 2: Insert into staging table | ||
| await insertImportBatch(transformedEvents, importId); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Make batch buffer provider-agnostic and harden transform stage
- Use unknown[] for the raw batch.
- Guard transformEvent with try/catch to skip bad rows without failing the whole job.
- const eventBatch: UmamiRawEvent[] = [];
+ const eventBatch: unknown[] = [];
@@
- const transformedEvents = eventBatch.map((event) =>
- providerInstance.transformEvent(event),
- );
+ const transformedEvents: any[] = [];
+ for (const event of eventBatch) {
+ try {
+ transformedEvents.push(providerInstance.transformEvent(event as any));
+ } catch (e) {
+ jobLogger.warn('Skipping event on transform failure', {
+ provider: providerInstance.provider,
+ reason: e instanceof Error ? e.message : String(e),
+ });
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Phase 1: Load & Transform - Process events in batches | |
| const eventBatch: UmamiRawEvent[] = []; | |
| for await (const rawEvent of providerInstance.parseSource()) { | |
| // Validate event | |
| if (!providerInstance.validate(rawEvent)) { | |
| jobLogger.warn('Skipping invalid event', { rawEvent }); | |
| continue; | |
| } | |
| eventBatch.push(rawEvent); | |
| totalEvents++; | |
| // Process batch when it reaches the batch size | |
| if (eventBatch.length >= BATCH_SIZE) { | |
| jobLogger.info('Processing batch', { batchSize: eventBatch.length }); | |
| // Transform events | |
| const transformedEvents = eventBatch.map((event) => | |
| providerInstance.transformEvent(event), | |
| ); | |
| // Phase 2: Insert into staging table | |
| await insertImportBatch(transformedEvents, importId); | |
| // Phase 1: Load & Transform - Process events in batches | |
| const eventBatch: unknown[] = []; | |
| for await (const rawEvent of providerInstance.parseSource()) { | |
| // Validate event | |
| if (!providerInstance.validate(rawEvent)) { | |
| jobLogger.warn('Skipping invalid event', { rawEvent }); | |
| continue; | |
| } | |
| eventBatch.push(rawEvent); | |
| totalEvents++; | |
| // Process batch when it reaches the batch size | |
| if (eventBatch.length >= BATCH_SIZE) { | |
| jobLogger.info('Processing batch', { batchSize: eventBatch.length }); | |
| // Transform events | |
| const transformedEvents: any[] = []; | |
| for (const event of eventBatch) { | |
| try { | |
| transformedEvents.push(providerInstance.transformEvent(event as any)); | |
| } catch (e) { | |
| jobLogger.warn('Skipping event on transform failure', { | |
| provider: providerInstance.provider, | |
| reason: e instanceof Error ? e.message : String(e), | |
| }); | |
| } | |
| } | |
| // Phase 2: Insert into staging table | |
| await insertImportBatch(transformedEvents, importId); |
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 45 to 69, the batch buffer and
transform stage need hardening: change eventBatch from UmamiRawEvent[] to
unknown[] so the buffer is provider-agnostic, and wrap each
providerInstance.transformEvent call in a try/catch so a thrown error on one row
is caught, logged (jobLogger.warn with the rawEvent and error), and that row is
skipped without failing the whole job; ensure you collect only successful
transformed events (filter out failed/undefined results) before calling
insertImportBatch(importId).
| let properties: Record<string, any> = {}; | ||
|
|
||
| if (query) { | ||
| properties.__query = query; | ||
| } | ||
|
|
||
| // Add useful properties from Umami data | ||
| if (rawEvent.page_title) properties.__title = rawEvent.page_title; | ||
| if (rawEvent.screen) properties.__screen = rawEvent.screen; | ||
| if (rawEvent.language) properties.__language = rawEvent.language; | ||
| if (rawEvent.utm_source) | ||
| properties = assocPath( | ||
| ['__query', 'utm_source'], | ||
| rawEvent.utm_source, | ||
| properties, | ||
| ); | ||
| if (rawEvent.utm_medium) | ||
| properties = assocPath( | ||
| ['__query', 'utm_medium'], | ||
| rawEvent.utm_medium, | ||
| properties, | ||
| ); | ||
| if (rawEvent.utm_campaign) | ||
| properties = assocPath( | ||
| ['__query', 'utm_campaign'], | ||
| rawEvent.utm_campaign, | ||
| properties, | ||
| ); | ||
| if (rawEvent.utm_content) | ||
| properties = assocPath( | ||
| ['__query', 'utm_content'], | ||
| rawEvent.utm_content, | ||
| properties, | ||
| ); | ||
| if (rawEvent.utm_term) | ||
| properties = assocPath( | ||
| ['__query', 'utm_term'], | ||
| rawEvent.utm_term, | ||
| properties, | ||
| ); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
properties contains nested objects; ClickHouse expects primitives
properties.__query = query and nested __query.* entries violate IClickhouseEvent.properties type. Flatten or stringify.
- let properties: Record<string, any> = {};
-
- if (query) {
- properties.__query = query;
- }
+ const properties: Record<string, string | number | boolean | null | undefined> = {};
+ if (query) {
+ for (const [k, v] of Object.entries(query)) {
+ properties[`query.${k}`] = v as string;
+ }
+ }
@@
- if (rawEvent.utm_source)
- properties = assocPath(['__query', 'utm_source'], rawEvent.utm_source, properties);
- if (rawEvent.utm_medium)
- properties = assocPath(['__query', 'utm_medium'], rawEvent.utm_medium, properties);
- if (rawEvent.utm_campaign)
- properties = assocPath(['__query', 'utm_campaign'], rawEvent.utm_campaign, properties);
- if (rawEvent.utm_content)
- properties = assocPath(['__query', 'utm_content'], rawEvent.utm_content, properties);
- if (rawEvent.utm_term)
- properties = assocPath(['__query', 'utm_term'], rawEvent.utm_term, properties);
+ if (rawEvent.utm_source) properties.utm_source = rawEvent.utm_source;
+ if (rawEvent.utm_medium) properties.utm_medium = rawEvent.utm_medium;
+ if (rawEvent.utm_campaign) properties.utm_campaign = rawEvent.utm_campaign;
+ if (rawEvent.utm_content) properties.utm_content = rawEvent.utm_content;
+ if (rawEvent.utm_term) properties.utm_term = rawEvent.utm_term;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let properties: Record<string, any> = {}; | |
| if (query) { | |
| properties.__query = query; | |
| } | |
| // Add useful properties from Umami data | |
| if (rawEvent.page_title) properties.__title = rawEvent.page_title; | |
| if (rawEvent.screen) properties.__screen = rawEvent.screen; | |
| if (rawEvent.language) properties.__language = rawEvent.language; | |
| if (rawEvent.utm_source) | |
| properties = assocPath( | |
| ['__query', 'utm_source'], | |
| rawEvent.utm_source, | |
| properties, | |
| ); | |
| if (rawEvent.utm_medium) | |
| properties = assocPath( | |
| ['__query', 'utm_medium'], | |
| rawEvent.utm_medium, | |
| properties, | |
| ); | |
| if (rawEvent.utm_campaign) | |
| properties = assocPath( | |
| ['__query', 'utm_campaign'], | |
| rawEvent.utm_campaign, | |
| properties, | |
| ); | |
| if (rawEvent.utm_content) | |
| properties = assocPath( | |
| ['__query', 'utm_content'], | |
| rawEvent.utm_content, | |
| properties, | |
| ); | |
| if (rawEvent.utm_term) | |
| properties = assocPath( | |
| ['__query', 'utm_term'], | |
| rawEvent.utm_term, | |
| properties, | |
| ); | |
| const properties: Record<string, string | number | boolean | null | undefined> = {}; | |
| if (query) { | |
| for (const [k, v] of Object.entries(query)) { | |
| properties[`query.${k}`] = v as string; | |
| } | |
| } | |
| // Add useful properties from Umami data | |
| if (rawEvent.page_title) properties.__title = rawEvent.page_title; | |
| if (rawEvent.screen) properties.__screen = rawEvent.screen; | |
| if (rawEvent.language) properties.__language = rawEvent.language; | |
| if (rawEvent.utm_source) properties.utm_source = rawEvent.utm_source; | |
| if (rawEvent.utm_medium) properties.utm_medium = rawEvent.utm_medium; | |
| if (rawEvent.utm_campaign) properties.utm_campaign = rawEvent.utm_campaign; | |
| if (rawEvent.utm_content) properties.utm_content = rawEvent.utm_content; | |
| if (rawEvent.utm_term) properties.utm_term = rawEvent.utm_term; |
🤖 Prompt for AI Agents
In packages/importer/src/providers/umami.ts around lines 301 to 341, properties
is being built with nested objects (properties.__query = query and
assocPath(['__query', ...])) which violates IClickhouseEvent.properties that
expects primitives; replace the nested structure by either flattening the query
into primitive top-level keys (e.g. __query_<key>=value) or by storing the whole
query as a JSON string (e.g. __query = JSON.stringify(query)) and update the
assocPath usage to set primitive keys accordingly so every properties entry is a
primitive value compatible with ClickHouse.
| // Properties type for events - more specific than Record<string, any> | ||
| export interface EventProperties { | ||
| [key: string]: | ||
| | string | ||
| | number | ||
| | boolean | ||
| | null | ||
| | undefined | ||
| | Record<string, unknown>; | ||
| __query?: Record<string, unknown>; | ||
| __title?: string; | ||
| __screen?: string; | ||
| __language?: string; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EventProperties allows nested objects, but ClickHouse properties is primitives-only
IClickhouseEvent.properties is Record<string, string|number|boolean|undefined|null>. Storing objects (e.g., __query) will break ingestion unless stringified.
- Constrain EventProperties values to primitives.
- Or require JSON.stringify for nested shapes (e.g.,
__queryJson: string).
🤖 Prompt for AI Agents
In packages/importer/src/types.ts around lines 53 to 66, EventProperties
currently allows nested objects which will break ClickHouse ingestion because
IClickhouseEvent.properties must be primitives-only; update the EventProperties
interface so values are limited to string | number | boolean | null | undefined
(remove Record<string, unknown>) or replace nested object fields with explicit
JSON string fields (e.g., __queryJson: string) and rename/remove existing
__query, ensuring all nested shapes are JSON.stringified before being assigned
to EventProperties.
| @@ -0,0 +1,3 @@ | |||
| import { getSharedVitestConfig } from '../../vitest.shared'; | |||
|
|
|||
| export default getSharedVitestConfig({ __dirname }); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__dirname is unavailable in ESM modules.
Since package.json declares "type": "module", this file runs as ESM. The __dirname global doesn't exist in ESM and will throw a ReferenceError.
Apply this diff to compute the directory path in ESM:
+import { dirname } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
import { getSharedVitestConfig } from '../../vitest.shared';
-export default getSharedVitestConfig({ __dirname });
+export default getSharedVitestConfig({
+ __dirname: dirname(fileURLToPath(import.meta.url))
+});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export default getSharedVitestConfig({ __dirname }); | |
| import { dirname } from 'node:path'; | |
| import { fileURLToPath } from 'node:url'; | |
| import { getSharedVitestConfig } from '../../vitest.shared'; | |
| export default getSharedVitestConfig({ | |
| __dirname: dirname(fileURLToPath(import.meta.url)) | |
| }); |
🤖 Prompt for AI Agents
In packages/importer/vitest.config.ts around line 3, the code passes the
non-existent __dirname to getSharedVitestConfig (ESM modules don't have
__dirname); compute the directory from import.meta.url instead and pass that
value: import fileURLToPath from 'url' and path from 'path' (or use their named
functions), derive const __dirname =
path.dirname(fileURLToPath(import.meta.url)), then call getSharedVitestConfig({
__dirname }) so the config receives the correct directory path in ESM.
| list: protectedProcedure | ||
| .input(z.object({ projectId: z.string() })) | ||
| .query(async ({ input, ctx }) => { | ||
| const access = await getProjectAccessCached({ | ||
| projectId: input.projectId, | ||
| userId: ctx.session.userId, | ||
| }); | ||
|
|
||
| if (!access) { | ||
| throw TRPCAccessError('You do not have access to this project'); | ||
| } | ||
|
|
||
| return db.import.findMany({ | ||
| where: { | ||
| projectId: input.projectId, | ||
| }, | ||
| orderBy: { | ||
| createdAt: 'desc', | ||
| }, | ||
| }); | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sensitive config exposure in list/get — restrict or redact.
config likely contains provider credentials. Current list/get return it to any project member with read access. Redact or gate by role.
Option A (redact for read-level):
list: protectedProcedure
.input(z.object({ projectId: z.string() }))
.query(async ({ input, ctx }) => {
const access = await getProjectAccessCached({
projectId: input.projectId,
userId: ctx.session.userId,
});
if (!access) {
throw TRPCAccessError('You do not have access to this project');
}
- return db.import.findMany({
+ const rows = await db.import.findMany({
where: { projectId: input.projectId },
orderBy: { createdAt: 'desc' },
});
+ const canSeeSecrets =
+ typeof access !== 'boolean' && access.level !== 'read';
+ return canSeeSecrets
+ ? rows
+ : rows.map((r) => ({ ...r, config: null }));
}), get: protectedProcedure
.input(z.object({ id: z.string() }))
.query(async ({ input, ctx }) => {
const importRecord = await db.import.findUniqueOrThrow({
where: { id: input.id },
include: { project: true },
});
const access = await getProjectAccessCached({
projectId: importRecord.projectId,
userId: ctx.session.userId,
});
if (!access) {
throw TRPCAccessError('You do not have access to this import');
}
- return importRecord;
+ const canSeeSecrets =
+ typeof access !== 'boolean' && access.level !== 'read';
+ return canSeeSecrets ? importRecord : { ...importRecord, config: null };
}),Option B: Use Prisma select to exclude config in list and only include it for admin/write roles.
Also applies to: 34-57
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 12 to 32 (and likewise apply
the change to lines 34 to 57), the current list/get resolvers return the full
import records including the sensitive config field to any project member;
change the DB queries to explicitly exclude the config field by using a Prisma
select (e.g., select: { config: false, ...otherFields }) for read-level access,
and only fetch/include config when the caller has a higher role (admin/write)
after re-checking the caller's role from ctx.session or the access object;
implement a conditional: if role allows, run a second query or include config in
select, otherwise omit/redact config from the returned objects and ensure no
accidental passthrough in list results.
| export const zCreateImport = z.object({ | ||
| projectId: z.string().min(1), | ||
| provider: z.enum(['umami', 'plausible', 'mixpanel']), | ||
| config: z.union([ | ||
| zUmamiImportConfig, | ||
| zPlausibleImportConfig, | ||
| zMixpanelImportConfig, | ||
| ]), | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discriminate by provider to prevent mismatched payloads
provider and config can disagree (e.g., provider='umami' with a Mixpanel config). Make zCreateImport a discriminated union on provider so config shape is tied to the selected provider.
-export const zCreateImport = z.object({
- projectId: z.string().min(1),
- provider: z.enum(['umami', 'plausible', 'mixpanel']),
- config: z.union([
- zUmamiImportConfig,
- zPlausibleImportConfig,
- zMixpanelImportConfig,
- ]),
-});
+export const zCreateImport = z.discriminatedUnion('provider', [
+ z.object({
+ projectId: z.string().min(1),
+ provider: z.literal('umami'),
+ config: zUmamiImportConfig,
+ }),
+ z.object({
+ projectId: z.string().min(1),
+ provider: z.literal('plausible'),
+ config: zPlausibleImportConfig,
+ }),
+ z.object({
+ projectId: z.string().min(1),
+ provider: z.literal('mixpanel'),
+ config: zMixpanelImportConfig,
+ }),
+]);🤖 Prompt for AI Agents
In packages/validation/src/index.ts around lines 469 to 477, zCreateImport
currently allows provider and config to disagree; change it to a discriminated
union so each provider literal is paired with the correct config shape. Replace
the single z.object with either z.discriminatedUnion('provider', [...]) or a
z.union of three z.object schemas where provider is a literal ('umami' |
'plausible' | 'mixpanel') and config is the matching zUmamiImportConfig,
zPlausibleImportConfig, or zMixpanelImportConfig respectively; keep the same
exported name and types so callers get compile-time validation that provider and
config match.
d71712e to
b60dd0e
Compare
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughAdds an importer package, provider implementations (Umami, Mixpanel), DB migrations and ClickHouse support, a TRPC import router, UI for imports, job queue and worker import job, import service & CSV utilities, refactors common exports, and removes the CLI package. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant UI as Start UI
participant TRPC
participant Queue as Job Queue
participant Worker
participant Provider as Import Provider
participant CH as ClickHouse
participant PG as Postgres
User->>UI: Configure & submit import
UI->>TRPC: create import
TRPC->>PG: INSERT import (status: pending)
TRPC->>Queue: enqueue import job
TRPC-->>UI: return created import
Queue->>Worker: deliver job
Worker->>PG: load import record
Worker->>Provider: parseSource() (stream events)
Provider-->>Worker: yield raw events
Worker->>CH: insertImportBatch (staging CSV)
CH-->>Worker: ack + progress
alt provider.shouldGenerateSessionIds
Worker->>CH: generateSessionIds()
end
Worker->>CH: createSessionsStartEndEvents()
Worker->>CH: moveImportsToProduction()
Worker->>CH: backfillSessionsToProduction()
Worker->>PG: mark import complete
Worker-->>TRPC: (status updated)
TRPC-->>UI: UI observes progress/completion
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas requiring extra attention:
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/common/src/object.ts (1)
22-29: Inconsistent handling of whitespace-only strings.The guard on line 22 now filters out explicit empty strings, but whitespace-only strings (e.g.,
' ') will pass this check, get trimmed to an empty string on line 28, and still be added to the accumulator. This creates inconsistent behavior.Consider moving the empty string check after the cleaning step to ensure all empty values (including trimmed whitespace) are filtered consistently:
- if (value === undefined || value === null || value === '') { + if (value === undefined || value === null) { return acc; } const cleanedValue = typeof value === 'string' ? removeInvalidSurrogates(value).trim() : String(value); + + if (cleanedValue === '') { + return acc; + } return { ...acc, [`${path}${key}`]: cleanedValue, };
♻️ Duplicate comments (18)
apps/start/src/components/ui/badge.tsx (1)
12-16: New badge variants added for status indication.The three new variants (success, warning, info) extend the badge component to support import status displays and other UI feedback needs.
Note: The warning variant's color contrast issue (white text on yellow-500) has already been flagged in a previous review.
packages/db/code-migrations/5-add-imports-table.ts (1)
65-69: TTL must be defined at CREATE time (and on local replicated table in cluster).ALTERing TTL on
events_importswill hit the Distributed table in cluster mode and be a no-op. Bake TTL intocreateTableso it applies to the replicated local table and the single-node path. Remove the ALTER.Apply these diffs:
In this file:
const sqls: string[] = [ ...createTable({ name: 'events_imports', columns: [ // ... ], orderBy: ['import_id', 'created_at'], partitionBy: 'toYYYYMM(imported_at_meta)', settings: { index_granularity: 8192, }, distributionHash: 'cityHash64(import_id)', replicatedVersion: '1', isClustered, + ttl: "imported_at_meta + INTERVAL 7 DAY", }), ]; - // Add TTL policy for auto-cleanup after 7 days - sqls.push(` - ALTER TABLE events_imports - MODIFY TTL imported_at_meta + INTERVAL 7 DAY - `);And extend
createTable(packages/db/src/clickhouse/migration.ts):export function createTable({ name: tableName, columns, indices = [], engine = 'MergeTree()', orderBy = ['tuple()'], partitionBy, settings = {}, distributionHash, replicatedVersion, isClustered, + ttl, }: CreateTableOptions): string[] { const columnDefinitions = [...columns, ...indices].join(',\n '); const settingsClause = Object.entries(settings).length ? `SETTINGS ${Object.entries(settings).map(([k, v]) => `${k} = ${v}`).join(', ')}` : ''; const partitionByClause = partitionBy ? `PARTITION BY ${partitionBy}` : ''; + const ttlClause = ttl ? `\nTTL ${ttl}` : ''; if (!isClustered) { return [ `CREATE TABLE IF NOT EXISTS ${tableName} ( ${columnDefinitions} ) ENGINE = ${engine} ${partitionByClause} ORDER BY (${orderBy.join(', ')}) -${settingsClause}`.trim(), +${ttlClause} +${settingsClause}`.trim(), ]; } return [ `CREATE TABLE IF NOT EXISTS ${replicated(tableName)} ON CLUSTER '{cluster}' ( ${columnDefinitions} ) ENGINE = Replicated${engine.replace(/^(.+?)\((.+?)?\)/, `$1('${CLUSTER_REPLICA_PATH.replace('{replicatedVersion}', replicatedVersion)}', '{replica}', $2)`).replace(/, \)$/, ')')} ${partitionByClause} ORDER BY (${orderBy.join(', ')}) -${settingsClause}`.trim(), +${ttlClause} +${settingsClause}`.trim(), `CREATE TABLE IF NOT EXISTS ${tableName} ON CLUSTER '{cluster}' AS ${replicated(tableName)} ENGINE = Distributed('{cluster}', currentDatabase(), ${replicated(tableName)}, ${distributionHash})`, ]; }Also applies to: 15-63
packages/validation/src/index.ts (1)
478-486: Make zCreateImport a discriminated union to bind provider ⇔ config.As-is,
provider='umami'can carry a Mixpanel config. Use a discriminated union onprovider.-export const zCreateImport = z.object({ - projectId: z.string().min(1), - provider: z.enum(['umami', 'plausible', 'mixpanel']), - config: z.union([ - zUmamiImportConfig, - zPlausibleImportConfig, - zMixpanelImportConfig, - ]), -}); +export const zCreateImport = z.discriminatedUnion('provider', [ + z.object({ + projectId: z.string().min(1), + provider: z.literal('umami'), + config: zUmamiImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('plausible'), + config: zPlausibleImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('mixpanel'), + config: zMixpanelImportConfig, + }), +]);packages/trpc/src/routers/import.ts (3)
12-32: Sensitive config exposure in list — restrict or redact.
configcontains provider credentials (API keys, secrets). Current implementation returns it to any project member with read access. Redact or gate by role.Option A (redact for read-level):
list: protectedProcedure .input(z.object({ projectId: z.string() })) .query(async ({ input, ctx }) => { const access = await getProjectAccessCached({ projectId: input.projectId, userId: ctx.session.userId, }); if (!access) { throw TRPCAccessError('You do not have access to this project'); } - return db.import.findMany({ + const rows = await db.import.findMany({ where: { projectId: input.projectId }, orderBy: { createdAt: 'desc' }, }); + const canSeeSecrets = + typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets + ? rows + : rows.map((r) => ({ ...r, config: null })); }),
34-56: Sensitive config exposure in get — restrict or redact.
configcontains provider credentials. Apply the same access-level check as inlistto redact secrets for read-only users.get: protectedProcedure .input(z.object({ id: z.string() })) .query(async ({ input, ctx }) => { const importRecord = await db.import.findUniqueOrThrow({ where: { id: input.id }, include: { project: true }, }); const access = await getProjectAccessCached({ projectId: importRecord.projectId, userId: ctx.session.userId, }); if (!access) { throw TRPCAccessError('You do not have access to this import'); } - return importRecord; + const canSeeSecrets = + typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets ? importRecord : { ...importRecord, config: null }; }),
72-99: Queue enqueue not wrapped — record can be left pending with no job.If
importQueue.addfails, DB keeps a dangling 'pending' import. Wrap in try/catch, set failed status, and pass stable jobId/attempts/backoff.// Create import record const importRecord = await db.import.create({ data: { projectId: input.projectId, config: input.config, status: 'pending', }, }); - // Add job to queue - const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); - - // Update import record with job ID - await db.import.update({ - where: { id: importRecord.id }, - data: { jobId: job.id }, - }); + // Add job to queue + try { + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + ); + await db.import.update({ + where: { id: importRecord.id }, + data: { jobId: job.id }, + }); + } catch (err) { + await db.import.update({ + where: { id: importRecord.id }, + data: { status: 'failed', errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed' }, + }); + throw err; + }apps/worker/src/jobs/import.ts (3)
130-139: Redact raw event logging to avoid PII leakage.Line 137 logs the full
rawEventwhich may contain PII (emails, user identifiers, etc.). Log only minimal, non-PII identifiers.if ( !providerInstance.validate( // @ts-expect-error rawEvent, ) ) { - jobLogger.warn('Skipping invalid event', { rawEvent }); + jobLogger.warn('Skipping invalid event', { + provider: providerInstance.provider, + reason: 'validation_failed', + }); continue; }
126-169: Harden batch buffer and transform stage.Batch uses
anytype (Line 126) and transform stage (Lines 147-152) lacks error handling. A single transform failure fails the entire batch. Useunknown[]and wrap transforms in try/catch to skip invalid events.- const eventBatch: any = []; + const eventBatch: unknown[] = []; for await (const rawEvent of providerInstance.parseSource( resumeLoadingFrom, )) { // ... validation ... // Process batch when it reaches the batch size if (eventBatch.length >= BATCH_SIZE) { jobLogger.info('Processing batch', { batchSize: eventBatch.length }); - const transformedEvents: IClickhouseEvent[] = eventBatch.map( - ( - // @ts-expect-error - event, - ) => providerInstance!.transformEvent(event), - ); + const transformedEvents: IClickhouseEvent[] = []; + for (const event of eventBatch) { + try { + transformedEvents.push(providerInstance.transformEvent(event as any)); + } catch (e) { + jobLogger.warn('Skipping event on transform failure', { + provider: providerInstance.provider, + reason: e instanceof Error ? e.message : String(e), + }); + } + }
246-265: Sync final Postgres counters with ClickHouse progress.UI reads
processedEvents/totalEventsfrom Postgres. After completion, Postgres isn't updated with final ClickHouse counts, so progress may show stale values. Update both tofinalProgressvalues.await markImportComplete(importId); await updateImportStatus(jobLogger, job, importId, { step: 'completed', }); jobLogger.info('Import marked as complete'); // Get final progress const finalProgress = await getImportProgress(importId); jobLogger.info('Import job completed successfully', { totalEvents: finalProgress.totalEvents, insertedEvents: finalProgress.insertedEvents, status: finalProgress.status, }); + + // Keep Postgres in sync for UI progress + await db.import.update({ + where: { id: importId }, + data: { + totalEvents: finalProgress.totalEvents, + processedEvents: finalProgress.insertedEvents, + }, + });apps/start/src/modals/add-import.tsx (1)
245-249: Avoid leaking secrets in error messages.Line 247 displays
error.messagewhich may include upstream API details or secrets. Use a generic client-facing message and log details server-side only.onError: (error) => { toast.error('Import failed', { - description: error.message, + description: 'Failed to start import. Please check your configuration and try again.', }); },packages/importer/src/providers/umami.ts (5)
79-198: SSRF exposure persists when fetching arbitrary URLs.The fetch operation on line 97 accepts user-provided URLs without validating that they don't resolve to internal networks or restricted IP ranges. This remains a critical security risk.
213-220: Device ID fallback remains incorrect and unsafe.Lines 216-217 still use
rawEvent.visit_id!for bothipanduaparameters with non-null assertions. The visit_id is semantically unrelated to IP address or user agent, and the non-null assertion will throw if visit_id is undefined.
224-228: URL construction issues remain unresolved.Line 225 concatenates hostname, path, and query without ensuring proper formatting. The query parameter needs a leading '?' and the hostname should be required for absolute URLs.
231-233: Referrer query parameter is still ignored.Line 232 constructs the referrerUrl but omits
rawEvent.referrer_query, losing query string information from the referrer.
257-296: Properties structure violates ClickHouse type expectations.Lines 260 and 267-296 build nested objects (
properties.__queryandproperties.__query.utm_*) which conflict withIClickhouseEvent.properties: Record<string, string | number | boolean | undefined | null>. ClickHouse expects flat primitive values.packages/db/src/services/import.service.ts (3)
256-256: Timestamp collision risk persists.Lines 256 and 293 adjust timestamps by exactly ±1000ms (1 second). This can collide with real events at those timestamps, causing non-deterministic ordering in time-based queries.
Also applies to: 293-293
559-559: Non-deterministic status aggregation remains.Line 559 uses
any(import_status)which returns an arbitrary status when events have mixed statuses. This makes progress reporting unpredictable.
603-603: Type assertion bypasses runtime validation.Line 603 uses a type assertion for
row.statuswithout runtime validation. If ClickHouse returns an unexpected value (due to theany()issue on line 559 or data corruption), this will silently pass invalid data.
🧹 Nitpick comments (10)
apps/start/src/components/project-mapper.tsx (1)
50-50: Consider validating for duplicate source project IDs.The component allows users to add multiple mappings with the same "from" (source project ID), which could lead to ambiguous or conflicting import behavior. Consider adding validation to prevent duplicate source IDs, either here or in the parent form's validation schema.
packages/db/src/clickhouse/client.ts (1)
136-156: Consider immutability in insert proxy.The proxy mutates
args[0]directly when injectingclickhouse_settings. While functional, this modifies the caller's object.Consider creating a new object to avoid mutation:
return (...args: any[]) => withRetry(() => { - args[0].clickhouse_settings = { + const settings = { // Allow bigger HTTP payloads/time to stream rows async_insert: 1, wait_for_async_insert: 1, // Increase insert timeouts and buffer sizes for large batches max_execution_time: 300, max_insert_block_size: '500000', max_http_get_redirects: '0', // Ensure JSONEachRow stays efficient input_format_parallel_parsing: 1, // Keep long-running inserts/queries from idling out at proxies by sending progress headers send_progress_in_http_headers: 1, http_headers_progress_interval_ms: '50000', // Ensure server holds the connection until the query is finished wait_end_of_query: 1, ...args[0].clickhouse_settings, }; + return value.apply(target, [{ ...args[0], clickhouse_settings: settings }]); - return value.apply(target, args); });packages/common/scripts/get-referrers.ts (2)
91-92: Improve error handling in main function.The current error handling only logs to console without providing actionable information or proper error propagation.
Apply this diff to improve error handling:
} catch (e) { - console.log(e); + console.error('Failed to generate referrers:', e); + process.exit(1); }This ensures the script fails with a non-zero exit code when referrer generation fails, allowing CI/build systems to detect the failure.
69-71: Implement robust error handling and fallback strategy for the external data source.The Snowplow referrer data source is currently available and returns valid JSON (HTTP 200), but the script lacks fallback mechanisms. If the fetch fails during the build, the error is only logged to the console without preventing silent failures. Consider adding retry logic, a timeout, or fallback to a cached/bundled version to ensure build reliability when the external source becomes temporarily unavailable.
packages/importer/src/providers/metadata.ts (2)
13-30: Keep provider metadata in sync with validation (plausible present there).Validation accepts 'plausible', but it’s missing here. Either add Plausible to IMPORT_PROVIDERS or drop it from validation until ready.
Would you like a patch to add Plausible with a neutral placeholder logo?
18-19: Avoid hot‑linking external logos; serve from our assets/CDN.Brandfetch links can break or change. Prefer checked‑in assets or our CDN.
Also applies to: 26-28
packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1)
1-3: Consider indexes if filtering/sorting by progress fields.If
currentBatch/currentStepappear in WHERE/ORDER BY, add indexes (even partial) to avoid seq scans. Otherwise this change looks fine.apps/worker/src/index.ts (1)
35-45: Minor improvement: Cleaner Bull Board initialization.Refactoring from a wrapped object literal to a direct
createBullBoard()call improves code clarity. This is a good incidental cleanup.packages/common/server/parser-user-agent.ts (1)
84-112: Consider simplifying the redundant type checks.Each override field uses a pattern like:
typeof overrides?.__os === 'string' && overrides?.__os ? overrides?.__os : res.os.nameThe condition
typeof overrides?.__os === 'string' && overrides?.__osis redundant. If the typeof check passes and the value is truthy, the second check adds no value unless you're explicitly filtering out empty strings. If that's the intent, a clearer pattern would be:- os: - typeof overrides?.__os === 'string' && overrides?.__os - ? overrides?.__os - : res.os.name, + os: + (typeof overrides?.__os === 'string' && overrides?.__os !== '') + ? overrides?.__os + : res.os.name,Or if empty strings should be accepted:
- os: - typeof overrides?.__os === 'string' && overrides?.__os - ? overrides?.__os - : res.os.name, + os: typeof overrides?.__os === 'string' ? overrides?.__os : res.os.name,apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)
46-76: Polling interval appropriate for progress tracking.5-second refetch interval on Line 50 provides real-time import progress updates without excessive load. Consider adding a condition to stop polling when all imports are completed/failed.
Optional optimization:
const importsQuery = useQuery( trpc.import.list.queryOptions( { projectId }, { - refetchInterval: 5000, + refetchInterval: (query) => { + const hasActiveImports = query.state.data?.some( + (imp) => imp.status === 'pending' || imp.status === 'processing' + ); + return hasActiveImports ? 5000 : false; + }, }, ), );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (63)
apps/api/package.json(1 hunks)apps/start/package.json(2 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/project-mapper.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/date-ranger-picker.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/worker/package.json(3 hunks)apps/worker/src/boot-workers.ts(4 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/events.incoming-event.ts(3 hunks)apps/worker/src/jobs/events.incoming-events.test.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)package.json(1 hunks)packages/cli/package.json(0 hunks)packages/cli/src/cli.ts(0 hunks)packages/cli/src/importer/importer.ts(0 hunks)packages/cli/src/importer/index.ts(0 hunks)packages/cli/tsconfig.json(0 hunks)packages/cli/tsup.config.ts(0 hunks)packages/common/package.json(1 hunks)packages/common/scripts/get-referrers.ts(1 hunks)packages/common/server/index.ts(1 hunks)packages/common/server/parse-referrer.test.ts(5 hunks)packages/common/server/parse-referrer.ts(4 hunks)packages/common/server/parser-user-agent.ts(2 hunks)packages/common/src/object.ts(1 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql(1 hunks)packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql(1 hunks)packages/db/prisma/migrations/20251028152531_imports_again/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/buffers/event-buffer.test.ts(8 hunks)packages/db/src/clickhouse/client.ts(2 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/services/session.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/metadata.ts(1 hunks)packages/importer/src/providers/mixpanel.test.ts(1 hunks)packages/importer/src/providers/mixpanel.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/redis/package.json(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
💤 Files with no reviewable changes (6)
- packages/cli/tsup.config.ts
- packages/cli/src/importer/index.ts
- packages/cli/package.json
- packages/cli/src/cli.ts
- packages/cli/src/importer/importer.ts
- packages/cli/tsconfig.json
✅ Files skipped from review due to trivial changes (2)
- packages/importer/src/providers/mixpanel.test.ts
- packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql
🚧 Files skipped from review as they are similar to previous changes (12)
- packages/db/src/services/event.service.ts
- packages/db/code-migrations/5-add-imports-table.sql
- packages/importer/vitest.config.ts
- apps/worker/package.json
- apps/start/src/components/integrations/integration-card.tsx
- packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql
- packages/db/prisma/migrations/20251018214030_fix/migration.sql
- packages/db/src/types.ts
- packages/importer/tsconfig.json
- packages/importer/package.json
- packages/importer/src/types.ts
- apps/start/src/routeTree.gen.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/modals/index.tsxapps/start/src/modals/date-ranger-picker.tsxapps/start/src/components/ui/badge.tsxapps/start/src/components/project-mapper.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/modals/add-import.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx
🧬 Code graph analysis (17)
apps/start/src/modals/index.tsx (1)
apps/start/src/modals/add-import.tsx (1)
AddImport(231-282)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
importQueue(175-181)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(138-138)protectedProcedure(161-164)packages/trpc/src/access.ts (1)
getProjectAccessCached(4-4)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(270-270)packages/validation/src/index.ts (1)
zCreateImport(478-486)packages/queue/src/queues.ts (1)
importQueue(175-181)
packages/trpc/src/root.ts (1)
packages/trpc/src/routers/import.ts (1)
importRouter(11-178)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
getIsCluster(17-24)packages/db/src/clickhouse/migration.ts (2)
createTable(68-116)runClickhouseMigrationCommands(322-454)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
getRedisQueue(93-105)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
importQueue(175-181)apps/worker/src/jobs/import.ts (1)
importJob(27-283)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
IServiceProject(7-7)apps/start/src/components/ui/select.tsx (5)
Select(173-173)SelectTrigger(181-181)SelectValue(182-182)SelectContent(174-174)SelectItem(176-176)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
BaseRawEvent(40-42)ImportJobMetadata(69-73)ErrorContext(45-51)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
Route(12-34)packages/importer/src/providers/metadata.ts (1)
IMPORT_PROVIDERS(13-30)apps/public/components/ui/tooltip.tsx (1)
Tooltiper(36-51)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
zUmamiImportConfig(452-454)zMixpanelImportConfig(461-470)IUmamiImportConfig(456-456)IMixpanelImportConfig(471-471)IImportConfig(473-476)apps/start/src/components/project-mapper.tsx (1)
ProjectMapper(33-116)apps/public/lib/utils.ts (1)
cn(4-6)
packages/common/server/parse-referrer.test.ts (1)
packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
IMixpanelImportConfig(471-471)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)packages/common/server/parser-user-agent.ts (2)
parseUserAgent(72-115)UserAgentInfo(71-71)packages/common/src/object.ts (2)
toDots(3-36)strip(48-48)
packages/importer/src/providers/umami.ts (5)
packages/validation/src/index.ts (1)
IUmamiImportConfig(456-456)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
packages/db/src/services/import.service.ts (4)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/client.ts (4)
ch(131-161)TABLE_NAMES(48-60)convertClickhouseDateToJs(238-240)formatClickhouseDate(208-219)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/prisma-client.ts (1)
db(270-270)
apps/worker/src/jobs/import.ts (9)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/prisma-client.ts (1)
db(270-270)packages/db/src/services/import.service.ts (10)
ImportSteps(671-671)updateImportStatus(673-736)getImportDateBounds(610-634)insertImportBatch(28-55)generateSessionIds(62-98)createSessionsStartEndEvents(106-316)moveImportsToProduction(322-426)backfillSessionsToProduction(428-521)markImportComplete(526-545)getImportProgress(550-605)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(208-219)error(40-45)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)packages/importer/src/index.ts (2)
UmamiProvider(1-1)MixpanelProvider(2-2)packages/importer/src/providers/umami.ts (1)
UmamiProvider(59-382)packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-441)
packages/common/scripts/get-referrers.ts (1)
apps/worker/scripts/get-referrers.ts (1)
main(66-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-and-push-worker
- GitHub Check: build-and-push-dashboard
- GitHub Check: build-and-push-api
🔇 Additional comments (43)
packages/db/src/services/session.service.ts (1)
195-199: Incomplete scope: Performance concern applies to both session and event query services.The dateIntervalInDays change (7 → 360 days) was applied to both
packages/db/src/services/session.service.ts(lines 195-199) andpackages/db/src/services/event.service.ts(lines 418-432) with identical logic. The original review only identifies the session service impact but misses the event service, effectively doubling the performance concern.Both services use the same binary threshold (1 day for >1M events, 360 days otherwise) to filter ClickHouse queries, meaning most organizations will experience a 51x increase in the lookback window across both query surfaces.
Recommendations remain the same for both services:
- Introduce intermediate tiers based on organization size
- Make the interval configurable per organization
- Add UI controls for explicit date range selection
Likely an incorrect or invalid review comment.
apps/start/src/modals/date-ranger-picker.tsx (1)
30-30: LGTM! Good UX improvement.Adding
captionLayout="dropdown"provides a more compact date navigation interface, especially useful in modal contexts where space is limited.apps/api/package.json (1)
67-67: LGTM! Reasonable pinning for build tool.Pinning the build tool (tsdown) to an exact version ensures consistent builds across environments. Since this is a devDependency, the trade-off of missing automatic updates is acceptable.
packages/db/src/buffers/event-buffer.test.ts (2)
13-23: LGTM! Test data aligned with production settings.The
clickhouseSettingsconstant mirrors the settings added to the production ClickHouse client, ensuring tests validate the actual insert behavior including async operations, timeouts, and performance optimizations.
142-142: LGTM! Consistent test expectations.All insert expectations now include the
clickhouse_settingsfield, ensuring tests verify that the proxy-wrapped insert method properly injects these settings.Also applies to: 187-187, 519-519, 570-570, 615-615, 640-640, 688-688
packages/common/package.json (1)
8-9: LGTM! Script for generating referrer mappings.The new
gen:referrersscript generates up-to-date referrer mappings from an external source, with automatic formatting applied to the output.packages/db/src/clickhouse/client.ts (2)
59-59: LGTM! Table name for import tracking.Adding
events_importsto the table names registry supports the new import system introduced in this PR.
64-67: LGTM! Extended timeouts for large import operations.The increased timeouts accommodate large data imports:
request_timeout: 60s → 300s (5 minutes)idle_socket_ttl: 8s → 60sThese align with the
max_execution_time: 300setting in the insert configuration.packages/redis/package.json (1)
11-11: Accept ioredis 5.8.2 upgrade with exact pinning.ioredis 5.8.2 is the latest version, and the upgrade is safe. v5.8.2 (released 2025-10-21) contains only bug fixes: default IP family selection and CLIENT SETINFO command improvements—no breaking changes. No security vulnerabilities were detected. The exact version pin (removing the caret) trades automatic patch-level security updates for build reproducibility, which is a valid tradeoff for production stability.
apps/worker/src/jobs/events.incoming-events.test.ts (1)
100-103: <!-- [scratchpad_start]
[task review_verification 1/1]
Analyze shell script results for referrerType normalization concerns.[observations]
- First regex search for
referrer(Type)?\s*[=:]\s*['\"]unknown['\"]returned NO matches for filters checking equality with 'unknown'- Output shows field assignments/mappings, but NO evidence of dashboards/queries filtering on referrerType='unknown'
- Code pattern in event.service.ts (lines 338-340) shows:
referrer_type: payload.referrerType ?? '',— already defaults undefined/null to empty string- ClickHouse schema files show referrer_type as
LowCardinality(String)— accepts empty strings- No breaking query patterns found in the codebase
[analysis]
- Original concern #1 (queries filtering on 'unknown') — NOT FOUND; no evidence of such filters
- Original concern #2 (DB defaults empty to '' for non-nullable columns) — CONFIRMED; code already uses
?? ''pattern- Change from test setting referrerType to '' is consistent with existing null/undefined handling
[pending]
- Verify the specific test file change at apps/worker/src/jobs/events.incoming-events.test.ts lines 100-103 and 208-211
- Confirm no other downstream assumptions exist
[done]
Script results show no matching filters on 'unknown' values and confirm existing empty-string defaulting pattern.
[scratchpad_end]
-->Referrer type normalization aligns with existing code patterns — no breaking query filters found.
The codebase already normalizes
undefined/nullreferrer fields to empty strings via nullish coalescing (?? '') in event.service.ts. No queries or dashboards filtering onreferrerType='unknown'were detected in the search. The change to normalizereferrerTypeto''in the test is consistent with the production code's existing behavior.apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
45-45: LGTM! Clean addition of the Imports tab.The new tab entry follows the existing pattern and integrates well with the settings navigation structure.
packages/trpc/package.json (1)
19-19: LGTM! Appropriate dependency addition.Adding
@openpanel/queueenables TRPC routes to interact with the import queue system, aligning with the new import workflow introduced in this PR.packages/common/server/index.ts (1)
4-4: LGTM! Consistent re-export pattern.The new re-export follows the existing pattern and appropriately expands the public API surface of the common/server module.
apps/worker/src/index.ts (1)
8-8: LGTM! ImportQueue properly integrated into Bull Board monitoring.The addition of
importQueueto the Bull Board UI enables visibility and monitoring of import jobs, which is essential for debugging and tracking the new import workflow.Also applies to: 42-42
apps/start/src/modals/index.tsx (1)
8-8: LGTM! Modal registration follows established pattern.The
AddImportmodal is properly registered in the modals registry, enabling the import configuration UI flow described in the PR objectives.Also applies to: 41-41
package.json (1)
12-12: Verified: thecommonpackage correctly defines thegen:referrersscript.The script exists in
packages/common/package.jsonand runs the referrer generation logic. The filter change fromworkertocommonis valid and properly configured.packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1)
1-2: Migration correctly implements nullablecurrentStepfor import workflow resumption.The codebase explicitly depends on
currentStepbeing nullable:apps/worker/src/jobs/import.ts:56checksif (record.currentStep === null)to identify new imports versus resumed imports. This null state allows the import job to distinguish between first-time executions (null) and resumptions from a previous step. The migration is correct as-is.apps/start/package.json (1)
34-34: LGTM! Dependency addition aligns with the import feature.The new
@openpanel/importerworkspace dependency is correctly added to support the import functionality introduced in this PR.packages/trpc/src/root.ts (1)
7-7: LGTM! Import router correctly integrated.The
importRouteris properly imported and wired into theappRouter, following the established pattern for other routers in the application.Also applies to: 44-44
packages/common/server/parser-user-agent.ts (1)
71-71: LGTM! Useful type export.The new
UserAgentInfotype export provides a clean type definition for consumers ofparseUserAgent.packages/common/server/parse-referrer.test.ts (1)
8-8: LGTM! Test expectations correctly updated.The test expectations have been updated from
'unknown'to''(empty string) to match the implementation changes inparse-referrer.ts. This is consistent with the new default behavior for unknown referrers.Also applies to: 14-14, 44-44, 52-52, 67-67, 91-91, 99-99, 113-113
packages/importer/src/index.ts (1)
1-13: LGTM! Clean barrel export for importer package.The index file properly exports the provider implementations and type definitions, establishing a clean public API for the importer package.
packages/queue/src/queues.ts (1)
168-181: LGTM! Import queue properly configured.The new
importQueuefollows the established pattern for other queues in the codebase, with appropriate job retention settings (removeOnComplete: 10,removeOnFail: 50).apps/worker/src/boot-workers.ts (1)
8-8: LGTM! Import worker correctly configured.The
importWorkeris properly set up with:
- Correct queue and job handler imports
- Configurable concurrency via
IMPORT_JOB_CONCURRENCYenvironment variable- Standard worker options
Also applies to: 23-23, 61-64
packages/common/server/parse-referrer.ts (3)
35-51: Improved referrer matching logic.The updated matching logic is cleaner and more maintainable:
- Source is now normalized to lowercase upfront
- Matching checks direct key,
.comsuffix, and case-insensitive name lookup- Removes redundant fallback
The logic now properly handles case-insensitive matching for query parameters like
utm_source=GoOgLe.
23-23: No breaking change detected.The code already defaults to empty strings for unknown referrers. Tests confirm this is the expected behavior, and no consumers in the codebase check for
type === 'unknown'. The alleged change from'unknown'to''either does not exist or was already in place prior to this PR.Likely an incorrect or invalid review comment.
1-1: Import paths are correct and resolve properly.Both imports in
packages/common/server/parse-referrer.tssuccessfully resolve:
- Line 1:
../src/string→packages/common/src/string.ts✓- Line 3:
./referrers→packages/common/server/referrers/index.ts✓packages/trpc/src/routers/import.ts (1)
101-133: LGTM!Delete procedure correctly checks write permissions and safely removes queue job before deleting the record.
apps/worker/src/jobs/events.incoming-event.ts (2)
8-12: LGTM!Import reorganization consolidates referrer utilities from
@openpanel/common/serverand correctly uses type-only import forJob.Also applies to: 21-21, 23-23
120-122: Referrer priority adjusted — verify behavior.
baseEvent.referrernow exclusively usesreferrer?.url(removing fallback toutmReferrer?.url), andreferrerTypeprioritizesreferrer?.typebeforeutmReferrer?.type. This aligns with centralized referrer parsing, but ensure existing analytics flows handle the priority change correctly.packages/db/prisma/schema.prisma (2)
470-475: LGTM!
ImportStatusenum defines a clear import lifecycle: pending → processing → completed/failed.
197-197: LGTM!Project relation to
importsis correctly defined and matches the Import model's foreign key.apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)
88-122: LGTM!
getStatusBadgeprovides clear visual feedback with appropriate icons and tooltips for failed imports showing error details.
229-263: LGTM!Progress display handles multiple scenarios (estimated counts, N/A, processing) with clear user communication via tooltips and progress bars.
apps/worker/src/jobs/import.ts (1)
285-298: LGTM!
createProviderfactory cleanly instantiates provider-specific implementations and throws descriptive errors for unknown providers.packages/importer/src/base-provider.ts (3)
4-31: LGTM!Abstract base class defines a clear provider contract with streaming support (
AsyncGenerator), transformation, validation, and event counting. Well-documented and type-safe.
33-58: LGTM!Optional hooks provide extension points for providers without forcing implementation. Default behaviors (re-throw errors, return 0 for estimated total) are sensible.
78-120: LGTM!
getDateChunksutility correctly splits date ranges for batched processing, handles edge cases (same start/end date, chunks exceeding end date), and is well-documented.apps/start/src/modals/add-import.tsx (3)
44-107: LGTM!UmamiImport form correctly uses
zUmamiImportConfigschema, includes required fields (provider, type, fileUrl), and integrates ProjectMapper for optional mappings.
115-229: LGTM!MixpanelImport form correctly implements required fields with appropriate input types (password for secret) and integrates date range picker for selecting import window.
253-259: LGTM!Payload construction correctly passes
configfrom forms which already include requiredproviderandtypefields set in form defaults (Lines 59-60 for Umami, Lines 124-125 for Mixpanel).packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1)
1-14: The migration concerns are valid and already identified in the codebase.The migration under review makes
currentStepNOT NULL without backfilling data, and the very next migration (20251028150655_fix_imports_again) immediately reverts this change, indicating the problem was encountered. Additionally, another follow-up migration (20251028152531_imports_again) modifiescurrentBatchagain.Since these are part of an active development cycle on a newly created table (10 days old as of the migration date), consider consolidating these migrations:
- Combine 20251028150123, 20251028150655, and 20251028152531 into a single migration that correctly handles the schema without intermediate reversals, or
- If these are across separate PRs, ensure
currentStepremains nullable (or add a backfill to populate existing NULLs before setting NOT NULL).The current state requires the next migration to fix the previous one, which suggests the schema design should be finalized before deployment.
packages/db/src/services/import.service.ts (1)
682-691: Verify terminal state handling.Lines 684-691 unconditionally set
status: 'processing'when step is 'loading'. If an import has already reached a terminal state ('completed' or 'failed'), this would revert it. Confirm whether this is intentional for retry scenarios, or if a state check is needed.If terminal states should be protected, add a guard:
const current = await db.import.findUnique({ where: { id: importId }, select: { status: true } }); if (current?.status === 'completed' || current?.status === 'failed') { // Skip or throw depending on requirements return; }
apps/start/package.json
Outdated
| "type": "module", | ||
| "scripts": { | ||
| "testing": "pnpm dev", | ||
| "testing2": "pnpm dev", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify the script rename from "testing" to "testing2".
The script key was renamed from "testing" to "testing2" without clear justification. This appears to be a development artifact rather than an intentional change for this PR.
If this rename is intentional, please clarify the purpose. Otherwise, consider reverting it or using a more descriptive name.
🤖 Prompt for AI Agents
In apps/start/package.json around line 6, the script key was renamed from
"testing" to "testing2" which looks like an accidental dev artifact; either
revert the key back to "testing" to preserve original functionality or replace
it with a descriptive name (e.g., "dev" or "start:testing") if the rename is
intentional, and update any references in documentation or CI that call the
script accordingly; if intentional, add a short comment in the PR description
explaining the reason for the new name.
| interface ProjectMapperProps { | ||
| fields: FieldArrayWithId<any, 'projectMapper', 'id'>[]; | ||
| append: UseFieldArrayAppend<any, 'projectMapper'>; | ||
| remove: UseFieldArrayRemove; | ||
| projects: IServiceProject[]; | ||
| register: UseFormRegister<any>; | ||
| watch: UseFormWatch<any>; | ||
| setValue: UseFormSetValue<any>; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Define a proper form type instead of using any.
The interface uses any for all generic type parameters, which defeats TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.
Apply this diff to add proper typing:
+interface ProjectMapperFormData {
+ projectMapper: Array<{ from: string; to: string }>;
+}
+
interface ProjectMapperProps {
- fields: FieldArrayWithId<any, 'projectMapper', 'id'>[];
- append: UseFieldArrayAppend<any, 'projectMapper'>;
+ fields: FieldArrayWithId<ProjectMapperFormData, 'projectMapper', 'id'>[];
+ append: UseFieldArrayAppend<ProjectMapperFormData, 'projectMapper'>;
remove: UseFieldArrayRemove;
projects: IServiceProject[];
- register: UseFormRegister<any>;
- watch: UseFormWatch<any>;
- setValue: UseFormSetValue<any>;
+ register: UseFormRegister<ProjectMapperFormData>;
+ watch: UseFormWatch<ProjectMapperFormData>;
+ setValue: UseFormSetValue<ProjectMapperFormData>;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| interface ProjectMapperProps { | |
| fields: FieldArrayWithId<any, 'projectMapper', 'id'>[]; | |
| append: UseFieldArrayAppend<any, 'projectMapper'>; | |
| remove: UseFieldArrayRemove; | |
| projects: IServiceProject[]; | |
| register: UseFormRegister<any>; | |
| watch: UseFormWatch<any>; | |
| setValue: UseFormSetValue<any>; | |
| } | |
| interface ProjectMapperFormData { | |
| projectMapper: Array<{ from: string; to: string }>; | |
| } | |
| interface ProjectMapperProps { | |
| fields: FieldArrayWithId<ProjectMapperFormData, 'projectMapper', 'id'>[]; | |
| append: UseFieldArrayAppend<ProjectMapperFormData, 'projectMapper'>; | |
| remove: UseFieldArrayRemove; | |
| projects: IServiceProject[]; | |
| register: UseFormRegister<ProjectMapperFormData>; | |
| watch: UseFormWatch<ProjectMapperFormData>; | |
| setValue: UseFormSetValue<ProjectMapperFormData>; | |
| } |
🤖 Prompt for AI Agents
In apps/start/src/components/project-mapper.tsx around lines 23 to 31, the
ProjectMapperProps interface uses `any` for all react-hook-form generics;
replace these with a concrete form data type (e.g., define an exported
FormValues or ProjectForm type that describes fields used by the component,
including projectMapper: { id: string; <other fields> }[] and any nested
fields); update the interface to use that type in FieldArrayWithId<FormValues,
'projectMapper', 'id'>, UseFieldArrayAppend<FormValues, 'projectMapper'>,
UseFieldArrayRemove<FormValues>, UseFormRegister<FormValues>,
UseFormWatch<FormValues>, and UseFormSetValue<FormValues>; import or declare the
FormValues type near the top of the file (or from a shared types file) and
adjust any call sites to match the new type.
| <Select | ||
| value={watch(`projectMapper.${index}.to`)} | ||
| onValueChange={(value) => | ||
| setValue(`projectMapper.${index}.to`, value) | ||
| } | ||
| > | ||
| <SelectTrigger className="mt-1 w-full" size="sm"> | ||
| <SelectValue placeholder="Select project" /> | ||
| </SelectTrigger> | ||
| <SelectContent> | ||
| {projects.slice(0, 10).map((project) => ( | ||
| <SelectItem key={project.id} value={project.id}> | ||
| {project.name} | ||
| </SelectItem> | ||
| ))} | ||
| </SelectContent> | ||
| </Select> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the arbitrary 10-project limit.
Line 92 limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling, or implement search/pagination if performance is a concern.
Apply this diff to show all projects:
<SelectContent>
- {projects.slice(0, 10).map((project) => (
+ {projects.map((project) => (
<SelectItem key={project.id} value={project.id}>
{project.name}
</SelectItem>If performance is a concern with many projects, consider adding a search input or ComboBox component instead of a plain Select.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <Select | |
| value={watch(`projectMapper.${index}.to`)} | |
| onValueChange={(value) => | |
| setValue(`projectMapper.${index}.to`, value) | |
| } | |
| > | |
| <SelectTrigger className="mt-1 w-full" size="sm"> | |
| <SelectValue placeholder="Select project" /> | |
| </SelectTrigger> | |
| <SelectContent> | |
| {projects.slice(0, 10).map((project) => ( | |
| <SelectItem key={project.id} value={project.id}> | |
| {project.name} | |
| </SelectItem> | |
| ))} | |
| </SelectContent> | |
| </Select> | |
| <Select | |
| value={watch(`projectMapper.${index}.to`)} | |
| onValueChange={(value) => | |
| setValue(`projectMapper.${index}.to`, value) | |
| } | |
| > | |
| <SelectTrigger className="mt-1 w-full" size="sm"> | |
| <SelectValue placeholder="Select project" /> | |
| </SelectTrigger> | |
| <SelectContent> | |
| {projects.map((project) => ( | |
| <SelectItem key={project.id} value={project.id}> | |
| {project.name} | |
| </SelectItem> | |
| ))} | |
| </SelectContent> | |
| </Select> |
🤖 Prompt for AI Agents
In apps/start/src/components/project-mapper.tsx around lines 82 to 98 the Select
is artificially limited by projects.slice(0, 10) which hides projects beyond the
first 10; remove the slice so the SelectContent maps over the entire projects
array (or replace with a virtualized list / searchable ComboBox if performance
is a concern) and ensure the SelectContent is allowed to scroll by keeping the
UI container styles intact.
| if (process.env.NODE_ENV === 'production') { | ||
| exitHandler(evt, code); | ||
| } else { | ||
| logger.info('Shutting down for development', { | ||
| event: evt, | ||
| code, | ||
| }); | ||
| process.exit(0); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Development exit handler bypasses graceful shutdown.
The modified exit handler now exits immediately with code 0 in development mode, bypassing the graceful shutdown logic (closing workers, waiting for queues). This could lead to:
- Jobs left in inconsistent states
- Incomplete database transactions
- Resource leaks (unclosed Redis connections)
Consider one of the following approaches:
- Keep graceful shutdown for development but with a shorter timeout
- Document why immediate exit is acceptable in development
- At minimum, still close workers even if you skip waiting for queue completion
['uncaughtException', 'unhandledRejection', 'SIGTERM', 'SIGINT'].forEach(
(evt) => {
process.on(evt, (code) => {
- if (process.env.NODE_ENV === 'production') {
- exitHandler(evt, code);
- } else {
- logger.info('Shutting down for development', {
- event: evt,
- code,
- });
- process.exit(0);
- }
+ exitHandler(evt, code);
});
},
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (process.env.NODE_ENV === 'production') { | |
| exitHandler(evt, code); | |
| } else { | |
| logger.info('Shutting down for development', { | |
| event: evt, | |
| code, | |
| }); | |
| process.exit(0); | |
| } | |
| exitHandler(evt, code); |
| -- AlterTable | ||
| ALTER TABLE "public"."imports" ALTER COLUMN "currentBatch" DROP NOT NULL, | ||
| ALTER COLUMN "currentBatch" DROP DEFAULT, | ||
| ALTER COLUMN "currentBatch" SET DATA TYPE TEXT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the relaxed constraints on currentBatch align with batch tracking requirements.
This migration makes three significant changes to currentBatch:
- Removes NOT NULL constraint (allows imports without batches)
- Removes DEFAULT value (no automatic batch assignment)
- Changes type to TEXT (allows flexible batch identifiers)
Please confirm that:
- Imports are expected to exist without a
currentBatch(nullable is correct) - The workflow handles missing batch values appropriately
- TEXT is the appropriate type for your batch identifier scheme (e.g., UUIDs, timestamps, custom strings)
🏁 Script executed:
#!/bin/bash
# Description: Check how currentBatch is used in import processing logic
# Search for currentBatch usage and null checks
rg -nP --type=ts -C3 'currentBatch' -g '!*.gen.ts' -g '!node_modules'Length of output: 4902
🏁 Script executed:
#!/bin/bash
# Find the Prisma schema and check currentBatch field definition
fd -name 'schema.prisma' -o -name '*.prisma'Length of output: 230
🏁 Script executed:
#!/bin/bash
# List migrations in order to find the one before this one
ls -1 packages/db/prisma/migrations/ | sort | tail -10Length of output: 295
🏁 Script executed:
#!/bin/bash
# Check the imports table definition in the latest schema or migrations
fd -name '*.sql' -path '*/migrations/*' -exec grep -l 'currentBatch' {} \;Length of output: 230
🏁 Script executed:
#!/bin/bash
# View the previous migration to understand what currentBatch type was
cat packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sqlLength of output: 86
🏁 Script executed:
#!/bin/bash
# View the migration that added the import table initially
cat packages/db/prisma/migrations/20251018205153_add_import_table/migration.sqlLength of output: 840
🏁 Script executed:
#!/bin/bash
# Search for schema.prisma in common locations
find packages/db -name 'schema.prisma' 2>/dev/nullLength of output: 32
🏁 Script executed:
#!/bin/bash
# Check what type currentBatch has in recent migrations
rg 'currentBatch' packages/db/prisma/migrations/ -A1 -B1Length of output: 859
🏁 Script executed:
#!/bin/bash
# View the Prisma schema definition
cat packages/db/prisma/schema.prisma | grep -A 50 'model imports'Length of output: 0
🏁 Script executed:
#!/bin/bash
# Check if there's a type generation process or codegen in Prisma
rg -A5 'generator|prisma' packages/db/prisma/schema.prisma | head -30Length of output: 432
🏁 Script executed:
#!/bin/bash
# Check how batch parameter is used - what values are passed
rg -nP 'options\.batch|batch:' packages/db/src/ apps/worker/src/ -B2 -A2Length of output: 4984
🏁 Script executed:
#!/bin/bash
# View full imports model from schema
rg -A 30 'model imports' packages/db/prisma/schema.prismaLength of output: 0
🏁 Script executed:
#!/bin/bash
# Check generated Prisma types for imports
cat packages/db/src/generated/prisma/index.d.ts 2>/dev/null | grep -A 30 'interface imports' | head -40Length of output: 0
🏁 Script executed:
#!/bin/bash
# View the imports.service.ts to see the UpdateImportStatusOptions type
rg -B5 -A10 'interface.*UpdateImportStatusOptions|type UpdateImportStatusOptions' packages/db/src/services/import.service.tsLength of output: 357
🏁 Script executed:
#!/bin/bash
# Find and view the ImportProgress interface definition
rg -B3 -A8 'interface ImportProgress' packages/importer/src/types.tsLength of output: 232
🏁 Script executed:
#!/bin/bash
# Check if there are any type errors or if the types are reconciled elsewhere
rg -C5 'currentBatch.*number|number.*currentBatch' packages/Length of output: 530
🏁 Script executed:
#!/bin/bash
# Check where ImportProgress is used and how currentBatch is accessed
rg -C3 'ImportProgress|currentBatch' packages/importer/src/ apps/Length of output: 3534
Fix TypeScript type definition for currentBatch in ImportProgress interface.
The migration correctly changes currentBatch from INTEGER NOT NULL DEFAULT 0 to TEXT nullable with no default. However, the TypeScript type definition needs to be updated:
- File:
packages/importer/src/types.ts, line 23 - Current:
currentBatch: number; - Should be:
currentBatch: string | null;
Reason: The migration stores TEXT values (timestamps, batch identifiers), and the code already handles null batches. The type should reflect this reality.
🤖 Prompt for AI Agents
In packages/importer/src/types.ts around line 23, the ImportProgress interface
currently declares currentBatch: number; but the DB migration changes
currentBatch to nullable TEXT; update the TypeScript definition to currentBatch:
string | null; so the type matches stored values (text/batch IDs or null) and
adjust any places expecting a number accordingly.
| model Import { | ||
| id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid | ||
| projectId String | ||
| project Project @relation(fields: [projectId], references: [id], onDelete: Cascade) | ||
| jobId String? // BullMQ job ID | ||
| status ImportStatus | ||
| statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs" | ||
| errorMessage String? | ||
| /// [IPrismaImportConfig] | ||
| config Json | ||
| totalEvents Int @default(0) | ||
| processedEvents Int @default(0) | ||
| currentStep String? | ||
| currentBatch String? // String date 2020-01-01 | ||
| createdAt DateTime @default(now()) | ||
| completedAt DateTime? | ||
| updatedAt DateTime @default(now()) @updatedAt | ||
|
|
||
| @@map("imports") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistency: currentStep nullable in schema but NOT NULL in migration.
Line 489 defines currentStep as String? (optional), but migration 20251028150123_fix_imports_table/migration.sql sets it to NOT NULL. Align schema and migration:
- If imports always have a step, make it required:
currentStep String - If it's optional during creation, remove NOT NULL constraint from migration
- currentStep String?
+ currentStep StringOr revert the migration's SET NOT NULL constraint.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model Import { | |
| id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid | |
| projectId String | |
| project Project @relation(fields: [projectId], references: [id], onDelete: Cascade) | |
| jobId String? // BullMQ job ID | |
| status ImportStatus | |
| statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs" | |
| errorMessage String? | |
| /// [IPrismaImportConfig] | |
| config Json | |
| totalEvents Int @default(0) | |
| processedEvents Int @default(0) | |
| currentStep String? | |
| currentBatch String? // String date 2020-01-01 | |
| createdAt DateTime @default(now()) | |
| completedAt DateTime? | |
| updatedAt DateTime @default(now()) @updatedAt | |
| @@map("imports") | |
| } | |
| model Import { | |
| id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid | |
| projectId String | |
| project Project @relation(fields: [projectId], references: [id], onDelete: Cascade) | |
| jobId String? // BullMQ job ID | |
| status ImportStatus | |
| statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs" | |
| errorMessage String? | |
| /// [IPrismaImportConfig] | |
| config Json | |
| totalEvents Int @default(0) | |
| processedEvents Int @default(0) | |
| currentStep String | |
| currentBatch String? // String date 2020-01-01 | |
| createdAt DateTime @default(now()) | |
| completedAt DateTime? | |
| updatedAt DateTime @default(now()) @updatedAt | |
| @@map("imports") | |
| } |
🤖 Prompt for AI Agents
In packages/db/prisma/schema.prisma around lines 477 to 496, the model Import
declares currentStep as nullable (String?) while the migration
20251028150123_fix_imports_table/migration.sql sets current_step to NOT NULL;
pick one and make them consistent: if imports must always have a step, change
schema currentStep to non-nullable (currentStep String) and run prisma migrate
to regenerate the SQL; otherwise, edit the migration SQL to remove the SET NOT
NULL on current_step (or create a new migration that drops the NOT NULL
constraint) so the column remains nullable, then reapply migrations.
| this.logger?.info('Fetching events from Mixpanel', { | ||
| url: `${url}?${params}`, | ||
| from, | ||
| to, | ||
| projectId, | ||
| serviceAccount, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sensitive credentials logged.
Line 90 logs the serviceAccount credential. Even though the secret is not logged, exposing account identifiers in logs can aid attackers in credential stuffing or enumeration attacks.
Consider removing sensitive fields from logs:
this.logger?.info('Fetching events from Mixpanel', {
url: `${url}?${params}`,
from,
to,
projectId,
- serviceAccount,
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| this.logger?.info('Fetching events from Mixpanel', { | |
| url: `${url}?${params}`, | |
| from, | |
| to, | |
| projectId, | |
| serviceAccount, | |
| }); | |
| this.logger?.info('Fetching events from Mixpanel', { | |
| url: `${url}?${params}`, | |
| from, | |
| to, | |
| projectId, | |
| }); |
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 85 to 91, the logger
call includes the serviceAccount identifier which is considered sensitive;
remove serviceAccount from the logged metadata and only log non-sensitive
context (url, from, to, projectId) or replace serviceAccount with a
redacted/boolean flag (e.g., serviceAccountPresent: true) if presence needs to
be recorded; update the logger invocation accordingly and run a quick search to
ensure no other logs emit service account identifiers.
| // Add new job to queue | ||
| const job = await importQueue.add('import', { | ||
| type: 'import', | ||
| payload: { | ||
| importId: importRecord.id, | ||
| }, | ||
| }); | ||
|
|
||
| // Update import record | ||
| return db.import.update({ | ||
| where: { id: importRecord.id }, | ||
| data: { | ||
| jobId: job.id, | ||
| status: 'pending', | ||
| errorMessage: null, | ||
| }, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap retry enqueue in try/catch.
Same issue as create: if importQueue.add fails, the record is updated to 'pending' but no job exists. Wrap in try/catch and revert status on failure.
// Add new job to queue
- const job = await importQueue.add('import', {
- type: 'import',
- payload: {
- importId: importRecord.id,
- },
- });
-
- // Update import record
- return db.import.update({
- where: { id: importRecord.id },
- data: {
- jobId: job.id,
- status: 'pending',
- errorMessage: null,
- },
- });
+ try {
+ const job = await importQueue.add(
+ 'import',
+ { type: 'import', payload: { importId: importRecord.id } },
+ { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+ );
+ return db.import.update({
+ where: { id: importRecord.id },
+ data: {
+ jobId: job.id,
+ status: 'pending',
+ errorMessage: null,
+ },
+ });
+ } catch (err) {
+ throw new Error('Failed to enqueue retry job: ' + (err instanceof Error ? err.message : String(err)));
+ }
}),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Add new job to queue | |
| const job = await importQueue.add('import', { | |
| type: 'import', | |
| payload: { | |
| importId: importRecord.id, | |
| }, | |
| }); | |
| // Update import record | |
| return db.import.update({ | |
| where: { id: importRecord.id }, | |
| data: { | |
| jobId: job.id, | |
| status: 'pending', | |
| errorMessage: null, | |
| }, | |
| }); | |
| // Add new job to queue | |
| try { | |
| const job = await importQueue.add( | |
| 'import', | |
| { type: 'import', payload: { importId: importRecord.id } }, | |
| { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, | |
| ); | |
| return db.import.update({ | |
| where: { id: importRecord.id }, | |
| data: { | |
| jobId: job.id, | |
| status: 'pending', | |
| errorMessage: null, | |
| }, | |
| }); | |
| } catch (err) { | |
| throw new Error('Failed to enqueue retry job: ' + (err instanceof Error ? err.message : String(err))); | |
| } |
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 160 to 176, the code calls
importQueue.add and then unconditionally updates the DB record to status
'pending', which can leave the DB pointing to a non‑existent job if add fails;
wrap the call to importQueue.add in a try/catch, only update the import record
to 'pending' and set jobId after a successful add, and in the catch update the
import record to a terminal state (e.g., 'failed') with errorMessage set to the
caught error (or revert to the previous status if tracked); ensure the DB
updates are awaited and consider doing the job add and record update in a
transaction or in the correct sequence so no pending state is recorded when
enqueueing fails.
b60dd0e to
1a11373
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
packages/trpc/src/routers/auth.ts (2)
227-229: Remove sensitive console logs from auth flow.Logging session objects and cookie setters risks leaking tokens/PII.
Apply this diff:
- console.log('session', session); setSessionTokenCookie(ctx.setCookie, token, session.expiresAt); - console.log('ctx.setCookie', ctx.setCookie);
201-203: Avoid credential enumeration; return a generic auth error.“User does not exists” reveals account presence. Use a single generic message.
Apply this diff:
-import { TRPCAccessError, TRPCNotFoundError } from '../errors'; +import { TRPCAccessError, TRPCNotFoundError } from '../errors'; @@ - if (!user) { - throw TRPCNotFoundError('User does not exists'); - } + if (!user) { + throw TRPCAccessError('Incorrect email or password'); + }Note: If you prefer stricter semantics, you can also replace the other branch’s incorrect-password error to the same message to keep responses uniform.Also applies to: 31-31
apps/worker/src/jobs/events.incoming-event.ts (1)
39-44: Do not log full event payload; redact to avoid PII leakage.Logging entire payload and headers can include emails, IDs, and custom properties. Log only identifiers and sizes.
- logger.info('Creating event', { event: payload, jobData }); + logger.info('Creating event', { + name: payload.name, + projectId: payload.projectId, + profileId: payload.profileId, + sessionId: (payload as any).sessionId, + deviceId: (payload as any).deviceId, + propertiesSize: Object.keys(payload.properties ?? {}).length, + });
♻️ Duplicate comments (31)
apps/start/src/components/project-mapper.tsx (2)
23-31: Define a proper form type instead of usingany.The interface uses
anyfor all generic type parameters, which defeats TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.
92-92: Remove the arbitrary 10-project limit.The
slice(0, 10)limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling, or implement search/pagination if performance is a concern.packages/common/scripts/get-referrers.ts (1)
74-74: Fix incorrect output path.The script writes to
../../worker/src/referrers/index.ts, but frompackages/common/scripts/, this resolves to an unexpected location outside the package. The corresponding format command inpackage.jsonexpects./server/referrers/index.tsrelative to the package root.Apply this diff to fix the output path:
- path.resolve(__dirname, '../../worker/src/referrers/index.ts'), + path.resolve(__dirname, '../server/referrers/index.ts'),packages/db/code-migrations/5-add-imports-table.ts (2)
1-2: Fix__filenameusage in ESM module.The package is configured as ESM (
"type": "module"), and using__filenamedirectly will fail at runtime. ImportfileURLToPathfromnode:urland derive__filenameproperly.Apply this diff:
import fs from 'node:fs'; import path from 'node:path'; +import { fileURLToPath } from 'node:url'; import { TABLE_NAMES } from '../src/clickhouse/client'; import { chMigrationClient, createTable, runClickhouseMigrationCommands, } from '../src/clickhouse/migration'; import { getIsCluster } from './helpers'; +const __filename = fileURLToPath(import.meta.url); + export async function up() {Also applies to: 72-72
65-69: TTL on Distributed table will fail in clustered mode.
ALTER TABLE events_imports MODIFY TTLwill fail or be a no-op on the Distributed table in clustered deployments. TTL must be defined on the local Replicated MergeTree table, preferably in the CREATE TABLE statement.Consider extending
createTableto support TTL parameter and include it in the CREATE TABLE statement, or target the replicated table explicitly withON CLUSTER '{cluster}'if the ALTER approach is retained.packages/validation/src/index.ts (1)
478-486: Discriminate by provider to prevent mismatched payloads.The current schema allows
providerandconfigto disagree (e.g.,provider='umami'with a Mixpanel config). This was flagged in a previous review and remains unresolved.Apply a discriminated union on
providerso the config shape is tied to the selected provider:-export const zCreateImport = z.object({ - projectId: z.string().min(1), - provider: z.enum(['umami', 'plausible', 'mixpanel']), - config: z.union([ - zUmamiImportConfig, - zPlausibleImportConfig, - zMixpanelImportConfig, - ]), -}); +export const zCreateImport = z.discriminatedUnion('provider', [ + z.object({ + projectId: z.string().min(1), + provider: z.literal('umami'), + config: zUmamiImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('plausible'), + config: zPlausibleImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('mixpanel'), + config: zMixpanelImportConfig, + }), +]);packages/importer/src/types.ts (1)
53-66: EventProperties allows nested objects, but ClickHouse requires primitives only.The
EventPropertiesinterface (line 61) allowsRecord<string, unknown>and defines__queryasRecord<string, unknown>(line 62). This conflicts with ClickHouse'sIClickhouseEvent.propertieswhich must beRecord<string, string|number|boolean|undefined|null>. Storing nested objects will break ingestion unless stringified. This was flagged in a previous review and remains unresolved.Either constrain EventProperties to primitives only:
export interface EventProperties { [key: string]: | string | number | boolean | null - | undefined - | Record<string, unknown>; - __query?: Record<string, unknown>; + | undefined; + __query?: string; // JSON.stringify before storing __title?: string; __screen?: string; __language?: string; }Or explicitly handle JSON stringification for nested fields before they reach ClickHouse.
packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
10-13: Data loss risk: backfill config before dropping columns.Migrate provider/source fields into config first, then DROP. Example:
+-- Backfill config from legacy columns before dropping them +UPDATE "public"."imports" i +SET "config" = COALESCE( + jsonb_build_object( + 'provider', i."provider", + 'sourceType', i."sourceType", + 'sourceLocation', i."sourceLocation" + ) || COALESCE(i."config", '{}'::jsonb), + '{}'::jsonb +) +WHERE i."provider" IS NOT NULL + OR i."sourceType" IS NOT NULL + OR i."sourceLocation" IS NOT NULL; + -- AlterTable ALTER TABLE "public"."imports" DROP COLUMN "provider", DROP COLUMN "sourceLocation", DROP COLUMN "sourceType", ALTER COLUMN "config" DROP DEFAULT;packages/importer/vitest.config.ts (1)
3-3: ESM: __dirname is undefined. Compute from import.meta.url.Apply:
+import { dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; import { getSharedVitestConfig } from '../../vitest.shared'; -export default getSharedVitestConfig({ __dirname }); +export default getSharedVitestConfig({ + __dirname: dirname(fileURLToPath(import.meta.url)), +});apps/worker/src/boot-workers.ts (2)
66-73: eventsGroupWorker is unmanaged; either add it back to lifecycle or remove it entirely.Currently it runs but is excluded from handlers/close, risking leaks and unobserved failures. Choose one path and apply consistently.
Option A — manage it with the others (recommended):
const workers = [ sessionsWorker, cronWorker, notificationWorker, miscWorker, importWorker, - // eventsGroupWorker, + // GroupMQ worker participates in lifecycle + // (we still keep eventsGroupWorker.run(); above) + eventsGroupWorker as unknown as Worker, ];Option B — if intentionally disabled, remove it completely:
- const eventsGroupWorker = new GroupWorker<...>({ ... }); - eventsGroupWorker.run(); + // eventsGroupWorker intentionally disabled; remove instantiation to avoid orphaned worker.
155-166: Always use graceful shutdown; remove the dev fast-exit path.Avoid skipping worker close/queue drain in dev.
- if (process.env.NODE_ENV === 'production') { - exitHandler(evt, code); - } else { - logger.info('Shutting down for development', { - event: evt, - code, - }); - process.exit(0); - } + exitHandler(evt, code);packages/db/prisma/schema.prisma (1)
489-489: Schema/migration mismatch on currentStep.Schema has
currentStep String?but the migration sets it NOT NULL. Keep it nullable or update both consistently. This also aligns with import.ts wherecurrentStep === nulldenotes a new import.packages/trpc/src/routers/import.ts (4)
24-31: Sensitive config exposure in list — redact for read-level access.Return imports without leaking config to read-only members. Gate by role and null-out config when needed.
Apply:
- return db.import.findMany({ - where: { - projectId: input.projectId, - }, - orderBy: { - createdAt: 'desc', - }, - }); + const rows = await db.import.findMany({ + where: { projectId: input.projectId }, + orderBy: { createdAt: 'desc' }, + }); + const canSeeSecrets = + typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets ? rows : rows.map((r) => ({ ...r, config: null }));
55-56: Sensitive config exposure in get — redact for read-level access.Avoid returning secrets to read-only members.
Apply:
- return importRecord; + const canSeeSecrets = + typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets ? importRecord : { ...importRecord, config: null };
81-99: Enqueue not guarded — pending record can orphan without a job.Wrap queue add in try/catch, use stable jobId, attempts, and mark DB failed on error.
Apply:
- // Add job to queue - const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); - - // Update import record with job ID - await db.import.update({ - where: { id: importRecord.id }, - data: { jobId: job.id }, - }); + // Add job to queue + try { + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + ); + await db.import.update({ + where: { id: importRecord.id }, + data: { jobId: job.id }, + }); + } catch (err) { + await db.import.update({ + where: { id: importRecord.id }, + data: { + status: 'failed', + errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed', + }, + }); + throw err; + }
160-176: Retry enqueue not guarded — can set 'pending' without job.Same reliability issue: wrap in try/catch and only update status on success.
Apply:
- // Add new job to queue - const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); - - // Update import record - return db.import.update({ - where: { id: importRecord.id }, - data: { - jobId: job.id, - status: 'pending', - errorMessage: null, - }, - }); + // Add new job to queue + try { + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + ); + return db.import.update({ + where: { id: importRecord.id }, + data: { jobId: job.id, status: 'pending', errorMessage: null }, + }); + } catch (err) { + throw new Error( + 'Failed to enqueue retry job: ' + + (err instanceof Error ? err.message : String(err)), + ); + }apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)
265-276: Config tooltip exposes credentials — sanitize/redact before display.Redact known secret keys and handle null config.
Apply:
+const redactConfig = (cfg: any) => { + if (!cfg) return null; + const secretKeys = new Set([ + 'apiKey','apiSecret','serviceAccount','serviceSecret','token','password', + 'secret','privateKey','accessKey','secretAccessKey', + ]); + const walk = (obj: any): any => + Object.fromEntries( + Object.entries(obj).map(([k, v]) => [ + k, + v && typeof v === 'object' ? walk(v) : secretKeys.has(k) ? '***' : v, + ]), + ); + return walk(cfg); +}; ... <Tooltiper content={ <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> - {JSON.stringify(imp.config, null, 2)} + {JSON.stringify(redactConfig(imp.config), null, 2)} </pre> } tooltipClassName="max-w-xs" > <Badge>Config</Badge> </Tooltiper>apps/worker/src/jobs/import.ts (2)
129-137: Avoid logging full raw events (PII leak).Log minimal reason/context only.
Apply:
- ) { - jobLogger.warn('Skipping invalid event', { rawEvent }); + ) { + jobLogger.warn('Skipping invalid event', { + provider: providerInstance.provider, + reason: 'validation_failed', + }); continue; }
251-258: Sync final Postgres counters with ClickHouse progress for UI.Update totalEvents/processedEvents on completion.
Apply:
const finalProgress = await getImportProgress(importId); @@ jobLogger.info('Import job completed successfully', { totalEvents: finalProgress.totalEvents, insertedEvents: finalProgress.insertedEvents, status: finalProgress.status, }); + + await db.import.update({ + where: { id: importId }, + data: { + totalEvents: finalProgress.totalEvents, + processedEvents: finalProgress.insertedEvents, + }, + });packages/importer/src/providers/umami.ts (6)
1-15: Add SSRF guard before fetching remote files.Validate scheme and resolve host to block private/loopback ranges.
Apply:
import { randomUUID } from 'node:crypto'; import { Readable } from 'node:stream'; import { pipeline } from 'node:stream/promises'; import { createBrotliDecompress, createGunzip } from 'node:zlib'; +import { lookup } from 'node:dns/promises'; +import { isIP } from 'node:net'; import { isSameDomain, parsePath } from '@openpanel/common'; import { generateDeviceId } from '@openpanel/common/server'; import { getReferrerWithQuery, parseReferrer } from '@openpanel/common/server'; import type { IClickhouseEvent } from '@openpanel/db'; import type { ILogger } from '@openpanel/logger'; import type { IUmamiImportConfig } from '@openpanel/validation'; import { parse } from 'csv-parse'; -import { assocPath } from 'ramda'; import { z } from 'zod'; import { BaseImportProvider } from '../base-provider'; + +async function assertRemoteUrlAllowed(target: string) { + const u = new URL(target); + if (!/^https?:$/i.test(u.protocol)) { + throw new Error('Only http/https URLs are allowed'); + } + const addrs = await lookup(u.hostname, { all: true, verbatim: true }); + const isPrivate = (addr: string) => + addr === '127.0.0.1' || + addr === '::1' || + addr.startsWith('10.') || + /^172\.(1[6-9]|2\d|3[0-1])\./.test(addr) || + addr.startsWith('192.168.') || + addr.startsWith('fc') || + addr.startsWith('fd') || + addr.startsWith('fe80:'); + for (const { address } of addrs) { + if (isIP(address) && isPrivate(address)) { + throw new Error('Blocked private address'); + } + } +}
79-87: Call SSRF guard before fetch.Block unsafe targets early.
Apply:
private async *parseRemoteFile( url: string, @@ ): AsyncGenerator<UmamiRawEvent, void, unknown> { const { signal, maxBytes, maxRows } = opts; const controller = new AbortController(); @@ - const res = await fetch(url, { signal: controller.signal }); + await assertRemoteUrlAllowed(url); + const res = await fetch(url, { signal: controller.signal });Also applies to: 97-103
213-221: Device ID fallback is invalid; use stable available identifiers.Stop passing visit_id as ip/ua.
Apply:
- const deviceId = - rawEvent.visit_id || - generateDeviceId({ - ip: rawEvent.visit_id!, - ua: rawEvent.visit_id!, - origin: projectId, - salt: 'xxx', - }); + const deviceId = + rawEvent.visit_id || + rawEvent.distinct_id || + rawEvent.session_id || + rawEvent.event_id || + randomUUID();
224-229: Fix URL construction (require hostname, prefix query with '?').Avoid invalid URLs.
Apply:
- const url = rawEvent.url_path - ? `https://${[rawEvent.hostname, rawEvent.url_path, rawEvent.url_query] - .filter(Boolean) - .join('')}` - : ''; + let url = ''; + if (rawEvent.hostname && rawEvent.url_path) { + const q = rawEvent.url_query + ? rawEvent.url_query.startsWith('?') + ? rawEvent.url_query + : `?${rawEvent.url_query}` + : ''; + url = `https://${rawEvent.hostname}${rawEvent.url_path}${q}`; + }
231-234: Include referrer_query in referrer URL.Preserve query parameters.
Apply:
- const referrerUrl = rawEvent.referrer_domain - ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}` - : ''; + const refQuery = rawEvent.referrer_query + ? rawEvent.referrer_query.startsWith('?') + ? rawEvent.referrer_query + : `?${rawEvent.referrer_query}` + : ''; + const referrerUrl = rawEvent.referrer_domain + ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}${refQuery}` + : '';
257-297: Flatten properties to primitives; ClickHouse rejects nested objects.Replace nested __query and assocPath usage with flat primitive keys.
Apply:
- let properties: Record<string, any> = {}; - - if (query) { - properties.__query = query; - } + const properties: Record< + string, + string | number | boolean | null | undefined + > = {}; + if (query) { + for (const [k, v] of Object.entries(query)) { + properties[`query.${k}`] = v as string; + } + } @@ - if (rawEvent.utm_source) - properties = assocPath( - ['__query', 'utm_source'], - rawEvent.utm_source, - properties, - ); - if (rawEvent.utm_medium) - properties = assocPath( - ['__query', 'utm_medium'], - rawEvent.utm_medium, - properties, - ); - if (rawEvent.utm_campaign) - properties = assocPath( - ['__query', 'utm_campaign'], - rawEvent.utm_campaign, - properties, - ); - if (rawEvent.utm_content) - properties = assocPath( - ['__query', 'utm_content'], - rawEvent.utm_content, - properties, - ); - if (rawEvent.utm_term) - properties = assocPath( - ['__query', 'utm_term'], - rawEvent.utm_term, - properties, - ); + if (rawEvent.utm_source) properties.utm_source = rawEvent.utm_source; + if (rawEvent.utm_medium) properties.utm_medium = rawEvent.utm_medium; + if (rawEvent.utm_campaign) properties.utm_campaign = rawEvent.utm_campaign; + if (rawEvent.utm_content) properties.utm_content = rawEvent.utm_content; + if (rawEvent.utm_term) properties.utm_term = rawEvent.utm_term;apps/start/src/modals/add-import.tsx (1)
245-249: Avoid leaking upstream error details to end users.Use a generic error toast; log details server-side.
Apply:
- onError: (error) => { - toast.error('Import failed', { - description: error.message, - }); - }, + onError: () => { + toast.error('Import failed', { + description: 'We could not start the import. Please try again later.', + }); + },packages/importer/src/providers/mixpanel.ts (1)
85-91: Sensitive credentials logged.Line 90 logs the
serviceAccountcredential, which can aid attackers in credential stuffing or enumeration attacks.packages/db/src/services/import.service.ts (4)
256-256: Risk of timestamp collisions with existing events.Lines 256 and 293 adjust timestamps by exactly ±1000ms (1 second), which could create timestamp conflicts with real events at those exact timestamps, leading to non-deterministic event ordering.
Also applies to: 293-293
505-506: Inconsistent parameter handling.Lines 505-506 use
sqlstring.escape()for inline parameter substitution, while the rest of the codebase uses ClickHouse'squery_paramsmechanism. This inconsistency reduces maintainability.
559-559: Use deterministic status aggregation instead ofany(import_status).Line 559 uses
any(import_status)which returns an arbitrary status from the group, making the result non-deterministic when events have mixed statuses.
603-603: Validate status value instead of using type assertion.Line 603 uses a type assertion which bypasses type safety. If ClickHouse returns an unexpected status value, the type system won't catch it.
🧹 Nitpick comments (15)
packages/common/server/parser-user-agent.ts (1)
84-112: Consider reducing code duplication.The pattern
typeof overrides?.__x === 'string' && overrides?.__x ? overrides?.__x : res.xis repeated seven times for different fields. This increases maintenance overhead—if the override logic needs adjustment, you'd need to update it in seven places.Consider extracting this logic into a helper function:
+function getOverride<T>( + override: unknown, + fallback: T | undefined +): T | undefined | string { + return typeof override === 'string' && override !== '' ? override : fallback; +} + export function parseUserAgent( ua?: string | null, overrides?: Record<string, unknown>, ) { if (!ua) return parsedServerUa; const res = parse(ua); if (isServer(res)) { return parsedServerUa; } return { - os: - typeof overrides?.__os === 'string' && overrides?.__os - ? overrides?.__os - : res.os.name, + os: getOverride(overrides?.__os, res.os.name), - osVersion: - typeof overrides?.__osVersion === 'string' && overrides?.__osVersion - ? overrides?.__osVersion - : res.os.version, + osVersion: getOverride(overrides?.__osVersion, res.os.version), - browser: - typeof overrides?.__browser === 'string' && overrides?.__browser - ? overrides?.__browser - : res.browser.name, + browser: getOverride(overrides?.__browser, res.browser.name), - browserVersion: - typeof overrides?.__browserVersion === 'string' && - overrides?.__browserVersion - ? overrides?.__browserVersion - : res.browser.version, + browserVersion: getOverride(overrides?.__browserVersion, res.browser.version), - device: - typeof overrides?.__device === 'string' && overrides?.__device - ? overrides?.__device - : res.device.type || getDevice(ua), + device: getOverride(overrides?.__device, res.device.type || getDevice(ua)), - brand: - typeof overrides?.__brand === 'string' && overrides?.__brand - ? overrides?.__brand - : res.device.vendor, + brand: getOverride(overrides?.__brand, res.device.vendor), - model: - typeof overrides?.__model === 'string' && overrides?.__model - ? overrides?.__model - : res.device.model, + model: getOverride(overrides?.__model, res.device.model), isServer: false, } as const; }packages/db/src/buffers/base-buffer.ts (1)
92-126: Avoid lock expiry during long flushes (possible double processing).If onFlush exceeds lockTimeout (60s), the Redis lock can expire mid-run and another worker may acquire it. Add a heartbeat to extend TTL only when the lockId still matches.
Apply this diff:
@@ private async releaseLock(lockId: string): Promise<void> { @@ } + private async refreshLock(lockId: string): Promise<void> { + const script = ` + if redis.call("get", KEYS[1]) == ARGV[1] then + return redis.call("expire", KEYS[1], ARGV[2]) + else + return 0 + end + `; + await getRedisCache().eval( + script, + 1, + this.lockKey, + lockId, + this.lockTimeout.toString(), + ); + } + async tryFlush() { @@ - if (acquired === 'OK') { + if (acquired === 'OK') { + const intervalMs = + Math.max(10, Math.floor(this.lockTimeout / 2)) * 1000; + const heartbeat = setInterval(() => { + void this.refreshLock(lockId).catch(() => {}); + }, intervalMs); try { @@ } catch (error) { @@ } finally { + clearInterval(heartbeat); await this.releaseLock(lockId); @@ } } }packages/trpc/src/routers/auth.ts (1)
152-154: Use the correct error type for “User already exists”.This is a conflict, not a not-found case. Prefer a 400/409-style error.
Apply this diff:
-import { TRPCAccessError, TRPCNotFoundError } from '../errors'; +import { + TRPCAccessError, + TRPCNotFoundError, + TRPCBadRequestError, +} from '../errors'; @@ - if (user) { - throw TRPCNotFoundError('User already exists'); - } + if (user) { + throw TRPCBadRequestError('User already exists'); + }Also applies to: 31-31
packages/importer/src/providers/mixpanel.test.ts (1)
5-23: Granularity mismatch: daily vs monthly chunks.Test expects daily chunks, but Mixpanel provider comments mention monthly chunking for reliability. Align expectation and implementation (or update the comment) to avoid future drift.
packages/importer/src/index.ts (1)
1-13: LGTM: coherent public surface.Exports for Umami/Mixpanel and core types look consistent. Consider exporting BaseImportProvider if external providers are intended later.
apps/worker/src/boot-workers.ts (2)
61-64: Clamp concurrency to ≥1 to avoid invalid worker configs.ENV may be "0" or non-numeric; guard with fallback.
- concurrency: Number.parseInt(process.env.IMPORT_JOB_CONCURRENCY || '1', 10), + concurrency: + Math.max(1, Number.parseInt(process.env.IMPORT_JOB_CONCURRENCY ?? '', 10) || 1),- concurrency: Number.parseInt(process.env.EVENT_JOB_CONCURRENCY || '1', 10), + concurrency: + Math.max(1, Number.parseInt(process.env.EVENT_JOB_CONCURRENCY ?? '', 10) || 1),Also applies to: 37-38
137-143: Graceful shutdown should consider all active queues (include import).Only waiting on cron can exit while imports are active. Wait for key queues or rely solely on worker.close.
- await waitForQueueToEmpty(cronQueue); + // Wait for critical queues to drain before closing workers + await Promise.all([ + waitForQueueToEmpty(cronQueue), + waitForQueueToEmpty(sessionsQueue), + waitForQueueToEmpty(notificationQueue), + waitForQueueToEmpty(miscQueue), + waitForQueueToEmpty(importQueue), + ]);packages/queue/src/queues.ts (1)
168-181: Add retries/backoff for import jobs.Imports are long-running and transiently fail (rate limits, net hiccups). Configure sane defaults.
export const importQueue = new Queue<ImportQueuePayload>('import', { connection: getRedisQueue(), defaultJobOptions: { removeOnComplete: 10, removeOnFail: 50, + attempts: 3, + backoff: { type: 'exponential', delay: 5000 }, + // prevent duplicates if enqueued idempotently from API + // jobId can be importId if desired at call-site }, });packages/db/prisma/schema.prisma (1)
477-496: Import model polish: set defaults and add indexes; keep currentStep nullable.
- Set status default to 'pending' to match lifecycle.
- Add indexes for common lookups (by projectId, status, recency).
- Keep currentStep optional; worker logic uses NULL for “new import”.
model Import { id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid projectId String project Project @relation(fields: [projectId], references: [id], onDelete: Cascade) jobId String? // BullMQ job ID - status ImportStatus + status ImportStatus @default(pending) statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs" errorMessage String? /// [IPrismaImportConfig] config Json totalEvents Int @default(0) processedEvents Int @default(0) - currentStep String? + currentStep String? currentBatch String? // String date 2020-01-01 createdAt DateTime @default(now()) completedAt DateTime? updatedAt DateTime @default(now()) @updatedAt @@map("imports") + @@index([projectId, createdAt(sort: Desc)]) + @@index([status]) + @@index([projectId, status]) }packages/trpc/src/routers/import.ts (2)
121-126: Make job removal resilient.Wrap job.remove() in try/catch so delete isn’t blocked by queue errors; optionally skip removal if status is processing.
Apply:
- if (importRecord.jobId) { - const job = await importQueue.getJob(importRecord.jobId); - if (job) { - await job.remove(); - } - } + if (importRecord.jobId) { + try { + const job = await importQueue.getJob(importRecord.jobId); + if (job) await job.remove(); + } catch (e) { + // Best-effort cleanup; continue with delete + } + }
24-31: Paginate list to avoid unbounded result sets.Add take/skip or cursor-based pagination; default e.g. last 50, with client-driven paging.
Apply:
- const rows = await db.import.findMany({ - where: { projectId: input.projectId }, - orderBy: { createdAt: 'desc' }, - }); + const rows = await db.import.findMany({ + where: { projectId: input.projectId }, + orderBy: { createdAt: 'desc' }, + take: 50, + });apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)
206-213: Null-safe provider/type display to work with server-side redaction.Avoid crashes when config is null.
Apply:
- <div>{imp.config.provider}</div> + <div>{imp.config?.provider ?? 'unknown'}</div> ... - {imp.config.type} + {imp.config?.type ?? 'n/a'}
102-111: Don’t surface full internal error details in UI.Truncate error text to reduce leak risk.
Apply:
- <Tooltiper - content={errorMessage} + <Tooltiper + content={errorMessage ? String(errorMessage).slice(0, 300) : 'Error'} tooltipClassName="max-w-xs break-words" >apps/worker/src/jobs/import.ts (1)
201-207: Use job logger instead of console.log.Align logging and ensure capture.
Apply:
- console.log('Generating session IDs', { from }); + jobLogger.info('Generating session IDs', { from });packages/importer/src/providers/mixpanel.ts (1)
309-314: Address TODO or document why it's deferred.The TODO comment indicates incomplete screen view detection logic. This heuristic-based approach may cause incorrect event name mapping for production data.
If this is a temporary workaround, please document when and how it should be removed. Otherwise, do you want me to help implement a more robust solution or open an issue to track this task?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (72)
.github/workflows/docker-build.yml(3 hunks)apps/api/package.json(1 hunks)apps/start/Dockerfile(2 hunks)apps/start/package.json(2 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/project-mapper.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/date-ranger-picker.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/start/src/utils/math.ts(1 hunks)apps/start/src/utils/slug.ts(1 hunks)apps/worker/Dockerfile(2 hunks)apps/worker/package.json(3 hunks)apps/worker/src/boot-workers.ts(4 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/events.incoming-event.ts(3 hunks)apps/worker/src/jobs/events.incoming-events.test.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)package.json(1 hunks)packages/cli/package.json(0 hunks)packages/cli/src/cli.ts(0 hunks)packages/cli/src/importer/importer.ts(0 hunks)packages/cli/src/importer/index.ts(0 hunks)packages/cli/tsconfig.json(0 hunks)packages/cli/tsup.config.ts(0 hunks)packages/common/package.json(1 hunks)packages/common/scripts/get-referrers.ts(1 hunks)packages/common/server/index.ts(1 hunks)packages/common/server/parse-referrer.test.ts(5 hunks)packages/common/server/parse-referrer.ts(4 hunks)packages/common/server/parser-user-agent.ts(2 hunks)packages/common/src/object.ts(1 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/index.ts(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql(1 hunks)packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql(1 hunks)packages/db/prisma/migrations/20251028152531_imports_again/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/buffers/base-buffer.ts(1 hunks)packages/db/src/buffers/event-buffer.test.ts(8 hunks)packages/db/src/clickhouse/client.ts(2 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/services/session.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/metadata.ts(1 hunks)packages/importer/src/providers/mixpanel.test.ts(1 hunks)packages/importer/src/providers/mixpanel.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/redis/package.json(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/auth.ts(1 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/trpc/src/routers/organization.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
💤 Files with no reviewable changes (6)
- packages/cli/src/cli.ts
- packages/cli/src/importer/importer.ts
- packages/cli/tsconfig.json
- packages/cli/package.json
- packages/cli/tsup.config.ts
- packages/cli/src/importer/index.ts
🚧 Files skipped from review as they are similar to previous changes (26)
- apps/start/src/modals/date-ranger-picker.tsx
- packages/common/server/index.ts
- apps/api/package.json
- apps/start/src/modals/index.tsx
- packages/db/src/services/event.service.ts
- packages/trpc/package.json
- packages/db/src/types.ts
- packages/common/server/parse-referrer.test.ts
- apps/start/src/components/ui/badge.tsx
- apps/start/src/components/integrations/integration-card.tsx
- apps/start/src/routes/app.$organizationId.$projectId.settings._tabs.tsx
- packages/db/src/buffers/event-buffer.test.ts
- packages/common/package.json
- packages/importer/tsconfig.json
- packages/db/src/services/session.service.ts
- packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql
- packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql
- packages/common/src/object.ts
- apps/start/package.json
- apps/start/src/routeTree.gen.ts
- packages/db/src/clickhouse/client.ts
- packages/db/prisma/migrations/20251028152531_imports_again/migration.sql
- packages/importer/src/base-provider.ts
- packages/importer/package.json
- package.json
- packages/trpc/src/root.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/utils/slug.tsapps/start/src/components/project-mapper.tsxapps/start/src/modals/add-import.tsxapps/start/src/utils/math.ts
🧬 Code graph analysis (15)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
importQueue(175-181)apps/worker/src/jobs/import.ts (1)
importJob(25-281)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
getIsCluster(17-24)packages/db/src/clickhouse/migration.ts (2)
createTable(68-116)runClickhouseMigrationCommands(322-454)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
getRedisQueue(93-105)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
Route(12-34)packages/importer/src/providers/metadata.ts (1)
IMPORT_PROVIDERS(13-30)apps/public/components/ui/tooltip.tsx (1)
Tooltiper(36-51)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(138-138)protectedProcedure(161-164)packages/trpc/src/access.ts (1)
getProjectAccessCached(4-4)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(270-270)packages/validation/src/index.ts (1)
zCreateImport(478-486)packages/queue/src/queues.ts (1)
importQueue(175-181)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
IServiceProject(7-7)apps/start/src/components/ui/select.tsx (5)
Select(173-173)SelectTrigger(181-181)SelectValue(182-182)SelectContent(174-174)SelectItem(176-176)
packages/importer/vitest.config.ts (1)
vitest.shared.ts (1)
getSharedVitestConfig(4-27)
apps/worker/src/jobs/import.ts (6)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/prisma-client.ts (1)
db(270-270)packages/db/src/services/import.service.ts (10)
ImportSteps(671-671)updateImportStatus(673-736)getImportDateBounds(610-634)insertImportBatch(28-55)generateSessionIds(62-98)createSessionsStartEndEvents(106-316)moveImportsToProduction(322-426)backfillSessionsToProduction(428-521)markImportComplete(526-545)getImportProgress(550-605)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(208-219)error(40-45)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
IServiceImportedEventPayload(144-147)packages/db/src/services/import.service.ts (2)
ImportProgress(18-23)ImportStageResult(12-16)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
importQueue(175-181)
packages/importer/src/providers/umami.ts (5)
packages/validation/src/index.ts (1)
IUmamiImportConfig(456-456)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
zUmamiImportConfig(452-454)zMixpanelImportConfig(461-470)IUmamiImportConfig(456-456)IMixpanelImportConfig(471-471)IImportConfig(473-476)apps/start/src/components/project-mapper.tsx (1)
ProjectMapper(33-116)apps/public/lib/utils.ts (1)
cn(4-6)
packages/importer/src/providers/mixpanel.test.ts (1)
packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-441)
packages/db/src/services/import.service.ts (4)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/client.ts (4)
ch(131-161)TABLE_NAMES(48-60)convertClickhouseDateToJs(238-240)formatClickhouseDate(208-219)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/prisma-client.ts (1)
db(270-270)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
IMixpanelImportConfig(471-471)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)packages/common/server/parser-user-agent.ts (2)
parseUserAgent(72-115)UserAgentInfo(71-71)packages/common/src/object.ts (2)
toDots(3-36)strip(48-48)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-and-push-dashboard
- GitHub Check: build-and-push-worker
- GitHub Check: build-and-push-api
🔇 Additional comments (29)
.github/workflows/docker-build.yml (1)
149-149: Guard tag creation to main branch only—LGTM.The conditional guards on all three tag-creation steps (
if: github.ref == 'refs/heads/main') correctly restrict tag creation to the main branch. This is good practice for release/deployment management, preventing accidental tags from feature branches while allowing Docker images to be built for all branches. The syntax and consistency across all three jobs are correct.Also applies to: 210-210, 271-271
packages/common/server/parser-user-agent.ts (1)
71-71: Good addition of type export.Exporting the return type as
UserAgentInfois a best practice that improves type safety and developer experience for consumers of this function.packages/redis/package.json (1)
11-11: Verify the intent behind pinning ioredis to an exact version.Version 5.8.2 is confirmed as valid, latest, and published by the official Redis Labs maintainer. However, this is the only production dependency in the package.json using exact pinning—all others use caret constraints (
^) or monorepo references (workspace:*,catalog:). Ensure this pinning is intentional for your queue/job processing stability, rather than an accidental divergence from project conventions. If stability is not a critical concern here, consider aligning with the project's versioning strategy.packages/db/src/buffers/base-buffer.ts (1)
1-1: Import-path update is consistent with the monorepo re-exports.For safety, run the same re-export check script shared in organization.ts to ensure no accidental duplicate exports.
packages/trpc/src/routers/auth.ts (1)
15-15: Import-path change LGTM.packages/trpc/src/routers/organization.ts (1)
14-14: No issues found; import change is correct.Re-export verified:
generateSecureIdis defined once inpackages/common/server/id.tsand properly re-exported. The import path consolidation to@openpanel/common/serveris consistent with existing patterns across the codebase (already used in auth.ts, db/buffers, and other files). Tree-shaking will work correctly.apps/worker/src/jobs/events.incoming-events.test.ts (1)
102-102: LGTM! Test expectations correctly updated.The referrerType expectations now correctly use an empty string instead of 'unknown', aligning with the updated behavior in packages/common/server/parse-referrer.ts where the default type for non-matches changed from 'unknown' to ''.
Also applies to: 210-210
apps/start/src/utils/math.ts (1)
1-1: LGTM! Re-export now uses package root.The change from '@openpanel/common/src/math' to '@openpanel/common' aligns with the pattern of consuming the public API surface from package roots rather than internal paths.
packages/db/code-migrations/5-add-imports-table.sql (1)
1-43: LGTM! Table schema is well-structured.The events_imports table schema provides comprehensive event tracking with import-specific metadata. The 7-day TTL on imported_at_meta provides automatic cleanup for staging data. Note that in clustered deployments, the TTL should be applied to the replicated table rather than the distributed table (see related comment on the .ts migration file).
apps/start/src/utils/slug.ts (1)
1-1: LGTM! Re-export now uses package root.Consistent with the broader pattern of consuming the public API surface from '@openpanel/common' rather than internal paths.
packages/common/server/parse-referrer.ts (1)
1-3: LGTM! Referrer parsing logic improved.The changes improve consistency and efficiency:
- Default type changed from 'unknown' to empty string for unmatched referrers
- Source is now lowercased once during computation rather than repeatedly during comparisons
- Updated matching logic is cleaner and more maintainable
Test files correctly reflect these behavioral changes.
Also applies to: 23-23, 35-51, 63-63
packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1)
1-2: LGTM!The migration correctly adds a nullable
statusMessagecolumn to track detailed import status information.packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1)
1-2: LGTM!The migration correctly makes
currentStepnullable, which is appropriate for representing imports that haven't started or tracking optional step progression.apps/worker/package.json (2)
41-41: Verify intentional removal of version range flexibility.The
tsdowndependency was changed from^0.14.2to0.14.2(caret removed), which prevents automatic minor/patch updates. This appears unrelated to the importer feature.Is this pinning intentional for reproducibility, or was it accidental? If unintentional, consider restoring the caret:
- "tsdown": "0.14.2", + "tsdown": "^0.14.2",
22-22: LGTM!Adding the
@openpanel/importerdependency correctly wires the worker to process import jobs.packages/db/index.ts (1)
25-25: LGTM!The export correctly surfaces the new import service, consistent with other service exports in this file.
apps/worker/src/index.ts (1)
8-8: LGTM!The
importQueueis correctly imported and added to the Bull Board UI, enabling monitoring of import jobs alongside existing queues.Also applies to: 42-42
apps/worker/Dockerfile (2)
38-38: LGTM: workspace caching for importer.Including packages/importer/package.json improves pnpm cache hits.
85-87: LGTM: importer copied into runner image.Runtime availability is ensured; ordering with logger/common is fine.
packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
13-13: Migration is safe—config validation requires the field.The
configfield inzCreateImportis mandatory (a union of three required config types with no.optional()), and the single import creation path always provides it from validated input. Since the column remainsNOT NULL, dropping the DEFAULT poses no risk of introducing NULLs.apps/worker/src/jobs/events.incoming-event.ts (2)
8-12: Consolidation to @openpanel/common/server looks good.Shared parser usage reduces drift.
120-123: Referrer precedence change: confirm downstream expectations.Now referrer/url is only from parsed referrer (not UTM), and type prioritizes referrer over UTM. Validate dashboards/segments relying on previous precedence (tests adjusted, but product semantics matter).
packages/importer/src/providers/mixpanel.ts (1)
401-440: LGTM!The property stripping and string coercion logic correctly handles edge cases including null/undefined values and nested objects. The regex pattern effectively filters Mixpanel-specific properties.
packages/db/src/services/import.service.ts (6)
28-55: LGTM!The batch insertion logic correctly handles empty arrays, adds required import metadata, and leverages ClickHouse's JSONEachRow format for efficient bulk inserts.
62-98: LGTM!The deterministic session ID generation using MD5 hashing of device_id and 30-minute time windows ensures consistent results across imports. The use of
mutations_sync: '2'properly ensures replication before proceeding.
322-426: LGTM!The migration logic correctly handles session events with adjusted timestamps by including them via session_id lookup rather than strict date filtering. The progress headers configuration prevents connection timeouts for large imports.
526-545: LGTM!The completion marking correctly uses
mutations_sync: '2'to ensure all replicas are updated before the function returns, preventing race conditions.
610-634: LGTM!The date bounds query correctly handles the optional
fromCreatedAtparameter and provides sensible defaults for empty result sets.
673-736: Well-structured status update flow.The step-based union type provides excellent type safety, ensuring each step can only include valid fields. The status message generation is clear and user-friendly.
Note: The Postgres update at the end (lines 732-735) occurs after the job progress update. If the database update fails, the job progress will be out of sync with the database state. Consider whether this ordering is intentional or if database-first updates would be more appropriate.
| COPY packages/trpc/package.json packages/trpc/ | ||
| COPY packages/json/package.json packages/json/ | ||
| COPY packages/common/package.json packages/common/ | ||
| COPY packages/importer/package.json packages/importer/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Ensure @openpanel/importer is either available at runtime or not linked.
You add packages/importer/package.json for install caching, but the runner image does not copy packages/importer. If apps/start does any value import (not type-only) of @openpanel/importer, PNPM’s workspace symlink will point to a missing path at runtime.
- Option A: If only type imports are used, keep as-is.
- Option B: If runtime imports exist, also copy the importer sources into runner:
+# In runner stage, alongside other packages
+COPY --from=build /app/packages/importer ./packages/importerRun to confirm no value imports from apps/start:
Also applies to: 89-89
🏁 Script executed:
#!/bin/bash
# Find imports of @openpanel/importer in Start; flag non-type imports
rg -nP --type=ts --type=tsx -C2 "from ['\"]@openpanel/importer['\"]" apps/start \
| grep -v "import type" || echo "No value imports found."Length of output: 51
I need to re-run the verification with a corrected command, as the file type flag issue makes the result inconclusive.
---
🏁 Script executed:
```shell
#!/bin/bash
# Broader search: find any imports of @openpanel/importer in apps/start
# without file type restrictions to avoid flag parsing issues
echo "=== Searching for @openpanel/importer imports ==="
rg -n "@openpanel/importer" apps/start || echo "No matches found."
echo ""
echo "=== Directory structure of apps/start ==="
find apps/start -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" \) | head -20
Length of output: 1250
Need to verify if this is the only import and search more thoroughly:
---
🏁 Script executed:
```shell
#!/bin/bash
# Find all imports/requires of @openpanel/importer in apps/start
echo "=== All references to @openpanel/importer ==="
rg -n "@openpanel/importer" apps/start -A2 -B2
echo ""
echo "=== Count of files with @openpanel/importer imports ==="
rg -l "@openpanel/importer" apps/start | wc -l
Length of output: 1172
Copy @openpanel/importer sources to runner stage to fix broken workspace symlink at runtime.
Apps/start has a value import (import { IMPORT_PROVIDERS } from '@openpanel/importer/providers') in apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx:22. The Dockerfile copies packages/importer/package.json for caching but omits the source files in the runner stage, causing the workspace symlink to point to a missing path at runtime.
Copy the importer sources in the runner stage:
+# In runner stage, alongside other packages
+COPY --from=build /app/packages/importer ./packages/importerAlso applies to: line 89
🤖 Prompt for AI Agents
In apps/start/Dockerfile around lines 32 and 89, the Dockerfile only copies
packages/importer/package.json which leaves the @openpanel/importer source files
absent in the runner stage and breaks the workspace symlink at runtime; fix by
copying the importer sources into the runner stage (either COPY
packages/importer/ packages/importer/ in the runner stage or COPY --from=builder
/app/packages/importer/ /app/packages/importer/ if the builder stage contains
the sources/artifacts) so the import path @openpanel/importer resolves at
runtime.
| const jobLogger = logger.child({ | ||
| importId, | ||
| config: record.config, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logger context leaks config secrets.
Avoid attaching full config to logs; include non-sensitive fields only.
Apply:
- const jobLogger = logger.child({
- importId,
- config: record.config,
- });
+ const jobLogger = logger.child({
+ importId,
+ provider: record.config?.provider,
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const jobLogger = logger.child({ | |
| importId, | |
| config: record.config, | |
| }); | |
| const jobLogger = logger.child({ | |
| importId, | |
| provider: record.config?.provider, | |
| }); |
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 35 to 38, the logger child is
attaching the entire record.config which may contain secrets; replace that with
a whitelist of non-sensitive fields (e.g., config.name, config.type,
config.source, config.region, config.bucket or other known safe metadata) or
explicitly redact sensitive keys before passing to logger.child — do not log the
full config object; update the jobLogger creation to only include importId plus
the selected safe config fields (or a redact/mask function) so secrets are never
emitted to logs.
| const eventBatch: any = []; | ||
| for await (const rawEvent of providerInstance.parseSource( | ||
| resumeLoadingFrom, | ||
| )) { | ||
| // Validate event | ||
| if ( | ||
| !providerInstance.validate( | ||
| // @ts-expect-error | ||
| rawEvent, | ||
| ) | ||
| ) { | ||
| jobLogger.warn('Skipping invalid event', { rawEvent }); | ||
| continue; | ||
| } | ||
|
|
||
| eventBatch.push(rawEvent); | ||
|
|
||
| // Process batch when it reaches the batch size | ||
| if (eventBatch.length >= BATCH_SIZE) { | ||
| jobLogger.info('Processing batch', { batchSize: eventBatch.length }); | ||
|
|
||
| const transformedEvents: IClickhouseEvent[] = eventBatch.map( | ||
| ( | ||
| // @ts-expect-error | ||
| event, | ||
| ) => providerInstance!.transformEvent(event), | ||
| ); | ||
|
|
||
| await insertImportBatch(transformedEvents, importId); | ||
|
|
||
| processedEvents += eventBatch.length; | ||
| eventBatch.length = 0; | ||
|
|
||
| const createdAt = new Date(transformedEvents[0]?.created_at || '') | ||
| .toISOString() | ||
| .split('T')[0]; | ||
|
|
||
| await updateImportStatus(jobLogger, job, importId, { | ||
| step: 'loading', | ||
| batch: createdAt, | ||
| totalEvents, | ||
| processedEvents, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Harden batch transform; avoid crashes on empty/failed transforms.
- Use unknown[] buffer.
- Guard per-event transform with try/catch.
- Safe createdAt derivation.
Apply:
- const eventBatch: any = [];
+ const eventBatch: unknown[] = [];
@@
- if (eventBatch.length >= BATCH_SIZE) {
+ if (eventBatch.length >= BATCH_SIZE) {
jobLogger.info('Processing batch', { batchSize: eventBatch.length });
-
- const transformedEvents: IClickhouseEvent[] = eventBatch.map(
- (
- // @ts-expect-error
- event,
- ) => providerInstance!.transformEvent(event),
- );
+ const transformedEvents: IClickhouseEvent[] = [];
+ for (const ev of eventBatch) {
+ try {
+ transformedEvents.push(
+ providerInstance!.transformEvent(ev as any),
+ );
+ } catch (e) {
+ jobLogger.warn('Skipping event on transform failure', {
+ provider: providerInstance.provider,
+ reason: e instanceof Error ? e.message : String(e),
+ });
+ }
+ }
await insertImportBatch(transformedEvents, importId);
processedEvents += eventBatch.length;
eventBatch.length = 0;
- const createdAt = new Date(transformedEvents[0]?.created_at || '')
- .toISOString()
- .split('T')[0];
+ const firstCreatedAt = transformedEvents[0]?.created_at;
+ const createdAt = firstCreatedAt
+ ? firstCreatedAt.split('T')[0]!
+ : formatClickhouseDate(new Date(), true);
await updateImportStatus(jobLogger, job, importId, {
step: 'loading',
batch: createdAt,
totalEvents,
processedEvents,
});Committable suggestion skipped: line range outside the PR's diff.
| // Process remaining events in the last batch | ||
| if (eventBatch.length > 0) { | ||
| const transformedEvents = eventBatch.map( | ||
| ( | ||
| // @ts-expect-error | ||
| event, | ||
| ) => providerInstance!.transformEvent(event), | ||
| ); | ||
|
|
||
| await insertImportBatch(transformedEvents, importId); | ||
|
|
||
| processedEvents += eventBatch.length; | ||
| eventBatch.length = 0; | ||
|
|
||
| const createdAt = new Date(transformedEvents[0]?.created_at || '') | ||
| .toISOString() | ||
| .split('T')[0]; | ||
|
|
||
| await updateImportStatus(jobLogger, job, importId, { | ||
| step: 'loading', | ||
| batch: createdAt, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeat: harden final partial batch and persist counters.
Mirror guarded transform and include totals in status update.
Apply:
- if (eventBatch.length > 0) {
- const transformedEvents = eventBatch.map(
- (
- // @ts-expect-error
- event,
- ) => providerInstance!.transformEvent(event),
- );
+ if (eventBatch.length > 0) {
+ const transformedEvents: IClickhouseEvent[] = [];
+ for (const ev of eventBatch) {
+ try {
+ transformedEvents.push(providerInstance!.transformEvent(ev as any));
+ } catch (e) {
+ jobLogger.warn('Skipping event on transform failure', {
+ provider: providerInstance.provider,
+ reason: e instanceof Error ? e.message : String(e),
+ });
+ }
+ }
await insertImportBatch(transformedEvents, importId);
processedEvents += eventBatch.length;
eventBatch.length = 0;
- const createdAt = new Date(transformedEvents[0]?.created_at || '')
- .toISOString()
- .split('T')[0];
+ const firstCreatedAt = transformedEvents[0]?.created_at;
+ const createdAt = firstCreatedAt
+ ? firstCreatedAt.split('T')[0]!
+ : formatClickhouseDate(new Date(), true);
await updateImportStatus(jobLogger, job, importId, {
step: 'loading',
batch: createdAt,
+ totalEvents,
+ processedEvents,
});
}🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 170 to 192, the final partial
batch is processed without the same guards and error handling used earlier and
the status update omits processed/total counters; update the block to perform
the guarded transform exactly like the earlier batch processing (check
providerInstance exists, handle per-event transform errors or wrap transform in
try/catch and skip/collect failures), ensure transformedEvents is computed only
if providerInstance is present and has valid created_at before deriving
createdAt, and include the processedEvents and total/expected total (e.g.,
totalEvents) in the call to updateImportStatus so counters are persisted
alongside step and batch.
| -- CreateEnum | ||
| CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed'); | ||
|
|
||
| -- AlterTable | ||
| ALTER TABLE "public"."imports" DROP COLUMN "status", | ||
| ADD COLUMN "status" "public"."ImportStatus" NOT NULL, | ||
| ALTER COLUMN "currentStep" SET NOT NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Destructive migration: status drop/recreate will lose data; currentStep NOT NULL conflicts with code.
- Map TEXT → enum safely; do not drop without backfill.
- Keep currentStep nullable; worker import.ts uses null to detect new imports.
Safe migration example:
--- a/packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
+++ b/packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed');
-ALTER TABLE "public"."imports" DROP COLUMN "status",
-ADD COLUMN "status" "public"."ImportStatus" NOT NULL,
-ALTER COLUMN "currentStep" SET NOT NULL;
+-- 1) Add new enum column with default to preserve existing rows
+ALTER TABLE "public"."imports"
+ ADD COLUMN "status_v2" "public"."ImportStatus" NOT NULL DEFAULT 'pending';
+
+-- 2) Map old text values to enum; fall back to 'pending' for unknowns
+UPDATE "public"."imports"
+SET "status_v2" = CASE lower(coalesce("status", 'pending'))
+ WHEN 'pending' THEN 'pending'::"public"."ImportStatus"
+ WHEN 'processing' THEN 'processing'::"public"."ImportStatus"
+ WHEN 'completed' THEN 'completed'::"public"."ImportStatus"
+ WHEN 'failed' THEN 'failed'::"public"."ImportStatus"
+ ELSE 'pending'::"public"."ImportStatus"
+END;
+
+-- 3) Drop old column and rename
+ALTER TABLE "public"."imports" DROP COLUMN "status";
+ALTER TABLE "public"."imports" RENAME COLUMN "status_v2" TO "status";
+ALTER TABLE "public"."imports" ALTER COLUMN "status" DROP DEFAULT;
+
+-- 4) Keep currentStep nullable (code relies on NULL to denote new import)
+-- (Remove the SET NOT NULL)Also, consider wrapping in a transaction if your migration runner allows it.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| -- CreateEnum | |
| CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed'); | |
| -- AlterTable | |
| ALTER TABLE "public"."imports" DROP COLUMN "status", | |
| ADD COLUMN "status" "public"."ImportStatus" NOT NULL, | |
| ALTER COLUMN "currentStep" SET NOT NULL; | |
| -- CreateEnum | |
| CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed'); | |
| -- AlterTable | |
| -- 1) Add new enum column with default to preserve existing rows | |
| ALTER TABLE "public"."imports" | |
| ADD COLUMN "status_v2" "public"."ImportStatus" NOT NULL DEFAULT 'pending'; | |
| -- 2) Map old text values to enum; fall back to 'pending' for unknowns | |
| UPDATE "public"."imports" | |
| SET "status_v2" = CASE lower(coalesce("status", 'pending')) | |
| WHEN 'pending' THEN 'pending'::"public"."ImportStatus" | |
| WHEN 'processing' THEN 'processing'::"public"."ImportStatus" | |
| WHEN 'completed' THEN 'completed'::"public"."ImportStatus" | |
| WHEN 'failed' THEN 'failed'::"public"."ImportStatus" | |
| ELSE 'pending'::"public"."ImportStatus" | |
| END; | |
| -- 3) Drop old column and rename | |
| ALTER TABLE "public"."imports" DROP COLUMN "status"; | |
| ALTER TABLE "public"."imports" RENAME COLUMN "status_v2" TO "status"; | |
| ALTER TABLE "public"."imports" ALTER COLUMN "status" DROP DEFAULT; | |
| -- 4) Keep currentStep nullable (code relies on NULL to denote new import) | |
| -- (Remove the SET NOT NULL) |
🤖 Prompt for AI Agents
In packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
around lines 8 to 14, the migration currently drops and recreates the "status"
column (which will lose data) and forces "currentStep" to NOT NULL (which
conflicts with import worker logic); instead, create the new enum type, add a
new nullable enum column (e.g. status_new), backfill status_new from the
existing text status using a safe mapping, then drop the old text column and
rename status_new to status (or use ALTER COLUMN TYPE after backfill), and leave
currentStep nullable (do not set NOT NULL). Wrap the whole sequence in a
transaction if your runner supports it and ensure any NOT NULL constraints are
only applied after backfilling.
| const event = JSON.parse(line); | ||
| yield event; | ||
| } catch (error) { | ||
| console.warn('Failed to parse Mixpanel event:', line); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Use configured logger instead of console.warn.
Lines 134 and 146 use console.warn, bypassing the structured logger passed to the constructor. This creates inconsistent logging and prevents proper log aggregation and filtering.
Apply this diff to use the configured logger:
for (const line of lines) {
if (line.trim()) {
try {
const event = JSON.parse(line);
yield event;
} catch (error) {
- console.warn('Failed to parse Mixpanel event:', line);
+ this.logger?.warn('Failed to parse Mixpanel event', { line, error });
}
}
} if (buffer.trim()) {
try {
const event = JSON.parse(buffer);
yield event;
} catch (error) {
- console.warn('Failed to parse final Mixpanel event:', buffer);
+ this.logger?.warn('Failed to parse final Mixpanel event', { buffer, error });
}
}Also applies to: 146-146
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 134 and 146, replace
the direct console.warn calls with the instance's configured logger (e.g.,
this.logger.warn or the logger variable passed into the constructor) so logging
uses the structured logger; update both occurrences to call the logger with the
same message and include the event/line and any parse error details, preserving
message clarity and log level.
| const userAgent = props.osVersion || ''; | ||
| const uaInfo = this.isWebEvent(props.mp_lib) | ||
| ? parseUserAgent(userAgent, props) | ||
| : this.parseServerDeviceInfo(props); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect user agent assignment.
Line 202 assigns props.osVersion to the userAgent variable, but OS version is not the same as the user agent string. The user agent should be extracted from a property like props.$user_agent or similar.
Verify the correct Mixpanel property name for the user agent string and apply the fix:
- const userAgent = props.osVersion || '';
+ const userAgent = props.$user_agent || '';
const uaInfo = this.isWebEvent(props.mp_lib)
? parseUserAgent(userAgent, props)
: this.parseServerDeviceInfo(props);Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 202 to 205, the code
incorrectly assigns props.osVersion to userAgent; replace this with the Mixpanel
user agent property (e.g., props.$user_agent or the correct property name from
Mixpanel events), fall back to an empty string or props.osVersion if user agent
is not present, and pass that userAgent into parseUserAgent; ensure the
isWebEvent branch uses the actual UA string and not the OS version.
| os: uaInfo.os || props.$os, | ||
| os_version: uaInfo.osVersion || props.$osVersion, | ||
| browser: uaInfo.browser || props.$browser, | ||
| browser_version: | ||
| uaInfo.browserVersion || props.$browserVersion | ||
| ? String(props.$browser_version) | ||
| : '', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Inconsistent property naming may cause data loss.
Lines 289 and 292-293 use props.$osVersion and props.$browser_version respectively, but Mixpanel typically uses underscores (e.g., $os_version). This inconsistency may result in missing data if the wrong property names are used.
Verify the correct Mixpanel property naming convention:
🌐 Web query:
What are the correct Mixpanel event property names for OS version and browser version?
💡 Result:
Use Mixpanel's default property names:
- OS name: $os — OS version: $os_version. [1]
- Browser name: $browser — Browser version: $browser_version. [1]
Source: Mixpanel Default/Property reference. [1]
Fix inconsistent Mixpanel property names to prevent data loss.
The code uses camelCase property names (props.$osVersion, props.$browserVersion) but Mixpanel's standard property names are $os_version and $browser_version (snake_case). Additionally, lines 292-294 check props.$browserVersion in the condition but access props.$browser_version in the ternary, creating a logic inconsistency. Update both lines 289 and 292 to use the correct snake_case property names to ensure data is properly captured.
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 288 to 294, the
Mixpanel property names are inconsistent (using camelCase props.$osVersion and
props.$browserVersion) and the ternary condition checks a different prop than it
accesses, which can drop data; change references to use Mixpanel's snake_case
names (props.$os_version and props.$browser_version) and make the ternary check
and the value access use the same snake_case prop, converting to string as
needed (e.g., String(props.$browser_version)) so os_version and browser_version
are populated correctly.
1a11373 to
c9ebec6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (29)
packages/validation/src/index.ts (1)
478-486: This issue has already been flagged in a previous review.The concern about provider and config mismatches remains unaddressed. Please refer to the previous review comment which provides a discriminated union solution to enforce that each provider literal is paired with its correct config shape.
packages/db/prisma/migrations/20251028152531_imports_again/migration.sql (1)
1-4: SQL migration is correct.The migration correctly alters the
currentBatchcolumn to TEXT nullable without a default. The TypeScript type mismatch identified in the past review comment (lines 1-4) should be addressed separately inpackages/importer/src/types.ts.packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
10-13: Critical: Data migration missing before dropping columns.As identified in the past review comment (lines 10-12), this migration drops
provider,sourceLocation, andsourceTypecolumns without first migrating their data to theconfigJSON field. This will cause data loss for existing import records and break the worker'screateProviderfunction which expectsconfig.provider.packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1)
8-14: Destructive migration: status drop/recreate will lose data; currentStep NOT NULL conflicts with code.This migration drops and recreates the status column, which will lose existing data. Additionally, forcing currentStep to NOT NULL conflicts with the schema definition (line 489 in schema.prisma shows
currentStep String?) and worker code that relies on null values to detect new imports.The safe migration approach is to:
- Add a new enum column with a default value
- Backfill from the existing text column with proper mapping
- Drop the old column and rename
- Keep currentStep nullable
Refer to the detailed migration example in the existing past review comment.
apps/start/src/components/project-mapper.tsx (2)
23-31: Define a proper form type instead of usingany.The interface uses
anyfor all generic type parameters, defeating TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.Apply the diff suggested in the past review comment to add proper typing with a ProjectMapperFormData interface.
92-92: Remove the arbitrary 10-project limit.Line 92 limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling.
Remove
.slice(0, 10)to show all projects, or implement search/pagination if performance is a concern.packages/importer/package.json (2)
5-5: Main entry point references source instead of build output.The
mainfield points tosrc/index.ts, but consumers will need the compiled JavaScript output. This should point to./dist/index.js.Update to:
- "main": "src/index.ts", + "main": "./dist/index.js",Also ensure "types" field points to the compiled declaration file.
21-22: Review CSV library usage and align Ramda versions.Two concerns:
- Only
csv-parseis listed (csv-parser is not present), so this may be fineramdais at^0.29.1but@types/ramdais at^0.31.1- version mismatch may cause type errorsUpdate ramda to match its types:
- "ramda": "^0.29.1", + "ramda": "^0.31.1",packages/db/prisma/schema.prisma (1)
479-498: Inconsistency:currentStepnullable in schema but NOT NULL in migration.Line 489 defines
currentStepasString?(optional), but migration20251028150123_fix_imports_table/migration.sqlsets it to NOT NULL. This mismatch will cause runtime issues.Align schema and migration: either make it required in schema (
currentStep String) or remove the NOT NULL constraint from the migration. Based on the worker code needing null to detect new imports, keeping it nullable is recommended.packages/db/code-migrations/5-add-imports-table.ts (1)
74-85: Fix__filenameusage in ESM context.The package is configured as ESM (
"type": "module"), and__filenameis undefined in ESM. Other migration files (2-accounts.ts, 3-init-ch.ts) already use the correct pattern.Import and use
fileURLToPath:+import { fileURLToPath } from 'node:url'; + +const __filename = fileURLToPath(import.meta.url); + export async function up() {packages/importer/src/providers/mixpanel.test.ts (1)
39-127: Test masks mapping bugs in provider (browser/os version).The test expects
browser_version: ''andos_version: undefinedeven though the rawEvent contains$browser_version: 135and$os: 'Android'. This masks a bug in the provider's field mapping logic.Tighten the test assertions to expect the actual values:
- os_version: undefined, - browser_version: '', + os_version: 'Android', // or appropriate parsed version + browser_version: '135',Then fix the provider (packages/importer/src/providers/mixpanel.ts) to correctly map
$browser_versionand$os_versionfields, handling both snake_case and camelCase variants.packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (1)
1-3: Ensure gen_random_uuid() is available.The migration uses
gen_random_uuid()without ensuring the pgcrypto extension is enabled, which will cause the migration to fail on databases where pgcrypto isn't already available.Add the extension check before the table creation:
+-- Enable required extensions +CREATE EXTENSION IF NOT EXISTS pgcrypto; + -- CreateTable CREATE TABLE "public"."imports" (packages/importer/src/providers/mixpanel.ts (4)
128-147: Route parse errors through the provider logger
console.warnbypasses our structured logger and can disappear in production. Usethis.logger?.warn(with the error and raw line) instead.Apply this diff:
- console.warn('Failed to parse Mixpanel event:', line); + this.logger?.warn('Failed to parse Mixpanel event', { + line, + error, + }); @@ - console.warn('Failed to parse final Mixpanel event:', buffer); + this.logger?.warn('Failed to parse final Mixpanel event', { + buffer, + error, + });
85-91: Remove service account from logsLogging
serviceAccountdiscloses a credential identifier. Drop it (or mask it) from the metadata.Apply this diff:
this.logger?.info('Fetching events from Mixpanel', { url: `${url}?${params}`, from, to, projectId, - serviceAccount, });
201-205: Parse UA from$user_agent, not osVersion
props.osVersionis not a user-agent string, so every web event ends up with blank UA/device data. Prefer$user_agent(fallback to empty string) before feedingparseUserAgent.Apply this diff:
- const userAgent = props.osVersion || ''; + const userAgent = + typeof props.$user_agent === 'string' && props.$user_agent + ? props.$user_agent + : '';
273-279: Fix Mixpanel property names and browser version fallbackMixpanel uses snake_case (
$os_version,$browser_version), and the current ternary always stringifies$browser_versioneven when undefined, yielding'undefined'. Use the correct keys and direct fallbacks.Apply this diff:
- os: uaInfo.os || props.$os, - os_version: uaInfo.osVersion || props.$osVersion, - browser: uaInfo.browser || props.$browser, - browser_version: - uaInfo.browserVersion || props.$browserVersion - ? String(props.$browser_version) - : '', + os: uaInfo.os || props.$os || '', + os_version: uaInfo.osVersion || props.$os_version || '', + browser: uaInfo.browser || props.$browser || '', + browser_version: + uaInfo.browserVersion || props.$browser_version || '',apps/worker/src/jobs/import.ts (5)
35-38: Redact import configuration in logs
record.configcan carry API keys and secrets; logging it verbatim leaks credentials. Please log only non-sensitive identifiers such asprovider(and explicitly redacted fields) before fanning out to child logs.Apply this diff:
- const jobLogger = logger.child({ - importId, - config: record.config, - }); + const jobLogger = logger.child({ + importId, + provider: record.config?.provider, + });
134-142: Do not log raw provider eventsInvalid events are currently dumped to the logger, exposing full payloads (PII, tokens, etc.). Emit only minimal context (provider + reason) so we stay compliant while still highlighting the failure.
Apply this diff:
- if ( - !providerInstance.validate( - // @ts-expect-error - rawEvent, - ) - ) { - jobLogger.warn('Skipping invalid event', { rawEvent }); + if (!providerInstance.validate(rawEvent as never)) { + jobLogger.warn('Skipping invalid event', { + provider: providerInstance.provider, + reason: 'validation_failed', + }); continue; }
128-198: Harden batch transforms to avoid import abortsA single
transformEventexception currently tears down the job, and we over-count processed rows even when transforms fail. Buffer the batch asunknown[], wrap each transform in try/catch, log failures, and incrementprocessedEventsbased on the successfully transformed rows only (both for full and tail batches).Apply this diff:
- const eventBatch: any = []; + const eventBatch: unknown[] = []; @@ - const transformedEvents: IClickhouseEvent[] = eventBatch.map( - ( - // @ts-expect-error - event, - ) => providerInstance!.transformEvent(event), - ); + const transformedEvents: IClickhouseEvent[] = []; + for (const raw of eventBatch) { + try { + transformedEvents.push( + providerInstance.transformEvent( + raw as Parameters<typeof providerInstance.transformEvent>[0], + ), + ); + } catch (error) { + jobLogger.warn('Skipping event on transform failure', { + provider: providerInstance.provider, + reason: error instanceof Error ? error.message : String(error), + }); + } + } + if (transformedEvents.length === 0) { + eventBatch.length = 0; + continue; + } await insertImportBatch(transformedEvents, importId); - processedEvents += eventBatch.length; + processedEvents += transformedEvents.length; @@ - if (eventBatch.length > 0) { - const transformedEvents = eventBatch.map( - ( - // @ts-expect-error - event, - ) => providerInstance!.transformEvent(event), - ); + if (eventBatch.length > 0) { + const transformedEvents: IClickhouseEvent[] = []; + for (const raw of eventBatch) { + try { + transformedEvents.push( + providerInstance.transformEvent( + raw as Parameters<typeof providerInstance.transformEvent>[0], + ), + ); + } catch (error) { + jobLogger.warn('Skipping event on transform failure', { + provider: providerInstance.provider, + reason: error instanceof Error ? error.message : String(error), + }); + } + } + if (transformedEvents.length === 0) { + eventBatch.length = 0; + continue; + } await insertImportBatch(transformedEvents, importId); - processedEvents += eventBatch.length; + processedEvents += transformedEvents.length;
194-197: Include counters in the final loading updateThe tail-batch status update omits
totalEventsandprocessedEvents, leaving Postgres stale for the UI. Pass the counters just like the earlier batches.Apply this diff:
await updateImportStatus(jobLogger, job, importId, { step: 'loading', batch: createdAt, + totalEvents, + processedEvents, });
250-269: Persist final totals after completionAfter
markImportComplete, Postgres still holds outdatedprocessedEvents/totalEvents. Fetch the final ClickHouse progress and write those numbers back (either viaupdateImportStatusor a directdb.import.update) so the UI reflects 100 % completion.Apply this diff:
- await updateImportStatus(jobLogger, job, importId, { - step: 'completed', - }); + await updateImportStatus(jobLogger, job, importId, { + step: 'completed', + totalEvents, + processedEvents, + }); @@ - const finalProgress = await getImportProgress(importId); + const finalProgress = await getImportProgress(importId); + await db.import.update({ + where: { id: importId }, + data: { + totalEvents: finalProgress.totalEvents, + processedEvents: finalProgress.insertedEvents, + }, + });apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)
265-275: Redact sensitive config before displaying
imp.configfrequently contains API keys and secrets; dumping it straight into the tooltip leaks credentials. Sanitize (or redact) known secret fields before stringifying, and/or restrict the tooltip to privileged roles.Apply this diff (or similar):
- content={ - <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> - {JSON.stringify(imp.config, null, 2)} - </pre> - } + content={ + <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> + {JSON.stringify(redactImportConfig(imp.config), null, 2)} + </pre> + }Add a helper such as:
function redactImportConfig(config: Record<string, unknown>) { const masked = { ...config }; for (const key of ['apiKey', 'apiSecret', 'serviceAccount', 'serviceSecret', 'token', 'secret']) { if (masked[key]) masked[key] = '***'; } return masked; }apps/start/src/modals/add-import.tsx (1)
246-248: Use a generic failure messageSurfacing
error.messagecan leak upstream credentials or debug details. Replace it with a neutral user-facing string and keep the sensitive message server-side.Apply this diff:
- onError: (error) => { - toast.error('Import failed', { - description: error.message, - }); + onError: () => { + toast.error('Import failed', { + description: 'We could not queue the import. Please try again or contact support.', + });packages/importer/src/types.ts (1)
53-66: EventProperties allows nested objects, but ClickHousepropertiesis primitives-only.
EventPropertiespermitsRecord<string, unknown>(line 61) and defines__queryas a nested object (line 62). However,IClickhouseEvent.propertiesexpectsRecord<string, string | number | boolean | undefined | null>. Storing objects will break ingestion unless stringified.Consider:
- Constrain values to primitives:
string | number | boolean | null | undefined- Or require JSON stringification for nested data (e.g.,
__queryJson: string)Apply this diff to constrain to primitives:
export interface EventProperties { [key: string]: | string | number | boolean | null - | undefined - | Record<string, unknown>; - __query?: Record<string, unknown>; + | undefined; __title?: string; __screen?: string; __language?: string; }packages/db/src/services/import.service.ts (5)
1-12: Missing import forcrypto.randomUUID().Lines 272 and 309 call
crypto.randomUUID(), butrandomUUIDis not imported fromnode:crypto.Add the import:
import type { ILogger } from '@openpanel/logger'; +import { randomUUID } from 'node:crypto'; import sqlstring from 'sqlstring';Then replace usage:
- id: crypto.randomUUID(), + id: randomUUID(),
262-288: Risk of timestamp collisions with existing events.Adjusting session boundaries by ±1 second (line 288: -1000ms, line 325: +500ms) could create timestamp conflicts if real events exist at those exact timestamps, leading to non-deterministic ordering.
Consider:
- Use sub-millisecond offsets (±1ms) to reduce collision probability
- Add an
is_syntheticflag and use(created_at, is_synthetic)for ordering- Use ClickHouse's native session reconstruction capabilities
- created_at: adjustTimestamp(session.first_timestamp, -1000), + created_at: adjustTimestamp(session.first_timestamp, -1),
553-554: Inconsistent parameter handling.Lines 553-554 use
sqlstring.escape()for inline substitution, while the rest of the file uses ClickHouse'squery_paramsmechanism (e.g., lines 122, 449). This inconsistency reduces maintainability.Refactor to use query_params:
WHERE - e.import_id = ${sqlstring.escape(importId)} - AND toDate(e.created_at) = ${sqlstring.escape(from)} + e.import_id = {importId:String} + AND toDate(e.created_at) = {from:String} AND e.session_id != '' GROUP BY e.session_id `; await ch.command({ query: sessionsInsertQuery, + query_params: { importId, from }, clickhouse_settings: {
601-613: Use deterministic status aggregation instead ofany(import_status).
any(import_status)(line 607) returns an arbitrary status from the group, making results non-deterministic when events have mixed statuses (e.g., some 'pending', some 'processed').Use deterministic aggregation:
SELECT import_id, COUNT(*) as total_events, COUNTIf(import_status = 'pending') as pending_events, COUNTIf(import_status = 'processed') as processed_events, - any(import_status) as status + CASE + WHEN COUNTIf(import_status = 'failed') > 0 THEN 'failed' + WHEN COUNTIf(import_status = 'processed') = COUNT(*) THEN 'processed' + WHEN COUNTIf(import_status = 'pending') > 0 THEN 'processing' + ELSE 'pending' + END as status FROM ${TABLE_NAMES.events_imports}
647-652: Validate status value instead of using type assertion.The type assertion on line 651 bypasses type safety. If ClickHouse returns an unexpected status value, the type system won't catch it.
Add runtime validation:
+ const validStatuses = ['pending', 'processing', 'processed', 'failed'] as const; + const status = validStatuses.includes(row.status as any) + ? (row.status as typeof validStatuses[number]) + : 'pending'; + return { importId, totalEvents: row.total_events, insertedEvents: row.processed_events, - status: row.status as 'pending' | 'processing' | 'processed' | 'failed', + status, };
🧹 Nitpick comments (5)
packages/common/src/object.test.ts (1)
5-32: Strong test coverage for main scenarios.The test thoroughly validates the core functionality including primitives, arrays, nested objects, and JSON string parsing. The expectations correctly reflect the implementation's behavior of stringifying values, flattening structures, and filtering out null/undefined/empty strings.
Consider adding test cases for additional edge cases:
- Boolean values (
true/false)- Falsy number (
0)- Empty arrays and objects
- Arrays containing null/undefined elements
These would further strengthen the test suite but are not critical for the current functionality.
packages/validation/src/index.ts (1)
461-471: Consider date validation for temporal fields.The
fromandtofields likely represent date ranges but are only validated as non-empty strings. Invalid date formats could cause runtime errors in the import processor.Consider adding date format validation:
export const zMixpanelImportConfig = z.object({ provider: z.literal('mixpanel'), type: z.literal('api'), serviceAccount: z.string().min(1), serviceSecret: z.string().min(1), projectId: z.string().min(1), - from: z.string().min(1), - to: z.string().min(1), + from: z.string().datetime(), + to: z.string().datetime(), mapScreenViewProperty: z.string().optional(), });Note: Use
.datetime()for ISO 8601 timestamps or.date()for YYYY-MM-DD format, depending on the expected input format.packages/redis/package.json (1)
11-11: Consider using caret range for consistency.The change from
"^5.7.0"to"5.8.2"removes semver flexibility and pins to an exact version. This is inconsistent with dependency management patterns elsewhere in the monorepo and prevents automatic patch updates.Consider reverting to a caret range for consistency:
- "ioredis": "5.8.2" + "ioredis": "^5.8.2"packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1)
1-3: Migration creates type churn.The migration adds
currentBatchasINTEGER NOT NULL DEFAULT 0, but a subsequent migration (20251028152531_imports_again) changes it toTEXTnullable. While both migrations are technically valid, this indicates design iteration. For future migrations, consider finalizing type decisions before creating schema changes to reduce churn.packages/db/src/services/import.service.ts (1)
721-784: Consider coordinating Postgres and ClickHouse state updates.
updateImportStatusupdates Postgres only, whilemarkImportComplete(lines 574-593) updates ClickHouse only. If one system's update fails, import state becomes inconsistent across the two databases.Consider one of these approaches:
- Update Postgres first (as source of truth), then ClickHouse, with retry logic
- Use an idempotent reconciliation job that periodically syncs state
- At minimum, add try-catch with logging for manual reconciliation
export async function updateImportStatus( jobLogger: ILogger, job: { updateProgress: (progress: Record<string, any>) => void }, importId: string, options: UpdateImportStatusOptions, ): Promise<void> { + try { const data: Prisma.ImportUpdateInput = {}; // ... build data object ... await job.updateProgress(data); await db.import.update({ where: { id: importId }, data, }); + } catch (error) { + jobLogger.error('Failed to update import status', { importId, error }); + throw error; + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (76)
apps/api/package.json(1 hunks)apps/start/Dockerfile(2 hunks)apps/start/package.json(2 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/project-mapper.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/date-ranger-picker.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/start/src/utils/math.ts(1 hunks)apps/start/src/utils/slug.ts(1 hunks)apps/worker/Dockerfile(2 hunks)apps/worker/package.json(3 hunks)apps/worker/src/boot-workers.ts(4 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/cron.delete-projects.ts(1 hunks)apps/worker/src/jobs/events.incoming-event.ts(3 hunks)apps/worker/src/jobs/events.incoming-events.test.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)package.json(1 hunks)packages/cli/package.json(0 hunks)packages/cli/src/cli.ts(0 hunks)packages/cli/src/importer/importer.ts(0 hunks)packages/cli/src/importer/index.ts(0 hunks)packages/cli/tsconfig.json(0 hunks)packages/cli/tsup.config.ts(0 hunks)packages/common/package.json(1 hunks)packages/common/scripts/get-referrers.ts(1 hunks)packages/common/server/index.ts(1 hunks)packages/common/server/parse-referrer.test.ts(5 hunks)packages/common/server/parse-referrer.ts(4 hunks)packages/common/server/parser-user-agent.ts(2 hunks)packages/common/src/object.test.ts(1 hunks)packages/common/src/object.ts(2 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/index.ts(2 hunks)packages/db/package.json(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql(1 hunks)packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql(1 hunks)packages/db/prisma/migrations/20251028152531_imports_again/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/buffers/base-buffer.ts(1 hunks)packages/db/src/buffers/event-buffer.test.ts(8 hunks)packages/db/src/clickhouse/client.ts(4 hunks)packages/db/src/clickhouse/csv.ts(1 hunks)packages/db/src/clickhouse/migration.ts(1 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/services/session.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/metadata.ts(1 hunks)packages/importer/src/providers/mixpanel.test.ts(1 hunks)packages/importer/src/providers/mixpanel.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/redis/package.json(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/auth.ts(1 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/trpc/src/routers/organization.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
💤 Files with no reviewable changes (6)
- packages/cli/package.json
- packages/cli/tsup.config.ts
- packages/cli/src/cli.ts
- packages/cli/src/importer/importer.ts
- packages/cli/src/importer/index.ts
- packages/cli/tsconfig.json
🚧 Files skipped from review as they are similar to previous changes (33)
- packages/common/server/index.ts
- apps/start/src/modals/date-ranger-picker.tsx
- apps/worker/src/index.ts
- packages/common/server/parse-referrer.ts
- packages/trpc/src/root.ts
- packages/db/src/types.ts
- apps/start/src/modals/index.tsx
- packages/common/src/object.ts
- apps/worker/Dockerfile
- packages/db/src/buffers/event-buffer.test.ts
- packages/db/index.ts
- packages/trpc/src/routers/import.ts
- apps/start/src/components/integrations/integration-card.tsx
- packages/trpc/src/routers/organization.ts
- apps/start/src/utils/math.ts
- packages/importer/src/providers/metadata.ts
- packages/common/scripts/get-referrers.ts
- apps/start/package.json
- apps/start/src/components/ui/badge.tsx
- package.json
- packages/importer/src/providers/umami.ts
- apps/worker/package.json
- packages/importer/vitest.config.ts
- packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql
- packages/db/src/services/event.service.ts
- apps/start/Dockerfile
- packages/importer/src/index.ts
- apps/worker/src/boot-workers.ts
- packages/queue/src/queues.ts
- packages/importer/tsconfig.json
- packages/trpc/src/routers/auth.ts
- apps/start/src/routes/app.$organizationId.$projectId.settings._tabs.tsx
- packages/db/src/services/session.service.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/modals/add-import.tsxapps/start/src/utils/slug.tsapps/start/src/routeTree.gen.tsapps/start/src/components/project-mapper.tsx
🧠 Learnings (9)
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, use the helper functions in ./packages/db/src/clickhouse/query-functions.ts rather than crafting raw queries
Applied to files:
packages/db/src/clickhouse/csv.tspackages/db/code-migrations/5-add-imports-table.tspackages/db/src/clickhouse/client.tspackages/db/src/services/import.service.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*.tsx : Define React route files with createFileRoute and let the bundler/CLI manage the path string
Applied to files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/__root.tsx : Declare the root route in routes/__root.tsx using createRootRoute or createRootRouteWithContext
Applied to files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/{index,*.index}.tsx : For index routes, use index.tsx or *.index.tsx files and ensure the createFileRoute path uses a trailing slash (e.g., '/posts/')
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*$*.tsx : Name dynamic route files with $param (e.g., posts.$postId.tsx) to capture path parameters
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/$.tsx : Use $.tsx for splat (catch-all) routes to capture the remaining path into params._splat
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*_.tsx : Use a trailing underscore in a segment filename to create a non-nested (un-nested) route (e.g., posts_.tsx)
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/_*.tsx : Use leading underscore filenames for pathless layout routes that wrap children without matching a URL segment
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, always construct them using the custom query builder in ./packages/db/src/clickhouse/query-builder.ts
Applied to files:
packages/db/src/clickhouse/client.tspackages/db/src/services/import.service.ts
🧬 Code graph analysis (12)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
getIsCluster(17-24)packages/db/src/clickhouse/migration.ts (3)
createTable(68-116)modifyTTL(118-132)runClickhouseMigrationCommands(338-470)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
BaseRawEvent(40-42)ImportJobMetadata(69-73)ErrorContext(45-51)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)
packages/importer/src/providers/mixpanel.test.ts (1)
packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-452)
packages/common/server/parse-referrer.test.ts (1)
packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
apps/worker/src/jobs/import.ts (8)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/prisma-client.ts (1)
db(270-270)packages/db/src/services/import.service.ts (10)
ImportSteps(719-719)updateImportStatus(721-784)getImportDateBounds(658-682)insertImportBatch(30-87)generateSessionIds(94-130)createSessionsStartEndEvents(138-348)moveImportsToProduction(354-458)backfillSessionsToProduction(460-569)markImportComplete(574-593)getImportProgress(598-653)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(237-248)error(41-46)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)packages/importer/src/providers/umami.ts (1)
UmamiProvider(59-382)packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-452)
packages/common/src/object.test.ts (1)
packages/common/src/object.ts (1)
toDots(16-67)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
IServiceImportedEventPayload(144-147)packages/db/src/services/import.service.ts (2)
ImportProgress(20-25)ImportStageResult(14-18)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
Route(12-34)packages/importer/src/providers/metadata.ts (1)
IMPORT_PROVIDERS(13-30)apps/public/components/ui/tooltip.tsx (1)
Tooltiper(36-51)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
zUmamiImportConfig(452-454)zMixpanelImportConfig(461-470)IUmamiImportConfig(456-456)IMixpanelImportConfig(471-471)IImportConfig(473-476)apps/start/src/components/project-mapper.tsx (1)
ProjectMapper(33-116)apps/public/lib/utils.ts (1)
cn(4-6)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
IServiceProject(7-7)apps/start/src/components/ui/select.tsx (5)
Select(173-173)SelectTrigger(181-181)SelectValue(182-182)SelectContent(174-174)SelectItem(176-176)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
IMixpanelImportConfig(471-471)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)packages/common/server/parser-user-agent.ts (1)
parseUserAgent(72-115)packages/common/src/object.ts (2)
toDots(16-67)strip(79-79)
packages/db/src/services/import.service.ts (5)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/csv.ts (2)
csvEscapeField(42-53)csvEscapeJson(15-38)packages/db/src/clickhouse/client.ts (5)
formatClickhouseDate(237-248)chInsertCSV(202-228)TABLE_NAMES(49-61)ch(132-162)convertClickhouseDateToJs(267-269)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/prisma-client.ts (1)
db(270-270)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-and-push-worker
- GitHub Check: build-and-push-api
- GitHub Check: build-and-push-dashboard
🔇 Additional comments (30)
apps/api/package.json (1)
67-67: Verify the rationale for pinning tsdown to an exact version.The caret operator has been removed from tsdown, changing it from
"^0.14.2"to"0.14.2"(exact version pin). The AI summary indicates this change is coordinated across the monorepo. Ensure this is intentional for reproducible builds or a known stability requirement with this specific version.Is this pinning part of a broader monorepo strategy, or is there a specific reason (e.g., bug fix, stability) for locking to 0.14.2? You may want to verify consistency across other workspace packages to confirm alignment.
packages/db/src/buffers/base-buffer.ts (1)
1-1: LGTM! Import path consolidation.The import path change from
@openpanel/common/server/idto@openpanel/common/serveris part of a coordinated refactor to centralize server utilities, with similar updates across auth.ts and organization.ts.packages/common/src/object.test.ts (2)
1-2: LGTM!The imports are correct and appropriate for a vitest test file.
34-47: Excellent validation of malformed JSON handling.This test effectively validates that the function correctly:
- Parses valid JSON strings and expands them into dot notation
- Skips strings that appear to be JSON but are malformed
- Preserves regular strings as-is
The test expectations align well with the implementation's defensive approach to JSON parsing.
packages/common/package.json (1)
6-9: Public exports field enables dual entry points.The new
exportsfield correctly exposes the main entry point and a server-specific export, aligning with the referrer utilities being made part of the public API surface (parse-referrer, id).packages/validation/src/index.ts (2)
439-459: Well-structured file-based import configs.The
createFileImportConfighelper elegantly reduces duplication while maintaining type safety with provider literals. ThezProjectMapperschema and its integration intozUmamiImportConfigprovides clear validation for project mapping requirements.
473-476: Clean union type export.The
IImportConfigunion type appropriately combines the three provider-specific config types for type safety in consuming code.packages/trpc/package.json (1)
19-19: LGTM!The addition of the
@openpanel/queueworkspace dependency aligns with the new import queue functionality introduced in this PR.packages/db/src/clickhouse/csv.ts (3)
5-13: LGTM!The escape order is correct (backslashes first), and the function properly handles all necessary control characters for ClickHouse Map format.
15-38: LGTM!The function correctly handles null values, normalizes to strings for Map(String,String), and properly delegates to
csvEscapeFieldfor final CSV escaping.
42-53: LGTM!The function correctly implements CSV RFC 4180 standard by escaping double quotes through doubling and wrapping fields containing special characters.
packages/db/src/clickhouse/client.ts (4)
1-1: LGTM!The
Readableimport is correctly used in the newchInsertCSVfunction, and theevents_importstable addition aligns with the import system introduced in this PR.Also applies to: 60-60
65-65: Verify timeout increases are appropriate for the import workload.The request timeout increased 5x (60s → 300s) and idle socket TTL increased 7.5x (8s → 60s). While these increases support long-running import operations, consider monitoring these values in production to ensure they don't mask underlying performance issues or cause resource exhaustion.
Also applies to: 68-68
137-157: LGTM!The retry wrapper and comprehensive ClickHouse settings properly support large batch inserts. The settings spread order (line 154) correctly allows user settings to override defaults, and the progress headers prevent proxy timeouts during long-running operations.
202-228: LGTM!The CSV streaming implementation is correct, and the format settings properly align with the CSV escaping utilities (double quotes for CSV fields, single quotes for Map format internally). Error handling and logging are appropriate.
apps/worker/src/jobs/cron.delete-projects.ts (1)
57-57: LGTM!The type change from numeric
0to string'0'aligns with ClickHouse setting conventions and likely reflects updated type definitions in the newer @clickhouse/client version.packages/importer/src/base-provider.ts (3)
4-31: LGTM!The abstract class design is clean with appropriate type constraints. Using
AsyncGeneratorforparseSourceenables memory-efficient streaming for large imports, which is essential for this use case.
36-68: LGTM!The optional hooks are well-designed with sensible defaults. The
shouldGenerateSessionIds()default offalseappropriately assumes providers handle session ID generation during streaming.
78-120: LGTM!The date chunking implementation correctly splits date ranges while handling edge cases (same start/end date) and ensuring chunks don't exceed the end date. The break condition prevents infinite loops.
packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1)
1-2: LGTM!The migration correctly adds an optional
statusMessagecolumn to support detailed import status tracking, aligning with the PR's import system objectives.packages/db/package.json (1)
16-16: Verify compatibility with @clickhouse/client 1.12.1.The upgrade from
^1.2.0to^1.12.1spans 10 minor versions. While maintaining semver compatibility, ensure there are no behavioral changes affecting existing ClickHouse operations.apps/worker/src/jobs/events.incoming-events.test.ts (1)
102-102: LGTM! Test expectations align with updated referrer parsing behavior.The expectations correctly reflect that
referrerTypenow returns an empty string ('') instead of'unknown'for non-derived referrers, consistent with the changes inpackages/common/server/parse-referrer.ts.Also applies to: 210-210
packages/db/src/clickhouse/migration.ts (1)
118-132: LGTM! Clean implementation following established patterns.The
modifyTTLfunction correctly handles both clustered and non-clustered scenarios, using thereplicated()helper consistently with other migration utilities in this file. The implementation is type-safe and follows the existing code structure.packages/common/server/parse-referrer.test.ts (1)
8-8: LGTM! Comprehensive test coverage for the referrer type change.All test expectations have been consistently updated to reflect that unknown or empty referrer types now return
''instead of'unknown'. The changes cover bothparseReferrerandgetReferrerWithQueryfunctions across various scenarios (undefined, empty, invalid, and unknown inputs).Also applies to: 14-14, 44-44, 52-52, 67-67, 91-91, 99-99, 113-113
packages/db/code-migrations/5-add-imports-table.sql (1)
1-43: LGTM! Well-designed import staging table.The
events_importstable schema is well-structured:
- Mirrors the main events table structure with additional import tracking fields (
import_id,import_status,imported_at_meta)- Uses appropriate compression codecs for each data type
- Partitioning by month on
imported_at_metaaligns with the 7-day TTL for efficient partition dropsORDER BY (import_id, created_at)supports efficient queries by import batchThe separate TTL
ALTERstatement follows correct ClickHouse patterns.packages/common/server/parser-user-agent.ts (2)
71-71: Good API expansion with exported type.The
UserAgentInfotype export provides a clean public API for consumers to type the return value ofparseUserAgent, using TypeScript'sReturnTypeutility type.
84-112: LGTM! More defensive override handling.The updated override logic adds explicit type guards and truthiness checks, preventing empty strings or non-string values from being used as overrides. This is more defensive than the previous
??operator approach and ensures only valid, non-empty string overrides are applied. The pattern is consistently applied across all fields.packages/db/code-migrations/5-add-imports-table.ts (1)
65-72: TTL implementation via modifyTTL helper is correct.The code uses the
modifyTTLhelper function which correctly targets the replicated table in cluster mode (see packages/db/src/clickhouse/migration.ts lines 117-131). This addresses the concern about TTL on distributed tables.The helper properly constructs:
ALTER TABLE ${replicated(tableName)} ON CLUSTER '{cluster}' MODIFY TTL ${ttl}for clusteredALTER TABLE ${tableName} MODIFY TTL ${ttl}for non-clusteredapps/worker/src/jobs/events.incoming-event.ts (2)
8-23: Import reorganization aligns with common package refactoring.The imports have been updated to use public exports from
@openpanel/common/server, and Job is now correctly imported as a type-only import. The separate import ofomitfrom ramda alongside the namespace import is acceptable.
120-122: Referrer logic improvements are correct.The changes improve referrer handling:
- Line 120:
baseEvent.referrernow uses onlyreferrer?.url(removedutmReferrer?.url), which is correct since UTM parameters shouldn't override the actual referrer URL- Line 122:
referrerTypenow prioritizesreferrer?.typebeforeutmReferrer?.type, which properly favors detected referrer types (like 'search') over UTM-based types
| @@ -1 +1 @@ | |||
| export * from '@openpanel/common/src/slug'; | |||
| export * from '@openpanel/common'; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify export scope: file exports entire common package instead of slug utilities.
This file is named slug.ts but now exports everything from @openpanel/common, not just slug-related functions. This introduces namespace pollution and breaks the single-responsibility principle for utility modules.
Consider using named exports to maintain the focused scope:
-export * from '@openpanel/common';
+export { slugify, /* other slug-specific exports */ } from '@openpanel/common';Run the following script to verify what slug-related symbols should be exported:
🏁 Script executed:
#!/bin/bash
# Find usages of this slug utility module to determine required exports
rg -n --type=ts --type=tsx "from ['\"]@/utils/slug['\"]" -A 1 -B 1
# Find the original slug exports from common package
fd -e ts -e tsx --full-path "packages/common" | xargs rg -n "export.*slug" -iLength of output: 157
🏁 Script executed:
#!/bin/bash
# Check all exports in the common package's index
echo "=== Contents of packages/common/index.ts ==="
cat packages/common/index.ts
echo ""
echo "=== Search for any imports of apps/start/src/utils/slug.ts ==="
rg -n --type=ts "utils/slug" -C 2
echo ""
echo "=== Check if apps/start/src/utils/slug.ts file exists and its content ==="
cat apps/start/src/utils/slug.ts 2>/dev/null || echo "File not found"Length of output: 921
🏁 Script executed:
#!/bin/bash
# Check full content of packages/common/src/slug.ts to see all exports
echo "=== Full content of packages/common/src/slug.ts ==="
cat packages/common/src/slug.ts
echo ""
echo "=== Check if there are any other references to slug utilities ==="
rg -n "slug" apps/start/src/utils/ --type=tsLength of output: 616
Use named export to maintain module scope: export { slug } from '@openpanel/common';
The file is named slug.ts but re-exports the entire @openpanel/common package (11 modules), when it should only export the slug function. While named imports in consuming files (import { slug }) will tree-shake correctly, this creates a confusing public interface that doesn't match the module's name and breaks the expectation of focused utility exports.
-export * from '@openpanel/common';
+export { slug } from '@openpanel/common';📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export * from '@openpanel/common'; | |
| export { slug } from '@openpanel/common'; |
🤖 Prompt for AI Agents
In apps/start/src/utils/slug.ts around line 1, the file currently re-exports the
entire @openpanel/common package which exposes many unrelated modules; change it
to a focused named re-export so this module only exposes the slug utility.
Replace the bulk export with a named export that re-exports just slug from
@openpanel/common (maintain TypeScript types and default/namespace
considerations), run a quick type-check/build to ensure no other modules rely on
the previous broad export surface.
| export interface SessionInfo { | ||
| id: string; | ||
| lastTimestamp: number; | ||
| lastEvent: IServiceImportedEventPayload; | ||
| } | ||
|
|
||
| export interface ImportProgress { | ||
| totalEvents: number; | ||
| processedEvents: number; | ||
| currentBatch: number; | ||
| totalBatches: number; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove duplicate type definitions.
ImportProgress (lines 20-25) and ImportStageResult (lines 75-80) are redefined in packages/db/src/services/import.service.ts. This duplication risks type drift if one definition is updated without the other.
The service file should import these types from this module:
In packages/db/src/services/import.service.ts, remove the duplicate definitions and add:
import type { ImportProgress, ImportStageResult } from '@openpanel/importer';Also applies to: 75-80
🤖 Prompt for AI Agents
In packages/importer/src/types.ts around lines 14-25 and 75-80, ImportProgress
and ImportStageResult are defined here but duplicated in
packages/db/src/services/import.service.ts; remove the duplicate type
definitions from packages/db/src/services/import.service.ts and replace them
with a type-only import from this module: add "import type { ImportProgress,
ImportStageResult } from '@openpanel/importer';" at the top of the service file
and update any local references to use the imported types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
apps/worker/src/jobs/import.ts (4)
44-47: Logger context leaks config secrets.The full
record.configobject may contain sensitive credentials (API keys, database passwords, access tokens). Logging it exposes secrets in logs.Apply this diff to log only safe metadata:
const jobLogger = logger.child({ importId, - config: record.config, + provider: record.config?.provider, });
142-188: Harden batch processing: type safety, PII redaction, transform guards, safe createdAt.This segment has multiple issues flagged in prior reviews:
- Line 142:
anytype loses safety; useunknown[]- Line 153: Logs full
rawEventwhich may contain PII- Lines 163-168:
transformEventcan throw and crash the job- Lines 175-177: Unsafe
createdAtderivation iftransformedEvents[0]?.created_atis missing or invalidApply this diff to address all issues:
- const eventBatch: any = []; + const eventBatch: unknown[] = []; for await (const rawEvent of providerInstance.parseSource( resumeLoadingFrom, )) { // Validate event if ( !providerInstance.validate( - // @ts-expect-error - rawEvent, + rawEvent as any, ) ) { - jobLogger.warn('Skipping invalid event', { rawEvent }); + jobLogger.warn('Skipping invalid event', { + provider: providerInstance.provider, + reason: 'validation_failed', + }); continue; } eventBatch.push(rawEvent); // Process batch when it reaches the batch size if (eventBatch.length >= BATCH_SIZE) { jobLogger.info('Processing batch', { batchSize: eventBatch.length }); - const transformedEvents: IClickhouseEvent[] = eventBatch.map( - ( - // @ts-expect-error - event, - ) => providerInstance!.transformEvent(event), - ); + const transformedEvents: IClickhouseEvent[] = []; + for (const ev of eventBatch) { + try { + transformedEvents.push(providerInstance!.transformEvent(ev as any)); + } catch (e) { + jobLogger.warn('Skipping event on transform failure', { + provider: providerInstance.provider, + reason: e instanceof Error ? e.message : String(e), + }); + } + } await insertImportBatch(transformedEvents, importId); processedEvents += eventBatch.length; eventBatch.length = 0; - const createdAt = new Date(transformedEvents[0]?.created_at || '') - .toISOString() - .split('T')[0]; + const firstCreatedAt = transformedEvents[0]?.created_at; + const createdAt = firstCreatedAt + ? firstCreatedAt.split('T')[0]! + : formatClickhouseDate(new Date(), true); await updateImportStatus(jobLogger, job, importId, { step: 'loading', batch: createdAt, totalEvents, processedEvents, });
192-216: Harden final batch: guard transforms, safe createdAt, include counters.This final batch suffers the same issues as the main batch plus missing progress counters:
- Lines 193-198: No try-catch around
transformEvent- Lines 205-207: Unsafe
createdAtderivation- Lines 209-212: Missing
totalEventsandprocessedEventsin status updateApply this diff:
// Process remaining events in the last batch if (eventBatch.length > 0) { - const transformedEvents = eventBatch.map( - ( - // @ts-expect-error - event, - ) => providerInstance!.transformEvent(event), - ); + const transformedEvents: IClickhouseEvent[] = []; + for (const ev of eventBatch) { + try { + transformedEvents.push(providerInstance!.transformEvent(ev as any)); + } catch (e) { + jobLogger.warn('Skipping event on transform failure', { + provider: providerInstance.provider, + reason: e instanceof Error ? e.message : String(e), + }); + } + } await insertImportBatch(transformedEvents, importId); processedEvents += eventBatch.length; eventBatch.length = 0; - const createdAt = new Date(transformedEvents[0]?.created_at || '') - .toISOString() - .split('T')[0]; + const firstCreatedAt = transformedEvents[0]?.created_at; + const createdAt = firstCreatedAt + ? firstCreatedAt.split('T')[0]! + : formatClickhouseDate(new Date(), true); await updateImportStatus(jobLogger, job, importId, { step: 'loading', batch: createdAt, + totalEvents, + processedEvents, });
280-299: Sync final ClickHouse counters back to Postgres.The UI reads
processedEventsandtotalEventsfrom the Postgresimporttable. After completion,finalProgressfrom ClickHouse contains the actual counts, but these are never written back to Postgres.Apply this diff to persist final counts:
await markImportComplete(importId); await updateImportStatus(jobLogger, job, importId, { step: 'completed', }); jobLogger.info('Import marked as complete'); // Get final progress const finalProgress = await getImportProgress(importId); + // Sync final counts to Postgres for UI + await db.import.update({ + where: { id: importId }, + data: { + totalEvents: finalProgress.totalEvents, + processedEvents: finalProgress.insertedEvents, + }, + }); + jobLogger.info('Import job completed successfully', { totalEvents: finalProgress.totalEvents, insertedEvents: finalProgress.insertedEvents, status: finalProgress.status, });
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/worker/src/jobs/import.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
apps/worker/src/jobs/import.ts (6)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/prisma-client.ts (1)
db(270-270)packages/db/src/services/import.service.ts (10)
ImportSteps(719-719)updateImportStatus(721-784)getImportDateBounds(658-682)insertImportBatch(30-87)generateSessionIds(94-130)createSessionsStartEndEvents(138-348)moveImportsToProduction(354-458)backfillSessionsToProduction(460-569)markImportComplete(574-593)getImportProgress(598-653)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(237-248)error(41-46)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)
🔇 Additional comments (1)
apps/worker/src/jobs/import.ts (1)
300-332: LGTM! Error handling and provider factory are well-structured.The error handling properly marks the import as failed with a nested try-catch to ensure marking failures are logged. The
createProviderfactory cleanly delegates to the appropriate provider implementation.
9211e62 to
2808bdb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
♻️ Duplicate comments (14)
packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
1-13: 🔴 CRITICAL: Data migration missing before dropping columns (unresolved from prior review).This migration drops
provider,sourceLocation, andsourceTypecolumns without migrating their data to theconfigJSON field. The import job handler (apps/worker/src/jobs/import.ts) expectsconfig.providerto exist when processing imports. Existing import records will lose this metadata and the handler will fail.Add a data migration step before dropping the columns to preserve the values:
+-- AlterTable +UPDATE "public"."imports" +SET "config" = jsonb_build_object( + 'provider', provider, + 'sourceType', "sourceType", + 'sourceLocation', "sourceLocation" ) || "config"; + -- AlterTable ALTER TABLE "public"."imports" DROP COLUMN "provider", DROP COLUMN "sourceLocation", DROP COLUMN "sourceType", ALTER COLUMN "config" DROP DEFAULT;packages/importer/vitest.config.ts (1)
3-3:__dirnameis unavailable in ESM; this issue was previously flagged.This critical issue was already identified in a prior review:
__dirnamedoes not exist in ESM modules and will throw aReferenceErrorat runtime.Apply this diff to fix the issue:
+import { dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; + import { getSharedVitestConfig } from '../../vitest.shared'; -export default getSharedVitestConfig({ __dirname }); +export default getSharedVitestConfig({ + __dirname: dirname(fileURLToPath(import.meta.url)) +});packages/validation/src/index.ts (1)
478-486: Address the unresolved provider/config type safety issue.This was flagged in a previous review and remains unresolved. The current schema allows
providerandconfigto contain mismatched provider values (e.g.,provider: 'umami'withconfig.provider: 'mixpanel'), which breaks type safety.The suggested fix from the previous review uses a discriminated union to enforce that the provider literal matches the config shape. This is a critical type safety improvement that should be implemented.
packages/common/scripts/get-referrers.ts (1)
74-74: Fix incorrect output path.The script writes to
../../worker/src/referrers/index.ts, but this is incorrect based on the past review comment. The path should be../server/referrers/index.tsto match the package-local structure.apps/worker/src/jobs/import.ts (2)
44-47: Stop logging full provider config. Including the entirerecord.configin the logger context risks leaking credentials and secrets to log storage. Strip it down to non-sensitive identifiers (e.g., provider name) or explicitly mask sensitive keys before callinglogger.child.
286-299: Persist final counters back to Postgres. AftergetImportProgress, only the ClickHouse view reflects the final totals; Postgresimport.totalEvents/processedEventsstays at the previous batch values. The UI reads Postgres, so progress appears stale. Update the import row withfinalProgress.totalEventsandfinalProgress.insertedEventsbefore returning success. For example:jobLogger.info('Import job completed successfully', { totalEvents: finalProgress.totalEvents, insertedEvents: finalProgress.insertedEvents, status: finalProgress.status, }); + + await db.import.update({ + where: { id: importId }, + data: { + totalEvents: finalProgress.totalEvents, + processedEvents: finalProgress.insertedEvents, + }, + });packages/importer/src/providers/mixpanel.ts (1)
85-91: Remove service account from logs. Line [90] logsserviceAccount, exposing credential identifiers in centralized logs. Drop the field or replace it with a boolean flag (e.g.,serviceAccountPresent: true).packages/importer/src/providers/mixpanel.test.ts (1)
83-126: Tighten expectations for OS/browser version. The provider should emit$os_versionand$browser_version, but this test expects them empty, masking the bug. Update the assertion to expect real values (e.g.,'Android','135') after fixing the transformer.packages/trpc/src/routers/import.ts (4)
12-32: Sensitive config exposure in list — restrict or redact.The
listprocedure returns import records including theconfigfield to any project member with read access. This config likely contains provider credentials (API keys, secrets, service accounts). Redact sensitive fields or gate full config access by role.Consider one of these approaches:
Option A: Redact for read-level users
list: protectedProcedure .input(z.object({ projectId: z.string() })) .query(async ({ input, ctx }) => { const access = await getProjectAccessCached({ projectId: input.projectId, userId: ctx.session.userId, }); if (!access) { throw TRPCAccessError('You do not have access to this project'); } - return db.import.findMany({ + const rows = await db.import.findMany({ where: { projectId: input.projectId }, orderBy: { createdAt: 'desc' }, }); + const canSeeSecrets = typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets ? rows : rows.map((r) => ({ ...r, config: null })); }),Option B: Exclude config from query for read users
Use Prismaselectto omit the config field for read-level access.
34-56: Sensitive config exposure in get — restrict or redact.The
getprocedure returns the full import record including sensitiveconfigto any project member. Apply the same redaction/role-based gating as inlist.get: protectedProcedure .input(z.object({ id: z.string() })) .query(async ({ input, ctx }) => { const importRecord = await db.import.findUniqueOrThrow({ where: { id: input.id }, include: { project: true }, }); const access = await getProjectAccessCached({ projectId: importRecord.projectId, userId: ctx.session.userId, }); if (!access) { throw TRPCAccessError('You do not have access to this import'); } - return importRecord; + const canSeeSecrets = typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets ? importRecord : { ...importRecord, config: null }; }),
72-99: Queue enqueue not wrapped — record can be left pending with no job.If
importQueue.addfails, the database keeps a dangling 'pending' import with no corresponding job. Wrap the enqueue in try/catch, mark the record as failed on error, and configure stable jobId with retry options.// Create import record const importRecord = await db.import.create({ data: { projectId: input.projectId, config: input.config, status: 'pending', }, }); - // Add job to queue - const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); - - // Update import record with job ID - await db.import.update({ - where: { id: importRecord.id }, - data: { jobId: job.id }, - }); + // Add job to queue + try { + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + ); + await db.import.update({ + where: { id: importRecord.id }, + data: { jobId: job.id }, + }); + } catch (err) { + await db.import.update({ + where: { id: importRecord.id }, + data: { + status: 'failed', + errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed' + }, + }); + throw err; + }
160-176: Wrap retry enqueue in try/catch.Same issue as create: if
importQueue.addfails, the record is updated to 'pending' but no job exists. Wrap in try/catch and handle failures.// Add new job to queue - const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); - - // Update import record - return db.import.update({ - where: { id: importRecord.id }, - data: { - jobId: job.id, - status: 'pending', - errorMessage: null, - }, - }); + try { + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + ); + return db.import.update({ + where: { id: importRecord.id }, + data: { + jobId: job.id, + status: 'pending', + errorMessage: null, + }, + }); + } catch (err) { + throw new Error('Failed to enqueue retry job: ' + (err instanceof Error ? err.message : String(err))); + } }),apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)
265-276: Config tooltip exposes sensitive credentials.The tooltip at Line 269 renders the raw
imp.configJSON which can expose API keys, secrets, and service credentials to anyone viewing the imports page. Redact sensitive fields before display.<Tooltiper content={ <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> - {JSON.stringify(imp.config, null, 2)} + {JSON.stringify( + { + ...imp.config, + apiKey: imp.config.apiKey ? '***' : undefined, + apiSecret: imp.config.apiSecret ? '***' : undefined, + serviceAccount: imp.config.serviceAccount ? '***' : undefined, + serviceSecret: imp.config.serviceSecret ? '***' : undefined, + }, + null, + 2 + )} </pre> } tooltipClassName="max-w-xs" >Or implement a utility function to sanitize known sensitive keys (apiKey, api_secret, secret, password, token, credentials, etc.) before rendering.
apps/start/src/modals/add-import.tsx (1)
245-249: Avoid leaking secrets in error paths.The error toast at Line 247 displays
error.messagewhich may include upstream details or API secrets. Use a generic user-facing message and log full details server-side only.onError: (error) => { toast.error('Import failed', { - description: error.message, + description: 'Unable to start import. Please try again or contact support.', }); },Ensure the server logs the full error details securely without echoing them to the client.
🧹 Nitpick comments (9)
packages/db/src/clickhouse/migration.ts (1)
118-132: Add validation or documentation for TTL format.The function correctly handles clustered and non-clustered scenarios. However, the
ttlparameter is directly interpolated into the SQL statement without validation. ClickHouse TTL expressions must follow a specific syntax (e.g.,created + INTERVAL 30 DAY), and invalid formats will only fail at runtime.Consider adding:
- Input validation to check TTL syntax before generating SQL
- JSDoc documentation specifying the expected TTL format
+/** + * Generates ALTER TABLE statement to modify TTL + * @param tableName - Name of the table + * @param isClustered - Whether the deployment is clustered + * @param ttl - TTL expression in ClickHouse format (e.g., 'created + INTERVAL 30 DAY') + * @returns SQL statement to modify TTL + */ export const modifyTTL = ({ tableName, isClustered, ttl, }: { tableName: string; isClustered: boolean; ttl: string; }) => { + // Basic validation - TTL should not be empty + if (!ttl || ttl.trim().length === 0) { + throw new Error('TTL expression cannot be empty'); + } + if (isClustered) { return `ALTER TABLE ${replicated(tableName)} ON CLUSTER '{cluster}' MODIFY TTL ${ttl}`; } return `ALTER TABLE ${tableName} MODIFY TTL ${ttl}`; };packages/validation/src/index.ts (2)
439-442: Consider exportingzProjectMapperif used externally.The schema is well-defined, but verify whether external packages (e.g.,
@openpanel/importer) need access to this schema for validation or type inference. If so, add theexportkeyword.
444-449: Good abstraction for file-based import configs.The helper reduces duplication effectively. Similar to
zProjectMapper, consider exporting this if external packages need to create compatible config schemas.packages/common/src/object.ts (1)
44-55: Good use of try-catch for safe JSON parsing.The try-catch properly handles parse errors and prevents crashes from malformed JSON. However, consider simplifying by removing the pre-check in
isValidJsonString(see previous comment) since the try-catch already provides the necessary safety.apps/worker/package.json (1)
41-41: Version pinning removes automatic patch updates.Removing the caret (^) from
tsdownpins it to exactly0.14.2, preventing automatic patch updates that might include bug fixes. Unless there's a specific compatibility reason, consider keeping the caret for patch-level updates.packages/db/code-migrations/5-add-imports-table.sql (1)
42-43: Consider extending TTL for debugging.A 7-day TTL may be too aggressive for debugging failed imports or auditing historical import issues. Consider extending to 30-90 days depending on your operational needs and storage constraints.
packages/common/scripts/get-referrers.ts (1)
81-86: Consider formatting the generated output.The
JSON.stringifyoutput is not formatted, which will produce a single-line file that's difficult to read and review. Consider adding indentation or using a code formatter.- `const referrers: Record<string, { type: string, name: string }> = ${JSON.stringify( + `const referrers: Record<string, { type: string, name: string }> = ${JSON.stringify( { ...transform(data), ...extraReferrers, }, + null, + 2, )} as const;`,packages/db/src/clickhouse/client.ts (1)
206-232: Validate CSV row format before insertion.The
chInsertCSVfunction joins rows with newlines without validating that each row is properly formatted CSV. Consider adding validation or documenting the expected row format to prevent silent data corruption.export async function chInsertCSV(tableName: string, rows: string[]) { + if (rows.length === 0) { + logger.warn('chInsertCSV called with empty rows array'); + return; + } + try { const now = performance.now();packages/db/src/services/import.service.ts (1)
468-568: Use ClickHouse query_params instead ofsqlstring.escape. InbackfillSessionsToProduction, inlinesqlstring.escape(importId)/sqlstring.escape(from)breaks consistency with the rest of the file and forfeits ClickHouse parameter handling. Replace them with{importId:String}/{from:String}placeholders and passquery_params: { importId, from }. Based on learnings.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (76)
apps/api/package.json(1 hunks)apps/start/Dockerfile(2 hunks)apps/start/package.json(2 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/project-mapper.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/date-ranger-picker.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/start/src/utils/math.ts(1 hunks)apps/start/src/utils/slug.ts(1 hunks)apps/worker/Dockerfile(2 hunks)apps/worker/package.json(3 hunks)apps/worker/src/boot-workers.ts(4 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/cron.delete-projects.ts(1 hunks)apps/worker/src/jobs/events.incoming-event.ts(3 hunks)apps/worker/src/jobs/events.incoming-events.test.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)package.json(1 hunks)packages/cli/package.json(0 hunks)packages/cli/src/cli.ts(0 hunks)packages/cli/src/importer/importer.ts(0 hunks)packages/cli/src/importer/index.ts(0 hunks)packages/cli/tsconfig.json(0 hunks)packages/cli/tsup.config.ts(0 hunks)packages/common/package.json(1 hunks)packages/common/scripts/get-referrers.ts(1 hunks)packages/common/server/index.ts(1 hunks)packages/common/server/parse-referrer.test.ts(5 hunks)packages/common/server/parse-referrer.ts(4 hunks)packages/common/server/parser-user-agent.ts(2 hunks)packages/common/src/object.test.ts(1 hunks)packages/common/src/object.ts(2 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/index.ts(2 hunks)packages/db/package.json(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql(1 hunks)packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql(1 hunks)packages/db/prisma/migrations/20251028152531_imports_again/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/buffers/base-buffer.ts(1 hunks)packages/db/src/buffers/event-buffer.test.ts(8 hunks)packages/db/src/clickhouse/client.ts(6 hunks)packages/db/src/clickhouse/csv.ts(1 hunks)packages/db/src/clickhouse/migration.ts(1 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/services/session.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/metadata.ts(1 hunks)packages/importer/src/providers/mixpanel.test.ts(1 hunks)packages/importer/src/providers/mixpanel.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/redis/package.json(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/auth.ts(1 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/trpc/src/routers/organization.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
💤 Files with no reviewable changes (6)
- packages/cli/tsup.config.ts
- packages/cli/tsconfig.json
- packages/cli/src/importer/index.ts
- packages/cli/package.json
- packages/cli/src/importer/importer.ts
- packages/cli/src/cli.ts
🚧 Files skipped from review as they are similar to previous changes (43)
- packages/db/src/buffers/event-buffer.test.ts
- apps/start/src/components/integrations/integration-card.tsx
- packages/common/src/object.test.ts
- apps/start/src/modals/date-ranger-picker.tsx
- packages/db/src/services/event.service.ts
- packages/importer/tsconfig.json
- packages/db/src/services/session.service.ts
- packages/trpc/package.json
- packages/db/index.ts
- apps/worker/src/jobs/events.incoming-events.test.ts
- packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql
- packages/queue/src/queues.ts
- apps/worker/Dockerfile
- apps/worker/src/index.ts
- packages/db/src/types.ts
- packages/common/server/index.ts
- packages/importer/src/index.ts
- apps/worker/src/jobs/cron.delete-projects.ts
- apps/start/package.json
- apps/start/src/modals/index.tsx
- apps/start/src/components/project-mapper.tsx
- packages/db/prisma/schema.prisma
- apps/api/package.json
- packages/db/package.json
- apps/start/src/components/ui/badge.tsx
- packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql
- packages/common/server/parse-referrer.test.ts
- apps/start/Dockerfile
- packages/common/server/parser-user-agent.ts
- packages/importer/src/base-provider.ts
- packages/redis/package.json
- packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql
- apps/worker/src/boot-workers.ts
- apps/worker/src/jobs/events.incoming-event.ts
- packages/importer/src/types.ts
- packages/db/code-migrations/5-add-imports-table.ts
- packages/importer/src/providers/metadata.ts
- packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
- packages/db/prisma/migrations/20251028152531_imports_again/migration.sql
- packages/common/package.json
- packages/importer/package.json
- packages/db/src/buffers/base-buffer.ts
- apps/start/src/utils/slug.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/utils/math.tsapps/start/src/routeTree.gen.tsapps/start/src/modals/add-import.tsx
🧠 Learnings (9)
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/__root.tsx : Declare the root route in routes/__root.tsx using createRootRoute or createRootRouteWithContext
Applied to files:
packages/trpc/src/root.tsapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*.tsx : Define React route files with createFileRoute and let the bundler/CLI manage the path string
Applied to files:
packages/trpc/src/root.tsapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/_*.tsx : Use leading underscore filenames for pathless layout routes that wrap children without matching a URL segment
Applied to files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, use the helper functions in ./packages/db/src/clickhouse/query-functions.ts rather than crafting raw queries
Applied to files:
packages/db/src/clickhouse/csv.tspackages/db/src/clickhouse/migration.tspackages/db/src/clickhouse/client.tspackages/importer/src/providers/umami.tspackages/db/src/services/import.service.ts
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, always construct them using the custom query builder in ./packages/db/src/clickhouse/query-builder.ts
Applied to files:
packages/db/src/clickhouse/client.tspackages/importer/src/providers/umami.tspackages/db/src/services/import.service.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/{index,*.index}.tsx : For index routes, use index.tsx or *.index.tsx files and ensure the createFileRoute path uses a trailing slash (e.g., '/posts/')
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*$*.tsx : Name dynamic route files with $param (e.g., posts.$postId.tsx) to capture path parameters
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/$.tsx : Use $.tsx for splat (catch-all) routes to capture the remaining path into params._splat
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*_.tsx : Use a trailing underscore in a segment filename to create a non-nested (un-nested) route (e.g., posts_.tsx)
Applied to files:
apps/start/src/routeTree.gen.ts
🧬 Code graph analysis (10)
apps/worker/src/jobs/import.ts (5)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/services/import.service.ts (10)
ImportSteps(719-719)updateImportStatus(721-784)getImportDateBounds(658-682)insertImportBatch(30-87)generateSessionIds(94-130)createSessionsStartEndEvents(138-348)moveImportsToProduction(354-458)backfillSessionsToProduction(460-569)markImportComplete(574-593)getImportProgress(598-653)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(241-252)error(38-43)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)
packages/trpc/src/root.ts (1)
packages/trpc/src/routers/import.ts (1)
importRouter(11-178)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (3)
packages/importer/src/providers/metadata.ts (1)
IMPORT_PROVIDERS(13-30)apps/public/components/ui/tooltip.tsx (1)
Tooltiper(36-51)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
packages/trpc/src/routers/import.ts (5)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(134-134)protectedProcedure(166-170)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(276-276)packages/validation/src/index.ts (1)
zCreateImport(478-486)packages/queue/src/queues.ts (1)
importQueue(175-181)
packages/importer/src/providers/mixpanel.test.ts (1)
packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-452)
packages/importer/src/providers/umami.ts (5)
packages/validation/src/index.ts (1)
IUmamiImportConfig(456-456)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
IMixpanelImportConfig(471-471)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)packages/common/server/parser-user-agent.ts (2)
parseUserAgent(72-115)UserAgentInfo(71-71)packages/common/src/object.ts (2)
toDots(16-67)strip(79-79)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
zUmamiImportConfig(452-454)zMixpanelImportConfig(461-470)IUmamiImportConfig(456-456)IMixpanelImportConfig(471-471)IImportConfig(473-476)apps/start/src/components/project-mapper.tsx (1)
ProjectMapper(33-116)apps/public/lib/utils.ts (1)
cn(4-6)
packages/db/src/services/import.service.ts (5)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/csv.ts (2)
csvEscapeField(42-53)csvEscapeJson(15-38)packages/db/src/clickhouse/client.ts (5)
formatClickhouseDate(241-252)chInsertCSV(206-232)TABLE_NAMES(46-58)ch(129-166)convertClickhouseDateToJs(271-273)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/prisma-client.ts (1)
db(276-276)
packages/importer/vitest.config.ts (1)
vitest.shared.ts (1)
getSharedVitestConfig(4-27)
🔇 Additional comments (22)
packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1)
1-2: Schema modification is safe and straightforward.Dropping the NOT NULL constraint on
currentStepaligns with flexible import lifecycle handling as part of the broader imports table normalization.packages/validation/src/index.ts (2)
452-471: Provider config schemas are well-structured.Each provider config correctly captures its specific requirements:
- Umami: file-based with project mapping
- Plausible: simple file-based
- Mixpanel: API-based with credentials and date range
The schemas properly use literals for provider/type discrimination.
473-476: LGTM!The union type provides proper TypeScript inference for all import config variants.
packages/common/src/object.ts (1)
35-37: Empty string filtering is intentional and tested behavior—no issues found.The test file explicitly validates that empty strings are filtered out. In
packages/common/src/object.test.ts, the test input includesempty: '', but the expected output does not contain anemptykey, confirming this filtering is the designed behavior. This matches the code at lines 35-37, which filtersundefined,null, and empty strings. No breaking changes are present—the behavior is tested and consistent across all usages in the codebase.apps/worker/package.json (1)
22-22: LGTM!The addition of the
@openpanel/importerworkspace dependency aligns with the new import subsystem introduced in this PR.packages/trpc/src/routers/organization.ts (1)
14-14: LGTM!The import path consolidation to
@openpanel/common/serverimproves module organization without changing functionality.package.json (1)
12-12: LGTM!The script update correctly reflects the migration of referrers generation from the worker scope to the common scope, aligning with the new
packages/common/scripts/get-referrers.tsscript.packages/db/code-migrations/5-add-imports-table.sql (2)
30-30: Verify nullable imported_at field.The
imported_atfield is nullable, which makes sense for pending imports. Ensure that queries filtering by this field handle NULL values appropriately, or consider adding a non-null default for completed imports.
37-37: Verify ordering key matches query patterns.The table is ordered by
(import_id, created_at). Ensure this ordering supports your primary query patterns. If you frequently query byproject_idorimport_status, consider including those in the ordering key.Based on learnings.
packages/db/src/clickhouse/csv.ts (3)
5-13: LGTM!The escape order is correct (backslashes first), and all necessary control characters are properly escaped for ClickHouse Map format.
40-53: LGTM!The CSV field escaping correctly follows RFC 4180 CSV standard by doubling internal quotes and wrapping fields that contain special characters.
29-29: Code is correct — ClickHouse CSV supports{}as empty Map representation.The function correctly returns
'{}'as ClickHouse CSV accepts{}(optionally quoted as"{}") as the valid emptyMap(String, String)representation.packages/db/src/clickhouse/client.ts (4)
57-57: LGTM!Adding
events_importstable name aligns with the new import subsystem introduced in this PR.
62-62: Timeout increases support long-running imports.The timeout changes (request_timeout: 300s, idle_socket_ttl: 60s) are appropriate for supporting large CSV import operations.
Also applies to: 65-65
89-127: LGTM!Exporting
withRetryenables reuse across the codebase. The retry logic correctly handles connection errors with exponential backoff while immediately throwing non-retriable errors.
142-143: String values for numeric settings are correct and explicitly supported.The ClickHouse client documentation confirms that numeric settings like
max_insert_block_sizeandmax_http_get_redirectscan be specified as strings. The docs show an explicit example passing a UInt64 setting as a string (http_headers_progress_interval_ms: '110000'), and the client handles this appropriately. No changes are needed.Likely an incorrect or invalid review comment.
packages/common/server/parse-referrer.ts (2)
35-51: Case-insensitive matching is explicitly tested and working as intended.The test suite includes a dedicated case-insensitive test that validates mixed-case inputs (e.g., 'GoOgLe') correctly resolve to the proper referrer (e.g., 'Google'). The implementation consistently lowercases the source value and falls back gracefully for unknown sources. All three current usages in the codebase (event processing, Umami importer, Mixpanel importer) are compatible with case-insensitive matching for query parameters.
23-23: Review comment is incorrect and should be disregarded.The parse-referrer.ts code already defaults to empty string (
'') when no referrer match is found. This is not a new change—line 23 showstype: match?.type ?? '', which means if there's no match, the type defaults to an empty string.The 'unknown' value that exists in the referrers data is NOT a default; it's a specific type value assigned to certain known referrers (e.g., Google support pages, Yahoo subdomains). No downstream code checks for
'unknown'as a sentinel value to trigger special handling. The codebase contains zero matches for conditional checks liketype === 'unknown'.Likely an incorrect or invalid review comment.
packages/trpc/src/routers/auth.ts (1)
15-15: LGTM! Import path consolidation.The import path change from
@openpanel/common/server/idto@openpanel/common/serverconsolidates module imports without affecting behavior.apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
45-45: LGTM! New Imports tab added.The new tab entry correctly wires to the imports route under project settings.
packages/trpc/src/root.ts (1)
7-7: LGTM! Import router wired correctly.The new import router is properly integrated into the root TRPC router, following established patterns.
Also applies to: 44-44
apps/start/src/routeTree.gen.ts (1)
59-59: Auto-generated route tree — no review required.This file is automatically generated by TanStack Router and correctly includes the new imports route declarations.
Also applies to: 385-390, 525-525, 580-580, 645-645, 705-705, 760-760, 824-824, 1210-1216, 1544-1544, 1556-1557
| @@ -1 +1 @@ | |||
| export * from '@openpanel/common/src/math'; | |||
| export * from '@openpanel/common'; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Export only math utilities, not the entire common package.
This file is named math.ts, suggesting it should provide math-related utilities. Exporting everything from @openpanel/common introduces namespace pollution and makes it unclear what this module provides. This could lead to unintended exports and maintenance issues.
Consider one of these approaches:
Option 1: If the package structure supports it, use a specific export path:
-export * from '@openpanel/common';
+export * from '@openpanel/common/math';Option 2: Explicitly re-export only math-related utilities:
-export * from '@openpanel/common';
+export { round, clamp, percentage, /* other math utilities */ } from '@openpanel/common';Verify that consumers of this module still have access to all expected math utilities:
🏁 Script executed:
#!/bin/bash
# Find all files importing from this math utility module
rg -n --type=ts --type=tsx "from ['\"].*utils/math['\"]" -A 2 -B 1
# Check what @openpanel/common exports at the root
fd -e ts -e tsx "index.ts" packages/common/ --exec cat {}Length of output: 162843
Change removes math utilities; will break consumers relying on this module.
The change from '@openpanel/common/src/math' to '@openpanel/common' is a breaking change. The @openpanel/common package index does not export math utilities—it exports only: crypto, profileId, parser-user-agent, parse-referrer, and id. This will break any code importing math functions from apps/start/src/utils/math.ts.
Either restore the specific path or ensure math utilities are added to @openpanel/common's root exports.
🤖 Prompt for AI Agents
In apps/start/src/utils/math.ts around line 1, the export was changed from
'@openpanel/common/src/math' to '@openpanel/common', removing math utilities and
breaking consumers; restore compatibility by reverting to export the specific
math module path (e.g., re-export from '@openpanel/common/src/math') or,
alternatively, add the math utilities to the root exports of @openpanel/common
and update its index so this file can safely export from '@openpanel/common';
ensure the chosen fix is applied consistently and run tests/usages to confirm
imports resolve.
| if ( | ||
| !providerInstance.validate( | ||
| // @ts-expect-error | ||
| rawEvent, | ||
| ) | ||
| ) { | ||
| jobLogger.warn('Skipping invalid event', { rawEvent }); | ||
| continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redact invalid events in logs. jobLogger.warn('Skipping invalid event', { rawEvent }) dumps the full raw payload, which can contain PII or customer data. Log only safe metadata (provider, event ID/type, reason) and drop the payload.
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 147 to 154, the current warning
logs include the full rawEvent payload which may contain PII; change the log to
redact the payload by extracting and logging only safe metadata (provider name,
event id/type if available, and a short reason like "validation failed"). Remove
or omit rawEvent from the logged object, and ensure any fields you include are
explicitly accessed (e.g., providerInstance.name, rawEvent.id, rawEvent.type)
with safe guards so undefined values are handled; keep the log message and
continue behavior unchanged.
| const transformedEvents: IClickhouseEvent[] = eventBatch.map( | ||
| ( | ||
| // @ts-expect-error | ||
| event, | ||
| ) => providerInstance!.transformEvent(event), | ||
| ); | ||
|
|
||
| await insertImportBatch(transformedEvents, importId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard batch transform errors. A single transformEvent throw terminates the whole job because the map runs outside any try/catch. Wrap each transform in a guarded loop, skip failures, and only insert successful events; also use a provider-agnostic unknown[] buffer. For example:
- const eventBatch: any = [];
+ const eventBatch: unknown[] = [];
…
- const transformedEvents: IClickhouseEvent[] = eventBatch.map(
- (event) => providerInstance!.transformEvent(event),
- );
+ const transformedEvents: IClickhouseEvent[] = [];
+ for (const event of eventBatch) {
+ try {
+ transformedEvents.push(
+ providerInstance!.transformEvent(event as never),
+ );
+ } catch (error) {
+ jobLogger.warn('Skipping event on transform failure', {
+ provider: providerInstance.provider,
+ reason: error instanceof Error ? error.message : String(error),
+ });
+ }
+ }Apply the same guard for the final partial batch.
Also applies to: 191-198
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 163-170 (and similarly 191-198),
the current code maps eventBatch through providerInstance.transformEvent outside
any try/catch so a single throw aborts the whole job; change this to a guarded
per-event loop using an intermediary provider-agnostic buffer (unknown[]),
iterate events with a try/catch around each transform, push only successful
transformed events into the buffer, optionally log or count failures, and call
insertImportBatch(buffer, importId) only when the buffer is non-empty; repeat
the same guarded logic for the final partial batch at the later lines.
| providerInstance.shouldGenerateSessionIds() | ||
| ) { | ||
| await whileBounds(resumeGeneratingSessionIdsFrom, async (from) => { | ||
| console.log('Generating session IDs', { from }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use jobLogger instead of console.log. Switch this message to jobLogger.info('Generating session IDs', { from }) to retain context (importId, provider) and stay consistent with structured logging.
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around line 225, replace the console.log call
with a structured jobLogger.info call: change console.log('Generating session
IDs', { from }) to jobLogger.info('Generating session IDs', { from, importId,
provider }) so the message uses the jobLogger and includes contextual fields
(importId, provider) for consistent structured logging.
| } catch (e) { | ||
| console.log(e); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error handling silently swallows failures.
The catch block only logs errors without throwing or exiting with a non-zero code. This may cause silent failures in CI/CD pipelines.
} catch (e) {
console.log(e);
+ process.exit(1);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch (e) { | |
| console.log(e); | |
| } | |
| } catch (e) { | |
| console.log(e); | |
| process.exit(1); | |
| } |
🤖 Prompt for AI Agents
In packages/common/scripts/get-referrers.ts around lines 91 to 93, the catch
block only logs the error with console.log which silently swallows failures;
change it to log to console.error and either rethrow the error or call
process.exit(1) so the process fails in CI; ensure any async contexts properly
propagate the error (throw or reject) and include the caught error message/stack
in the log for debugging.
| const userAgent = props.osVersion || ''; | ||
| const uaInfo = this.isWebEvent(props.mp_lib) | ||
| ? parseUserAgent(userAgent, props) | ||
| : this.parseServerDeviceInfo(props); | ||
|
|
||
| // Map event name - $mp_web_page_view should be screen_view | ||
| let eventName = rawEvent.event; | ||
| if (eventName === '$mp_web_page_view') { | ||
| eventName = 'screen_view'; | ||
| } | ||
|
|
||
| // Build properties object - strip Mixpanel-specific properties | ||
| const properties = this.stripMixpanelProperties(props, query); | ||
|
|
||
| if (props.$insert_id) { | ||
| properties.__source_insert_id = String(props.$insert_id); | ||
| } | ||
| // Add useful properties | ||
| if (props.$screen_width && props.$screen_height) { | ||
| properties.__screen = `${props.$screen_width}x${props.$screen_height}`; | ||
| } | ||
| if (props.$screen_dpi) { | ||
| properties.__dpi = props.$screen_dpi; | ||
| } | ||
| if (props.$language) { | ||
| properties.__language = props.$language; | ||
| } | ||
| if (props.$timezone) { | ||
| properties.__timezone = props.$timezone; | ||
| } | ||
| if (props.$app_version) { | ||
| properties.__version = props.$app_version; | ||
| } | ||
| if (props.$app_build_number) { | ||
| properties.__buildNumber = props.$app_build_number; | ||
| } | ||
| if (props.$lib_version) { | ||
| properties.__lib_version = props.$lib_version; | ||
| } | ||
|
|
||
| if (hash) { | ||
| properties.__hash = hash; | ||
| } | ||
|
|
||
| if (Object.keys(query).length > 0) { | ||
| properties.__query = query; | ||
| } | ||
|
|
||
| if (props.current_page_title) { | ||
| properties.__title = props.current_page_title; | ||
| } | ||
|
|
||
| if (userAgent) { | ||
| properties.__userAgent = userAgent; | ||
| } | ||
|
|
||
| // Always use UUID for id to match ClickHouse UUID column | ||
| const event = { | ||
| id: randomUUID(), | ||
| name: eventName, | ||
| device_id: deviceId, | ||
| profile_id: profileId, | ||
| project_id: projectId, | ||
| session_id: '', // Will be generated in SQL after import | ||
| properties: toDots(properties), // Flatten nested objects/arrays to Map(String, String) | ||
| created_at: new Date(props.time * 1000).toISOString(), | ||
| country, | ||
| city, | ||
| region, | ||
| longitude: null, | ||
| latitude: null, | ||
| os: uaInfo.os || props.$os, | ||
| os_version: uaInfo.osVersion || props.$osVersion, | ||
| browser: uaInfo.browser || props.$browser, | ||
| browser_version: | ||
| uaInfo.browserVersion || props.$browserVersion | ||
| ? String(props.$browser_version) | ||
| : '', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull user agent and version fields from the correct properties. Using props.osVersion for userAgent, props.$osVersion (camelCase) for OS version, and the ternary uaInfo.browserVersion || props.$browserVersion ? String(props.$browser_version) : '' means:
- UA parsing runs on the OS version string.
$os_version(Mixpanel default) is ignored, so OS version stays empty.- Browser version resolves to
'undefined'or''even when$browser_versionexists.
Read from $user_agent, $os_version, $browser_version, and order the fallbacks explicitly:
const userAgent = props.$user_agent || '';
const osVersion =
uaInfo.osVersion || props.$os_version || props.$osVersion || '';
const browserVersion =
uaInfo.browserVersion ||
props.$browser_version ||
props.$browserVersion ||
'';Coerce the final strings with String(...). This keeps device metadata accurate.
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 202 to 279, the code
uses the wrong properties for user agent and version fields (props.osVersion,
props.$osVersion, and a faulty ternary for browser_version), causing UA parsing
and version values to be incorrect; update to read userAgent from
props.$user_agent, compute osVersion and browserVersion with explicit fallbacks
like: uaInfo.osVersion || props.$os_version || props.$osVersion || '' and
uaInfo.browserVersion || props.$browser_version || props.$browserVersion || '',
coerce final values with String(...), pass the correct userAgent into the UA
parsing call (so parsing runs on $user_agent), and assign the computed
os_version and browser_version values into the event object instead of the
current faulty expressions.
| private async *parseRemoteFile( | ||
| url: string, | ||
| opts: { | ||
| signal?: AbortSignal; | ||
| maxBytes?: number; | ||
| maxRows?: number; | ||
| } = {}, | ||
| ): AsyncGenerator<UmamiRawEvent, void, unknown> { | ||
| const { signal, maxBytes, maxRows } = opts; | ||
| const controller = new AbortController(); | ||
|
|
||
| // Link to caller's signal for cancellation | ||
| if (signal) { | ||
| signal.addEventListener('abort', () => controller.abort(), { | ||
| once: true, | ||
| }); | ||
| } | ||
|
|
||
| const res = await fetch(url, { signal: controller.signal }); | ||
| if (!res.ok || !res.body) { | ||
| throw new Error( | ||
| `Failed to fetch remote file: ${res.status} ${res.statusText}`, | ||
| ); | ||
| } | ||
|
|
||
| const contentType = res.headers.get('content-type') || ''; | ||
| const contentEnc = res.headers.get('content-encoding') || ''; | ||
| const contentLen = Number(res.headers.get('content-length') ?? 0); | ||
|
|
||
| if ( | ||
| contentType && | ||
| !/text\/csv|text\/plain|application\/gzip|application\/octet-stream/i.test( | ||
| contentType, | ||
| ) | ||
| ) { | ||
| console.warn(`Warning: Content-Type is ${contentType}, expected CSV-ish`); | ||
| } | ||
|
|
||
| if (maxBytes && contentLen && contentLen > maxBytes) { | ||
| throw new Error( | ||
| `Remote file exceeds size limit (${contentLen} > ${maxBytes})`, | ||
| ); | ||
| } | ||
|
|
||
| const looksGzip = | ||
| /\.gz($|\?)/i.test(url) || | ||
| /gzip/i.test(contentEnc) || | ||
| /application\/gzip/i.test(contentType); | ||
| const looksBr = /br/i.test(contentEnc) || /\.br($|\?)/i.test(url); | ||
|
|
||
| // WHATWG -> Node stream | ||
| const body = Readable.fromWeb(res.body as any); | ||
|
|
||
| // Optional size guard during stream | ||
| let seenBytes = 0; | ||
| if (maxBytes) { | ||
| body.on('data', (chunk: Buffer) => { | ||
| seenBytes += chunk.length; | ||
| if (seenBytes > maxBytes) { | ||
| controller.abort(); | ||
| body.destroy( | ||
| new Error( | ||
| `Stream exceeded size limit (${seenBytes} > ${maxBytes})`, | ||
| ), | ||
| ); | ||
| } | ||
| }); | ||
| } | ||
|
|
||
| // Build decode chain (gzip/brotli -> CSV parser) | ||
| const decompress = looksGzip | ||
| ? createGunzip() | ||
| : looksBr | ||
| ? createBrotliDecompress() | ||
| : null; | ||
|
|
||
| const parser = parse({ | ||
| columns: true, // objects per row | ||
| bom: true, // handle UTF-8 BOM | ||
| relax_column_count: true, | ||
| skip_empty_lines: true, | ||
| }); | ||
|
|
||
| // Wire the pipeline for proper backpressure & error propagation | ||
| (async () => { | ||
| try { | ||
| if (decompress) { | ||
| await pipeline(body, decompress, parser, { | ||
| signal: controller.signal, | ||
| }); | ||
| } else { | ||
| await pipeline(body, parser, { signal: controller.signal }); | ||
| } | ||
| } catch (e) { | ||
| parser.destroy(e as Error); | ||
| } | ||
| })().catch(() => { | ||
| /* handled by iterator */ | ||
| }); | ||
|
|
||
| let rows = 0; | ||
| try { | ||
| for await (const record of parser) { | ||
| rows++; | ||
| if (maxRows && rows > maxRows) { | ||
| controller.abort(); | ||
| throw new Error(`Row limit exceeded (${rows} > ${maxRows})`); | ||
| } | ||
| yield record as UmamiRawEvent; | ||
| } | ||
| } catch (err) { | ||
| throw new Error( | ||
| `Failed to parse remote file from ${url}: ${ | ||
| err instanceof Error ? err.message : String(err) | ||
| }`, | ||
| ); | ||
| } finally { | ||
| controller.abort(); // ensure fetch stream is torn down | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add SSRF defenses before fetching user URLs. parseRemoteFile accepts arbitrary fileUrl and streams it without DNS/IP validation. Attackers can pivot to internal networks or metadata services. Resolve the hostname, block private/link-local/loopback ranges (10/8, 172.16/12, 192.168/16, 127/8, ::1, fc00::/7, fe80::/10), and recheck after redirects—alternatively require signed/allowlisted URLs. Abort the fetch if any resolved address is disallowed.
| const deviceId = | ||
| rawEvent.visit_id || | ||
| generateDeviceId({ | ||
| ip: rawEvent.visit_id!, | ||
| ua: rawEvent.visit_id!, | ||
| origin: projectId, | ||
| salt: 'xxx', | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix device ID fallback. Falling back to generateDeviceId with rawEvent.visit_id! for both ip and ua breaks when visit_id is absent (runtime undefined) and produces unstable identifiers even when present. Prefer existing identifiers: visit_id, distinct_id, session_id, or event_id, and only hash real IP/UA values if you have them.
| // Parse URL if available - use same logic as real-time events | ||
| const url = rawEvent.url_path | ||
| ? `https://${[rawEvent.hostname, rawEvent.url_path, rawEvent.url_query] | ||
| .filter(Boolean) | ||
| .join('')}` | ||
| : ''; | ||
| const { path, hash, query, origin } = parsePath(url); | ||
| // Extract referrer information - use same logic as real-time events | ||
| const referrerUrl = rawEvent.referrer_domain | ||
| ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}` | ||
| : ''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Harden URL assembly. Building url via https://${[hostname, path, query].join('')} omits the leading ? and produces invalid strings when hostname is missing. Require hostname before creating an absolute URL, preserve the leading slash on url_path, and normalize url_query so it starts with ?. Otherwise hand parsePath the relative path.
🤖 Prompt for AI Agents
In packages/importer/src/providers/umami.ts around lines 223–233, the URL
assembly is fragile: only build an absolute URL when rawEvent.hostname exists;
preserve a leading slash on rawEvent.url_path (prepend '/' if non-empty and
missing), normalize rawEvent.url_query so it either is empty or starts with '?'
(strip any leading '?' then prefix one), then assemble origin as
`https://${hostname}` and join origin + path + query to pass to parsePath; if
hostname is missing, pass a relative path to parsePath by combining the
normalized path and query (no origin). Apply the same hostname-check and path
normalization for the referrer (require rawEvent.referrer_domain, ensure
referrer_path starts with '/', normalize any referrer query similarly) so
parsePath always receives a well-formed absolute URL or a correctly formed
relative path.
| let properties: Record<string, any> = {}; | ||
|
|
||
| if (query) { | ||
| properties.__query = query; | ||
| } | ||
|
|
||
| // Add useful properties from Umami data | ||
| if (rawEvent.page_title) properties.__title = rawEvent.page_title; | ||
| if (rawEvent.screen) properties.__screen = rawEvent.screen; | ||
| if (rawEvent.language) properties.__language = rawEvent.language; | ||
| if (rawEvent.utm_source) | ||
| properties = assocPath( | ||
| ['__query', 'utm_source'], | ||
| rawEvent.utm_source, | ||
| properties, | ||
| ); | ||
| if (rawEvent.utm_medium) | ||
| properties = assocPath( | ||
| ['__query', 'utm_medium'], | ||
| rawEvent.utm_medium, | ||
| properties, | ||
| ); | ||
| if (rawEvent.utm_campaign) | ||
| properties = assocPath( | ||
| ['__query', 'utm_campaign'], | ||
| rawEvent.utm_campaign, | ||
| properties, | ||
| ); | ||
| if (rawEvent.utm_content) | ||
| properties = assocPath( | ||
| ['__query', 'utm_content'], | ||
| rawEvent.utm_content, | ||
| properties, | ||
| ); | ||
| if (rawEvent.utm_term) | ||
| properties = assocPath( | ||
| ['__query', 'utm_term'], | ||
| rawEvent.utm_term, | ||
| properties, | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flatten properties instead of nesting objects. Storing properties.__query = query plus assocPath(['__query', …]) leaves nested objects. They serialize to [object Object] in ClickHouse, losing UTM data. Flatten the fields (e.g., properties[__query.${key}] = value) and keep all values primitive so insertImportBatch emits valid Map(String,String) rows.
🤖 Prompt for AI Agents
In packages/importer/src/providers/umami.ts around lines 257 to 296, the code
currently sets properties.__query = query and uses assocPath(['__query', ...])
which creates nested objects that serialize poorly for ClickHouse; replace this
by flattening query and UTM fields into primitive map entries (e.g. for each key
in query set properties[`__query.${key}`] = String(value)), remove assocPath and
any assignment that creates an object at properties.__query, and ensure all
assigned values are primitives (strings or numbers) so insertImportBatch emits
valid Map(String,String) rows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
packages/trpc/src/routers/import.ts (4)
12-32: Config field contains sensitive credentials — restrict access.The
listprocedure returns the full import records, including theconfigfield, which likely contains provider credentials. This exposes secrets to any project member with read access.The past review comment at lines 12-32 provides detailed remediation options (redact for read-level users or use Prisma select to exclude the field conditionally).
34-56: Config field exposure in get — apply same restrictions as list.The
getprocedure returns the full import record, including sensitiveconfigdata, to any user with project access.Refer to the past review comment at lines 34-57 for detailed remediation (conditional select/redaction based on role).
72-99: Wrap queue enqueue in try/catch to avoid orphaned pending records.If
importQueue.addfails, the import record remains in 'pending' status but no job exists in the queue. This leaves a dangling record that will never complete.The past review comment at lines 72-99 provides a detailed fix with error handling, stable jobId, and retry configuration.
160-176: Wrap retry enqueue in try/catch.Same issue as
create: ifimportQueue.addfails, the import record will be updated to 'pending' but no job will exist, leaving it in a stuck state.The past review comment at lines 160-176 provides a detailed fix with error handling.
🧹 Nitpick comments (2)
packages/trpc/src/routers/import.ts (2)
121-126: Consider wrapping job removal in try/catch.If
job.remove()fails (e.g., job not found or already completed), the entire delete operation will fail and the import record will remain in the database. Depending on requirements, you may want to proceed with record deletion even if job cleanup fails.if (importRecord.jobId) { - const job = await importQueue.getJob(importRecord.jobId); - if (job) { - await job.remove(); - } + try { + const job = await importQueue.getJob(importRecord.jobId); + if (job) { + await job.remove(); + } + } catch (err) { + // Log but don't block deletion if job cleanup fails + console.error('Failed to remove job:', err); + } }
66-66: Consider extracting repeated write-access check.The pattern
!access || (typeof access !== 'boolean' && access.level === 'read')is repeated increate,delete, andretryprocedures (lines 66, 115, 149). Consider extracting to a helper function.function canWrite(access: Awaited<ReturnType<typeof getProjectAccess>>): boolean { return !!access && (typeof access === 'boolean' || access.level !== 'read'); }Then use:
- if (!access || (typeof access !== 'boolean' && access.level === 'read')) { + if (!canWrite(access)) { throw TRPCAccessError( 'You do not have permission to create imports for this project', ); }Also applies to: 115-115, 149-149
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/trpc/src/routers/import.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
packages/trpc/src/routers/import.ts (5)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(134-134)protectedProcedure(166-170)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(276-276)packages/validation/src/index.ts (1)
zCreateImport(478-486)packages/queue/src/queues.ts (1)
importQueue(175-181)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-and-push-dashboard
- GitHub Check: build-and-push-api
- GitHub Check: build-and-push-worker
🔇 Additional comments (1)
packages/trpc/src/routers/import.ts (1)
15-15: Review comment is based on incorrect assumptions.The
getProjectAccessfunction is already cached. It's defined asexport const getProjectAccess = cacheable('getProjectAccess', ...)inpackages/db/src/services/access.service.ts. No separategetProjectAccessCachedvariant exists. The code is already using the cached version.Likely an incorrect or invalid review comment.
| // Only allow retry for failed imports | ||
| if (importRecord.status !== 'failed') { | ||
| throw new Error('Only failed imports can be retried'); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use TRPCError for consistency.
The error thrown here uses a generic Error, while other access-related errors in this file use TRPCError. Use TRPCError with an appropriate code for consistency.
// Only allow retry for failed imports
if (importRecord.status !== 'failed') {
- throw new Error('Only failed imports can be retried');
+ throw new TRPCError({
+ code: 'BAD_REQUEST',
+ message: 'Only failed imports can be retried',
+ });
}You'll need to add TRPCError to the imports:
-import { TRPCAccessError } from '../errors';
+import { TRPCError } from '@trpc/server';
+import { TRPCAccessError } from '../errors';📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Only allow retry for failed imports | |
| if (importRecord.status !== 'failed') { | |
| throw new Error('Only failed imports can be retried'); | |
| } | |
| // Only allow retry for failed imports | |
| if (importRecord.status !== 'failed') { | |
| throw new TRPCError({ | |
| code: 'BAD_REQUEST', | |
| message: 'Only failed imports can be retried', | |
| }); | |
| } |
| // Only allow retry for failed imports | |
| if (importRecord.status !== 'failed') { | |
| throw new Error('Only failed imports can be retried'); | |
| } | |
| import { TRPCError } from '@trpc/server'; | |
| import { TRPCAccessError } from '../errors'; |
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 155 to 158, replace the
generic throw new Error('Only failed imports can be retried') with a TRPCError
so it matches other access-related errors; add an import for TRPCError from
'@trpc/server' at the top and throw new TRPCError({ code: 'BAD_REQUEST',
message: 'Only failed imports can be retried' }) instead.
2f71c57 to
b86c7a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
♻️ Duplicate comments (27)
packages/common/scripts/get-referrers.ts (2)
74-74: Fix incorrect output path.This issue was already identified in a previous review. The path
../../worker/src/referrers/index.tsfrompackages/common/scripts/doesn't match the expected location referenced in package.json scripts.
91-93: Error handling silently swallows failures.This issue was already identified in a previous review. The catch block only logs errors without throwing or exiting with a non-zero code, which may cause silent failures in CI/CD pipelines.
packages/common/src/object.ts (1)
3-14: Previous review concern: misleading function names.As flagged in the earlier review, these function names suggest full JSON validation but only check bracket/brace matching. A string like
"{not valid json"would incorrectly passisValidJsonStringdespite being invalid JSON.packages/validation/src/index.ts (1)
478-486: Discriminate by provider to prevent mismatched payloads.This issue was previously flagged but remains unresolved. The current schema allows
providerandconfigto disagree (e.g.,provider: 'umami'with a Mixpanel config), which can lead to runtime errors. Use a discriminated union onproviderto ensure the config shape matches the selected provider.Apply the suggested fix from the previous review:
-export const zCreateImport = z.object({ - projectId: z.string().min(1), - provider: z.enum(['umami', 'plausible', 'mixpanel']), - config: z.union([ - zUmamiImportConfig, - zPlausibleImportConfig, - zMixpanelImportConfig, - ]), -}); +export const zCreateImport = z.discriminatedUnion('provider', [ + z.object({ + projectId: z.string().min(1), + provider: z.literal('umami'), + config: zUmamiImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('plausible'), + config: zPlausibleImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('mixpanel'), + config: zMixpanelImportConfig, + }), +]);apps/start/src/utils/math.ts (1)
1-1: Duplicate: Breaking change already flagged.This issue was already identified in a previous review. The export path change from
'@openpanel/common/src/math'to'@openpanel/common'removes math utilities and breaks consumers.apps/start/src/utils/slug.ts (1)
1-1: Duplicate: Namespace pollution already flagged.This issue was already identified in a previous review. The file exports the entire
@openpanel/commonpackage when it should only export theslugutility.packages/db/prisma/schema.prisma (1)
472-498: Duplicate: Schema-migration inconsistency already flagged.The inconsistency between
currentStepbeing nullable in the schema (String?) but NOT NULL in the migration was already identified in a previous review.packages/importer/package.json (2)
5-5: Duplicate: Main entry point issue already flagged.The issue of
mainpointing to source TypeScript instead of compiled output was already identified in a previous review.
21-22: Duplicate: Ramda version mismatch already flagged.The version mismatch between
ramdaruntime (^0.29.1) and@types/ramda(^0.31.1) was already identified in a previous review.packages/db/src/services/import.service.ts (3)
272-272: Missing explicit import forrandomUUID.Lines 272 and 309 use
crypto.randomUUID()without importing fromnode:crypto. While Node provides a globalcryptoobject, explicit imports are clearer and match the Umami provider's pattern.Add at the top:
import type { ILogger } from '@openpanel/logger'; +import { randomUUID } from 'node:crypto'; import sqlstring from 'sqlstring';Then update usage:
- id: crypto.randomUUID(), + id: randomUUID(),Also applies to: Line 309
269-341: Avoid whole-second shifts for synthetic events.Shifting
created_atby 1000 ms (session_start) or 500 ms (session_end) risks colliding with real events, scrambling chronological order. Also note: Line 307 comment says "1 second after" but code uses 500ms—inconsistent.Use millisecond offsets (±1 ms) to reduce collision probability:
- created_at: adjustTimestamp(session.first_timestamp, -1000), // 1 second before first event + created_at: adjustTimestamp(session.first_timestamp, -1), // 1ms before first event- created_at: adjustTimestamp(session.last_timestamp, 500), // 1 second after last event + created_at: adjustTimestamp(session.last_timestamp, 1), // 1ms after last event
602-652: Derive status deterministically.Using
any(import_status)returns an arbitrary value when rows contain mixed statuses (e.g., some pending, some processed), making progress reporting unreliable.Compute status explicitly using COUNTIf:
SELECT import_id, COUNT(*) as total_events, COUNTIf(import_status = 'pending') as pending_events, COUNTIf(import_status = 'processed') as processed_events, - any(import_status) as status + CASE + WHEN COUNTIf(import_status = 'failed') > 0 THEN 'failed' + WHEN COUNTIf(import_status = 'processed') = COUNT(*) THEN 'processed' + WHEN COUNTIf(import_status = 'pending') > 0 THEN 'processing' + ELSE 'pending' + END as status FROM ${TABLE_NAMES.events_imports} WHERE import_id = {importId:String} AND name NOT IN ('session_start', 'session_end') GROUP BY import_idpackages/importer/src/providers/umami.ts (4)
79-198: SSRF vulnerability: arbitrary URLs fetched without validation.The
parseRemoteFilemethod fetches user-provided URLs without DNS/IP validation, allowing attackers to probe internal networks, metadata services (e.g., 169.254.169.254), or localhost endpoints.Before calling
fetch, resolve the hostname to its IP addresses and reject private/loopback/link-local ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 127.0.0.0/8, ::1, fc00::/7, fe80::/10). Revalidate after redirects. Alternatively, require signed/allowlisted URLs or proxy downloads through a controlled service.
213-221: Fix device ID fallback logic.Using
rawEvent.visit_id!for bothipanduaparameters is incorrect and will throw whenvisit_idis undefined. This produces unstable/invalid device IDs.Prefer a stable fallback hierarchy:
const deviceId = rawEvent.visit_id || - generateDeviceId({ - ip: rawEvent.visit_id!, - ua: rawEvent.visit_id!, - origin: projectId, - salt: 'xxx', - }); + rawEvent.distinct_id || + rawEvent.session_id || + rawEvent.event_id || + randomUUID();
224-233: Harden URL and referrer construction.URL assembly omits the leading
?for queries and can produce invalid strings whenhostnameis missing. Referrer construction ignoresreferrer_queryentirely.Apply proper URL normalization:
-const url = rawEvent.url_path - ? `https://${[rawEvent.hostname, rawEvent.url_path, rawEvent.url_query] - .filter(Boolean) - .join('')}` - : ''; +let url = ''; +if (rawEvent.hostname && rawEvent.url_path) { + const q = rawEvent.url_query + ? rawEvent.url_query.startsWith('?') + ? rawEvent.url_query + : `?${rawEvent.url_query}` + : ''; + url = `https://${rawEvent.hostname}${rawEvent.url_path}${q}`; +} + -const referrerUrl = rawEvent.referrer_domain - ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}` - : ''; +const refQuery = rawEvent.referrer_query + ? rawEvent.referrer_query.startsWith('?') + ? rawEvent.referrer_query + : `?${rawEvent.referrer_query}` + : ''; +const referrerUrl = rawEvent.referrer_domain + ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}${refQuery}` + : '';
257-297: Flatten properties instead of nesting objects.Assigning
properties.__query = queryand usingassocPath(['__query', ...])creates nested objects that serialize to[object Object]in ClickHouse, losing all UTM and query data.Flatten all fields to primitives:
-let properties: Record<string, any> = {}; - -if (query) { - properties.__query = query; -} +const properties: Record<string, string | number | boolean | null | undefined> = {}; +if (query) { + for (const [k, v] of Object.entries(query)) { + properties[`__query.${k}`] = v as string; + } +} if (rawEvent.page_title) properties.__title = rawEvent.page_title; if (rawEvent.screen) properties.__screen = rawEvent.screen; if (rawEvent.language) properties.__language = rawEvent.language; -if (rawEvent.utm_source) - properties = assocPath(['__query', 'utm_source'], rawEvent.utm_source, properties); -if (rawEvent.utm_medium) - properties = assocPath(['__query', 'utm_medium'], rawEvent.utm_medium, properties); -if (rawEvent.utm_campaign) - properties = assocPath(['__query', 'utm_campaign'], rawEvent.utm_campaign, properties); -if (rawEvent.utm_content) - properties = assocPath(['__query', 'utm_content'], rawEvent.utm_content, properties); -if (rawEvent.utm_term) - properties = assocPath(['__query', 'utm_term'], rawEvent.utm_term, properties); +if (rawEvent.utm_source) properties['__query.utm_source'] = rawEvent.utm_source; +if (rawEvent.utm_medium) properties['__query.utm_medium'] = rawEvent.utm_medium; +if (rawEvent.utm_campaign) properties['__query.utm_campaign'] = rawEvent.utm_campaign; +if (rawEvent.utm_content) properties['__query.utm_content'] = rawEvent.utm_content; +if (rawEvent.utm_term) properties['__query.utm_term'] = rawEvent.utm_term;packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
10-12: Migration drops columns without data migration.The migration drops
provider,sourceLocation, andsourceTypewithout migrating their data to theconfigJSON field. The import job handler (apps/worker/src/jobs/import.ts) expectsconfig.providerto exist when processing imports, so existing records will fail.Add a data migration step before dropping:
-- Migrate data to config field UPDATE "public"."imports" SET "config" = jsonb_build_object( 'provider', provider, 'sourceType', "sourceType", 'sourceLocation', "sourceLocation" ) || COALESCE("config", '{}'::jsonb) WHERE provider IS NOT NULL OR "sourceType" IS NOT NULL OR "sourceLocation" IS NOT NULL; -- Then drop the columns ALTER TABLE "public"."imports" DROP COLUMN "provider", DROP COLUMN "sourceLocation", DROP COLUMN "sourceType", ALTER COLUMN "config" DROP DEFAULT;packages/importer/src/providers/mixpanel.ts (4)
134-134: Duplicate: Use configured logger instead of console.warn.Lines 134 and 146 use
console.warn, bypassing the structured logger. Replace withthis.logger?.warn('Failed to parse Mixpanel event', { line, error })for consistent logging.Also applies to: 146-146
85-91: Duplicate: Sensitive credentials logged.Line 90 logs the
serviceAccountidentifier. Remove it from the log metadata or replace with a boolean flag (e.g.,serviceAccountPresent: true).
202-205: Duplicate: Incorrect user agent assignment.Line 202 assigns
props.osVersiontouserAgent, but OS version is not the user agent string. Useprops.$user_agentor a similar property that contains the actual UA string.
273-279: Duplicate: Inconsistent Mixpanel property names cause data loss.Lines 274 and 276-279 use camelCase (
props.$osVersion,props.$browserVersion) but Mixpanel's standard properties are snake_case ($os_version,$browser_version). Update to use the correct snake_case names and ensure the ternary condition matches the accessed property.apps/start/src/modals/add-import.tsx (1)
245-249: Duplicate: Avoid leaking secrets in error paths/logs.The error toast displays
error.messagewhich may expose upstream API details or secrets. Use a generic user-facing message and log the full error server-side only.apps/worker/src/jobs/import.ts (5)
44-47: Duplicate: Logger context leaks config secrets.The jobLogger child attaches the entire
record.configwhich may contain sensitive credentials like serviceAccount and serviceSecret. Include only non-sensitive fields (e.g., provider, type).
142-188: Duplicate: Harden batch transform; avoid crashes on empty/failed transforms.The batch buffer uses
anyandtransformEventis called without per-event error handling. A single transform failure will crash the entire job. Use a provider-agnosticunknown[]buffer and wrap each transform in try/catch.
147-154: Duplicate: Redact invalid events in logs.Logging the full
rawEventobject may expose PII or customer data. Log only safe metadata (provider, event ID/type, validation reason) instead of the complete payload.
225-225: Duplicate: UsejobLoggerinstead ofconsole.log.Switch to
jobLogger.info('Generating session IDs', { from, importId, provider })to maintain structured logging with context.
191-216: Duplicate: Repeat harden final partial batch.The final partial batch also needs guarded per-event transforms and safe createdAt derivation to mirror the main batch processing logic.
🧹 Nitpick comments (5)
packages/common/scripts/get-referrers.ts (2)
10-47: Consider adding an explicit type annotation.Adding a type annotation to
extraReferrerswould improve type safety and maintainability.Apply this diff to add a type annotation:
+type ReferrerEntry = { type: string; name: string }; + // extras -const extraReferrers = { +const extraReferrers: Record<string, ReferrerEntry> = { 'zoom.us': { type: 'social', name: 'Zoom' },
49-64: Replaceanytype withunknownfor better type safety.The
dataparameter usesany, which bypasses type checking. Usingunknownwould maintain safety while acknowledging the dynamic nature of the fetched data.Apply this diff:
-function transform(data: any) { +function transform(data: unknown) { const obj: Record<string, unknown> = {}; + if (typeof data !== 'object' || data === null) { + return obj; + } for (const type in data) {apps/start/src/utils/op.ts (1)
5-18: Consider adding debug logging for missing clientId.The factory pattern with a Proxy fallback is a solid approach. However, when the clientId is missing or invalid, there's no indication in logs, which could make debugging difficult in production environments.
Additionally, the check for
clientId === 'undefined'(string comparison) suggests that environment variables might be serialized as the string "undefined" rather than being truly undefined. This is a minor code smell that might warrant investigation at the environment configuration level.Consider adding logging:
const createOpInstance = () => { if (!clientId || clientId === 'undefined') { + console.warn('OpenPanel clientId is missing or invalid. Using no-op implementation.'); return new Proxy({} as OpenPanel, { get: () => () => {}, }); }packages/db/src/services/import.service.ts (1)
553-554: Inconsistent parameter handling with sqlstring.escape.Lines 553-554 use
sqlstring.escape()for inline substitution, while the rest of the file uses ClickHouse'squery_paramsmechanism (e.g., lines 122, 449, 583). This inconsistency reduces maintainability.Refactor to use query_params for consistency:
WHERE - e.import_id = ${sqlstring.escape(importId)} - AND toDate(e.created_at) = ${sqlstring.escape(from)} + e.import_id = {importId:String} + AND toDate(e.created_at) = {from:String} AND e.session_id != '' GROUP BY e.session_id `; await ch.command({ query: sessionsInsertQuery, + query_params: { importId, from }, clickhouse_settings: {Based on learnings (ClickHouse queries should use query builder/helper functions).
packages/trpc/src/routers/import.ts (1)
155-177: Retry flow should use TRPCError and catch enqueue failures.
retryshould return a structured TRPC error (e.g.,TRPCError({ code: 'BAD_REQUEST', ... })) instead ofnew Error, and it needs the same enqueue error handling as create to avoid leaving the UI thinking the retry succeeded when the queue add actually failed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (76)
apps/api/package.json(1 hunks)apps/start/Dockerfile(2 hunks)apps/start/package.json(2 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/project-mapper.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/date-ranger-picker.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/start/src/utils/math.ts(1 hunks)apps/start/src/utils/op.ts(1 hunks)apps/start/src/utils/slug.ts(1 hunks)apps/worker/Dockerfile(2 hunks)apps/worker/package.json(3 hunks)apps/worker/src/boot-workers.ts(4 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/cron.delete-projects.ts(1 hunks)apps/worker/src/jobs/events.incoming-event.ts(3 hunks)apps/worker/src/jobs/events.incoming-events.test.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)package.json(1 hunks)packages/cli/package.json(0 hunks)packages/cli/src/cli.ts(0 hunks)packages/cli/src/importer/importer.ts(0 hunks)packages/cli/src/importer/index.ts(0 hunks)packages/cli/tsconfig.json(0 hunks)packages/cli/tsup.config.ts(0 hunks)packages/common/package.json(1 hunks)packages/common/scripts/get-referrers.ts(1 hunks)packages/common/server/index.ts(1 hunks)packages/common/server/parse-referrer.test.ts(5 hunks)packages/common/server/parse-referrer.ts(4 hunks)packages/common/server/parser-user-agent.ts(2 hunks)packages/common/src/object.test.ts(1 hunks)packages/common/src/object.ts(2 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/index.ts(2 hunks)packages/db/package.json(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql(1 hunks)packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql(1 hunks)packages/db/prisma/migrations/20251028152531_imports_again/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/buffers/base-buffer.ts(1 hunks)packages/db/src/buffers/event-buffer.test.ts(8 hunks)packages/db/src/clickhouse/client.ts(6 hunks)packages/db/src/clickhouse/csv.ts(1 hunks)packages/db/src/clickhouse/migration.ts(1 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/services/session.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/metadata.ts(1 hunks)packages/importer/src/providers/mixpanel.test.ts(1 hunks)packages/importer/src/providers/mixpanel.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/redis/package.json(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/auth.ts(1 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/trpc/src/routers/organization.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
💤 Files with no reviewable changes (6)
- packages/cli/tsup.config.ts
- packages/cli/src/cli.ts
- packages/cli/src/importer/index.ts
- packages/cli/src/importer/importer.ts
- packages/cli/tsconfig.json
- packages/cli/package.json
🚧 Files skipped from review as they are similar to previous changes (37)
- packages/db/src/services/event.service.ts
- packages/common/package.json
- packages/importer/tsconfig.json
- packages/db/prisma/migrations/20251028152531_imports_again/migration.sql
- packages/trpc/src/routers/auth.ts
- packages/db/src/services/session.service.ts
- apps/worker/src/jobs/events.incoming-events.test.ts
- apps/start/src/components/integrations/integration-card.tsx
- packages/common/server/index.ts
- package.json
- packages/db/src/buffers/event-buffer.test.ts
- apps/start/src/components/project-mapper.tsx
- packages/common/server/parser-user-agent.ts
- packages/db/src/clickhouse/csv.ts
- packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql
- packages/db/index.ts
- apps/start/src/modals/date-ranger-picker.tsx
- packages/db/code-migrations/5-add-imports-table.ts
- apps/start/src/routes/app.$organizationId.$projectId.settings._tabs.tsx
- apps/start/src/components/ui/badge.tsx
- packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
- apps/worker/package.json
- packages/trpc/src/routers/organization.ts
- apps/start/src/modals/index.tsx
- packages/redis/package.json
- apps/worker/src/jobs/cron.delete-projects.ts
- packages/common/src/object.test.ts
- packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql
- apps/start/Dockerfile
- packages/importer/src/providers/metadata.ts
- packages/trpc/src/root.ts
- packages/importer/vitest.config.ts
- apps/api/package.json
- packages/importer/src/index.ts
- packages/db/src/types.ts
- packages/importer/src/types.ts
- packages/common/server/parse-referrer.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/utils/slug.tsapps/start/src/utils/op.tsapps/start/src/routeTree.gen.tsapps/start/src/utils/math.tsapps/start/src/modals/add-import.tsx
🧠 Learnings (9)
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*.tsx : Define React route files with createFileRoute and let the bundler/CLI manage the path string
Applied to files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/__root.tsx : Declare the root route in routes/__root.tsx using createRootRoute or createRootRouteWithContext
Applied to files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/_*.tsx : Use leading underscore filenames for pathless layout routes that wrap children without matching a URL segment
Applied to files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, use the helper functions in ./packages/db/src/clickhouse/query-functions.ts rather than crafting raw queries
Applied to files:
packages/db/src/clickhouse/migration.tspackages/db/src/clickhouse/client.tspackages/db/src/services/import.service.tspackages/importer/src/providers/umami.ts
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, always construct them using the custom query builder in ./packages/db/src/clickhouse/query-builder.ts
Applied to files:
packages/db/src/clickhouse/client.tspackages/db/src/services/import.service.tspackages/importer/src/providers/umami.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/{index,*.index}.tsx : For index routes, use index.tsx or *.index.tsx files and ensure the createFileRoute path uses a trailing slash (e.g., '/posts/')
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*$*.tsx : Name dynamic route files with $param (e.g., posts.$postId.tsx) to capture path parameters
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/$.tsx : Use $.tsx for splat (catch-all) routes to capture the remaining path into params._splat
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
Repo: Openpanel-dev/openpanel PR: 0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*_.tsx : Use a trailing underscore in a segment filename to create a non-nested (un-nested) route (e.g., posts_.tsx)
Applied to files:
apps/start/src/routeTree.gen.ts
🧬 Code graph analysis (13)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
importQueue(175-181)apps/worker/src/jobs/import.ts (1)
importJob(34-317)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (3)
packages/importer/src/providers/metadata.ts (1)
IMPORT_PROVIDERS(13-30)apps/public/components/ui/tooltip.tsx (1)
Tooltiper(36-51)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
getRedisQueue(93-105)
packages/trpc/src/routers/import.ts (5)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(134-134)protectedProcedure(166-170)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(276-276)packages/validation/src/index.ts (1)
zCreateImport(478-486)packages/queue/src/queues.ts (1)
importQueue(175-181)
apps/worker/src/jobs/import.ts (7)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/services/import.service.ts (10)
ImportSteps(719-719)updateImportStatus(721-784)getImportDateBounds(658-682)insertImportBatch(30-87)generateSessionIds(94-130)createSessionsStartEndEvents(138-348)moveImportsToProduction(354-458)backfillSessionsToProduction(460-569)markImportComplete(574-593)getImportProgress(598-653)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(241-252)error(38-43)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)packages/importer/src/providers/umami.ts (1)
UmamiProvider(59-382)packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-452)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
importQueue(175-181)
packages/common/server/parse-referrer.test.ts (1)
packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
packages/importer/src/providers/mixpanel.test.ts (1)
packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-452)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
IMixpanelImportConfig(471-471)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(719-879)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)packages/common/server/parser-user-agent.ts (2)
parseUserAgent(72-115)UserAgentInfo(71-71)packages/common/src/object.ts (2)
toDots(16-67)strip(79-79)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
BaseRawEvent(40-42)ImportJobMetadata(69-73)ErrorContext(45-51)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)
packages/db/src/services/import.service.ts (5)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/csv.ts (2)
csvEscapeField(42-53)csvEscapeJson(15-38)packages/db/src/clickhouse/client.ts (5)
formatClickhouseDate(241-252)chInsertCSV(206-232)TABLE_NAMES(46-58)ch(129-166)convertClickhouseDateToJs(271-273)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/prisma-client.ts (1)
db(276-276)
apps/start/src/modals/add-import.tsx (4)
apps/start/src/integrations/tanstack-query/root-provider.tsx (1)
Provider(66-84)packages/validation/src/index.ts (5)
zUmamiImportConfig(452-454)zMixpanelImportConfig(461-470)IUmamiImportConfig(456-456)IMixpanelImportConfig(471-471)IImportConfig(473-476)apps/start/src/components/project-mapper.tsx (1)
ProjectMapper(33-116)apps/public/lib/utils.ts (1)
cn(4-6)
packages/importer/src/providers/umami.ts (5)
packages/validation/src/index.ts (1)
IUmamiImportConfig(456-456)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(719-879)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
🔇 Additional comments (20)
packages/db/src/buffers/base-buffer.ts (1)
1-1: LGTM! Import path refactor looks good.The consolidation of
generateSecureIdimport from a specific subpath to the main@openpanel/common/servermodule is a clean refactor that aligns with the broader import path restructuring mentioned in the PR.packages/common/scripts/get-referrers.ts (1)
1-7: LGTM!The ESM pattern for deriving
__dirnameis correct and the imports are appropriate for the script's needs.apps/start/src/utils/op.ts (2)
3-4: LGTM!Clean extraction of the environment variable into a local constant.
20-20: LGTM!Module-level initialization is appropriate for this singleton analytics instance.
packages/trpc/package.json (1)
19-19: LGTM!The addition of
@openpanel/queuedependency is correct and follows the monorepo workspace pattern. This enables the TRPC import router to interact with the import queue.packages/db/src/clickhouse/migration.ts (1)
118-132: LGTM!The
modifyTTLfunction correctly follows the established pattern for ClickHouse migration helpers. It properly handles both clustered and non-clustered scenarios, uses thereplicated()helper appropriately, and maintains consistency with other migration functions in this file.apps/worker/Dockerfile (1)
38-38: LGTM!The importer package is correctly integrated into the Docker build process. The changes follow the established pattern: copying
package.jsonfor dependency resolution (line 38) and the full package in the final image (line 87).Also applies to: 87-87
packages/importer/src/base-provider.ts (1)
1-121: LGTM! Well-designed base class for import providers.The abstract class provides a clean, extensible foundation for import providers:
- Clear separation between required abstract methods and optional hooks
- Async generator pattern for
parseSourceenables efficient streaming of large datasets- Comprehensive lifecycle hooks for customization
getDateChunksutility correctly handles date range splitting with proper boundary logic- Excellent documentation throughout
packages/db/package.json (1)
16-16: LGTM: ClickHouse client version updated to support new features.The update from ^1.2.0 to ^1.12.1 aligns with enhanced ClickHouse integration (events_imports table, improved insert settings, etc.) introduced in this PR.
packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1)
1-2: LGTM: Makes currentStep nullable for flexible import state tracking.Dropping the NOT NULL constraint on
currentStepallows imports to exist without an explicit step, supporting initialization states in the import lifecycle.apps/worker/src/index.ts (1)
8-8: LGTM: Import queue integrated into Bull Board.The
importQueueis properly imported and wired into the Bull Board UI for monitoring import jobs.Also applies to: Line 42
apps/start/package.json (2)
11-11: LGTM: Build script now loads environment variables.Updating the build script to use
pnpm with-envensures environment variables are available during the build process, consistent with other scripts in the file.
33-33: LGTM: Importer package added to support new UI.The
@openpanel/importerworkspace dependency enables the new Imports UI and modal implementations in the start app.packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1)
1-3: LGTM: Adds batch and step tracking for import lifecycle.The
currentBatchandcurrentStepcolumns support multi-stage import processing, enabling progress tracking and debugging across import batches.packages/importer/src/providers/mixpanel.test.ts (1)
5-318: LGTM! Comprehensive test coverage.The test suite thoroughly covers Mixpanel provider functionality including date chunking, event transformation, property flattening, and React Native referrer handling. The test expectations are well-structured and validate the provider's behavior across different scenarios.
packages/db/src/clickhouse/client.ts (4)
1-1: LGTM! Appropriate updates for import operations.The increased timeouts (300s request timeout, 60s idle socket TTL) and addition of the
events_importstable support long-running import jobs without connection issues.Also applies to: 57-57, 62-62, 65-65
89-127: LGTM! Solid retry logic.Exporting
withRetryand wrapping operations with exponential backoff provides resilience against transient connection failures. The retry conditions (Connect errors, socket hang up, Timeout) are well-chosen.
134-162: LGTM! Proper settings for large inserts.The retry wrapper and comprehensive clickhouse_settings (async_insert, increased timeouts, progress headers) ensure large CSV/row streaming completes successfully even under load.
206-232: LGTM! Clean CSV insert implementation.The
chInsertCSVfunction correctly uses a Readable stream in binary mode with appropriate CSV format settings, mirrors the existing EventBuffer pattern, and includes proper logging.apps/start/src/routeTree.gen.ts (1)
1-1700: Auto-generated route tree file.This file is automatically generated by TanStack Router and should not be manually edited. The additions correctly wire the new imports route into the settings tabs structure.
| <TableCell> | ||
| <Tooltiper | ||
| content={ | ||
| <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> | ||
| {JSON.stringify(imp.config, null, 2)} | ||
| </pre> | ||
| } | ||
| tooltipClassName="max-w-xs" | ||
| > | ||
| <Badge>Config</Badge> | ||
| </Tooltiper> | ||
| </TableCell> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redact import configs before rendering.
The tooltip currently stringifies imp.config verbatim, exposing stored provider credentials to anyone with page access. Sanitize/redact known secret keys (API keys, tokens, passwords) or limit visibility based on role before displaying.
🤖 Prompt for AI Agents
In
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
around lines 265 to 276, the tooltip directly stringifies imp.config and may
expose secrets; fix by sanitizing the config before rendering: implement or call
a small sanitizer that scans known secret keys (e.g., apiKey, api_key, token,
access_token, secret, clientSecret, password, passwd, auth, private_key) and
replaces their values with a placeholder like "REDACTED" (or strips them), then
JSON.stringify that sanitized object for the tooltip; additionally gate full
config visibility by user role/permission — if the current user lacks an
elevated role, always show the sanitized version. Ensure the sanitizer is
deterministic and runs synchronously before passing content to Tooltiper.
| if (value === undefined || value === null || value === '') { | ||
| return acc; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering empty strings may cause unintended data loss.
The new check at line 35 filters out empty strings (value === ''), which could silently drop properties where an empty string is a legitimate value. This might be a breaking change if previous behavior preserved empty strings as valid data.
For example:
- Input:
{name: "", age: 25} - Result: Only
ageis preserved;nameis lost entirely
If this filtering is intentional for the import use case, consider documenting this behavior or making it configurable.
🤖 Prompt for AI Agents
In packages/common/src/object.ts around lines 35 to 37, the added condition
checking value === '' will drop empty-string properties which may be valid data;
either remove the value === '' check to preserve empty strings, or make this
behavior explicit/configurable (e.g., add an options flag keepEmptyStrings or a
separate filter function) and document the change in code comments and relevant
docs so callers opt into stripping empty strings; update unit tests to cover
both behaviors.
| if (typeof value === 'string' && isMalformedJsonString(value)) { | ||
| // Skip it | ||
| return acc; | ||
| } | ||
|
|
||
| // Fix nested json strings - but catch parse errors for malformed JSON | ||
| if (typeof value === 'string' && isValidJsonString(value)) { | ||
| try { | ||
| return { | ||
| ...acc, | ||
| ...toDots(JSON.parse(value), `${path}${key}.`), | ||
| }; | ||
| } catch { | ||
| // Skip it | ||
| return acc; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent error handling makes debugging difficult.
While the earlier review flagged the data loss risk from skipping malformed JSON (lines 39-42), there's an additional concern: the catch block at lines 51-54 silently drops values that fail JSON parsing without any logging or warning.
When legitimate data is skipped due to parse failures, users have no visibility into what was lost or why. This makes troubleshooting import issues very difficult.
Consider adding logging to provide visibility:
} catch {
- // Skip it
+ // Log parse failure for visibility
+ console.warn(`Failed to parse JSON string for key "${path}${key}": ${value.substring(0, 50)}...`);
return acc;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (typeof value === 'string' && isMalformedJsonString(value)) { | |
| // Skip it | |
| return acc; | |
| } | |
| // Fix nested json strings - but catch parse errors for malformed JSON | |
| if (typeof value === 'string' && isValidJsonString(value)) { | |
| try { | |
| return { | |
| ...acc, | |
| ...toDots(JSON.parse(value), `${path}${key}.`), | |
| }; | |
| } catch { | |
| // Skip it | |
| return acc; | |
| } | |
| } | |
| if (typeof value === 'string' && isMalformedJsonString(value)) { | |
| // Skip it | |
| return acc; | |
| } | |
| // Fix nested json strings - but catch parse errors for malformed JSON | |
| if (typeof value === 'string' && isValidJsonString(value)) { | |
| try { | |
| return { | |
| ...acc, | |
| ...toDots(JSON.parse(value), `${path}${key}.`), | |
| }; | |
| } catch { | |
| // Log parse failure for visibility | |
| console.warn(`Failed to parse JSON string for key "${path}${key}": ${value.substring(0, 50)}...`); | |
| return acc; | |
| } | |
| } |
🤖 Prompt for AI Agents
In packages/common/src/object.ts around lines 39 to 55, the catch block silently
swallows JSON.parse errors when attempting to fix nested JSON strings; update
the catch to log the error and the offending value/path using the module's
logger (or accept a logger param) before returning acc so failures are visible,
include contextual info (error message, value, and path/key) and avoid changing
existing control flow.
| return db.import.findMany({ | ||
| where: { | ||
| projectId: input.projectId, | ||
| }, | ||
| orderBy: { | ||
| createdAt: 'desc', | ||
| }, | ||
| }); | ||
| }), | ||
|
|
||
| get: protectedProcedure | ||
| .input(z.object({ id: z.string() })) | ||
| .query(async ({ input, ctx }) => { | ||
| const importRecord = await db.import.findUniqueOrThrow({ | ||
| where: { | ||
| id: input.id, | ||
| }, | ||
| include: { | ||
| project: true, | ||
| }, | ||
| }); | ||
|
|
||
| const access = await getProjectAccess({ | ||
| projectId: importRecord.projectId, | ||
| userId: ctx.session.userId, | ||
| }); | ||
|
|
||
| if (!access) { | ||
| throw TRPCAccessError('You do not have access to this import'); | ||
| } | ||
|
|
||
| return importRecord; | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restrict config exposure in list/get.
Both list and get return the full config payload to any member with read access, which leaks provider credentials (API keys, secrets, etc.). Gate secrets by role or redact sensitive fields before returning the response.
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 24 to 56, both the list and
get handlers currently return the full import.config payload which may contain
provider credentials; restrict exposure by either checking the caller's project
role and only returning full config for privileged roles (owner/admin), or by
redacting sensitive fields (apiKey, secret, token, password, privateKey, etc.)
from the config before returning for non‑privileged readers; apply the same
redaction/role check in both the findMany and findUnique responses (or transform
the returned objects) and update any return types to reflect the reduced config
shape.
| // Create import record | ||
| const importRecord = await db.import.create({ | ||
| data: { | ||
| projectId: input.projectId, | ||
| config: input.config, | ||
| status: 'pending', | ||
| }, | ||
| }); | ||
|
|
||
| // Add job to queue | ||
| const job = await importQueue.add('import', { | ||
| type: 'import', | ||
| payload: { | ||
| importId: importRecord.id, | ||
| }, | ||
| }); | ||
|
|
||
| // Update import record with job ID | ||
| await db.import.update({ | ||
| where: { id: importRecord.id }, | ||
| data: { jobId: job.id }, | ||
| }); | ||
|
|
||
| return { | ||
| ...importRecord, | ||
| jobId: job.id, | ||
| }; | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle queue enqueue failures.
In create, if importQueue.add throws (Redis outage, connection drop), the record stays pending with no job attached. Wrap the enqueue in try/catch, update the record to failed (with errorMessage), and only persist pending/jobId after a successful add.
|
|
||
| // Import configs | ||
| export const zUmamiImportConfig = createFileImportConfig('umami').extend({ | ||
| projectMapper: z.array(zProjectMapper), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
projectMapper should likely be optional.
The AI summary indicates that projectMapper should be optional, but the code doesn't include .optional(). Project mapping is typically not required for all imports.
Apply this diff to make it optional:
- projectMapper: z.array(zProjectMapper),
+ projectMapper: z.array(zProjectMapper).optional(),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| projectMapper: z.array(zProjectMapper), | |
| projectMapper: z.array(zProjectMapper).optional(), |
🤖 Prompt for AI Agents
In packages/validation/src/index.ts around line 453, the schema currently
declares projectMapper as a required array (projectMapper:
z.array(zProjectMapper)), but it should be optional; change the schema to make
projectMapper optional by appending .optional() (i.e.,
z.array(zProjectMapper).optional()) so the field can be omitted during
validation.
Summary by CodeRabbit
New Features
UI
Chores