Skip to content

refactor(dal): update analitycs repository to use zod for schema generation#10581

Draft
djabarovgeorge wants to merge 2 commits intonextfrom
refactor-clickhouse-schema
Draft

refactor(dal): update analitycs repository to use zod for schema generation#10581
djabarovgeorge wants to merge 2 commits intonextfrom
refactor-clickhouse-schema

Conversation

@djabarovgeorge
Copy link
Copy Markdown
Contributor

@djabarovgeorge djabarovgeorge commented Apr 5, 2026

What changed? Why was the change needed?

What changed

The analytics repository's data access layer was refactored to migrate from a ClickHouse schema builder pattern to Zod-based schema validation with integrated ClickHouse type descriptors. A new LogRepositoryV2 abstract class was introduced as the base for analytics repositories, centralizing tenant/environment enforcement, feature-flag-driven log expiration, and type-safe query condition building with runtime validation. The RequestLogRepository now extends LogRepositoryV2 instead of LogRepository, and its schema definition was migrated to use Zod validators paired with ClickHouse type metadata.

Affected areas

  • dal: Migration of RequestLogRepository from LogRepository to LogRepositoryV2 with schema validation now driven by Zod validators instead of ClickHouse schema builders. New v2 module introduces reusable typed query conditions, tenant enforcement, and batch insert capabilities.

Key technical decisions

  • Zod + ClickHouse type mapping: Fields now pair Zod validators with ClickHouse type descriptors ({ zod, ch }) for runtime safety and type inference, replacing the previous ClickhouseSchema wrapper class.
  • LogRepositoryV2 abstraction: Centralizes tenant/environment enforcement (with __unsafe escape hatch), feature-flag-driven log expiration, and query construction with automatic column name and operator validation.
  • Type-safe query builder: QueryBuilder<T> provides fluent API for constructing WHERE clauses with type-checked columns, operators (equals, in, like, between), and array/null handling.
  • Batch insertion: Supports configurable batch inserts controlled by feature flags and environment variables, with automatic id generation and expires_at computation.

Testing

No test files were included in the summary. Verification of schema validation behavior, tenant enforcement, and query builder correctness through existing or new unit tests is recommended.

Screenshots

Expand for optional sections

Related enterprise PR

Special notes for your reviewer

…ositoryV2 and modify requestLogSchema to use Zod for validation
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 5, 2026

Deploy Preview for dashboard-v2-novu-staging canceled.

Name Link
🔨 Latest commit fa28887
🔍 Latest deploy log https://app.netlify.com/projects/dashboard-v2-novu-staging/deploys/69d3988c6ea193000828adb1

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 5, 2026

Hey there and thank you for opening this pull request! 👋

We require pull request titles to follow specific formatting rules and it looks like your proposed title needs to be adjusted.

Your PR title is: refactor(dal): update analitycs repository to use zod for schema generation

Requirements:

  1. Follow the Conventional Commits specification
  2. As a team member, include Linear ticket ID at the end: fixes TICKET-ID or include it in your branch name

Expected format: feat(scope): Add fancy new feature fixes NOV-123

Details:

PR title must end with 'fixes TICKET-ID' (e.g., 'fixes NOV-123') or include ticket ID in branch name

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 5, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: deb4fce7-fdbf-401c-a5f5-3f75e03b37fa

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

A new LogRepositoryV2 architecture is introduced alongside a Zod-ClickHouse type system for schema definition and query building. RequestLogRepository is migrated to extend the new base class with updated schema format, while the v2 repository provides typed query conditions, automatic tenant/environment enforcement, feature-flag-driven batching, and expiration calculation.

Changes

Cohort / File(s) Summary
Repository Base Class Migration
libs/application-generic/src/services/analytic-logs/request-log/request-log.repository.ts
RequestLogRepository now extends LogRepositoryV2<typeof requestLogSchema> instead of LogRepository, updating generic parameters to match the new base class API while maintaining constructor call signature.
Schema Definition Refactoring
libs/application-generic/src/services/analytic-logs/request-log/request-log.schema.ts
Schema replaced with Zod + ClickHouse type mapping object using as const. Field definitions now pair { zod: ..., ch: ... } per field. Type inference changed from InferClickhouseSchemaType to InferZodChSchema, with ORDER_BY and TTL re-derived from the new schema structure.
New V2 Type System
libs/application-generic/src/services/analytic-logs/v2/clickhouse-types.ts, libs/application-generic/src/services/analytic-logs/v2/index.ts
New module introducing ZodChField, ZodChSchema, InferZodChSchema types and builder functions for ClickHouse primitives (chString, chUInt*, chFloat64, chBoolean, chDate, chDateTime64) and composite types (chLowCardinality, chNullable, chArray, chTuple). Barrel export in v2/index.ts.
New V2 Repository Implementation
libs/application-generic/src/services/analytic-logs/v2/log.repository.ts
New abstract LogRepositoryV2<TSchema> with typed query conditions, QueryBuilder fluent interface, automatic tenant/environment enforcement (with __unsafe escape hatch), feature-flag-driven batch insertion configuration, expiration date calculation, and find/findOne/count methods supporting column selection, ordering, and limit/offset.

Sequence Diagram

sequenceDiagram
    participant App as Application
    participant QB as QueryBuilder
    participant LogRepV2 as LogRepositoryV2
    participant CH as ClickHouse
    
    App->>QB: new QueryBuilder()
    App->>QB: whereEquals('status', 'active')<br/>.whereGreaterThan('count', 10)
    QB->>QB: Store typed conditions
    App->>LogRepV2: find(qb.build())
    LogRepV2->>LogRepV2: Validate conditions<br/>(operators, columns)
    LogRepV2->>LogRepV2: Add environment_id<br/>enforcement
    LogRepV2->>LogRepV2: Build WHERE clause<br/>with parameter map
    LogRepV2->>CH: Execute: SELECT ... WHERE<br/>environment_id = ? AND ...<br/>ORDER BY ... FINAL
    CH-->>LogRepV2: Result rows
    LogRepV2-->>App: Typed result[]
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Suggested Labels

@novu/dal

Suggested Reviewers

  • scopsy
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title uses 'dal' scope which is not in the valid scopes list (api, dashboard, worker, ws, webhook, shared, dal, application-generic, js, react, react-native, nextjs, providers, framework, root). The valid scope for this change should be 'application-generic' since the changes are in libs/application-generic. Update the title to use 'application-generic' scope instead of 'dal': refactor(application-generic): update analytics repository to use zod for schema generation. Also note the typo 'analitycs' should be 'analytics'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
libs/application-generic/src/services/analytic-logs/request-log/request-log.repository.ts (1)

15-24: ⚠️ Potential issue | 🟠 Major

Pass the batch writer through this V2 migration.

After switching this repository to LogRepositoryV2, the constructor still only wires ClickHouseService. shouldUseBatching() will now always short-circuit on !this.batchService, so createMany() never reaches the feature-flagged batch path.

♻️ Suggested wiring
 import { ClickHouseService, InsertOptions } from '../clickhouse.service';
+import { ClickHouseBatchService } from '../clickhouse-batch.service';
 ...
   constructor(
     protected readonly clickhouseService: ClickHouseService,
+    protected readonly clickhouseBatchService: ClickHouseBatchService,
     protected readonly logger: PinoLogger,
     protected readonly featureFlagsService: FeatureFlagsService
   ) {
-    super(clickhouseService, logger, requestLogSchema, ORDER_BY, featureFlagsService);
+    super(clickhouseService, logger, requestLogSchema, ORDER_BY, featureFlagsService, clickhouseBatchService);

As per coding guidelines, "Use ClickHouseService for single queries/inserts and ClickHouseBatchService for high-throughput writes".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@libs/application-generic/src/services/analytic-logs/request-log/request-log.repository.ts`
around lines 15 - 24, The constructor of RequestLogRepository was not wired with
a ClickHouseBatchService so LogRepositoryV2's shouldUseBatching() short-circuits
(this.batchService is undefined) and createMany() never uses the batch path;
update the constructor signature to accept a protected readonly batchService:
ClickHouseBatchService (alongside ClickHouseService, PinoLogger,
FeatureFlagsService) and ensure you pass or assign that batchService into the
base class (LogRepositoryV2) / this.batchService so shouldUseBatching() and
createMany() can use batching; look for symbols RequestLogRepository,
constructor, LogRepositoryV2, shouldUseBatching(), createMany(), and
ClickHouseBatchService when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/application-generic/src/services/analytic-logs/v2/log.repository.ts`:
- Around line 51-53: EnforcedContext lost organization scoping; add
organizationId: string back to the EnforcedContext type and update
buildEnforcedConditions() to inject both environment_id and organization_id (map
to the expected ClickHouse parameters _environmentId and _organizationId or the
existing parameter naming used elsewhere) so all V2 safe-read queries include
both tenant boundaries; update any call sites that construct EnforcedContext to
supply organizationId accordingly.
- Around line 92-103: The new abstract class LogRepositoryV2 is defined as a
separate base instead of extending the shared LogRepository contract; update the
class so it either extends LogRepository or move the Zod-specific behavior into
the existing LogRepository. Concretely, change the declaration of
LogRepositoryV2 to extend LogRepository and keep its constructor signature
(clickhouseService, logger, schema: ZodChSchema, schemaOrderBy,
featureFlagsService, batchService) while delegating common fields/methods to the
parent; alternatively, fold schema/schemaOrderBy handling (ZodChSchema and
SchemaKeys) into the existing LogRepository implementation so all ClickHouse
repositories continue to inherit from LogRepository per guidelines. Ensure
references to LogRepositoryV2, LogRepository, schema, and schemaOrderBy are
updated accordingly.
- Around line 464-489: The findOne overloads (the generic version and the '*'
version) currently return result.data[0], which can be undefined on no-match and
violates their promised non-null return types; update both implementations (the
findOne method and the similar block around lines referenced for 492-523) to
explicitly handle empty results by either returning a nullable type (e.g.,
Promise<... | null>) or throwing a NotFound/Error when result.data.length === 0,
and update the function signatures to reflect the chosen contract; ensure you
adjust callers/types accordingly so the behaviors of findOne and the analogous
method are consistent and no longer return { data: undefined, rows: 0 } while
claiming a non-null row.

---

Outside diff comments:
In
`@libs/application-generic/src/services/analytic-logs/request-log/request-log.repository.ts`:
- Around line 15-24: The constructor of RequestLogRepository was not wired with
a ClickHouseBatchService so LogRepositoryV2's shouldUseBatching() short-circuits
(this.batchService is undefined) and createMany() never uses the batch path;
update the constructor signature to accept a protected readonly batchService:
ClickHouseBatchService (alongside ClickHouseService, PinoLogger,
FeatureFlagsService) and ensure you pass or assign that batchService into the
base class (LogRepositoryV2) / this.batchService so shouldUseBatching() and
createMany() can use batching; look for symbols RequestLogRepository,
constructor, LogRepositoryV2, shouldUseBatching(), createMany(), and
ClickHouseBatchService when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7fafc5b0-a2c1-48f1-97f9-db89b6397910

📥 Commits

Reviewing files that changed from the base of the PR and between ac5998c and 034a6b3.

📒 Files selected for processing (5)
  • libs/application-generic/src/services/analytic-logs/request-log/request-log.repository.ts
  • libs/application-generic/src/services/analytic-logs/request-log/request-log.schema.ts
  • libs/application-generic/src/services/analytic-logs/v2/clickhouse-types.ts
  • libs/application-generic/src/services/analytic-logs/v2/index.ts
  • libs/application-generic/src/services/analytic-logs/v2/log.repository.ts

Comment on lines +51 to +53
export type EnforcedContext = {
environmentId: string;
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Restore organization scoping in enforced queries.

EnforcedContext no longer carries organizationId, and buildEnforcedConditions() only injects environment_id. That means every safe read path in V2 is missing one tenant boundary.

As per coding guidelines, "Always include _environmentId and _organizationId in ClickHouse queries for tenant isolation".

Also applies to: 181-188

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/application-generic/src/services/analytic-logs/v2/log.repository.ts`
around lines 51 - 53, EnforcedContext lost organization scoping; add
organizationId: string back to the EnforcedContext type and update
buildEnforcedConditions() to inject both environment_id and organization_id (map
to the expected ClickHouse parameters _environmentId and _organizationId or the
existing parameter naming used elsewhere) so all V2 safe-read queries include
both tenant boundaries; update any call sites that construct EnforcedContext to
supply organizationId accordingly.

Comment on lines +92 to +103
export abstract class LogRepositoryV2<TSchema extends ZodChSchema> {
readonly table: string;
readonly identifierPrefix: string;

constructor(
protected readonly clickhouseService: ClickHouseService,
protected readonly logger: PinoLogger,
protected readonly schema: TSchema,
protected readonly schemaOrderBy: SchemaKeys<TSchema>[],
protected readonly featureFlagsService: FeatureFlagsService,
protected readonly batchService?: ClickHouseBatchService
) {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Keep the new base chained to LogRepository.

This starts a parallel analytic-logs repository hierarchy instead of extending the shared base contract the rest of the layer is expected to use. Please fold the Zod-specific behavior into LogRepository, or make LogRepositoryV2 extend it.

As per coding guidelines, "All ClickHouse repositories must extend LogRepository from libs/application-generic/src/services/analytic-logs/".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/application-generic/src/services/analytic-logs/v2/log.repository.ts`
around lines 92 - 103, The new abstract class LogRepositoryV2 is defined as a
separate base instead of extending the shared LogRepository contract; update the
class so it either extends LogRepository or move the Zod-specific behavior into
the existing LogRepository. Concretely, change the declaration of
LogRepositoryV2 to extend LogRepository and keep its constructor signature
(clickhouseService, logger, schema: ZodChSchema, schemaOrderBy,
featureFlagsService, batchService) while delegating common fields/methods to the
parent; alternatively, fold schema/schemaOrderBy handling (ZodChSchema and
SchemaKeys) into the existing LogRepository implementation so all ClickHouse
repositories continue to inherit from LogRepository per guidelines. Ensure
references to LogRepositoryV2, LogRepository, schema, and schemaOrderBy are
updated accordingly.

Comment on lines +464 to +489
async findOne<T extends readonly (keyof InferZodChSchema<TSchema>)[]>(options: {
where: Where<InferZodChSchema<TSchema>>;
limit?: number;
offset?: number;
orderBy?: SchemaKeys<TSchema>;
orderDirection?: 'ASC' | 'DESC';
useFinal?: boolean;
select: T;
}): Promise<{
data: Pick<InferZodChSchema<TSchema>, T[number]>;
rows: number;
}>;

// Overload for "*" all columns selection
async findOne(options: {
where: Where<InferZodChSchema<TSchema>>;
limit?: number;
offset?: number;
orderBy?: SchemaKeys<TSchema>;
orderDirection?: 'ASC' | 'DESC';
useFinal?: boolean;
select: '*';
}): Promise<{
data: InferZodChSchema<TSchema>;
rows: number;
}>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle the empty-result path in findOne.

Both branches return result.data[0], so a miss becomes { data: undefined, rows: 0 } even though the overloads promise a row. Make the contract explicit (null/undefined) or throw when nothing matches.

Also applies to: 492-523

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/application-generic/src/services/analytic-logs/v2/log.repository.ts`
around lines 464 - 489, The findOne overloads (the generic version and the '*'
version) currently return result.data[0], which can be undefined on no-match and
violates their promised non-null return types; update both implementations (the
findOne method and the similar block around lines referenced for 492-523) to
explicitly handle empty results by either returning a nullable type (e.g.,
Promise<... | null>) or throwing a NotFound/Error when result.data.length === 0,
and update the function signatures to reflect the chosen contract; ensure you
adjust callers/types accordingly so the behaviors of findOne and the analogous
method are consistent and no longer return { data: undefined, rows: 0 } while
claiming a non-null row.

@djabarovgeorge djabarovgeorge marked this pull request as draft April 5, 2026 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant