[CLEAN] Synthetic Benchmark PR #5893 - Refactor building the Query struct #85

tomerqodo · 2025-12-04T20:02:16Z

User description

Benchmark PR plausible#5893

Type: Clean (correct implementation)

Original PR Title: Refactor building the Query struct
Original PR Description: ### Changes

This PR is a refactor and doesn't introduce any new behaviour.

In the upcoming dashboard migration from React to LiveView, we want to be able to create Query structs from native data structures. E.g.:

pass the site struct itself into the Query.build function, not the site_id
pass DateTime structs instead of iso8601 strings
pass metrics as atoms instead of strings
etc...

The biggest refactor in this PR is changing the current Query.build function into Query.parse_and_build which calls two separate modules (QueryParser and QueryBuilder (new) for the respective actions).

It also introduces an intermediate data structure that represents the parsed state between those two phases - ParsedQueryParams.

There was a lot of query "building" logic and validations incorporated into QueryParser. The test file (with more than 3k lines of code) got turned into QueryParseAndBuildTest, now asserting on a Query struct outcome rather than an arbitrary map.

This PR doesn't tackle the date_range input aspect yet. For now, QueryBuilder will expect a clean utc_time_range input (via the ParsedQueryParams struct). It shouldn't block this PR though. I can tackle it later. E.g.: we'll likely want to build queries like:

query1 = QueryBuilder.build(site, %{metrics: [:visitors], date_range: :month, date: ~D[2025-01-01]})
query2 = QueryBuilder.build(site, %{metrics: [:visitors], date_range: [~D[2025-01-01], ~D[2025-01-31]]})

With that, we can also migrate any existing ad-hoc Query construction (e.g. in email reports or query_24h_stats) to use this new version of Query.build.

Tests

Automated tests have been added

Changelog

This PR does not make a user-facing change

Documentation

This change does not need a documentation update

Dark mode

This PR does not change the UI

Original PR URL: plausible#5893

PR Type

Enhancement, Tests

Description

Refactored Query.build to Query.parse_and_build implementing a two-phase architecture (parsing and building) for better separation of concerns
Created new QueryBuilder module to handle Query struct construction with comprehensive validation logic for metrics, dimensions, filters, and special constraints
Created new ParsedQueryParams struct as an intermediate data structure between parsing and building phases
Refactored QueryParser module (relocated from Plausible.Stats.Filters) to focus solely on parsing, removing validation and query building logic
Updated 20+ files across the codebase to use the new Query.parse_and_build API and relocated QueryParser
Added comprehensive test suite (3000+ lines) for Query.parse_and_build functionality covering metrics, filters, dimensions, date ranges, pagination, and special cases
Reorganized test modules under Plausible.Stats.Query namespace for better structure
Renamed StatsAPIFilterParser to LegacyStatsAPIFilterParser to clarify legacy API support

Diagram Walkthrough

flowchart LR
  Input["Input Parameters"]
  Parser["QueryParser.parse"]
  ParsedParams["ParsedQueryParams"]
  Builder["QueryBuilder.build"]
  Query["Query Struct"]
  
  Input -- "parse" --> Parser
  Parser -- "returns" --> ParsedParams
  ParsedParams -- "build" --> Builder
  Builder -- "returns" --> Query

File Walkthrough

Relevant files

Tests

6 files

query_parse_and_build_test.exs `Comprehensive test suite for Query parse and build` test/plausible/stats/query/query_parse_and_build_test.exs Added comprehensive test suite with 3000+ lines covering `Query.parse_and_build` functionality Tests validate metrics, filters, dimensions, date ranges, pagination, and include parameters Tests cover goal preloading, revenue metrics, behavioral filters, segments, and special metric validations Tests verify both public and internal API schemas with appropriate error handling	+3056/-0
query_optimizer_test.exs `Update QueryOptimizerTest module path and references` test/plausible/stats/query/query_optimizer_test.exs Renamed module from `Plausible.Stats.QueryOptimizerTest` to `Plausible.Stats.Query.QueryOptimizerTest` Updated imports to include `ParsedQueryParams` alias Replaced all `QueryParser.default_include()` calls with `ParsedQueryParams.default_include()` Removed unused `QueryParser` alias from test describe block	+14/-16
query_result_test.exs `Update QueryResultTest module path and function names` test/plausible/stats/query/query_result_test.exs Renamed module from `Plausible.Stats.QueryResultTest` to `Plausible.Stats.Query.QueryResultTest` Updated all `Query.build!` calls to `Query.parse_and_build!`	+4/-4
query_parser_test.exs `New QueryParserTest module with basic tests` test/plausible/stats/query/query_parser_test.exs New test module for `QueryParser` functionality Tests basic parsing validation including empty map rejection and invalid metric detection	+22/-0
query_from_test.exs `Update QueryFromTest module path and cleanup` test/plausible/stats/query/query_from_test.exs Renamed module from `Plausible.Stats.QueryTest` to `Plausible.Stats.Query.QueryFromTest` Removed unused `QueryParser` alias import	+1/-2
comparisons_test.exs `Update comparisons test function call` test/plausible/stats/comparisons_test.exs Updated `Query.build!` call to `Query.parse_and_build!` to match renamed function	+1/-1

Refactoring

1 files

query_parser.ex `Refactor QueryParser to focus on parsing only` lib/plausible/stats/query_parser.ex Refactored module name from `Plausible.Stats.Filters.QueryParser` to `Plausible.Stats.QueryParser` Simplified `parse` function to return `ParsedQueryParams` struct instead of full query map Removed query validation logic (moved to `QueryBuilder`) Removed goal preloading and consolidated site ID logic (moved to builder phase) Updated to use `ParsedQueryParams` defaults instead of local constants	+33/-339

Enhancement

17 files

send_email_report.ex `Update email report worker to use new Query API` lib/workers/send_email_report.ex Updated four function calls from `Query.build!` to `Query.parse_and_build!` Changes affect `stats_aggregates`, `pages`, `sources`, and `goals` functions Maintains same parameter structure and internal API usage	+4/-4
query_builder.ex `New QueryBuilder module for Query struct construction` lib/plausible/stats/query_builder.ex New module created to handle building `Query` structs from already-parsed parameters Implements comprehensive validation logic for metrics, dimensions, filters, and special query constraints Handles segment resolution, goal preloading, revenue metrics, and comparison time range setup Provides helper functions for time-on-page data and consolidated site ID retrieval	+363/-0
parsed_query_params.ex `New ParsedQueryParams struct for parsed query state` lib/plausible/stats/parsed_query_params.ex New module defining `ParsedQueryParams` struct as intermediate data structure between parsing and building phases Contains default configurations for `include` and `pagination` options Provides `new!/1` constructor for creating validated instances with required fields	+55/-0
query.ex `Refactor Query.build to parse_and_build with delegation` lib/plausible/stats/query.ex Renamed `build/4` function to `parse_and_build/4` to reflect two-phase architecture Renamed `build!/4` function to `parse_and_build!/4` for consistency Refactored to delegate to `QueryParser.parse` and `QueryBuilder.build` instead of inline logic Updated default `include` reference from `QueryParser.default_include()` to `ParsedQueryParams.default_include()` Updated documentation to reference new function names	+18/-39
legacy_query_builder.ex `Update legacy builder to use new QueryBuilder and QueryParser` lib/plausible/stats/legacy/legacy_query_builder.ex Updated function calls from `Query.put_comparison_utc_time_range()` to `QueryBuilder.put_comparison_utc_time_range()` Updated function calls from `Query.set_time_on_page_data()` to `QueryBuilder.set_time_on_page_data()` Updated `preload_goals_and_revenue` call to use `QueryBuilder` instead of `QueryParser` Refactored `parse_order_by` and `parse_include` to use new `QueryParser` module with improved error handling Simplified helper functions by removing `json_decode` and `unwrap` utilities	+24/-23
filters.ex `Update filter module aliases and legacy parser naming` lib/plausible/stats/filters/filters.ex Updated alias from `Plausible.Stats.Filters.QueryParser` to `Plausible.Stats.QueryParser` Renamed `StatsAPIFilterParser` to `LegacyStatsAPIFilterParser` in alias and usage	+3/-3
legacy_stats_api_filter_parser.ex `Rename and document legacy stats API filter parser` lib/plausible/stats/filters/legacy_stats_api_filter_parser.ex Renamed module from `Plausible.Stats.Filters.StatsAPIFilterParser` to `Plausible.Stats.Filters.LegacyStatsAPIFilterParser` Enhanced moduledoc to clarify this is for legacy Stats API v1 filter format	+4/-2
filters.ex `Update segment filters to use relocated QueryParser` lib/plausible/segments/filters.ex Updated alias to import `QueryParser` from `Plausible.Stats` instead of nested under `Filters` Updated function call from `Filters.QueryParser.parse_filters` to `QueryParser.parse_filters`	+2/-2
external_query_api_controller.ex `Update external query API controller function call` lib/plausible_web/controllers/api/external_query_api_controller.ex Updated `Query.build` call to `Query.parse_and_build` to match renamed function	+1/-1
segment.ex `Update segment module function call` lib/plausible/segments/segment.ex Updated `Query.build` call to `Query.parse_and_build` to match renamed function	+1/-1
page_rules.ex `Update page rules shield function call` lib/plausible_web/live/shields/page_rules.ex Updated `Query.build!` call to `Query.parse_and_build!` to match renamed function	+1/-1
hostname_rules.ex `Update hostname rules shield function call` lib/plausible_web/live/shields/hostname_rules.ex Updated `Query.build!` call to `Query.parse_and_build!` to match renamed function	+1/-1
form.ex `Update goal settings form function call` lib/plausible_web/live/goal_settings/form.ex Updated `Query.build!` call to `Query.parse_and_build!` to match renamed function	+1/-1
goal_suggestions.ex `Update goal suggestions function call` lib/plausible/stats/goal_suggestions.ex Updated `Query.build!` call to `Query.parse_and_build!` to match renamed function	+1/-1
consolidated_view.ex `Update consolidated view function calls` extra/lib/plausible/stats/consolidated_view.ex Updated two `Query.build!` calls to `Query.parse_and_build!` to match renamed function	+2/-2
form.ex `Update funnel settings form function call` extra/lib/plausible_web/live/funnel_settings/form.ex Updated `Query.build!` call to `Query.parse_and_build!` to match renamed function	+1/-1
traffic_change_notifier.ex `Update traffic change notifier function calls` lib/workers/traffic_change_notifier.ex Updated two `Query.build!` calls to `Query.parse_and_build!` to match renamed function	+2/-2

Additional files

1 files

query_parser_test.exs	+0/-2740

qodo-code-review · 2025-12-04T20:03:02Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢	No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: No audit logs: The new parsing/building flow adds validation and building logic without any audit logging of critical actions or failures, and the diff does not show existing logging that would satisfy audit trail requirements. Referred Code def build(site, parsed_query_params, params, debug_metadata \\ %{}) do with {:ok, parsed_query_params} <- resolve_segments_in_filters(parsed_query_params, site), query = do_build(parsed_query_params, site, params, debug_metadata), :ok <- validate_order_by(query), :ok <- validate_custom_props_access(site, query), :ok <- validate_toplevel_only_filter_dimension(query), :ok <- validate_special_metrics_filters(query), :ok <- validate_behavioral_filters(query), :ok <- validate_filtered_goals_exist(query), :ok <- validate_revenue_metrics_access(site, query), :ok <- validate_metrics(query), :ok <- validate_include(query) do query = query \|> set_time_on_page_data(site) \|> put_comparison_utc_time_range() \|> Query.put_imported_opts(site) on_ee do # NOTE: The Query API schema does not allow the sample_threshold param # and it looks like it's not used as a parameter anymore. We might want ... (clipped 6 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Error detail surface: The builder and parser return detailed validation error strings (e.g., listing invalid dimensions or metrics) which might propagate to user-facing contexts; it is unclear from the diff whether these are sanitized at the API boundary. Referred Code defp validate_order_by(query) do if query.order_by do valid_values = query.metrics ++ query.dimensions invalid_entry = Enum.find(query.order_by, fn {value, _direction} -> not Enum.member?(valid_values, value) end) case invalid_entry do nil -> :ok _ -> {:error, "Invalid order_by entry '#{i(invalid_entry)}'. Entry is not a queried metric or dimension."} end else :ok end end ... (clipped 196 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: No log controls: The new code introduces parsing and building paths without adding or showing structured logging controls or scrubbing for potentially sensitive filter values, and the diff does not demonstrate existing logging safeguards. Referred Code def build(site, parsed_query_params, params, debug_metadata \\ %{}) do with {:ok, parsed_query_params} <- resolve_segments_in_filters(parsed_query_params, site), query = do_build(parsed_query_params, site, params, debug_metadata), :ok <- validate_order_by(query), :ok <- validate_custom_props_access(site, query), :ok <- validate_toplevel_only_filter_dimension(query), :ok <- validate_special_metrics_filters(query), :ok <- validate_behavioral_filters(query), :ok <- validate_filtered_goals_exist(query), :ok <- validate_revenue_metrics_access(site, query), :ok <- validate_metrics(query), :ok <- validate_include(query) do query = query \|> set_time_on_page_data(site) \|> put_comparison_utc_time_range() \|> Query.put_imported_opts(site) on_ee do # NOTE: The Query API schema does not allow the sample_threshold param # and it looks like it's not used as a parameter anymore. We might want ... (clipped 6 lines) Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2025-12-04T20:04:16Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
High-level	Refine the new query building API Refine the `QueryBuilder.build` function to remove its dependency on raw, untyped parameter maps. Instead, it should rely solely on the new typed `ParsedQueryParams` struct for its inputs. Examples: lib/plausible/stats/query_builder.ex [10-36] def build(site, parsed_query_params, params, debug_metadata \\ %{}) do with {:ok, parsed_query_params} <- resolve_segments_in_filters(parsed_query_params, site), query = do_build(parsed_query_params, site, params, debug_metadata), :ok <- validate_order_by(query), :ok <- validate_custom_props_access(site, query), :ok <- validate_toplevel_only_filter_dimension(query), :ok <- validate_special_metrics_filters(query), :ok <- validate_behavioral_filters(query), :ok <- validate_filtered_goals_exist(query), :ok <- validate_revenue_metrics_access(site, query), ... (clipped 17 lines) lib/plausible/stats/query.ex [50-59] def parse_and_build( %Plausible.Site{domain: domain} = site, schema_type, %{"site_id" => domain} = params, debug_metadata \\ %{} ) do with {:ok, %ParsedQueryParams{} = parsed_query_params} <- QueryParser.parse(site, schema_type, params) do QueryBuilder.build(site, parsed_query_params, params, debug_metadata) end Solution Walkthrough: Before: # lib/plausible/stats/query.ex def parse_and_build(site, schema_type, params, ...) do with {:ok, parsed_query_params} <- QueryParser.parse(site, schema_type, params) do # The raw `params` map is passed to the builder QueryBuilder.build(site, parsed_query_params, params, ...) end end # lib/plausible/stats/query_builder.ex def build(site, parsed_query_params, params, ...) do # ... # The builder uses the raw `params` map query = do_build(parsed_query_params, site, params, ...) # ... end After: # lib/plausible/stats/parsed_query_params.ex defstruct [ # ... :input_date_range ] # lib/plausible/stats/query_parser.ex def parse(site, schema_type, params, ...) do # ... # The parser now populates `input_date_range` Plausible.Stats.ParsedQueryParams.new!(%{ # ... input_date_range: Map.get(params, "date_range") }) end # lib/plausible/stats/query.ex def parse_and_build(site, schema_type, params, ...) do with {:ok, parsed_query_params} <- QueryParser.parse(site, schema_type, params) do # The raw `params` map is no longer needed QueryBuilder.build(site, parsed_query_params, ...) end end Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies a design flaw where `QueryBuilder.build` still depends on the raw `params` map, which contradicts the PR's goal of separating parsing and building logic.	Medium
Possible issue	Return empty list for nil filters Modify `parse_filters/1` to return `{:ok, []}` instead of `{:ok, nil}` when the input is `nil`. lib/plausible/stats/query_parser.ex [60] -defp parse_filters(nil), do: {:ok, nil} +defp parse_filters(nil), do: {:ok, []} Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly points out that returning `[]` instead of `nil` for missing filters aligns better with the expected data structure and test assertions, which expect `filters: []`. This improves consistency and robustness by ensuring the `filters` field is always a list.	Medium
	Return empty list for nil dimensions Modify `parse_dimensions/1` to return `{:ok, []}` instead of `{:ok, nil}` when the input is `nil`. lib/plausible/stats/query_parser.ex [258] -defp parse_dimensions(nil), do: {:ok, nil} +defp parse_dimensions(nil), do: {:ok, []} Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies that returning `[]` instead of `nil` for missing dimensions is more consistent with the expected data type and test cases, which assert `dimensions: []`. This change improves code robustness by ensuring the `dimensions` field is always a list.	Medium
	Return default pagination for nil input Modify `parse_pagination/1` to return the default pagination map, `Plausible.Stats.ParsedQueryParams.default_pagination()`, instead of `nil` when the input is `nil`. lib/plausible/stats/query_parser.ex [345] -defp parse_pagination(nil), do: {:ok, nil} +defp parse_pagination(nil), do: {:ok, Plausible.Stats.ParsedQueryParams.default_pagination()} Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly observes that returning a default pagination map instead of `nil` aligns with the test expectations and ensures the `pagination` field is always a map. This improves consistency and prevents potential `nil` value issues downstream.	Medium
General	Improve error message for invalid filters Improve the error message for invalid filter dimensions to list all invalid dimensions at once, rather than just the first one found. lib/plausible/stats/query_builder.ex [164-169] -if Enum.count(not_toplevel) > 0 do +if Enum.any?(not_toplevel) do + invalid_dims = Enum.join(not_toplevel, ", ") + {:error, - "Invalid filters. Dimension `#{List.first(not_toplevel)}` can only be filtered at the top level."} + "Invalid filters. Dimensions `#{invalid_dims}` can only be filtered at the top level."} else :ok end Apply / Chat Suggestion importance[1-10]: 5 __ Why: The suggestion improves the user experience by providing a more comprehensive error message, which is a valuable quality-of-life enhancement.	Low
General	Improve validation for required parameters Refactor the `metrics` validation in `new!/1` to use an `if` condition and an explicit `raise` for an empty list, instead of pattern matching. lib/plausible/stats/parsed_query_params.ex [40-54] def new!(params) when is_map(params) do %DateTimeRange{} = utc_time_range = Map.fetch!(params, :utc_time_range) - [_ \| _] = metrics = Map.fetch!(params, :metrics) + metrics = Map.fetch!(params, :metrics) + + if metrics == [] do + raise "ParsedQueryParams `metrics` cannot be empty" + end %__MODULE__{ now: params[:now], utc_time_range: utc_time_range, metrics: metrics, filters: params[:filters] \|\| [], dimensions: params[:dimensions] \|\| [], order_by: params[:order_by], pagination: Map.merge(@default_pagination, params[:pagination] \|\| %{}), include: Map.merge(@default_include, params[:include] \|\| %{}) } end Apply / Chat Suggestion importance[1-10]: 2 __ Why: The suggestion proposes a stylistic change that is subjective; the existing code is a common and valid Elixir idiom for asserting a non-empty list.	Low
More

Apply changes for benchmark PR

e7544ce

qodo-code-review bot added the Review effort 3/5 label Dec 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CLEAN] Synthetic Benchmark PR #5893 - Refactor building the Query struct #85

[CLEAN] Synthetic Benchmark PR #5893 - Refactor building the Query struct #85

Uh oh!

tomerqodo commented Dec 4, 2025 •

edited by qodo-code-review bot

Loading

Uh oh!

qodo-code-review bot commented Dec 4, 2025

Uh oh!

qodo-code-review bot commented Dec 4, 2025

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CLEAN] Synthetic Benchmark PR #5893 - Refactor building the Query struct #85

Are you sure you want to change the base?

[CLEAN] Synthetic Benchmark PR #5893 - Refactor building the Query struct #85

Uh oh!

Conversation

tomerqodo commented Dec 4, 2025 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Benchmark PR plausible#5893

Tests

Changelog

Documentation

Dark mode

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review bot commented Dec 4, 2025

PR Compliance Guide 🔍

Uh oh!

qodo-code-review bot commented Dec 4, 2025

PR Code Suggestions ✨

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tomerqodo commented Dec 4, 2025 •

edited by qodo-code-review bot

Loading