Skip to content

Conversation

@tomerqodo
Copy link

@tomerqodo tomerqodo commented Dec 4, 2025

User description

Benchmark PR plausible#5893

Type: Corrupted (contains bugs)

Original PR Title: Refactor building the Query struct
Original PR Description: ### Changes

This PR is a refactor and doesn't introduce any new behaviour.

In the upcoming dashboard migration from React to LiveView, we want to be able to create Query structs from native data structures. E.g.:

  • pass the site struct itself into the Query.build function, not the site_id
  • pass DateTime structs instead of iso8601 strings
  • pass metrics as atoms instead of strings
  • etc...

The biggest refactor in this PR is changing the current Query.build function into Query.parse_and_build which calls two separate modules (QueryParser and QueryBuilder (new) for the respective actions).

It also introduces an intermediate data structure that represents the parsed state between those two phases - ParsedQueryParams.

There was a lot of query "building" logic and validations incorporated into QueryParser. The test file (with more than 3k lines of code) got turned into QueryParseAndBuildTest, now asserting on a Query struct outcome rather than an arbitrary map.

This PR doesn't tackle the date_range input aspect yet. For now, QueryBuilder will expect a clean utc_time_range input (via the ParsedQueryParams struct). It shouldn't block this PR though. I can tackle it later. E.g.: we'll likely want to build queries like:

query1 = QueryBuilder.build(site, %{metrics: [:visitors], date_range: :month, date: ~D[2025-01-01]})
query2 = QueryBuilder.build(site, %{metrics: [:visitors], date_range: [~D[2025-01-01], ~D[2025-01-31]]})

With that, we can also migrate any existing ad-hoc Query construction (e.g. in email reports or query_24h_stats) to use this new version of Query.build.

Tests

  • Automated tests have been added

Changelog

  • This PR does not make a user-facing change

Documentation

  • This change does not need a documentation update

Dark mode

  • This PR does not change the UI

Original PR URL: plausible#5893


PR Type

Enhancement, Tests


Description

  • Refactored Query.build into a two-phase architecture with Query.parse_and_build, separating parsing and building concerns

  • Created new QueryParser module (moved from Plausible.Stats.Filters.QueryParser) focused solely on parsing and basic validation

  • Created new QueryBuilder module to handle Query struct construction with comprehensive validation logic, goal preloading, and revenue metrics handling

  • Introduced ParsedQueryParams struct as an intermediate data structure representing the parsed state between parsing and building phases

  • Added comprehensive test suite (3000+ lines) in QueryParseAndBuildTest validating metrics, filters, dimensions, date ranges, pagination, and special metrics

  • Updated 20+ call sites across the codebase to use the new Query.parse_and_build API instead of Query.build

  • Reorganized test files into test/plausible/stats/query/ directory structure for better organization

  • Renamed StatsAPIFilterParser to LegacyStatsAPIFilterParser for clarity on legacy API support


Diagram Walkthrough

flowchart LR
  Input["Input Parameters<br/>site, params, etc."] -->|parse| QueryParser["QueryParser<br/>parse/1"]
  QueryParser -->|returns| ParsedQueryParams["ParsedQueryParams<br/>struct"]
  ParsedQueryParams -->|build| QueryBuilder["QueryBuilder<br/>build/1"]
  QueryBuilder -->|returns| Query["Query<br/>struct"]
  Query -->|used by| Consumers["Stats API<br/>Email Reports<br/>Segments<br/>etc."]
Loading

File Walkthrough

Relevant files
Tests
6 files
query_parse_and_build_test.exs
Comprehensive test suite for Query.parse_and_build refactor

test/plausible/stats/query/query_parse_and_build_test.exs

  • Added comprehensive test suite with 3000+ lines covering the new
    Query.parse_and_build function
  • Tests validate metrics, filters, dimensions, date ranges, pagination,
    and include parameters
  • Tests cover special metrics like conversion_rate, exit_rate,
    scroll_depth, time_on_page, and revenue metrics
  • Tests verify goal preloading, segment resolution, and access control
    for custom properties and revenue features
+3056/-0
query_optimizer_test.exs
Update QueryOptimizerTest module path and references         

test/plausible/stats/query/query_optimizer_test.exs

  • Updated module name from Plausible.Stats.QueryOptimizerTest to
    Plausible.Stats.Query.QueryOptimizerTest
  • Added ParsedQueryParams to module aliases
  • Replaced all QueryParser.default_include() calls with
    ParsedQueryParams.default_include()
  • Removed unused QueryParser alias from test describe block
+14/-16 
query_result_test.exs
Update QueryResultTest module path and function calls       

test/plausible/stats/query/query_result_test.exs

  • Updated module name from Plausible.Stats.QueryResultTest to
    Plausible.Stats.Query.QueryResultTest
  • Updated all Query.build!() calls to Query.parse_and_build!()
+4/-4     
query_parser_test.exs
New QueryParserTest for query parsing validation                 

test/plausible/stats/query/query_parser_test.exs

  • New test file created for QueryParser module
  • Tests basic parsing validation including empty map and invalid metric
    scenarios
  • Imports QueryParser functions for testing
+22/-0   
query_from_test.exs
Update QueryFromTest module path and cleanup                         

test/plausible/stats/query/query_from_test.exs

  • Updated module name from Plausible.Stats.QueryTest to
    Plausible.Stats.Query.QueryFromTest
  • Removed unused QueryParser alias import
+1/-2     
comparisons_test.exs
Update comparisons test function call                                       

test/plausible/stats/comparisons_test.exs

  • Updated Query.build!() call to Query.parse_and_build!() in
    build_comparison_query/2 helper function
+1/-1     
Enhancement
18 files
query_parser.ex
Refactor QueryParser to return ParsedQueryParams struct   

lib/plausible/stats/query_parser.ex

  • Renamed module from Plausible.Stats.Filters.QueryParser to
    Plausible.Stats.QueryParser
  • Refactored parse function to return ParsedQueryParams struct instead
    of a full query map
  • Removed query validation logic and goal/revenue preloading from parser
    (moved to builder)
  • Simplified parser to focus only on parsing and basic validation of
    input parameters
  • Updated parse_include and parse_pagination to use defaults from
    ParsedQueryParams
+33/-339
send_email_report.ex
Update email report worker to use new parse_and_build API

lib/workers/send_email_report.ex

  • Updated function calls from Query.build! to Query.parse_and_build! in
    four locations
  • Changes affect stats_aggregates, pages, sources, and goals functions
  • Maintains same parameter structure and internal API usage
+4/-4     
query_builder.ex
New QueryBuilder module for Query struct construction       

lib/plausible/stats/query_builder.ex

  • New module created to handle building Query structs from
    already-parsed ParsedQueryParams
  • Implements comprehensive validation logic for metrics, dimensions,
    filters, and special query constraints
  • Handles goal preloading, revenue metrics, and comparison time range
    calculations
  • Provides helper functions for time-on-page data and consolidated site
    ID resolution
+363/-0 
parsed_query_params.ex
New ParsedQueryParams struct for parsed query state           

lib/plausible/stats/parsed_query_params.ex

  • New module defining ParsedQueryParams struct as intermediate data
    structure between parsing and building phases
  • Contains default configurations for include and pagination options
  • Provides new!/1 constructor for creating validated ParsedQueryParams
    instances
+55/-0   
query.ex
Refactor Query.build to parse_and_build with new architecture

lib/plausible/stats/query.ex

  • Renamed build/4 function to parse_and_build/4 to reflect two-phase
    architecture
  • Renamed build!/4 function to parse_and_build!/4 for consistency
  • Updated to use new QueryParser and QueryBuilder modules instead of
    inline logic
  • Updated default include reference from QueryParser.default_include()
    to ParsedQueryParams.default_include()
  • Added new module aliases for QueryParser, ParsedQueryParams, and
    QueryBuilder
+18/-39 
legacy_query_builder.ex
Update legacy query builder to use new modules                     

lib/plausible/stats/legacy/legacy_query_builder.ex

  • Updated to use QueryBuilder.put_comparison_utc_time_range() instead of
    Query.put_comparison_utc_time_range()
  • Updated to use QueryBuilder.set_time_on_page_data() instead of
    Query.set_time_on_page_data()
  • Updated to use QueryBuilder.preload_goals_and_revenue() instead of
    QueryParser.preload_goals_and_revenue()
  • Refactored parse_order_by/1 and parse_include/2 functions to use new
    QueryParser module with improved error handling
+24/-23 
filters.ex
Update filter module aliases and naming                                   

lib/plausible/stats/filters/filters.ex

  • Updated alias from Plausible.Stats.Filters.QueryParser to
    Plausible.Stats.QueryParser
  • Renamed StatsAPIFilterParser alias to LegacyStatsAPIFilterParser for
    clarity
  • Updated filter parsing error handling to use renamed module
+3/-3     
legacy_stats_api_filter_parser.ex
Rename StatsAPIFilterParser to LegacyStatsAPIFilterParser

lib/plausible/stats/filters/legacy_stats_api_filter_parser.ex

  • Renamed module from Plausible.Stats.Filters.StatsAPIFilterParser to
    Plausible.Stats.Filters.LegacyStatsAPIFilterParser
  • Updated moduledoc to clarify this is for legacy Stats API v1 filter
    format
+4/-2     
external_query_api_controller.ex
Update external query API controller function call             

lib/plausible_web/controllers/api/external_query_api_controller.ex

  • Updated Query.build() call to Query.parse_and_build() to use renamed
    function
+1/-1     
filters.ex
Update segment filters to use QueryParser module                 

lib/plausible/segments/filters.ex

  • Added QueryParser to module aliases
  • Updated Filters.QueryParser.parse_filters() call to
    QueryParser.parse_filters()
+2/-2     
segment.ex
Update segment module function call                                           

lib/plausible/segments/segment.ex

  • Updated Query.build() call to Query.parse_and_build() to use renamed
    function
+1/-1     
goal_suggestions.ex
Update goal suggestions function call                                       

lib/plausible/stats/goal_suggestions.ex

  • Updated Query.build!() call to Query.parse_and_build!() to use renamed
    function
+1/-1     
traffic_change_notifier.ex
Update traffic change notifier function calls                       

lib/workers/traffic_change_notifier.ex

  • Updated two Query.build!() calls to Query.parse_and_build!() in
    put_sources/2 and put_pages/2 functions
+2/-2     
page_rules.ex
Update page rules shield function call                                     

lib/plausible_web/live/shields/page_rules.ex

  • Updated Query.build!() call to Query.parse_and_build!() in
    suggest_page_paths/4 function
+1/-1     
hostname_rules.ex
Update hostname rules shield function call                             

lib/plausible_web/live/shields/hostname_rules.ex

  • Updated Query.build!() call to Query.parse_and_build!() in
    suggest_hostnames/3 function
+1/-1     
form.ex
Update goal settings form function call                                   

lib/plausible_web/live/goal_settings/form.ex

  • Updated Query.build!() call to Query.parse_and_build!() in
    suggest_page_paths/2 function
+1/-1     
consolidated_view.ex
Update consolidated view function calls                                   

extra/lib/plausible/stats/consolidated_view.ex

  • Updated two Query.build!() calls to Query.parse_and_build!() in
    consolidated view query functions
+2/-2     
form.ex
Update funnel settings form function call                               

extra/lib/plausible_web/live/funnel_settings/form.ex

  • Updated Query.build!() call to Query.parse_and_build!() in funnel
    settings form
+1/-1     
Additional files
1 files
query_parser_test.exs +0/-2740

@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No audit logs: The new parsing/building flow handles external query requests without adding or modifying
any audit logging for critical actions, but the diff does not show whether such logging
exists elsewhere.

Referred Code
def query(conn, params) do
  site = Repo.preload(conn.assigns.site, :owners)

  case Query.parse_and_build(site, conn.assigns.schema_type, params, debug_metadata(conn)) do
    {:ok, query} ->
      results = Plausible.Stats.query(site, query)
      json(conn, results)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Error propagation: The builder returns generic {:error, reason} tuples from multiple validations but does not
add contextual logging or mapping for user-facing vs internal errors; adequacy cannot be
fully determined from the diff.

Referred Code
def build(site, parsed_query_params, params, debug_metadata \\ %{}) do
  with {:ok, parsed_query_params} <- resolve_segments_in_filters(parsed_query_params, site),
       query = do_build(parsed_query_params, site, params, debug_metadata),
       :ok <- validate_order_by(query),
       :ok <- validate_custom_props_access(site, query),
       :ok <- validate_toplevel_only_filter_dimension(query),
       :ok <- validate_special_metrics_filters(query),
       :ok <- validate_behavioral_filters(query),
       :ok <- validate_filtered_goals_exist(query),
       :ok <- validate_revenue_metrics_access(site, query),
       :ok <- validate_metrics(query),
       :ok <- validate_include(query) do
    query =
      query
      |> set_time_on_page_data(site)
      |> put_comparison_utc_time_range()
      |> Query.put_imported_opts(site)

    on_ee do
      # NOTE: The Query API schema does not allow the sample_threshold param
      # and it looks like it's not used as a parameter anymore. We might want


 ... (clipped 7 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Raised errors content: The parse_and_build!/4 raises with "Failed to build query: #{inspect(reason)}"
which may expose internal validation messages to callers; whether these reach end users
depends on upstream handling.

Referred Code
def parse_and_build!(site, schema_type, params, debug_metadata \\ %{}) do
  case parse_and_build(site, schema_type, params, debug_metadata) do
    {:ok, query} -> query
    {:error, reason} -> raise "Failed to build query: #{inspect(reason)}"
  end

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Refactoring is incomplete for its goal

The new QueryBuilder module should be updated to accept native Elixir types
directly, instead of only accepting a pre-parsed ParsedQueryParams struct. This
change would align the implementation with the PR's main goal and simplify its
use for internal callers.

Examples:

lib/plausible/stats/query_builder.ex [10-36]
  def build(site, parsed_query_params, params, debug_metadata \\ %{}) do
    with {:ok, parsed_query_params} <- resolve_segments_in_filters(parsed_query_params, site),
         query = do_build(parsed_query_params, site, params, debug_metadata),
         :ok <- validate_order_by(query),
         :ok <- validate_custom_props_access(site, query),
         :ok <- validate_toplevel_only_filter_dimension(query),
         :ok <- validate_special_metrics_filters(query),
         :ok <- validate_behavioral_filters(query),
         :ok <- validate_filtered_goals_exist(query),
         :ok <- validate_revenue_metrics_access(site, query),

 ... (clipped 17 lines)

Solution Walkthrough:

Before:

# lib/plausible/stats/query_builder.ex
defmodule Plausible.Stats.QueryBuilder do
  # The public interface only accepts ParsedQueryParams, not native types.
  # It expects a pre-calculated `utc_time_range`.
  def build(site, %ParsedQueryParams{} = parsed_query_params, params) do
    # ... logic to build Query from parsed params
  end
end

# An internal caller cannot do this:
# QueryBuilder.build(site, %{metrics: [:visitors], date_range: :month})

# They must use the full parse-and-build flow with raw params:
# Query.parse_and_build(site, :internal, %{"metrics" => ["visitors"], "date_range" => "month"})

After:

# lib/plausible/stats/query_builder.ex
defmodule Plausible.Stats.QueryBuilder do
  # New public function that accepts native Elixir types
  def build(site, native_params) when is_map(native_params) do
    # 1. Parse native params (e.g., date_range: :month) into a ParsedQueryParams struct
    parsed_params = parse_native_params(site, native_params)

    # 2. Call the existing build logic
    build(site, parsed_params, %{})
  end

  # Existing function becomes private or handles both cases
  def build(site, %ParsedQueryParams{} = parsed_query_params, params) do
    # ... existing logic
  end
end
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical design gap where the new QueryBuilder does not fulfill the PR's primary goal of accepting native Elixir types, forcing callers to perform parsing logic themselves.

High
Possible issue
Fix incorrect goal filter parsing

Wrap the clauses variable with List.wrap/1 to correctly handle both
single-string and list-of-strings goal filters during validation.

lib/plausible/stats/query_builder.ex [222-228]

 goal_filter_clauses =
   query.filters
   |> Filters.all_leaf_filters()
   |> Enum.flat_map(fn
-    [:is, "event:goal", clauses] -> clauses
+    [:is, "event:goal", clauses] -> List.wrap(clauses)
     _ -> []
   end)
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a bug where a single goal filter provided as a string would be incorrectly processed as a list of characters, causing validation to fail for valid queries.

Medium
General
Return nil on parsing failure

In parse_order_by/1, return nil instead of an empty list ([]) in the else block
to handle parsing failures consistently.

lib/plausible/stats/legacy/legacy_query_builder.ex [271-279]

 def parse_order_by(order_by) do
   with true <- is_binary(order_by),
        {:ok, order_by} <- JSON.decode(order_by),
        {:ok, order_by} <- QueryParser.parse_order_by(order_by) do
     order_by
   else
-    _ -> []
+    _ -> nil
   end
 end
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly points out that returning nil instead of [] on parsing failure is more consistent with the Query struct's defaults, improving code clarity and predictability.

Low
Explicitly handle optional parameter access

In new!/1, use Map.get(params, :order_by) instead of params[:order_by] to
explicitly fetch the optional :order_by parameter for improved code clarity.

lib/plausible/stats/parsed_query_params.ex [40-54]

 def new!(params) when is_map(params) do
   %DateTimeRange{} = utc_time_range = Map.fetch!(params, :utc_time_range)
   [_ | _] = metrics = Map.fetch!(params, :metrics)
 
   %__MODULE__{
     now: params[:now],
     utc_time_range: utc_time_range,
     metrics: metrics,
     filters: params[:filters] || [],
     dimensions: params[:dimensions] || [],
-    order_by: params[:order_by],
+    order_by: Map.get(params, :order_by),
     pagination: Map.merge(@default_pagination, params[:pagination] || %{}),
     include: Map.merge(@default_include, params[:include] || %{})
   }
 end
  • Apply / Chat
Suggestion importance[1-10]: 2

__

Why: The suggestion proposes using Map.get/2 for clarity, which is a minor improvement in code style but does not change the logic, as params[:order_by] already correctly returns nil if the key is absent.

Low
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants