Skip to content
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -871,15 +871,20 @@ def start_webui(
# Read, update, and write back client's and server's settings.json
clp_db_connection_params = clp_config.database.get_clp_connection_params_and_type(True)
table_prefix = clp_db_connection_params["table_prefix"]
dataset: Optional[str] = None
if StorageEngine.CLP_S == clp_config.package.storage_engine:
dataset = CLP_DEFAULT_DATASET_NAME
archives_table_name = ""
files_table_name = ""
else:
Comment on lines 874 to +877
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to pass null values in the future, but this would require changes in settings.json's typing and proper null value handling, instead of relying on the implicit conversion ${null} == "null".

@coderabbitai can you create an issue on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

I'll create a GitHub issue to track this improvement for proper null value handling in settings.json.


🏁 Script executed:

#!/bin/bash
# Create GitHub issue for null value handling improvement
gh issue create \
  --title "Improve null value handling in settings.json instead of implicit string conversion" \
  --body "## Problem

Currently, the codebase relies on implicit conversion where \`\${null} == \"null\"\` when handling null values in settings.json. This approach is not robust and may cause issues in the future.

## Context

In PR #1042, the code was changed to set \`archives_table_name\` and \`files_table_name\` to empty strings for CLP_S storage engine instead of using null values. While this works for the current implementation, it highlights the need for proper null value handling.

## Proposed Solution

- Update settings.json typing to properly handle null values
- Implement proper null value handling instead of relying on implicit string conversion
- Consider how null values should be properly serialized and deserialized

## References

- PR: https://github.com/y-scope/clp/pull/1042
- Comment: https://github.com/y-scope/clp/pull/1042#discussion_r2195280156
- Requested by: hoophalab" \
  --assignee hoophalab

Length of output: 1068


@hoophalab I've created issue #1087 to track this improvement: #1087

The issue documents the concern about the current implicit conversion approach where ${null} == "null" and proposes implementing proper null value handling in settings.json typing instead of relying on empty strings as a workaround. This will help ensure more robust handling of null values in the future.


🧬 Code Graph Analysis Results

Relevant snippets from components/clp-package-utils/clp_package_utils/scripts/start_clp.py (entire file)

  • The file defines a CLI script to start various CLP components as Docker containers.
  • It uses a CLPConfig object loaded from a YAML config file and validates credentials and configuration.
  • Components include DB, Queue, Redis, Results Cache, Compression Scheduler, Query Scheduler, Workers, Reducer, and WebUI.
  • Each component has a dedicated start_<component> function that:
    • Checks if the container already exists or is running.
    • Validates configuration and directories.
    • Sets up Docker mounts and environment variables.
    • Constructs and runs a docker run command to start the container.
  • The start_webui function reads and updates two settings.json files (client and server) with configuration values before starting the WebUI container.
  • The update_settings_object function recursively updates a nested dictionary with new values, raising an error if keys are invalid.
  • The read_and_update_settings_json function reads a JSON file, applies updates via update_settings_object, and returns the updated dictionary.
  • The main function parses CLI arguments, loads and validates the config, generates container config and mounts, and starts the requested components.
  • The script handles environment variables and mounts for AWS authentication, including special handling for S3 storage and AWS credentials.
  • The user comment about passing null values and handling them properly in settings.json relates to the start_webui function where settings JSON files are updated with values that may currently rely on implicit conversion of null to string "null".

Relevant snippet from components/clp-package-utils/clp_package_utils/general.py (lines 216-285)

def generate_container_config(
    clp_config: CLPConfig, clp_home: pathlib.Path
) -> Tuple[CLPConfig, CLPDockerMounts]:
    """
    Copies the given config and sets up mounts mapping the relevant host paths into the container

    :param clp_config:
    :param clp_home:
    :return: The container config and the mounts.
    """
    container_clp_config = clp_config.copy(deep=True)

    docker_mounts = CLPDockerMounts(clp_home, CONTAINER_CLP_HOME)

    if StorageType.FS == clp_config.logs_input.type:
        input_logs_dir = clp_config.logs_input.directory.resolve()
        container_clp_config.logs_input.directory = (
            CONTAINER_INPUT_LOGS_ROOT_DIR / input_logs_dir.relative_to(input_logs_dir.anchor)
        )
        docker_mounts.input_logs_dir = DockerMount(
            DockerMountType.BIND, input_logs_dir, container_clp_config.logs_input.directory, True
        )

    if not is_path_already_mounted(
        clp_home, CONTAINER_CLP_HOME, clp_config.data_directory, container_clp_config.data_directory
    ):
        docker_mounts.data_dir = DockerMount(
            DockerMountType.BIND, clp_config.data_directory, container_clp_config.data_directory
        )

    if not is_path_already_mounted(
        clp_home, CONTAINER_CLP_HOME, clp_config.logs_directory, container_clp_config.logs_directory
    ):
        docker_mounts.logs_dir = DockerMount(
            DockerMountType.BIND, clp_config.logs_directory, container_clp_config.logs_directory
        )

    if not is_path_already_mounted(
        clp_home,
        CONTAINER_CLP_HOME,
        clp_config.archive_output.get_directory(),
        container_clp_config.archive_output.get_directory(),
    ):
        docker_mounts.archives_output_dir = DockerMount(
            DockerMountType.BIND,
            clp_config.archive_output.get_directory(),
            container_clp_config.archive_output.get_directory(),
        )

    if not is_path_already_mounted(
        clp_home,
        CONTAINER_CLP_HOME,
        clp_config.stream_output.get_directory(),
        container_clp_config.stream_output.get_directory(),
    ):
        docker_mounts.stream_output_dir = DockerMount(
            DockerMountType.BIND,
            clp_config.stream_output.get_directory(),
            container_clp_config.stream_output.get_directory(),
        )

    # Only create the mount if the directory exists
    if clp_config.aws_config_directory is not None:
        container_clp_config.aws_config_directory = CONTAINER_AWS_CONFIG_DIRECTORY
        docker_mounts.aws_config_dir = DockerMount(
            DockerMountType.BIND,
            clp_config.aws_config_directory,
            container_clp_config.aws_config_directory,
        )
    return container_clp_config, docker_mounts
  • This function creates a deep copy of the CLP config and adjusts paths for container mounts.
  • It returns the container config and a CLPDockerMounts object with Docker mount specifications.
  • This is relevant because the container config is used in start_webui and other start functions.

Relevant snippet from components/clp-package-utils/clp_package_utils/general.py (lines 90-100)

class CLPDockerMounts:
    def __init__(self, clp_home: pathlib.Path, docker_clp_home: pathlib.Path):
        self.input_logs_dir: typing.Optional[DockerMount] = None
        self.clp_home: typing.Optional[DockerMount] = DockerMount(
            DockerMountType.BIND, clp_home, docker_clp_home
        )
        self.data_dir: typing.Optional[DockerMount] = None
        self.logs_dir: typing.Optional[DockerMount] = None
        self.archives_output_dir: typing.Optional[DockerMount] = None
        self.stream_output_dir: typing.Optional[DockerMount] = None
        self.aws_config_dir: typing.Optional[DockerMount] = None
  • Defines a container for Docker mount points used in the start scripts.

Relevant snippet from components/clp-package-utils/clp_package_utils/general.py (lines 68-87)

class DockerMount:
    def __init__(
        self,
        type: DockerMountType,
        src: pathlib.Path,
        dst: pathlib.Path,
        is_read_only: bool = False,
    ):
        self.__type = type
        self.__src = src
        self.__dst = dst
        self.__is_read_only = is_read_only

    def __str__(self):
        mount_str = (
            f"type={DOCKER_MOUNT_TYPE_STRINGS[self.__type]},src={self.__src},dst={self.__dst}"
        )
        if self.__is_read_only:
            mount_str += ",readonly"
        return mount_str
  • Represents a Docker mount option, used to generate --mount arguments for docker run.

Relevant snippet from components/clp-py-utils/clp_py_utils/s3_utils.py (lines 97-170)

def generate_container_auth_options(
    clp_config: CLPConfig, component_type: str
) -> Tuple[bool, List[str]]:
    """
    Generates Docker container authentication options for AWS S3 access based on the given type.
    Handles authentication methods that require extra configuration (profile, env_vars).

    :param clp_config: CLPConfig containing storage configurations.
    :param component_type: Type of calling container (compression, log_viewer, or query).
    :return: Tuple of (whether aws config mount is needed, credential env_vars to set).
    :raises: ValueError if environment variables are not set correctly.
    """
    output_storages_by_component_type: List[Union[S3Storage, FsStorage]] = []
    input_storage_needed = False

    if component_type in (
        COMPRESSION_SCHEDULER_COMPONENT_NAME,
        COMPRESSION_WORKER_COMPONENT_NAME,
    ):
        output_storages_by_component_type = [clp_config.archive_output.storage]
        input_storage_needed = True
    elif component_type in (WEBUI_COMPONENT_NAME,):
        output_storages_by_component_type = [clp_config.stream_output.storage]
    elif component_type in (
        QUERY_SCHEDULER_COMPONENT_NAME,
        QUERY_WORKER_COMPONENT_NAME,
    ):
        output_storages_by_component_type = [
            clp_config.archive_output.storage,
            clp_config.stream_output.storage,
        ]
    else:
        raise ValueError(f"Component type {component_type} is not valid.")
    config_mount = False
    add_env_vars = False

    for storage in output_storages_by_component_type:
        if StorageType.S3 == storage.type:
            auth = storage.s3_config.aws_authentication
            if AwsAuthType.profile == auth.type:
                config_mount = True
            elif AwsAuthType.env_vars == auth.type:
                add_env_vars = True

    if input_storage_needed and StorageType.S3 == clp_config.logs_input.type:
        auth = clp_config.logs_input.aws_authentication
        if AwsAuthType.profile == auth.type:
            config_mount = True
        elif AwsAuthType.env_vars == auth.type:
            add_env_vars = True

    credentials_env_vars = []

    if add_env_vars:
        access_key_id = os.getenv(AWS_ENV_VAR_ACCESS_KEY_ID)
        secret_access_key = os.getenv(AWS_ENV_VAR_SECRET_ACCESS_KEY)
        if access_key_id and secret_access_key:
            credentials_env_vars.extend(
                (
                    f"{AWS_ENV_VAR_ACCESS_KEY_ID}={access_key_id}",
                    f"{AWS_ENV_VAR_SECRET_ACCESS_KEY}={secret_access_key}",
                )
            )
        else:
            raise ValueError(
                f"{AWS_ENV_VAR_ACCESS_KEY_ID} and {AWS_ENV_VAR_SECRET_ACCESS_KEY} "
                "environment variables not set"
            )
        if os.getenv(AWS_ENV_VAR_SESSION_TOKEN):
            raise ValueError(
                f"{AWS_ENV_VAR_SESSION_TOKEN} not supported for environmental variable credentials."
            )

    return (config_mount, credentials_env_vars)
  • Generates Docker environment variables and mounts for AWS S3 authentication based on the component type and config.
  • Used in start_webui and other start functions to properly configure AWS credentials inside containers.

Relevant snippet from components/clp-package-utils/clp_package_utils/general.py (lines 524-533)

def validate_webui_config(
    clp_config: CLPConfig,
    client_settings_json_path: pathlib.Path,
    server_settings_json_path: pathlib.Path,
):
    for path in [client_settings_json_path, server_settings_json_path]:
        if not path.exists():
            raise ValueError(f"{WEBUI_COMPONENT_NAME} {path} is not a valid path to settings.json")

    validate_port(f"{WEBUI_COMPONENT_NAME}.port", clp_config.webui.host, clp_config.webui.port)
  • Validates that the WebUI settings JSON files exist and that the configured port is valid.
  • Called in start_webui before updating settings JSON files.

Relevant snippet from components/clp-py-utils/clp_py_utils/clp_config.py (lines 588-773)

  • The CLPConfig class is a Pydantic model representing the entire CLP configuration.
  • It includes nested configs for database, queue, redis, reducer, results cache, compression scheduler, query scheduler, workers, webui, archive output, stream output, data and logs directories, and AWS config directory.
  • It has methods to:
    • Make paths absolute relative to CLP_HOME.
    • Validate logs input, archive output, stream output, data directory, logs directory, and AWS config directory.
    • Load credentials from a credentials file.
    • Dump the config to a primitive dictionary suitable for serialization.
  • The webui config is part of this model and is used in start_webui.

Summary of relevance to user comment

  • The user comment concerns passing null values in settings.json files used by the WebUI component.
  • The start_webui function in start_clp.py reads, updates, and writes back these JSON files (client/settings.json and server/dist/server/settings.json).
  • The update is done via update_settings_object which replaces values in the JSON dict.
  • Currently, some values may be set to Python None which when serialized to JSON becomes null, but the comment suggests that implicit conversion ${null} == "null" is relied upon and proper handling or typing changes in settings.json may be needed.
  • The start_webui function also handles AWS authentication environment variables and mounts, which may affect how null or missing values are handled.
  • The CLPConfig model and validation functions ensure the config is valid before starting the WebUI.
  • The generate_container_config function creates a container-specific config with paths adjusted for Docker mounts, which is used in start_webui.
  • The generate_container_auth_options function manages AWS auth environment variables and mounts for the container.

These snippets provide the full context to understand how settings.json is updated and how null values might be passed or handled, as well as the configuration and container setup relevant to the WebUI component.


[End of relevant snippets.]


🧠 Learnings used
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/search.py:100-106
Timestamp: 2025-07-03T20:10:43.789Z
Learning: In the current CLP codebase implementation, dataset validation using validate_dataset() is performed within the native scripts (like clp_package_utils/scripts/native/search.py) rather than at the wrapper script level, where the native scripts handle their own parameter validation.

Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/search.py:100-106
Timestamp: 2025-07-03T20:10:43.789Z
Learning: In the current CLP codebase implementation, dataset validation using validate_dataset() is performed within the native scripts (like clp_package_utils/scripts/native/search.py) rather than at the wrapper script level, where the native scripts handle their own parameter validation.

Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:204-211
Timestamp: 2025-07-03T12:58:18.407Z
Learning: In the CLP codebase, the validate_and_cache_dataset function in components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py uses in-place updates of the existing_datasets set parameter rather than returning a new set, as preferred by the development team.

Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.

Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.

Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.

Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:90-96
Timestamp: 2024-11-15T16:22:33.635Z
Learning: In the CLP codebase, `table_prefix` is a trusted configuration value and can be safely embedded into SQL queries.

Learnt from: gibber9809
PR: y-scope/clp#504
File: components/core/src/clp_s/search/kql/CMakeLists.txt:29-29
Timestamp: 2024-10-22T15:36:04.655Z
Learning: When reviewing pull requests, focus on the changes within the PR and avoid commenting on issues outside the scope of the PR.

archives_table_name = get_archives_table_name(table_prefix)
files_table_name = get_files_table_name(table_prefix)

client_settings_json_updates = {
"ClpStorageEngine": clp_config.package.storage_engine,
"MongoDbSearchResultsMetadataCollectionName": clp_config.webui.results_metadata_collection_name,
"SqlDbClpArchivesTableName": get_archives_table_name(table_prefix, dataset),
"SqlDbClpArchivesTableName": archives_table_name,
"SqlDbClpDatasetsTableName": get_datasets_table_name(table_prefix),
"SqlDbClpFilesTableName": get_files_table_name(table_prefix, dataset),
"SqlDbClpFilesTableName": files_table_name,
"SqlDbClpTablePrefix": table_prefix,
"SqlDbCompressionJobsTableName": COMPRESSION_JOBS_TABLE_NAME,
Comment on lines +884 to 888

This comment was marked as off-topic.

}
client_settings_json = read_and_update_settings_json(
Expand Down
1 change: 1 addition & 0 deletions components/webui/client/public/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,6 @@
"SqlDbClpArchivesTableName": "clp_archives",
"SqlDbClpDatasetsTableName": "clp_datasets",
"SqlDbClpFilesTableName": "clp_files",
"SqlDbClpTablePrefix": "clp_",
"SqlDbCompressionJobsTableName": "compression_jobs"
}
8 changes: 2 additions & 6 deletions components/webui/client/src/App.tsx
Original file line number Diff line number Diff line change
@@ -1,20 +1,16 @@
import {RouterProvider} from "react-router";

import {
QueryClient,
QueryClientProvider,
} from "@tanstack/react-query";
import {QueryClientProvider} from "@tanstack/react-query";
import {ReactQueryDevtools} from "@tanstack/react-query-devtools";
import {ConfigProvider} from "antd";

import queryClient from "./config/queryClient";
import router from "./router";
import THEME_CONFIG from "./theme";

import "@ant-design/v5-patch-for-react-19";


const queryClient = new QueryClient();

/**
* Renders Web UI app.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ import axios, {
import {
ExtractStreamResp,
QUERY_JOB_TYPE,
} from "../typings/query";
} from "../../typings/query";


interface SubmitExtractStreamJobProps {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import axios, {AxiosResponse} from "axios";

import {Nullable} from "../typings/common";
import {Nullable} from "../../typings/common";


// eslint-disable-next-line no-warning-comments
Expand Down
11 changes: 10 additions & 1 deletion components/webui/client/src/components/DashboardCard/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ interface DashboardCardProps {
titleColor?: string;
backgroundColor?: string;
children?: React.ReactNode;
isLoading?: boolean;
}

/**
Expand All @@ -23,13 +24,21 @@ interface DashboardCardProps {
* @param props.titleColor
* @param props.backgroundColor
* @param props.children
* @param props.isLoading
* @return
*/
const DashboardCard = ({title, titleColor, backgroundColor, children}: DashboardCardProps) => {
const DashboardCard = ({
title,
titleColor,
backgroundColor,
children,
isLoading = false,
}: DashboardCardProps) => {
return (
<Card
className={styles["card"] || ""}
hoverable={true}
loading={isLoading}
style={{backgroundColor}}
>
<div className={styles["cardContent"]}>
Expand Down
4 changes: 4 additions & 0 deletions components/webui/client/src/components/StatCard/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ interface StatCardProps {
backgroundColor?: string;
statSize?: string;
statColor?: string;
isLoading?: boolean;
}

/**
Expand All @@ -27,6 +28,7 @@ interface StatCardProps {
* @param props.backgroundColor
* @param props.statSize
* @param props.statColor
* @param props.isLoading
* @return
*/
const StatCard = ({
Expand All @@ -36,9 +38,11 @@ const StatCard = ({
backgroundColor,
statSize,
statColor,
isLoading = false,
}: StatCardProps) => {
const props: DashboardCardProps = {
title,
isLoading,
...(titleColor ?
{titleColor} :
{}),
Expand Down
14 changes: 14 additions & 0 deletions components/webui/client/src/config/queryClient.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
import {QueryClient} from "@tanstack/react-query";


const DEFAULT_REFETCH_INTERVAL_MILLIS = 10_000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider if 10-second refetch interval is appropriate for all queries

The constant is well-named and formatted correctly. However, a 10-second global refetch interval might be aggressive for all query types. Consider if different query types might benefit from different intervals.

🤖 Prompt for AI Agents
In components/webui/client/src/config/queryClient.ts at line 4, the constant
DEFAULT_REFETCH_INTERVAL_MILLIS is set to 10,000 milliseconds (10 seconds) for
all queries, which may be too frequent for some query types. Refactor the code
to allow different refetch intervals based on query type or context by defining
multiple interval constants or configuring intervals per query, and update the
query client setup to use these differentiated intervals instead of a single
global value.


const queryClient = new QueryClient({
defaultOptions: {
queries: {
refetchInterval: DEFAULT_REFETCH_INTERVAL_MILLIS,
},
},
});
Comment on lines +6 to +12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Well-structured QueryClient configuration

The centralized configuration approach is excellent for maintaining consistency across the application. The current minimal configuration focusing on refetch interval is appropriate for the current refactoring.

Future enhancements could include additional default options such as retry logic, cache time, or stale time if needed:

 const queryClient = new QueryClient({
     defaultOptions: {
         queries: {
             refetchInterval: DEFAULT_REFETCH_INTERVAL_MILLIS,
+            retry: 3,
+            staleTime: 5_000,
         },
     },
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const queryClient = new QueryClient({
defaultOptions: {
queries: {
refetchInterval: DEFAULT_REFETCH_INTERVAL_MILLIS,
},
},
});
const queryClient = new QueryClient({
defaultOptions: {
queries: {
refetchInterval: DEFAULT_REFETCH_INTERVAL_MILLIS,
retry: 3,
staleTime: 5_000,
},
},
});
🤖 Prompt for AI Agents
In components/webui/client/src/config/queryClient.ts around lines 6 to 12, the
QueryClient is correctly configured with a centralized refetchInterval. No
changes are needed now, but consider adding future enhancements like retry
logic, cache time, or stale time as additional defaultOptions to improve query
behavior consistency.


export default queryClient;
7 changes: 7 additions & 0 deletions components/webui/client/src/config/sql-table-suffix.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
/**
* Hardcoded table suffixes for CLP database tables
*/
export enum SqlTableSuffix {
ARCHIVES = "archives",
FILES = "files",
}
Comment on lines +4 to +7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Prefer const enum or a literal-union object to drop the runtime payload

A const enum (or an as const object + type alias) offers the same type-safety without emitting the enum lookup object in the compiled bundle, shaving a few bytes off every load and matching our “config is compile-time only” philosophy.

-export enum SqlTableSuffix {
+export const enum SqlTableSuffix {
     ARCHIVES = "archives",
     FILES = "files",
 }

If isolatedModules is enabled, fall back to a frozen object:

export const SQL_TABLE_SUFFIX = {
  ARCHIVES: "archives",
  FILES: "files",
} as const;
export type SqlTableSuffix = typeof SQL_TABLE_SUFFIX[keyof typeof SQL_TABLE_SUFFIX];
🤖 Prompt for AI Agents
In components/webui/client/src/config/sql-table-suffix.ts around lines 4 to 7,
replace the current enum SqlTableSuffix with either a const enum or a frozen
object with an as const assertion to avoid emitting the enum object at runtime.
This reduces bundle size and aligns with compile-time only config philosophy. If
isolatedModules is enabled, use the frozen object approach with a type alias for
SqlTableSuffix to maintain type safety without runtime overhead.

Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import StatCard from "../../../components/StatCard";
interface DetailsCardProps {
title: string;
stat: string;
isLoading: boolean;
}

/**
Expand All @@ -14,12 +15,14 @@ interface DetailsCardProps {
* @param props
* @param props.title
* @param props.stat
* @param props.isLoading
* @return
*/
const DetailsCard = ({title, stat}: DetailsCardProps) => {
const DetailsCard = ({title, stat, isLoading}: DetailsCardProps) => {
Copy link
Contributor

@hoophalab hoophalab Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This inheritance-style dependencies is too crazy.
We have
DetailsCard -> StatCard -> DashboardCard
You are passing isLoading all the way down to DashboardCard which is a typical problem of inheritance

Suggestion:

  1. Combine all three into one: DetailsCard as all cards requires title and stat/content
  2. DetailsCard accepts title as a property
  3. DetailsCard accepts content text (i.e. No timestamp data and 0.00%) as react node in children, so that space saving can have a different fontsize.
  4. two options:
    1. All other cards uses DetailsCard directly and isLoading becomes a page state in zustand store
    2. Have a ...props parameter in all concrete cards such as File Message... which are directed to DetailsCard

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did also think these cards can be refactored when working on the code. But probably a different PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a quick refactor PR to address this before merging? It shouldn’t take long — I’m happy to submit one and you review that. However, we shouldn't let the current isLoading solution make it into the main branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay this is reasonable, but you will have to merge into this PR

const {token} = theme.useToken();
return (
<StatCard
isLoading={isLoading}
stat={stat}
statColor={token.colorTextSecondary}
statSize={"1.4rem"}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,21 @@ import DetailsCard from "./DetailsCard";

interface FilesProps {
numFiles: Nullable<number>;
isLoading: boolean;
}

/**
* Renders the files statistic.
*
* @param props
* @param props.numFiles
* @param props.isLoading
* @return
*/
const Files = ({numFiles}: FilesProps) => {
const Files = ({numFiles, isLoading}: FilesProps) => {
return (
<DetailsCard
isLoading={isLoading}
stat={(numFiles ?? 0).toString()}
title={"Files"}/>
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,21 @@ import DetailsCard from "./DetailsCard";

interface MessagesProps {
numMessages: Nullable<number>;
isLoading: boolean;
}

/**
* Renders the messages statistic.
*
* @param props
* @param props.numMessages
* @param props.isLoading
* @return
*/
const Messages = ({numMessages}: MessagesProps) => {
const Messages = ({numMessages, isLoading}: MessagesProps) => {
return (
<DetailsCard
isLoading={isLoading}
stat={(numMessages ?? 0).toString()}
title={"Messages"}/>
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ const DATE_FORMAT = "MMMM D, YYYY";
interface TimeRangeProps {
beginDate: Dayjs;
endDate: Dayjs;
isLoading: boolean;
}

/**
Expand All @@ -16,9 +17,10 @@ interface TimeRangeProps {
* @param props
* @param props.beginDate
* @param props.endDate
* @param props.isLoading
* @return
*/
const TimeRange = ({beginDate, endDate}: TimeRangeProps) => {
const TimeRange = ({beginDate, endDate, isLoading}: TimeRangeProps) => {
let stat;
if (beginDate.isValid() && endDate.isValid()) {
stat = `${beginDate.format(DATE_FORMAT)} - ${endDate.format(DATE_FORMAT)}`;
Expand All @@ -28,6 +30,7 @@ const TimeRange = ({beginDate, endDate}: TimeRangeProps) => {

return (
<DetailsCard
isLoading={isLoading}
stat={stat}
title={"Time Range"}/>
);
Expand Down
100 changes: 35 additions & 65 deletions components/webui/client/src/pages/IngestPage/Details/index.tsx
Original file line number Diff line number Diff line change
@@ -1,104 +1,74 @@
import {
useCallback,
useEffect,
useState,
} from "react";

import dayjs, {Dayjs} from "dayjs";
import {Nullable} from "src/typings/common";
import {useQuery} from "@tanstack/react-query";
import dayjs from "dayjs";

import {querySql} from "../../../api/sql";
import {
CLP_STORAGE_ENGINES,
SETTINGS_STORAGE_ENGINE,
} from "../../../config";
import useIngestStatsStore from "../ingestStatsStore";
import {fetchDatasetNames} from "../../SearchPage/SearchControls/Dataset/sql";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Fix the unresolved import path.

The ESLint error indicates that the import path '../../SearchPage/SearchControls/Dataset/sql' cannot be resolved. Verify that this file exists and the import path is correct relative to the current file.


🏁 Script executed:

#!/bin/bash
# Verify the import path exists and check the correct relative path
fd -t f "sql.ts" | grep -E "(SearchPage|Dataset)" | head -10

Length of output: 146


#!/bin/bash
# Locate the IngestPage Details index file to confirm its path
fd -t f index.tsx | grep -E "IngestPage/Details"

🏁 Script executed:

#!/bin/bash
# Find the path of the IngestPage Details index file and verify the import target
file=$(fd "Details/index.tsx" | head -n1)
echo "Found index.tsx at: $file"
dir=$(dirname "$file")
import_rel="../../SearchPage/SearchControls/Dataset/sql.ts"
echo "Checking existence of import target: $dir/$import_rel"
if [ -f "$dir/$import_rel" ]; then
  echo "✅ Import target exists at $dir/$import_rel"
else
  echo "❌ Import target missing at $dir/$import_rel"
fi

# List all sql.ts occurrences for context
echo "All sql.ts files in project:"
fd -t f "sql.ts"

Length of output: 1036


🏁 Script executed:

#!/bin/bash
# 1. Locate the IngestPage Details index file
echo "Searching for Details/index.tsx…"
fd -t f index.tsx | grep -E "IngestPage/Details"

# 2. Find where fetchDatasetNames is defined in the repo
echo "Looking for fetchDatasetNames definitions…"
rg "fetchDatasetNames" -n .

Length of output: 1088


#!/bin/bash
# Display the top of Details/sql.ts
echo "=== components/webui/client/src/pages/IngestPage/Details/sql.ts ==="
sed -n '1,20p' "$(fd -t f sql.ts | grep 'webui/client/src/pages/IngestPage/Details/sql.ts')"

# Display the top of api/sql.ts to see if fetchDatasetNames is exported
echo -e "\n=== components/webui/client/src/api/sql.ts ==="
sed -n '1,20p' "$(fd -t f sql.ts | grep 'webui/client/src/api/sql.ts')"

#!/bin/bash
# Locate and display the Details/sql.ts file
details_sql=$(fd -t f sql.ts | grep 'webui/client/src/pages/IngestPage/Details/sql.ts')
echo "=== $details_sql ==="
sed -n '1,50p' "$details_sql"

# Locate and inspect client api/sql.ts for fetchDatasetNames export
api_sql=$(fd -t f sql.ts | grep 'webui/client/src/api/sql.ts')
echo -e "\n=== $api_sql ==="
grep -R "export.*fetchDatasetNames" -n "$api_sql" || echo "No fetchDatasetNames export found in api/sql.ts"

# Search all of webui/client for fetchDatasetNames occurrences
echo -e "\n=== All fetchDatasetNames in webui/client ==="
grep -R "fetchDatasetNames" -n components/webui/client/src

Fix broken import path for fetchDatasetNames

The import

import {fetchDatasetNames} from "../../SearchPage/SearchControls/Dataset/sql";

does not resolve because there is no pages/SearchPage/SearchControls/Dataset/sql.ts in the webui client. Update to the actual module that exports fetchDatasetNames. For example, if it lives in your shared API:

• File: components/webui/client/src/pages/IngestPage/Details/index.tsx (line 8)
• File: components/webui/client/src/pages/IngestPage/SpaceSavings/index.tsx (line 9)

Suggested diff:

- import {fetchDatasetNames} from "../../SearchPage/SearchControls/Dataset/sql";
+ import {fetchDatasetNames} from "../../../api/sql";

Adjust the path according to where fetchDatasetNames is truly exported.

🧰 Tools
🪛 GitHub Actions: clp-lint

[error] 8-8: ESLint: Unable to resolve path to module '../../SearchPage/SearchControls/Dataset/sql' (import/no-unresolved)

🤖 Prompt for AI Agents
In components/webui/client/src/pages/IngestPage/Details/index.tsx at line 8, the
import path for fetchDatasetNames is incorrect because the referenced file does
not exist. Locate the correct module that exports fetchDatasetNames, such as a
shared API or another valid path, and update the import statement to use that
correct path to fix the broken import.

import Files from "./Files";
import styles from "./index.module.css";
import Messages from "./Messages";
import {
DetailsResp,
getDetailsSql,
DETAILS_DEFAULT,
fetchClpDetails,
fetchClpsDetails,
} from "./sql";
import TimeRange from "./TimeRange";


/**
* Default state for details.
*/
const DETAILS_DEFAULT = Object.freeze({
beginDate: dayjs(null),
endDate: dayjs(null),

numFiles: 0,
numMessages: 0,
});

/**
* Renders grid with compression details.
*
* @return
*/
const Details = () => {
const {refreshInterval} = useIngestStatsStore();
const [beginDate, setBeginDate] = useState<Dayjs>(DETAILS_DEFAULT.beginDate);
const [endDate, setEndDate] = useState<Dayjs>(DETAILS_DEFAULT.endDate);
const [numFiles, setNumFiles] = useState<Nullable<number>>(DETAILS_DEFAULT.numFiles);
const [numMessages, setNumMessages] = useState<Nullable<number>>(DETAILS_DEFAULT.numMessages);
const {data: datasetNames = [], isSuccess: isSuccessDatasetNames} = useQuery({
queryKey: ["datasets"],
queryFn: fetchDatasetNames,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Address the unsafe assignment warning.

The TypeScript ESLint error suggests there's an unsafe assignment of an error-typed value. Ensure that the fetchDatasetNames function returns the expected type and handle potential errors appropriately.


🏁 Script executed:

#!/bin/bash
# Check the return type of fetchDatasetNames function
ast-grep --pattern 'export $_ fetchDatasetNames'

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Locate all references and the definition of fetchDatasetNames
rg 'fetchDatasetNames' -n

Length of output: 792


🏁 Script executed:

#!/bin/bash
# Display lines around the usage of fetchDatasetNames in the Details page
sed -n '1,200p' components/webui/client/src/pages/IngestPage/Details/index.tsx

Length of output: 2314


Specify useQuery generics to avoid unsafe unknown assignment

The @typescript-eslint/no-unsafe-assignment error is due to useQuery inferring data as unknown by default. Since fetchDatasetNames returns Promise<string[]>, you should explicitly type the hook so that data is string[].

• In components/webui/client/src/pages/IngestPage/Details/index.tsx (around line 28) update the hook:

- const { data: datasetNames = [], isSuccess: isSuccessDatasetNames } = useQuery({
+ const { data: datasetNames = [], isSuccess: isSuccessDatasetNames } = useQuery<string[], Error>({
    queryKey: ["datasets"],
    queryFn: fetchDatasetNames,
    enabled: CLP_STORAGE_ENGINES.CLP_S === SETTINGS_STORAGE_ENGINE,
});

Optionally, apply the same change in components/webui/client/src/pages/IngestPage/SpaceSavings/index.tsx.

This ensures datasetNames is typed correctly and prevents unsafe assignment.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
queryFn: fetchDatasetNames,
const { data: datasetNames = [], isSuccess: isSuccessDatasetNames } = useQuery<string[], Error>({
queryKey: ["datasets"],
queryFn: fetchDatasetNames,
enabled: CLP_STORAGE_ENGINES.CLP_S === SETTINGS_STORAGE_ENGINE,
});
🧰 Tools
🪛 GitHub Actions: clp-lint

[error] 28-28: ESLint: Unsafe assignment of an error typed value (@typescript-eslint/no-unsafe-assignment)

🤖 Prompt for AI Agents
In components/webui/client/src/pages/IngestPage/Details/index.tsx at line 28,
the useQuery hook is currently untyped, causing data to default to unknown and
triggering a no-unsafe-assignment error. Fix this by specifying the generic type
parameter for useQuery as string[], matching the Promise<string[]> return type
of fetchDatasetNames. This will ensure datasetNames is correctly typed as
string[] and prevent unsafe assignment errors. Optionally, apply the same
generic typing fix to the useQuery hook in
components/webui/client/src/pages/IngestPage/SpaceSavings/index.tsx.

enabled: CLP_STORAGE_ENGINES.CLP_S === SETTINGS_STORAGE_ENGINE,
});

/**
* Fetches details stats from the server.
*
* @throws {Error} If the response is undefined.
*/
const fetchDetailsStats = useCallback(() => {
querySql<DetailsResp>(getDetailsSql()).then((resp) => {
const [details] = resp.data;
if ("undefined" === typeof details) {
throw new Error("Details response is undefined");
const {data: details = DETAILS_DEFAULT, isPending} = useQuery({
queryKey: [
"details",
datasetNames,
],
queryFn: async () => {
if (CLP_STORAGE_ENGINES.CLP === SETTINGS_STORAGE_ENGINE) {
return fetchClpDetails();
}
setBeginDate(dayjs.utc(details.begin_timestamp));
setEndDate(dayjs.utc(details.end_timestamp));
setNumFiles(details.num_files);
setNumMessages(details.num_messages);
})
.catch((e: unknown) => {
console.error("Failed to fetch details stats", e);
});
}, []);


useEffect(() => {
fetchDetailsStats();
const intervalId = setInterval(fetchDetailsStats, refreshInterval);

return () => {
clearInterval(intervalId);
};
}, [
refreshInterval,
fetchDetailsStats,
]);
return fetchClpsDetails(datasetNames);
},
enabled: CLP_STORAGE_ENGINES.CLP === SETTINGS_STORAGE_ENGINE || isSuccessDatasetNames,
});

if (CLP_STORAGE_ENGINES.CLP === SETTINGS_STORAGE_ENGINE) {
return (
<div className={styles["detailsGrid"]}>
<div className={styles["timeRange"]}>
<TimeRange
beginDate={beginDate}
endDate={endDate}/>
beginDate={dayjs.utc(details.begin_timestamp)}
endDate={dayjs.utc(details.end_timestamp)}
isLoading={isPending}/>
</div>
<Messages numMessages={numMessages}/>
<Files numFiles={numFiles}/>
<Messages
isLoading={isPending}
numMessages={details.num_messages}/>
<Files
isLoading={isPending}
numFiles={details.num_files}/>
</div>
);
}

return (
<div>
<TimeRange
beginDate={beginDate}
endDate={endDate}/>
beginDate={dayjs.utc(details.begin_timestamp)}
endDate={dayjs.utc(details.end_timestamp)}
isLoading={isPending}/>
</div>
);
Comment on lines 67 to 73
Copy link
Contributor

@hoophalab hoophalab Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question: I might be missing some context here. Why don’t we show Messages and Files for CLP-s? @davemarco

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clp-s storage engine doesn't currently keep track of the number of log events and files in the archives.

};
Expand Down
Loading