Skip to content

Pre-aggregations deleted before touch timeout is reachedΒ #9870

@darian-heede

Description

@darian-heede

Describe the bug
Pre-aggregations seems to get deleted long before the CUBEJS_TOUCH_PRE_AGG_TIMEOUT is reached, e.g. ~20 minutes after the cluster is started and the pre-aggregations are built. Once this happens, I can see many of the following errors in the log:

{"message":"Error querying db","error":"Error: Pre-aggregation table is not found for prod_pre_aggregations.orders20250202 after it was successfully created
    at mostRecentResult (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregationLoader.ts:262:15)
    at processTicksAndRejections (node:internal/process/task_queues:105:5)
    at PreAggregationLoader.loadPreAggregation (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregationLoader.ts:128:22)
    at preAggregationPromise (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregations.ts:484:30)
    at QueryOrchestrator.fetchQuery (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryOrchestrator.ts:218:9)
    at OrchestratorApi.executeQuery (/cube/node_modules/@cubejs-backend/server-core/src/core/OrchestratorApi.ts:98:20)
    at /cube/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:620:13
    at async Promise.all (index 10)","requestId":"scheduler-ee358451-6fcd-4b3b-beb6-a33c6dfc808b"}

Changing the timeout value doesn't seem to affect this behavior. The only solution to this so far is to set CUBEJS_DROP_PRE_AGG_WITHOUT_TOUCH to false.

To Reproduce
Steps to reproduce the behavior:

  1. Spin up docker compose
  2. Access the cubestore mysql to track the pre-aggregations, e.g. by executing SELECT COUNT(*) FROM information_schema.tables;
  3. The table count will constantly increase while the pre-aggregations are being built
  4. At some point (e.g. ~20 minutes in) the table count goes back 0 and the error mentioned above starts to appear when executing requests.

Expected behavior
The pre-aggregations should not be deleted until the timeout is reached.

Docker compose setup

services:
  cube_api_1:
    image: cubejs/cube:v1.3.41
    container_name: cube-api-1
    ports:
      - 4000:4000
    environment:
      - CUBEJS_REFRESH_WORKER=false
      - CUBEJS_PRE_AGGREGATIONS_BUILDER=false
      - CUBEJS_API_SECRET=<secret>
      - CUBEJS_WEB_SOCKETS=true
      - CUBEJS_DEV_MODE=false
      - CUBEJS_CUBESTORE_HOST=cubestore_router
      - CUBEJS_CUBESTORE_PORT=3030
      - CUBEJS_SCHEDULED_REFRESH_TIMEZONES=Europe/Berlin
      - CUBEJS_SCHEDULED_REFRESH_INTERVAL=600000
    depends_on:
      - cube_refresh_worker_1

  cube_refresh_worker_1:
    restart: always
    image: cubejs/cube:v1.3.41
    container_name: cube-refresh-worker-1
    environment:
      - CUBEJS_REFRESH_WORKER=true
      - CUBEJS_PRE_AGGREGATIONS_BUILDER=true
      - NODE_OPTIONS=--max-old-space-size=4096
      - CUBEJS_API_SECRET=<secret>
      - CUBEJS_WEB_SOCKETS=true
      - CUBEJS_DEV_MODE=false
      - CUBEJS_CUBESTORE_HOST=cubestore_router
      - CUBEJS_CUBESTORE_PORT=3030
      - CUBEJS_SCHEDULED_REFRESH_TIMEZONES=Europe/Berlin
      - CUBEJS_SCHEDULED_REFRESH_INTERVAL=600000
    depends_on:
      - cubestore_worker_1
      - cubestore_worker_2

  cubestore_router:
    restart: always
    image: cubejs/cubestore:v1.3.41
    container_name: cubestore-router
    environment:
      - CUBESTORE_SERVER_NAME=cubestore_router:9999
      - CUBESTORE_META_PORT=9999
      - CUBESTORE_WORKERS=cubestore_worker_1:9001,cubestore_worker_2:9002
      - CUBEJS_SCHEDULED_REFRESH_TIMEZONES=Europe/Berlin
      - CUBESTORE_MINIO_ACCESS_KEY_ID=...
      - CUBESTORE_MINIO_SECRET_ACCESS_KEY=...
      - CUBESTORE_MINIO_BUCKET=...
      - CUBESTORE_MINIO_REGION=...
      - CUBESTORE_MINIO_SERVER_ENDPOINT=...
      - CUBESTORE_MINIO_SUB_PATH=...
    volumes:
      - .cubestore_data:/cube/.cubestore/data
    ports:
      - 3030:3030
      - 9999:9999
      - 3306:3306 #MySql port

  cubestore_worker_1:
    restart: always
    image: cubejs/cubestore:v1.3.41
    container_name: cubestore-worker-1
    environment:
      - CUBESTORE_SERVER_NAME=cubestore_worker_1:9001
      - CUBESTORE_WORKER_PORT=9001
      - CUBESTORE_META_ADDR=cubestore_router:9999
      - CUBEJS_SCHEDULED_REFRESH_TIMEZONES=Europe/Berlin
      - CUBESTORE_WORKERS=cubestore_worker_1:9001,cubestore_worker_2:9002
      - CUBESTORE_MINIO_ACCESS_KEY_ID=...
      - CUBESTORE_MINIO_SECRET_ACCESS_KEY=...
      - CUBESTORE_MINIO_BUCKET=...
      - CUBESTORE_MINIO_REGION=...
      - CUBESTORE_MINIO_SERVER_ENDPOINT=...
      - CUBESTORE_MINIO_SUB_PATH=...
    ports:
      - 9001:9001
    depends_on:
      - cubestore_router

  cubestore_worker_2:
    restart: always
    image: cubejs/cubestore:v1.3.41
    container_name: cubestore-worker-2
    environment:
      - CUBESTORE_SERVER_NAME=cubestore_worker_2:9002
      - CUBESTORE_WORKER_PORT=9002
      - CUBESTORE_META_ADDR=cubestore_router:9999
      - CUBEJS_SCHEDULED_REFRESH_TIMEZONES=Europe/Berlin
      - CUBESTORE_WORKERS=cubestore_worker_1:9001,cubestore_worker_2:9002
      - CUBESTORE_MINIO_ACCESS_KEY_ID=...
      - CUBESTORE_MINIO_SECRET_ACCESS_KEY=...
      - CUBESTORE_MINIO_BUCKET=...
      - CUBESTORE_MINIO_REGION=...
      - CUBESTORE_MINIO_SERVER_ENDPOINT=...
      - CUBESTORE_MINIO_SUB_PATH=...
    ports:
      - 9002:9002
    depends_on:
      - cubestore_router

Version:
1.3.41

Additional context
We have a multitenancy setup with postgres databases as data sources. The same cube and pre-aggregation definitions are being used over all tenants. We use wasabi S3 for the storage of the pre-aggregations und use the CUBESTORE_MINIO... environment variables for the connection details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionThe issue is a question. Please use Stack Overflow for questions.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions