Skip to content

skipOffsetFromLatest in auto-compaction spec is not respected. #18780

@nibinqtl

Description

@nibinqtl

Affected Version

35.0.0

Description

Please include as much detailed information about the problem as possible.

  • Cluster size: Single server small
  • Any debugging that you have already done
    I'm ingesting data from using local input source, JSON format, with segmentGranularity=hour
    each batch results in a segment partition for the hour. About 10-30 in each hour, about 60k rows in each partition.
    I created a auto-compaction to consolidate into segments with 4H interval.
    Here is the config JSON:
    {
    "dataSource": "data",
    "skipOffsetFromLatest": "PT4H",
    "tuningConfig": {
    "partitionsSpec": {
    "type": "single_dim",
    "partitionDimension": "hostHeader",
    "targetRowsPerSegment": 3000000,
    "assumeGrouped": false
    },
    "type": "index_parallel"
    },
    "granularitySpec": {
    "segmentGranularity": {
    "type": "period",
    "period": "PT4H"
    },
    "rollup": false
    },
    "ioConfig": {
    "dropExisting": true
    }
    }
    What I found is the skipOffsetFromLatest seems to be ignored. The compact task always starts right after each 4 hour boundary for the previous 4-hour interval. For example, the new segment 04:00:00/08:00:00 is generated at 08:06:00. There is no wait time. I tried to set skipOffsetFromLatest at different values but made no difference. I'm very sure that there is no rogue timestamps in the data. At 08:06:00, the latest segment has an end time 09:00:00.

However, the datasource tab of the console correctly shows: "Fully compacted (except the last PT4H of data, 12 segments skipped)"

Any idea why?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions