Skip to content

[Feature] [core] Support arbitrary time granularity for chain table delta computation #8184

@juntaozhang

Description

@juntaozhang

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Users need chain tables with minute-level granularity for scenarios like:

  • IoT data ingestion with minute-level partitions
  • Real-time monitoring systems
  • Incremental processing with fine-grained time windows

Example configuration:

  PARTITIONED BY (`dt` STRING, `hr_min` STRING)
  TBLPROPERTIES (
    'partition.timestamp-pattern' = '$dt $hr_min:00',
    'partition.timestamp-formatter' = 'yyyyMMdd HH:mm:ss',
    'chain-table.chain-partition-keys' = 'dt,hr_min'
  )

or

  PARTITIONED BY (`dt` STRING, `hr` STRING, `min` STRING)
  TBLPROPERTIES (
    'partition.timestamp-pattern' = '$dt $hr:$min:00',
    'partition.timestamp-formatter' = 'yyyyMMdd HH:mm:ss',
    'chain-table.chain-partition-keys' = 'dt,hr_min'
  )

Solution

Introduce ChainPartitionStepExtractor to dynamically infer the time step from the partition.timestamp-pattern and partition.timestamp-formatter configuration.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions