Skip to content

[Feature][CDCSOURCE] Supports filter data #2388

@stdnt-xiao

Description

@stdnt-xiao

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

Expect to support data filtering to achieve bidirectional synchronization of multi-center databases. Use filtering functionality to prevent data loops. like:
https://www.confluent.io/blog/sync-databases-and-remove-silos-with-kafka-cdc/

My current idea is to improve the parameters in 'flink-kafka-connector' and filter the data based on 'sink.filter.pattern' rules in the 'write' function.

Have any better suggestions?
thx.

EXECUTE CDCSOURCE jobname WITH (
'connector' = 'mysql-cdc',
'hostname' = '127.0.0.1',
'port' = '3306',
'username' = 'dlink',
'password' = 'dlink',
'checkpoint' = '3000',
'scan.startup.mode' = 'initial',
'parallelism' = '1',
'table-name' = 'test.student,test.score',
'sink.connector'='datastream-kafka',
'sink.brokers'='127.0.0.1:9092',
'sink.filter.pattern': '$[?(@.SRC == "SQLSRV")]',
'sink.filter.model": 'exclude'
)

Use case

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Discussion

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions