From ea0aa2977b10d3c0c982610c815c4817fa72d1c1 Mon Sep 17 00:00:00 2001 From: shiyuhang <1136742008@qq.com> Date: Wed, 24 Dec 2025 18:26:12 +0800 Subject: [PATCH 1/8] add ovewview --- tidb-cloud/essential-changefeed-overview.md | 91 +++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100644 tidb-cloud/essential-changefeed-overview.md diff --git a/tidb-cloud/essential-changefeed-overview.md b/tidb-cloud/essential-changefeed-overview.md new file mode 100644 index 0000000000000..69140aba95a9f --- /dev/null +++ b/tidb-cloud/essential-changefeed-overview.md @@ -0,0 +1,91 @@ +--- +title: Changefeed +summary: TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. +--- + +# Changefeed + +TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. Currently, TiDB Cloud supports streaming data to Apache Kafka, MySQL, TiDB Cloud and cloud storage. + +> **Note:** +> +> - Currently, TiDB Cloud only allows up to 10 changefeeds per {{{ .essential }}} cluster. +> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) clusters, the changefeed feature is unavailable. + +## View the Changefeed page + +To access the changefeed feature, take the following steps: + +1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project. + + > **Tip:** + > + > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. + +2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Changefeed** in the left navigation pane. The changefeed page is displayed. + +On the **Changefeed** page, you can create a changefeed, view a list of existing changefeeds, and operate the existing changefeeds (such as scaling, pausing, resuming, editing, and deleting a changefeed). + +## Create a changefeed + +To create a changefeed, refer to the tutorials: + +- [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md) +- [Sink to MySQL](/tidb-cloud/changefeed-sink-to-mysql.md) + +## View a changefeed + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +2. Locate the corresponding changefeed you want to view, and click **...** > **View** in the **Action** column. +3. You can see the details of a changefeed, including its configuration, status, and metrics. + +## Pause or resume a changefeed + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB ckuster. +2. Locate the corresponding changefeed you want to pause or resume, and click **...** > **Pause/Resume** in the **Action** column. + +## Edit a changefeed + +> **Note:** +> +> TiDB Cloud currently only allows editing changefeeds in the paused status. + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +2. Locate the changefeed you want to pause, and click **...** > **Pause** in the **Action** column. +3. When the changefeed status changes to `Paused`, click **...** > **Edit** to edit the corresponding changefeed. + + TiDB Cloud populates the changefeed configuration by default. You can modify the following configurations: + + - Apache Kafka sink: all configurations except **Destination**, **Connection** and **Start Position** + - MySQL sink: all configurations except **Destination**, **Connection** and **Start Position** + +4. After editing the configuration, click **...** > **Resume** to resume the corresponding changefeed. + +## Duplicate a changefeed + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +2. Locate the changefeed that you want to duplicate. In the **Action** column, click **...** > **Duplicate**. +3. TiDB Cloud automatically populates the new changefeed configuration with the original settings. You can review and modify the configuration as needed. +4. After confirming the configuration, click **Submit** to create and start the new changefeed. + +## Delete a changefeed + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +2. Locate the corresponding changefeed you want to delete, and click **...** > **Delete** in the **Action** column. + +## Changefeed billing + +Free of charge during the beta period. + +## Changefeed states + +During the running process, changefeeds might fail with errors, or be manually paused or resumed. These behaviors can lead to changes of the changefeed state. + +The states are described as follows: + +- `CREATING`: the changefeed is being created. +- `CREATE_FAILED`: the changefeed creation fails. You need to delete the changefeed and create a new one. +- `RUNNING`: the changefeed runs normally and the checkpoint-ts proceeds normally. +- `PAUSED`: the changefeed is paused. +- `WARNING`: the changefeed returns a warning. The changefeed cannot continue due to some recoverable errors. The changefeed in this state keeps trying to resume until the state transfers to `RUNNING`. The changefeed in this state blocks [GC operations](https://docs.pingcap.com/tidb/stable/garbage-collection-overview). +- `RUNNING_FAILED`: the changefeed fails. Due to some errors, the changefeed cannot resume and cannot be recovered automatically. If the issues are resolved before the garbage collection (GC) of the incremental data, you can manually resume the failed changefeed. The default Time-To-Live (TTL) duration for incremental data is 24 hours, which means that the GC mechanism does not delete any data within 24 hours after the changefeed is interrupted. From 162e5fea444940166a6cd9d090bcb8ae3cca5f64 Mon Sep 17 00:00:00 2001 From: shiyuhang <1136742008@qq.com> Date: Thu, 25 Dec 2025 12:29:13 +0800 Subject: [PATCH 2/8] add changefeed --- tidb-cloud/essential-changefeed-overview.md | 77 ++++++++++- .../essential-changefeed-sink-to-mysql.md | 126 ++++++++++++++++++ 2 files changed, 201 insertions(+), 2 deletions(-) create mode 100644 tidb-cloud/essential-changefeed-sink-to-mysql.md diff --git a/tidb-cloud/essential-changefeed-overview.md b/tidb-cloud/essential-changefeed-overview.md index 69140aba95a9f..a3963968e4b1f 100644 --- a/tidb-cloud/essential-changefeed-overview.md +++ b/tidb-cloud/essential-changefeed-overview.md @@ -30,26 +30,66 @@ On the **Changefeed** page, you can create a changefeed, view a list of existing To create a changefeed, refer to the tutorials: -- [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md) -- [Sink to MySQL](/tidb-cloud/changefeed-sink-to-mysql.md) +- [Sink to Apache Kafka](/tidb-cloud/essential-changefeed-sink-to-apache-kafka.md) +- [Sink to MySQL](/tidb-cloud/essential-changefeed-sink-to-mysql.md) ## View a changefeed + +
+ 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. 2. Locate the corresponding changefeed you want to view, and click **...** > **View** in the **Action** column. 3. You can see the details of a changefeed, including its configuration, status, and metrics. +
+ +
+ +``` +ticloud serverless changefeed get -c --changefeed-id +``` + +
+
+ ## Pause or resume a changefeed + +
+ 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB ckuster. 2. Locate the corresponding changefeed you want to pause or resume, and click **...** > **Pause/Resume** in the **Action** column. +
+ +
+ +To pause a changefeed: + +``` +ticloud serverless changefeed pause -c --changefeed-id +``` + +To resume a changefeed: + +``` +ticloud serverless changefeed resume -c --changefeed-id +``` + +
+
+ + ## Edit a changefeed > **Note:** > > TiDB Cloud currently only allows editing changefeeds in the paused status. + +
+ 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. 2. Locate the changefeed you want to pause, and click **...** > **Pause** in the **Action** column. 3. When the changefeed status changes to `Paused`, click **...** > **Edit** to edit the corresponding changefeed. @@ -61,6 +101,25 @@ To create a changefeed, refer to the tutorials: 4. After editing the configuration, click **...** > **Resume** to resume the corresponding changefeed. +
+ +
+ +Edit a changefeed with Apache Kafka sink: + +``` +ticloud serverless changefeed edit -c --changefeed-id --name --kafka --filter +``` + +Edit a changefeed with MySQL sink: + +``` +ticloud serverless changefeed edit -c --changefeed-id --name --mysql --filter +``` + +
+
+ ## Duplicate a changefeed 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. @@ -70,9 +129,23 @@ To create a changefeed, refer to the tutorials: ## Delete a changefeed + +
+ 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. 2. Locate the corresponding changefeed you want to delete, and click **...** > **Delete** in the **Action** column. +
+ +
+ +``` +ticloud serverless changefeed delete -c --changefeed-id +``` + +
+
+ ## Changefeed billing Free of charge during the beta period. diff --git a/tidb-cloud/essential-changefeed-sink-to-mysql.md b/tidb-cloud/essential-changefeed-sink-to-mysql.md new file mode 100644 index 0000000000000..30f69c4eab429 --- /dev/null +++ b/tidb-cloud/essential-changefeed-sink-to-mysql.md @@ -0,0 +1,126 @@ +--- +title: Sink to MySQL +summary: This document explains how to stream data from TiDB Cloud to MySQL using the Sink to MySQL changefeed. It includes restrictions, prerequisites, and steps to create a MySQL sink for data replication. The process involves setting up network connections, loading existing data to MySQL, and creating target tables in MySQL. After completing the prerequisites, users can create a MySQL sink to replicate data to MySQL. +--- + +# Sink to MySQL + +This document describes how to stream data from TiDB Cloud to MySQL using the **Sink to MySQL** changefeed. + +## Restrictions + +- For each TiDB Cloud cluster, you can create up to 10 changefeeds. +- Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). +- If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. + +## Prerequisites + +Before creating a changefeed, you need to complete the following prerequisites: + +- Set up your network connection +- Export and load the existing data to MySQL (optional) +- Create corresponding target tables in MySQL if you do not load the existing data and only want to replicate incremental data to MySQL + +### Network + +Make sure that your TiDB Cloud cluster can connect to the MySQL service. + + +
+ +If your MySQL service can be accessed over the public network, you can choose to connect to MySQL through a public IP or domain name. + +
+ +
+ +Private link connection leverage **Private Link** technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. + +You can connect your TiDB Cloud cluster to your MySQL service securely through a private link connection. If the private link connection is not available for your MySQL service, follow [Connect to Amazon RDS via a Private Link Connection](/tidbcloud/serverless-private-link-connection-to-aws-rds.md) or [Connect to Alibaba Cloud ApsaraDB RDS for MySQL via a Private Link Connection](/tidbcloud/serverless-private-link-connection-to-alicloud-rds.md) to create one. + +
+ +
+ +### Load existing data (optional) + +The **Sink to MySQL** connector can only sink incremental data from your TiDB Cloud cluster to MySQL after a certain timestamp. If you already have data in your TiDB Cloud cluster, you can export and load the existing data of your TiDB Cloud cluster into MySQL before enabling **Sink to MySQL**. + +To load the existing data: + +1. Extend the [tidb_gc_life_time](https://docs.pingcap.com/tidb/stable/system-variables#tidb_gc_life_time-new-in-v50) to be longer than the total time of the following two operations, so that historical data during the time is not garbage collected by TiDB. + + - The time to export and import the existing data + - The time to create **Sink to MySQL** + + For example: + + {{< copyable "sql" >}} + + ```sql + SET GLOBAL tidb_gc_life_time = '72h'; + ``` + +2. Use [Export](/tidb-cloud/serverless-export.md) to export data from your TiDB Cloud cluster, then use community tools such as [mydumper/myloader](https://centminmod.com/mydumper.html) to load data to the MySQL service. + +3. Use the snapshot time of [Export](/tidb-cloud/serverless-export.md) as the start position of MySQL sink. + +### Create target tables in MySQL + +If you do not load the existing data, you need to create corresponding target tables in MySQL manually to store the incremental data from TiDB. Otherwise, the data will not be replicated. + +## Create a MySQL sink + +After completing the prerequisites, you can sink your data to MySQL. + +1. Navigate to the overview page of the target TiDB Cloud cluster, and then click **Data** > **Changefeed** in the left navigation pane. + +2. Click **Create Changefeed**, and select **MySQL** as **Destination**. + +3. In **Connectivity Method**, choose the method to connect to your MySQL service. + + - If you choose **Public**, fill in your MySQL endpoint. + - If you choose **Private Link**, select the private link connection that you created in the [Network](#network) section, and then fill in the MySQL port for your MySQL service. + +4. In **Authentication**, fill in the MySQL user name, password and TLS Encryption of your MySQL service. TiDB Cloud does not support self-signed certificates for MySQL TLS connections currently. + +5. Click **Next** to test whether TiDB can connect to MySQL successfully: + + - If yes, you are directed to the next step of configuration. + - If not, a connectivity error is displayed, and you need to handle the error. After the error is resolved, click **Next** again. + +6. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](https://docs.pingcap.com/tidb/stable/table-filter/#syntax). + + - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. + - **Replication Scope**: you can choose to only replicate tables with valid keys or replicate all selected tables. + - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule and click `apply`, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules under the `Filter results`. + - **Filter results with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. + - **Filter results without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. + +7. Customize **Event Filter** to filter the events that you want to replicate. + + - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. + - **Event Filter**: you can choose the events you want to ingnore. + +8. In **Start Replication Position**, configure the starting position for your MySQL sink. + + - If you have [loaded the existing data](#load-existing-data-optional) using Export, select **From Time** and fill in the snapshot time that you get from Export. Pay attention the time zone. + - If you do not have any data in the upstream TiDB cluster, select **Start replication from now on**. + +9. Click **Next** to configure your changefeed specification. + + - In the **Changefeed Name** area, specify a name for the changefeed. + +10. If you confirm that all configurations are correct, click **Submit**. If you want to modify some configurations, click **Previous** to go back to the previous configuration page. + +11. The sink starts soon, and you can see the status of the sink changes from **Creating** to **Running**. + + Click the changefeed name, and you can see more details about the changefeed, such as the checkpoint, replication latency, and other metrics. + +12. If you have [loaded the existing data](#load-existing-data-optional) using Export, you need to restore the GC time to its original value (the default value is `10m`) after the sink is created: + +{{< copyable "sql" >}} + +```sql +SET GLOBAL tidb_gc_life_time = '10m'; +``` From 89486509a3870fc2712ccce935181ea1bd449c2a Mon Sep 17 00:00:00 2001 From: shiyuhang <1136742008@qq.com> Date: Thu, 25 Dec 2025 18:22:59 +0800 Subject: [PATCH 3/8] add kafka --- .../essential-changefeed-sink-to-kafka.md | 353 ++++++++++++++++++ .../essential-changefeed-sink-to-mysql.md | 2 +- 2 files changed, 354 insertions(+), 1 deletion(-) create mode 100644 tidb-cloud/essential-changefeed-sink-to-kafka.md diff --git a/tidb-cloud/essential-changefeed-sink-to-kafka.md b/tidb-cloud/essential-changefeed-sink-to-kafka.md new file mode 100644 index 0000000000000..a5fdb5a283f14 --- /dev/null +++ b/tidb-cloud/essential-changefeed-sink-to-kafka.md @@ -0,0 +1,353 @@ +--- +title: Sink to Apache Kafka +summary: This document explains how to create a changefeed to stream data from TiDB Cloud to Apache Kafka. It includes restrictions, prerequisites, and steps to configure the changefeed for Apache Kafka. The process involves setting up network connections, adding permissions for Kafka ACL authorization, and configuring the changefeed specification. +--- + +# Sink to Apache Kafka + +This document describes how to create a changefeed to stream data from TiDB Cloud to Apache Kafka. + + + +> **Note:** +> +> - To use the changefeed feature, make sure that your TiDB Cloud Dedicated cluster version is v6.1.3 or later. +> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. + + + + +> **Note:** +> +> For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. + + + +## Restrictions + +- For each TiDB Cloud clusterinstance, you can create up to 100 changefeeds. +- Currently, TiDB Cloud does not support uploading self-signed TLS certificates to connect to Kafka brokers. +- Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). +- If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. + + + +- If you choose Private Link or Private Service Connect as the network connectivity method, ensure that your TiDB cluster version meets the following requirements: + + - For v6.5.x: version v6.5.9 or later + - For v7.1.x: version v7.1.4 or later + - For v7.5.x: version v7.5.1 or later + - For v8.1.x: all versions of v8.1.x and later are supported +- If you want to use Debezium as your data format, make sure the version of your TiDB cluster is v8.1.0 or later. +- For the partition distribution of Kafka messages, note the following: + + - If you want to distribute changelogs by primary key or index value to Kafka partition with a specified index name, make sure the version of your TiDB cluster is v7.5.0 or later. + - If you want to distribute changelogs by column value to Kafka partition, make sure the version of your TiDB cluster is v7.5.0 or later. + + + +## Prerequisites + +Before creating a changefeed to stream data to Apache Kafka, you need to complete the following prerequisites: + +- Set up your network connection +- Add permissions for Kafka ACL authorization + +### Network + +Ensure that your TiDB clusterinstance can connect to the Apache Kafka service. You can choose one of the following connection methods: + +- Private Connect: ideal for avoiding VPC CIDR conflicts and meeting security compliance, but incurs additional [Private Data Link Cost](/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md#private-data-link-cost). +- VPC Peering: suitable as a cost-effective option, but requires managing potential VPC CIDR conflicts and security considerations. +- Public IP: suitable for a quick setup. + + + + +
+ +Private Connect leverages **Private Link** or **Private Service Connect** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. + +TiDB Cloud currently supports Private Connect only for self-hosted Kafka. It does not support direct integration with MSK, Confluent Kafka, or other Kafka SaaS services. To connect to these Kafka SaaS services via Private Connect, you can deploy a [kafka-proxy](https://github.com/grepplabs/kafka-proxy) as an intermediary, effectively exposing the Kafka service as self-hosted Kafka. For a detailed example, see [Set Up Self-Hosted Kafka Private Service Connect by Kafka-proxy in Google Cloud](/tidb-cloud/setup-self-hosted-kafka-private-service-connect.md#set-up-self-hosted-kafka-private-service-connect-by-kafka-proxy). This setup is similar across all Kafka SaaS services. + +- If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint.md) to create a private endpoint. +- If your Apache Kafka service is hosted on Google Cloud, follow [Set Up Self-Hosted Kafka Private Service Connect in Google Cloud](/tidb-cloud/setup-self-hosted-kafka-private-service-connect.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint.md) to create a private endpoint. +- If your Apache Kafka service is hosted on Azure, follow [Set Up Self-Hosted Kafka Private Link Service in Azure](/tidb-cloud/setup-azure-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint.md) to create a private endpoint. + +
+
+ +If your Apache Kafka service is in an AWS VPC that has no internet access, take the following steps: + +1. [Set up a VPC peering connection](/tidb-cloud/set-up-vpc-peering-connections.md) between the VPC of the Apache Kafka service and your TiDB cluster. +2. Modify the inbound rules of the security group that the Apache Kafka service is associated with. + + You must add the CIDR of the region where your TiDB Cloud cluster is located to the inbound rules. The CIDR can be found on the **VPC Peering** page. Doing so allows the traffic to flow from your TiDB cluster to the Kafka brokers. + +3. If the Apache Kafka URL contains hostnames, you need to allow TiDB Cloud to be able to resolve the DNS hostnames of the Apache Kafka brokers. + + 1. Follow the steps in [Enable DNS resolution for a VPC peering connection](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-dns.html). + 2. Enable the **Accepter DNS resolution** option. + +If your Apache Kafka service is in a Google Cloud VPC that has no internet access, take the following steps: + +1. [Set up a VPC peering connection](/tidb-cloud/set-up-vpc-peering-connections.md) between the VPC of the Apache Kafka service and your TiDB cluster. +2. Modify the ingress firewall rules of the VPC where Apache Kafka is located. + + You must add the CIDR of the region where your TiDB Cloud cluster is located to the ingress firewall rules. The CIDR can be found on the **VPC Peering** page. Doing so allows the traffic to flow from your TiDB cluster to the Kafka brokers. + +
+
+ +If you want to provide Public IP access to your Apache Kafka service, assign Public IP addresses to all your Kafka brokers. + +It is **NOT** recommended to use Public IP in a production environment. + +
+
+
+ + + + +
+ +Private Connect leverages **Private Link** or **Private Service Connect** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. + +To create a private endpoint for changefeeds in your {{{ .premium }}} instances, follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint.md). + +TiDB Cloud currently supports Private Connect only for self-hosted Kafka. It does not support direct integration with MSK, Confluent Kafka, or other Kafka SaaS services. To connect to these Kafka SaaS services via Private Connect, you can deploy a [kafka-proxy](https://github.com/grepplabs/kafka-proxy) as an intermediary, effectively exposing the Kafka service as self-hosted Kafka. + +If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md) to create a private endpoint. + +
+
+ +If you want to provide Public IP access to your Apache Kafka service, assign Public IP addresses to all your Kafka brokers. + +It is **NOT** recommended to use Public IP in a production environment. + +
+ +
+ +Currently, the VPC Peering feature for {{{ .premium }}} instances is only available upon request. To request this feature, click **?** in the lower-right corner of the [TiDB Cloud console](https://tidbcloud.com) and click **Request Support**. Then, fill in "Apply for VPC Peering for {{{ .premium }}} instance" in the **Description** field and click **Submit**. + +
+
+
+ +### Kafka ACL authorization + +To allow TiDB Cloud changefeeds to stream data to Apache Kafka and create Kafka topics automatically, ensure that the following permissions are added in Kafka: + +- The `Create` and `Write` permissions are added for the topic resource type in Kafka. +- The `DescribeConfigs` permission is added for the cluster resource type in Kafka. + +For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/kafka/authorization.html#adding-acls) in Confluent documentation for more information. + +## Step 1. Open the Changefeed page for Apache Kafka + +1. Log in to the [TiDB Cloud console](https://tidbcloud.com). +2. Navigate to the overview page of the target TiDB clusterinstance, and then click **Data** > **Changefeed** in the left navigation pane. +3. Click **Create Changefeed**, and select **Kafka** as **Destination**. + +## Step 2. Configure the changefeed target + +The steps vary depending on the connectivity method you select. + + +
+ +1. In **Connectivity Method**, select **VPC Peering** or **Public IP**, fill in your Kafka brokers endpoints. You can use commas `,` to separate multiple endpoints. +2. Select an **Authentication** option according to your Kafka authentication configuration. + + - If your Kafka does not require authentication, keep the default option **Disable**. + - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. + +3. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. +4. Select a **Compression** type for the data in this changefeed. +5. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. +6. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. + +
+
+ +1. In **Connectivity Method**, select **Private Link**. +2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. Make sure the AZs of the private endpoint match the AZs of the Kafka deployment. +3. Fill in the **Bootstrap Ports** that you obtained from the [Network](#network) section. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports. +4. Select an **Authentication** option according to your Kafka authentication configuration. + + - If your Kafka does not require authentication, keep the default option **Disable**. + - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. +5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. +6. Select a **Compression** type for the data in this changefeed. +7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. +8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. + +
+ + +
+ +1. In **Connectivity Method**, select **Private Link**. +2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. Make sure the AZs of the private endpoint match the AZs of the Kafka deployment. +3. Fill in the **Bootstrap Ports** that you obtained from the [Network](#network) section. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports. +4. Select an **Authentication** option according to your Kafka authentication configuration. + + - If your Kafka does not require authentication, keep the default option **Disable**. + - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. +5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. +6. Select a **Compression** type for the data in this changefeed. +7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. +8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. + +
+
+ + +
+ +1. In **Connectivity Method**, select **Private Service Connect**. +2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. +3. Fill in the **Bootstrap Ports** that you obtained from the [Network](#network) section. It is recommended that you provide more than one port. You can use commas `,` to separate multiple ports. +4. Select an **Authentication** option according to your Kafka authentication configuration. + + - If your Kafka does not require authentication, keep the default option **Disable**. + - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. +5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. +6. Select a **Compression** type for the data in this changefeed. +7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. +8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. +9. TiDB Cloud creates the endpoint for **Private Service Connect**, which might take several minutes. +10. Once the endpoint is created, log in to your cloud provider console and accept the connection request. +11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds. + +
+
+ + +
+ +1. In **Connectivity Method**, select **Private Link**. +2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. +3. Fill in the **Bootstrap Ports** that you obtained in the [Network](#network) section. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports. +4. Select an **Authentication** option according to your Kafka authentication configuration. + + - If your Kafka does not require authentication, keep the default option **Disable**. + - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. +5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. +6. Select a **Compression** type for the data in this changefeed. +7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. +8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. +9. TiDB Cloud creates the endpoint for **Private Link**, which might take several minutes. +10. Once the endpoint is created, log in to the [Azure portal](https://portal.azure.com/) and accept the connection request. +11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds. + +
+
+
+ +## Step 3. Set the changefeed + +1. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](/table-filter.md). + + - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. + - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules in the box on the right. You can add up to 100 filter rules. + - **Tables with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. + - **Tables without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. + +2. Customize **Event Filter** to filter the events that you want to replicate. + + - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. You can add up to 10 event filter rules per changefeed. + - **Event Filter**: you can use the following event filters to exclude specific events from the changefeed: + - **Ignore event**: excludes specified event types. + - **Ignore SQL**: excludes DDL events that match specified expressions. For example, `^drop` excludes statements starting with `DROP`, and `add column` excludes statements containing `ADD COLUMN`. + - **Ignore insert value expression**: excludes `INSERT` statements that meet specific conditions. For example, `id >= 100` excludes `INSERT` statements where `id` is greater than or equal to 100. + - **Ignore update new value expression**: excludes `UPDATE` statements where the new value matches a specified condition. For example, `gender = 'male'` excludes updates that result in `gender` being `male`. + - **Ignore update old value expression**: excludes `UPDATE` statements where the old value matches a specified condition. For example, `age < 18` excludes updates where the old value of `age` is less than 18. + - **Ignore delete value expression**: excludes `DELETE` statements that meet a specified condition. For example, `name = 'john'` excludes `DELETE` statements where `name` is `'john'`. + +3. Customize **Column Selector** to select columns from events and send only the data changes related to those columns to the downstream. + + - **Tables matching**: specify which tables the column selector applies to. For tables that do not match any rule, all columns are sent. + - **Column Selector**: specify which columns of the matched tables will be sent to the downstream. + + For more information about the matching rules, see [Column selectors](https://docs.pingcap.com/tidb/stable/ticdc-sink-to-kafka/#column-selectors). + +4. In the **Data Format** area, select your desired format of Kafka messages. + + - Avro is a compact, fast, and binary data format with rich data structures, which is widely used in various flow systems. For more information, see [Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol). + - Canal-JSON is a plain JSON text format, which is easy to parse. For more information, see [Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json). + - Open Protocol is a row-level data change notification protocol that provides data sources for monitoring, caching, full-text indexing, analysis engines, and primary-secondary replication between different databases. For more information, see [Open Protocol data format](https://docs.pingcap.com/tidb/stable/ticdc-open-protocol). + - Debezium is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. For more information, see [Debezium data format](https://docs.pingcap.com/tidb/stable/ticdc-debezium). + +5. Enable the **TiDB Extension** option if you want to add TiDB-extension fields to the Kafka message body. + + For more information about TiDB-extension fields, see [TiDB extension fields in Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol#tidb-extension-fields) and [TiDB extension fields in Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json#tidb-extension-field). + +6. If you select **Avro** as your data format, you will see some Avro-specific configurations on the page. You can fill in these configurations as follows: + + - In the **Decimal** and **Unsigned BigInt** configurations, specify how TiDB Cloud handles the decimal and unsigned bigint data types in Kafka messages. + - In the **Schema Registry** area, fill in your schema registry endpoint. If you enable **HTTP Authentication**, the fields for user name and password are displayed and automatically filled in with your TiDB clusterinstance endpoint and password. + +7. In the **Topic Distribution** area, select a distribution mode, and then fill in the topic name configurations according to the mode. + + If you select **Avro** as your data format, you can only choose the **Distribute changelogs by table to Kafka Topics** mode in the **Distribution Mode** drop-down list. + + The distribution mode controls how the changefeed creates Kafka topics, by table, by database, or creating one topic for all changelogs. + + - **Distribute changelogs by table to Kafka Topics** + + If you want the changefeed to create a dedicated Kafka topic for each table, choose this mode. Then, all Kafka messages of a table are sent to a dedicated Kafka topic. You can customize topic names for tables by setting a topic prefix, a separator between a database name and table name, and a suffix. For example, if you set the separator as `_`, the topic names are in the format of `_`. + + For changelogs of non-row events, such as Create Schema Event, you can specify a topic name in the **Default Topic Name** field. The changefeed will create a topic accordingly to collect such changelogs. + + - **Distribute changelogs by database to Kafka Topics** + + If you want the changefeed to create a dedicated Kafka topic for each database, choose this mode. Then, all Kafka messages of a database are sent to a dedicated Kafka topic. You can customize topic names of databases by setting a topic prefix and a suffix. + + For changelogs of non-row events, such as Resolved Ts Event, you can specify a topic name in the **Default Topic Name** field. The changefeed will create a topic accordingly to collect such changelogs. + + - **Send all changelogs to one specified Kafka Topic** + + If you want the changefeed to create one Kafka topic for all changelogs, choose this mode. Then, all Kafka messages in the changefeed will be sent to one Kafka topic. You can define the topic name in the **Topic Name** field. + +8. In the **Partition Distribution** area, you can decide which partition a Kafka message will be sent to. You can define **a single partition dispatcher for all tables**, or **different partition dispatchers for different tables**. TiDB Cloud provides four types of dispatchers: + + - **Distribute changelogs by primary key or index value to Kafka partition** + + If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The primary key or index value of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures row-level orderliness. + + - **Distribute changelogs by table to Kafka partition** + + If you want the changefeed to send Kafka messages of a table to one Kafka partition, choose this distribution method. The table name of a row changelog will determine which partition the changelog is sent to. This distribution method ensures table orderliness but might cause unbalanced partitions. + + - **Distribute changelogs by timestamp to Kafka partition** + + If you want the changefeed to send Kafka messages to different Kafka partitions randomly, choose this distribution method. The commitTs of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures orderliness in each partition. However, multiple changes of a data item might be sent to different partitions and the consumer progress of different consumers might be different, which might cause data inconsistency. Therefore, the consumer needs to sort the data from multiple partitions by commitTs before consuming. + + - **Distribute changelogs by column value to Kafka partition** + + If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The specified column values of a row changelog will determine which partition the changelog is sent to. This distribution method ensures orderliness in each partition and guarantees that the changelog with the same column values is send to the same partition. + +9. In the **Topic Configuration** area, configure the following numbers. The changefeed will automatically create the Kafka topics according to the numbers. + + - **Replication Factor**: controls how many Kafka servers each Kafka message is replicated to. The valid value ranges from [`min.insync.replicas`](https://kafka.apache.org/33/documentation.html#brokerconfigs_min.insync.replicas) to the number of Kafka brokers. + - **Partition Number**: controls how many partitions exist in a topic. The valid value range is `[1, 10 * the number of Kafka brokers]`. + +10. In the **Split Event** area, choose whether to split `UPDATE` events into separate `DELETE` and `INSERT` events or keep as raw `UPDATE` events. For more information, see [Split primary or unique key UPDATE events for non-MySQL sinks](https://docs.pingcap.com/tidb/stable/ticdc-split-update-behavior/#split-primary-or-unique-key-update-events-for-non-mysql-sinks). + +11. Click **Next**. + +## Step 4. Configure your changefeed specification + +1. In the **Changefeed Specification** area, specify the number of Replication Capacity Units (RCUs)Changefeed Capacity Units (CCUs) to be used by the changefeed. +2. In the **Changefeed Name** area, specify a name for the changefeed. +3. Click **Next** to check the configurations you set and go to the next page. + +## Step 5. Review the configurations + +On this page, you can review all the changefeed configurations that you set. + +If you find any error, you can go back to fix the error. If there is no error, you can click the check box at the bottom, and then click **Create** to create the changefeed. diff --git a/tidb-cloud/essential-changefeed-sink-to-mysql.md b/tidb-cloud/essential-changefeed-sink-to-mysql.md index 30f69c4eab429..9d1e32791686f 100644 --- a/tidb-cloud/essential-changefeed-sink-to-mysql.md +++ b/tidb-cloud/essential-changefeed-sink-to-mysql.md @@ -91,9 +91,9 @@ After completing the prerequisites, you can sink your data to MySQL. 6. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](https://docs.pingcap.com/tidb/stable/table-filter/#syntax). - - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. - **Replication Scope**: you can choose to only replicate tables with valid keys or replicate all selected tables. - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule and click `apply`, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules under the `Filter results`. + - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. - **Filter results with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. - **Filter results without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. From 8fbe9ce6ddea620141672da7dae117d67bc9c6d5 Mon Sep 17 00:00:00 2001 From: shiyuhang <1136742008@qq.com> Date: Thu, 25 Dec 2025 18:32:21 +0800 Subject: [PATCH 4/8] add toc --- TOC-tidb-cloud-essential.md | 4 +++ .../essential-changefeed-sink-to-kafka.md | 36 ++----------------- 2 files changed, 6 insertions(+), 34 deletions(-) diff --git a/TOC-tidb-cloud-essential.md b/TOC-tidb-cloud-essential.md index b808ca5be5e24..7cc0d54fc2051 100644 --- a/TOC-tidb-cloud-essential.md +++ b/TOC-tidb-cloud-essential.md @@ -232,6 +232,10 @@ - [CSV Configurations for Importing Data](/tidb-cloud/csv-config-for-import-data.md) - [Troubleshoot Access Denied Errors during Data Import from Amazon S3](/tidb-cloud/troubleshoot-import-access-denied-error.md) - [Connect AWS DMS to TiDB Cloud clusters](/tidb-cloud/tidb-cloud-connect-aws-dms.md) +- Stream Data + - [Changefeed Overview](/tidb-cloud/essential-changefeed-overview.md) + - [To MySQL Sink](/tidb-cloud/essential-changefeed-sink-to-mysql.md) + - [To Kafka Sink](/tidb-cloud/essential-changefeed-sink-to-kafka.md) - Vector Search ![BETA](/media/tidb-cloud/blank_transparent_placeholder.png) - [Overview](/vector-search/vector-search-overview.md) - Get Started diff --git a/tidb-cloud/essential-changefeed-sink-to-kafka.md b/tidb-cloud/essential-changefeed-sink-to-kafka.md index a5fdb5a283f14..86762f9743859 100644 --- a/tidb-cloud/essential-changefeed-sink-to-kafka.md +++ b/tidb-cloud/essential-changefeed-sink-to-kafka.md @@ -7,45 +7,13 @@ summary: This document explains how to create a changefeed to stream data from T This document describes how to create a changefeed to stream data from TiDB Cloud to Apache Kafka. - - -> **Note:** -> -> - To use the changefeed feature, make sure that your TiDB Cloud Dedicated cluster version is v6.1.3 or later. -> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. - - - - -> **Note:** -> -> For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. - - - ## Restrictions -- For each TiDB Cloud clusterinstance, you can create up to 100 changefeeds. +- For each TiDB Cloud cluster, you can create up to 10 changefeeds. - Currently, TiDB Cloud does not support uploading self-signed TLS certificates to connect to Kafka brokers. - Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). - If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. - - -- If you choose Private Link or Private Service Connect as the network connectivity method, ensure that your TiDB cluster version meets the following requirements: - - - For v6.5.x: version v6.5.9 or later - - For v7.1.x: version v7.1.4 or later - - For v7.5.x: version v7.5.1 or later - - For v8.1.x: all versions of v8.1.x and later are supported -- If you want to use Debezium as your data format, make sure the version of your TiDB cluster is v8.1.0 or later. -- For the partition distribution of Kafka messages, note the following: - - - If you want to distribute changelogs by primary key or index value to Kafka partition with a specified index name, make sure the version of your TiDB cluster is v7.5.0 or later. - - If you want to distribute changelogs by column value to Kafka partition, make sure the version of your TiDB cluster is v7.5.0 or later. - - - ## Prerequisites Before creating a changefeed to stream data to Apache Kafka, you need to complete the following prerequisites: @@ -55,7 +23,7 @@ Before creating a changefeed to stream data to Apache Kafka, you need to complet ### Network -Ensure that your TiDB clusterinstance can connect to the Apache Kafka service. You can choose one of the following connection methods: +Ensure that your TiDB Cloud cluster can connect to the Apache Kafka service. You can choose one of the following connection methods: - Private Connect: ideal for avoiding VPC CIDR conflicts and meeting security compliance, but incurs additional [Private Data Link Cost](/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md#private-data-link-cost). - VPC Peering: suitable as a cost-effective option, but requires managing potential VPC CIDR conflicts and security considerations. From a3ecf3ea82acc9a18965390d4d9409d052adaa02 Mon Sep 17 00:00:00 2001 From: shiyuhang <1136742008@qq.com> Date: Fri, 26 Dec 2025 15:18:02 +0800 Subject: [PATCH 5/8] add kafka --- tidb-cloud/essential-changefeed-overview.md | 2 +- .../essential-changefeed-sink-to-kafka.md | 188 +++--------------- 2 files changed, 33 insertions(+), 157 deletions(-) diff --git a/tidb-cloud/essential-changefeed-overview.md b/tidb-cloud/essential-changefeed-overview.md index a3963968e4b1f..912186477d584 100644 --- a/tidb-cloud/essential-changefeed-overview.md +++ b/tidb-cloud/essential-changefeed-overview.md @@ -5,7 +5,7 @@ summary: TiDB Cloud changefeed helps you stream data from TiDB Cloud to other da # Changefeed -TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. Currently, TiDB Cloud supports streaming data to Apache Kafka, MySQL, TiDB Cloud and cloud storage. +TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. Currently, TiDB Cloud supports streaming data to Apache Kafka and MySQL. > **Note:** > diff --git a/tidb-cloud/essential-changefeed-sink-to-kafka.md b/tidb-cloud/essential-changefeed-sink-to-kafka.md index 86762f9743859..5f548e4b3d595 100644 --- a/tidb-cloud/essential-changefeed-sink-to-kafka.md +++ b/tidb-cloud/essential-changefeed-sink-to-kafka.md @@ -25,85 +25,32 @@ Before creating a changefeed to stream data to Apache Kafka, you need to complet Ensure that your TiDB Cloud cluster can connect to the Apache Kafka service. You can choose one of the following connection methods: -- Private Connect: ideal for avoiding VPC CIDR conflicts and meeting security compliance, but incurs additional [Private Data Link Cost](/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md#private-data-link-cost). -- VPC Peering: suitable as a cost-effective option, but requires managing potential VPC CIDR conflicts and security considerations. -- Public IP: suitable for a quick setup. - - +- Public Access: suitable for a quick setup. +- Private Link Connection: meeting security compliance and ensuring network quality. -
- -Private Connect leverages **Private Link** or **Private Service Connect** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. - -TiDB Cloud currently supports Private Connect only for self-hosted Kafka. It does not support direct integration with MSK, Confluent Kafka, or other Kafka SaaS services. To connect to these Kafka SaaS services via Private Connect, you can deploy a [kafka-proxy](https://github.com/grepplabs/kafka-proxy) as an intermediary, effectively exposing the Kafka service as self-hosted Kafka. For a detailed example, see [Set Up Self-Hosted Kafka Private Service Connect by Kafka-proxy in Google Cloud](/tidb-cloud/setup-self-hosted-kafka-private-service-connect.md#set-up-self-hosted-kafka-private-service-connect-by-kafka-proxy). This setup is similar across all Kafka SaaS services. - -- If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint.md) to create a private endpoint. -- If your Apache Kafka service is hosted on Google Cloud, follow [Set Up Self-Hosted Kafka Private Service Connect in Google Cloud](/tidb-cloud/setup-self-hosted-kafka-private-service-connect.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint.md) to create a private endpoint. -- If your Apache Kafka service is hosted on Azure, follow [Set Up Self-Hosted Kafka Private Link Service in Azure](/tidb-cloud/setup-azure-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint.md) to create a private endpoint. - -
-
- -If your Apache Kafka service is in an AWS VPC that has no internet access, take the following steps: - -1. [Set up a VPC peering connection](/tidb-cloud/set-up-vpc-peering-connections.md) between the VPC of the Apache Kafka service and your TiDB cluster. -2. Modify the inbound rules of the security group that the Apache Kafka service is associated with. - - You must add the CIDR of the region where your TiDB Cloud cluster is located to the inbound rules. The CIDR can be found on the **VPC Peering** page. Doing so allows the traffic to flow from your TiDB cluster to the Kafka brokers. +
-3. If the Apache Kafka URL contains hostnames, you need to allow TiDB Cloud to be able to resolve the DNS hostnames of the Apache Kafka brokers. +Private Link Connection leverages **Private Link** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. - 1. Follow the steps in [Enable DNS resolution for a VPC peering connection](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-dns.html). - 2. Enable the **Accepter DNS resolution** option. +TiDB Cloud currently supports Private Link Connection only for self-hosted Kafka and Confluent Cloud dedicated cluster. It does not support direct integration with MSK, or other Kafka SaaS services. -If your Apache Kafka service is in a Google Cloud VPC that has no internet access, take the following steps: +See the following instructions to set up a Private Link connection according to your Kafka deployment and cloud provider: -1. [Set up a VPC peering connection](/tidb-cloud/set-up-vpc-peering-connections.md) between the VPC of the Apache Kafka service and your TiDB cluster. -2. Modify the ingress firewall rules of the VPC where Apache Kafka is located. - - You must add the CIDR of the region where your TiDB Cloud cluster is located to the ingress firewall rules. The CIDR can be found on the **VPC Peering** page. Doing so allows the traffic to flow from your TiDB cluster to the Kafka brokers. - -
-
- -If you want to provide Public IP access to your Apache Kafka service, assign Public IP addresses to all your Kafka brokers. - -It is **NOT** recommended to use Public IP in a production environment. +- [Connect to Confluent Cloud via a Private Link Connection](/tidbcloud/serverless-private-link-connection-to-aws-confluent.md) +- [Connect to AWS Self-Hosted Kafka via Private Link Connection](/tidbcloud/serverless-private-link-connection-to-self-hosted-kafka-in-aws.md) +- [Connect to Alibaba Cloud Self-Hosted Kafka via a Private Link Connection](/tidbcloud/serverless-private-link-connection-to-self-hosted-kafka-in-alicloud.md)
- - - - - -
+
-Private Connect leverages **Private Link** or **Private Service Connect** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. +If you want to provide Public access to your Apache Kafka service, assign Public IP addresses or domain names to all your Kafka brokers. -To create a private endpoint for changefeeds in your {{{ .premium }}} instances, follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint.md). - -TiDB Cloud currently supports Private Connect only for self-hosted Kafka. It does not support direct integration with MSK, Confluent Kafka, or other Kafka SaaS services. To connect to these Kafka SaaS services via Private Connect, you can deploy a [kafka-proxy](https://github.com/grepplabs/kafka-proxy) as an intermediary, effectively exposing the Kafka service as self-hosted Kafka. - -If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md) to create a private endpoint. - -
-
- -If you want to provide Public IP access to your Apache Kafka service, assign Public IP addresses to all your Kafka brokers. - -It is **NOT** recommended to use Public IP in a production environment. - -
- -
- -Currently, the VPC Peering feature for {{{ .premium }}} instances is only available upon request. To request this feature, click **?** in the lower-right corner of the [TiDB Cloud console](https://tidbcloud.com) and click **Request Support**. Then, fill in "Apply for VPC Peering for {{{ .premium }}} instance" in the **Description** field and click **Submit**. +It is **NOT** recommended to use Public access in a production environment.
- ### Kafka ACL authorization @@ -117,7 +64,7 @@ For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources ## Step 1. Open the Changefeed page for Apache Kafka 1. Log in to the [TiDB Cloud console](https://tidbcloud.com). -2. Navigate to the overview page of the target TiDB clusterinstance, and then click **Data** > **Changefeed** in the left navigation pane. +2. Navigate to the overview page of the target TiDB Cloud cluster, and then click **Data** > **Changefeed** in the left navigation pane. 3. Click **Create Changefeed**, and select **Kafka** as **Destination**. ## Step 2. Configure the changefeed target @@ -125,9 +72,9 @@ For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources The steps vary depending on the connectivity method you select. -
+
-1. In **Connectivity Method**, select **VPC Peering** or **Public IP**, fill in your Kafka brokers endpoints. You can use commas `,` to separate multiple endpoints. +1. In **Connectivity Method**, select **Public**, fill in your Kafka brokers endpoints. You can use commas `,` to separate multiple endpoints. 2. Select an **Authentication** option according to your Kafka authentication configuration. - If your Kafka does not require authentication, keep the default option **Disable**. @@ -139,11 +86,11 @@ The steps vary depending on the connectivity method you select. 6. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page.
-
+
1. In **Connectivity Method**, select **Private Link**. -2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. Make sure the AZs of the private endpoint match the AZs of the Kafka deployment. -3. Fill in the **Bootstrap Ports** that you obtained from the [Network](#network) section. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports. +2. In **Private Link Connection**, select the private link connection that you created in the [Network](#network) section. Make sure the AZs of the private link connection match the AZs of the Kafka deployment. +3. Fill in the **Bootstrap Port** that you obtained from the [Network](#network) section. 4. Select an **Authentication** option according to your Kafka authentication configuration. - If your Kafka does not require authentication, keep the default option **Disable**. @@ -151,90 +98,26 @@ The steps vary depending on the connectivity method you select. 5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. 6. Select a **Compression** type for the data in this changefeed. 7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. -8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. +8. Input the **TLS Server Name** if your Kafka requires TLS SNI verification. For example, Confluent Cloud Dedicated clusters. +9. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page.
- - -
- -1. In **Connectivity Method**, select **Private Link**. -2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. Make sure the AZs of the private endpoint match the AZs of the Kafka deployment. -3. Fill in the **Bootstrap Ports** that you obtained from the [Network](#network) section. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports. -4. Select an **Authentication** option according to your Kafka authentication configuration. - - - If your Kafka does not require authentication, keep the default option **Disable**. - - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. -5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. -6. Select a **Compression** type for the data in this changefeed. -7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. -8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. - -
-
- - -
- -1. In **Connectivity Method**, select **Private Service Connect**. -2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. -3. Fill in the **Bootstrap Ports** that you obtained from the [Network](#network) section. It is recommended that you provide more than one port. You can use commas `,` to separate multiple ports. -4. Select an **Authentication** option according to your Kafka authentication configuration. - - - If your Kafka does not require authentication, keep the default option **Disable**. - - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. -5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. -6. Select a **Compression** type for the data in this changefeed. -7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. -8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. -9. TiDB Cloud creates the endpoint for **Private Service Connect**, which might take several minutes. -10. Once the endpoint is created, log in to your cloud provider console and accept the connection request. -11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds. - -
-
- - -
- -1. In **Connectivity Method**, select **Private Link**. -2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. -3. Fill in the **Bootstrap Ports** that you obtained in the [Network](#network) section. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports. -4. Select an **Authentication** option according to your Kafka authentication configuration. - - - If your Kafka does not require authentication, keep the default option **Disable**. - - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. -5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. -6. Select a **Compression** type for the data in this changefeed. -7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. -8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. -9. TiDB Cloud creates the endpoint for **Private Link**, which might take several minutes. -10. Once the endpoint is created, log in to the [Azure portal](https://portal.azure.com/) and accept the connection request. -11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds. - -
-
## Step 3. Set the changefeed -1. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](/table-filter.md). +1. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](https://docs.pingcap.com/tidb/stable/table-filter/#syntax). + - **Replication Scope**: you can choose to only replicate tables with valid keys or replicate all selected tables. + - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule and click `apply`, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules under the `Filter results`. - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. - - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules in the box on the right. You can add up to 100 filter rules. - - **Tables with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. - - **Tables without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. + - **Filter results with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. + - **Filter results without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. 2. Customize **Event Filter** to filter the events that you want to replicate. - - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. You can add up to 10 event filter rules per changefeed. - - **Event Filter**: you can use the following event filters to exclude specific events from the changefeed: - - **Ignore event**: excludes specified event types. - - **Ignore SQL**: excludes DDL events that match specified expressions. For example, `^drop` excludes statements starting with `DROP`, and `add column` excludes statements containing `ADD COLUMN`. - - **Ignore insert value expression**: excludes `INSERT` statements that meet specific conditions. For example, `id >= 100` excludes `INSERT` statements where `id` is greater than or equal to 100. - - **Ignore update new value expression**: excludes `UPDATE` statements where the new value matches a specified condition. For example, `gender = 'male'` excludes updates that result in `gender` being `male`. - - **Ignore update old value expression**: excludes `UPDATE` statements where the old value matches a specified condition. For example, `age < 18` excludes updates where the old value of `age` is less than 18. - - **Ignore delete value expression**: excludes `DELETE` statements that meet a specified condition. For example, `name = 'john'` excludes `DELETE` statements where `name` is `'john'`. + - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. + - **Event Filter**: you can choose the events you want to ingnore. 3. Customize **Column Selector** to select columns from events and send only the data changes related to those columns to the downstream. @@ -257,7 +140,7 @@ The steps vary depending on the connectivity method you select. 6. If you select **Avro** as your data format, you will see some Avro-specific configurations on the page. You can fill in these configurations as follows: - In the **Decimal** and **Unsigned BigInt** configurations, specify how TiDB Cloud handles the decimal and unsigned bigint data types in Kafka messages. - - In the **Schema Registry** area, fill in your schema registry endpoint. If you enable **HTTP Authentication**, the fields for user name and password are displayed and automatically filled in with your TiDB clusterinstance endpoint and password. + - In the **Schema Registry** area, fill in your schema registry endpoint. If you enable **HTTP Authentication**, the fields for user name and password are displayed to fill in. 7. In the **Topic Distribution** area, select a distribution mode, and then fill in the topic name configurations according to the mode. @@ -285,7 +168,7 @@ The steps vary depending on the connectivity method you select. - **Distribute changelogs by primary key or index value to Kafka partition** - If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The primary key or index value of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures row-level orderliness. + If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The primary key or index value of a row changelog will determine which partition the changelog is sent to. Keep the **Index Name** field empty if you want to use the primary key. This distribution method provides a better partition balance and ensures row-level orderliness. - **Distribute changelogs by table to Kafka partition** @@ -308,14 +191,7 @@ The steps vary depending on the connectivity method you select. 11. Click **Next**. -## Step 4. Configure your changefeed specification - -1. In the **Changefeed Specification** area, specify the number of Replication Capacity Units (RCUs)Changefeed Capacity Units (CCUs) to be used by the changefeed. -2. In the **Changefeed Name** area, specify a name for the changefeed. -3. Click **Next** to check the configurations you set and go to the next page. - -## Step 5. Review the configurations - -On this page, you can review all the changefeed configurations that you set. +## Step 4. Review and create your changefeed specification -If you find any error, you can go back to fix the error. If there is no error, you can click the check box at the bottom, and then click **Create** to create the changefeed. +1. In the **Changefeed Name** area, specify a name for the changefeed. +2. Review all the changefeed configurations that you set. Click **Previous** to go back to the previous configuration pages if you want to modify some configurations. Click **Submit** if all configurations are correct to create the changefeed. \ No newline at end of file From 90c8a4f904002ee0dc4ead1264f999b6c78517d4 Mon Sep 17 00:00:00 2001 From: shi yuhang <52435083+shiyuhang0@users.noreply.github.com> Date: Fri, 26 Dec 2025 15:44:13 +0800 Subject: [PATCH 6/8] Apply suggestions from code review Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- TOC-tidb-cloud-essential.md | 4 ++-- tidb-cloud/essential-changefeed-overview.md | 4 ++-- .../essential-changefeed-sink-to-kafka.md | 18 +++++++++--------- .../essential-changefeed-sink-to-mysql.md | 16 ++++++++-------- 4 files changed, 21 insertions(+), 21 deletions(-) diff --git a/TOC-tidb-cloud-essential.md b/TOC-tidb-cloud-essential.md index 7cc0d54fc2051..93a9d7893325e 100644 --- a/TOC-tidb-cloud-essential.md +++ b/TOC-tidb-cloud-essential.md @@ -234,8 +234,8 @@ - [Connect AWS DMS to TiDB Cloud clusters](/tidb-cloud/tidb-cloud-connect-aws-dms.md) - Stream Data - [Changefeed Overview](/tidb-cloud/essential-changefeed-overview.md) - - [To MySQL Sink](/tidb-cloud/essential-changefeed-sink-to-mysql.md) - - [To Kafka Sink](/tidb-cloud/essential-changefeed-sink-to-kafka.md) + - [Sink to MySQL](/tidb-cloud/essential-changefeed-sink-to-mysql.md) + - [Sink to Apache Kafka](/tidb-cloud/essential-changefeed-sink-to-kafka.md) - Vector Search ![BETA](/media/tidb-cloud/blank_transparent_placeholder.png) - [Overview](/vector-search/vector-search-overview.md) - Get Started diff --git a/tidb-cloud/essential-changefeed-overview.md b/tidb-cloud/essential-changefeed-overview.md index 912186477d584..9ed3b05fc242c 100644 --- a/tidb-cloud/essential-changefeed-overview.md +++ b/tidb-cloud/essential-changefeed-overview.md @@ -58,7 +58,7 @@ ticloud serverless changefeed get -c --changefeed-id
-1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB ckuster. +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. 2. Locate the corresponding changefeed you want to pause or resume, and click **...** > **Pause/Resume** in the **Action** column.
@@ -85,7 +85,7 @@ ticloud serverless changefeed resume -c --changefeed-id **Note:** > -> TiDB Cloud currently only allows editing changefeeds in the paused status. +> TiDB Cloud currently only allows editing changefeeds that are in the `Paused` state.
diff --git a/tidb-cloud/essential-changefeed-sink-to-kafka.md b/tidb-cloud/essential-changefeed-sink-to-kafka.md index 5f548e4b3d595..f578de8ce79e9 100644 --- a/tidb-cloud/essential-changefeed-sink-to-kafka.md +++ b/tidb-cloud/essential-changefeed-sink-to-kafka.md @@ -33,7 +33,7 @@ Ensure that your TiDB Cloud cluster can connect to the Apache Kafka service. You Private Link Connection leverages **Private Link** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. -TiDB Cloud currently supports Private Link Connection only for self-hosted Kafka and Confluent Cloud dedicated cluster. It does not support direct integration with MSK, or other Kafka SaaS services. +TiDB Cloud currently supports Private Link Connection only for self-hosted Kafka and Confluent Cloud Dedicated Cluster. It does not support direct integration with MSK, or other Kafka SaaS services. See the following instructions to set up a Private Link connection according to your Kafka deployment and cloud provider: @@ -45,9 +45,9 @@ See the following instructions to set up a Private Link connection according to
-If you want to provide Public access to your Apache Kafka service, assign Public IP addresses or domain names to all your Kafka brokers. +If you want to provide public access to your Apache Kafka service, assign public IP addresses or domain names to all your Kafka brokers. -It is **NOT** recommended to use Public access in a production environment. +It is not recommended to use public access in a production environment.
@@ -59,7 +59,7 @@ To allow TiDB Cloud changefeeds to stream data to Apache Kafka and create Kafka - The `Create` and `Write` permissions are added for the topic resource type in Kafka. - The `DescribeConfigs` permission is added for the cluster resource type in Kafka. -For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/kafka/authorization.html#adding-acls) in Confluent documentation for more information. +For example, if your Kafka cluster is in Confluent Cloud, refer to [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/kafka/authorization.html#adding-acls) in the Confluent documentation for more information. ## Step 1. Open the Changefeed page for Apache Kafka @@ -74,7 +74,7 @@ The steps vary depending on the connectivity method you select.
-1. In **Connectivity Method**, select **Public**, fill in your Kafka brokers endpoints. You can use commas `,` to separate multiple endpoints. +1. In **Connectivity Method**, select **Public**, and fill in your Kafka broker endpoints. You can use commas `,` to separate multiple endpoints. 2. Select an **Authentication** option according to your Kafka authentication configuration. - If your Kafka does not require authentication, keep the default option **Disable**. @@ -109,7 +109,7 @@ The steps vary depending on the connectivity method you select. 1. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](https://docs.pingcap.com/tidb/stable/table-filter/#syntax). - **Replication Scope**: you can choose to only replicate tables with valid keys or replicate all selected tables. - - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule and click `apply`, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules under the `Filter results`. + - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule and click **Apply**, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules under **Filter results**. - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. - **Filter results with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. - **Filter results without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. @@ -117,7 +117,7 @@ The steps vary depending on the connectivity method you select. 2. Customize **Event Filter** to filter the events that you want to replicate. - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. - - **Event Filter**: you can choose the events you want to ingnore. + - **Event Filter**: you can choose the events you want to ignore. 3. Customize **Column Selector** to select columns from events and send only the data changes related to those columns to the downstream. @@ -130,7 +130,7 @@ The steps vary depending on the connectivity method you select. - Avro is a compact, fast, and binary data format with rich data structures, which is widely used in various flow systems. For more information, see [Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol). - Canal-JSON is a plain JSON text format, which is easy to parse. For more information, see [Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json). - - Open Protocol is a row-level data change notification protocol that provides data sources for monitoring, caching, full-text indexing, analysis engines, and primary-secondary replication between different databases. For more information, see [Open Protocol data format](https://docs.pingcap.com/tidb/stable/ticdc-open-protocol). + - Open Protocol is a row-level data change notification protocol that provides data sources for monitoring, caching, full-text indexing, analysis engines, and primary-secondary replication between different databases. For more information, see [Open Protocol data format](https://docs.pingcap.com/tidb/stable/ticdc-open-protocol). - Debezium is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. For more information, see [Debezium data format](https://docs.pingcap.com/tidb/stable/ticdc-debezium). 5. Enable the **TiDB Extension** option if you want to add TiDB-extension fields to the Kafka message body. @@ -180,7 +180,7 @@ The steps vary depending on the connectivity method you select. - **Distribute changelogs by column value to Kafka partition** - If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The specified column values of a row changelog will determine which partition the changelog is sent to. This distribution method ensures orderliness in each partition and guarantees that the changelog with the same column values is send to the same partition. + If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The specified column values of a row changelog will determine which partition the changelog is sent to. This distribution method ensures orderliness in each partition and guarantees that the changelog with the same column values is sent to the same partition. 9. In the **Topic Configuration** area, configure the following numbers. The changefeed will automatically create the Kafka topics according to the numbers. diff --git a/tidb-cloud/essential-changefeed-sink-to-mysql.md b/tidb-cloud/essential-changefeed-sink-to-mysql.md index 9d1e32791686f..286bef181be75 100644 --- a/tidb-cloud/essential-changefeed-sink-to-mysql.md +++ b/tidb-cloud/essential-changefeed-sink-to-mysql.md @@ -34,7 +34,7 @@ If your MySQL service can be accessed over the public network, you can choose to
-Private link connection leverage **Private Link** technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. +Private link connections leverage **Private Link** technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. You can connect your TiDB Cloud cluster to your MySQL service securely through a private link connection. If the private link connection is not available for your MySQL service, follow [Connect to Amazon RDS via a Private Link Connection](/tidbcloud/serverless-private-link-connection-to-aws-rds.md) or [Connect to Alibaba Cloud ApsaraDB RDS for MySQL via a Private Link Connection](/tidbcloud/serverless-private-link-connection-to-alicloud-rds.md) to create one. @@ -48,7 +48,7 @@ The **Sink to MySQL** connector can only sink incremental data from your TiDB Cl To load the existing data: -1. Extend the [tidb_gc_life_time](https://docs.pingcap.com/tidb/stable/system-variables#tidb_gc_life_time-new-in-v50) to be longer than the total time of the following two operations, so that historical data during the time is not garbage collected by TiDB. +1. Extend the [tidb_gc_life_time](https://docs.pingcap.com/tidb/stable/system-variables#tidb_gc_life_time-new-in-v50) to be longer than the total time of the following two operations, so that historical data during this period is not garbage collected by TiDB. - The time to export and import the existing data - The time to create **Sink to MySQL** @@ -82,7 +82,7 @@ After completing the prerequisites, you can sink your data to MySQL. - If you choose **Public**, fill in your MySQL endpoint. - If you choose **Private Link**, select the private link connection that you created in the [Network](#network) section, and then fill in the MySQL port for your MySQL service. -4. In **Authentication**, fill in the MySQL user name, password and TLS Encryption of your MySQL service. TiDB Cloud does not support self-signed certificates for MySQL TLS connections currently. +4. In **Authentication**, fill in the MySQL user name and password, and configure TLS encryption for your MySQL service. Currently, TiDB Cloud does not support self-signed certificates for MySQL TLS connections. 5. Click **Next** to test whether TiDB can connect to MySQL successfully: @@ -92,7 +92,7 @@ After completing the prerequisites, you can sink your data to MySQL. 6. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](https://docs.pingcap.com/tidb/stable/table-filter/#syntax). - **Replication Scope**: you can choose to only replicate tables with valid keys or replicate all selected tables. - - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule and click `apply`, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules under the `Filter results`. + - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule and click **Apply**, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules under **Filter results**. - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. - **Filter results with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. - **Filter results without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. @@ -100,20 +100,20 @@ After completing the prerequisites, you can sink your data to MySQL. 7. Customize **Event Filter** to filter the events that you want to replicate. - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. - - **Event Filter**: you can choose the events you want to ingnore. + - **Event Filter**: you can choose the events you want to ignore. 8. In **Start Replication Position**, configure the starting position for your MySQL sink. - - If you have [loaded the existing data](#load-existing-data-optional) using Export, select **From Time** and fill in the snapshot time that you get from Export. Pay attention the time zone. + - If you have [loaded the existing data](#load-existing-data-optional) using Export, select **From Time** and fill in the snapshot time that you get from Export. Pay attention to the time zone. - If you do not have any data in the upstream TiDB cluster, select **Start replication from now on**. 9. Click **Next** to configure your changefeed specification. - In the **Changefeed Name** area, specify a name for the changefeed. -10. If you confirm that all configurations are correct, click **Submit**. If you want to modify some configurations, click **Previous** to go back to the previous configuration page. +10. If you confirm that all configurations are correct, click **Submit**. If you want to modify some configurations, click **Previous** to go back to the previous configuration page. -11. The sink starts soon, and you can see the status of the sink changes from **Creating** to **Running**. +11. The sink starts soon, and you can see the sink status change from **Creating** to **Running**. Click the changefeed name, and you can see more details about the changefeed, such as the checkpoint, replication latency, and other metrics. From c7f2950145a4fa4eb753525195d7fc2ef01c200d Mon Sep 17 00:00:00 2001 From: shiyuhang <1136742008@qq.com> Date: Fri, 26 Dec 2025 16:00:10 +0800 Subject: [PATCH 7/8] fix changefeed --- tidb-cloud/essential-changefeed-overview.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tidb-cloud/essential-changefeed-overview.md b/tidb-cloud/essential-changefeed-overview.md index 9ed3b05fc242c..136df8c168697 100644 --- a/tidb-cloud/essential-changefeed-overview.md +++ b/tidb-cloud/essential-changefeed-overview.md @@ -30,7 +30,7 @@ On the **Changefeed** page, you can create a changefeed, view a list of existing To create a changefeed, refer to the tutorials: -- [Sink to Apache Kafka](/tidb-cloud/essential-changefeed-sink-to-apache-kafka.md) +- [Sink to Apache Kafka](/tidb-cloud/essential-changefeed-sink-to-kafka.md) - [Sink to MySQL](/tidb-cloud/essential-changefeed-sink-to-mysql.md) ## View a changefeed @@ -80,7 +80,6 @@ ticloud serverless changefeed resume -c --changefeed-id - ## Edit a changefeed > **Note:** From 3623e4a17fcdaf4f899909374275dd80f757a43a Mon Sep 17 00:00:00 2001 From: shiyuhang <1136742008@qq.com> Date: Fri, 26 Dec 2025 16:46:29 +0800 Subject: [PATCH 8/8] fix link --- tidb-cloud/essential-changefeed-sink-to-kafka.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/essential-changefeed-sink-to-kafka.md b/tidb-cloud/essential-changefeed-sink-to-kafka.md index f578de8ce79e9..5a0b2efeec95e 100644 --- a/tidb-cloud/essential-changefeed-sink-to-kafka.md +++ b/tidb-cloud/essential-changefeed-sink-to-kafka.md @@ -59,7 +59,7 @@ To allow TiDB Cloud changefeeds to stream data to Apache Kafka and create Kafka - The `Create` and `Write` permissions are added for the topic resource type in Kafka. - The `DescribeConfigs` permission is added for the cluster resource type in Kafka. -For example, if your Kafka cluster is in Confluent Cloud, refer to [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/kafka/authorization.html#adding-acls) in the Confluent documentation for more information. +For example, if your Kafka cluster is in Confluent Cloud, refer to [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/security/authorization/acls/manage-acls.html#add-acls) in the Confluent documentation for more information. ## Step 1. Open the Changefeed page for Apache Kafka