-
Notifications
You must be signed in to change notification settings - Fork 430
[clickpipes] update FAQ guidance for initial load PG<=13 #5271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -173,11 +173,7 @@ | |
| You cannot speed up an already running initial load. However, you can optimize future initial loads by adjusting certain settings. By default, the settings are configured with 4 parallel threads and a snapshot number of rows per partition set to 100,000. These are advanced settings and are generally sufficient for most use cases. | ||
| For Postgres versions 13 or lower, CTID range scans are slower, and these settings become more critical. In such cases, consider the following process to improve performance: | ||
| 1. **Drop the existing pipe**: This is necessary to apply new settings. | ||
| 2. **Delete destination tables on ClickHouse**: Ensure that the tables created by the previous pipe are removed. | ||
| 3. **Create a new pipe with optimized settings**: Typically, increase the snapshot number of rows per partition to between 1 million and 10 million, depending on your specific requirements and the load your Postgres instance can handle. | ||
| For Postgres versions 13 or lower, CTID range scans are very slow and therefore ClickPipes does not use them. Instead we read the entire table as a single partition, essentially making it single-threaded (therefore ignoring both number of rows per partition and parallel threads settings). It is critical to adjust these settings to instead move multiple tables in parallel for fast initial loads. | ||
|
Check notice on line 176 in docs/integrations/data-ingestion/clickpipes/postgres/faq.md
|
||
| These adjustments should significantly enhance the performance of the initial load, especially for older Postgres versions. If you are using Postgres 14 or later, these settings are less impactful due to improved support for CTID range scans. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This now refers to the deleted lines |
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to mention custom partitioning column for PG13?