diff --git a/docs/sys-utilities/index.md b/docs/sys-utilities/index.md index 47122062600..a85f5e45f93 100644 --- a/docs/sys-utilities/index.md +++ b/docs/sys-utilities/index.md @@ -33,6 +33,8 @@ For more information about a utility, see the corresponding topic listed in the - [pg_checksums](./pg-checksums.md) — enable, disable or check data checksums in a database cluster +- [pg_verifybackup](./pg-verifybackup.md) — verify the integrity and consistency of a database backup + ## Reference for Administrator - [analyzedb](./analyzedb.md) - performs `ANALYZE` operations on tables incrementally and concurrently diff --git a/docs/sys-utilities/pg-verifybackup.md b/docs/sys-utilities/pg-verifybackup.md new file mode 100644 index 00000000000..87cddc43724 --- /dev/null +++ b/docs/sys-utilities/pg-verifybackup.md @@ -0,0 +1,149 @@ +--- +title: pg_verifybackup +--- + +`pg_verifybackup` is used to verify the integrity of database cluster backups. It checks the `backup_manifest` file generated by the server during the backup process and validates the completeness and consistency of the backup files. + +Since Cloudberry Database is a distributed database system, `pg_verifybackup` is particularly well-suited for verifying full cluster backups that include a Coordinator and multiple Datanodes. + +## Use Cases + +- **Backup Verification**: Immediately verify the integrity of a backup after it is completed to ensure that the backup files are not corrupted. +- **Pre-Restore Check**: Validate the availability of backup data before performing a recovery. +- **Periodic Verification**: Regularly check long-term stored backups to ensure they remain accessible. +- **Distributed Consistency**: Verify the consistency of backups across multiple nodes in a distributed environment. + +## Syntax Overview + +```bash +pg_verifybackup [options...] backup_directory +``` + +## Command Line Options + +The following options control the verification process: + +### Basic Options + +| Option | Description | +|---------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `-e`, `--exit-on-error` | Exit immediately upon detecting an issue. By default, the tool continues checking and reports all issues found. | +| `-i path`, `--ignore=path` | Ignore the specified file or directory (using a relative path). If a directory is specified, all files under that directory are affected. Can be used multiple times. | +| `-m path`, `--manifest-path=path` | Use the manifest file located at the specified path. By default, the tool looks for the manifest file in the root of the backup directory. | +| `-n`, `--no-parse-wal` | Do not parse write-ahead log (WAL) data. Suitable for scenarios where only data file verification is needed. | +| `-q`, `--quiet` | Suppress output when verification is successful. Ideal for integration into automated scripts. | +| `-s`, `--skip-checksums` | Skip checksum verification of data files. File existence and size will still be checked. This option significantly speeds up the verification process. | +| `-w path`, `--wal-directory=path` | Look for WAL files in the specified directory. Useful when WAL archiving is stored separately from the backup. | + +### Other Options + +| Option | Description | +|-------------------------------|------------------------------------------------------| +| `-V`, `--version` | Display version information and exit. | +| `-?`, `--help` | Display help information and exit. | + +## Usage Examples + +### Basic Verification + +Verify the backup for a Coordinator node: + +```bash +pg_verifybackup /backup/cloudberry/coordinator +``` + +### Using an External Manifest File + +```bash +# Create a backup +pg_basebackup -h mydbserver -D /backup/cloudberry/node1 + +# Move the manifest file to a secure location +mv /backup/cloudberry/node1/backup_manifest /secure/manifest.node1 + +# Verify the backup using the external manifest +pg_verifybackup -m /secure/manifest.node1 /backup/cloudberry/node1 +``` + +### Fast Verification Mode + +Skip checksum verification and only check file integrity: + +```bash +pg_verifybackup --skip-checksums /backup/cloudberry/node1 +``` + +### Ignoring Specific Files + +```bash +# Ignore custom configuration files and temporary files +pg_verifybackup \ + --ignore=postgresql.auto.conf \ + --ignore=temp \ + /backup/cloudberry/node1 +``` + +## Verification Process + +The tool performs verification in four stages: + +**1. Manifest File Verification** +• Read and validate the backup_manifest file. +• Check the file format and its internal checksums. +• Verify the overall integrity of the manifest file. + +**2. File Set Verification** +• Check for the existence of all required files. +• Validate file sizes. +• Identify extra or missing files. + +**3. Data Integrity Verification** +• Calculate checksums for all data files. +• Compare the calculated checksums with those recorded in the manifest. +• Confirm that the files have not been modified. + +**4. WAL Verification** +• Check for the presence of WAL files required for recovery. +• Verify that the WAL files can be parsed. +• Ensure that point-in-time recovery is possible. + +## FAQ + +### What if the manifest file is missing? + +• Symptom: The backup_manifest file cannot be found. +• Cause: The manifest was not generated during the backup or has been moved. + +Solutions: + +1.Ensure that you are using the correct backup directory. +2.Use the -m option to specify the location of the manifest file. +3.Re-run the backup with manifest generation enabled. + +### How to resolve checksum mismatches? + +Symptom: File checksum verification fails. + +Cause: The file was modified during transfer or storage. + +Solutions: + +1.Check the integrity of your storage system. +2.Verify that the data transfer process is reliable. +3.Consider re-running the backup. + +### How to handle issues with WAL files? +Symptom: Required WAL files are missing. +Cause: Incomplete WAL archiving or incorrect WAL file location. + +Solutions: +1.Check your WAL archiving configuration. +2.Use the -w option to specify the correct WAL directory. +3.Ensure that your WAL retention policy is adequate. + +## Related Commands +•pg_basebackup +•pg_waldump +•pg_controldata + +By using `pg_verifybackup`, Cloudberry Database users can ensure the integrity and availability of their backups. It is recommended to integrate this tool into your regular backup processes to maintain data security. diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/sys-utilities/pg-verifybackup.md b/i18n/zh/docusaurus-plugin-content-docs/current/sys-utilities/pg-verifybackup.md new file mode 100644 index 00000000000..ffb4fc86002 --- /dev/null +++ b/i18n/zh/docusaurus-plugin-content-docs/current/sys-utilities/pg-verifybackup.md @@ -0,0 +1,171 @@ +--- +title: pg_verifybackup +--- + +`pg_verifybackup` 用于验证数据库集群备份的完整性。它会检查备份期间服务器生成的 `backup_manifest` 文件,并验证备份文件的完整性和一致性。 + +由于 Cloudberry Database 是分布式数据库系统,`pg_verifybackup` 特别适合验证包含 Coordinator 和多个 Datanode 的完整集群备份。 + +## 使用场景 + +- **备份验证**:在备份完成后立即验证备份的完整性,确保备份文件未被损坏。 +- **恢复前检查**:在执行数据恢复前,验证备份数据的可用性。 +- **定期验证**:定期检查长期存储的备份,确保其仍然可用。 +- **分布式一致性**:验证分布式环境下多个节点的备份一致性。 + +## 语法概要 + +```bash +pg_verifybackup [选项...] 备份目录 +``` + +## 命令选项说明 + +以下选项可用于控制验证过程: + +### 基本选项 + +- **-e, --exit-on-error** + - 检测到问题时立即退出 + - 默认情况下会继续检查并报告所有问题 + +- **-i path, --ignore=path** + - 忽略指定的文件或目录(使用相对路径) + - 如果指定目录,将影响该目录下的所有文件 + - 可多次使用此选项 + +- **-m path, --manifest-path=path** + - 使用指定路径的清单文件 + - 默认在备份目录根目录中查找 + +- **-n, --no-parse-wal** + - 不解析预写日志数据 + - 适用于只需验证数据文件的场景 + +- **-q, --quiet** + - 验证成功时不输出信息 + - 适合集成到自动化脚本中 + +- **-s, --skip-checksums** + - 跳过数据文件校验和验证 + - 仍会检查文件存在性和大小 + - 显著提高验证速度 + +- **-w path, --wal-directory=path** + - 在指定目录中查找 WAL 文件 + - 适用于 WAL 归档与备份分开存储的情况 + +### 其他选项 + +- **-V, --version** + - 显示版本信息并退出 + +- **-?, --help** + - 显示帮助信息并退出 + +## 使用示例 + +### 基本验证 + +验证 Coordinator 节点备份: + +```bash +pg_verifybackup /backup/cloudberry/coordinator +``` + +### 使用外部清单文件 + +```bash +# 创建备份 +pg_basebackup -h mydbserver -D /backup/cloudberry/node1 + +# 移动清单文件到安全位置 +mv /backup/cloudberry/node1/backup_manifest /secure/manifest.node1 + +# 验证备份 +pg_verifybackup -m /secure/manifest.node1 /backup/cloudberry/node1 +``` + +### 快速验证模式 + +```bash +# 跳过校验和验证,仅检查文件完整性 +pg_verifybackup --skip-checksums /backup/cloudberry/node1 +``` + +### 忽略特定文件 + +```bash +# 忽略自定义配置文件和临时文件 +pg_verifybackup \ + --ignore=postgresql.auto.conf \ + --ignore=temp \ + /backup/cloudberry/node1 +``` + +## 验证过程 + +工具执行验证的四个阶段: + +### 1. 清单文件验证 +- 读取并验证 backup_manifest 文件 +- 检查文件格式和内部校验和 +- 验证清单文件的完整性 + +### 2. 文件集验证 +- 检查所有必需文件的存在性 +- 验证文件大小 +- 标识多余或缺失的文件 +- 自动忽略以下文件的变化: + - postgresql.auto.conf + - standby.signal + - recovery.signal + - backup_manifest + - pg_wal 目录内容 + +### 3. 数据完整性验证 +- 计算所有数据文件的校验和 +- 与清单中记录的校验和比对 +- 验证文件未被修改 + +### 4. WAL 验证 +- 检查恢复所需的 WAL 文件 +- 验证 WAL 文件可以被解析 +- 确保可以执行时间点恢复 + +## 常见问题 + +### 清单文件缺失如何处理? + +- **现象**:找不到 backup_manifest 文件 +- **原因**:备份过程未生成清单或清单被移动 +- **解决**: + 1. 确保使用正确的备份目录 + 2. 使用 -m 选项指定清单文件位置 + 3. 重新执行带清单生成的备份 + +### 校验和不匹配如何解决? + +- **现象**:文件校验和验证失败 +- **原因**:文件在传输或存储过程中被修改 +- **解决**: + 1. 检查存储系统完整性 + 2. 验证传输过程是否可靠 + 3. 考虑重新执行备份 + +### WAL 文件问题如何处理? + +- **现象**:缺少必要的 WAL 文件 +- **原因**:WAL 归档不完整或位置错误 +- **解决**: + 1. 检查 WAL 归档设置 + 2. 使用 -w 选项指定正确的 WAL 目录 + 3. 确保 WAL 保留策略合适 + +## 相关命令 + +- pg_basebackup +- pg_waldump +- pg_controldata + +通过 `pg_verifybackup`,Cloudberry Database 用户可以确保其备份的完整性和可用性。建议将此工具集成到常规备份流程中,确保数据安全。 \ No newline at end of file diff --git a/scripts/auto_verify.sh b/scripts/auto_verify.sh new file mode 100644 index 00000000000..80b301c9efa --- /dev/null +++ b/scripts/auto_verify.sh @@ -0,0 +1,117 @@ +#!/bin/bash +# auto_verify_remote.sh +# +# This script performs two tasks: +# 1. It checks all changed Markdown files (between origin/main and HEAD) +# using markdownlint. +# 2. It connects to a remote host (2.56.166.146) via SSH (as gpadmin) +# and performs Cloudberry backup verification. +# +# Usage: +# ./auto_verify_remote.sh [--docker] [remote_backup_directory] +# +# Examples: +# ./auto_verify_remote.sh +# => Checks Markdown files and then verifies the default remote backup directory (/backup/cloudberry/coordinator) +# +# ./auto_verify_remote.sh /my/remote_backup +# => Checks Markdown files and then verifies the specified remote backup directory. +# +# ./auto_verify_remote.sh --docker +# => Checks Markdown files and then verifies using Docker mode on the default remote backup directory. +# +# ./auto_verify_remote.sh --docker /my/remote_backup +# => Checks Markdown files and then verifies using Docker mode on the specified directory. + +########################### +# Step 1: Markdown Lint Check (Local) +########################### +echo "========================================" +echo "Starting Markdown lint check..." +# Get the list of changed Markdown files between origin/main and HEAD +md_files=$(git diff --name-only origin/main...HEAD | grep '\.md$') + +if [ -z "$md_files" ]; then + echo "No Markdown files found in the current commit." +else + echo "Checking the following Markdown files:" + echo "$md_files" + for file in $md_files; do + echo "Linting: $file" + markdownlint "$file" + if [ $? -ne 0 ]; then + echo "Formatting issues found in file: $file" + fi + done +fi +echo "Markdown lint check complete." +echo "========================================" + +########################### +# Step 2: Remote Backup Verification via SSH +########################### +# Default remote backup directory (adjust as needed) +REMOTE_BACKUP_DIR="/backup/cloudberry/coordinator" +# Default: do not use Docker mode +USE_DOCKER=0 + +# Parse command line arguments for remote backup verification. +while [[ $# -gt 0 ]]; do + case "$1" in + --docker) + USE_DOCKER=1 + shift + ;; + *) + REMOTE_BACKUP_DIR="$1" + shift + ;; + esac +done + +# SSH connection details +REMOTE_HOST="2.56.166.146" +REMOTE_USER="gpadmin" +# Determine the current script directory to locate the SSH key file +SCRIPT_DIR=$(cd "$(dirname "$0")" && pwd) +# SSH key file: typically the private key (e.g., ./id_rsa) is used for authentication. +# Here, we assume the key is in the same directory as this script. +SSH_KEY="$SCRIPT_DIR/id_rsa.pub" # Change to "$SCRIPT_DIR/id_rsa" if needed. + +echo "========================================" +echo "Connecting to remote host: $REMOTE_HOST as $REMOTE_USER" +echo "Remote backup directory: $REMOTE_BACKUP_DIR" +if [ $USE_DOCKER -eq 1 ]; then + echo "Verification mode: Docker" +else + echo "Verification mode: Local pg_verifybackup command" +fi +echo "Using SSH key from: $SSH_KEY" +echo "========================================" + +# Execute remote commands via SSH +ssh -i "$SSH_KEY" ${REMOTE_USER}@${REMOTE_HOST} <