diff --git a/modules/ROOT/content-nav.adoc b/modules/ROOT/content-nav.adoc index e59741670..1df99b433 100644 --- a/modules/ROOT/content-nav.adoc +++ b/modules/ROOT/content-nav.adoc @@ -201,6 +201,7 @@ ** xref:backup-restore/online-backup.adoc[] ** xref:backup-restore/aggregate.adoc[] ** xref:backup-restore/inspect.adoc[] +** xref:backup-restore/validate.adoc[] ** xref:backup-restore/consistency-checker.adoc[] ** xref:backup-restore/restore-backup.adoc[] ** xref:backup-restore/offline-backup.adoc[] diff --git a/modules/ROOT/pages/backup-restore/index.adoc b/modules/ROOT/pages/backup-restore/index.adoc index 87470ae88..44c969752 100644 --- a/modules/ROOT/pages/backup-restore/index.adoc +++ b/modules/ROOT/pages/backup-restore/index.adoc @@ -9,6 +9,7 @@ This chapter describes the following: * xref:backup-restore/online-backup.adoc[Back up an online database] -- How to back up an online database. * xref:backup-restore/aggregate.adoc[Aggregate a database backup chain] - How to aggregate a backup chain into a single backup. * xref:backup-restore/inspect.adoc[Inspect the metadata of a database backup file] -- How to inspect the metadata of a database backup file. +* xref:backup-restore/validate.adoc[Validate a sharded property database backup] -- How to validate a sharded property database backup using the `neo4j-admin backup validate` command. * xref:backup-restore/consistency-checker.adoc[Check database consistency] -- How to check the consistency of a database, backup, or a dump. * xref:backup-restore/restore-backup.adoc[Restore a database backup] -- How to restore a database backup in a live Neo4j deployment. * xref:backup-restore/offline-backup.adoc[Back up an offline database] -- How to back up an offline database. diff --git a/modules/ROOT/pages/backup-restore/validate.adoc b/modules/ROOT/pages/backup-restore/validate.adoc new file mode 100644 index 000000000..8ed623a6b --- /dev/null +++ b/modules/ROOT/pages/backup-restore/validate.adoc @@ -0,0 +1,97 @@ +:page-role: new-2025.12 enterprise-edition not-on-aura +[[validate-backup]] += Validate a sharded property database backup +:description: This section describes how to validate a sharded property database backup using the neo4j-admin backup validate command. +:keywords: neo4j-admin, backup, validate, sharded property databases, sharding + +When you back up a sharded property database, you create multiple backup chains—one for each shard. +The backup chains of the graph shard will have 1 full backup and 0 or more differential backups. +Whereas the property shard backup chains will have only 1 full backup, differential backups of property shards are not supported. +See xref:scalability/sharded-property-databases/admin-operations.adoc#backup-and-restore[Backup and restore] for more information on backing up sharded property databases. + +To ensure that the backup chains are valid and can be used for restoration, you can use the `neo4j-admin backup validate` command. + +[[validate-backup-command]] +== Command + +The `neo4j-admin backup validate` command checks the integrity and consistency of the backup artifacts for a specified sharded property database. + +[[validate-backup-syntax]] +=== Syntax + +[source,role=noheader] +---- +neo4j-admin backup validate [-h] [--expand-commands] [--verbose] + [--additional-config=] --database= + [--format=] --from-path= +---- + + +=== Description + +Command to validate a collection of backups. + +[[validate-backup-command-options]] +=== Options + +.`neo4j-admin backup validate` options +[options="header", cols="5m,6a,4m"] +|=== +| Option +| Description +| Default + +|--additional-config=footnote:[See xref:neo4j-admin-neo4j-cli.adoc#_configuration[Neo4j Admin and Neo4j CLI -> Configuration] for details.] +|Configuration file with additional configuration. +| + +|--database= +|Name of the database to validate. +| + +| --expand-commands +| Allow command expansion in config value evaluation. +| + +|--format= +|Format of the output of the command. Possible values are: 'JSON, TABULAR'. +|TABULAR + +|--from-path= +|Path denoting a directory to where backups are stored. +| + +|-h, --help +|Show this help message and exit. +| + +|--verbose +|Enable verbose output. +| + +|=== + +[[validate-backup-example]] +== Example + +To validate a backup for the database `foo` located at _s3://bucket/backups_, use the following command: + +[source,shell] +---- +bin/neo4j-admin backup validate "foo" --from-path=s3://bucket/backups +---- + +The output will indicate whether the backups are valid. +For example: + +[result] +---- +| DATABASE | PATH | STATUS | +| foo-g000 | /bucket/backups/foo-g000-2025-06-11T21-04-42.backup | OK | +| foo-p000 | /bucket/backups/foo-p000-2025-06-11T21-04-37.backup | OK | +| foo-p001 | /bucket/backups/foo-p001-2025-06-11T21-04-40.backup | OK | +---- + +If valid, the backups can be used to xref:scalability/sharded-property-databases/data-ingestion.adoc#creating-sharded-db-from-uri[seed a sharded property database]. + +For more examples and details, see xref:scalability/sharded-property-databases/admin-operations.adoc#backup-and-restore[Backup and restore]. \ No newline at end of file diff --git a/modules/ROOT/pages/neo4j-admin-neo4j-cli.adoc b/modules/ROOT/pages/neo4j-admin-neo4j-cli.adoc index 1e907468b..1607ef27a 100644 --- a/modules/ROOT/pages/neo4j-admin-neo4j-cli.adoc +++ b/modules/ROOT/pages/neo4j-admin-neo4j-cli.adoc @@ -174,7 +174,7 @@ For details, see xref:backup-restore/restore-backup.adoc[]. For details, see xref:database-administration/standard-databases/upload-to-aura.adoc[]. -.2+| `backup` +.3+| `backup` |`inspect` | Lists the metadata stored in the header of backup files. @@ -185,6 +185,11 @@ For details, see xref:backup-restore/inspect.adoc[]. |Aggregates a chain of backup artifacts into a single artifact. For details, see xref:backup-restore/aggregate.adoc[]. + +|`validate` +|Validates a collection of sharded property database backups. + +For details, see xref:backup-restore/validate.adoc[]. |=== == The `neo4j` tool diff --git a/modules/ROOT/pages/scalability/sharded-property-databases/admin-operations.adoc b/modules/ROOT/pages/scalability/sharded-property-databases/admin-operations.adoc index 0f04cfc27..79372b522 100644 --- a/modules/ROOT/pages/scalability/sharded-property-databases/admin-operations.adoc +++ b/modules/ROOT/pages/scalability/sharded-property-databases/admin-operations.adoc @@ -48,32 +48,34 @@ See xref:scalability/sharded-property-databases/data-ingestion.adoc#splitting-ex A sharded property database is a database made up of multiple databases. This means that when you want to back up a database, you must back up all the shards individually, resulting in a sharded property database backup that is composed of multiple smaller backup chains. -Backup chains for each shard are produced using the neo4j-admin database backup. +Backup chains for each shard are produced using the `neo4j-admin database backup` command. For the graph shard, its backup chain must contain one full artefact and 0+ differential artefacts. Each property shard’s backup chain must contain only one full backup and no differential backups. In practical terms, this means that to back up a sharded property database, you start with a full backup of the graph shard and then all of the property shards; any subsequent differential backups would only need to be of the graph shard. This is because the transaction log of the property shards is the same as the graph shard log and is simply filtered when applied, so only the graph shard log is required for a restore. +=== Validating sharded property database backups + For example, assume there is a sharded property database called `foo` with a graph shard and 2 property shards. -A backup must be taken of each shard, for example: +. Back up each shard, for example: ++ [source,shell] ---- -bin/neo4j-admin database backup "foo*" --to-path=/backups --from=localhost:6361 --remote-address-resolution +bin/neo4j-admin backup "foo*" --to-path=/backups --from=localhost:6361 --remote-address-resolution ---- -The `--remote-address-resolution` option requires `internal.dbms.cluster.experimental_protocol_version.dbms_enabled=true` to be set in both the _neo4j.conf_ and _neo4j-admin.conf_ files. - -You can then check the validity of the resulting backups using: - +. Check the validity of the resulting backups. +For details on command syntax and options, see xref:backup-restore/validate.adoc[Validate a database backup]. ++ [source,shell] ---- -bin/neo4j-admin database backup validate "foo" --from-path=s3://bucket/backups +bin/neo4j-admin backup validate "foo" --from-path=s3://bucket/backups ---- - ++ The output will indicate whether the backups are valid. For example: - ++ [result] ---- | DATABASE | PATH | STATUS | @@ -82,8 +84,8 @@ For example: | foo-p001 | /bucket/backups/foo-p001-2025-06-11T21-04-40.backup | OK | ---- -If valid, the backups can be used to seed a sharded property database: - +. If valid, the backups can be used to seed a sharded property database: ++ [source,cypher] ---- CYPHER 25 CREATE DATABASE baz SET GRAPH SHARD { TOPOLOGY 3 PRIMARIES 0 SECONDARIES } @@ -91,12 +93,15 @@ SET PROPERTY SHARDS { COUNT 2 TOPOLOGY 1 REPLICA } OPTIONS {seedUri:"s3://bucket/backups/"}; ---- +=== Understanding backup validation + Due to potential synchronization issues that might occur when shard backups are not on the exact same transaction IDs (since backups can be taken in parallel or sequentially), the restore process is designed to be very lenient to different shards at different transaction IDs. As a result, a sharded property database backup is considered valid if the store files of each property shard are within the range of transactions recorded in the graph shard’s transaction log. For example, assume the graph shard’s store files are at tx 10 and it has transaction logs from tx 11-36, and property shard 1’s store files are at 13 and property shard 2’s store files are at 30, then at restore time, all databases can be recovered and made consistent up to transaction 36. You can use the command `neo4j-admin backup validate` to check whether a collection of backup chains for a database is valid. +For details on command syntax and options, see xref:backup-restore/validate.adoc[Validate a database backup]. Additional actions may be required to create a validated backup if a property shard is ahead or behind the range of transactions in the graph shard backup chain. @@ -113,8 +118,8 @@ To form a validated backup, you must ensure that each property shard’s store f In the example above, property shard `foo-p000` is behind the graph shard backup chain, and property shard `foo-p001` is ahead of the graph shard backup chain. To form a valid sharded property database backup, you need to: -* Take a full backup of the property shard `foo-p000` so that its store at least includes transaction 5. -* Take a differential backup of the graph shard so that at least transaction 12 is included in its transaction log, so `foo-p001` is included in its range. +. Take a full backup of the property shard `foo-p000` so that its store at least includes transaction 5. +. Take a differential backup of the graph shard, so that at least transaction 12 is included in its transaction log, so `foo-p001` is included in its range. Once a valid sharded properties database backup is created, differential backups can be performed by taking differential backups of the graph shard, extending the range of the graph shard chain. Continuing with the example, the graph chain contains transactions from 11 to 36, property shard 1’s store files are at 13, and property shard 2’s store files are at 30. diff --git a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc index 35f2876cb..9127ab1e3 100644 --- a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc +++ b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc @@ -159,6 +159,7 @@ CYPHER 25 CREATE DATABASE `foo-sharded` SET PROPERTY SHARDS { COUNT 3 TOPOLOGY 2 REPLICAS }; ---- +[[creating-sharded-db-from-uri]] === Creating a sharded database from a URI You can create a new sharded property database from an existing database with seeding from one or more URIs.