From 9b5f3966bdff23eb7c4d2d5240fb0912dcdedeb5 Mon Sep 17 00:00:00 2001 From: Conor Watson Date: Mon, 8 Dec 2025 12:14:10 +0000 Subject: [PATCH 1/5] Document new NO_CHECK for SPD offline seeding --- .../data-ingestion.adoc | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc index 35f2876cb..e61ed04c1 100644 --- a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc +++ b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc @@ -50,6 +50,32 @@ OPTIONS { }; ---- +It is also possible to perform the import on a cluster that has no access to any cloud. + +. Using the `neo4j-admin database import` command, import data into the `foo-sharded` database, creating one graph shard and three property shards. +If the process is running on the same server as another Neo4j DBMS process, the latter must be stopped. ++ +[source, shell] +---- +neo4j-admin database import full foo-sharded --nodes=nodes.csv --nodes=movies.csv --relationships=relationships.csv --input-type=csv --property-shard-count=3 --schema=schema.cypher +---- + +. using allow and deny database allocate a single shard to each server in the cluster. +See xref:clustering/databases.adoc#cluster-allow-deny-db[allow and deny database] +. Move the produced backups from the local file system on the machine used to run the import onto the server that is hosting each of the shards so that each server has 1 backup and they reside in the same path on each server. +. On each server, update the neo4j.conf to include the correct settings for file seeding as outlined in xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI]. +. Create the database foo-sharded as a sharded property database by seeding it from your backups in the servers file systems: ++ +[source, cypher] +---- +CREATE DATABASE `foo-sharded` +DEFAULT LANGUAGE CYPHER 25 +PROPERTY SHARDS { COUNT 3 } +OPTIONS { + seedUri: `file:/backusp/`, seedOptions: 'NO_CHECK' +}; +---- + The cluster automatically distributes the data across its servers. For more information on seed providers, see xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI]. From 140b21a1b7a0a6b0d6e31b8384ceeaa457d36406 Mon Sep 17 00:00:00 2001 From: Conor Watson Date: Mon, 8 Dec 2025 14:38:46 +0000 Subject: [PATCH 2/5] PR Comments --- .../sharded-property-databases/data-ingestion.adoc | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc index e61ed04c1..60ea79cdb 100644 --- a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc +++ b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc @@ -27,10 +27,11 @@ This will help distribute the data across multiple servers in a Neo4j cluster. If you are creating the property shards on a self-managed server, the server that executes the `neo4j-admin database import` command must have sufficient storage space available for all of the property shards that will be created. ==== +==== Import using S3 The following example shows how to import a set of CSV files, back them up to S3 using the `--target-location` and `--target-format` options, and then create a database using those seeds in S3. . Using the `neo4j-admin database import` command, import data into the `foo-sharded` database, creating one graph shard and three property shards. -If the process is running on the same server as another Neo4j DBMS process, the latter must be stopped. +If the process is running on the same server as another Neo4j DBMS process, the Neo4j DBMS process must be stopped. The `--target-location` and `--target-format` options take the outputs of the import, turn them into uncompressed backups, and upload them to a location ready to be seeded from. + [source, shell] @@ -50,10 +51,11 @@ OPTIONS { }; ---- +==== Import using local file system It is also possible to perform the import on a cluster that has no access to any cloud. . Using the `neo4j-admin database import` command, import data into the `foo-sharded` database, creating one graph shard and three property shards. -If the process is running on the same server as another Neo4j DBMS process, the latter must be stopped. +If the process is running on the same server as another Neo4j DBMS process, the Neo4j DBMS process must be stopped. + [source, shell] ---- From 806891b61c3b0fc39b4a16a0740d1cbcd6f6bdb3 Mon Sep 17 00:00:00 2001 From: Conor Watson Date: Thu, 11 Dec 2025 11:35:09 +0000 Subject: [PATCH 3/5] PR Fixes --- .../data-ingestion.adoc | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc index 60ea79cdb..71951e56e 100644 --- a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc +++ b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc @@ -28,10 +28,11 @@ If you are creating the property shards on a self-managed server, the server tha ==== ==== Import using S3 + The following example shows how to import a set of CSV files, back them up to S3 using the `--target-location` and `--target-format` options, and then create a database using those seeds in S3. . Using the `neo4j-admin database import` command, import data into the `foo-sharded` database, creating one graph shard and three property shards. -If the process is running on the same server as another Neo4j DBMS process, the Neo4j DBMS process must be stopped. +If the `neo4j-admin` process is running on the same server as a Neo4j DBMS process, the Neo4j DBMS process must be stopped. The `--target-location` and `--target-format` options take the outputs of the import, turn them into uncompressed backups, and upload them to a location ready to be seeded from. + [source, shell] @@ -51,22 +52,24 @@ OPTIONS { }; ---- +[role=label--new-2025.12] ==== Import using local file system -It is also possible to perform the import on a cluster that has no access to any cloud. +You can import data into a Neo4j cluster that has no access to any cloud. . Using the `neo4j-admin database import` command, import data into the `foo-sharded` database, creating one graph shard and three property shards. -If the process is running on the same server as another Neo4j DBMS process, the Neo4j DBMS process must be stopped. +If the `neo4j-admin` process is running on the same server as a Neo4j DBMS process, the Neo4j DBMS process must be stopped. + [source, shell] ---- neo4j-admin database import full foo-sharded --nodes=nodes.csv --nodes=movies.csv --relationships=relationships.csv --input-type=csv --property-shard-count=3 --schema=schema.cypher ---- -. using allow and deny database allocate a single shard to each server in the cluster. -See xref:clustering/databases.adoc#cluster-allow-deny-db[allow and deny database] -. Move the produced backups from the local file system on the machine used to run the import onto the server that is hosting each of the shards so that each server has 1 backup and they reside in the same path on each server. -. On each server, update the neo4j.conf to include the correct settings for file seeding as outlined in xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI]. -. Create the database foo-sharded as a sharded property database by seeding it from your backups in the servers file systems: +. Using allow and deny database allocate a single shard to each server in the cluster. +See xref:clustering/databases.adoc#cluster-allow-deny-db[Controlling locations with allowed/denied databases] +. Move the produced backups from the local file system of the machine used for the import to the servers hosting each of the shards. +Each server should have one backup, and the backups must reside in the same path on each server. +. On each server, update the _neo4j.conf_ to include the correct settings for file seeding as outlined in xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI]. +. Create the database `foo-sharded` as a sharded property database by seeding it from your backups in the servers file systems: + [source, cypher] ---- From d07b1433037fa7b32fd062cab1bb7672b66415d9 Mon Sep 17 00:00:00 2001 From: Conor Watson Date: Fri, 12 Dec 2025 11:19:56 +0000 Subject: [PATCH 4/5] Add some information on no_check --- .../scalability/sharded-property-databases/data-ingestion.adoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc index 71951e56e..618d95750 100644 --- a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc +++ b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc @@ -81,6 +81,8 @@ OPTIONS { }; ---- +In this context `NO_CHECK` stops the seeding process from checking if all backups are present on all servers. + The cluster automatically distributes the data across its servers. For more information on seed providers, see xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI]. From d3fe71ce94fea2a3721c9ca413c8632b9fd1f868 Mon Sep 17 00:00:00 2001 From: Reneta Popova Date: Fri, 12 Dec 2025 14:37:45 +0000 Subject: [PATCH 5/5] Apply suggestion from @renetapopova --- .../scalability/sharded-property-databases/data-ingestion.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc index 618d95750..05f0db466 100644 --- a/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc +++ b/modules/ROOT/pages/scalability/sharded-property-databases/data-ingestion.adoc @@ -81,7 +81,7 @@ OPTIONS { }; ---- -In this context `NO_CHECK` stops the seeding process from checking if all backups are present on all servers. +In this context, `NO_CHECK` prevents the seeding process from verifying that all backups are present on all servers. The cluster automatically distributes the data across its servers. For more information on seed providers, see xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI].