Skip to content

Conversation

@NataliaIvakina
Copy link
Collaborator

@NataliaIvakina NataliaIvakina commented Dec 2, 2025

Rewrite the Clustering introductory page to include examples of database topologies.
Remove terminology inconsistencies across a few pages.

@NataliaIvakina NataliaIvakina removed the WIP label Dec 3, 2025
@NataliaIvakina NataliaIvakina force-pushed the dev-update-clustering-intro branch from 6b095c3 to 87a6450 Compare December 3, 2025 13:32
@jackwaudby jackwaudby self-assigned this Dec 15, 2025
Servers and databases are decoupled: servers provide computation and storage power for databases to use.
Each database relies on its own cluster architecture, organized into primaries (with a minimum of three for high availability) and secondaries (for read scaling).
. *Fault tolerance:* Primary database allocations provide a fault tolerant platform for transaction processing.
A database remains available for reads and writes as long as a simple majority of its primary copies are functioning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copies -> allocations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm depends actually, are we using copy/allocation interchangeably elsewhere? In which ignore this suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to use database allocation everywhere though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, we're using copies/allocations interchangeably. In the Disaster recovery, however, we decided to use allocations only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like the term allocation, but I will leave it at your discretion :)

@NataliaIvakina NataliaIvakina force-pushed the dev-update-clustering-intro branch from 87a6450 to af0d43a Compare December 17, 2025 12:30
@neo4j-docops-agent
Copy link
Collaborator

. *Safety:* Servers hosting databases in primary mode provide a fault tolerant platform for transaction processing which remains available while a simple majority of those Primary Servers are functioning.
. *Scale:* Servers hosting databases in secondary mode provide a massively scalable platform for graph queries that enables very large graph workloads to be executed in a widely distributed topology.
. *Causal consistency:* When invoked, a client application is guaranteed to read at least its own writes.
. *Scalability:* A Neo4j cluster is a highly available cluster with multi-database support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saying a cluster is a cluster feels a bit redundant, what do you think?

. *Scalability:* A Neo4j cluster is a highly available cluster with multi-database support.
It is a set of servers running a number of databases.
Servers and databases are decoupled: servers provide computation and storage power for databases to use.
Each database has it own independent topology, organized into primaries (with a minimum of three for high availability) and secondaries (for read scaling).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"with a minimum of three for high availability" could read as if you have to always have at least 3 primaries. I think this is important information but could be moved elsewhere

Servers and databases are decoupled: servers provide computation and storage power for databases to use.
Each database has it own independent topology, organized into primaries (with a minimum of three for high availability) and secondaries (for read scaling).
. *Fault tolerance:* Primary database allocations provide a fault tolerant platform for transaction processing.
A database remains available for reads and writes as long as a simple majority of its primary allocations are functioning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"simple majority of its primary allocations are functioning" is true for writes, but reads do not require this

Comment on lines +85 to +87
Generally speaking, fault tolerance is the number of primary database copies you can lose without affecting a certain operation.
For instance, with one primary copy, you have no fault tolerance, because if it goes offline, nothing is available.
If you have two servers, each with a primary copy of the database, and one goes offline, the other will still have some copy of the data, so read availability would be preserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this block could go earlier before we enumerate the types


While secondaries serve as a copy of your database, providing some level of durability (what is committed cannot be lost), they do not guarantee it completely.
Secondaries pull updates from a selected upstream member (primary) on their own schedule.
Due to their asynchronous nature, secondaries may not provide all transactions committed on the primary allocation(s).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could re-work this to say there can be a lag between a transaction committed on primaries vs when it will be able on the secondaries

[[recommended-number-of-secondaries]]
=== How many secondaries you should have

Secondaries typically provide read scale out, i.e. if you have more read queries happening than your primaries can handle, you can add secondaries to share the load; or even configure the query routing so that reads preferentially target secondaries to leave the primaries free to handle just the write workload.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read scale out -> read scaling?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants