Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions docs/client-api/configuration/content/_conventions-csharp.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import CodeBlock from '@theme/CodeBlock';
[FindPropertyNameForDynamicIndex](../../../client-api/configuration/conventions.mdx#findpropertynamefordynamicindex)
[FindPropertyNameForIndex](../../../client-api/configuration/conventions.mdx#findpropertynameforindex)
[FirstBroadcastAttemptTimeout](../../../client-api/configuration/conventions.mdx#firstbroadcastattempttimeout)
[GlobalHttpClientTimeout](../../../client-api/configuration/conventions.mdx#globalhttpclienttimeout)
[HttpClientType](../../../client-api/configuration/conventions.mdx#httpclienttype)
[HttpPooledConnectionIdleTimeout](../../../client-api/configuration/conventions.mdx#httppooledconnectionidletimeout)
[HttpPooledConnectionLifetime](../../../client-api/configuration/conventions.mdx#httppooledconnectionlifetime)
Expand Down Expand Up @@ -704,6 +705,44 @@ public TimeSpan FirstBroadcastAttemptTimeout \{ get; set; \}
</TabItem>

</Admonition>

<Admonition type="note" title="">

#### GlobalHttpClientTimeout

* Use the `GlobalHttpClientTimeout` convention to set the maximum time the underlying `HttpClient` can spend on a single HTTP request.
If a request takes longer than this value, the request times out and the client reports an error.
This setting does not control how long an idle TCP connection remains open; it limits how long the client waits for an HTTP request to complete.

* This value also serves as the maximum allowed timeout for request-executor timeouts,
such as [RequestTimeout](../../../client-api/configuration/conventions.mdx#requesttimeout),
[FirstBroadcastAttemptTimeout](../../../client-api/configuration/conventions.mdx#firstbroadcastattempttimeout),
[SecondBroadcastAttemptTimeout](../../../client-api/configuration/conventions.mdx#secondbroadcastattempttimeout), and per-request command timeouts.
Setting any of these to a value greater than `GlobalHttpClientTimeout` will throw an exception.

* In client code, set `GlobalHttpClientTimeout` to be greater than or equal to the longest request timeout you plan to use.
For example, if you set `RequestTimeout` or a per-request timeout to 24 hours, `GlobalHttpClientTimeout` must also be at least 24 hours.

* Set the value to `Timeout.InfiniteTimeSpan` to remove this upper-bound restriction.

* DEFAULT: `12 hours`

<TabItem value="GlobalHttpClientTimeout" label="GlobalHttpClientTimeout">
<CodeBlock language="csharp">
{`GlobalHttpClientTimeout = TimeSpan.FromHours(6)
`}
</CodeBlock>
</TabItem>
<TabItem value="GlobalHttpClientTimeoutSyntax" label="GlobalHttpClientTimeoutSyntax">
<CodeBlock language="csharp">
{`// Syntax:
public TimeSpan GlobalHttpClientTimeout \{ get; set; \}
`}
</CodeBlock>
</TabItem>

</Admonition>

<Admonition type="note" title="">

#### HttpClientType
Expand Down
77 changes: 77 additions & 0 deletions docs/sharding/configuration.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
title: "Sharding: Configuration Options"
description: "Configure RavenDB sharding options for sharded import stream compression, orchestrator request timeouts, and shard executor HTTP compression."
sidebar_label: Configuration
sidebar_position: 13
---

import Admonition from '@theme/Admonition';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock';
import LanguageSwitcher from "@site/src/components/LanguageSwitcher";
import LanguageContent from "@site/src/components/LanguageContent";
import Panel from "@site/src/components/Panel";
import ContentFrame from "@site/src/components/ContentFrame";

<Admonition type="note" title="">

* The following configuration options control sharded import stream compression and HTTP communication between the orchestrator and the shards.

Learn how to apply configuration options in [Configuration Overview](../server/configuration/configuration-options.mdx).

* In this article:
* [Sharding.Import.CompressionLevel](../sharding/configuration.mdx#shardingimportcompressionlevel)
* [Sharding.OrchestratorTimeoutInMin](../sharding/configuration.mdx#shardingorchestratortimeoutinmin)
* [Sharding.ShardExecutor.UseHttpCompression](../sharding/configuration.mdx#shardingshardexecutorusehttpcompression)
* [Sharding.ShardExecutor.UseHttpDecompression](../sharding/configuration.mdx#shardingshardexecutorusehttpdecompression)

</Admonition>

<ContentFrame>

## Sharding.Import.CompressionLevel

The compression level used when the orchestrator sends imported data to the shards during a smuggler import.

- **Type**: `CompressionLevel` (`NoCompression`, `Optimal`, `Fastest`, `SmallestSize`)
- **Default**: `NoCompression`
- **Scope**: Server-wide or per database

</ContentFrame>
<ContentFrame>

## Sharding.OrchestratorTimeoutInMin

* Sets the maximum timeout, in minutes, for HTTP requests that the orchestrator sends to shards.
* By default, this timeout is infinite, so no timeout is enforced by this setting.

---

- **Type**: `int` (minutes)
- **Default**: `Infinite` (no timeout)
- **Scope**: Server-wide only

</ContentFrame>
<ContentFrame>

## Sharding.ShardExecutor.UseHttpCompression

Compress HTTP request content sent by the orchestrator to the shards.

- **Type**: `bool`
- **Default**: `false`
- **Scope**: Server-wide only

</ContentFrame>
<ContentFrame>

## Sharding.ShardExecutor.UseHttpDecompression

Allow the orchestrator to accept and decompress compressed HTTP responses from the shards.

- **Type**: `bool`
- **Default**: `false`
- **Scope**: Server-wide only

</ContentFrame>
23 changes: 8 additions & 15 deletions docs/sharding/import-and-export.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ import CodeBlock from '@theme/CodeBlock';
import LanguageSwitcher from "@site/src/components/LanguageSwitcher";
import LanguageContent from "@site/src/components/LanguageContent";

# Sharding: Import and Export
<Admonition type="note" title="">

* Smuggler is a RavenDB interface with which data can be
Expand All @@ -34,12 +33,13 @@ import LanguageContent from "@site/src/components/LanguageContent";
done with a non-sharded database.
Behind the scenes, Studio uses Smuggler to perform these operations.

* In this page:
* In this article:
* [Export](../sharding/import-and-export.mdx#export)
* [Import](../sharding/import-and-export.mdx#import)
* [Export and Import Options Summary](../sharding/import-and-export.mdx#export-and-import-options-summary)

</Admonition>

## Export

When smuggler is called to
Expand All @@ -63,18 +63,17 @@ in a `ShardedSmugglerResult` type. This type is specific to a sharded database,
and casting it using a non-sharded type [will fail](../migration/client-api/previous-versions-client-breaking-changes.mdx#casting-smuggler-results).
</Admonition>



## Import

Smuggler can be used to [import](../client-api/smuggler/what-is-smuggler.mdx#import)
data into a database from either a [.ravendbdump](../sharding/import-and-export.mdx#export)
file or from backup files (full or incremental).

When data is imported into a sharded database one of the shard nodes
is appointed **orchestrator**; the orchestrator retrieves items from
the `.ravendbdump` or backup files, gathers the imported items into
batches, and distributes them among the shards.
When data is imported into a sharded database one of the shard nodes is appointed **orchestrator**; the orchestrator retrieves items from
the `.ravendbdump` or backup files, gathers the imported items into batches, and distributes them among the shards.

The compression level applied to the import streams the orchestrator sends to the shards can be set with the
[Sharding.Import.CompressionLevel](../sharding/configuration.mdx#shardingimportcompressionlevel) configuration key.

## Importing data from a `.ravendbdump` file

Expand Down Expand Up @@ -125,8 +124,6 @@ contain only the changes that have been made in the database since the last full
Import the full backup first, and then the incremental backups that complement it.
</Admonition>



## Export and Import Options Summary

| Option | Available on a Sharded Database | Comment |
Expand All @@ -139,8 +136,4 @@ Import the full backup first, and then the incremental backups that complement i
| Import from a `.ravendbdump` file | **Yes** | An orchestrator is appointed to distribute the data among the shards. |
| Import from **Backup files** | **Yes** | **Importing** data from a backup file does **not** create a new database like running the [restore process](../sharding/backup-and-restore/restore.mdx) over the backup file would, but **adds the data** to the existing database by distributing it among the shards. |
| Import from **Full backup files** | **Yes** | |
| Import from **Incremental backup files** | **Yes** | |




| Import from **Incremental backup files** | **Yes** | |
41 changes: 22 additions & 19 deletions docs/sharding/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ import CodeBlock from '@theme/CodeBlock';
import LanguageSwitcher from "@site/src/components/LanguageSwitcher";
import LanguageContent from "@site/src/components/LanguageContent";

# Sharding Overview
<Admonition type="note" title="">

* Sharding, supported by RavenDB from version 6.0 onward, is the distribution of a database's content among autonomous **Shards**,
Expand All @@ -24,7 +23,7 @@ import LanguageContent from "@site/src/components/LanguageContent";
* In most cases, sharding allows the efficient usage and management of exceptionally large databases
(e.g., a 10-terabyte DB).

* In this page:
* In this article:
* [Sharding](../sharding/overview.mdx#sharding)
* [Licensing](../sharding/overview.mdx#licensing)
* [Client compatibility](../sharding/overview.mdx#client-compatibility)
Expand All @@ -42,6 +41,7 @@ import LanguageContent from "@site/src/components/LanguageContent";
* [Creating a sharded database](../sharding/overview.mdx#creating-a-sharded-database)

</Admonition>

## Sharding

As a database grows [very large](https://en.wikipedia.org/wiki/Very_large_database), storing and managing it may become too demanding for any single node.
Expand All @@ -52,6 +52,7 @@ With sharding, as the volume of stored data grows, the database can be scaled ou
This allows the database to be managed by multiple nodes and effectively removes most limits on its growth.
In this manner, the size of the overall database, comprised of all shards, can reach dozens of terabytes and more,
while keeping the resources of each shard in check and maintaining high performance and throughput.

#### Licensing

<Admonition type="info" title="">
Expand All @@ -61,6 +62,7 @@ Sharding is fully available with the **Enterprise** license.
* On a **Developer** license, the replication factor is restricted to 1.
* On **Community** and **Professional** licenses, all shards must be on the same node.
* Learn more about licensing [here](../licensing/overview.mdx).

#### Client compatibility

Sharding is managed by the RavenDB server;
Expand All @@ -72,14 +74,21 @@ clients require no special adaptation when accessing a sharded database:

Specific modifications to RavenDB features in a sharded environment are documented in detail
in feature-specific articles.

#### Client-Server communication

When a client connects to a sharded database, it is appointed a RavenDB server that functions as an **orchestrator**,
mediating all the communication between the client and the database shards.
The client remains unaware of this process and uses the same API used by non-sharded databases to load documents, query, and perform other operations.

Note that this additional communication between the client and the orchestrator, as well as between the orchestrator and the shards,
introduces some overhead compared to using a non-sharded database.
introduces some overhead compared to using a non-sharded database.

The HTTP traffic between the orchestrator and the shards can be compressed
(see [Sharding.ShardExecutor.UseHttpCompression](../sharding/configuration.mdx#shardingshardexecutorusehttpcompression)
and [Sharding.ShardExecutor.UseHttpDecompression](../sharding/configuration.mdx#shardingshardexecutorusehttpdecompression)),
and a timeout can be set on the orchestrator's requests to the shards (see [Sharding.OrchestratorTimeoutInMin](../sharding/configuration.mdx#shardingorchestratortimeoutinmin)).

#### When should sharding be used?

While sharding solves many issues related to the storage and management of high-volume databases,
Expand All @@ -103,13 +112,13 @@ is in the vicinity of 250 GB, so the transition is already well established when

</Admonition>


## Shards

While each cluster node of a non-sharded database handles a full replica of the entire database,
each **shard** is assigned a **subset** of the entire database content.

<Admonition type="note" title="">

For example:

Take a 3-shard database, in which shard **1** is populated with documents `Users/1`..`Users/2000`,
Expand All @@ -119,6 +128,7 @@ A client that connects to this database to retrieve `Users/3000` and `Users/5000
automatically-appointed [orchestrator node](../sharding/overview.mdx#client-server-communication)
that would seamlessly retrieve `Users/3000` from shard **2**
and `Users/5000` from shard **3** and hand them to the client.

</Admonition>

As far as clients are concerned, a sharded database is still a single entity:
Expand All @@ -131,6 +141,7 @@ a client can, for example, track the shard where a document is stored and query
Studio can be used to relocate ([reshard](../sharding/resharding.mdx)) documents from one shard to another.

!["Studio Document View"](./assets/overview_document-view.png)

#### Shard replication

Similar to non-sharded databases, shards can be **replicated** across cluster nodes to ensure the continuous availability
Expand All @@ -145,20 +156,21 @@ The number of nodes a shard is replicated to is determined by the **Shard Replic

* The Shard Replication Factor is set to 2, maintaining two replicas of each shard.



## How documents are distributed among shards

#### Buckets

Documents in a sharded database are stored within virtual containers called **buckets**.
The number of documents and the amount of data stored in each bucket may vary.
The number of documents and the amount of data stored in each bucket may vary.

#### Buckets allocation

Upon creating a sharded database, the cluster reserves **1,048,576** (1024 x 1024) buckets for the entire database.
Each shard is assigned a range of buckets from this overall set, where documents can be stored.
(Note: This default reservation method differs when using prefixed sharding. Learn more in [Bucket management](../sharding/administration/sharding-by-prefix.mdx#bucket-management)).

!["Buckets Allocation"](./assets/overview_buckets-allocation.png)

#### Buckets population

The cluster automatically populates the buckets with documents in the following way:
Expand All @@ -185,13 +197,12 @@ the bucket number assigned to a document also determines which shard the documen
Learn more in [Sharding by prefix](../sharding/administration/sharding-by-prefix.mdx).

</Admonition>

#### Document extensions storage

Document extensions (i.e. Attachments, Time series, Counters, and Revisions) are stored in the same bucket as the document they belong to.
To achieve this, the bucket number (hash code) for these extensions is calculated using the ID of the document that owns them.



## Resharding

[Resharding](../sharding/resharding.mdx) is the relocation of data from one shard to another to maintain a balanced database,
Expand All @@ -210,10 +221,9 @@ For example:
* Move all the data that belongs to this bucket to shard **2**.
* Associate bucket `100,000` with shard **2**.
* From now on, any data added to this bucket will be stored in shard **2**.

</Admonition>



## Paging

From the client's perspective, [paging](../indexes/querying/paging.mdx) is conducted similarly in both sharded and non-sharded databases,
Expand All @@ -224,8 +234,6 @@ and sort the retrieved results before handing the selected page to the client.

Read more about paging [here](../sharding/querying.mdx#paging).



## Using local IP addresses

The local IP address of a cluster node can be exposed, allowing other cluster nodes to prioritize it over the public IP address when accessing the node.
Expand All @@ -236,16 +244,11 @@ that may need to communicate with all other shards to process the request and it

Use [this configuration option](../server/configuration/core-configuration.mdx#serverurlcluster) to expose a node's local IP address to other nodes.



## Creating a sharded database

* When a database is created, the user can choose whether it will be sharded or not.
RavenDB (version 6.0 and later) provides this option by default, with no further steps required to enable the feature.

* A sharded database can be created via the [Studio](../sharding/administration/studio-admin.mdx#creating-a-sharded-database) or the [Client API](../sharding/administration/api-admin.mdx).

* A RavenDB cluster can run both sharded and non-sharded databases in parallel.



* A RavenDB cluster can run both sharded and non-sharded databases in parallel.
Loading
Loading