Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions config/quickwit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,13 @@ version: 0.8
# metastore_uri: s3://your-bucket/indexes
# metastore_uri: postgres://username:password@host:port/db
#
# Optional PostgreSQL read replica URI. Nodes started with the
# `metastore_read_replica` service connect to it over a read-only connection and
# serve stale-tolerant read-only metastore requests. Searchers use those nodes
# only when `searcher.use_metastore_read_replica` is enabled. Defaults to unset.
#
# metastore_read_replica_uri: postgres://username:password@read-replica-host:port/db
#
# When using a file-backed metastore, the state of the metastore will be cached forever.
# If you are indexing and searching from different processes, it is possible to periodically
# refresh the state of the metastore on the searcher using the `polling_interval` hashtag.
Expand Down Expand Up @@ -168,6 +175,11 @@ indexer:
# https://quickwit.io/docs/configuration/node-config#searcher-configuration
#
# searcher:
# # If true, routes read-only metastore requests from searchers, including
# # DataFusion when enabled, to nodes running the `metastore_read_replica`
# # service. Searchers require at least one `metastore_read_replica` node at
# # startup and do not fall back to the primary metastore.
# use_metastore_read_replica: false

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we need this flag in addition to metastore_read_replica_uri is that users can set metastore_read_replica_uri via the QW_METASTORE_READ_REPLICA_URI environment variable. Without this flag, searchers would need access to that same environment variable even though they don’t need the secret itself.

# fast_field_cache_capacity: 1G
# split_footer_cache_capacity: 500M
# partial_request_cache_capacity: 64M
Expand Down
5 changes: 4 additions & 1 deletion docs/configuration/node-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,15 @@ A commented example is available here: [quickwit.yaml](https://github.com/quickw
| `version` | Config file version. `0.7` is the only available value with a retro compatibility on `0.5` and `0.4`. | | |
| `cluster_id` | Unique identifier of the cluster the node will be joining. Clusters sharing the same network should use distinct cluster IDs.| `QW_CLUSTER_ID` | `quickwit-default-cluster` |
| `node_id` | Unique identifier of the node. It must be distinct from the node IDs of its cluster peers. Defaults to the instance's short hostname if not set. | `QW_NODE_ID` | short hostname |
| `enabled_services` | Enabled services (control_plane, indexer, janitor, metastore, searcher) | `QW_ENABLED_SERVICES` | all services |
| `enabled_services` | Enabled services (control_plane, indexer, janitor, metastore, metastore_read_replica, searcher) | `QW_ENABLED_SERVICES` | all services except metastore_read_replica |
| `listen_address` | The IP address or hostname that Quickwit service binds to for starting REST and GRPC server and connecting this node to other nodes. By default, Quickwit binds itself to 127.0.0.1 (localhost). This default is not valid when trying to form a cluster. | `QW_LISTEN_ADDRESS` | `127.0.0.1` |
| `advertise_address` | IP address advertised by the node, i.e. the IP address that peer nodes should use to connect to the node for RPCs. | `QW_ADVERTISE_ADDRESS` | `listen_address` |
| `gossip_listen_port` | The port which to listen for the Gossip cluster membership service (UDP). | `QW_GOSSIP_LISTEN_PORT` | `rest.listen_port` |
| `grpc_listen_port` | The port on which gRPC services listen for traffic. | `QW_GRPC_LISTEN_PORT` | `rest.listen_port + 1` |
| `peer_seeds` | List of IP addresses or hostnames used to bootstrap the cluster and discover the complete set of nodes. This list may contain the current node address and does not need to be exhaustive. If the list of peer seeds contains a host name, Quickwit will resolve it by querying the DNS every minute. On kubernetes for instance, it is a good practise to set it to a [headless service](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services). | `QW_PEER_SEEDS` | |
| `data_dir` | Path to directory where data (tmp data, splits kept for caching purpose) is persisted. This is mostly used in indexing. | `QW_DATA_DIR` | `./qwdata` |
| `metastore_uri` | Metastore URI. Can be a local directory or `s3://my-bucket/indexes` or `postgres://username:password@localhost:5432/metastore`. [Learn more about the metastore configuration](metastore-config.md). | `QW_METASTORE_URI` | `{data_dir}/indexes` |
| `metastore_read_replica_uri` | Optional PostgreSQL read replica URI. Nodes running the `metastore_read_replica` service connect to it over a read-only connection and serve stale-tolerant read-only metastore requests. Searchers use those nodes only when `searcher.use_metastore_read_replica` is enabled. | `QW_METASTORE_READ_REPLICA_URI` | |
| `default_index_root_uri` | Default index root URI that defines the location where index data (splits) is stored. The index URI is built following the scheme: `{default_index_root_uri}/{index-id}` | `QW_DEFAULT_INDEX_ROOT_URI` | `{data_dir}/indexes` |
| environment variable only | Log level of Quickwit. Can be a direct log level, or a comma separated list of `module_name=level` | `RUST_LOG` | `info` |

Expand Down Expand Up @@ -285,6 +286,7 @@ This section contains the configuration options for a Searcher.
| `max_num_concurrent_split_searches` | Maximum number of concurrent split search requests running on a Searcher. | `100` |
| `split_cache` | Searcher split cache configuration options defined in the section below. Cache disabled if unspecified. | |
| `request_timeout_secs` | The time before a search request is cancelled. This should match the timeout of the stack calling into quickwit if there is one set. | `30` |
| `use_metastore_read_replica` | If true, routes read-only metastore requests from searchers, including DataFusion when enabled, to nodes running the `metastore_read_replica` service. Searchers require at least one `metastore_read_replica` node at startup and do not fall back to the primary metastore. | `false` |

### Searcher split cache configuration

Expand All @@ -301,6 +303,7 @@ Example:

```yaml
searcher:
use_metastore_read_replica: false
fast_field_cache_capacity: 1G
split_footer_cache_capacity: 500M
partial_request_cache_capacity: 64M
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
],
"data_dir": "/opt/quickwit/data",
"metastore_uri": "postgres://username:password@host:port/db",
"metastore_read_replica_uri": "postgres://username:replica-password@replica-host:port/db",
"default_index_root_uri": "s3://quickwit-indexes",
"rest": {
"listen_port": 1111,
Expand Down Expand Up @@ -64,6 +65,7 @@
"replication_factor": 2
},
"searcher": {
"use_metastore_read_replica": true,
"aggregation_memory_limit": "1G",
"aggregation_bucket_limit": 500000,
"fast_field_cache_capacity": "10G",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ grpc_listen_port = 3333
peer_seeds = [ "quickwit-searcher-0.local", "quickwit-searcher-1.local" ]
data_dir = "/opt/quickwit/data"
metastore_uri = "postgres://username:password@host:port/db"
metastore_read_replica_uri = "postgres://username:replica-password@replica-host:port/db"
default_index_root_uri = "s3://quickwit-indexes"

[rest]
Expand Down Expand Up @@ -54,6 +55,7 @@ parquet_merge_use_streaming_engine = true
replication_factor = 2

[searcher]
use_metastore_read_replica = true
aggregation_memory_limit = "1G"
aggregation_bucket_limit = 500_000
fast_field_cache_capacity = "10G"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ peer_seeds:
- quickwit-searcher-1.local
data_dir: /opt/quickwit/data
metastore_uri: postgres://username:password@host:port/db
metastore_read_replica_uri: postgres://username:replica-password@replica-host:port/db
default_index_root_uri: s3://quickwit-indexes

rest:
Expand Down Expand Up @@ -58,6 +59,7 @@ ingest_api:
replication_factor: 2

searcher:
use_metastore_read_replica: true
aggregation_memory_limit: 1G
aggregation_bucket_limit: 500000
fast_field_cache_capacity: 10G
Expand Down
31 changes: 31 additions & 0 deletions quickwit/quickwit-config/src/node_config/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,9 @@ pub struct SearcherConfig {
#[serde(default)]
#[serde(skip_serializing_if = "Option::is_none")]
pub storage_timeout_policy: Option<StorageTimeoutPolicy>,
/// Routes read-only metastore requests from searchers, including DataFusion when enabled, to
/// nodes running the `metastore_read_replica` service.
pub use_metastore_read_replica: bool,
pub warmup_memory_budget: ByteSize,
pub warmup_single_split_initial_allocation: ByteSize,
/// Lambda configuration for serverless leaf search execution.
Expand Down Expand Up @@ -566,6 +569,7 @@ impl Default for SearcherConfig {
request_timeout_secs: Self::default_request_timeout_secs(),
leaf_request_timeout_secs: Self::default_request_timeout_secs(),
storage_timeout_policy: None,
use_metastore_read_replica: false,
warmup_memory_budget: ByteSize::gb(100),
warmup_single_split_initial_allocation: ByteSize::mb(300),
lambda: None,
Expand Down Expand Up @@ -812,6 +816,10 @@ pub struct NodeConfig {
pub peer_seeds: Vec<String>,
pub data_dir_path: PathBuf,
pub metastore_uri: Uri,
/// Optional PostgreSQL read replica URI. It is used as the connection URI by nodes running the
/// [`QuickwitService::MetastoreReadReplica`] role.
#[serde(skip_serializing_if = "Option::is_none")]
pub metastore_read_replica_uri: Option<Uri>,
pub default_index_root_uri: Uri,
pub rest_config: RestConfig,
#[serde(skip_serializing_if = "Option::is_none")]
Expand Down Expand Up @@ -872,6 +880,9 @@ impl NodeConfig {
pub fn redact(&mut self) {
self.metastore_configs.redact();
self.metastore_uri.redact();
if let Some(metastore_read_replica_uri) = &mut self.metastore_read_replica_uri {
metastore_read_replica_uri.redact();
}
self.storage_configs.redact();
}

Expand All @@ -896,6 +907,26 @@ mod tests {
use super::*;
use crate::IndexerConfig;

#[test]
fn test_node_config_redact_metastore_uris() {
let mut config = NodeConfig::for_test();
config.metastore_uri = Uri::for_test("postgresql://username:password@host:5432/db");
config.metastore_read_replica_uri = Some(Uri::for_test(
"postgresql://replica-user:replica-password@replica-host:5432/db",
));

config.redact();

assert_eq!(
config.metastore_uri,
"postgresql://username:***redacted***@host:5432/db"
);
assert_eq!(
config.metastore_read_replica_uri.unwrap(),
"postgresql://replica-user:***redacted***@replica-host:5432/db"
);
}

#[test]
fn test_indexer_config_serialization() {
{
Expand Down
Loading
Loading