Skip to content

feat(vector sink): add multiple endpoint strategies#25662

Open
fpytloun wants to merge 1 commit into
vectordotdev:masterfrom
fpytloun:fpytloun/vector-sink-multiple-backends
Open

feat(vector sink): add multiple endpoint strategies#25662
fpytloun wants to merge 1 commit into
vectordotdev:masterfrom
fpytloun:fpytloun/vector-sink-multiple-backends

Conversation

@fpytloun

Copy link
Copy Markdown
Contributor

Summary

Adds multi-endpoint support to the vector sink:

  • addresses = [...] configures multiple downstream Vector endpoints
  • endpoint_strategy = "load_balance" is the default and uses the existing distributed service / endpoint-health path
  • endpoint_strategy = "failover" uses ordered, non-preemptive failover for stateful downstream Vector aggregators
  • keeps existing address = "..." behavior unchanged
  • updates generated component docs and adds a changelog fragment

The failover strategy starts with the first configured endpoint, moves to the next endpoint on request failure or per-endpoint timeout, and keeps using the successful endpoint until it fails. Hosts can use different address ordering to spread primary ownership without requiring random failover semantics.

Validation

  • make generate-component-docs
  • cargo fmt --all -- --check
  • cargo test --no-default-features --features sinks-vector --lib sinks::vector::tests::
  • cargo test --no-default-features --features sinks-vector --lib sinks::vector::test::generate_config
  • cargo clippy --no-default-features --features sinks-vector --lib -- -D warnings -A clippy::manual_option_zip
  • ./scripts/check_changelog_fragments.sh
  • Docker Compose E2E:
    • load_balance delivered events to both downstream Vector servers
    • failover delivered only to the first endpoint while healthy, then moved to the second endpoint after the first was stopped with bounded request.timeout_secs

@fpytloun fpytloun requested review from a team as code owners June 22, 2026 14:47
@github-actions github-actions Bot added domain: sinks Anything related to the Vector's sinks domain: external docs Anything related to Vector's external, public documentation docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. labels Jun 22, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d7499cf772

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/sinks/vector/config.rs
Comment thread src/sinks/vector/config.rs
Comment thread src/sinks/vector/config.rs Outdated
Comment thread src/sinks/vector/config.rs
@fpytloun fpytloun force-pushed the fpytloun/vector-sink-multiple-backends branch from d7499cf to 25ceca6 Compare June 22, 2026 15:12
@fpytloun

Copy link
Copy Markdown
Contributor Author

Addressed the review comments in the latest push (25ceca6c6f):

  • Preserved the old single-address service path so unchanged single-endpoint configs do not go through endpoint-health distributed_service.
  • Made ordered failover advance only on retriable Vector errors; non-retriable gRPC statuses such as DataLoss now bubble immediately and are not resent to later endpoints.
  • Guarded active endpoint updates so stale concurrent successes cannot preempt a newer failover target.
  • Added timeout slack around the internal per-endpoint failover loop so the final endpoint still gets its full per-endpoint timeout budget.
  • Added a regression test proving non-retriable primary rejection is not resent to secondary.

Validation rerun:

  • cargo fmt --all
  • cargo test --no-default-features --features sinks-vector --lib sinks::vector::tests::
  • cargo clippy --no-default-features --features sinks-vector --lib -- -D warnings -A clippy::manual_option_zip
  • delegated code review: APPROVE

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25ceca6c6f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/sinks/vector/config.rs Outdated
Comment thread src/sinks/vector/config.rs Outdated
Comment thread src/sinks/vector/config.rs
@fpytloun fpytloun force-pushed the fpytloun/vector-sink-multiple-backends branch from 25ceca6 to 75fcbe3 Compare June 22, 2026 15:51
@fpytloun

Copy link
Copy Markdown
Contributor Author

Updated the branch to address the latest review comments:

  • Endpoint health now uses the same retriable-error classification as the Vector sink retry logic, so non-retriable Vector responses such as DataLoss do not mark a reachable endpoint unhealthy.
  • HealthConfig::default() now matches the documented/serde defaults instead of zero backoff values.
  • Failover active-state advancement now uses a generation-aware CAS loop and reloads stale observed state before deciding whether to advance.
  • Added focused regressions for health classification, documented health defaults, stale failover generation handling, and stale mismatched-state reload behavior.

Validation:

  • cargo fmt --all -- --check
  • cargo test --no-default-features --features sinks-vector --lib
  • cargo clippy --no-default-features --features sinks-vector --lib -- -D warnings -A clippy::manual_option_zip
  • Delegated code review: APPROVE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. domain: external docs Anything related to Vector's external, public documentation domain: sinks Anything related to the Vector's sinks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant