Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
The S2S Proxy enables server-to-server communication between Temporal Servers - where each server could have differing infrastructure, security, and application configurations.
This PR outlines Compatibility Tests for the S2S proxy - with the intent to validate the compatibility of different Temporal Cluster specifications when fronted via the S2S Proxy. This type of test is crucial, as Temporal only guarantees compatibility between adjacent versions - but users of the proxy connect all sorts of server versions and configurations!
Primarily, this PR is concerned with testing compatibility of differing Temporal Server versions - but the framework is extensible to support differences across the entire cluster specification - such as database type, proxy setup, and application settings.
This PR is large - I'm sorry. I'll give a summary of my mental model, and I'll comment code in-line to make it easier to grok. Thanks!
Mental Model
When facading a Temporal Server instance with the S2S Proxy - we end up with a high-level structure where the grouping of the
serveranddatabaseis called acluster, and it's fronted via aproxy.graph LR P[S2S Proxy] -->|fronts| TS[Temporal Server] TS --> DB[(Database)] subgraph Cluster TS DB endWhen we connect multiple of these clusters together we create a
topology. This topology hasnClusters paired withnproxies across a sharednetwork. The proxies are connected to each other in a full-mesh structure, where along eachedge, one cluster acts as the mux "server" and the other the mux "client". It's a bidirectional proxy - it doesn't
matter which is denoted server/client.
As an example, here's a 3-cluster topology:
graph TB subgraph A["Cluster A"] PA[Proxy A] --> SA[Temporal Server A] SA --> DBA[(Database A)] end subgraph B["Cluster B"] PB[Proxy B] --> SB[Temporal Server B] SB --> DBB[(Database B)] end subgraph C["Cluster C"] PC[Proxy C] --> SC[Temporal Server C] SC --> DBC[(Database C)] end PA <-->|mux| PB PB <-->|mux| PC PA <-->|mux| PCThis multi-cluster connectivity is possible without the S2S proxy of course! But the proxy serves as a front to configuration interface differences - Temporal server version, security, infrastructure. For a cluster I call theses configuration differences a cluster's specification - the exact configuration that defines it.
Here is an example topology where each cluster has a different specification:
graph TB subgraph A["Cluster A - temporal:1.29.2 + postgres:15"] PA[Proxy A] --> SA[Temporal 1.29.2] SA --> DBA[(Postgres 15)] end subgraph B["Cluster B - temporal:1.27.4 + postgres:12"] PB[Proxy B] --> SB[Temporal 1.27.4] SB --> DBB[(Postgres 12)] end subgraph C["Cluster C - temporal:1.29.2 + postgres:15"] PC[Proxy C] --> SC[Temporal 1.29.2] SC --> DBC[(Postgres 15)] end PA <-->|mux| PB PB <-->|mux| PC PA <-->|mux| PCAs such, my semantics are as follows:
cluster: A Temporal Server instance paired with its backing database, representing a single logical deployment unit.specification: The exact configuration that defines acluster- server version, Docker image, database type/version, config templates, and schema setup scripts.topology: The complete test environment - N Clusters each fronted by an S2S Proxy, all joined to a shared Docker network, with proxies interconnected in a full-mesh arrangement.Code Structure
This PR is modelled on the same mental model outlined above.
The Compatibility Test Suite defines three things:
As a result, this PR introduces the following structure:
Specifications
Each specification is structured in a self-contained package:
Each package exports a
New()ClusterSpec function and bundles the server config template, schema setup script, andDocker image reference for that exact server+database combination. Adding a new specification is as simple as adding a new package here.
Topology
The topology package contains the logic to build a running topology from a list of ClusterSpec values. It:
Understanding this PR
Sorry - it's large. Here's where I'd start:
tests/compatibility/matrix/matrix_test.goHere is where we define the entry point for a topology.
For example, this is testing two 1.29.2 clusters:
tests/compatibility/specifications/temporal_1_29_2_postgres15/config.goHere we define the cluster specification for this 1.29.2 cluster:
tests/compatibility/matrix/run.goHere we define the exact tests being run on each topology:
tests/compatibility/suite/suite_replication.goHere is an example of a test, that would be run against a topology.
tests/compatibility/topology/topology.goHere is where we actually create the topology.