Skip to content

feat(dgw): route KDC traffic through agent tunnel (DGW-384)#1781

Draft
irvingouj@Devolutions (irvingoujAtDevolution) wants to merge 29 commits into
masterfrom
feat/DGW-384-kdc-agent-tunnel
Draft

feat(dgw): route KDC traffic through agent tunnel (DGW-384)#1781
irvingouj@Devolutions (irvingoujAtDevolution) wants to merge 29 commits into
masterfrom
feat/DGW-384-kdc-agent-tunnel

Conversation

@irvingoujAtDevolution
Copy link
Copy Markdown
Contributor

@irvingoujAtDevolution irvingouj@Devolutions (irvingoujAtDevolution) commented May 12, 2026

Closes DGW-384.

Routes KDC traffic through the agent tunnel for the two remaining paths after #1741:

  • HTTP /jet/KdcProxy endpoint
  • RDP CredSSP/NLA (rdp_proxy.rs::send_network_request)

send_krb_message gains (agent_tunnel_handle, session_id: Uuid). RDP callers pass claims.jet_aid so agent-side logs correlate KDC sub-traffic with the parent RDP session; the HTTP handler mints a fresh UUID since its token has no parent association.

Depends on #1741 — must merge first (uses agent_tunnel::routing::try_route).

Builds on #1738 (core infrastructure). Follow-up PRs will add the
Windows/Linux installer integration, gateway webapp agent
management UI, Docker deployment, and Playwright E2E harness.

Transparent routing:

- `crates/agent-tunnel/src/routing.rs`: `RoutingDecision` pipeline —
  explicit `jet_agent_id` from the JWT → subnet match → domain
  suffix match (longest wins) → direct connect. Single `try_route`
  entry point consumed by all gateway proxy paths.
- `crates/agent-tunnel/src/registry.rs`: `find_agents_for(host)` +
  `RouteAdvertisementState::matches_target()` do the lookup in one
  spot; offline agents are skipped.
- Gateway proxy integration: `api/fwd.rs`, `api/kdc_proxy.rs`,
  `api/rdp.rs`, `rd_clean_path.rs`, `generic_client.rs`, `rdp_proxy.rs`
  all call `try_route` before falling through to direct TCP.
- Tests: `agent-tunnel/src/integration_test.rs` (2 full-stack QUIC
  E2E), `tests/agent_tunnel_registry.rs` (13), `tests/agent_tunnel_
  routing.rs` (8).

Agent-side certificate renewal:

- `enrollment.rs`: `is_cert_expiring(cert_path, threshold_days)` and
  `generate_csr_from_existing_key(key_path, agent_name)` — the key
  never changes across renewals, the gateway just signs a new cert
  with the same public key.
- `tunnel.rs`: on connect, if the cert is within 15 days of expiry,
  the agent sends a `CertRenewalRequest` control message with a new
  CSR, waits for `CertRenewalResponse::Success`, writes the renewed
  cert and CA, and reconnects.
- `agent-tunnel/src/listener.rs`: gateway-side handler signs the
  CSR via `CaManager::sign_agent_csr` and returns the new cert chain.
  (Stub replaced: master's handler emitted a debug log and dropped
  the message.)

QUIC endpoint override:

- `enrollment.rs`: new `quic_endpoint_override: Option<String>`
  parameter on `enroll_agent` — if set, overrides the endpoint
  returned by the enroll API. Needed because the gateway's
  `quic_endpoint` is derived from `conf.hostname`, which in a
  containerized deployment is often the container ID (not routable
  from outside).
- `main.rs`: new `--quic-endpoint` CLI flag and `jet_quic_endpoint`
  JWT claim; precedence is CLI flag > JWT claim > enroll API
  response.

Agent-side routing primitives:

- `tunnel_helpers.rs`: `Target::Ip` / `Target::Domain` enum parsed
  from the gateway's `ConnectRequest::target`, `resolve_target`
  (domain → DNS), `connect_to_target` (happy-eyeballs).

Tests: 22 agent-tunnel lib + 3 proto version + 24 proto control +
11 proto session + 13 registry + 8 routing integration + 64 gateway
lib, all green. Zero clippy warnings; nightly fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`run_single_connection` previously returned `Ok(())` on both graceful
shutdown and successful cert renewal. The outer reconnect loop treated
`Ok(())` as "task done forever", so after a renewal the agent exited
and never reconnected with the new cert.

Split the return with `ConnectionOutcome::{Shutdown, CertRenewed}`;
renewal now reconnects immediately (bypassing backoff), shutdown still
exits the task. Also wrap the `CertRenewalResponse` recv in a 30s
timeout so a stalled gateway cannot hang the agent indefinitely.
- routing.rs: when `explicit_agent_id` is set but the gateway has no
  tunnel handle, return `Err` instead of silently falling back to a
  direct connect. A token that names a specific `jet_agent_id` is
  declaring a required network boundary; silent fallback would bypass
  it.
- api/fwd.rs, generic_client.rs, rd_clean_path.rs, api/kdc_proxy.rs:
  use `TargetAddr::as_addr()` (which brackets IPv6) instead of
  `format!("{host}:{port}")` or `to_string()` (which includes scheme).
  Fixes two real bugs: IPv6 targets were malformed (`::1:443` vs
  `[::1]:443`), and kdc_proxy was passing `tcp://host:88` to the
  tunnel target parser — which only accepts bare `host:port`.
- rdp_proxy.rs: add a `TODO(agent-tunnel)` documenting that CredSSP
  Kerberos network requests cannot currently traverse the agent
  tunnel because `send_network_request` hardcodes `None` for the
  handle. Edge case (KDC behind a NAT'd site only reachable via an
  enrolled agent); plumbing the handle through `RdpProxy` is a
  follow-up.
- tests/agent_tunnel_routing.rs: replace a flaky `thread::sleep(10ms)`
  (Windows timer resolution is ~16 ms) with an explicit
  `set_received_at_for_test` helper. Adds two new tests for the new
  explicit-agent-without-handle error path.
- registry.rs: expose `set_received_at_for_test` for the above.
- agent-tunnel-proto/control.rs: fix a stale doc comment that claimed
  `subnets` is IPv4+IPv6 (it is IPv4-only; `Vec<Ipv4Network>`).
…nale

CI fix:
- enrollment.rs: replace `time::OffsetDateTime::now_utc()` with
  `std::time::SystemTime::now()` — the `time` crate is Windows-only in
  devolutions-agent's Cargo.toml, so the previous code broke the Linux
  lint and test jobs. No dependency added; one fewer path too.

Test relocation:
- `crates/agent-tunnel/src/integration_test.rs` (QUIC E2E +
  domain-routing E2E) moved to
  `devolutions-gateway/tests/agent_tunnel_integration.rs`. Integration
  tests belong in the gateway's dedicated `tests/` folder, not inside
  a library crate's `src/`. Imports swapped from `super::cert` /
  `super::listener` to `agent_tunnel::cert` / `agent_tunnel::`.
- `devolutions-agent/src/main.rs` inline `mod tests { ... }` block
  extracted to `devolutions-agent/src/cli_tests.rs`. Kept as a child
  module (`#[cfg(test)] mod cli_tests;`) because the tests use the
  private `UpCommand` / `parse_up_command_args` — the alternative
  (move to `tests/` folder) would require exposing binary internals
  or a lib+bin split.

Dev-dep cleanup on agent-tunnel:
- Drop `rustls-pemfile` and `tempfile` from `[dev-dependencies]` —
  both were only used by the integration test that just moved out.
- Trim `tokio` dev-dep to `["macros"]` (drop `"net"` — no TcpListener
  usage in the remaining src tests).
- `rustls` stays in `[dependencies]` — it is unavoidable: Quinn's
  QUIC server TLS config is built on `rustls::ServerConfig`. PR1's
  cleanup dropped `rustls-pemfile` and `x509-parser` (replaced with
  picky); `rustls` itself is the TLS stack, not something we pull in
  for PEM parsing.

`jet_quic_endpoint` rationale:
- Expand the terse "often a container ID in Docker" comment into a
  full explanation on `EnrollmentJwtClaims::jet_quic_endpoint` of why
  the override exists at all: a running process has no way to
  self-discover its externally-reachable address, so the enroll API's
  self-reported `conf.hostname:port` is routinely wrong in
  Docker/K8s, NAT, split-horizon DNS, and HA-behind-LB deployments.
  Only the operator who designed the network knows the right value,
  so it is encoded into the JWT at mint time. The CLI flag takes
  precedence for last-minute corrections without re-issuing a JWT.
- `enroll_agent`'s docstring and the inline comment in the override
  branch now defer to the claim doc instead of re-stating the Docker
  example.

Verification:
- `cargo check --workspace --all-targets` ✅
- `cargo clippy -p agent-tunnel -p agent-tunnel-proto -p devolutions-agent -p devolutions-gateway --all-targets` ✅ 0 warn
- Tests: 55 lib (agent-tunnel 20 / proto 3 / agent 27 / agent bin 5)
  + 25 gateway integration (e2e 2 / registry 13 / routing 10) = 80
  total, all green.
- `cargo +nightly fmt --check` ✅
The enroll API previously returned `quic_endpoint = conf.hostname:listen_port`,
which the agent used as a fallback. That value is the gateway's self-report — a
running process cannot know the address its clients actually route to (Docker
bridge NAT, K8s service FQDN, split-horizon DNS, LB VIP). In anything past a
trivial single-host setup, the self-report is silently wrong, and the agent
tunnel just times out with no clear signal pointing at the root cause.

Require the operator to supply the QUIC endpoint explicitly. They are the only
party who knows the correct externally-reachable address anyway.

Gateway (`api/tunnel.rs`):
- Drop `quic_endpoint` from `EnrollResponse`. No replacement — the gateway no
  longer tries to guess.

Agent (`enrollment.rs`, `main.rs`):
- `enroll_agent` now takes `quic_endpoint: String` (was
  `quic_endpoint_override: Option<String>`).
- `UpCommand.quic_endpoint_override` → `UpCommand.quic_endpoint`. The
  enroll-time check rejects the run if neither `--quic-endpoint` nor the
  JWT's `jet_quic_endpoint` claim is present, with a clear error message.
- Positional `enroll` subcommand gains a required `quic_endpoint` at
  position 5.
- `EnrollmentJwtClaims::jet_quic_endpoint` doc rewritten: no longer framed
  as "optional override", now "operator must supply this, here is why".

Tests (`cli_tests.rs`):
- Happy-path tests pass `--quic-endpoint` / `jet_quic_endpoint` explicitly.
- Two new tests lock in the new contract: rejection when neither source is
  given (plain `--token` mode; `--enrollment-string` mode with no claim).
- CLI-wins-over-JWT test retained.
Moves the duplicated upstream connection machinery (RoutePlan,
UpstreamLeg, UpstreamSession, connect_upstream, prepare_upstream) out
of api/fwd.rs into a new crate::upstream module, and consumes it from
fwd.rs, generic_client.rs, and rd_clean_path.rs. Net effect:

- Eliminates three copies of the same `resolve route → connect →
  optional TLS wrap` sequence. All three call sites now share one
  implementation, so future fixes (e.g. alternate-target iteration,
  IPv6 bracketing, TLS over agent tunnel) happen in one place.
- Fixes TLS-over-agent-tunnel silently not working in fwd.rs: the new
  `UpstreamSession::Tls(Box<TlsStream<UpstreamLeg>>)` wraps either TCP
  or tunnel legs, where previously the TLS path only handled TCP.
- RDP credential injection in generic_client.rs now works over agent
  tunnel too (RdpProxy<_, S> is generic over S; UpstreamLeg satisfies
  its bounds).

rd_clean_path.rs's local `ServerTransport` enum is removed in favour
of the shared `UpstreamLeg`; the comment explaining why this must be
an enum (not Box<dyn>) moves to the shared module.

devolutions-agent/src/main.rs: nightly rustfmt reflow of a long
`.context(...)` chain, no behavioural change.

- fwd.rs: 942 → 650 LOC (−292)
- generic_client.rs: 240 → 205 LOC (−35)
- rd_clean_path.rs: ~900 → ~860 LOC (−40)
- upstream.rs: +344 new (shared)

Verified: cargo check --workspace --all-targets clean (video-streamer
bench failure is pre-existing). cargo clippy -p agent-tunnel -p
agent-tunnel-proto -p devolutions-gateway -p devolutions-agent --all-targets: 0 warnings.
cargo +nightly fmt applied. Tests: 52 lib + 23 gateway integration
(routing/registry) all green.
Adds the path that DVLS (and any other authenticated admin UI) uses to
bootstrap new agents: POST a JSON body → receive an
`devolutions-agent up --enrollment-string "dgw-enroll:v1:…"` command
ready to paste on the target machine.

- Gateway mints a one-time enrollment token stored server-side, then
  encodes `{ api_base_url, quic_endpoint, enrollment_token, name }`
  into a base64url payload prefixed with `dgw-enroll:v1:`. The agent
  decodes this string and posts the token as a Bearer on
  `/jet/tunnel/enroll`.
- The endpoint derives the QUIC endpoint from the caller-supplied
  api_base_url (operator knows the externally reachable host) falling
  back to conf.hostname. A running gateway cannot self-discover its
  externally reachable address — see `EnrollmentJwtClaims::jet_quic_endpoint`.

Adds two new canonical `AccessScope` variants that callers should
prefer for admin-tunnel operations:

- `AccessScope::AgentEnroll` (serde `gateway.agent.enroll`) — for
  minting enrollment strings and other write operations on the tunnel.
- `AccessScope::AgentRead` (serde `gateway.agent.read`) — for reading
  the connected agents list and status.

`AgentManagementWriteAccess` now accepts `AgentEnroll | ConfigWrite |
Wildcard`; `AgentManagementReadAccess` accepts `AgentRead |
DiagnosticsRead | ConfigWrite | Wildcard`. The broader existing scopes
are retained for back-compat with any caller that predates the
dedicated agent scopes.
…version

Mirrors the two new `AccessScope` variants (`gateway.agent.enroll` /
`gateway.agent.read`) on the .NET side so DVLS's `ScopeClaims` can sign
scope tokens with the canonical scope names instead of the broader
`ConfigWrite` / `DiagnosticsRead` fallbacks.

Bumps the package version to `2025.10.2-pr2-alpha1` (prerelease) so
downstream consumers can pin to this build for dev / PR validation
without clashing with the published 2025.10.1.

Adds JSON round-trip tests confirming the two new scopes serialize to
their expected `scope` field strings.
* `upstream::ConnectedUpstream::server_addr` for tunneled targets now
  reports the target IP:port (or `0.0.0.0:<port>` for a hostname
  target) instead of `0.0.0.0:0`. Logs / PCAP filenames / session info
  surface a meaningful peer address again.
* `RoutePlan` and its methods are now `pub(crate)`; they were never
  meant to leak outside the upstream module.
* `RoutingDecision::ExplicitAgentNotFound` is logged and degraded to
  Direct rather than panicking via `unreachable!` — the routing crate
  could grow new branches and we should not crash the gateway on a
  contract drift.
* `/jet/tunnel/enrollment-string` now returns `400` when neither
  `quic_host` is provided nor a parseable host can be extracted from
  `api_base_url`. The previous silent fallback to `conf.hostname` was
  a Docker/K8s footgun (often a container ID the agent cannot dial),
  and the token store insert is now performed only after that check
  so a 400 leaves no orphan token.
* `AgentManagementReadAccess` and `AgentManagementWriteAccess` reject
  with an error message that names the accepted scopes, easing
  integration debugging.
Earlier, when an RDP fwd request did not match the registry the only
visible breadcrumb was the eventual `Connected to destination server`
line — which fires for both Direct and ViaAgent paths and gave no way
to tell whether the registry was even consulted. Adding a single
debug! at the resolution call site lets an operator distinguish
"target_host did not match any agent" from "registry never asked",
which was already the difference between two real bug reports during
smoke testing.
* `quic_host` field doc updated to match the implementation, which
  rejects with 400 instead of falling back to `conf.hostname`.
* Reject overflowing `lifetime` values in
  `/jet/tunnel/enrollment-string`: with the prior `now_secs +
  lifetime_secs` an attacker-controlled lifetime could wrap and emit
  a token whose stored expiry is in the past, looking valid to the
  caller while it is in fact already redeemable as expired.
  Validate up-front so the in-memory store is never poisoned.
* Mirror the gateway's dual-stack fallback on the agent's QUIC client
  socket: try `[::]:0` first, fall back to `0.0.0.0:0` with a warn
  when the host has IPv6 disabled (typical of stripped-down Linux
  containers). Prevents the agent from being stranded on hosts where
  the v6 bind itself fails.
* Renewal CSR's CommonName was being filled with
  `tunnel_conf.gateway_endpoint` (a `host:port`), which only worked
  because the gateway ignores the CSR subject and trusts the
  mTLS-authenticated identity. Read the agent's CommonName from the
  existing certificate (the authoritative source for the registered
  name) and use it for the renewal CSR.
Refactoring the upstream connect path collapsed the two distinct
"WebSocket-TCP forwarding" / "WebSocket-TLS forwarding" messages into
a single "WebSocket forwarding" with a `mode` structured field. The
TLS-anchoring integration test (and operators' existing log greps)
key on the literal pre-refactor strings, so the test
`cli::dgw::tls_anchoring::test::case_1_self_signed_correct_thumb`
hung waiting for them and CI failed. Keep the structured field for
new telemetry but emit the original message text per mode.
The `POST /jet/tunnel/enrollment-string` endpoint had the gateway
generating, storing, and returning a UUID enrollment token wrapped in a
`dgw-enroll:v1:<base64-json>` envelope. That put token issuance and
state where it does not belong: in Devolutions' architecture, DVLS is
the only authority for tokens (it holds the provisioner private key),
the gateway is a stateless verifier (it holds the public key). The
in-memory `EnrollmentTokenStore` also broke HA — an agent could redeem
its token only against the specific gateway node that minted it, and a
gateway restart silently invalidated all unredeemed tokens.

The same `/jet/tunnel/enroll` handler already accepts a JWT scope
token (`TunnelEnroll` / `Wildcard`) signed by the provisioner key,
which is the correct path: DVLS signs, the agent presents the JWT, the
gateway verifies statelessly. With the redundant path removed, the
gateway no longer mints tokens at all.

Removes the route, the request/response types, the `EnrollmentTokenStore`
itself, and the corresponding branch from the `enroll` handler. The
`gateway.agent.enroll` scope is kept (DVLS signs `TunnelEnroll` JWTs
with that scope on its scope-token path) and so is the static
`enrollment_secret` fallback for environments without DVLS.
Two scopes had grown for the same concept now that DVLS mints the
enrollment JWT itself: `gateway.tunnel.enroll` was the scope on the
JWT presented by the agent at `/jet/tunnel/enroll`, and
`gateway.agent.enroll` was the scope DVLS used when calling the
removed `/jet/tunnel/enrollment-string` endpoint. With that endpoint
gone, the second meaning is dead and the first is the only one that
actually authorizes anything on the wire.

Drop `AccessScope::TunnelEnroll` and have the gateway-side validator
accept `AgentEnroll | Wildcard`. DVLS signs with `agent.enroll`. The
.NET side gets a new `EnrollmentClaims` class that mirrors the Rust
`EnrollmentTokenClaims` shape (scope + jet_gw_url + optional
jet_agent_name + optional jet_quic_endpoint) so `TokenUtils.Sign` can
emit the JWT directly.

NuGet bumped to 2025.10.3.
* The cert renewal check used to run only once, immediately after the
  QUIC connection was established. With a 30-day cert and a 15-day
  renewal threshold an agent that stays connected long enough never
  reaches the check again, and once the cert expires the next mTLS
  reconnect fails — the renewal request can no longer be sent because
  the tunnel cannot come up at all. Add an hourly tick in the main
  select! loop: when it detects the cert has entered the renewal
  window, close the connection and surface ConnectionOutcome::CertRenewed,
  routing the agent back through the existing pre-loop renewal block on
  reconnect (where the control stream is still un-split, so the
  request/response handshake is straightforward).

* KDC routing in send_krb_message walked the agent-tunnel pipeline for
  any matching subnet/domain, but the agent only speaks
  ConnectRequest::tcp. A `udp://` KDC token whose host happened to
  match an enrolled agent would therefore be delivered to the agent as
  a TCP target — wrong protocol semantics that silently breaks UDP
  Kerberos deployments. Skip agent routing for non-tcp KDC schemes;
  fall through to the direct path, which honors udp/tcp correctly.
* `read_kdc_reply_message` no longer trusts the 4-byte length prefix
  blindly. A misbehaving (or malicious) tunnel peer that announces
  `u32::MAX` would otherwise cause us to pre-allocate ~4 GiB and OOM.
  Cap the announced length at 64 KiB (well above any realistic
  Kerberos reply), use checked arithmetic on the header, and surface
  oversize/overflow as `io::Error` instead of panicking.

* `route_and_connect` returns `Err` on an empty candidate slice rather
  than `assert!`-ing. The function is a public API in a library crate;
  mis-calling it should not crash the gateway process.

* `set_last_seen_for_test` and `set_received_at_for_test` are explicitly
  named test-only by the `_for_test` suffix. Add `#[doc(hidden)]` and a
  comment explaining why they remain `pub` (cross-crate integration
  tests need them; `cfg(test)` only fires in the declaring crate's
  own test build) so production callers do not pick them up by accident.

* Fixed the misleading "skip hostname verification" comment in the
  agent-tunnel integration test — the test does not skip hostname
  validation, it just narrows the trusted-roots set to the test CA.

* Added round-trip JSON tests for the new `EnrollmentClaims` (full
  claims set + null-omission of optional fields).
The agent's `up` subcommand could only persist subnets via CLI; domain
advertisements still required hand-editing `agent.json` after
enrollment, which made demos and one-off setups awkward. Mirror the
existing `--advertise-subnets` shape: accept a comma-separated list
and persist it on enrollment, falling back to the on-disk value when
the flag is omitted (so an existing config is not silently wiped).

`enroll.nu` at the project root wraps the full demo flow into one
command — wipe previous state, run `up` with a JWT and the
project's smoke-test advertise lists, then start the agent service in
the foreground.
JWTs minted from the DVLS UI without an explicit name in the dialog
do not carry a `jet_agent_name` claim, and the agent's `up`
subcommand requires `--name`. Pass a sensible default so the
demo helper works on any JWT, and let the caller override:

  nu enroll.nu "<JWT>"
  nu enroll.nu "<JWT>" my-agent
* Drop `enroll.nu` — it was a developer convenience for local smoke
  tests, not part of the shipped product.
* Document `EnrollmentClaims.JetAgentName` end-to-end: explain that
  the gateway never reads it (auth is by signature/scope), the
  authoritative name is sent in the agent's enrollment request body,
  and the JWT claim is read agent-side as the default for the
  `--name` CLI flag — letting DVLS pre-fill the name typed in the
  "Generate Enrollment String" dialog.
* Move the `agent_tunnel_*` integration tests out of
  `devolutions-gateway/tests/` and into the `testsuite` crate's
  central test binary (`testsuite/tests/agent_tunnel/{integration,
  registry, routing}.rs`), where the rest of the cross-crate
  integration tests already live. Drop the now-ineffective
  `#![allow(unused_crate_dependencies)]` inner attributes (the lint
  is crate-level only) and add the agent-tunnel-related dev deps to
  `testsuite/Cargo.toml`.
PR #1741 was reviewed as too large. Reduce its scope to A+B (refactor +
transparent routing) by backing out the cert-renewal additions (C) and
the JWT-based enrollment pivot (D). Both will be opened as their own
PRs against master.

Cert renewal (C) removed:

- Agent-side: drop the pre-loop expiry check, periodic cert_expiry_tick
  in the main select! loop, ConnectionOutcome enum, and the
  `is_cert_expiring` / `read_agent_name_from_cert` /
  `generate_csr_from_existing_key` helpers from enrollment.rs.
- Gateway-side: drop the agent's ability to drive renewal; the
  CertRenewal proto messages stay (they exist on master from #1738) and
  the listener keeps the stub debug-and-drop arm. AGENT_CERT_VALIDITY_DAYS
  reverts to 365.

JWT enrollment refactor (D) removed:

- Gateway: revert token.rs (TunnelEnroll only, no AgentEnroll/AgentRead),
  extract.rs (no AgentManagement scope unions), and api/tunnel.rs to
  master (EnrollmentTokenStore-backed enroll handler with
  quic_endpoint in the response).
- Agent-tunnel crate: restore enrollment_store module + handle getter +
  registration in bind().
- Agent CLI: revert main.rs and cli_tests.rs to before --advertise-domains
  (config-side advertise_domains support stays, only the CLI flag goes).
  Test JWTs go back to gateway.tunnel.enroll scope.
- NuGet: delete EnrollmentClaims.cs, drop GatewayAgentEnroll/Read from
  AccessScope.cs, revert csproj version, drop the new
  JsonSerializationTests cases.
Trim missed agent-side D content — the JWT enrollment refactor lives in
its own follow-up PR, so PR2 should not carry any of it:

- enrollment.rs: restore EnrollResponse::quic_endpoint and the original
  enroll_agent / persist_enrollment_response signatures (no extra
  quic_endpoint or advertise_domains parameters). Drop the
  EnrollmentJwtClaims::jet_quic_endpoint claim — the enrollment JWT
  carries gw_url / agent_name only.
- main.rs: drop --quic-endpoint CLI flag, drop UpCommand::quic_endpoint,
  drop the JWT jet_quic_endpoint extraction, restore the two-arg-shorter
  service-mode signature, restore the inline cli tests module.
- cli_tests.rs: removed (the tests are back inline in main.rs at the
  master state).
… PRs

Tests (testsuite/tests/agent_tunnel/{integration,registry,routing}.rs) and
the read_cert_chain rewrite are not part of the routing/upstream feature
itself — they ship as their own PRs so this one stays focused on the
feature code:

- Cert PEM parsing fix → #1771
- Agent-tunnel test suite → follow-up PR (stacked on this one)

After this trim, PR2's diff is purely:
- Routing: agent-tunnel/{routing,registry,listener}.rs
- Upstream refactor: devolutions-gateway/upstream.rs and the proxy paths
  (fwd, kdc_proxy, rdp, rdp_proxy, rd_clean_path, generic_client)
- Agent client: devolutions-agent/tunnel_helpers.rs (TargetAddr widening
  to handle IPv6 alongside IPv4)
Revert kdc_proxy.rs to master. KDC tunnel routing covers two callers
(the /jet/KdcProxy HTTP handler and the CredSSP/NLA path in
rdp_proxy.rs::send_network_request), and both need to agree on how
send_krb_message takes a session_id. Doing only the HTTP handler here
forced a Uuid::new_v4() at the routing site for a "session" the KDC
token has no notion of -- meaningless on the wire and on the agent log.

Move the whole KDC-via-tunnel story (HTTP path + CredSSP path, plus
the read_kdc_reply_message DoS cap) into DGW-384 where the API can be
designed once with both call sites in view.

Also drop the kdc_proxy.rs reference from routing.rs's module doc
since this crate's caller is now only the upstream module family.
When an agent advertises the KDC's subnet or DNS domain, route Kerberos
traffic through the QUIC tunnel just like every other proxy path. This
closes the last gap left after the transparent routing PR (#1741):

- `/jet/KdcProxy` HTTP endpoint — `send_krb_message` now consults the
  routing pipeline before falling back to direct TCP. The HTTP handler
  has no parent association, so it mints a fresh session_id purely for
  agent-side log correlation.

- RDP CredSSP/NLA — `rdp_proxy.rs::send_network_request` previously
  hard-coded `None` for the agent handle. Plumb `agent_tunnel_handle`
  and `session_id` from `RdpProxy` down through `perform_credssp_with_*`
  → `resolve_*_generator` → `send_network_request`. The same change
  reaches the credential-injection clean path (`rd_clean_path.rs`).
  `session_id` here is `session_info.id` / `claims.jet_aid` so the
  agent log ties KDC sub-traffic to its parent RDP session.

Stack: based on #1741. Picks up `agent_tunnel::routing::try_route`.

`send_krb_message` signature gains `(agent_tunnel_handle, session_id)`
in that order — required `Uuid`, no `Option<>` — so the call site is
honest about which UUID it's logging. The UDP scheme guard (KDC over
UDP keeps going direct because the agent protocol only carries TCP)
and the 64 KiB `MAX_KDC_REPLY_MESSAGE_LEN` DoS cap (and the matching
generic `read_kdc_reply_message`) come along since they live in the
same file and serve the same end.
@github-actions
Copy link
Copy Markdown

Let maintainers know that an action is required on their side

  • Add the label release-required Please cut a new release (Devolutions Gateway, Devolutions Agent, Jetsocat, PowerShell module) when you request a maintainer to cut a new release (Devolutions Gateway, Devolutions Agent, Jetsocat, PowerShell module)

  • Add the label release-blocker Follow-up is required before cutting a new release if a follow-up is required before cutting a new release

  • Add the label publish-required Please publish libraries (`Devolutions.Gateway.Utils`, OpenAPI clients, etc) when you request a maintainer to publish libraries (Devolutions.Gateway.Utils, OpenAPI clients, etc.)

  • Add the label publish-blocker Follow-up is required before publishing libraries if a follow-up is required before publishing libraries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant