Skip to content

WIP: hdfs: add Kerberos authentication support#4419

Open
Khalid-Nowaf wants to merge 1 commit into
redpanda-data:mainfrom
Khalid-Nowaf:hdfs/upgrade-v2
Open

WIP: hdfs: add Kerberos authentication support#4419
Khalid-Nowaf wants to merge 1 commit into
redpanda-data:mainfrom
Khalid-Nowaf:hdfs/upgrade-v2

Conversation

@Khalid-Nowaf

Copy link
Copy Markdown

Adds Kerberos authentication support for the HDFS connector by upgrading to github.com/colinmarc/hdfs/v2 and introducing shared HDFS auth configuration for input and output.

Changes

  • Adds auth.kerberos config for HDFS output.
  • Adds limited Kerberos support for HDFS input.
  • Supports keytab-based Kerberos auth with:
    • krb5_conf
    • keytab
    • principal
    • service_principal
    • data_transfer_protection

Related Issues
#1347

@CLAassistant

CLAassistant commented May 11, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@Khalid-Nowaf

Copy link
Copy Markdown
Author

I would like to know how I trigger the review process.

@Jeffail Jeffail left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution, @Khalid-Nowaf — Kerberos support here has been long awaited. I spent some time tracing through the diff and the upstream libraries; the notes below are intended as collaborative feedback.

Kerberos client lifecycle (internal/impl/hdfs/config.go + input.go/output.go Close)

Login() inside hdfsKerberosConfig.client() spawns a background renewal goroutine via gokrb5's enableAutoSessionRenewal (client/session.go). Because the client is constructed with NewWithKeytab, the TGT will refresh indefinitely for the lifetime of the *krbclient.Client — so long-running pipelines should authenticate without issue, which is great.

However, neither hdfsReader.Close nor hdfsWriter.Close currently calls Destroy() on the *krbclient.Client (which stops the renewal goroutine, drops the session, and zeroes credential material), nor closes the underlying *hdfs.Client. For a process that brings the component up once and runs forever this is harmless, but on component rebuild (config reload, fatal-error retry) the previous instance's renewal goroutine retains a reference to the Client and the in-memory keytab, preventing GC. Stashing the Kerberos client on the reader/writer struct and calling Destroy() (plus closing the HDFS client) in Close would resolve both.

Dead branch in validateInput (config.go)

hdfsConfigFromParsed already invokes c.kerberos.validate() before validateInput runs, and validate() rejects any data_transfer_protection value outside {\"\", \"authentication\", \"integrity\", \"privacy\"}. The default: arm of the validateInput switch is therefore unreachable and could be removed for clarity.

`user` field semantics under Kerberos

In hdfs/v2, when KerberosClient is set, User becomes the effective/proxy user rather than the authenticating identity (which is derived from the principal). The current field description — "A user ID to connect as." — is accurate for the non-Kerberos case but potentially misleading once Kerberos is enabled. A brief note in the description would help operators understand the interaction.

Origin of the input integrity/privacy restriction

Would you mind clarifying the source of the "integrity"/"privacy" limitation for the HDFS input? Knowing whether this stems from colinmarc/hdfs/v2, a deliberate Connect-side decision, or a gap to be closed later would be useful for operators — and a short reference in the validation error message itself would aid debugging in the field.

@Khalid-Nowaf

Khalid-Nowaf commented Jun 3, 2026

Copy link
Copy Markdown
Author

Thanks @Jeffail for the comprehensive review.

I’d like to provide more context on the origin of the input integrity/privacy restriction, before jumping to refactor the code based on your review.

Before working on this, I set up a local HDFS cluster with Kerberos enabled and created local integration tests (Im using macOS, and the current integration tests does not support my OS): Local Integration Test Setup. Reproduce Failure Script

Once the environment was ready, the following test cases failed:

  • Kerberos with data_transfer_protection: integrity
    • The Hadoop CLI can read and write files successfully, so the cluster itself appears healthy.
    • ReadDir works.
    • ReadFile fails with: proto: cannot parse invalid wire-format data.
  • Kerberos with data_transfer_protection: privacy
    • The Hadoop CLI can read and write files successfully.
    • ReadDir works.
    • ReadFile fails with the same error.

This does not appear to be a Kerberos authentication issue or a problem with the HDFS setup. Instead, it looks like an issue in the upstream Go HDFS client’s DataNode read path when the stream is wrapped with either integrity or privacy protection.

I started debugging the upstream HDFS client, but it required more time than I initially expected, so I treated this as an upstream bug for now.

For that reason, I block these two modes only for the HDFS input path to avoid exposing a configuration that I know will fail. But still allow them for HDFS output because output operations were tested locally and worked correctly, including with privacy.

@Khalid-Nowaf

Copy link
Copy Markdown
Author

I think the issue is with my environment or local test setup. I’ll try using the current integration tests on a Linux machine and add Kerberos (krb5) integration tests as well.

I’ll get back once I’ve tested it again.

@Khalid-Nowaf Khalid-Nowaf changed the title hdfs: add Kerberos authentication support WIP: hdfs: add Kerberos authentication support Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants