feature - Technical implementation discussion of kafka to iggy bridge #3252

ryerraguntla · 2026-05-14T00:58:21Z

ryerraguntla
May 14, 2026

The discussion is branching out the discussion from #3081 around the architectural and design of the kafka_iggy_bridge . The design details will be added here for community discussion and sharing the implementation idea. All the implementation jira tickets will be added here.

ryerraguntla · 2026-05-14T01:27:54Z

ryerraguntla
May 14, 2026
Author

Below is the proposed architecture for the version 0.01 of kafka_2_iggy bridge. My thought process is to provide a bridge for transition to iggy with the current iggy capabilities. As Iggy's new server and SDK is available, the known gaps by design will be taken care.

Architecture Overview

Approach: Standalone TCP Proxy (referenced as Option 1 in discussion #3081, comment #3044)

The bridge runs as a separate process that:

Listens on port 9093 (Kafka wire protocol)
Decodes Kafka binary requests (ApiVersions, Produce, Fetch, etc.)
Translates to Iggy SDK calls via IggyBridge
Encodes Kafka-compliant responses and sends them back to the client

Key Technical Achievements

1. Complete Protocol Foundation

Files: src/protocol/codec.rs, src/protocol/header.rs

Implement a zero-copy encoder/decoder for all Kafka primitives:

2. Request/Response Header Auto-detection

File: src/protocol/header.rs

Kafka has two incompatible header formats:

v1 (non-flexible): api_key | api_version | correlation_id | client_id (NULLABLE_STRING)
v2 (flexible): api_key | api_version | correlation_id | client_id (COMPACT_NULLABLE_STRING) | tagged_fields

The bridge auto-detects which format to use based on a lookup table covering all 65 API keys:

Impact: Clients using old versions (e.g., Kafka 2.x) and new versions (Kafka 3.x+) work seamlessly against the same bridge.

3. Kafka-to-Iggy Mapping Layer

File: src/iggy_bridge.rs

The IggyBridge struct wraps the Iggy SDK and provides idempotent operations:

Kafka Concept	Iggy Mapping	Method
Topic	Stream + Topic (same name)	`ensure_stream_and_topic()`
Partition	Balanced partitioning (Iggy decides)	`produce()`
Offset	Message offset (0-based, same as Kafka)	`fetch()`
High watermark	Current end offset	`high_watermark()`

Key Design Decision: Kafka partition IDs are ignored during produce. The bridge uses Partitioning::balanced() so Iggy handles partition assignment. This avoids 0-vs-1-based index confusion and leverages Iggy's native load balancing. This implementation will be revisited after the iggy partition mapping is implemented. For now it will be same partition id.

4. Implemented API Keys (Phase 1)

| API Key | Name | Versions | Iggy Integration |
|---------|------|----------|--------|------------------|
| 18 | ApiVersions | v0–v3 | N/A (handshake only) |
| 3 | Metadata | v0–v9 |Returns stub (single broker at 127.0.0.1:9093) |
| 0 | Produce | v3–v9 | send_messages() with balanced partitioning |
| 1 | Fetch | v4–v12 | poll_messages() with offset-based polling |
| 2 | ListOffsets | v1–v6 | high_watermark() for latest offset |
| 19 | CreateTopics | v2–v5 | create_stream() + create_topic() |

Code Locations:

Decoders: src/protocol/requests.rs
Encoders: src/protocol/responses.rs
Iggy integration: src/protocol/api.rs (handle_request_with_iggy())

5. Comprehensive Testing Infrastructure

Unit Tests

Codec tests (tests/codec_tests.rs): Varint encoding, nullable strings, compact arrays
Header tests (tests/header_tests.rs): Header v1/v2 round-trip, correlation ID preservation
API handler tests (tests/api_handler_tests.rs): ApiVersions flexible/legacy encoding

Integration Tests

Server lifecycle (tests/server_integration_tests.rs): Connection handling, frame size limits, graceful shutdown
Golden wire fixtures (tests/golden_wire_fixtures_tests.rs): Byte-exact validation against known-good Kafka messages

End-to-End Testing Tool

tools/kafka-tool: A CLI that generates 280+ valid Kafka wire messages (every API key × every version) using the kafka-protocol crate.

Purpose: Ensures the bridge doesn't crash or produce malformed responses when faced with any valid Kafka request.

Technical Challenges To Solve

Challenge 1: Flexible vs. Legacy Encoding

Kafka introduced "flexible" encoding (compact arrays, tagged fields) gradually across versions. For example:

Produce v9+ is flexible
Produce v3–v8 is legacy
ApiVersions v3+ is flexible
Metadata v9+ is flexible (but v0–v8 is legacy)

Solution: Version-aware encoder/decoder that checks api_version >= flexible_threshold and calls the appropriate codec methods (write_compact_nullable_string() vs. write_nullable_string()).

Challenge 2: RecordBatch Format

Kafka Produce requests contain a RecordBatch (Kafka's internal message format) with:

Compression (gzip, snappy, lz4, zstd)
Timestamps (millisecond precision)
Headers (key-value metadata)
Checksums (CRC32C)

Proposed Solution: The bridge treats RecordBatch as opaque bytes and stores them directly in Iggy. This preserves the original format but means:

Iggy can't filter on Kafka message headers
Iggy can't decompress Kafka messages for storage optimization
Kafka consumers can decode the RecordBatch natively

Future Enhancement: Parse RecordBatch and extract individual messages, storing them as Iggy messages with metadata preserved in headers.

Challenge 3: Partition Mapping

Problem: Kafka partitions are 0-based (0, 1, 2, ...), while Iggy partitions are 1-based (1, 2, 3, ...).

Solution: The bridge ignores Kafka partition IDs during produce and uses Partitioning::balanced(). For fetch, it polls across all partitions and returns messages in offset order. This avoids off-by-one bugs and leverages Iggy's native balancing.

Trade-off: Kafka clients lose control over which partition receives a message. For sticky-key partitioning (e.g., all messages for user_id=123 go to partition 0), a future enhancement would be:

TODO:
Design for this will be revisited after the server-ng and in works features are available.

Comparison to "Built-in Listener" Approach

Discussion #3081 mentions integrating Kafka protocol directly into Iggy server. Here's why the standalone proxy is superior:

Concern	Standalone Proxy	Built-in Listener
Iggy codebase impact	Zero (pure client)	High (new TCP listener + dispatcher)
Testing complexity	Isolated unit tests	Requires Iggy server in test environment
Deployment flexibility	Optional (run only if Kafka compat needed)	Always-on (port 9093 always open)
Version evolution	Independent releases	Tied to Iggy release cycle
Security boundary	Proxy can enforce Kafka ACLs separately	ACLs mixed with Iggy auth
Latency	+1-2ms	~0ms

Verdict: Standalone proxy is the pragmatic choice for MVP. Built-in listener could be considered for v2 if latency becomes a bottleneck (unlikely for most workloads).

Next Steps (Phase 2 API Keys)

To achieve full producer/consumer compatibility, the following API keys are required:

Critical (Producer)

Produce
ApiVersions
Metadata
InitProducerId (API key 22) — needed for idempotent/transactional producers
AddPartitionsToTxn (API key 24) — transactional writes
EndTxn (API key 26) — commit/abort transactions

Critical (Consumer)

Fetch
ListOffsets
FindCoordinator (API key 10) — locate consumer group coordinator
JoinGroup (API key 11) — join consumer group
SyncGroup (API key 14) — receive partition assignment
Heartbeat (API key 12) — keep group membership alive
OffsetCommit (API key 8) — commit consumer offsets
OffsetFetch (API key 9) — fetch last committed offsets
LeaveGroup (API key 13) — graceful consumer shutdown

Known Limitations

Consumer Groups: Not implemented. Each Kafka consumer currently acts as an independent Iggy consumer. No rebalancing or offset tracking.
Transactions: Not supported. InitProducerId, AddPartitionsToTxn, and EndTxn return UNSUPPORTED_VERSION.
Compression: RecordBatch compression is preserved (opaque bytes) but not decompressed. Iggy stores the compressed payload as-is.
ACLs/Auth: The bridge authenticates to Iggy as root. Kafka client auth (SASL, mTLS) is not mapped to Iggy users.
Partition Assignment: Kafka clients cannot control which partition receives a message (balanced partitioning only).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature - Technical implementation discussion of kafka to iggy bridge #3252

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

feature - Technical implementation discussion of kafka to iggy bridge #3252

Uh oh!

ryerraguntla May 14, 2026

Replies: 1 comment

Uh oh!

Uh oh!

ryerraguntla May 14, 2026 Author

Architecture Overview

Key Technical Achievements

1. Complete Protocol Foundation

Implement a zero-copy encoder/decoder for all Kafka primitives:

2. Request/Response Header Auto-detection

3. Kafka-to-Iggy Mapping Layer

4. Implemented API Keys (Phase 1)

5. Comprehensive Testing Infrastructure

Unit Tests

Integration Tests

End-to-End Testing Tool

Technical Challenges To Solve

Challenge 1: Flexible vs. Legacy Encoding

Challenge 2: RecordBatch Format

Challenge 3: Partition Mapping

Comparison to "Built-in Listener" Approach

Next Steps (Phase 2 API Keys)

Critical (Producer)

Critical (Consumer)

Known Limitations

ryerraguntla
May 14, 2026

ryerraguntla
May 14, 2026
Author