Replies: 1 comment 2 replies
-
|
I've been thinking about it for a while and concluded that DLQ is a feature that can be trivialy implemented In the future we would like to implement an Shortly, I think DLQ as a functionality is very simple to implement and relies heavily on user defined business logic, which in order to support would require changes to our connector infrastructure in a way where I don't see the justification for it. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
When a connector fails to process one or more messages from a batch, those messages are currently lost forever. The following is a proposal for a Dead Letter Queue (DLQ) that routes failed messages to a dedicated Iggy topic so they can be inspected, debugged, and replayed.
Outline
Design
DLQ Semantics
1. DLQ to contain failing messages only, not the entire batch
When a batch partially fails, the connector must separate successful messages from failed ones and only send the failed messages to the DLQ. The sucessful messages must be written to the target.
Writing the entire batch to DLQ on any single failure wastes storage, makes debugging and replay harder, and obscures the actual failure rate.
2. DLQ is an Iggy topic
Failed messages are published to a user-defined Iggy topic. These are fully user-defined in the sink config. The runtime reads them at startup and sets up the DLQ producer accordingly.
High-level Design
The design of the DLQ follows directly from the principles above.
Since DLQ operates at message granularity and only the connector has visibility into message-level failures, it is the connector's resposibility to route messages to the DLQ. This also means that DLQ is opt-in per connector. If a connector does not implement DLQ, failed messages are silently lost.
DLQ writes are bridged via a
DlqCallbackfunction pointer injected into the plugin atopen()time, following the same pattern asLogCallback. This is needed because connectors do not have access to the runtime'sIggyClient.Connector to DLQ Message Path
This section describes the flow of a failed message from detection in the sink/source to its eventual persistence in the DLQ.
Sink
DlqWriterinto the sink atopen()time via a callbackconsume(), the sink segregates messages into two structures — successful and failed — as it processes each messageconsume(), successful messages are written to the target in a single batch writedlq.send(topic_metadata, partition_id, msg, reason)for each failed message — the SDK constructs theDlqEnvelope, serializes it to bytes, and invokes the callbackSource
DlqWriterinto the source atopen()time via a callbackpoll(), the source fetches a batch from the external system and converts each record into aProducedMessagedlq.send()with the failure reasonProducedMessages— the runtime writes them to the target Iggy topicABI Impact
Introducing
DlqCallbackrequires changing theiggy_sink_open(andiggy_source_open) FFI signature — the entrypoint symbol every plugin must export. This is a breaking ABI change: any plugin compiled against the old signature and loaded by a new runtime will misinterpret the call arguments, regardless of whether the plugin uses DLQ. A forced rebuild of all plugins is required.Three approaches to handle this:
Option 1: New versioned entrypoint symbol
Add
iggy_sink_open_v2with the new signature alongside the existingiggy_sink_open. The runtime checks foriggy_sink_open_v2first (dlopen symbol lookup), and falls back toiggy_sink_openfor older plugins. Old plugins continue to work without recompile; new plugins opt in by exporting the v2 symbol.Option 2: Separate
iggy_sink_set_dlq_callbacksymbol (recommended)Keep
iggy_sink_openunchanged. Add a new optional symboliggy_sink_set_dlq_callback(id, dlq_callback)that the runtime calls afteropen()if DLQ is configured for that sink. Plugins that don't export this symbol simply don't get DLQ wired up.Option 3: Accept the forced rebuild
Since all official sink and source plugins live in the same repository, a coordinated rebuild across all of them is low friction. The only plugins affected without a rebuild would be third-party plugins compiled outside the repo.
Beta Was this translation helpful? Give feedback.
All reactions