Kafka Idempotent Consumers: Achieving Exactly-Once Semantics

October 12, 2025

24 min read

Goh Ling Yong

Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

Beyond the Producer: The Consumer's Crucial Role in Exactly-Once Semantics

For senior engineers building critical event-driven systems, Kafka's promise of Exactly-Once Semantics (EOS) is both a powerful feature and a source of significant implementation complexity. While much of the documentation focuses on the transactional producer (transactional.id, enable.idempotence=true), this is only half the story. True end-to-end EOS is a property of the entire system, and the most challenging part often lies in the consumer's interaction with external, non-Kafka systems like a PostgreSQL database or a Redis cache.

Standard at-least-once delivery guarantees that messages aren't lost, but it opens the door to duplicate processing during failures and consumer group rebalances. For financial transactions, order processing, or stateful computations, processing a message twice can be catastrophic. This article bypasses introductory concepts and dives directly into production-grade architectural patterns for building truly idempotent consumers that form the bedrock of a reliable EOS pipeline.

We will analyze the atomic consume-process-produce loop and its inherent failure modes. We will then architect solutions for two distinct, real-world scenarios:

Kafka-Native Transactions: The 'simpler' case where a consumer reads from a topic, processes the data, and writes to another topic, all within the Kafka ecosystem. We'll dissect the interplay between the transaction coordinator, producer, and consumer group.

Transactions with External Systems: The far more common and complex scenario where consumer processing involves side effects in an external transactional system, such as a relational database. We will implement and contrast two robust patterns: the Transactional Outbox and Idempotency Key Persistence.

This is not a theoretical overview. We will provide complete Java examples, discuss schema design, performance optimization for batch processing, and tackle the difficult edge cases: poison pill handling without breaking atomicity and ensuring correctness during consumer rebalances.

The Atomicity Challenge: The Consume-Process-Commit Minefield

At its core, the challenge is making three distinct operations atomic: reading from Kafka, executing business logic (e.g., updating a database), and committing the read offset. A failure at any point can violate EOS.

Consider this naive consumer logic:

java

// DO NOT USE IN PRODUCTION - DEMONSTRATES FAILURE MODES
while (true) {
    ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
    for (ConsumerRecord<String, String> record : records) {
        // 1. Process the message (e.g., database update)
        updateDatabase(record.value());

        // <<< CRASH! Failure here means DB is updated, but offset is not committed.
        // On restart, the message will be re-processed, causing a duplicate update.

        // 2. Commit the offset
        consumer.commitSync(); // Or async
    }
}

Conversely, if we commit the offset before the database update, a crash after the commit but before the DB write results in data loss (at-most-once semantics). The fundamental problem is the lack of a distributed transaction that spans both Kafka and the external database.

Our goal is to architect patterns that make the combination of process + commit an atomic operation, ensuring that even if a message is delivered multiple times by Kafka, it is processed exactly once.

Pattern 1: Kafka-Native Transactions for Stream Processing

When your entire workflow is contained within Kafka (consuming from topic A, producing to topic B), you can leverage Kafka's Transactional API to achieve EOS with relative ease. This pattern is ideal for data enrichment, filtering, or transformation pipelines.

The Mechanism: The key is to bind the consumer's offset commit to the producer's transaction. The sendOffsetsToTransaction() method allows the consumer to send its offsets to the transaction coordinator, which commits them atomically with the producer's outbound messages. Consumers configured with isolation.level=read_committed will only ever read messages from committed transactions.

Configuration

Producer Configuration:

properties

# Guarantees messages are written in order without duplicates.
# Requires acks=all, retries > 0, and max.in.flight.requests.per.connection=1 (for Kafka < 3.0).
enable.idempotence=true

# Enables transactional semantics.
acks=all

# A unique, stable ID for this producer instance. Critical for fencing zombie producers.
transactional.id=my-transactional-stream-processor-v1

Consumer Configuration:

properties

# Must disable auto-commit to manage offsets manually within the transaction.
enable.auto.commit=false

# Ensures the consumer only reads messages that are part of a committed transaction.
isolation.level=read_committed

Implementation

Here is a complete, production-ready example demonstrating the transactional loop.

java

import org.apache.kafka.clients.consumer.*;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.KafkaException;
import org.apache.kafka.common.TopicPartition;

import java.time.Duration;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;

public class KafkaTransactionalStreamProcessor {

    public static void main(String[] args) {
        // Common properties
        Properties commonProps = new Properties();
        commonProps.put("bootstrap.servers", "localhost:9092");

        // Producer setup
        Properties producerProps = new Properties(commonProps);
        producerProps.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        producerProps.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        producerProps.put("enable.idempotence", "true");
        producerProps.put("transactional.id", "order-processor-01");
        producerProps.put("acks", "all");

        // Consumer setup
        Properties consumerProps = new Properties(commonProps);
        consumerProps.put("group.id", "order-processor-group");
        consumerProps.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        consumerProps.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        consumerProps.put("enable.auto.commit", "false");
        consumerProps.put("isolation.level", "read_committed");

        KafkaProducer<String, String> producer = new KafkaProducer<>(producerProps);
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(consumerProps);

        consumer.subscribe(Collections.singletonList("input-topic"));

        // Initialize transactions. This registers the transactional.id with the coordinator
        // and acquires an epoch, fencing any lingering producers with the same ID.
        producer.initTransactions();

        try {
            while (true) {
                ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(1));
                if (records.isEmpty()) {
                    continue;
                }

                try {
                    producer.beginTransaction();

                    for (ConsumerRecord<String, String> record : records) {
                        // 1. Process the message
                        String processedValue = record.value().toUpperCase();
                        System.out.printf("Processing record: key=%s, value=%s, partition=%d, offset=%d\n",
                                record.key(), record.value(), record.partition(), record.offset());

                        // 2. Produce the result to the output topic
                        producer.send(new ProducerRecord<>("output-topic", record.key(), processedValue));
                    }

                    // 3. Send consumed offsets to the transaction
                    Map<TopicPartition, OffsetAndMetadata> offsetsToCommit = new HashMap<>();
                    for (TopicPartition partition : records.partitions()) {
                        long lastOffset = records.records(partition).get(records.records(partition).size() - 1).offset();
                        offsetsToCommit.put(partition, new OffsetAndMetadata(lastOffset + 1));
                    }
                    producer.sendOffsetsToTransaction(offsetsToCommit, consumer.groupMetadata());

                    // 4. Commit the transaction
                    producer.commitTransaction();

                } catch (KafkaException e) {
                    System.err.println("Aborting transaction due to: " + e.getMessage());
                    producer.abortTransaction();
                    // Note: No need to seek here. The aborted transaction will not have committed
                    // the offsets, so the next poll will re-fetch the same records.
                }
            }
        } finally {
            consumer.close();
            producer.close();
        }
    }
}

Edge Case: Fencing Zombie Instances

The transactional.id is paramount. If an instance of our processor crashes and a new one starts, Kafka uses the transactional.id and an incrementing producer epoch to "fence off" the old producer. Any lingering transactions from the old, "zombie" instance will be rejected by the broker, preventing split-brain scenarios. This is handled automatically by the client library as long as transactional.id is stable and unique per logical processor.

Pattern 2: Idempotency with External Systems

This is the real challenge. How do you atomically update a PostgreSQL table and commit a Kafka offset? Since a 2PC (Two-Phase Commit) across Kafka and most databases is not feasible or is prohibitively complex, we rely on making the business logic itself idempotent.

Sub-Pattern 2a: The Transactional Outbox

This pattern provides maximum safety and is ideal for systems where a consumed event triggers both a state change and the production of new events.

The Concept: Instead of writing to a database AND sending a message to Kafka (a dangerous dual-write), we do everything within a single local database transaction. The consumed message's offset is treated as just another piece of application data to be committed atomically with the business state.

Start DB Transaction.

Perform business logic (e.g., UPDATE accounts SET balance = balance - 100 WHERE id = 'abc').

Insert an event into an outbox table within the same transaction (e.g., INSERT INTO outbox (payload) VALUES ('{"type": "DEBIT_SUCCESSFUL", ...}')).

Store/update the consumed Kafka offset in a processed_offsets table, also within the same transaction.

Commit DB Transaction.

If the commit succeeds, all three pieces of data are durable. A separate, asynchronous process then reads from the outbox table and reliably publishes those events to Kafka.

Schema Design

sql

-- Stores the last successfully processed offset for each partition
CREATE TABLE kafka_consumer_offsets (
    group_id VARCHAR(255) NOT NULL,
    topic VARCHAR(255) NOT NULL,
    partition INT NOT NULL,
    current_offset BIGINT NOT NULL,
    PRIMARY KEY (group_id, topic, partition)
);

-- The outbox table for reliably publishing events
CREATE TABLE outbox (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    topic VARCHAR(255) NOT NULL,
    key VARCHAR(255),
    payload JSONB NOT NULL,
    headers JSONB,
    published_at TIMESTAMPTZ -- Null until published
);

CREATE INDEX idx_outbox_unpublished ON outbox (created_at) WHERE published_at IS NULL;

Implementation: The Consumer

The consumer's job is to update the business state and the offset table atomically. It does not talk to a Kafka producer directly.

java

// Using Spring Boot, JPA, and Spring Kafka for conciseness

@Service
public class OrderProcessorService {

    @Autowired
    private DataSource dataSource;

    @Transactional(propagation = Propagation.REQUIRES_NEW, rollbackFor = Exception.class)
    public void processOrder(ConsumerRecord<String, Order> record) {
        // This entire method runs in a single database transaction.
        try (Connection conn = dataSource.getConnection()) {
            // Step 1: Check if this message has already been processed.
            // This is the idempotency check.
            long lastProcessedOffset = getLastProcessedOffset(conn, record.topic(), record.partition());
            if (record.offset() <= lastProcessedOffset) {
                // Already processed this or an earlier message in a previous batch, skip.
                return;
            }

            // Step 2: Perform business logic
            // e.g., update an order table, inventory, etc.
            updateOrderStateInDb(conn, record.value());

            // Step 3: Create an outbox event
            createOutboxEvent(conn, "order_processed_events", record.key(), createProcessedEvent(record.value()));

            // Step 4: Update the offset in the SAME transaction
            updateProcessedOffset(conn, record.topic(), record.partition(), record.offset());

            // The transaction is committed by Spring upon method exit.
            // If any step fails, the entire transaction is rolled back.
        } catch (SQLException e) {
            throw new RuntimeException("Database processing failed", e);
        }
    }

    // ... DAO methods for getLastProcessedOffset, updateOrderStateInDb, etc. ...
}

// The Kafka Listener
@Component
public class OrderConsumer {
    @Autowired
    private OrderProcessorService processorService;

    @KafkaListener(topics = "orders", containerFactory = "kafkaListenerContainerFactory")
    public void listen(ConsumerRecord<String, Order> record) {
        // The container factory should have manual ACK mode.
        // We don't ACK here. The database transaction is the source of truth.
        processorService.processOrder(record);
    }
}

Implementation: The Outbox Poller

A simple, separate service polls the outbox table, sends messages to Kafka, and marks them as published. This can be a scheduled job or a dedicated microservice.

java

@Service
public class OutboxPublisher {

    @Autowired
    private KafkaTemplate<String, String> kafkaTemplate;
    @Autowired
    private DataSource dataSource;

    @Scheduled(fixedDelay = 1000)
    public void publishEvents() {
        // Select unpublished events and lock the rows to prevent concurrent processing
        String selectSql = "SELECT * FROM outbox WHERE published_at IS NULL ORDER BY created_at LIMIT 100 FOR UPDATE SKIP LOCKED";

        try (Connection conn = dataSource.getConnection();
             PreparedStatement selectStmt = conn.prepareStatement(selectSql)) {

            conn.setAutoCommit(false);
            List<UUID> publishedIds = new ArrayList<>();

            ResultSet rs = selectStmt.executeQuery();
            while (rs.next()) {
                // ... deserialize event from ResultSet ...
                ProducerRecord<String, String> record = new ProducerRecord<>(topic, key, payload);
                try {
                    // Send to Kafka. Use a producer with idempotence enabled.
                    kafkaTemplate.send(record).get(); // Synchronous send for simplicity
                    publishedIds.add(id);
                } catch (Exception e) {
                    // Log error, maybe implement a retry mechanism.
                    // The transaction will roll back, and we'll try again later.
                    conn.rollback();
                    return;
                }
            }

            if (!publishedIds.isEmpty()) {
                // Mark events as published
                String updateSql = "UPDATE outbox SET published_at = NOW() WHERE id = ANY(?)";
                try (PreparedStatement updateStmt = conn.prepareStatement(updateSql)) {
                    Array idsArray = conn.createArrayOf("UUID", publishedIds.toArray());
                    updateStmt.setArray(1, idsArray);
                    updateStmt.executeUpdate();
                }
            }
            conn.commit();

        } catch (SQLException e) {
            // Handle exception
        }
    }
}

Sub-Pattern 2b: Idempotency Key Persistence

For simpler cases where the consumer only needs to update a database and doesn't produce new Kafka messages, the full outbox pattern can be overkill. A more direct approach is to use a unique identifier from the message as an idempotency key.

The Concept:

Extract a unique key from the Kafka message (e.g., a UUID in the payload or a header).
Before processing, check a dedicated table (or a Redis set) to see if this key has been processed.
Wrap the business logic and the insertion of the idempotency key into a single database transaction.

Schema Design

sql

CREATE TABLE processed_message_keys (
    message_key UUID PRIMARY KEY,
    processed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    consumer_group VARCHAR(255) NOT NULL
);

-- Add a TTL mechanism if keys don't need to be stored forever.
-- For example, a background job that deletes keys older than 30 days.

Implementation

java

@Service
public class PaymentProcessorService {

    @Autowired
    private DataSource dataSource;
    @Autowired
    private KafkaConsumer<String, Payment> consumer;

    public void processPayments() {
        // Manual poll loop with manual offset management
        while (true) {
            ConsumerRecords<String, Payment> records = consumer.poll(Duration.ofSeconds(1));
            if (records.isEmpty()) continue;

            // Process entire batch in one transaction for efficiency
            try (Connection conn = dataSource.getConnection()) {
                conn.setAutoCommit(false);

                for (ConsumerRecord<String, Payment> record : records) {
                    UUID idempotencyKey = record.value().getEventId();

                    if (hasBeenProcessed(conn, idempotencyKey)) {
                        System.out.printf("Skipping duplicate message with key %s\n", idempotencyKey);
                        continue;
                    }

                    // Business logic: update account balances, etc.
                    executePaymentLogic(conn, record.value());

                    // Mark as processed
                    markAsProcessed(conn, idempotencyKey);
                }

                conn.commit();
                consumer.commitSync(); // Commit Kafka offset only after DB commit succeeds

            } catch (SQLException e) {
                // On DB error, rollback and do not commit Kafka offset.
                // The consumer will re-fetch the records on the next poll.
                // Consider seeking to the beginning of the failed batch.
                System.err.println("DB transaction failed. Seeking to last committed offset.");
                // consumer.seek(...) logic here
            }
        }
    }

    private boolean hasBeenProcessed(Connection conn, UUID key) throws SQLException {
        // SELECT 1 FROM processed_message_keys WHERE message_key = ?
        // ... implementation ...
    }

    private void markAsProcessed(Connection conn, UUID key) throws SQLException {
        // INSERT INTO processed_message_keys (message_key, consumer_group) VALUES (?, ?)
        // ... implementation ...
    }
}

Performance and Batching Considerations

A naive implementation of the idempotency key check (one query per message) will create significant database load. To optimize for batches:

Collect all idempotency keys from the ConsumerRecords batch.

Issue a single SELECT message_key FROM processed_message_keys WHERE message_key IN (...) to find all keys that have already been processed.

Filter out the duplicate messages in memory.

Execute the business logic and INSERT statements for the new messages.

This transforms N round-trips into a constant number of queries per batch, dramatically improving throughput.

Advanced Edge Cases: Ensuring Robustness

Handling Poison Pill Messages

What happens if a message is malformed or causes an unrecoverable business logic exception? In an idempotent system, it will be retried indefinitely, blocking the partition. The solution is a Dead Letter Queue (DLQ).

Implementing a DLQ within these transactional patterns requires care:

In the Transactional Outbox pattern, the processOrder method should have a try-catch block. On a fatal exception, instead of creating a business event in the outbox, it should create a DLQ event (INSERT INTO outbox (topic, payload) VALUES ('orders_dlq', '...')). The offset is still updated atomically, allowing the consumer to move on.

In the Idempotency Key pattern, the try-catch block around executePaymentLogic would, on failure, send the message to a DLQ topic using a separate, non-transactional Kafka producer and then commit the transaction (including the idempotency key to prevent retries) and the Kafka offset.

Surviving Consumer Rebalancing

A rebalance can occur at any time. A consumer might process a batch and update the database, but crash before it can commit its Kafka offset. When the partition is assigned to a new consumer, it will re-read the same batch of messages.

This is precisely the scenario our idempotent patterns are designed to handle.

The new consumer will execute the same logic.

It will check the processed_offsets table or the processed_message_keys table.

It will find that the messages have already been processed and will simply skip them, committing the offset and moving on.

This demonstrates that idempotency is the ultimate defense against the duplicate deliveries inherent in at-least-once systems, especially during rebalances.

Conclusion: A System-Level Approach to EOS

Achieving Exactly-Once Semantics in a distributed system with Kafka is not a feature you simply turn on; it's an architectural discipline. While Kafka's transactional API provides a powerful foundation for Kafka-native stream processing, integrating with external systems requires shifting the focus from delivery guarantees to idempotent processing.

The Transactional Outbox pattern offers the highest level of data integrity for complex workflows, ensuring that state changes and outbound events are perfectly synchronized. For simpler state updates, the Idempotency Key pattern provides a lower-overhead solution.

By understanding the failure modes of the consume-process-commit cycle and implementing these robust, transaction-aware patterns, senior engineers can build resilient, reliable systems that correctly handle data exactly once, even in the face of failures, retries, and rebalances.

Beyond the Producer: The Consumer's Crucial Role in Exactly-Once Semantics

The Atomicity Challenge: The Consume-Process-Commit Minefield

Pattern 1: Kafka-Native Transactions for Stream Processing

Configuration

Implementation

Edge Case: Fencing Zombie Instances

Pattern 2: Idempotency with External Systems

Sub-Pattern 2a: The Transactional Outbox

Schema Design

Implementation: The Consumer

Implementation: The Outbox Poller

Sub-Pattern 2b: Idempotency Key Persistence

Schema Design

Implementation

Performance and Batching Considerations

Advanced Edge Cases: Ensuring Robustness

Handling Poison Pill Messages

Surviving Consumer Rebalancing

Conclusion: A System-Level Approach to EOS

Found this article helpful?