Mastering DynamoDB Concurrency with Conditional Writes & Versioning

20 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Inevitability of Race Conditions in Distributed Systems

In any distributed system operating at scale, the classic read-modify-write pattern is a ticking time bomb for data integrity. When multiple concurrent processes read the same piece of data, modify it in memory, and attempt to write it back, a "lost update" anomaly is almost guaranteed. The last writer wins, silently overwriting the changes made by others.

Consider a high-traffic inventory management system for an e-commerce platform during a flash sale. The DynamoDB table for products might look like this:

json
{
  "sku": "TSHIRT-BLK-L",
  "description": "Large Black T-Shirt",
  "stock_count": 100,
  "price": 19.99
}

A naive implementation for processing an order might be:

  • Read: Fetch the item for TSHIRT-BLK-L. stock_count is 100.
  • Modify: In the application logic, calculate the new stock: new_stock = 100 - 1 = 99.
  • Write: Update the item in DynamoDB, setting stock_count to 99.
  • Now, imagine two application servers, Server A and Server B, receive an order for the same SKU at virtually the same millisecond:

  • T1: Server A reads the item. stock_count is 100.
  • T2: Server B reads the item. stock_count is 100.
  • T3: Server A calculates its new stock (99) and updates DynamoDB. The stock_count is now 99.
  • T4: Server B, completely unaware of Server A's update, calculates its new stock based on the stale data it read at T2 (100 - 1 = 99) and updates DynamoDB. The stock_count is now... 99.
  • Two shirts were sold, but the inventory was only decremented by one. This is a catastrophic failure of data consistency that leads to overselling, customer dissatisfaction, and financial loss. Traditional relational databases solve this with row-level locking or complex transaction isolation levels, but these mechanisms often come at a significant performance cost and are antithetical to the design philosophy of a hyperscale NoSQL database like DynamoDB.

    The Optimistic Concurrency Control (OCC) Pattern

    Instead of preventing concurrent access through pessimistic locking (i.e., assuming a conflict will happen and locking the data), Optimistic Concurrency Control (OCC) operates on a different principle: assume conflicts are rare, but verify that no conflict occurred before committing the write.

    This is a perfect fit for DynamoDB's architecture. It avoids the overhead and complexity of managing distributed locks, allowing for maximum throughput while still guaranteeing data correctness. The implementation relies on two key components:

  • A Version Attribute: Every item in your DynamoDB table includes a numeric attribute that acts as a version counter (e.g., version, revision, etag). This number is atomically incremented with every successful write.
  • A Conditional Write: When you perform an update, you tell DynamoDB: "Only apply this update if the version number of the item on the server is still the same as the one I originally read."
  • If another process has modified the item in the interim, its version number will have changed, your conditional write will fail, and DynamoDB will inform you of the conflict. This failure is not an error to be logged and ignored; it is a critical signal to your application that it needs to handle the contention.

    Our item schema is now updated:

    json
    {
      "sku": "TSHIRT-BLK-L",
      "description": "Large Black T-Shirt",
      "stock_count": 100,
      "price": 19.99,
      "version": 1
    }

    Deep Dive into DynamoDB Conditional Writes

    DynamoDB's PutItem, UpdateItem, and DeleteItem API calls all accept a ConditionExpression parameter. This is a string that contains a boolean expression that must evaluate to true for the operation to succeed. If it evaluates to false, DynamoDB rejects the write and returns a ConditionalCheckFailedException.

    To implement OCC, our ConditionExpression will check the version attribute.

    Let's walk through the corrected flow from our previous example:

  • T1: Server A reads the item. It receives stock_count: 100 and version: 1.
  • T2: Server B reads the item. It also receives stock_count: 100 and version: 1.
  • T3: Server A calculates its update. It will attempt to set stock_count to 99 and increment version to 2. Its UpdateItem call includes a crucial condition: ConditionExpression: "version = :expectedVersion", where :expectedVersion is 1.
  • T4: The update from Server A succeeds because the condition is met. The item in DynamoDB is now { stock_count: 99, version: 2 }.
  • T5: Server B now calculates its update. It will also attempt to set stock_count to 99 and version to 2. Its UpdateItem call also includes the condition: ConditionExpression: "version = :expectedVersion", where :expectedVersion is 1.
  • T6: DynamoDB evaluates Server B's condition. It checks the item's current version, which is 2. Since 2 != 1, the condition fails. DynamoDB rejects the update and returns a ConditionalCheckFailedException to Server B.
  • Server B is now aware of the conflict. The lost update has been prevented. The responsibility now shifts to Server B's application logic to handle this exception gracefully.

    Production-Grade Implementation Pattern

    Let's implement this pattern in TypeScript using the AWS SDK for JavaScript v3. We'll create a function to decrement stock for a given SKU.

    typescript
    import { 
        DynamoDBClient, 
        GetItemCommand, 
        UpdateItemCommand, 
        ConditionalCheckFailedException 
    } from "@aws-sdk/client-dynamodb";
    import { marshall, unmarshall } from "@aws-sdk/util-dynamodb";
    
    const dynamoDBClient = new DynamoDBClient({});
    const TABLE_NAME = process.env.TABLE_NAME || 'Inventory';
    
    interface Product {
        sku: string;
        stock_count: number;
        version: number;
        // ... other attributes
    }
    
    async function decrementStock(sku: string, quantity: number): Promise<void> {
        // 1. READ the current state of the item
        const getItemResponse = await dynamoDBClient.send(new GetItemCommand({
            TableName: TABLE_NAME,
            Key: marshall({ sku }),
            // Use consistent reads for time-sensitive operations
            ConsistentRead: true, 
        }));
    
        if (!getItemResponse.Item) {
            throw new Error(`Product with SKU ${sku} not found.`);
        }
    
        const currentProduct = unmarshall(getItemResponse.Item) as Product;
    
        // 2. MODIFY the state in memory
        if (currentProduct.stock_count < quantity) {
            throw new Error(`Insufficient stock for SKU ${sku}.`);
        }
    
        const newStockCount = currentProduct.stock_count - quantity;
        const expectedVersion = currentProduct.version;
        const newVersion = expectedVersion + 1;
    
        // 3. Perform the CONDITIONAL WRITE
        try {
            const updateParams = {
                TableName: TABLE_NAME,
                Key: marshall({ sku }),
                UpdateExpression: "SET stock_count = :newStock, version = :newVersion",
                ConditionExpression: "version = :expectedVersion",
                ExpressionAttributeValues: marshall({
                    ":newStock": newStockCount,
                    ":newVersion": newVersion,
                    ":expectedVersion": expectedVersion
                }),
                ReturnValues: "NONE",
            };
    
            await dynamoDBClient.send(new UpdateItemCommand(updateParams));
            console.log(`Successfully decremented stock for ${sku}. New stock: ${newStockCount}`);
    
        } catch (error) {
            if (error instanceof ConditionalCheckFailedException) {
                // The race condition was detected!
                console.warn(`Conflict detected for SKU ${sku}. Another process updated the item.`);
                // In a real application, you would now trigger a retry.
                // For this example, we'll just re-throw to indicate failure.
                throw new Error(`Conflict detected while updating SKU ${sku}. Please retry.`);
            } else {
                // Handle other potential errors (e.g., throttling, network issues)
                console.error("An unexpected error occurred:", error);
                throw error;
            }
        }
    }

    This code correctly implements the core Read-Modify-ConditionalWrite cycle. A key detail is ConsistentRead: true. For a sensitive workflow like this, you want to ensure your read operation retrieves the absolute latest value from the leader partition, even if it incurs slightly higher latency and cost, to minimize the chance of starting your logic with stale data.

    Advanced Error Handling & Retry Logic

    Simply catching the ConditionalCheckFailedException is not enough. A production system must react to it intelligently. The correct response is to retry the entire operation, starting from the read.

    A naive retry loop is dangerous. If contention is high, multiple clients could enter a tight retry loop, hammering the database and creating a thundering herd problem. The standard solution is to implement a retry mechanism with exponential backoff and jitter.

  • Exponential Backoff: Increase the delay between retries exponentially (e.g., 50ms, 100ms, 200ms, 400ms...). This gives the competing processes time to complete their work.
  • Jitter: Add a random amount of time to each backoff delay. This prevents multiple competing clients from retrying in lockstep, spreading out the load on the database.
  • Let's create a higher-order function to encapsulate this complex retry logic, making our business logic cleaner.

    typescript
    // (imports from previous example)
    
    // Utility for async sleep
    const sleep = (ms: number) => new Promise(res => setTimeout(res, ms));
    
    // Configuration for retry logic
    const RETRY_CONFIG = {
        maxRetries: 5,
        baseDelayMs: 50,
        maxDelayMs: 1000,
    };
    
    /**
     * A higher-order function that wraps a read-modify-write operation
     * with robust OCC retry logic.
     */
    async function withOccRetry<T>(
        operation: () => Promise<T>
    ): Promise<T> {
        let attempts = 0;
        while (true) {
            try {
                return await operation();
            } catch (error) {
                if (error instanceof ConditionalCheckFailedException) {
                    attempts++;
                    if (attempts > RETRY_CONFIG.maxRetries) {
                        console.error(`Operation failed after ${RETRY_CONFIG.maxRetries} attempts due to persistent conflicts.`);
                        throw new Error('Failed to update item due to high contention.');
                    }
    
                    // Exponential backoff with jitter
                    const backoff = Math.min(
                        RETRY_CONFIG.maxDelayMs,
                        RETRY_CONFIG.baseDelayMs * Math.pow(2, attempts)
                    );
                    const jitter = backoff * Math.random();
                    const delay = backoff / 2 + jitter;
    
                    console.warn(`Conflict detected. Retrying attempt ${attempts} in ${Math.round(delay)}ms...`);
                    await sleep(delay);
    
                    // The loop will continue, re-invoking the operation
                } else {
                    // Re-throw non-conflict errors immediately
                    throw error;
                }
            }
        }
    }
    
    // Refactored business logic using the retry wrapper
    async function robustDecrementStock(sku: string, quantity: number): Promise<void> {
        const operation = async () => {
            // The entire read-modify-write cycle is now inside the operation lambda
            const getItemResponse = await dynamoDBClient.send(new GetItemCommand({
                TableName: TABLE_NAME,
                Key: marshall({ sku }),
                ConsistentRead: true,
            }));
    
            if (!getItemResponse.Item) throw new Error(`Product with SKU ${sku} not found.`);
            
            const currentProduct = unmarshall(getItemResponse.Item) as Product;
    
            if (currentProduct.stock_count < quantity) {
                throw new Error(`Insufficient stock for SKU ${sku}.`);
            }
    
            const newStockCount = currentProduct.stock_count - quantity;
            const expectedVersion = currentProduct.version;
    
            const updateParams = {
                TableName: TABLE_NAME,
                Key: marshall({ sku }),
                // We can combine the version increment and check in one expression
                UpdateExpression: "SET stock_count = :newStock, version = version + :inc",
                ConditionExpression: "version = :expectedVersion",
                ExpressionAttributeValues: marshall({
                    ":newStock": newStockCount,
                    ":inc": 1,
                    ":expectedVersion": expectedVersion
                }),
            };
            await dynamoDBClient.send(new UpdateItemCommand(updateParams));
        };
    
        await withOccRetry(operation);
        console.log(`Successfully decremented stock for ${sku}.`);
    }
    
    // Example usage:
    // await robustDecrementStock('TSHIRT-BLK-L', 1);

    This refactored version is far more robust. The business logic (robustDecrementStock) is clean and focuses on the what, while the withOccRetry wrapper handles the complex how of contention management. Notice the optimized UpdateExpression: SET ..., version = version + :inc. This is an atomic counter pattern, which is slightly more efficient than incrementing the version in the application code.

    Edge Cases and Performance Considerations

    Senior engineers must consider the boundaries where a pattern excels and where it breaks down.

    High Contention Scenarios

    OCC is based on the premise that conflicts are the exception, not the rule. If an item is extremely "hot" and dozens of processes are trying to update it every second, the ConditionalCheckFailedException rate will skyrocket. This has two negative effects:

  • Cost: Every failed conditional write still consumes 1 Write Capacity Unit (WCU) for the attempt. High contention leads to wasted WCUs and increased costs.
  • Latency: The application's perceived latency will increase as it spends more time in retry loops. If maxRetries is exceeded, the operation fails entirely.
  • Solution: If you identify a hot item, it's a sign your data model may need refinement. Consider breaking the data apart. For example, instead of a single stock_count, you could shard the counter across multiple items (e.g., inventory_shard_1, inventory_shard_2) and have workers pull from different shards. This distributes the write load, but adds significant complexity to reads (which now must aggregate across shards).

    First Write (Item Creation)

    How do you apply this pattern when an item doesn't exist yet? You need to ensure your write is a true creation, not an accidental overwrite of an item with the same key that was created by another process moments before. The ConditionExpression for this is attribute_not_exists(partitionKey).

    typescript
    async function createProduct(product: Omit<Product, 'version'>): Promise<void> {
        const initialVersion = 1;
        const itemToCreate = { ...product, version: initialVersion };
    
        const putParams = {
            TableName: TABLE_NAME,
            Item: marshall(itemToCreate),
            // This ensures we only succeed if the item does not already exist
            ConditionExpression: "attribute_not_exists(sku)",
        };
    
        try {
            await dynamoDBClient.send(new PutItemCommand(putParams));
        } catch (error) {
            if (error instanceof ConditionalCheckFailedException) {
                throw new Error(`Product with SKU ${product.sku} already exists.`);
            }
            throw error;
        }
    }

    Comparison with DynamoDB Transactions

    DynamoDB offers TransactWriteItems, which allows for atomic updates across multiple items in one or more tables. You can include condition checks within a transaction. When should you use OCC versus a transaction?

  • Use OCC (this pattern): When you are modifying a single item. It is significantly cheaper (consumes half the WCUs of a transactional write), faster, and simpler.
  • Use Transactions: When you need to update multiple items atomically. The classic example is a financial transfer: you must debit one account and credit another. If either operation fails, both must be rolled back. This is impossible to guarantee with single-item OCC. Transactions provide the all-or-nothing guarantee but at double the write cost and with more limitations.
  • For our inventory example, since we only ever modify one product item at a time, OCC is the superior choice.

    A Complex Real-World Scenario: A Bidding System

    Let's model a real-time auction system where high-frequency bidding is expected.

    Item Structure:

    json
    {
      "itemId": "ART-VANGOGH-1889",
      "auctionStatus": "OPEN",
      "highestBid": 150000.00,
      "highestBidder": "user-123",
      "bidCount": 42,
      "version": 17
    }

    The placeBid function must atomically verify the new bid is higher than the current highestBid and then update the item. This is a perfect use case for a conditional update incorporating both the version check and the business logic.

    typescript
    interface Bid {
        itemId: string;
        bidAmount: number;
        bidderId: string;
    }
    
    async function placeBid(bid: Bid): Promise<void> {
        const operation = async () => {
            // 1. READ
            const getItemResponse = await dynamoDBClient.send(new GetItemCommand({
                TableName: 'Auctions',
                Key: marshall({ itemId: bid.itemId }),
                ConsistentRead: true,
            }));
    
            if (!getItemResponse.Item) throw new Error('Auction not found.');
            const auction = unmarshall(getItemResponse.Item);
    
            if (auction.auctionStatus !== 'OPEN') throw new Error('Auction is closed.');
    
            // The business logic check is now part of the condition expression,
            // but we can still pre-check to fail fast and avoid a wasted WCU.
            if (bid.bidAmount <= auction.highestBid) {
                 throw new Error('Bid must be higher than the current highest bid.');
            }
    
            const expectedVersion = auction.version;
    
            // 2. CONDITIONAL WRITE
            // The condition now checks BOTH version and business logic.
            const updateParams = {
                TableName: 'Auctions',
                Key: marshall({ itemId: bid.itemId }),
                UpdateExpression: "SET highestBid = :bidAmount, highestBidder = :bidderId, bidCount = bidCount + :inc, version = version + :inc",
                ConditionExpression: "version = :expectedVersion AND (highestBid < :bidAmount OR attribute_not_exists(highestBid))",
                ExpressionAttributeValues: marshall({
                    ":bidAmount": bid.bidAmount,
                    ":bidderId": bid.bidderId,
                    ":inc": 1,
                    ":expectedVersion": expectedVersion
                }),
            };
    
            await dynamoDBClient.send(new UpdateItemCommand(updateParams));
        };
    
        await withOccRetry(operation);
        console.log(`Successfully placed bid of ${bid.bidAmount} for item ${bid.itemId} by ${bid.bidderId}`);
    }

    In this advanced example, the ConditionExpression is doing double duty. It not only checks version = :expectedVersion to prevent the lost update problem, but it also enforces the core business rule highestBid < :bidAmount. This is a powerful optimization. It pushes the final business logic check to the database itself, ensuring it is performed atomically as part of the write operation. Even if the data changed between our read and write, DynamoDB will re-evaluate the full condition and reject the update if the new bid is no longer the highest, preventing data corruption with maximum efficiency.

    Conclusion

    Optimistic Concurrency Control with conditional writes and a versioning attribute is not an optional feature; it is a foundational pattern for building correct, scalable, and resilient applications on DynamoDB. By understanding the read-modify-conditional-write cycle, implementing robust retry logic with exponential backoff and jitter, and pushing business logic into condition expressions where possible, you can confidently handle concurrent operations without resorting to costly and complex locking mechanisms. Mastering this pattern is a significant step in moving from casual DynamoDB usage to designing truly professional, production-grade distributed systems.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles