Lesson 3 of 3 17 min

Distributed Transactions Part 7: Case Study - The Global Fintech Ledger

Applying the Saga, Outbox, and Idempotency patterns to design a high-scale, 100% accurate financial ledger.

Reading Mode

Hide the curriculum rail and keep the lesson centered for focused reading.

Key Takeaways

  • Double-entry bookkeeping requires that every financial transaction consists of equal debits and credits, ensuring a zero-sum balance.
  • Fintech ledgers must be designed as append-only logs, where historical records are never mutated; corrections require explicit reversal entries.
  • To scale ledger throughput under high write-concurrency, shard data by account identifier and process cross-shard transfers using two-phase locking.

Designing a global financial ledger is one of the most demanding challenges in systems engineering. The core requirements of a ledger are simple but absolute: never lose money, never create money, and maintain a 100% auditable record of every balance change. If an e-commerce catalog fails, users see stale product descriptions; if a ledger fails, the business suffers direct financial loss, compliance breaches, and loss of customer trust.

In a high-throughput distributed system, maintaining ledger accuracy is difficult due to network latency, partial execution failures, and concurrent write contentions. Standard relational database transactions (ACID) scale poorly across distributed microservices. Consequently, modern fintech platforms combine double-entry accounting models with distributed consistency patterns like Sagas, Transactional Outboxes, and Idempotent Request Ingestion.

This case study analyzes the architecture of a global, high-scale fintech ledger designed to process 5,000 transactions per second with absolute correctness and auditability.


System Requirements

An enterprise financial ledger requires strict boundaries to guarantee accounting integrity. We divide our system requirements into functional accounting goals, non-functional operational boundaries, and scale metrics.

Functional Requirements

  • Double-Entry Accounting Model: Record every financial movement as a transaction consisting of at least one debit (asset reduction or liability increase) and one credit (asset increase or liability reduction), ensuring the sum of debits and credits equals zero: $$\sum \text{Debits} + \sum \text{Credits} = 0$$
  • Append-Only Immutability: Enforce that ledger records are strictly append-only. Once written, a ledger entry must never be modified or deleted. Corrections must be written as explicit reversal transactions.
  • Idempotency Guarantee: Guarantee that a transfer request is executed exactly once, even if the client retries the request multiple times due to timeout errors.
  • Multi-Phase Transfers: Support two-phase money movements (such as authorizing/reserving funds first, and capturing/settling them later).
  • Asynchronous downstream Propagation: Publish balance change events atomically to downstream services (like notification gateways and analytics dashboards) using a Transactional Outbox.

Non-Functional Requirements

  • Strict Data Consistency: Ensure that an account balance can never drop below its configured credit limit (e.g., preventing overdrafts).
  • Sub-Second Transaction Latency: Complete ledger writes in less than 50 milliseconds to prevent checkout delays.
  • Continuous Auditability: Provide execution timelines that allow auditors to reconstruct the balance of any account at any point in time.
  • Horizontal Scalability: Scale write throughput to support high concurrent sales without blocking database connections.

Scale Assumptions

  • Peak Write Rate: 5,000 completed transfer transactions per second.
  • Total Ledger Accounts: 50,000,000 active accounts globally.
  • Data Retention: Indefinite storage of all historical ledger entries for regulatory compliance.

API Design and Interface Contracts

The ledger platform exposes REST APIs for transfer initiation and gRPC interfaces for high-speed queries.

1. Initiate Account Transfer (HTTP POST /v1/transfers)

Invoked by checkout or payout services to execute a double-entry transaction.

Request Headers:

Idempotency-Key: txn_idemp_77162_ac
Content-Type: application/json

Request Payload:

{
  "tenantId": "fin_tenant_usa_42",
  "description": "Order checkout payout to merchant",
  "postings": [
    {
      "accountId": "acc_cust_88192ab",
      "amount": -4900,
      "currency": "USD"
    },
    {
      "accountId": "acc_merchant_33190ac",
      "amount": 4900,
      "currency": "USD"
    }
  ]
}

Response Payload (201 Created):

{
  "transactionId": "tx_uuid_55162a88b",
  "status": "POSTED",
  "balanceCheckpoint": {
    "acc_cust_88192ab": 5100,
    "acc_merchant_33190ac": 14900
  },
  "createdAt": "2026-06-07T12:23:00Z"
}

2. Ledger gRPC Query Interface

High-speed internal microservices query account balances and transaction histories using gRPC.

syntax = "proto3";

package codesprintpro.ledger.v1;

service LedgerService {
  rpc GetAccountBalance (BalanceRequest) returns (BalanceResponse);
  rpc GetTransactionDetails (TransactionRequest) returns (TransactionResponse);
}

message BalanceRequest {
  string account_id = 1;
  int64 timestamp_ms = 2; -- Query balance at a specific historical point
}

message BalanceResponse {
  string account_id = 1;
  int64 balance_cents = 2;
  string currency = 3;
  int64 pending_holds_cents = 4;
}

message TransactionRequest {
  string transaction_id = 1;
}

message Posting {
  string account_id = 1;
  int64 amount_cents = 2;
}

message TransactionResponse {
  string transaction_id = 1;
  string description = 2;
  string status = 3; -- PENDING, POSTED, REVERSED
  repeated Posting postings = 4;
  int64 created_at_ms = 5;
}

High-Level Architecture

The ledger architecture decouples synchronous API ingestion from database persistence and downstream event propagation.

Double-Entry Ledger State Machine (Reserve / Post Sequence)

To prevent overdrafts and ensure consistency, we use a two-phase money movement flow (Hold and Capture) coordinated by a Saga orchestrator.

sequenceDiagram
    autonumber
    participant Client as Payout Service
    participant Saga as Saga Orchestrator
    participant Ledger as Ledger Service
    participant Locks as Hold Lock Manager
    participant DB as Postgres Ledger DB

    Client->>Saga: POST /v1/transfers (With Postings)
    Saga->>Saga: Create Saga Instance & set status=STARTED
    
    note over Saga, Locks: Phase 1: Reserve Funds
    Saga->>Locks: Acquire Hold Lock (acc_cust_88192ab, Amount = 4900)
    Locks->>DB: Check balance & Insert Hold record (PENDING)
    DB-->>Locks: Hold Acknowledged
    Locks-->>Saga: Hold Confirmed
    
    note over Saga, Ledger: Phase 2: Post Transfer
    Saga->>Ledger: Commit Transfer (tx_uuid_55162a88b)
    Ledger->>DB: Begin DB Transaction
    Ledger->>DB: Append Debit Entry (-4900)
    Ledger->>DB: Append Credit Entry (+4900)
    Ledger->>DB: Update Hold Lock status to RELEASED
    Ledger->>DB: Write Event to Transactional Outbox
    DB-->>Ledger: Commit DB Transaction
    Ledger-->>Saga: Transfer Committed
    
    Saga-->>Client: Return HTTP 201 Created (Success)

Ledger Outbox Transaction CDC Propagation Flow

To guarantee that downstream services are updated when a transaction commits, we write events to an Outbox table in the same database transaction. A Change Data Capture (CDC) engine tails the transaction log to publish events.

graph TD
    LedgerSvc[Ledger Service] -->|1. Write Posting Entries| DB[(Postgres Ledger DB)]
    LedgerSvc -->|2. Write Outbox Event| Outbox[(Outbox Table)]
    
    subgraph Database Transaction Boundary
        DB
        Outbox
    end
    
    CDC[Debezium / CDC Pipeline] -->|3. Read WAL Logs| Outbox
    CDC -->|4. Publish Event| Kafka[Kafka Event Broker]
    
    subgraph Downstream Consumers
        Kafka --> Notifications[Notification Service]
        Kafka --> Analytics[Analytics Dashboard]
        Kafka --> RecWorker[Reconciliation Worker]
    end
    
    RecWorker -->|5. Verify ledger vs bank statement| RecWorker

Low-Level Design and Schema

Fintech ledgers rely on strict relational structures to prevent rounding errors and guarantee that debits and credits align.

-- Represents a financial account balance record
CREATE TABLE ledger_accounts (
    account_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    customer_id UUID NOT NULL,
    currency VARCHAR(3) NOT NULL DEFAULT 'USD',
    balance_cents BIGINT NOT NULL DEFAULT 0, -- Store balances in cents to avoid floats
    account_status VARCHAR(32) NOT NULL DEFAULT 'ACTIVE', -- ACTIVE, FROZEN, CLOSED
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_accounts_customer ON ledger_accounts (customer_id);

-- Represents a transaction wrapper mapping a single business intent
CREATE TABLE ledger_transactions (
    transaction_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    description VARCHAR(256) NOT NULL,
    transaction_status VARCHAR(32) NOT NULL DEFAULT 'PENDING', -- PENDING, POSTED, REVERSED
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    completed_at TIMESTAMPTZ
);

-- Represents double-entry posting lines (Immutable Log)
CREATE TABLE ledger_entries (
    entry_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    transaction_id UUID NOT NULL REFERENCES ledger_transactions(transaction_id) ON DELETE RESTRICT,
    account_id UUID NOT NULL REFERENCES ledger_accounts(account_id) ON DELETE RESTRICT,
    amount_cents BIGINT NOT NULL, -- Negative for debits, positive for credits
    entry_sequence BIGSERIAL NOT NULL, -- Monotonic sequencing key
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE UNIQUE INDEX uk_entries_sequence ON ledger_entries (entry_sequence);
CREATE INDEX idx_entries_lookup ON ledger_entries (account_id, created_at DESC);

-- Tracks active holds (reserved funds) before capture
CREATE TABLE ledger_holds (
    hold_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id UUID NOT NULL REFERENCES ledger_accounts(account_id),
    transaction_id UUID NOT NULL REFERENCES ledger_transactions(transaction_id),
    hold_amount_cents BIGINT NOT NULL,
    hold_status VARCHAR(32) NOT NULL DEFAULT 'ACTIVE', -- ACTIVE, RELEASED, EXPIRED
    expires_at TIMESTAMPTZ NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_holds_lookup ON ledger_holds (account_id, hold_status) WHERE hold_status = 'ACTIVE';
CREATE INDEX idx_holds_expiry ON ledger_holds (expires_at) WHERE hold_status = 'ACTIVE';

-- Idempotency verification ledger
CREATE TABLE idempotent_requests (
    request_key VARCHAR(256) PRIMARY KEY, -- Hash of tenant_id + client_idempotency_key
    response_status_code INT NOT NULL,
    response_body JSONB NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Schema Rationale & Index Optimization

  1. ledger_entries (Immutable Constraint): The ledger_entries table is configured with no update privileges at the database permission level. Once written, rows cannot be updated or deleted, guaranteeing data integrity.
  2. entry_sequence (Monotonic Sequence): An auto-incrementing bigserial key that provides a strict order for all entries. This is critical for audits and ledger reconciliation, allowing the system to verify that no records were deleted or inserted out of order.
  3. idx_holds_lookup: A partial index restricted to active holds. The Hold Lock Manager uses this index to verify an account's available balance (calculated as balance_cents - sum(hold_amount_cents)) before executing a transaction.

Scaling Challenges and Capacity Estimation

A global ledger processing 5,000 transactions per second must manage database write capacity, indexing overhead, and storage footprint growth.

1. Database Write and Storage Footprint Calculations

  • Assumptions:

    • Peak Transaction Rate ($T$) = $5,000$ completed transfers/second
    • Each transfer writes 1 transaction row and 2 entry rows (1 debit, 1 credit).
    • Average storage footprint per row = $150$ bytes
    • Indexing and Transaction log overhead = 1.5x multiplier
  • Calculations: $$\text{Rows Written Per Second} = 5,000\text{ transfers/s} \times (1 + 2) = 15,000\text{ rows/second}$$ $$\text{Storage Rate} = 15,000\text{ rows/s} \times 150\text{ bytes} = 2,250,000\text{ bytes/second} = 2.25\text{ MB/second}$$ $$\text{Daily Storage Growth} = 2.25\text{ MB/s} \times 86,400\text{ seconds} \approx 194.4\text{ GB/day}$$ $$\text{Yearly Storage Footprint} = 194.4\text{ GB/day} \times 365\text{ days} \approx 71\text{ TB/year}$$

A single relational database instance cannot support $15,000$ row writes per second continuously over a 71 TB dataset.

To scale write throughput, we shard the database across 16 database shards using a consistent hash of the account_id. Cross-shard transactions (transfers between accounts on different shards) are coordinated using a two-phase commit protocol or a distributed Saga orchestrator.

2. Transactional Outbox CDC Ingress Bandwidth

  • Assumptions:

    • Transfer Rate = $5,000$ transactions/second
    • Average JSON Outbox event payload = $1$ KB
  • Calculations: $$\text{Ingress Volume} = 5,000\text{ events/s} \times 1\text{ KB} = 5\text{ MB/second} \approx 40\text{ Mbps}$$

Stateless CDC connectors (like Debezium) poll the database Write-Ahead Log (WAL) and publish these events to Kafka. A dedicated Kafka cluster with 3 broker nodes and 24 partitions handles this traffic, providing low-latency event propagation to downstream consumers.


Failure Scenarios and Resilience

Fintech ledgers must operate reliably across partial network failures and database replicas.

1. Payment Rail Timeout after Debit Hold

The Saga orchestrator reserves funds on a user's account and calls a third-party payment gateway (e.g., Visa Network) to settle the transaction, but the call times out.

  • The Threat: The user's funds remain locked in a PENDING state, blocking their credit line and preventing further purchases.
  • Resilience Design:
    • We assign a strict expiration lease on all holds: expires_at: NOW() + INTERVAL '10 minutes'.
    • A background cleanup worker scans the idx_holds_expiry index every minute.
    • If a hold passes its expiration time without receiving a capture command, the worker releases the hold and appends a HOLD_EXPIRED transaction record to the ledger, restoring the user's available balance.
  1. Database Replica Lag Discrepancies (Read-After-Write Consistency) A user completes a transfer on the primary database, and the client application immediately redirects them to their account dashboard, which reads from a read replica.
  • The Threat: If the read replica experiences 800ms of replication lag, the user's dashboard displays their old balance. This triggers confusion, causing the user to click submit again or contact support.
  • Resilience Design:
    • We use Session Consistency (Read-After-Write Consistency).
    • When the client executes a write, the API returns a database sequence checkpoint token (e.g., entry_sequence value) in the response.
    • The client includes this token in subsequent GET requests.
    • The API router checks the replica's replication status. If the replica has not caught up to the sequence token, the request is routed to the primary database or delayed by a brief polling sleep until the replica catches up, ensuring the user always sees their latest transaction state.

3. Double-Spending Concurrent Transactions

A customer attempts to send two concurrent transfer requests (e.g., 50 USD each) at the exact same millisecond when their account balance is only 60 USD.

  • The Threat: If both threads read the balance simultaneously, both see 60 USD as available, authorize the transfers, and deduct the funds. This results in a negative balance of -40 USD, violating the system's overdraft invariant.
  • Resilience Design:
    • We use Advisory Locks and Available Balance Verifications.
    • Before verifying a balance, the database transaction acquires an advisory lock on the target account ID: SELECT pg_advisory_xact_lock(hash_to_int(account_id)).
    • This serializes updates to that account. The second transaction blocks until the first commits, ensuring the second transaction reads the updated balance (10 USD) and rejects the overdraft attempt.

4. Downstream CDC Outbox Processing Outages

The outbox processor crashes, stopping event propagation to Kafka.

  • The Threat: Customers do not receive email receipts or payment notifications, even though their payments succeeded.
  • Resilience Design:
    • We use At-Least-Once Delivery Guarantees.
    • The outbox records are written to the database transactionally. If the processor crashes, the records remain in the database.
    • When the processor recovers, it reads the WAL logs starting from the last confirmed sequence token, resuming propagation without losing events. The consuming services must deduplicate events using the transaction_id header.

Architectural Trade-offs

Choosing the locking model and ledger storage pattern requires balancing throughput against transactional safety.

Trade-off 1: Serializable Isolation Level vs. Application-Level Advisory Locks

Serializable isolation ensures database-level transaction ordering; advisory locks serialize writes at the application layer.

Feature / Metric Serializable Isolation (SQL) Advisory Locks (Application)
Write Throughput Low. High conflict rates trigger frequent transaction rollbacks. High. Requests queue sequentially without rolling back.
Deadlock Risk High. Complex multi-row updates can easily trigger database deadlocks. Low. Locks are acquired in a strict, alphabetical order.
Complexity Low. Enforced by database configurations. Medium. Requires lock management logic in the application.
Consistency Guarantee Maximum. Handled by the database engine. High. Relies on developers acquiring locks correctly on all write paths.

Trade-off 2: Relational Schema Ledger vs. Key-Value Append-Only Log

A relational ledger uses standard tables with constraints; a key-value append-only log uses a NoSQL document store or Kafka.

Feature / Metric Relational Schema Ledger (SQL) Key-Value Append-Only Log (NoSQL)
Storage Capacity Low. Limited by table indexing sizes and sharding limits. High. Scales horizontally across disk clusters.
Consistency Enforcement High. Database constraints prevent orphan entries. Low. Consistency must be verified in the application layer.
Query Flexibility High. Supports ad-hoc queries, joins, and audits. Low. Fetching historical states requires processing the log sequentially.

Staff Engineer Perspective

Operating a high-scale financial ledger requires implementing strict safety barriers at the database layer.


Verbal Script

Interviewer: "How do you guarantee that a ledger balance never drops below zero when processing concurrent transfers?"

Candidate: "We guarantee balance integrity by combining advisory locking at the database layer with a two-phase balance validation sequence.

If we simply executed a check and write without locks, two concurrent threads could both read an account's balance as 100 USD, approve two separate 80 USD transfers, and write them, leaving the balance at -60 USD.

To prevent this:

  • When a transfer request is received, the database transaction acquires an advisory lock on the sender's account ID: SELECT pg_advisory_xact_lock(hash_to_int(sender_id)).
  • Postgres advisory locks are transactional; they block other transactions on that account ID and are released automatically when the transaction commits or rolls back.
  • Once the lock is acquired, we calculate the account's available balance by querying the current balance and subtracting active holds from the ledger_holds table.
  • If the available balance is sufficient, we write the transaction and entry records. If it is insufficient, we abort the transaction and release the lock, ensuring the balance never drops below zero."

Interviewer: "How would you design the ledger to support cross-shard transfers when the sender and receiver accounts are on different database instances?"

Candidate: "For cross-shard transfers, we use a two-phase reservation and commitment pattern coordinated by a Saga workflow engine.

Sharding the database by a hash of the account_id distributes the load, but it means a transfer between two users can span two separate database instances.

To process this safely:

  • The Saga orchestrator initiates the transfer by sending a command to the sender's shard to reserve the funds. The sender's database validates the available balance, locks the funds by writing a record to ledger_holds with a status of ACTIVE, and returns success.
  • Once the hold is confirmed, the orchestrator sends a command to the receiver's shard to append the credit entry. This shard writes the credit posting and returns success.
  • If the credit succeeds, the orchestrator sends a confirmation command to the sender's shard. This shard releases the hold and writes the debit entry in a local transaction.
  • If the credit fails (e.g., the receiver's account is frozen), the orchestrator triggers a compensation step, releasing the hold on the sender's shard and restoring their balance.

By decoupling the transaction into local steps coordinated by a durable Saga state machine, we avoid blocking distributed database locks."


Interviewer: "Why is the Transactional Outbox pattern preferred over publishing events directly to Kafka within the application code?"

Candidate: "The Transactional Outbox pattern is preferred because it avoids the dual-write problem, ensuring consistency between our database state and our messaging queue.

If we attempted to write to the database and publish to Kafka in the application code, we would encounter consistency failures:

  • If the database write succeeds but the Kafka publish fails (due to network issues or broker timeout), the transfer is processed, but downstream services are never notified.
  • If we reverse the order and publish to Kafka first, and the database write fails, downstream services process a transaction that never occurred.

The Transactional Outbox pattern solves this by writing the event to an outbox table in the database within the same database transaction as the ledger entries.

Because both writes share a transaction boundary, they either both succeed or both fail.

A background Change Data Capture (CDC) engine (like Debezium) tails the database transaction log (WAL) and publishes the outbox events to Kafka.

This guarantees at-least-once delivery to Kafka, ensuring downstream systems eventually sync with the ledger state."


Want to track your progress?

Sign in to save your progress, track completed lessons, and pick up where you left off.