"Design a payment system" is a system design question where correctness beats cleverness. Nobody is impressed by raw throughput if a retry double-charges a customer or a crash leaves a dollar unaccounted for. The interviewer is really testing three ideas: idempotency (don't charge twice), a double-entry ledger (money is always accounted for), and exactly-once effects under failure. Get those right and the rest follows.
Here's the full walkthrough with a diagram, covering the auth/capture flow, idempotency keys, the ledger, PSP integration, retries, reconciliation, and security.
+----------+ POST /charge (idempotency-key) +------------------+
| Client | ---------------------------------> | Payment API |
+----------+ | (dedupe on key) |
+--------+---------+
|
write PENDING v
+------------------+
| Payment service |
+----+--------+----+
1. authorize| | 2. append entries
v v
+----------------------+ | +------------------+
| Payment Service |<--+ | Double-entry |
| Provider (PSP) / bank| | LEDGER (append) |
+----------+-----------+ +------------------+
| settlement report ^
v | 3. compare
+----------------------+ |
| Reconciliation job |----------------+
| (ledger vs PSP) |
+----------------------+
1. Clarify the requirements
Functional requirements
- Charge a customer (authorize, then capture)
- Refund and partial refund
- Record every money movement in a ledger
- Integrate with one or more payment service providers (PSPs)
- Reconcile internal records against PSP settlement
Non-functional requirements
- Correctness first: no double charges, no lost money
- Strong consistency for balances (not eventual)
- Durable and fully auditable
- Available and resilient to PSP/network failures
Back-of-the-envelope scale: even a large processor handles "only" thousands of transactions per second — tiny next to a social feed. That's the point: this is a low-volume, high-stakes system. Optimize for correctness and auditability, not QPS.
2. API design
The charge endpoint takes an idempotency key supplied by the client. That single header is what makes the whole system safe to retry.
# Idempotency-Key is a client-generated unique ID
POST /v1/charges
Idempotency-Key: 6f1c...ab
{ "amount": 4999, "currency": "USD", "source": "tok_..." }
-> 201 { "id": "ch_1", "status": "authorized" }
# any retry with the SAME key returns this SAME response
POST /v1/charges/{id}/capture
POST /v1/charges/{id}/refund { "amount": 4999 }
3. The payment flow: authorize and capture
A charge is two steps, and separating them is a deliberate design choice:
- Authorization asks the card network to check and hold the funds — no money moves yet. This validates the card and reserves the amount.
- Capture actually settles the held funds, often later (e.g. when the order ships). An uncaptured authorization can simply be voided.
The service writes a PENDING record before calling the PSP, authorizes, and only then moves the record to AUTHORIZED. Writing intent first means that if the process crashes mid-call, a recovery job can query the PSP by the idempotency key and finish the job — nothing is lost in the gap.
4. Idempotency keys
Networks time out and clients retry, so the same charge can hit your API more than once. The fix: the client generates a unique idempotency key per logical operation. On the first request the server does the work and stores the key with the result; any later request carrying the same key returns the stored result without re-executing.
Crucially, you propagate that key downstream too — PSPs like Stripe accept idempotency keys — so a retry to the provider also collapses to a single charge. This is the mechanism behind "don't charge the customer twice."
5. The ledger and double-entry accounting
Never store money as a mutable balance column you overwrite. Use a double-entry ledger: each transaction is recorded as balanced debit and credit entries across accounts so the books always sum to zero. Entries are append-only and immutable — a correction is a new reversing entry, never an edit.
A $49.99 charge might append: credit the merchant's receivable, debit the customer's payment source. Because every movement is two balanced halves, a running sum that ever drifts from zero is an instant red flag. This is exactly the audit trail regulators and auditors require.
The one-liner interviewers want: an idempotent API in front of a double-entry, append-only ledger, with the ledger write and the PSP call reconciled after the fact. Say that and you've shown you understand payments, not just CRUD.
6. Exactly-once and consistency
You cannot get true exactly-once delivery in a distributed system — so you build exactly-once effects instead: at-least-once delivery plus idempotency. Every operation is keyed, so duplicate attempts are recognized and collapse to a single ledger effect. The customer is charged once even if the message is processed several times.
Balance-affecting writes need strong consistency, not eventual: the ledger entry and the charge status should commit together (a single transaction, or an outbox pattern that atomically records the event). Money is the one place where "it'll converge eventually" is not acceptable.
7. PSP integration, retries & failures
You rarely touch card networks directly; you integrate a PSP (Stripe, Adyen, Braintree). The tricky failure is the ambiguous timeout: you called the PSP and got no answer — did the charge go through or not?
- Retry with the same idempotency key so a duplicate is impossible; the PSP returns the original result if it already processed it.
- Reconcile on unknowns: if you still can't tell, mark the charge
PENDINGand let a background job query the PSP by key to resolve it — never guess. - Handle async webhooks: settlement and disputes arrive later via webhooks; verify their signatures and process them idempotently too.
8. Reconciliation
The real-time path can still drift, so a batch reconciliation job compares your internal ledger against the PSP and bank settlement reports, line by line. It catches missing, duplicated, or mismatched transactions and flags them for investigation. Reconciliation is the safety net that guarantees your recorded balances match the money that actually moved — every serious payment system has one.
9. Security and PCI
Handling raw card numbers (PAN) drags you into heavy PCI DSS scope. The standard move is to avoid touching them: the client tokenizes the card directly with the PSP, and your system only ever stores an opaque token. Combine that with TLS everywhere, encryption at rest, least-privilege access to the ledger, and a full audit log of who did what.
Key trade-offs the interviewer probes
- Strong vs eventual consistency. Feeds tolerate eventual consistency; balances do not. Pay the latency cost of strong consistency on the money path.
- Sync vs async processing. Authorize synchronously so the customer gets an answer; do settlement, payouts, and emails asynchronously off a queue.
- Where idempotency lives. Deduping at the API edge is simple but you must also propagate the key to the PSP, or a retry still double-charges downstream.
- Build vs buy the ledger. A correct, audited ledger is hard; many teams lean on a battle-tested ledger service rather than rolling their own.
Framework reminder: every system design answer follows the same arc — requirements → estimates → API → high-level design → data model → scale → trade-offs. Keep the system design cheat sheet in mind and narrate which stage you're in.
Handle correctness-heavy design with live AI support
CoPilot Interview surfaces a structured design skeleton — requirements, API, data model, and scaling — in about 4 seconds during real Zoom and Teams calls. Free for Windows and macOS, invisible on screen-share.
Download freeFAQ
Why are idempotency keys essential in a payment system?
Networks fail and clients retry, so the same charge request can arrive more than once. The client attaches a unique idempotency key; the server stores it with the result on the first attempt and returns that same stored result for any retry with the same key. This is how you avoid charging a customer twice.
What is a double-entry ledger and why use it?
Every transaction is recorded as balanced debit and credit entries across accounts, so the books always sum to zero. It gives you a complete, immutable audit trail and makes it easy to detect discrepancies. Money is never mutated in place; you append new entries, which is what auditors and regulators require.
How do you achieve exactly-once payment processing?
True exactly-once delivery is impossible in a distributed system, so you build exactly-once effects instead: at-least-once delivery plus idempotency. Each operation is keyed so that duplicate attempts are recognized and collapse to a single ledger effect, giving the customer exactly one charge even if the message is processed multiple times.
What is the difference between authorization and capture?
Authorization checks and holds funds on the customer's card without moving money; capture actually settles the held funds later, often after the order ships. Splitting the two lets you reserve money at checkout and only collect it when you fulfill, and lets you void an uncaptured authorization cleanly.
What is reconciliation in payments?
Reconciliation is the batch process that compares your internal ledger against the settlement reports from the payment service provider and the bank. It catches missing, duplicated, or mismatched transactions so your recorded balances match the money that actually moved. It is the safety net behind the real-time path.