UPI integration — the spec quirks no one mentions

The basics

UPI (Unified Payments Interface) is a real-time payment rail built on IMPS. A user authenticates with a UPI PIN; the payment moves between bank accounts in seconds. NPCI runs the central switch.

Integration: your service is a TPAP (third-party application provider) connected to a sponsoring PSP (payment service provider) bank. The PSP bank handles the NPCI connectivity; you handle the user experience.

The quirks

Collect requests aren’t push notifications. A “collect” is your request to the user’s app for them to authorise a payment. The user has 30 seconds to act (configurable, max 5 minutes). If they don’t, the request expires; your app gets a failure notification — sometimes immediately, sometimes minutes later. The lag is the spec, not the bug.

Status codes don’t always match the actual outcome. A RC=ZA (deemed approved) means NPCI thinks the payment succeeded; the destination bank might still reject it later. You can’t treat ZA as final until you see the consequent credit confirmation.

Refunds aren’t reversals. A “refund” via UPI is a new credit transaction from your bank to the user. It uses a different VPA (your business one) and a separate RRN. You can’t refund a UPI debit by sending the same transaction reversed.

The VPA (UPI ID) has rules. Some PSPs allow @upi, @paytm, @phonepe. Some don’t. The integration spec says alphanumeric + dot + hyphen; in practice some PSPs reject underscores. Always validate against your PSP’s accepted character set.

Per-day per-user limits vary by bank. ₹1 lakh is the spec ceiling; some banks default to ₹25K for new customers. Your app has to handle “amount exceeds daily limit” gracefully — show the user what their bank’s limit is if you can.

The webhook surface

Your service receives webhooks for:

TXN_INITIATED — request created on NPCI’s side
TXN_SUCCESS — payment completed
TXN_FAILED — payment failed (with reason code)
TXN_PENDING — NPCI doesn’t know yet (state could change)
REFUND_* — same set for refunds

The webhook delivery is at-least-once. Idempotent handlers are mandatory; the same notification can arrive 3-5 times for one transaction.

What broke first

For a P2P lending platform integration:

Treating PENDING as final. First implementation marked a loan as disbursed on PENDING. NPCI later moved it to FAILED. Borrower never received funds; the lender’s books showed the loan as outstanding. Three operations bugs in week 1.
Webhook deduplication via in-memory map. The map cleared on restart; duplicates after a deploy created double-debits. Move dedup to Redis with a 24h TTL.
No reconciliation against the daily NPCI settlement file. NPCI publishes a daily file of all transactions for your PSP. Reconciling our local view against this file caught 4 missed transactions in the first month.

The reconciliation pattern

Every day:

Fetch the NPCI settlement file (via the PSP).
For each transaction: compare our local state to the file.
Flag discrepancies for operator review.

Most discrepancies are timing (we showed PENDING; file shows SUCCESS). A few are real (we showed SUCCESS; file shows FAILED). The reconciliation is the only way to catch the few.

What I’d carry forward

For UPI specifically:

Treat PENDING as PENDING. Never marketing as success until SUCCESS.
Idempotent webhooks. Always.
Daily reconciliation against the PSP settlement file. Non-negotiable.
Status codes table embedded in the codebase; not in a wiki.
A test mode that walks the failure paths, not just the happy path. Most bugs are in the unhappy paths.

The spec is precise once you’ve read it three times. The integration partner’s specific quirks are what take the time. Build for the reconciliation case first; the happy path will figure itself out.