P2P lending — onboarding and disbursement governance
A P2P lending platform processed 5K+ loans/month with low fraud rates. The onboarding architecture and the disbursement-approval workflow were the controls that bought the low rate. Here is what shipped.
Onboarding — the deterministic pipeline
A borrower onboarding ran through a deterministic pipeline:
PAN check ─► Aadhaar OKYC ─► Bureau pulls ─► Income proof ─► Fraud signals ─► Decision
│ │ │ │ │
│ │ │ │ └─ velocity, geo, device
│ │ │ └─ bank statement parse
│ │ └─ parallel: 3 bureaus, take majority + log dissent
│ └─ Aadhaar offline KYC, signature verification
└─ PAN check-digit, format validation
Every step was deterministic Go code, not an LLM. The LLM’s involvement was narration only — turning the verdict into a borrower-friendly message (“we couldn’t verify your address; please re-upload your utility bill”).
The decision step combined the inputs through a scoring function the risk team owned. The function was a one-page table mapping inputs to scores, version-controlled, with every change reviewed by the risk team.
Why three bureaus, not one
Indian P2P lending pulls from multiple credit bureaus because:
- Coverage varies. Bureau A might have rich data on borrower X; bureau B might have nothing. Bureau C might have stale data.
- Score variance. Different bureaus weight signals differently. A 720 from one and a 640 from another tells you something different than a 680 average.
- Disputes. Borrowers sometimes have disputed entries on one bureau; checking the others gives you a fuller picture.
The pull was parallel — three goroutines, each fetching one bureau, with a timeout. If two of three responded, the decision ran on what we had. If only one responded, the decision deferred to human review. The on-call playbook covered bureau outages — they happened every few months.
The signal fusion
Combining bureau data is harder than picking the highest score. We:
- Took the median score across responding bureaus.
- Flagged any bureau with > 80-point divergence from the median for human review.
- Treated certain hard signals (sanctions hit, bureau-reported fraud) as overriding regardless of score.
The 80-point threshold was the risk team’s number. It got tuned based on actual fraud outcomes; the deterministic pipeline made the tuning a configuration change, not a code change.
Real-time fraud detection
In parallel with the bureau pulls, a fraud-signal scorer ran:
- Velocity: how many applications from this PAN / phone / device / IP in the last 24 hours?
- Geo: does the IP geolocate consistent with the address?
- Device: is this device tied to other applicants? (We used a third-party device fingerprint.)
- Behavioural: did the borrower fill the form too fast / paste fields / show signs of automation?
Each signal was 0-1; the combined fraud score was a weighted sum. Above a threshold, the application went to manual review. The threshold was tuned per outcome — every approved loan that defaulted fed back into the calibration.
The single highest-impact signal was velocity. A real borrower applies once. Synthetic-identity fraud rings apply dozens of times from different PANs but the same device or IP. Catching the device or IP overlap caught the bulk of attempted fraud.
Maker-checker for disbursement
After approval, disbursement was not automatic. It went through maker-checker (the four-eyes pattern from banking):
- Maker — the onboarding officer (or the system, for low-risk approved loans below a threshold) creates a disbursement request.
- Checker — a different person reviews and approves.
- Only after both can the payment-gateway call fire.
The threshold for system-vs-human maker varied by amount and risk score. Below ₹50,000 and a clean approval — system maker. Above — human maker. Always human checker.
The implementation:
type DisbursementRequest struct {
ID string
LoanID string
Amount Money
MakerID string
MakerAt time.Time
CheckerID string // empty until approved
CheckerAt time.Time
Status Status // pending | approved | rejected | disbursed
RejectReason string
}
func (s *Service) Approve(ctx context.Context, reqID string, checker auth.Claims, reason string) error {
// ... validation ...
if req.MakerID == checker.Subject {
return errors.New("maker cannot be checker")
}
if !checker.HasRole(auth.RoleDisbursementChecker) {
return errors.New("not authorised to approve disbursement")
}
req.CheckerID = checker.Subject
req.CheckerAt = time.Now().UTC()
req.Status = StatusApproved
// ... audit log ...
// trigger the payment gateway call
}
The maker-cannot-be-checker rule was enforced in code AND in the RBAC roles — the role that gave you “maker” was distinct from the one that gave you “checker”, and the role assignment process required HR sign-off.
The audit chain
Every onboarding and every disbursement decision wrote to a hash-chained audit table. The hash chain made the audit tamper-evident; the per-row schema captured:
- Who acted (maker, checker)
- When
- What the inputs were (bureau scores, fraud score, manual notes)
- What the decision was
- The hash of the previous audit row
A monthly verification job walked the chain and refused if any row was inconsistent. The job never found tampering; it found two operator bugs (a script that updated rows in place during a data migration, which broke the chain — we restored from backup and patched the script).
What I’d carry forward
Three patterns the platform proved out:
-
Deterministic core, LLM narration. The decisions that matter for fraud and money lived in code the risk team could read. The LLM made the borrower-facing messaging warm but never made decisions.
-
Parallel bureau pulls with majority decision. Single-bureau reliance creates a single point of failure (and a single gameable signal). Three sources with fusion is dramatically more robust.
-
Maker-checker for money movement, enforced at multiple layers. Code, roles, and HR process all reinforce each other. Any single layer can be bypassed by a determined insider; three layers create dramatically more friction.
The platform’s fraud rate stayed well below the industry benchmark. The number was the visible outcome; the discipline above was what generated it.