API security strategy — independent review

Generated 2026-05-12. Independent review of the umbrella issue #423 Rate-limit / DoS defence-in-depth and its six sub-issues (#424–#429). Scope: API security only — not edge DDoS, not network-layer defence, not bot management.

TL;DR — what to do today

30-minute foundation fix: restrict Render's origin to Cloudflare IP ranges. Without it, every other control in the stack is bypassable by hitting Render directly. Works on Cloudflare Free tier. Currently buried inside #427 — should be a precursor issue.
Re-sequence: idempotency before rack-attack. Sequenced the other way, flaky-network retries get throttled. Sequenced this way, retries are absorbed correctly and the rate-limit posture doesn't ship a UX regression.
Pin four open decisions in the ADR before any wave-1 work starts: Cloudflare plan level, breaker storage, Idempotency-Key required-from-day-one, lockout escalation. Each has a clear right answer; leaving them open turns implementation into re-litigation.
Defer #427 Cloudflare Pro tier until pilot tenants justify the $25/mo. Free tier + origin allowlist + rack-attack covers the realistic threat surface pre-pilot.

Highest-leverage fix: origin allowlist (Render → Cloudflare IPs only). Zero code, no plan upgrade, single biggest impact in the whole stack. Do it today.

The defence-in-depth stack

Six sub-issues address six layers. Layering is canonical — each layer addresses what the others can't see.

Edge (Cloudflare) Rack middleware Identity (Rodauth) Ops (Puma / DB) Correctness (idempotency) Domain (financial DoS)

#427 · Cloudflare WAF + RL + origin allowlist

Open

Edge Perimeter

WAF managed rules on auth paths, per-path rate-limit rules (Pro tier), bot scoring, and (critically) Render origin IP allowlist restricting traffic to Cloudflare's IP ranges.

Open question: Cloudflare plan level — Pro ($25/mo) buys per-path rate-limit rules. Suggest deferring until pilot tenants exist.
Highest leverage: The origin allowlist is the load-bearing one and works on Free tier. Split out as a precursor.

#424 · rack-attack baseline

Open

Rack Solid Cache backed

Per-IP and per-email throttles on /login / /create-account / /reset-password-request; per-account cap on POST /property_reports; catch-all backstop on state-changing /api/v1/*.

Open question: Rack-only vs also Guards::RateLimit in the pipeline — answer is "both" (cheap IP rejection at rack; account-scoped throttle at guard).
Concern: Stripe webhook strategy ("allowlist Stripe IPs") is fragile — see push-back 4.

#425 · Rodauth `:lockout` + enumeration close-out

Open

Identity Per-account

Enable Rodauth's :lockout feature (5 invalid logins → lock). Generic error messages on login + reset-password to close the account-enumeration oracle (currently distinguishes "no such account" from "wrong password").

Open question: Lockout-by-account vs per-IP-per-account challenge. :lockout alone hands an attacker a DoS vector against any known tenant email.
Recommendation: Ship :lockout for v1 + flag the DoS vector in the ADR; revisit with CAPTCHA when a real abuse signal arrives.

#426 · Puma + body size + DB pool hardening

Open

Ops Infrastructure

Puma worker_timeout 15, threads tuned to DB pool, max-body-size middleware (1MB JSON / 50MB multipart), explicit open_timeout + read_timeout on Sources::* outbound HTTP.

Open question: Render plan DB-connection capacity must be confirmed before sizing Puma threads.
Strong fit: Closes the slow-loris + oversized-body gaps that no other layer addresses.

#429 · Idempotency-Key on mutating endpoints

Gates rack-attack

Correctness Solves retry × throttle

Required Idempotency-Key header on POST/PATCH/DELETE; Solid Cache stores response under (key, method, path, account) hash for 24h; conflict detection on key-reuse with different body. Guards::Idempotency in pipeline.

Open question: required: true from day one or required: false with soft-deprecation? Recommend false-then-true to keep partner / B2B clients integratable.
Sequencing: Must land before rack-attack so retries don't trip throttles.

#428 · Per-account upstream-source budget breaker

Open

Domain Financial DoS

Guards::UpstreamBudget caps cumulative outbound source calls per account in a rolling window. Solid Cache for hot-path check; ProviderBudgetLedger table for audit on trip events.

Open question: Arithmetic doesn't line up — 30 reports × 5 sources = 150 calls vs the 100-call breaker. See push-back 5.
Unique to TenantMate: Other apps don't have a per-tenant upstream-source dollar cost; this app does. Right framing.

What's strong

Layering

Edge / rack / identity / ops / correctness / domain is the canonical defence-in-depth model. Each layer addresses what the others can't see.

Guard-pipeline integration

Guards::RateLimit, Guards::Idempotency, Guards::UpstreamBudget slot into #353's pipeline and inherit the #358 conformance check for free. Stops posture rot on every new endpoint.

Solid Cache as backing

Avoids a Redis dependency. One less moving part for a solo founder.

Financial-DoS as its own lane (#428)

The unique-to-TenantMate concern. Failure mode is bill shock, not unavailability. Worth framing separately from velocity rate-limits.

Conformance check extension

Per the umbrella's DoD: new mutating endpoints must pass through Guards::RateLimit + Guards::Idempotency or carry a tagged skip_guard. The mechanism stops the posture rotting silently.

Idempotency framed right

Solves three problems at once: retries-vs-rate-limits UX, duplicate-record correctness, alignment with the existing offline-first UUID pattern (architecture v1.2 §262–263).

Where I'd push back

1Origin allowlist should be Step 1, not a sub-bullet

Without restricting Render to Cloudflare IP ranges, every other rate-limit is bypassable by hitting Render directly. Currently buried inside #427 as "Origin protection." Works on Cloudflare Free tier — no plan upgrade needed. Single highest-leverage / lowest-cost fix in the whole stack.

Critical

2Idempotency before rack-attack, not after

The umbrella sequences #429 in the second parallel wave, behind #424. That ships rack-attack as a regression first, then fixes the UX problem. Flip the order: idempotency makes the rate-limit safe.

Critical

3Rate-limit response leaks account existence

#424's "per-email 5/5min" throttle is a second account-enumeration oracle — only registered emails accumulate a per-email counter. #425 closes the content oracle ("Invalid login or password"); doesn't close this behavioural one. Mitigation: per-email counter for any submitted email, including unknown.

High

4Stripe webhook IP allowlist is fragile

"Allowlist Stripe IPs" — Stripe's source IPs are broad and rotate, allowlist gets stale. The actual gate is Stripe-Signature verification plus WebhookEvent.event_id dedup. Rate-limit only signature-invalid requests; verified ones pass.

Medium

5#428 arithmetic doesn't add up

"30 reports / hour" (velocity #424) × "5 sources per report" = 150 calls — but #428's breaker is 100 calls. Either the breaker counts reports not calls, or 30 is too many. Pin in the ADR.

High

6`Idempotency-Key: required: true` from day one is over-strict

PWA auto-generation is invisible; native is fine. But forecloses partner / B2B integrations later — naive integrators see 400s. Recommend required: false with a 6-month soft-deprecation before flipping.

Medium

7`:lockout` hands an attacker a tenant-DoS vector

Account-level lockout (Rodauth default) means an attacker who knows a tenant's email can lock them out with 5 wrong passwords. Standard mitigation is per-IP-per-account CAPTCHA before account-level lockout. Worth flagging in the ADR even if you keep :lockout for v1.

High

8Audit-table growth as a DoS surface

Authenticated attacker pounding mutating endpoints generates audits rows linearly. Over time → DB-size DoS. Audit retention policy (probably in data_governance pack) is the right home for a cap.

Medium

9"Simulated abuse run or pen test" DoD is hand-wavy

Solo founder will convince themselves the stack works without proving it. Pin a tool (hey, k6, or wrk) and write the exact commands in the ADR. External pen test is a post-pilot task.

Medium

10#427 Pro tier ($25/mo + Terraform) is over-spec pre-launch

Free tier + global "Under Attack" mode + origin allowlist + rack-attack covers the realistic threat surface pre-pilot. Pro buys per-path rate-limit rules — a tightening, not a foundation. Defer until pilot tenants exist.

Medium

Suggested sequencing

The umbrella's "three-then-three parallel" plan is roughly right but mis-orders the foundations. Suggested re-sequence:

Wave	Issues	Why	Working days
0 — prereq	Cloudflare origin allowlist (split from #427)	Without this, nothing else matters. Render dashboard + Cloudflare IP list, ~30 min.	0.5 d
1 — parallel	#429 Idempotency · #426 Puma/body/pool · #425 Rodauth lockout + enumeration	Idempotency before rack-attack; Puma + lockout don't conflict; all three touch different files.	~3 d
2 — parallel	#424 rack-attack · #428 upstream-budget breaker	Both build on Wave 1. Different layers, no file overlap.	~3 d
3 — later	#427 Cloudflare WAF + RL rules (remainder, Pro tier)	Defer until pilot tenants make Pro spend justifiable.	0.5–3 d
4 — DoD	ADR `docs/05-design/api-rate-limit-policy.md` + pinned-tool abuse run	Closes the umbrella; commits the trade-offs.	1 d

MVP-for-pilot slice: Wave 0 + Wave 1 + Wave 2 + Wave 4 — ~7–8 working days. Defers Wave 3 entirely until traffic justifies it. Realistic in 2 calendar weeks alongside other work.

What to do today

30-minute fix: Render dashboard → IP allowlist → Cloudflare published IP ranges. Zero code, no plan upgrade. Cloudflare IPs reference.
Open precursor issue "[Rate limit] Cloudflare origin IP allowlist" split out of #427. Mark it as Wave 0 in #423.
Re-sequence #423 open-questions section to put Idempotency (#429) in Wave 1, not Wave 2.
Pin four ADR decisions before any wave-1 work starts:
- Cloudflare plan — Free + Wave-0 allowlist now; Pro deferred to pilot.
- Breaker storage — Solid Cache for hot-path, ProviderBudgetLedger table for audit. (Issue already suggests both.)
- Idempotency-Key: required — false with 6-month soft-deprecation.
- Lockout escalation — fires account.lockout notification via #348; account-level lockout for v1; per-IP-CAPTCHA flagged as future tightening.

Independent review artefact for #423. Self-contained: dark/light auto, inline CSS, no external assets, no JS. Auto-deploys via Cloudflare Pages from docs/artifacts/ — find sibling artefacts at the hub.

API security strategy — independent review

TL;DR — what to do today

The defence-in-depth stack

#427 · Cloudflare WAF + RL + origin allowlist

#424 · rack-attack baseline

#425 · Rodauth :lockout + enumeration close-out

#426 · Puma + body size + DB pool hardening

#429 · Idempotency-Key on mutating endpoints

#428 · Per-account upstream-source budget breaker

What's strong

Layering

Guard-pipeline integration

Solid Cache as backing

Financial-DoS as its own lane (#428)

Conformance check extension

Idempotency framed right

Where I'd push back

1Origin allowlist should be Step 1, not a sub-bullet

2Idempotency before rack-attack, not after

3Rate-limit response leaks account existence

4Stripe webhook IP allowlist is fragile

5#428 arithmetic doesn't add up

6Idempotency-Key: required: true from day one is over-strict

7:lockout hands an attacker a tenant-DoS vector

8Audit-table growth as a DoS surface

9"Simulated abuse run or pen test" DoD is hand-wavy

10#427 Pro tier ($25/mo + Terraform) is over-spec pre-launch

Suggested sequencing

What to do today

#425 · Rodauth `:lockout` + enumeration close-out

6`Idempotency-Key: required: true` from day one is over-strict

7`:lockout` hands an attacker a tenant-DoS vector