Multi-Tenancy Isolation Architecture Models

SaaS applications implement tenant isolation at different granularities, each with different security and operational trade-offs. Database-per-tenant provides the strongest isolation β€” each tenant's data is in a separate database with separate credentials. Compromise of one tenant's database does not directly expose others. The operational overhead is high. Schema-per-tenant (one PostgreSQL database, separate schemas) provides logical isolation with shared infrastructure. Shared database, row-level isolation is the most common model β€” all tenants' data shares tables, with a tenant_id column and application-level or database-level filtering enforcing isolation.

The shared-database model is where most isolation failures occur. The entire isolation model depends on every single query including the correct tenant filter. One missing filter, one code path that does not apply the tenant context, one internal API that bypasses the normal request context β€” any of these creates a cross-tenant access path that is exploitable by any authenticated user who can trigger that code path.

The consistency problem: In a shared-database multi-tenant system, isolation is maintained by adding WHERE tenant_id = :current_tenant to every query. The security of the entire system depends on never forgetting this filter β€” across every query, in every service, written by every developer, forever. That is a fragile guarantee.

Tenant ID Injection

Applications that derive the tenant context from a user-supplied identifier β€” a subdomain, a URL path parameter, a request header, or a JWT claim β€” must validate that the supplied tenant ID matches the authenticated user's actual tenant. If the validation is missing or bypassed, an attacker can supply any tenant ID and access that tenant's data.

// Vulnerable: tenant ID from URL without authorisation check app.get("/api/reports/:tenantId", authenticate, async (req, res) => { const { tenantId } = req.params; // attacker-controlled // Missing: verify req.user.tenantId === tenantId const reports = await db.query( `SELECT * FROM reports WHERE tenant_id = $1`, [tenantId] ); res.json(reports); }); // Secure: derive tenant from authenticated session only app.get("/api/reports", authenticate, async (req, res) => { const tenantId = req.user.tenantId; // from verified JWT, not user input const reports = await db.query( `SELECT * FROM reports WHERE tenant_id = $1`, [tenantId] ); res.json(reports); });

Shared Cache Cross-Tenant Contamination

Caching layers shared across tenants β€” Redis, Memcached, CDN caches β€” must include the tenant identifier in every cache key. A cache key that is derived from the resource identifier without the tenant produces a cross-tenant contamination risk: the first tenant to request a resource populates the cache; subsequent tenants requesting the same resource ID from different tenants receive the cached response from the first tenant.

This is a particularly insidious bug because it is intermittent and load-dependent. Under low traffic, each tenant's requests miss the cache and hit the database (which applies the correct tenant filter). Under high traffic, cache hit rates increase and cross-tenant contamination becomes more frequent. The bug may not manifest in testing or low-traffic environments at all, appearing only at production scale.

Cache key discipline: Every cache key in a multi-tenant system must include the tenant identifier as its first component: tenant:{tenantId}:resource:{resourceId}. Audit all cache key construction in code review. A single cache layer that uses resource:{resourceId} creates a cross-tenant data leak that is invisible in single-tenant testing.

Background Job Tenant Isolation

Background job processors β€” workers consuming from queues β€” frequently lose their tenant context. When an HTTP request enqueues a job, the request's tenant context exists. When the job worker processes the job, it may reconstruct the tenant context from the job payload β€” or it may not, if the original developer did not think to include it.

Jobs that operate on shared resources without tenant context β€” for example, a "generate monthly report" job that queries all records modified in the past month β€” will query across all tenants if the tenant filter is not applied. Scheduled jobs (cron-based) are particularly vulnerable because they do not originate from a tenant-scoped request and have no natural tenant context to inherit.

Row-Level Security Bypass

PostgreSQL Row-Level Security (RLS) can enforce tenant isolation at the database level, eliminating the "developer forgets to add WHERE clause" failure mode. However, RLS has its own bypass patterns. Database superusers bypass RLS by default β€” if your application occasionally uses a superuser connection (for migrations, monitoring, or admin operations), those operations are not protected by RLS policies. Similarly, the table owner bypasses RLS unless the policy is explicitly made applicable with ALTER TABLE ... FORCE ROW LEVEL SECURITY.

RLS as defence in depth: PostgreSQL RLS is an excellent additional isolation layer, but it should not replace application-level tenant filtering β€” it should supplement it. RLS policies can have edge cases and bypass conditions. Defence in depth means a bug in RLS does not automatically mean a cross-tenant breach if the application also applies tenant filters.

Multi-Tenancy Defence Patterns

  1. Always derive tenant context from authenticated session, never from user input: The tenant identifier should flow from the verified JWT or session cookie, not from URL parameters or request headers that users control.
  2. Apply RLS as a database-level backstop: Implement PostgreSQL RLS with FORCE ROW LEVEL SECURITY on all tenant-scoped tables. Set the tenant context in the database session at connection time via SET LOCAL app.current_tenant = ... and build policies that use this setting.
  3. Include tenant ID in every cache key, without exception: Implement cache key construction as a central utility function that always prepends the tenant ID. Code-review every direct cache key string for missing tenant scope.
  4. Explicitly pass tenant context to all background jobs: Include the tenant ID as a required field in every job payload. Reject jobs in workers that do not include a tenant ID. Treat tenant-scoped operations without a tenant ID as a programming error, not a default behaviour.
  5. Run cross-tenant access tests as part of your test suite: Write integration tests that create data in tenant A and verify that authenticated users from tenant B cannot access it through every API endpoint. These tests should run in CI on every PR.
  6. Alert on cross-tenant query patterns: Instrument your database layer to detect and alert on queries against tenant-scoped tables that do not include a tenant filter. This may be a logging or monitoring layer that flags the calling code for review.