Why row-level security and application-layer RBAC are necessary but not sufficient for multi-tenant clinical AI, and the isolation architecture that health system security teams actually audit.
The Pen Tester Got Cross-Tenant Data Anyway.
Standard SaaS controls protect structured data, not LLM context windows, embeddings, or inference logs.
Healthcare AI introduces three new PHI exposure surfaces that can leak data across tenants without proper isolation.
Shared retrieval indexes or cached contexts can put one tenant's PHI into another tenant's prompt. Standard access logs do not catch it because no database query crossed tenants.
Without tenant metadata filtering at query time, semantic search returns nearest-neighbor chunks from any tenant whose data was embedded into the index.
Shared logging infrastructure without tenant partitioning returns PHI from multiple tenants on unscoped queries. The same telemetry stack that worked for non-PHI SaaS becomes a HIPAA liability.
Per-tenant vector indexes (or strict metadata filters), with audit logging of every retrieval context. Cross-tenant retrieval becomes impossible by architecture, not by configuration.
Vector queries enforce tenant filters at the database layer. Application bugs cannot bypass the filter; the database refuses to return cross-tenant embeddings.
Inference traces partitioned by tenant. PHI scrubbed before storage. Per-tenant fine-tuning or differential privacy to prevent model memorization of one tenant's data leaking into another tenant's queries.
Tenant-scoped retrieval and partitioned inference logs make cross-tenant PHI access impossible by architecture, not merely unlikely.
RBAC and row-level security protect structured data. They do not isolate LLM context windows, vector embeddings, or inference logs — three new surfaces that can carry PHI in any RAG or fine-tuned-model architecture.
PHI leaked from an inference log is treated as the same violation as PHI leaked from a database. Intent and likelihood of access do not reduce liability. The architecture must make cross-tenant access impossible, not unlikely.
Yes. LLMs can memorize training data including PHI and reproduce it under certain queries. Mitigation requires per-tenant fine-tuning, differential privacy, or federated learning approaches.