A scalability playbook for VPs of Engineering whose platform is hitting limits — caching, async, partitioning, and the targeted database surgery that buys orders of magnitude of headroom without a year-long rebuild.
Re-platforming would take a year you don't have.
Scaling a real estate platform is not just about handling more requests. Sunday open-house traffic, listing-launch spikes, and tour-booking peaks each stress a different layer — and the layer that breaks is rarely the one the team is watching.
The wrong answer is to size everything for the peak. The platform becomes expensive at idle, brittle under change, and still fails on the workloads that actually need a different architecture.
The trap is shipping these changes without testing under realistic load. Optimizations that look right on the dev box behave differently on production traffic shapes — and the wrong one moves the bottleneck instead of removing it.
Most read-heavy workloads have 60 to 80 percent of traffic that could be cache-served. Listing pages, search results, comparable property analyses — the largest reductions land here, with the lowest risk.
Synchronous request paths that include slow operations are scalability killers. We move slow operations off the request path — email sending, search indexing, analytics events, third-party API calls — so the user-facing latency stops being the slowest dependency in the chain.
Single-tenant databases at scale eventually hit single-machine limits. Partitioning by tenant, by region, or by entity ID buys orders of magnitude of headroom and clears the path for the next 5x without a rebuild.
Most read-heavy workloads have 60 to 80 percent of traffic that could be cache-served. Listing pages, search results, comparable property analyses.
Synchronous request paths that include slow operations are scalability killers. We move slow operations off the request path — email sending, search indexing, analytics events, third-party API calls.
Single-tenant databases at scale eventually hit single-machine limits. Partitioning by tenant, by region, or by entity ID buys orders of magnitude of headroom.
Capture production traffic profiles and replay them at multiples of normal volume. Run live migrations using logical replication or dual-write patterns so the partitioning lands without downtime.
If your platform is hitting limits and replatforming is not an option this year, the answer is a scalability program targeting caching, async, partitioning, and database surgery.
Caching is less helpful for writes. Partitioning is more helpful. The mix of techniques shifts but the framework still applies.
Partitioning is a live migration. We use logical replication or dual-write patterns to avoid downtime.
Re-platforming swaps the architecture. This program targets the specific bottlenecks the current architecture has not yet exhausted. It buys two to three years of headroom and defers replatform until the business case actually justifies it.
Synthetic traffic shaped like production. We capture production traffic profiles and replay them at multiples of normal volume.
Net cost typically goes down at peak and stays flat at idle. Caching reduces the database tier; async smooths the compute curve; partitioning replaces a vertically scaled box with several smaller ones.