Warehouse-grade governance and transactions on cheap object storage, with no duplication. This guide lays out the architecture, the open table formats, the medallion pattern, the economics, and a clean migration path.
Run both, copy data between them, pay for storage and compute twice, and reconcile two versions of the truth.
Layer open table formats and a catalog over object storage so one governed copy serves every engine.
ACID transactions, schema evolution, and time travel on top of plain files in object storage.
A central catalog handles schema, permissions, discovery, and lineage across engines.
Cheap, durable, infinitely scalable storage that separates storage from compute.
For most enterprises, choose Iceberg for vendor neutrality and broad engine support across Snowflake, AWS, BigQuery, and Databricks. Decide once; it is the foundation everything else sits on.
Object storage plus the table format plus a governance catalog is the lakehouse core.
Bronze preserves raw, append-only source data as the system of record and replay point.
Move analytics onto the lakehouse use case by use case, stop copying data into a separate warehouse, then retire the redundant warehouse-plus-lake copies.
The pattern is settled: Iceberg, medallion, decoupled compute, and the market has voted. The work now is migrating cleanly and governing well.
Not dead, but increasingly redundant. The lakehouse gives warehouse governance and transactions without warehouse cost or data duplication, which is why 70% expect most analytics to move to it within three years.
A refinement pattern: Bronze (raw), Silver (cleaned), Gold (business-ready). It preserves raw truth, enforces quality, and shapes data for consumers, so the lake never becomes a swamp.
With the format and the core: pick Iceberg, stand up object storage plus a catalog, then build the medallion layers before migrating workloads off the warehouse-plus-lake split.
For most enterprises starting now, Iceberg, because it is engine-neutral with broad vendor alignment. Delta is excellent if you are committed to the Databricks ecosystem.
Storage runs $30-50 per TB versus $500-2000 for a warehouse, and 56% of adopters report saving more than 50% on analytics overall, with nearly 30% of large enterprises expecting savings above 75%.