LS LOGICIEL SOLUTIONS
Toggle navigation

ELT: Real Examples & Use Cases

Definition

ELT stands for Extract, Load, Transform: the pattern of pulling data from sources, loading it raw into a target (typically a cloud warehouse or lakehouse), and then transforming it in place using the target's own compute. The transformation order inversion compared to ETL turns out to be more than a sequencing change. It moves the transformation work onto the same engine that serves queries, eliminates a separate transformation infrastructure layer, and changes who owns and writes transformations. Real examples reveal what production ELT stacks actually look like, which workloads benefit from the pattern, and where ELT's strengths become weaknesses.

The pattern became practical when cloud warehouses gave teams elastic, cheap compute on the warehouse itself. With unlimited query power available, there was no reason to maintain a separate transformation server. Transformations could run as SQL inside the warehouse. The raw source data could land first; transformations could be developed iteratively against the loaded data; the same engine that produced the modeled tables also served queries against them.

The category in 2026 is dominated by a recognizable stack pattern. An ingestion tool (Fivetran, Airbyte, Stitch, Hevo, or custom) handles the extract-and-load. dbt or a similar tool handles the transform. A cloud warehouse or lakehouse hosts everything. An orchestrator (Airflow, Dagster, Prefect, or the ingestion tool's built-in scheduler) coordinates. A BI tool sits on top. The pattern is so common it functions as a default architecture for analytics in 2026\.

What separates ELT from a hypothetical "load everything and figure it out later" approach is the discipline around the transformation layer. dbt brought software engineering practices to SQL transformations: version control, code review, testing, documentation, dependency tracking. Without that discipline, ELT becomes the source of the data swamp problems that lakes were criticized for. With it, ELT produces maintainable, reliable analytics infrastructure.

This page surveys real ELT implementations across analytics-engineering-led teams, the operational realities of the pattern at scale, and where the pattern's limitations show up. The tooling has consolidated significantly; the architectural pattern is mature and widely understood.

Key Takeaways

  • ELT loads raw data into the target first, then transforms it using the target's own compute.
  • The pattern became practical when cloud warehouses provided elastic compute that could absorb both transformation and query workloads.
  • The dominant 2026 stack is Fivetran or Airbyte for ingestion, dbt for transformation, cloud warehouse for hosting, BI tool for consumption.
  • ELT lets analytics engineers own transformations in SQL with software engineering practices, removing the need for specialized ETL skills.
  • The pattern's limitations show up at scale, in regulated contexts, and when transformations need capabilities beyond what SQL provides easily.

The Dominant ELT Stack in Production

The Snowflake plus Fivetran plus dbt combination shows up at thousands of companies. Fivetran handles the ingestion connectors for SaaS sources, operational databases, and event streams. Data lands raw in Snowflake. dbt transforms the raw data into modeled tables that BI tools consume. The pattern is so common that vendor marketing collateral and conference talks usually assume it as the baseline.

The BigQuery variant swaps Snowflake for Google's warehouse, often with Fivetran or with GCP-native ingestion services. The transformation layer is still typically dbt. The BI layer is often Looker (also Google-owned). The pattern is functionally identical to the Snowflake version with different vendors filling the same roles.

The Redshift variant swaps in AWS's warehouse. AWS-native ingestion options (DMS, Glue, AppFlow) compete with Fivetran in this stack. The transformation layer is dbt or AWS-native services like DataBrew. The pattern fits AWS-committed shops.

The Databricks SQL variant runs the same pattern on the Databricks platform with Delta tables. Ingestion uses the same connector tools; transformation can use dbt or Databricks-native tools. The pattern fits companies that wanted unified data engineering and analytics on one platform.

Smaller teams sometimes pick simpler combinations: Hevo or Airbyte for ingestion, the cloud warehouse's free tier or Motherduck for hosting, dbt Cloud's free starter for transformations, Metabase for BI. The pattern is the same shape at smaller scale.

Companies Running ELT at Scale

GitLab publishes their entire data stack documentation publicly. The pattern is Fivetran for ingestion into Snowflake, dbt for transformations, Sisense for BI. The documentation reads like a textbook ELT implementation; many companies have adopted similar patterns based on the GitLab handbook.

Doordash, Wayfair, Shopify, and Cazoo (now defunct as a public company but the data team continued the patterns) all ran variations of warehouse-plus-dbt ELT stacks. The pattern scales from startup to large enterprise without architectural change; the volumes grow and the team grows around the same structure.

Casper, Glossier, and similar direct-to-consumer e-commerce companies adopted ELT stacks heavily during 2020-2023. The combination of fast iteration needs and limited dedicated data engineering capacity made ELT's analytics-engineering-led model fit well. The pattern persists at the survivors.

Many companies emerging from the 2020-2022 SaaS boom defaulted to ELT stacks. The pattern was the obvious choice for new builds, the tooling was mature enough to be productive quickly, and the skills required (SQL, dbt, BI) were broadly available in the analytics community.

Larger enterprises have adopted ELT for newer initiatives even while maintaining legacy ETL elsewhere. The split stacks coexist for years; new projects use ELT; old projects continue on legacy platforms. The migration of old projects is multi-year if it happens at all.

What ELT Changes Operationally

Who owns transformations shifts. In an ETL world, data engineers in a central team wrote transformations in specialized tools. In an ELT world, analytics engineers (or analysts with engineering practices) write transformations in SQL. The accessibility shifts ownership closer to the people who understand the business meaning of the data.

How transformations get developed shifts. dbt-style development uses CI, version control, and code review. Transformations get developed iteratively against the loaded data. Changes are reviewed in pull requests. Tests run on every change. The practice borrows from software engineering and produces more maintainable transformations than the GUI-based development typical of legacy ETL.

Where transformations run shifts. The cloud warehouse handles both the transformation execution and the downstream query execution. There is no separate transformation infrastructure to provision, monitor, scale, or operate. The simplification reduces the operational footprint significantly.

How transformations are documented shifts. dbt generates documentation from the model definitions and tests. The documentation is part of the code rather than a separate artifact that drifts. The pattern produces documentation that actually reflects current reality.

How lineage is tracked shifts. dbt's compilation produces lineage automatically from the SQL. Tools layer on top to extend the lineage upstream into ingestion and downstream into BI. The lineage is a free byproduct of the development pattern rather than a separate effort.

Where ELT's Limitations Show Up

Pre-load transformation requirements break the pattern. If data needs to be validated, de-identified, or transformed before landing in the target for governance reasons, the ELT ordering does not work. The pattern requires that raw source data be acceptable to land in the target, which is not always true.

Very large transformation workloads can hit warehouse cost ceilings. Transformations that scan petabytes daily produce large warehouse bills. ETL on dedicated transformation infrastructure can sometimes be cheaper at these scales because the cost model is different. The trade-off depends on workload patterns and pricing specifics.

Complex transformations that SQL handles awkwardly. Some transformations are easier in Python or Scala than in SQL. Modern dbt supports Python models, but the experience is less mature than pure SQL development. For very complex algorithmic transformations, a separate processing engine still wins.

Real-time requirements push beyond what most ELT stacks handle. ELT is typically batch-oriented with intervals from minutes to hours. Sub-minute freshness usually requires streaming patterns that look more like ETL or specialized streaming architectures. ELT can be made fast, but it pushes against the pattern's natural fit.

Regulated data with strict access controls requires careful handling. Raw data in the warehouse needs access controls so that analysts who can see modeled data cannot necessarily see raw PII. The pattern is workable but adds complexity that ETL with pre-load de-identification avoids.

Patterns That Make ELT Work Well

Staging conventions that separate raw landings from cleaned and modeled data. Most dbt projects use staging, intermediate, and mart layers. Raw data stays in dedicated schemas. Cleaned and standardized data lives in staging models. Business-modeled data lives in marts. The convention is so widely adopted it functions as a standard.

Tests on every model that catch the common failure modes. dbt's built-in tests (unique, not\_null, relationships, accepted\_values) catch a lot of issues. Custom tests catch business-specific invariants. The tests run on every pipeline execution; failures alert and can block deployment.

Documentation as part of the model definition. Models include descriptions; columns include descriptions; tests document constraints. dbt generates browsable documentation from this metadata. The documentation stays current because it lives with the code.

Macros that DRY up repeated logic. Date dimensions. Common pivots. Standard incrementality patterns. Macros let teams write reusable transformations and apply them consistently across many models.

Incremental materialization for large tables. Daily rebuilds of large tables waste compute; incremental processing only handles new and changed rows. The pattern requires more careful design but pays back significantly at scale.

Common Failure Modes

Raw data quality problems that propagate through the pipeline. Source data was bad; the pipeline processed it without catching the issue; downstream models compute on bad input. The fix is source-level tests and quality monitoring at the staging boundary.

Sprawling model graphs that no one fully understands. Years of additions produce thousands of models with unclear relationships. The fix is periodic refactoring, naming conventions enforced from the start, and explicit ownership for each model area.

Warehouse cost runaway from inefficient transformations. Models that recompute from scratch every run; queries that scan everything when they could scan partitions; materialization choices that produce more rebuilds than necessary. The fix is performance review on the most expensive models and budgets that force conversations about cost.

Test coverage that exists in theory but does not catch real problems. Tests on unique IDs but not on business invariants that actually matter. The fix is testing what could actually go wrong, not just what is easy to test.

ELT-as-a-rebrand where teams switched from ETL platforms to dbt but kept the same operational and design problems. The pattern only delivers benefits if the team adopts the practices that come with it, not just the tooling.

Best Practices

  • Adopt staging conventions (raw, staging, intermediate, marts) that separate concerns through naming and layering.
  • Test every model with at least basic constraints (uniqueness, not-null, referential integrity) and add business-specific tests for important invariants.
  • Document models and columns as part of the model definition; documentation lives with the code, not in a separate wiki.
  • Use incremental materialization for tables large enough to justify the design overhead.
  • Monitor transformation costs and review the most expensive models periodically.

Common Misconceptions

  • ELT is always better than ETL; the right choice depends on the workload, governance requirements, and target capabilities.
  • ELT means no transformations; transformation happens in the target rather than before the load, but it still happens.
  • dbt is ELT; dbt is a transformation tool that fits ELT patterns well, but ELT is a broader architectural pattern.
  • ELT removes the need for data engineering; the patterns and skills shift but transformations, ingestion, and platform work remain.
  • ELT scales infinitely because the warehouse scales; cost growth tracks transformation volume and can become a serious problem without controls.

Frequently Asked Questions (FAQ's)

When should I pick ELT over ETL?

When the target is a cloud warehouse or lakehouse with elastic compute, when transformations can be expressed in SQL or the warehouse's other languages, and when there is no governance requirement that data be transformed before loading. Most modern analytics workloads fit ELT well.

What is the role of dbt in an ELT stack?

dbt is the transformation layer. It manages the SQL models, their dependencies, tests, and documentation. The tool runs in the warehouse, compiling templated SQL into executable transformations. Almost every modern ELT stack uses dbt or a close competitor (SQLMesh, Coalesce).

Do I need a separate ingestion tool?

For most teams, yes. Fivetran, Airbyte, Hevo, Stitch, or similar tools maintain hundreds of connectors that would otherwise need custom development. The exception is teams with simple source landscapes where custom Python scripts or CDC patterns cover the needs.

How does ELT handle large transformations cost-wise?

It depends on warehouse pricing and transformation efficiency. Inefficient ELT can produce large warehouse bills. Efficient ELT (incremental materialization, partition pruning, sensible query design) keeps costs manageable. Cost monitoring and periodic optimization are part of operating ELT well.

How do I handle PII in an ELT stack?

With access controls on the raw schemas plus de-identification in the staging layer. Raw PII lives in restricted schemas accessible to a small group. The staging layer applies masking or tokenization before producing data that broader audiences can access. The pattern works but requires explicit design.

What about streaming or real-time requirements?

ELT is typically batch-oriented. For real-time requirements, options include warehouse-native streaming ingest (Snowpipe Streaming, BigQuery streaming inserts), specialized streaming platforms (Materialize, Tinybird), or hybrid architectures that combine batch ELT with streaming pipelines for fresh data. The pure ELT pattern does not naturally serve sub-minute freshness.

How does ELT fit with ML workloads?

The transformed analytical tables serve as training data sources. Feature engineering can happen in the ELT layer or in dedicated ML feature stores that read from it. Inference is usually not done from the warehouse for latency reasons; ELT produces the training data, online stores serve the inference layer.

What skills does an ELT team need?

SQL fluency, dbt experience, warehouse-specific knowledge (Snowflake, BigQuery, etc.), some Python for orchestration glue, and BI tool familiarity. The skills are broader than legacy ETL specialists but each individual skill is more accessible. Hiring for analytics engineers fitting this profile is easier than hiring for legacy ETL specialists in 2026\.

Where is ELT heading?

Toward continued consolidation around the established stack pattern. Toward AI assistance in dbt model development, debugging, and documentation. Toward more lakehouse-based variants as the warehouse-lakehouse boundary blurs. The pattern itself is mature; the tooling continues to refine the experience of working within it.