LS LOGICIEL SOLUTIONS
Toggle navigation

Data Lakehouse Platform Without Lock-In to a Single Compute Engine

Iceberg. Delta. Hudi. Storage you control. Compute you choose.

The lakehouse promise was open formats and engine choice. The reality, for many teams, is single-vendor lock-in. Logiciel's lakehouse platform is built on Iceberg, Delta, and Hudi - with multiple compute engines (Spark, Trino, Snowflake, Databricks) on top - so storage and compute decisions stay independent.

See Logiciel in Action

Your 'open' lakehouse is a single-vendor lakehouse

Symptoms most teams don't articulate:

  • Storage is in one vendor's flavor of Iceberg/Delta. Switching engines is theoretical. Single-vendor 'open' formats are open in name only; switching engines remains theoretical when catalog and optimizer are vendor-locked.
  • Schema evolution and compaction are tied to a specific engine's catalog. Vendor-tied schema evolution and compaction means engine portability is structurally limited regardless of marketing claims.
  • Query performance depends on a single vendor's optimizer. Single-vendor optimizer dependence means query performance is captive to one vendor's roadmap, not your engineering choices.

If you're shopping lakehouse platforms, openness should be real, not branded

Teams here typically need:

Open table formats with engine portability - Spark, Trino, Snowflake, Databricks, Athena, all working. Engine portability across Spark, Trino, Snowflake, Databricks, Athena requires neutral catalog management, not vendor-specific catalogs.

Catalog independence - stay neutral; switch engines without re-loading. Catalog independence is the structural feature that prevents single-engine lock-in; without it, 'open lakehouse' is marketing.

Workload-grounded performance - not vendor benchmarks. Workload-grounded performance benchmarks beat vendor benchmarks every time; trust your queries over their decks.

What you get with Logiciel

Open lakehouse, engine-portable, workload-tuned.

  • Open formats - Iceberg, Delta, Hudi as first-class storage. Open formats (Iceberg, Delta, Hudi) as first-class storage mean storage decisions don't lock you into a single compute vendor.
  • Engine portability - Spark, Trino, Snowflake, Databricks, Athena work on the same tables. Engine portability across Spark, Trino, Snowflake, Databricks, Athena lets you choose the right engine per workload without re-platforming.
  • Catalog independence - Glue, Polaris, Unity, Nessie, Hive Metastore - all supported. Catalog independence (Glue, Polaris, Unity, Nessie, Hive) means you stay neutral on the catalog decision and avoid vendor lock-in there too.
  • Performance tuning - partitioning, compaction, sorting tuned to your queries. Performance tuning across partitioning, compaction, and sorting captures workload-specific gains that vendor defaults leave on the table.

Where this fits - industries we serve in the US

FinTech & Financial Services

Trading data, risk models, regulatory reporting - sub-second SLAs and audit-ready governance.

PropTech & Real Estate

Listing data, transaction pipelines, geospatial analytics - multi-source consolidation.

Healthcare & Life Sciences

EHR integration, claims pipelines, clinical analytics - HIPAA-aware infrastructure.

B2B SaaS

Product analytics, customer 360, usage-based billing - embedded and operational data.

eCommerce & Marketplaces

Inventory, pricing, order, and customer pipelines - real-time and high-throughput.

Construction & Industrial Tech

IoT, project, and supply-chain data - operational analytics on hybrid stacks.

Engagement models that fit your stage

Dedicated Pod Staff Augmentation Project-Based Delivery
Embedded data engineering pod aligned to your sprint cadence - typically 3–6 engineers + a US lead. Senior data engineers, architects, and SMEs slotted into your team to unblock specific work. Fixed-scope, milestone-driven engagements with clear deliverables and outcomes.

From first call to first production pipeline

Discover

We map your stack, workloads, team, and constraints in a working session - not an RFP response.

Architect

Reference architecture grounded in your reality, with capacity, cost, and migration plans.

Build

Iterative implementation with weekly demos, code reviews, and your team in the loop.

Operate

Managed operations or knowledge transfer - your choice. Both with US-aligned coverage.

Optimize

Continuous tuning of cost, performance, and reliability against measurable SLAs.

Lakehouse capabilities

Open Table Formats

Iceberg, Delta, Hudi natively supported.

Multi-Engine Access

Spark, Trino, Snowflake, Databricks, Athena on same tables.

Time Travel

Query historical snapshots, branch tables for testing.

Catalog Federation

Glue, Polaris, Unity, Nessie, Hive - federated.

Compaction & Maintenance

Automated compaction, snapshot expiration, optimization.

Streaming Tables

CDC and streaming writes into open tables.

Extended FAQs

Different posture. Databricks is engine-first - proprietary compute (Photon, ML runtime) with Delta Lake as the storage format. Logiciel is open-format-first; we manage Iceberg, Delta, and Hudi tables and federate access across multiple compute engines (Spark, Trino, Snowflake, Databricks, Athena). For customers committed to Databricks as the primary engine, we complement rather than replace. For customers wanting engine portability and avoiding lock-in to a single compute vendor, we replace the proprietary lakehouse layer while preserving Spark/Photon for workloads that benefit from it. Most US customers we serve run multi-engine architectures - Snowflake for SQL analytics, Databricks for ML, Trino for federated query - on shared open-format storage.


Native - Snowflake can read your Iceberg tables (managed by Logiciel or external) with full read performance. We manage the table maintenance (compaction, snapshot expiration, schema evolution); Snowflake handles SQL execution. This pattern is increasingly common for customers who want Snowflake's SQL ergonomics for analysts plus open-format storage for engineering flexibility. The architecture decouples storage decisions (where data lives, in what format) from compute decisions (what engine queries it), which is structurally valuable for long-term flexibility. For customers running both Snowflake and Databricks, the same Iceberg tables serve both engines without data duplication.


Yes - migration tooling for Hive, raw Parquet, ORC, and other lake formats into Iceberg, Delta, or Hudi. Migration includes: catalog conversion (Hive Metastore to Iceberg-compatible catalogs like Glue, Polaris, Nessie), file metadata generation (manifest files, snapshots), partition translation, and parity testing. Migration runs in parallel - the legacy lake stays queryable while the new lakehouse is populated and validated; cutover happens after parity is signed off. For Fortune 500 lakes (petabyte-scale), migration typically takes 3-9 months including parity validation. We've migrated lakes from Cloudera, EMR-based legacy stacks, and on-prem Hadoop to cloud lakehouses.


Per managed table plus storage volume - compute is your bill (we don't markup). Mid-market customers (50-200 managed tables, 1-50TB) typically pay $30-90K ARR. Enterprise tiers (1,000+ tables, multi-petabyte, multi-engine federation, dedicated TAM, US-citizen support) start at $200K ARR. Storage is your S3/ADLS/GCS bill at standard rates; we provide tiering recommendations and snapshot expiration policies that typically save 20-40% on storage cost. For customers comparing to Databricks Unity Catalog or vendor-managed lakehouses, we benchmark TCO at evaluation; Logiciel typically saves 30-50% at equivalent capability with engine-portable architecture.


Workload-dependent, and we'll run a workload-grounded comparison. Iceberg wins for engine portability and catalog flexibility (works equally well with Spark, Trino, Snowflake, Athena); choose it if engine-independence matters. Delta wins if you're Databricks-heavy and committed to that ecosystem; choose it if Photon-optimized performance is the priority. Hudi wins for streaming-heavy mutation patterns (CDC at high volume, incremental upserts); choose it for those workloads specifically. Most US customers in 2026 default to Iceberg for new builds because portability has become structurally important; existing Delta deployments are migrated only when there's a strategic reason. We support all three formats equally well at the platform layer.


Automated, workload-aware compaction with cost controls. Compaction strategies are configurable per table: bin-packing for write-heavy workloads, sort-based for query-optimization, tiered for hot/cold separation. Compaction triggers are configurable (file count thresholds, time-based, size-based) and execution is bounded to budget caps so a runaway compaction job can't surprise your finance team. Snapshot expiration policies are integrated for storage cost management. For customers running large Iceberg tables (multi-terabyte fact tables), compaction strategy materially affects query performance and storage cost; we provide reference compaction patterns and tune per-customer based on workload.


Open tables work for ML - Spark, Ray, Daft, Polars, and direct Arrow access all read directly from Iceberg/Delta/Hudi without intermediate materialization. ML feature pipelines integrate with the lakehouse for training data assembly with point-in-time correctness; vector indexes for RAG live alongside warehouse tables under unified governance. For US AI-native customers, the lakehouse-as-ML-foundation pattern is increasingly standard because it eliminates the typical 'copy data to ML platform, lose lineage' antipattern. Compute engines specialized for ML (Ray, Spark MLlib, distributed Polars) read open-format tables directly, so no engine lock-in for ML workloads.


See an open lakehouse on your own data

Bring 5 of your largest tables. We'll convert them to Iceberg in your S3 and connect Spark, Trino, Snowflake, and Athena. You'll see open-format reality, not slideware.