Data Lakehouse

What Does Data Lakehouse Mean?

A data lakehouse is a unified storage architecture that combines the cost benefits of a data lake with the analytic benefits of a data warehouse.

Advertisements

An important purpose of a data lakehouse is to make it easier for machine learning engineers (MLEs) to use the same large data sets for different types of artificial intelligence (AI) workloads.

A data lakehouse architecture has five layers:

  • Ingestion layer – pulls structured and unstructured data from a variety of sources.
  • Storage layer – stores data at rest as storage objects in one layer of the architecture.
  • Metadata layer – used to locate specific storage objects and assign schema on read.
  • Application Programming Integration (API) layer – helps applications understand what data items are required to complete a particular task and how to retrieve them.
  • Consumption layer – provides support for analytics and reporting.

Techopedia Explains Data Lakehouse

A data lakehouse allows the same unified storage layer to be used for multiple purposes — including predictive analytics, prescriptive analytics, deep learning and reporting.

This emerging architecture uses metadata to combine the flexibility of a data lake with the benefits of a data warehouse. Popular data lakehouse vendors include:

Cloudera – this open source, open standards-based data lakehouse is built on Apache Iceberg’s open table format.

Databricks – the Databricks Lakehouse Platform can be delivered and managed as a service on AWS, Microsoft Azure and Google Cloud.

Dremio – provides fully-managed services designed to help customers experiment with using a lakehouse architecture with less TCO.

Snowflake – integrates subject-specific data marts, data warehouses and data lakes into a single source of truth (SSOT) that can be used to power different types of workloads.

Advertisements

Related Terms

Margaret Rouse

Margaret is an award-winning technical writer and teacher known for her ability to explain complex technical subjects to a non-technical business audience. Over the past twenty years, her IT definitions have been published by Que in an encyclopedia of technology terms and cited in articles by the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine, and Discovery Magazine. She joined Techopedia in 2011. Margaret's idea of a fun day is helping IT and business professionals learn to speak each other’s highly specialized languages.