Architecture, Data Lake and Data Warehouse

How Apache Iceberg Is Changing the Face of Data Lakes

Snowflake

APRIL 2, 2025

Data storage has been evolving, from databases to data warehouses and expansive data lakes, with each architecture responding to different business and data needs. Now you dont have to choose. This is why Snowflake is fully embracing this open table format.

Data Lake

Data Lake Metadata Cloud Storage Data Warehouse

Data Integrity for AI: What’s Old is New Again

Precisely

JANUARY 9, 2025

The goal of this post is to understand how data integrity best practices have been embraced time and time again, no matter the technology underpinning. In the beginning, there was a data warehouse The data warehouse (DW) was an approach to data architecture and structured data management that really hit its stride in the early 1990s.

Data Integration

Data Integration Hadoop Data Warehouse Data Lake

Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?

KDnuggets

OCTOBER 30, 2023

A comparative overview of data warehouses, data lakes, and data marts to help you make informed decisions on data storage solutions for your data architecture.

Data Lake

Data Lake Data Warehouse Data Storage Data

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

Snowflake

NOVEMBER 2, 2023

Over the years, the technology landscape for data management has given rise to various architecture patterns, each thoughtfully designed to cater to specific use cases and requirements. Each of these architectures has its own unique strengths and tradeoffs.

Data Lake

Data Lake Data Warehouse Cloud Unstructured Data

Keep Your Data Lake Fresh With Real Time Streams Using Estuary

Data Engineering Podcast

MAY 21, 2023

In this episode David Yaffe and Johnny Graettinger share the story behind the business and technology and how you can start using it today to build a real-time data lake without all of the headache. What is the impact of continuous data flows on dags/orchestration of transforms? RudderStack also supports real-time use cases.

Data Lake

Data Lake Machine Learning Kafka Data Warehouse

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

FEBRUARY 18, 2024

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. To start, can you share your definition of what constitutes a "Data Lakehouse"?

Data Lake

Data Lake High Quality Data Data Warehouse Google Cloud

Why Open Table Format Architecture is Essential for Modern Data Systems

phData: Data Engineering

NOVEMBER 8, 2024

The world we live in today presents larger datasets, more complex data, and diverse needs, all of which call for efficient, scalable data systems. Open Table Format (OTF) architecture now provides a solution for efficient data storage, management, and processing while ensuring compatibility across different platforms.

Architecture

Architecture Systems Data Lake Google Cloud

Charting A Path For Streaming Data To Fill Your Data Lake With Hudi

Data Engineering Podcast

AUGUST 3, 2021

Summary Data lake architectures have largely been biased toward batch processing workflows due to the volume of data that they are designed for. With more real-time requirements and the increasing use of streaming data there has been a struggle to merge fast, incremental updates with large, historical analysis.

Data Lake

Data Lake Data Warehouse Hadoop Architecture

Toward a Data Mesh (part 2) : Architecture & Technologies

François Nguyen

MARCH 22, 2021

With this 3rd platform generation, you have more real time data analytics and a cost reduction because it is easier to manage this infrastructure in the cloud thanks to managed services. We are Data Teams versus we have to patch the server with the latest version and do the tests. The number of subjects to automatize is not short.

Technology

Technology Architecture Google Cloud Metadata

Maintaining Your Data Lake At Scale With Spark

Data Engineering Podcast

JUNE 16, 2019

Summary Building and maintaining a data lake is a choose your own adventure of tools, services, and evolving best practices. The flexibility and freedom that data lakes provide allows for generating significant value, but it can also lead to anti-patterns and inconsistent quality in your analytics.

Data Lake

Data Lake Lambda Architecture Data Warehouse Hadoop

How Marriott Modernized Their Data Architecture with Snowflake

Snowflake

SEPTEMBER 14, 2023

More than 50% of data leaders recently surveyed by BCG said the complexity of their data architecture is a significant pain point in their enterprise. As a result,” says BCG, “many companies find themselves at a tipping point, at risk of drowning in a deluge of data, overburdened with complexity and costs.”

Data Architecture

Data Architecture Architecture Hadoop Data Warehouse

Straining Your Data Lake Through A Data Mesh

Data Engineering Podcast

JULY 22, 2019

Summary The current trend in data management is to centralize the responsibilities of storing and curating the organization’s information to a data engineering team. This organizational pattern is reinforced by the architectural pattern of data lakes as a solution for managing storage and access.

Data Lake

Data Lake Hadoop Data Architecture

Microsoft Fabric vs. Snowflake: Key Differences You Need to Know

Edureka

APRIL 22, 2025

It incorporates elements from several Microsoft products working together, like Power BI, Azure Synapse Analytics, Data Factory, and OneLake, into a single SaaS experience. Fabric is meant for organizations looking for a single pane of glass across their data estate with seamless integration and a low learning curve for Microsoft users.

BI

BI Pipeline-centric Data Lake Google Cloud

Building A Data Lake For The Database Administrator At Upsolver

Data Engineering Podcast

JUNE 1, 2020

Summary Data lakes offer a great deal of flexibility and the potential for reduced cost for your analytics, but they also introduce a great deal of complexity. In order to bring the DBA into the new era of data management the team at Upsolver added a SQL interface to their data lake platform.

Data Lake

Data Lake Database Building Lambda Architecture

Laying the Foundation for Modern Data Architecture

Cloudera

MAY 28, 2024

It’s not enough for businesses to implement and maintain a data architecture. The unpredictability of market shifts and the evolving use of new technologies means businesses need more data they can trust than ever to stay agile and make the right decisions.

Data Architecture

Data Architecture Architecture Data Lake Data Warehouse

Exploring Processing Patterns For Streaming Data Integration In Your Data Lake

Data Engineering Podcast

NOVEMBER 20, 2021

Summary One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. In this episode Ori Rafael shares his experiences from Upsolver and building scalable stream processing for integrating and analyzing data, and what the tradeoffs are when coming from a batch oriented mindset.

Data Lake

Data Lake Data Integration Lambda Architecture Process

Data Lake vs. Data Warehouse vs. Data Lakehouse

Sync Computing

NOVEMBER 7, 2024

Despite these limitations, data warehouses, introduced in the late 1980s based on ideas developed even earlier, remain in widespread use today for certain business intelligence and data analysis applications. While data warehouses are still in use, they are limited in use-cases as they only support structured data.

Data Lake

Data Lake Data Warehouse Business Intelligence Unstructured Data

How Upsolver Is Building A Data Lake Platform In The Cloud with Yoni Iny - Episode 56

Data Engineering Podcast

NOVEMBER 11, 2018

Summary A data lake can be a highly valuable resource, as long as it is well built and well managed. In this episode Yoni Iny, CTO of Upsolver, discusses the various components that are necessary for a successful data lake project, how the Upsolver platform is architected, and how modern data lakes can benefit your organization.

Data Lake

Data Lake Building Kafka Cloud

Building A Better Data Warehouse For The Cloud At Firebolt

Data Engineering Podcast

AUGUST 31, 2020

Summary Data warehouse technology has been around for decades and has gone through several generational shifts in that time. The current trends in data warehousing are oriented around cloud native architectures that take advantage of dynamic scaling and the separation of compute and storage.

Data Warehouse

Data Warehouse Cloud Building Data Lake

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

FEBRUARY 4, 2024

In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.

SQL

SQL Data Lake High Quality Data Machine Learning

Cloudera Data Warehouse Demonstrates Best-in-Class Cloud-Native Price-Performance

Cloudera

JANUARY 15, 2021

Cloud data warehouses allow users to run analytic workloads with greater agility, better isolation and scale, and lower administrative overhead than ever before. The results demonstrate superior price performance of Cloudera Data Warehouse on the full set of 99 queries from the TPC-DS benchmark. Introduction.

Data Warehouse

Data Warehouse Cloud Consulting SQL

Insights And Advice On Building A Data Lake Platform From Someone Who Learned The Hard Way

Data Engineering Podcast

MAY 15, 2022

Summary Designing a data platform is a complex and iterative undertaking which requires accounting for many conflicting needs. Designing a platform that relies on a data lake as its central architectural tenet adds additional layers of difficulty. Can you describe your current platform architecture?

Data Lake

Data Lake Building Architecture BI

Modern Customer Data Platform Principles

Data Engineering Podcast

JANUARY 21, 2024

Summary Databases and analytics architectures have gone through several generational shifts. A substantial amount of the data that is being managed in these systems is related to customers and their interactions with an organization. How has that changed the architectural approach to CDPs? Want to see Starburst in action?

Data Lake

Data Lake High Quality Data NoSQL Data Warehouse

Scale Your Analytics On The Clickhouse Data Warehouse

Data Engineering Podcast

JULY 8, 2019

Summary The market for data warehouse platforms is large and varied, with options for every use case. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit.

Data Warehouse

Data Warehouse MySQL Hadoop Data Lake

Data Lake vs. Data Warehouse: Differences and Similarities

U-Next

SEPTEMBER 7, 2022

The terms “ Data Warehouse ” and “ Data Lake ” may have confused you, and you have some questions. Essentially, this is the difference between a lake and a warehouse. On the other hand, a data warehouse contains historical data that has been cleaned and arranged. .

Data Lake

Data Lake Data Warehouse Unstructured Data Amazon Web Services

X-Ray Vision For Your Flink Stream Processing With Datorios

Data Engineering Podcast

JUNE 9, 2024

Data lakes are notoriously complex. Data observability has been gaining adoption for a number of years now, with a large focus on data warehouses. How much of the complexity is due to the nature of streaming data vs. the architectural realities of Flink? Data lakes are notoriously complex.

Process

Process Data Lake High Quality Data Machine Learning

Reflecting On The Past 6 Years Of Data Engineering

Data Engineering Podcast

FEBRUARY 5, 2023

Sign up now for early access to Materialize and get started with the power of streaming data with the same simplicity and low implementation cost as batch cloud data warehouses. Go to [dataengineeringpodcast.com/materialize]([link] Support Data Engineering Podcast

Data Engineering

Data Engineering Data Engineer Engineering PostgreSQL

The Modern Data Lakehouse: An Architectural Innovation

Cloudera

SEPTEMBER 9, 2022

The promise of a modern data lakehouse architecture. Imagine having self-service access to all business data, anywhere it may be, and being able to explore it all at once. Imagine quickly answering burning business questions nearly instantly, without waiting for data to be found, shared, and ingested.

Architecture

Architecture Metadata Machine Learning Unstructured Data

Do Away With Data Integration Through A Dataware Architecture With Cinchy

Data Engineering Podcast

AUGUST 27, 2021

By making the software be the owner of the data that it generates, we have to go through the trouble of extracting the information to then be used elsewhere. The team at Cinchy are working to bring about a new paradigm of software architecture that puts the data as the central element. No more scripts, just SQL.

Data Integration

Data Integration Architecture Data Warehouse Data Lake

What Happens When The Abstractions Leak On Your Data

Data Engineering Podcast

MAY 14, 2023

In this episode the host Tobias Macey shares his reflections on recent experiences where the abstractions leaked and some observances on how to deal with that situation in a data platform architecture. You can collect, transform, and route data across your entire stack with its event streaming, ETL, and reverse ETL pipelines.

Data Lake

Data Lake Machine Learning Data Warehouse AWS

Data Warehouse vs Data Lake vs Data Lakehouse: Definitions, Similarities, and Differences

Monte Carlo

AUGUST 25, 2023

Data teams need to balance the need for robust, powerful data platforms with increasing scrutiny on costs. That’s why it’s essential for teams to choose the right architecture for the storage layer of their data stack. But, the options for data storage are evolving quickly. Let’s dive in.

Data Lake

Data Lake Data Warehouse Unstructured Data Raw Data

Evaluating Change Data Capture Tools: A Comprehensive Guide

Data Engineering Weekly

AUGUST 6, 2024

CDC tools fuel analytical apps and mission-critical data feeds in banking and regulated industries, with use cases ranging from data synchronization, managing risk, and preventing fraud to driving personalization. Unlike data lakes, which are predominantly append-only, lakehouses support data mutation natively.

Data Lake

Data Lake Data Warehouse Database Data Architecture

Data Engineering Weekly #206

Data Engineering Weekly

FEBRUARY 2, 2025

[link] Adam Bellemare & Thomas Betts: The End of the Bronze Age: Rethinking the Medallion Architecture I’m always a bit uncomfortable with medallion architecture since it is a glorified term for the traditional ETL process. link] All rights reserved ProtoGrowth Inc, India.

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

AUGUST 30, 2022

Modern data architectures. To eliminate or integrate these silos, the public sector needs to adopt robust data management solutions that support modern data architectures (MDAs). Deploying modern data architectures. Lack of sharing hinders the elimination of fraud, waste, and abuse. Forrester ).

Data Architecture

Data Architecture Architecture Data Lake NoSQL

Data Lake Explained: A Comprehensive Guide to Its Architecture and Use Cases

AltexSoft

AUGUST 29, 2023

In 2010, a transformative concept took root in the realm of data storage and analytics — a data lake. The term was coined by James Dixon , Back-End Java, Data, and Business Intelligence Engineer, and it started a new era in how organizations could store, manage, and analyze their data. What is a data lake?

Data Lake

Data Lake Architecture IT Amazon Web Services

Reduce The Overhead In Your Pipelines With Agile Data Engine's DataOps Service

Data Engineering Podcast

JUNE 4, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. What are some of the tools and architectures that an organization might be able to replace with Agile Data Engine?

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

How Column-Aware Development Tooling Yields Better Data Models

Data Engineering Podcast

JUNE 17, 2023

Summary Architectural decisions are all based on certain constraints and a desire to optimize for different outcomes. In data systems one of the core architectural exercises is data modeling, which can have significant impacts on what is and is not possible for downstream use cases.

Data Lake

Data Lake Machine Learning Metadata Data Architecture

Snowflake Data Warehouse 101: A Comprehensive Guide

Hevo

MAY 17, 2024

Snowflake Data Warehouse delivers essential infrastructure for handling a Data Lake, and Data Warehouse needs. It can store semi-structured and structured data in one place due to its multi-clusters architecture that allows users to independently query data using SQL.

Data Warehouse

Data Warehouse Data Lake Structured Data SQL

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes.

Data Lake

Data Lake Data Warehouse BI SQL

Enabling Self-Service Business Insights with Cloudera Data Warehouse

Cloudera

JANUARY 11, 2021

How self-service data warehousing frees IT resources. Cloudera Data Warehouse (CDW) is a cloud service and an integral part of the newly released Cloudera Data Platform (CDP). Key features are: Highly scalable and performant open-source engines for BI and data warehousing workloads. Modern architecture.

Data Warehouse

Data Warehouse Pharmaceutical Data Lake BI

A Primer On Enterprise Data Curation with Todd Walter - Episode 49

Data Engineering Podcast

SEPTEMBER 23, 2018

Using the metaphor of a museum curator carefully managing the precious resources on display and in the vaults, he discusses the various layers of an enterprise data strategy. Can you walk through the stages of an ideal lifecycle for data within the context of an organizations uses for it?

Data Lake

Data Lake Data Warehouse Data Architecture Architecture

Data Engineering: A Formula 1-inspired Guide for Beginners

Towards Data Science

DECEMBER 4, 2023

Anyways, I wasn’t paying enough attention during university classes, and today I’ll walk you through data layers using — guess what — an example. Business Scenario & Data Architecture Imagine this: next year, a new team on the grid, Red Thunder Racing, will call us (yes, me and you) to set up their new data infrastructure.

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

Snowflake Migration Success Stories: Core Digital Media and NAVEX

Snowflake

OCTOBER 16, 2024

Many of our customers — from Marriott to AT&T — start their journey with the Snowflake AI Data Cloud by migrating their data warehousing workloads to the platform. Today we’re focusing on customers who migrated from a legacy data warehouse to Snowflake and some of the benefits they saw.

Digital Media

Digital Media Media Data Lake Data Warehouse

8 Essential Data Pipeline Design Patterns You Should Know

Monte Carlo

NOVEMBER 21, 2024

Whether it’s customer transactions, IoT sensor readings, or just an endless stream of social media hot takes, you need a reliable way to get that data from point A to point B while doing something clever with it along the way. That’s where data pipeline design patterns come in. Lambda Architecture Pattern 4.

Data Pipeline

Data Pipeline Designing Lambda Architecture Kafka

How Apache Iceberg Is Changing the Face of Data Lakes

Data Integrity for AI: What’s Old is New Again

Webinars

Trending Sources

Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?

Webinars

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

Keep Your Data Lake Fresh With Real Time Streams Using Estuary

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Why Open Table Format Architecture is Essential for Modern Data Systems

Charting A Path For Streaming Data To Fill Your Data Lake With Hudi

Toward a Data Mesh (part 2) : Architecture & Technologies

Maintaining Your Data Lake At Scale With Spark

How Marriott Modernized Their Data Architecture with Snowflake

Straining Your Data Lake Through A Data Mesh

Microsoft Fabric vs. Snowflake: Key Differences You Need to Know

Building A Data Lake For The Database Administrator At Upsolver

Laying the Foundation for Modern Data Architecture

Exploring Processing Patterns For Streaming Data Integration In Your Data Lake

Data Lake vs. Data Warehouse vs. Data Lakehouse

How Upsolver Is Building A Data Lake Platform In The Cloud with Yoni Iny - Episode 56

Building A Better Data Warehouse For The Cloud At Firebolt

Tackling Real Time Streaming Data With SQL Using RisingWave

Cloudera Data Warehouse Demonstrates Best-in-Class Cloud-Native Price-Performance

Insights And Advice On Building A Data Lake Platform From Someone Who Learned The Hard Way

Modern Customer Data Platform Principles

Scale Your Analytics On The Clickhouse Data Warehouse

Data Lake vs. Data Warehouse: Differences and Similarities

X-Ray Vision For Your Flink Stream Processing With Datorios

Reflecting On The Past 6 Years Of Data Engineering

The Modern Data Lakehouse: An Architectural Innovation

Do Away With Data Integration Through A Dataware Architecture With Cinchy

What Happens When The Abstractions Leak On Your Data

Data Warehouse vs Data Lake vs Data Lakehouse: Definitions, Similarities, and Differences

Evaluating Change Data Capture Tools: A Comprehensive Guide

Data Engineering Weekly #206

Breaking State and Local Data Silos with Modern Data Architectures

Data Lake Explained: A Comprehensive Guide to Its Architecture and Use Cases

Reduce The Overhead In Your Pipelines With Agile Data Engine's DataOps Service

How Column-Aware Development Tooling Yields Better Data Models

Snowflake Data Warehouse 101: A Comprehensive Guide

The Future of the Data Lakehouse – Open

Enabling Self-Service Business Insights with Cloudera Data Warehouse

A Primer On Enterprise Data Curation with Todd Walter - Episode 49

Data Engineering: A Formula 1-inspired Guide for Beginners

Snowflake Migration Success Stories: Core Digital Media and NAVEX

8 Essential Data Pipeline Design Patterns You Should Know

Stay Connected