Data Architecture, Data Lake and Data Warehouse

How Apache Iceberg Is Changing the Face of Data Lakes

Snowflake

APRIL 2, 2025

Data storage has been evolving, from databases to data warehouses and expansive data lakes, with each architecture responding to different business and data needs. Now you dont have to choose. This is why Snowflake is fully embracing this open table format.

Data Lake

Data Lake Cloud Storage Metadata Data Warehouse

Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?

KDnuggets

OCTOBER 30, 2023

A comparative overview of data warehouses, data lakes, and data marts to help you make informed decisions on data storage solutions for your data architecture.

Data Lake

Data Lake Data Warehouse Data Storage Data

Data Integrity for AI: What’s Old is New Again

Precisely

JANUARY 9, 2025

The goal of this post is to understand how data integrity best practices have been embraced time and time again, no matter the technology underpinning. In the beginning, there was a data warehouse The data warehouse (DW) was an approach to data architecture and structured data management that really hit its stride in the early 1990s.

Data Integration

Data Integration Hadoop Data Warehouse Data Lake

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

Snowflake

NOVEMBER 2, 2023

Over the years, the technology landscape for data management has given rise to various architecture patterns, each thoughtfully designed to cater to specific use cases and requirements. Each of these architectures has its own unique strengths and tradeoffs. The schema of semi-structured data tends to evolve over time.

Data Lake

Data Lake Data Warehouse Cloud Unstructured Data

How Marriott Modernized Their Data Architecture with Snowflake

Snowflake

SEPTEMBER 14, 2023

More than 50% of data leaders recently surveyed by BCG said the complexity of their data architecture is a significant pain point in their enterprise. As a result,” says BCG, “many companies find themselves at a tipping point, at risk of drowning in a deluge of data, overburdened with complexity and costs.”

Data Architecture

Data Architecture Architecture Hadoop Data Warehouse

Laying the Foundation for Modern Data Architecture

Cloudera

MAY 28, 2024

It’s not enough for businesses to implement and maintain a data architecture. The unpredictability of market shifts and the evolving use of new technologies means businesses need more data they can trust than ever to stay agile and make the right decisions.

Data Architecture

Data Architecture Architecture Data Lake Data Warehouse

Maintaining Your Data Lake At Scale With Spark

Data Engineering Podcast

JUNE 16, 2019

Summary Building and maintaining a data lake is a choose your own adventure of tools, services, and evolving best practices. The flexibility and freedom that data lakes provide allows for generating significant value, but it can also lead to anti-patterns and inconsistent quality in your analytics.

Data Lake

Data Lake Lambda Architecture Data Warehouse Hadoop

Straining Your Data Lake Through A Data Mesh

Data Engineering Podcast

JULY 22, 2019

Summary The current trend in data management is to centralize the responsibilities of storing and curating the organization’s information to a data engineering team. This organizational pattern is reinforced by the architectural pattern of data lakes as a solution for managing storage and access.

Data Lake

Data Lake Hadoop Data Architecture

Scale Your Analytics On The Clickhouse Data Warehouse

Data Engineering Podcast

JULY 8, 2019

Summary The market for data warehouse platforms is large and varied, with options for every use case. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit.

Data Warehouse

Data Warehouse MySQL Hadoop Data Lake

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

AUGUST 30, 2022

Modern data architectures. To eliminate or integrate these silos, the public sector needs to adopt robust data management solutions that support modern data architectures (MDAs). Deploying modern data architectures. Lack of sharing hinders the elimination of fraud, waste, and abuse. Forrester ).

Data Architecture

Data Architecture Architecture Data Lake NoSQL

Why Open Table Format Architecture is Essential for Modern Data Systems

phData: Data Engineering

NOVEMBER 8, 2024

Versioning also ensures a safer experimentation environment, where data scientists can test new models or hypotheses on historical data snapshots without impacting live data. Note : Cloud Data warehouses like Snowflake and Big Query already have a default time travel feature.

Architecture

Architecture Systems Data Lake Google Cloud

Data Pitfalls to Avoid with Data Warehouses & Data Lakes

Acceldata

DECEMBER 19, 2022

Avoid these three data pitfalls when attempting to modernize your data architecture with a data lake or data warehouse.

Data Lake

Data Lake Data Warehouse Data Architecture Data

Evaluating Change Data Capture Tools: A Comprehensive Guide

Data Engineering Weekly

AUGUST 6, 2024

CDC tools fuel analytical apps and mission-critical data feeds in banking and regulated industries, with use cases ranging from data synchronization, managing risk, and preventing fraud to driving personalization. Unlike data lakes, which are predominantly append-only, lakehouses support data mutation natively.

Data Lake

Data Lake Data Warehouse Database Data Architecture

How Column-Aware Development Tooling Yields Better Data Models

Data Engineering Podcast

JUNE 17, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. How has the move to the cloud for data warehousing/data platforms influenced the practice of data modeling?

Data Lake

Data Lake Machine Learning Metadata Data Architecture

A Primer On Enterprise Data Curation with Todd Walter - Episode 49

Data Engineering Podcast

SEPTEMBER 23, 2018

Using the metaphor of a museum curator carefully managing the precious resources on display and in the vaults, he discusses the various layers of an enterprise data strategy. Can you walk through the stages of an ideal lifecycle for data within the context of an organizations uses for it?

Data Lake

Data Lake Data Warehouse Data Architecture Architecture

Data Engineering: A Formula 1-inspired Guide for Beginners

Towards Data Science

DECEMBER 4, 2023

Anyways, I wasn’t paying enough attention during university classes, and today I’ll walk you through data layers using — guess what — an example. Business Scenario & Data Architecture Imagine this: next year, a new team on the grid, Red Thunder Racing, will call us (yes, me and you) to set up their new data infrastructure.

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Monte Carlo

OCTOBER 31, 2024

Key Differences Between AI Data Engineers and Traditional Data Engineers While traditional data engineers and AI data engineers have similar responsibilities, they ultimately differ in where they focus their efforts. Data Storage Solutions As we all know, data can be stored in a variety of ways.

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

A Prequel to Data Mesh

Towards Data Science

JANUARY 16, 2024

When I heard the words ‘decentralised data architecture’, I was left utterly confused at first! In my then limited experience as a Data Engineer, I had only come across centralised data architectures and they seemed to be working very well. Result: Data warehouse was born. So what was missing?

Data Warehouse

Data Warehouse Data Architecture Relational Database NoSQL

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes.

Data Lake

Data Lake Data Warehouse BI SQL

Hands-On Introduction to Delta Lake with (py)Spark

Towards Data Science

FEBRUARY 15, 2023

In this context, data management in an organization is a key point for the success of its projects involving data. One of the main aspects of correct data management is the definition of a data architecture. The data became useless. The Lakehouse architecture was one of them.

Data Lake

Data Lake Data Warehouse Hadoop Architecture

A Guide to Data Pipelines (And How to Design One From Scratch)

Striim

SEPTEMBER 11, 2024

Data pipelines are the backbone of your business’s data architecture. Implementing a robust and scalable pipeline ensures you can effectively manage, analyze, and organize your growing data. Understanding the essential components of data pipelines is crucial for designing efficient and effective data architectures.

Data Pipeline

Data Pipeline Designing Data Lake Data Warehouse

Let Your Analysts Build A Data Lakehouse With Cuelake

Data Engineering Podcast

AUGUST 20, 2021

Summary Data lakes have been gaining popularity alongside an increase in their sophistication and usability. Despite improvements in performance and data architecture they still require significant knowledge and experience to deploy and manage. No more scripts, just SQL. How are you using Cuelake in your work at Cuebook?

Building

Building Data Lake Data Warehouse SQL

Fivetran Supports the Automation of the Modern Data Lake on Amazon S3

phData: Data Engineering

APRIL 4, 2023

Today we want to introduce Fivetran’s support for Amazon S3 with Apache Iceberg, investigate some of the implications of this feature, and learn how it fits into the modern data architecture as a whole. Fivetran today announced support for Amazon Simple Storage Service (Amazon S3) with Apache Iceberg data lake format.

Data Lake

Data Lake Amazon Web Services Data Cleanse Data Warehouse

5 Reasons Data Discovery Platforms Are Best For Data Lakes

Monte Carlo

APRIL 1, 2021

Over the past few years, data lakes have emerged as a must-have for the modern data stack. But while the technologies powering our access and analysis of data have matured, the mechanics behind understanding this data in a distributed environment have lagged behind. Data discovery tools and platforms can help.

Data Lake

Data Lake Data Warehouse Unstructured Data Government

What is a Data Mesh?

DataKitchen

AUGUST 3, 2021

The data mesh design pattern breaks giant, monolithic enterprise data architectures into subsystems or domains, each managed by a dedicated team. First-generation – expensive, proprietary enterprise data warehouse and business intelligence platforms maintained by a specialized team drowning in technical debt.

Pharmaceutical

Pharmaceutical Data Lake Data Architecture Architecture

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

Over the past decade, Cloudera has enabled multi-function analytics on data lakes through the introduction of the Hive table format and Hive ACID. Companies, on the other hand, have continued to demand highly scalable and flexible analytic engines and services on the data lake, without vendor lock-in.

Data Lake

Data Lake Business Intelligence Metadata Data Warehouse

Data Lake Explained: A Comprehensive Guide to Its Architecture and Use Cases

AltexSoft

AUGUST 29, 2023

In 2010, a transformative concept took root in the realm of data storage and analytics — a data lake. The term was coined by James Dixon , Back-End Java, Data, and Business Intelligence Engineer, and it started a new era in how organizations could store, manage, and analyze their data. What is a data lake?

Data Lake

Data Lake Architecture IT Amazon Web Services

Data Mesh vs Data Lake: Pros, Cons, & How to Decide

Monte Carlo

JANUARY 23, 2023

When it comes to the data community, there’s always a debate broiling about something— and right now “data mesh vs data lake” is right at the top of that list. In this post we compare and contrast the data mesh vs data lake to illustrate the benefits of each and help discover what’s right for your data platform.

Data Lake

Data Lake Architecture Business Intelligence Unstructured Data

Escaping Analysis Paralysis For Your Data Platform With Data Virtualization

Data Engineering Podcast

NOVEMBER 18, 2019

Summary With the constant evolution of technology for data management it can seem impossible to make an informed decision about whether to build a data warehouse, or a data lake, or just leave your data wherever it currently rests. How does it influence the relevancy of data warehouses or data lakes?

Data Lake

Data Lake Scala Data Warehouse Hadoop

Evaluating Data Observability Tools: A Comprehensive Guide

Data Engineering Weekly

SEPTEMBER 18, 2024

The Rise of Data Observability Data observability has become increasingly critical as companies seek greater visibility into their data processes. This growing demand has found a natural synergy with the rise of the data lake. As a result, monitoring data in real time was often an afterthought.

Data Lake

Data Lake Data Pipeline Unstructured Data Data

Data Mesh vs Data Warehouse: 3 Key Differences

Monte Carlo

APRIL 4, 2023

Data mesh vs data warehouse is an interesting framing because it is not necessarily a binary choice depending on what exactly you mean by data warehouse (more on that later). Despite their differences, however, both approaches require high-quality, reliable data in order to function. What is a Data Mesh?

Data Warehouse

Data Warehouse Data Governance Data Architecture

Understanding Modern Data Architecture

Hevo

SEPTEMBER 17, 2024

Organizations have begun to built data warehouses and lakes to analyze large amounts of data for insights and business reports. Often time they bring data from multiple data silos into their data lake and also have data stored in particular data stores like NoSQL databases to support different use cases.

Data Architecture

Data Architecture Architecture NoSQL Data Lake

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

Mark: The first element in the process is the link between the source data and the entry point into the data platform. At Ramsey International (RI), we refer to that layer in the architecture as the foundation, but others call it a staging area, raw zone, or even a source data lake. What is a data fabric?

Data Lake

Data Lake Analytics Application Cloud Storage Architecture

AI Challenges and How Cloudera Can Help

Cloudera

AUGUST 20, 2024

Trusted data is what makes the outputs of AI not just accurate, but impactful in decision making. Ensuring data is trustworthy comes with its own complications. Cloudera’s State of Enterprise AI and Modern Data Architecture survey identified several challenges when it comes to data.

Government

Government Data Lake Data Governance Data Architecture

Open Source Object Storage For All Of Your Data

Data Engineering Podcast

SEPTEMBER 22, 2019

Summary Object storage is quickly becoming the unifying layer for data intensive applications and analytics. Modern, cloud oriented data warehouses and data lakes both rely on the durability and ease of use that it provides.

AWS

AWS Google Cloud Cloud Storage Data Lake

A Better Way to Plan the Payoff of Technical Debt

The Modern Data Company

MARCH 23, 2023

However, to unlock the maximum power of corporate data, it is necessary to mix data from different systems and allow each data source to enhance the others. Various architectures, from data warehouses to data lakes, have attempted to help solve this problem over the years.

Data Lake

Data Lake Data Warehouse Architecture Systems

Centralize Your Data Processes With a DataOps Process Hub

DataKitchen

NOVEMBER 4, 2021

Data organizations often have a mix of centralized and decentralized activity. DataOps concerns itself with the complex flow of data across teams, data centers and organizational boundaries. It expands beyond tools and data architecture and views the data organization from the perspective of its processes and workflows.

Process

Process Data Process Pharmaceutical Data Lake

Connecting the Data Lifecycle

Cloudera

NOVEMBER 29, 2021

Carrefour Spain , a branch of the larger company (with 1,250 stores), processes over 3 million transactions every day, giving rise to challenges like creating and managing a data lake and honing down key demographic information. . Working with Cloudera, Carrefour Spain was able to create a unified data lake for ease of data handling.

Data Lake

Data Lake Telecommunication Retail Data

Data Orchestration For Hybrid Cloud Analytics

Data Engineering Podcast

OCTOBER 21, 2019

We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC.

Cloud

Cloud Hadoop Data Lake Programming Language

An Introduction to Disaster Recovery with the Cloudera Data Platform

Cloudera

AUGUST 9, 2022

Data platforms are no longer skunkworks projects or science experiments. As customers import their mainframe and legacy data warehouse workloads, there is an expectation on the platform that it can meet, if not exceed, the resilience of the prior system and its associated dependencies. Conclusion.

Data Lake

Data Lake Data Warehouse Architecture Professional Services

Chose Both: Data Fabric and Data Lakehouse

Cloudera

SEPTEMBER 12, 2022

Combining and analyzing both structured and unstructured data is a whole new challenge to come to grips with, let alone doing so across different infrastructures. Both obstacles can be overcome using modern data architectures, specifically data fabric and data lakehouse. Unified data fabric.

Unstructured Data

Unstructured Data Data Lake Data Architecture Data

Beyond the Hype: Are Data Mesh and Data Fabric just Marchitecture? by Colin Eberhardt

Scott Logic

APRIL 18, 2024

In this episode, Oliver Cronk, Andrew Carr and David Hope talk about the ever-changing world of data, with conversations moving from data warehouse to data lake, and data mesh to data fabric.

Data Lake

Data Lake Data Warehouse Data Architecture Architecture

Educating ChatGPT on Data Lakehouse

Cloudera

MARCH 17, 2023

As the use of ChatGPT becomes more prevalent, I frequently encounter customers and data users citing ChatGPT’s responses in their discussions. I love the enthusiasm surrounding ChatGPT and the eagerness to learn about modern data architectures such as data lakehouses, data meshes, and data fabrics.

Education

Education Unstructured Data Data Lake Data Warehouse

Getting the Most From Your Modern Data Platform: A Three-Phase Approach

Snowflake

JULY 22, 2024

Phase 1 – Migrate: Move your legacy data system to Snowflake In the Migrate phase, you’ll move your data and workloads to Snowflake, and thereby resolve the cost concerns and performance bottlenecks associated with your legacy data warehouse.

Government

Government Cloud Data Hadoop

How Apache Iceberg Is Changing the Face of Data Lakes

Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?

Trending Sources

Data Integrity for AI: What’s Old is New Again

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

How Marriott Modernized Their Data Architecture with Snowflake

Laying the Foundation for Modern Data Architecture

Maintaining Your Data Lake At Scale With Spark

Straining Your Data Lake Through A Data Mesh

Scale Your Analytics On The Clickhouse Data Warehouse

Breaking State and Local Data Silos with Modern Data Architectures

Why Open Table Format Architecture is Essential for Modern Data Systems

Data Pitfalls to Avoid with Data Warehouses & Data Lakes

Evaluating Change Data Capture Tools: A Comprehensive Guide

How Column-Aware Development Tooling Yields Better Data Models

A Primer On Enterprise Data Curation with Todd Walter - Episode 49

Data Engineering: A Formula 1-inspired Guide for Beginners

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

A Prequel to Data Mesh

The Future of the Data Lakehouse – Open

Hands-On Introduction to Delta Lake with (py)Spark

A Guide to Data Pipelines (And How to Design One From Scratch)

Let Your Analysts Build A Data Lakehouse With Cuelake

Fivetran Supports the Automation of the Modern Data Lake on Amazon S3

5 Reasons Data Discovery Platforms Are Best For Data Lakes

What is a Data Mesh?

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Data Lake Explained: A Comprehensive Guide to Its Architecture and Use Cases

Data Mesh vs Data Lake: Pros, Cons, & How to Decide

Escaping Analysis Paralysis For Your Data Platform With Data Virtualization

Evaluating Data Observability Tools: A Comprehensive Guide

Data Mesh vs Data Warehouse: 3 Key Differences

Understanding Modern Data Architecture

Demystifying Modern Data Platforms

AI Challenges and How Cloudera Can Help

Open Source Object Storage For All Of Your Data

A Better Way to Plan the Payoff of Technical Debt

Centralize Your Data Processes With a DataOps Process Hub

Connecting the Data Lifecycle

Data Orchestration For Hybrid Cloud Analytics

An Introduction to Disaster Recovery with the Cloudera Data Platform

Chose Both: Data Fabric and Data Lakehouse

Beyond the Hype: Are Data Mesh and Data Fabric just Marchitecture? by Colin Eberhardt

Educating ChatGPT on Data Lakehouse

Getting the Most From Your Modern Data Platform: A Three-Phase Approach

Stay Connected