Data Architecture and Data Warehouse - Data Engineering Digest

Data Integrity for AI: What’s Old is New Again

Precisely

JANUARY 9, 2025

The goal of this post is to understand how data integrity best practices have been embraced time and time again, no matter the technology underpinning. In the beginning, there was a data warehouse The data warehouse (DW) was an approach to data architecture and structured data management that really hit its stride in the early 1990s.

Data Integration

Data Integration Hadoop Data Warehouse Data Lake

Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?

KDnuggets

OCTOBER 30, 2023

A comparative overview of data warehouses, data lakes, and data marts to help you make informed decisions on data storage solutions for your data architecture.

Data Lake

Data Lake Data Warehouse Data Storage Data

How Marriott Modernized Their Data Architecture with Snowflake

Snowflake

SEPTEMBER 14, 2023

More than 50% of data leaders recently surveyed by BCG said the complexity of their data architecture is a significant pain point in their enterprise. As a result,” says BCG, “many companies find themselves at a tipping point, at risk of drowning in a deluge of data, overburdened with complexity and costs.”

Data Architecture

Data Architecture Architecture Hadoop Data Warehouse

Laying the Foundation for Modern Data Architecture

Cloudera

MAY 28, 2024

It’s not enough for businesses to implement and maintain a data architecture. The unpredictability of market shifts and the evolving use of new technologies means businesses need more data they can trust than ever to stay agile and make the right decisions.

Data Architecture

Data Architecture Architecture Data Lake Data Warehouse

Beyond Data Fabrics: Cloudera Modern Data Architectures

Cloudera

JULY 11, 2022

What used to be bespoke and complex enterprise data integration has evolved into a modern data architecture that orchestrates all the disparate data sources intelligently and securely, even in a self-service manner: a data fabric. Cloudera data fabric and analyst acclaim. Next steps.

Data Architecture

Data Architecture Architecture Data Government

SnowflakeDB: The Data Warehouse Built For The Cloud

Data Engineering Podcast

DECEMBER 8, 2019

Summary Data warehouses have gone through many transformations, from standard relational databases on powerful hardware, to column oriented storage engines, to the current generation of cloud-native analytical engines. If you are evaluating your options for building or migrating a data platform, then this is definitely worth a listen.

Data Warehouse

Data Warehouse Cloud AWS Relational Database

How Apache Iceberg Is Changing the Face of Data Lakes

Snowflake

APRIL 2, 2025

Data storage has been evolving, from databases to data warehouses and expansive data lakes, with each architecture responding to different business and data needs. Traditional databases excelled at structured data and transactional workloads but struggled with performance at scale as data volumes grew.

Data Lake

Data Lake Cloud Storage Metadata Data Warehouse

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

Snowflake

NOVEMBER 2, 2023

Over the years, the technology landscape for data management has given rise to various architecture patterns, each thoughtfully designed to cater to specific use cases and requirements. Each of these architectures has its own unique strengths and tradeoffs. Want to see these features in action?

Data Lake

Data Lake Data Warehouse Cloud Unstructured Data

Keeping Your Data Warehouse In Order With DataForm

Data Engineering Podcast

OCTOBER 14, 2019

Summary Managing a data warehouse can be challenging, especially when trying to maintain a common set of patterns. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council.

Data Warehouse

Data Warehouse PostgreSQL AWS Programming Language

Data Warehouse, Redefined

Towards Data Science

JULY 30, 2024

Rethinking data warehousing: Why redefinition is necessary even beyond Modern Data Warehouse (MDW) and Lakehouse Models Continue reading on Towards Data Science »

Data Warehouse

Data Warehouse Data Science Data Data Architecture

Scale Your Analytics On The Clickhouse Data Warehouse

Data Engineering Podcast

JULY 8, 2019

Summary The market for data warehouse platforms is large and varied, with options for every use case. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit.

Data Warehouse

Data Warehouse MySQL Hadoop Data Lake

Building Streaming Data Architectures with Qlik Replicate and Apache Kafka

Confluent

OCTOBER 30, 2020

A fundamental challenge with today’s “data explosion” is finding the best answer to the question, “So where do I put my data?” while avoiding the longer-term problem of data warehouses, […].

Data Architecture

Data Architecture Architecture Kafka Building

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

AUGUST 30, 2022

Modern data architectures. To eliminate or integrate these silos, the public sector needs to adopt robust data management solutions that support modern data architectures (MDAs). Deploying modern data architectures. Lack of sharing hinders the elimination of fraud, waste, and abuse. Forrester ).

Data Architecture

Data Architecture Architecture Data Lake NoSQL

Data Mesh vs Data Warehouse: A Guide to Choosing the Right Data Architecture

Hevo

SEPTEMBER 10, 2024

Nowadays, when it comes to data management, every business has to make one critical decision: whether to use a Data Mesh or a Data Warehouse. Both are strong data management architectures, but they are designed to support different needs and various organizational structures.

Data Warehouse

Data Warehouse Architecture Data Architecture Data

Open-Source Data Warehousing – Druid, Apache Airflow & Superset

Simon Späti

NOVEMBER 28, 2018

However, this is still not common in the Data Warehouse (DWH) field. In my recent blog, I researched OLAP technologies, for this post I chose some open-source technologies and used them together to build a full data architecture for a Data Warehouse system. These days, everyone talks about open-source.

Data Warehouse

Data Warehouse Data Storage Data Architecture Architecture

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

SEPTEMBER 29, 2022

Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem. Data teams are confused as to whether they should get on the bandwagon of just one of these trends or pick a combination. First, we describe how data mesh and data fabric could be related.

Architecture

Architecture Data Architecture Metadata Data Warehouse

5 Advantages of Real-Time ETL for Snowflake

Striim

MARCH 21, 2025

With instant elasticity, high-performance, and secure data sharing across multiple clouds , Snowflake has become highly in-demand for its cloud-based data warehouse offering. As organizations adopt Snowflake for business-critical workloads, they also need to look for a modern data integration approach.

Data Warehouse

Data Warehouse MongoDB MySQL Hadoop

Evaluating Change Data Capture Tools: A Comprehensive Guide

Data Engineering Weekly

AUGUST 6, 2024

CDC tools fuel analytical apps and mission-critical data feeds in banking and regulated industries, with use cases ranging from data synchronization, managing risk, and preventing fraud to driving personalization. This approach simplifies data architecture and enhances performance by reducing data movement and latency.

Data Lake

Data Lake Data Warehouse Database Data Architecture

On-Prem vs. The Cloud: Key Considerations

phData: Data Engineering

FEBRUARY 21, 2025

In this post, we will be particularly interested in the impact that cloud computing left on the modern data warehouse. We will explore the different options for data warehousing and how you can leverage this information to make the right decisions for your organization. Understanding the Basics What is a Data Warehouse?

Cloud

Cloud Data Warehouse Amazon Web Services Data Ingestion

A Prequel to Data Mesh

Towards Data Science

JANUARY 16, 2024

When I heard the words ‘decentralised data architecture’, I was left utterly confused at first! In my then limited experience as a Data Engineer, I had only come across centralised data architectures and they seemed to be working very well. Result: Data warehouse was born. So what was missing?

Data Warehouse

Data Warehouse Data Architecture Relational Database NoSQL

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Monte Carlo

OCTOBER 31, 2024

Key Differences Between AI Data Engineers and Traditional Data Engineers While traditional data engineers and AI data engineers have similar responsibilities, they ultimately differ in where they focus their efforts. Data Storage Solutions As we all know, data can be stored in a variety of ways.

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

Trends and Takeaways from Banking and Payments’ Event of the Year

Snowflake

NOVEMBER 11, 2024

Data and AI architecture matter “Before focusing on AI/ML use cases such as hyper personalization and fraud prevention, it is important that the data and data architecture are organized and structured in a way which meets the requirements and standards of the local regulators around the world.

Banking

Banking Finance Retail Food

A Primer On Enterprise Data Curation with Todd Walter - Episode 49

Data Engineering Podcast

SEPTEMBER 23, 2018

This includes modeling the lifecycle of your information as a pipeline from the raw, messy, loosely structured records in your data lake, through a series of transformations and ultimately to your data warehouse. Can you walk through the stages of an ideal lifecycle for data within the context of an organizations uses for it?

Data Lake

Data Lake Data Warehouse Data Architecture Architecture

Data Pitfalls to Avoid with Data Warehouses & Data Lakes

Acceldata

DECEMBER 19, 2022

Avoid these three data pitfalls when attempting to modernize your data architecture with a data lake or data warehouse.

Data Lake

Data Lake Data Warehouse Data Architecture Data

Is Modern Data Warehouse Architecture Broken?

Monte Carlo

APRIL 16, 2022

The data warehouse is the foundation of the modern data stack, so it caught our attention when we saw Convoy head of data Chad Sanderson declare, “ the data warehouse is broken ” on LinkedIn. Treating data like an API. Immutable data warehouses have challenges too.

Data Warehouse

Data Warehouse Architecture Data Data Architect

Data Mesh vs Data Warehouse: 3 Key Differences

Monte Carlo

APRIL 4, 2023

Data mesh vs data warehouse is an interesting framing because it is not necessarily a binary choice depending on what exactly you mean by data warehouse (more on that later). Despite their differences, however, both approaches require high-quality, reliable data in order to function. What is a Data Mesh?

Data Warehouse

Data Warehouse Data Governance Data Architecture

How Column-Aware Development Tooling Yields Better Data Models

Data Engineering Podcast

JUNE 17, 2023

Sign up free at dataengineeringpodcast.com/rudderstack - Your host is Tobias Macey and today I'm interviewing Satish Jayanthi about the practice and promise of building a column-aware data architecture through intentional modeling Interview Introduction How did you get involved in the area of data management?

Data Lake

Data Lake Machine Learning Metadata Data Architecture

8 Takeaways from Snowflake’s Accelerate Events for Retail, CPG and Media

Snowflake

APRIL 10, 2025

Establishing a secure, organized and privacy-enabled data warehouse is foundational for successful data collaboration through clean rooms. And that 10% uplift in shop conversion should result in about a 20% uplift in revenue in 2025. Accelerate Advertising, Media & Entertainment 1. Missed the events?

Media

Media Retail Entertainment Consulting

Why Open Table Format Architecture is Essential for Modern Data Systems

phData: Data Engineering

NOVEMBER 8, 2024

Versioning also ensures a safer experimentation environment, where data scientists can test new models or hypotheses on historical data snapshots without impacting live data. Note : Cloud Data warehouses like Snowflake and Big Query already have a default time travel feature. FAQs What is a Data Lakehouse?

Architecture

Architecture Systems Data Lake Google Cloud

Data Engineering: A Formula 1-inspired Guide for Beginners

Towards Data Science

DECEMBER 4, 2023

Anyways, I wasn’t paying enough attention during university classes, and today I’ll walk you through data layers using — guess what — an example. Business Scenario & Data Architecture Imagine this: next year, a new team on the grid, Red Thunder Racing, will call us (yes, me and you) to set up their new data infrastructure.

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

How HomeToGo Is Building a Robust Clickstream Data Architecture with Snowflake, Snowplow and dbt

Snowflake

JULY 27, 2023

Over the course of this journey, HomeToGo’s data needs have evolved considerably. It also came with other advantages such as independence of cloud infrastructure providers, data recovery features such as Time Travel , and zero copy cloning which made setting up several environments — such as dev, stage or production — way more efficient.

Data Architecture

Data Architecture Architecture Building Structured Data

Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder

Data Engineering Podcast

APRIL 3, 2022

Summary The flexibility of software oriented data workflows is useful for fulfilling complex requirements, but for simple and repetitious use cases it adds significant complexity. Coalesce is a platform designed to reduce repetitive work for common workflows by adopting a visual pipeline builder to support your data warehouse transformations.

Data Warehouse

Data Warehouse Data Workflow Data Architecture SQL

What is a Data Mesh?

DataKitchen

AUGUST 3, 2021

The data mesh design pattern breaks giant, monolithic enterprise data architectures into subsystems or domains, each managed by a dedicated team. The past decades of enterprise data platform architectures can be summarized in 69 words. Introduction to Data Mesh. Source: Thoughtworks.

Pharmaceutical

Pharmaceutical Data Lake Data Architecture Architecture

A Guide to Data Pipelines (And How to Design One From Scratch)

Striim

SEPTEMBER 11, 2024

Data pipelines are the backbone of your business’s data architecture. Implementing a robust and scalable pipeline ensures you can effectively manage, analyze, and organize your growing data. Understanding the essential components of data pipelines is crucial for designing efficient and effective data architectures.

Data Pipeline

Data Pipeline Designing Data Lake Data Warehouse

Understanding Modern Data Architecture

Hevo

SEPTEMBER 17, 2024

Organizations have begun to built data warehouses and lakes to analyze large amounts of data for insights and business reports. Often time they bring data from multiple data silos into their data lake and also have data stored in particular data stores like NoSQL databases to support different use cases.

Data Architecture

Data Architecture Architecture NoSQL Data Lake

The View From The Lakehouse Of Architectural Patterns For Your Data Platform

Data Engineering Podcast

JULY 3, 2022

Datafold shows how a change in SQL code affects your data, both on a statistical level and down to individual rows and values before it gets merged to production. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows.

Architecture

Architecture Metadata MongoDB Data Warehouse

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. In recent years, the term “data lakehouse” was coined to describe this architectural pattern of tabular analytics over data in the data lake.

Data Lake

Data Lake Data Warehouse BI SQL

Maintaining Your Data Lake At Scale With Spark

Data Engineering Podcast

JUNE 16, 2019

This conversation was useful for getting a better idea of the challenges that exist in large scale data analytics, and the current state of the tradeoffs between data lakes and data warehouses in the cloud. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit.

Data Lake

Data Lake Lambda Architecture Data Warehouse Hadoop

Hands-On Introduction to Delta Lake with (py)Spark

Towards Data Science

FEBRUARY 15, 2023

In this context, data management in an organization is a key point for the success of its projects involving data. One of the main aspects of correct data management is the definition of a data architecture. The Lakehouse architecture was one of them.

Data Lake

Data Lake Data Warehouse Hadoop Architecture

Let Your Analysts Build A Data Lakehouse With Cuelake

Data Engineering Podcast

AUGUST 20, 2021

Summary Data lakes have been gaining popularity alongside an increase in their sophistication and usability. Despite improvements in performance and data architecture they still require significant knowledge and experience to deploy and manage. The data you’re looking for is already in your data warehouse and BI tools.

Building

Building Data Lake Data Warehouse SQL

Ramp Simplifies Data Architecture Management, Cuts Costs, and Delivers Market Insights to Customers at Scale

Snowflake

JANUARY 30, 2023

By running data warehouse and data engineering workloads on Snowflake’s Data Cloud Ramp improves performance and user experience, while delivering powerful insights to customers quickly. How do you scale seamlessly, without worrying about keeping the lights on?

Data Architecture

Data Architecture Architecture Management Datasets

Building Real-Time Data Platforms For Large Volumes Of Information With Aerospike

Data Engineering Podcast

OCTOBER 2, 2021

Datafold also helps automate regression testing of ETL code with its Data Diff feature that instantly shows how a change in ETL or BI code affects the produced data, both on a statistical level and down to individual rows and values. What are the driving factors for building a real-time data platform?

Building

Building BI Data Architecture Architecture

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

Companies, on the other hand, have continued to demand highly scalable and flexible analytic engines and services on the data lake, without vendor lock-in. Organizations want modern data architectures that evolve at the speed of their business and we are happy to support them with the first open data lakehouse. .

Data Lake

Data Lake Business Intelligence Metadata Data Warehouse

Digging Into Data Replication At Fivetran

Data Engineering Podcast

AUGUST 12, 2019

Fivetran is a platform that does the hard work for you and replicates information from your source systems into whichever data warehouse you use. Upcoming events include the O’Reilly AI Conference, the Strata Data Conference, and the combined events of the Data Architecture Summit and Graphorum.

Media

Media Data Warehouse Data Big Data

Data Integrity for AI: What’s Old is New Again

Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?

Trending Sources

How Marriott Modernized Their Data Architecture with Snowflake

Laying the Foundation for Modern Data Architecture

Beyond Data Fabrics: Cloudera Modern Data Architectures

SnowflakeDB: The Data Warehouse Built For The Cloud

How Apache Iceberg Is Changing the Face of Data Lakes

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

Keeping Your Data Warehouse In Order With DataForm

Data Warehouse, Redefined

Scale Your Analytics On The Clickhouse Data Warehouse

Building Streaming Data Architectures with Qlik Replicate and Apache Kafka

Breaking State and Local Data Silos with Modern Data Architectures

Data Mesh vs Data Warehouse: A Guide to Choosing the Right Data Architecture

Open-Source Data Warehousing – Druid, Apache Airflow & Superset

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

5 Advantages of Real-Time ETL for Snowflake

Evaluating Change Data Capture Tools: A Comprehensive Guide

On-Prem vs. The Cloud: Key Considerations

A Prequel to Data Mesh

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Trends and Takeaways from Banking and Payments’ Event of the Year

A Primer On Enterprise Data Curation with Todd Walter - Episode 49

Data Pitfalls to Avoid with Data Warehouses & Data Lakes

Is Modern Data Warehouse Architecture Broken?

Data Mesh vs Data Warehouse: 3 Key Differences

How Column-Aware Development Tooling Yields Better Data Models

8 Takeaways from Snowflake’s Accelerate Events for Retail, CPG and Media

Why Open Table Format Architecture is Essential for Modern Data Systems

Data Engineering: A Formula 1-inspired Guide for Beginners

How HomeToGo Is Building a Robust Clickstream Data Architecture with Snowflake, Snowplow and dbt

Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder

What is a Data Mesh?

A Guide to Data Pipelines (And How to Design One From Scratch)

Understanding Modern Data Architecture

The View From The Lakehouse Of Architectural Patterns For Your Data Platform

The Future of the Data Lakehouse – Open

Maintaining Your Data Lake At Scale With Spark

Hands-On Introduction to Delta Lake with (py)Spark

Let Your Analysts Build A Data Lakehouse With Cuelake

Ramp Simplifies Data Architecture Management, Cuts Costs, and Delivers Market Insights to Customers at Scale

Building Real-Time Data Platforms For Large Volumes Of Information With Aerospike

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Digging Into Data Replication At Fivetran

Stay Connected