article thumbnail

Data Integrity for AI: What’s Old is New Again

Precisely

The goal of this post is to understand how data integrity best practices have been embraced time and time again, no matter the technology underpinning. In the beginning, there was a data warehouse The data warehouse (DW) was an approach to data architecture and structured data management that really hit its stride in the early 1990s.

article thumbnail

Keep Your Data Lake Fresh With Real Time Streams Using Estuary

Data Engineering Podcast

Summary Batch vs. streaming is a long running debate in the world of data integration and transformation. In this episode David Yaffe and Johnny Graettinger share the story behind the business and technology and how you can start using it today to build a real-time data lake without all of the headache.

Data Lake 162
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Exploring Processing Patterns For Streaming Data Integration In Your Data Lake

Data Engineering Podcast

Summary One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. In this episode Ori Rafael shares his experiences from Upsolver and building scalable stream processing for integrating and analyzing data, and what the tradeoffs are when coming from a batch oriented mindset.

Data Lake 100
article thumbnail

Straining Your Data Lake Through A Data Mesh

Data Engineering Podcast

Summary The current trend in data management is to centralize the responsibilities of storing and curating the organization’s information to a data engineering team. This organizational pattern is reinforced by the architectural pattern of data lakes as a solution for managing storage and access.

Data Lake 100
article thumbnail

Data Warehouse vs. Data Lake

Precisely

Data warehouse vs. data lake, each has their own unique advantages and disadvantages; it’s helpful to understand their similarities and differences. In this article, we’ll focus on a data lake vs. data warehouse. It is often used as a foundation for enterprise data lakes.

article thumbnail

Self Service Real Time Data Integration Without The Headaches With Meroxa

Data Engineering Podcast

Summary Analytical workloads require a well engineered and well maintained data integration process to ensure that your information is reliable and up to date. Building a real-time pipeline for your data lakes and data warehouses is a non-trivial effort, requiring a substantial investment of time and energy.

article thumbnail

Do Away With Data Integration Through A Dataware Architecture With Cinchy

Data Engineering Podcast

Summary The reason that so much time and energy is spent on data integration is because of how our applications are designed. By making the software be the owner of the data that it generates, we have to go through the trouble of extracting the information to then be used elsewhere. What is Zero-Copy Integration?