article thumbnail

Designing A Non-Relational Database Engine

Data Engineering Podcast

Summary Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. Datafold has recently launched data replication testing, providing ongoing validation for source-to-target replication.

article thumbnail

Reconciling The Data In Your Databases With Datafold

Data Engineering Podcast

Summary A significant portion of data workflows involve storing and processing information in database engines. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data. Data lakes are notoriously complex.

Database 147
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

Summary Building a database engine requires a substantial amount of engineering effort and time investment. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database. Data lakes are notoriously complex.

Database 162
article thumbnail

Troubleshooting Kafka In Production

Data Engineering Podcast

Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack You shouldn't have to throw away the database to build with fast-changing data. It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products. With Materialize, you can!

Kafka 245
article thumbnail

Monte Carlo Recognized as the #1 Leader in Data Observability and Data Quality by G2

Monte Carlo

Weve always focused on delivering exceptional customer success and improving data quality across the entire data stack and its rewarding to know that hard work continues to translate to meaningful outcomes for our customers.

article thumbnail

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Starburst : ![Starburst

SQL 173
article thumbnail

How Meta discovers data flows via lineage at scale

Engineering at Meta

In order to build high-quality data lineage, we developed different techniques to collect data flow signals across different technology stacks: static code analysis for different languages, runtime instrumentation, and input and output data matching, etc.