article thumbnail

Exploring Processing Patterns For Streaming Data Integration In Your Data Lake

Data Engineering Podcast

In this episode Ori Rafael shares his experiences from Upsolver and building scalable stream processing for integrating and analyzing data, and what the tradeoffs are when coming from a batch oriented mindset. Can you start by giving an overview of the state of the market for data lakes today?

Data Lake 100
article thumbnail

Simplifying Continuous Data Processing Using Stream Native Storage In Pravega with Tom Kaitchuck - Episode 63

Data Engineering Podcast

Preamble Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Writing The Book That Offers A Single Reference For The Fundamentals Of Data Engineering

Data Engineering Podcast

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. What are your goals with this book?

article thumbnail

StreamNative Brings Streaming Data To The Cloud Native Landscape With Pulsar

Data Engineering Podcast

Pulsar is a well engineered and robust platform for building the core of any system that relies on durable access to easily scalable streams of data. What is Pulsar’s role in the lifecycle of data and where does it fit in the overall ecosystem of data tools? Can you start by giving an overview of what Pulsar is?

Cloud 100
article thumbnail

An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications

Data Engineering Podcast

In this episode Shruti Bhat gives her view on the state of the ecosystem for real-time data and the work that she and her team at Rockset is doing to make it easier for engineers to build those experiences. Can you describe what is driving the adoption of real-time analytics?

article thumbnail

Beyond Kafka: Conversation with Jark Wu on Fluss - Streaming Storage for Real-Time Analytics

Data Engineering Weekly

Confluent Tableflow can bridge Kafka and Iceberg data, but that is just a data movement that data integration tools like Fivetran or Airbyte can also achieve. On the other hand, Fluss is a Kappa Architecture ; it stores one copy of data and presents it as a stream or a table, depending on the use case.

Kafka 74
article thumbnail

Building A Data Lake For The Database Administrator At Upsolver

Data Engineering Podcast

What used to be entirely managed by the database engine is now a composition of multiple systems that need to be properly configured to work in concert. In order to bring the DBA into the new era of data management the team at Upsolver added a SQL interface to their data lake platform. We talked last in November of 2018.

Data Lake 100