Kafka - Data Engineering Digest

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Can you describe your experiences with Kafka? What are the operational challenges that you have had to overcome while working with Kafka? When is Kafka the wrong choice?

Kafka

Kafka Data Lake High Quality Data SQL

Kafka to MongoDB: Building a Streamlined Data Pipeline

Analytics Vidhya

FEBRUARY 28, 2024

We know that streaming data is data that is emitted at high volume […] The post Kafka to MongoDB: Building a Streamlined Data Pipeline appeared first on Analytics Vidhya. IT industries rely heavily on real-time insights derived from streaming data sources.

MongoDB

MongoDB Data Pipeline Kafka Building

A Detailed Guide of Interview Questions on Apache Kafka

Analytics Vidhya

APRIL 28, 2023

Introduction Apache Kafka is an open-source publish-subscribe messaging application initially developed by LinkedIn in early 2011. It is a message broker application and a logging service that is distributed, segmented, and […] The post A Detailed Guide of Interview Questions on Apache Kafka appeared first on Analytics Vidhya.

Kafka

Kafka Scala Coding Data Process

Webinars

Smart Tech + Human Expertise = How to Modernize Manufacturing Without Losing Control

MORE WEBINARS

Beyond Kafka: Conversation with Jark Wu on Fluss - Streaming Storage for Real-Time Analytics

Data Engineering Weekly

FEBRUARY 18, 2025

It addresses many of Kafka's challenges in analytical infrastructure. The combination of Kafka and Flink is not a perfect fit for real-time analytics; the integration of Kafka and Lakehouse is very shallow. How do you compare Fluss with Apache Kafka? Fluss and Kafka differ fundamentally in design principles.

Kafka

Kafka Lambda Architecture SQL Architecture

Kafka Summit Americas 2021 Recap

Confluent

SEPTEMBER 16, 2021

The full inventory of three online Kafka Summits in 2021 is now complete. Kafka Summit Americas wrapped just yesterday. Being a part of the event team and the Program Committee, […].

Kafka

Kafka Programming

Introducing Apache Kafka 3.7

Confluent

FEBRUARY 27, 2024

Apache Kafka 3.7 introduces updates to the Consumer rebalance protocol, an official Apache Kafka Docker image, JBOD support in Kraft-based clusters, and more!

Kafka

Kafka Streams Fundamentals

Confluent

OCTOBER 28, 2021

Kafka Streams is an abstraction over Apache Kafka® producers and consumers that lets you forget about low-level details and focus on processing your Kafka data. You could of course write […].

Kafka

Kafka Process Data

Introducing Apache Kafka® 3.8

Confluent

JULY 29, 2024

Apache Kafka 3.8 adds 17 new KIPs (13 for Core, 3 for Streams & 1 for Connect). Highlights include 2 new Docker images, the ability to set task assignors, and more!

Kafka

Apache Kafka Made Simple: A First Glimpse of a Kafka Without ZooKeeper

Confluent

MARCH 30, 2021

At the heart of Apache Kafka® sits the log—a simple data structure that uses sequential operations that work symbiotically with the underlying hardware. Efficient disk buffering and CPU cache usage, […].

Kafka

Kafka Data

What’s New in Apache Kafka 3.0.0

Confluent

SEPTEMBER 21, 2021

I’m pleased to announce the release of Apache Kafka 3.0 on behalf of the Apache Kafka® community. Apache Kafka 3.0 is a major release in more ways than one. Apache […].

Kafka

What’s New in Apache Kafka 3.2.0

Confluent

MAY 17, 2022

I’m proud to announce the release of Apache Kafka 3.2.0 on behalf of the Apache Kafka® community. release contains many new features and improvements. This blog will highlight […].

Kafka

Top 5 Things Every Kafka Developer Should Know

Confluent

OCTOBER 16, 2020

Apache Kafka® is an event streaming platform used by more than 30% of the Fortune 500 today. There are numerous features of Kafka that make it the de-facto standard for […].

Kafka

Kafka IT

Introducing Apache Kafka® 4.0

Confluent

MARCH 18, 2025

Major milestone release Apache Kafka 4.0 removes ZooKeeper entirely, provides early access to Queues for Kafka, and enables faster rebalances, in addition to many other new KIPs.

Kafka

Kafka Accessibility Accessible

Amazon Kinesis is not Apache Kafka

Waitingforcode

MAY 4, 2023

Today I'll focus on these differences for Amazon Kinesis service and Apache Kafka ecosystem. The managed cloud services often share the same fundamentals as their Open alternatives. However, there is always something different.

Kafka

Kafka Cloud Management

How to Better Manage Apache Kafka by Creating Kafka Messages from within Control Center

Confluent

JUNE 11, 2021

Managing Apache Kafka® clusters can be tricky sometimes. To solve this problem, Confluent Control Center helps you easily manage and monitor your clusters and interact with other Confluent components, such […].

Kafka

Kafka Management

Lessons Learned from Running Apache Kafka at Scale at Pinterest

Confluent

FEBRUARY 22, 2021

Apache Kafka® is at the heart of the data transportation layer at Pinterest. The amount of data that runs through Kafka has constantly grown over the years. This growth sometimes […].

Kafka

Kafka Transportation Data

What’s New in Apache Kafka 2.8

Confluent

APRIL 19, 2021

I’m proud to announce the release of Apache Kafka 2.8.0 on behalf of the Apache Kafka® community. release contains many new features and improvements. This blog post highlights […].

Kafka

How Producers Work: Kafka Producer and Consumer Internals, Part 1

Confluent

SEPTEMBER 5, 2024

Dive into Kafka internals with a four-part series examining client requests and brokers. Part 1 covers what a producer does to prepare raw event data for the broker.

Kafka

Kafka Data

Best Tutorials for Getting Started with Apache Kafka

Confluent

DECEMBER 3, 2021

Each one of the more than 50 tutorials for Apache Kafka® on Confluent Developer answers a question that you might ask a knowledgeable friend or colleague about Kafka and its […].

Kafka

Kafka IT

Queues in Apache Kafka®: Enhancing Message Processing and Scalability

Confluent

DECEMBER 19, 2024

Queue support in Apache Kafka 4.0, enabled by share groups, lets you accommodate traditional queue-type workloads through cooperative consumption.

Kafka

Kafka Process

5 Common Pitfalls When Using Apache Kafka

Confluent

JANUARY 11, 2022

Whether you’re a seasoned Apache Kafka® developer or just getting started you’re likely to hit a snag at some point or another—either in configuring and understanding your clients or setting […].

Kafka

Kora: The Cloud Native Engine for Apache Kafka

Confluent

MAY 15, 2023

Take a tour of the internals of Confluent’s Apache Kafka® service, powered by Kora: the next-generation, cloud-native streaming engine.Kora.

Kafka

Kafka Cloud Engineering

42 Things You Can Stop Doing Once ZooKeeper Is Gone from Apache Kafka

Confluent

FEBRUARY 18, 2021

Soon, Apache Kafka® will no longer need ZooKeeper! With KIP-500, Kafka will include its own built-in consensus layer, removing the ZooKeeper dependency altogether. The next big milestone in this effort […].

Kafka

Kafka IT

Min rate limits for Apache Kafka

Waitingforcode

FEBRUARY 20, 2024

You can limit the max throughput for Apache Spark Structured Streaming jobs for popular data sources such as Apache Kafka, Delta Lake, or raw files. Have you known that you can also control the lower limit, at least for Apache Kafka? I bet you know it already.

Kafka

Kafka IT Data

Stream Rows and Kafka Topics Directly into Snowflake with Snowpipe Streaming

Snowflake

MARCH 2, 2023

As part of this, we are also supporting Snowpipe Streaming as an ingestion method for our Snowflake Connector for Kafka. Now we are able to ingest our data in near real time directly from Kafka topics to a Snowflake table, drastically reducing the cost of ingestion and improving our SLA from 15 minutes to within 60 seconds.

Kafka

Kafka Data Ingestion Data Pipeline Cloud Storage

Spring for Apache Kafka 101

Confluent

OCTOBER 19, 2021

Extensive out-of-the-box functionality, a large user community, and up-to-date, cloud-native features make Spring and its libraries a strong option for anchoring your Apache Kafka® and Confluent Cloud based microservices architecture. […].

Kafka

Kafka Architecture Cloud IT

4 Must-Have Tests for Your Apache Kafka CI/CD with GitHub Actions

Confluent

JULY 26, 2022

Explore GitHub Actions for your Kafka CI/CD pipeline, automate Schema Registry, and transform the development and testing of Kafka client applications.

Kafka

Confluent + Immerok: Cloud Native Kafka Meets Cloud Native Flink

Confluent

JANUARY 6, 2023

Introducing fully managed Apache Kafka® + Flink for the most robust, cloud-native data streaming platform with stream processing, integration, and streaming analytics in one.

Kafka

Kafka Cloud Management Process

Node.js ❤️ Apache Kafka – Getting Started with KafkaJS

Confluent

FEBRUARY 8, 2021

One of the great things about using an Apache Kafka® based architecture is that it naturally decouples systems and allows you to use the best tool for the job. While […].

Kafka

Kafka Architecture Systems IT

What’s New in Apache Kafka 3.3

Confluent

OCTOBER 3, 2022

Apache Kafka 3.3 includes KRaft mode, improves partition scalability and resiliency while simplifying Kafka deployment, as well as updates to Kafka Streams, Connect, and more.

Kafka

Change Data Capture Using Debezium Kafka and Pg

Start Data Engineering

MAY 9, 2020

Change data capture is a software design pattern used to capture changes to data and take corresponding action based on that change. The change to data is usually one of read, update or delete. The corresponding action usually is supposed to occur in another system in response to the change that was made in the source system.

Kafka

Kafka Data Designing Systems

Building a Telegram Bot Powered by Apache Kafka and ksqlDB

Confluent

DECEMBER 2, 2022

ksqlDB use case: see how apps can use ksqlDB to ingest, filter, enrich, aggregate, and query data directly with Kafka—no complex architectures or data stores needed.

Kafka

Kafka Building Architecture Data

How Apache Kafka Works: An Introduction to Kafka’s Internals

Confluent

APRIL 14, 2022

It’s not difficult to get started with Apache Kafka®. If you are new to Kafka, […]. Learning resources can be found all over the internet, especially on the Confluent Developer site.

Kafka

Confluent Will Beat Your Cost of Running Kafka (or $100 on us)

Confluent

MAY 11, 2023

Running Kafka is costly, but Confluent has created a far more efficient product to lower your costs. Join the Cost Savings challenge to see for yourself.

Kafka

Deep Dive into Handling Consumer Fetch Requests: Kafka Producer and Consumer Internals, Part 4

Confluent

NOVEMBER 25, 2024

In the final article of this four-part series on Kafka producer and consumer internals, observe the inner workings of brokers as they attempt to serve data up to consumers.

Kafka

Kafka Data

From Eager to Smarter in Apache Kafka Consumer Rebalances

Confluent

DECEMBER 2, 2022

Major improvements to the Kafka consumer, Streams, and ksqlDB for incremental cooperative rebalancing while maintaining at-least-once and exactly-once guarantees.

Kafka

Kafka Process

How to Run Apache Kafka on Windows

Confluent

SEPTEMBER 7, 2023

Kafka-on-Windows tutorials are everywhere, but most run Kafka directly on Windows. Here's how to use Kafka on Windows in a Linux environment backed by WSL 2, maximizing performance and stability

Kafka

Getting Started with Apache Kafka in Python

Confluent

DECEMBER 8, 2021

Welcome Pythonistas to the streaming data world centered around Apache Kafka®! If you’re using Python and ready to get hands-on with Kafka, then you’re in the right place. This blog […].

Kafka

Kafka Python Data

Transactional Machine Learning at Scale with MAADS-VIPER and Apache Kafka

Confluent

DECEMBER 11, 2020

This blog post shows how transactional machine learning (TML) integrates data streams with automated machine learning (AutoML), using Apache Kafka® as the data backbone, to create a frictionless machine learning […].

Machine Learning

Machine Learning Kafka Data Programming

Building a Formula 1 Streaming Data Pipeline With Kafka and Risingwave

KDnuggets

SEPTEMBER 5, 2023

Build a streaming data pipeline using Formula 1 data, Python, Kafka, RisingWave as the streaming database, and visualize all the real-time data in Grafana.

Data Pipeline

Data Pipeline Kafka Building Python

Setting Up Kafka Multi-Tenancy

DoorDash Engineering

MARCH 27, 2024

At DoorDash, we rely on message queue systems based on Kafka to handle billions of real-time events. We will delve here into how we set up multi-tenancy with a messaging queue system based on Kafka. In Kafka, a test tenant processing production event can cause data inconsistencies, including outages and other incidents.

Kafka

Kafka Architecture Algorithm Process

Real-Time Wildlife Monitoring with Apache Kafka

Confluent

AUGUST 17, 2022

Confluent Hackathon ‘22: Using Apache Kafka a Raspberry Pi, and a camera, Simon Aubury builds a detection and monitoring system to better understand wildlife population trends over time.

Kafka

Kafka Systems Building

Dynamic vs. Static Consumer Membership in Apache Kafka

Confluent

FEBRUARY 15, 2023

There are two main consumer group memberships in Apache Kafka®. Here’s how static and dynamic consumer groups work, how they affect rebalancing, and which to choose for your application.

Kafka

Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot

Uber Engineering

SEPTEMBER 23, 2021

This article focuses on how we … The post Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot appeared first on Uber Engineering Blog. With this new ability came new challenges that needed to be solved at Uber, such as systems for ad auctions, bidding, attribution, reporting, and more.

Kafka

Kafka Process Systems Engineering

Troubleshooting Kafka In Production

Kafka to MongoDB: Building a Streamlined Data Pipeline

Webinars

Trending Sources

A Detailed Guide of Interview Questions on Apache Kafka

Webinars

Beyond Kafka: Conversation with Jark Wu on Fluss - Streaming Storage for Real-Time Analytics

Kafka Summit Americas 2021 Recap

Introducing Apache Kafka 3.7

Kafka Streams Fundamentals

Introducing Apache Kafka® 3.8

Apache Kafka Made Simple: A First Glimpse of a Kafka Without ZooKeeper

What’s New in Apache Kafka 3.0.0

What’s New in Apache Kafka 3.2.0

Top 5 Things Every Kafka Developer Should Know

Introducing Apache Kafka® 4.0

Amazon Kinesis is not Apache Kafka

How to Better Manage Apache Kafka by Creating Kafka Messages from within Control Center

Lessons Learned from Running Apache Kafka at Scale at Pinterest

What’s New in Apache Kafka 2.8

How Producers Work: Kafka Producer and Consumer Internals, Part 1

Best Tutorials for Getting Started with Apache Kafka

Queues in Apache Kafka®: Enhancing Message Processing and Scalability

5 Common Pitfalls When Using Apache Kafka

Kora: The Cloud Native Engine for Apache Kafka

42 Things You Can Stop Doing Once ZooKeeper Is Gone from Apache Kafka

Min rate limits for Apache Kafka

Stream Rows and Kafka Topics Directly into Snowflake with Snowpipe Streaming

Spring for Apache Kafka 101

4 Must-Have Tests for Your Apache Kafka CI/CD with GitHub Actions

Confluent + Immerok: Cloud Native Kafka Meets Cloud Native Flink

Node.js ❤️ Apache Kafka – Getting Started with KafkaJS

What’s New in Apache Kafka 3.3

Change Data Capture Using Debezium Kafka and Pg

Building a Telegram Bot Powered by Apache Kafka and ksqlDB

How Apache Kafka Works: An Introduction to Kafka’s Internals

Confluent Will Beat Your Cost of Running Kafka (or $100 on us)

Deep Dive into Handling Consumer Fetch Requests: Kafka Producer and Consumer Internals, Part 4

From Eager to Smarter in Apache Kafka Consumer Rebalances

How to Run Apache Kafka on Windows

Getting Started with Apache Kafka in Python

Transactional Machine Learning at Scale with MAADS-VIPER and Apache Kafka

Building a Formula 1 Streaming Data Pipeline With Kafka and Risingwave

Setting Up Kafka Multi-Tenancy

Real-Time Wildlife Monitoring with Apache Kafka

Dynamic vs. Static Consumer Membership in Apache Kafka

Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot

Stay Connected