Java and Kafka - Data Engineering Digest

Kafka vs RabbitMQ - A Head-to-Head Comparison for 2025

ProjectPro

JUNE 6, 2025

As a big data architect or a big data developer, when working with Microservices-based systems, you might often end up in a dilemma whether to use Apache Kafka or RabbitMQ for messaging. Rabbit MQ vs. Kafka - Which one is a better message broker? Table of Contents Kafka vs. RabbitMQ - An Overview What is RabbitMQ? What is Kafka?

Kafka

Kafka Java Big Data Architecture

Java vs Python for Data Science in 2025-What's your choice?

ProjectPro

JUNE 6, 2025

Why do data scientists prefer Python over Java? Java vs Python for Data Science- Which is better? Which has a better future: Python or Java in 2023? This blog aims to answer all questions on how Java vs Python compare for data science and which should be the programming language of your choice for doing data science in 2023.

Java

Java Data Science Python Programming Language

100+ Kafka Interview Questions and Answers for 2025

ProjectPro

JUNE 6, 2025

Your search for Apache Kafka interview questions ends right here! Let us now dive directly into the Apache Kafka interview questions and answers and help you get started with your Big Data interview preparation! What are topics in Apache Kafka? A stream of messages that belong to a particular category is called a topic in Kafka.

Kafka

Kafka Bytes Big Data Java

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

How To Learn Apache Kafka By Doing in 2025

ProjectPro

JUNE 6, 2025

Looking for the ultimate guide on mastering Apache Kafka in 2024? The ultimate hands-on learning guide with secrets on how you can learn Kafka by doing. Discover the key resources to help you master the art of real-time data streaming and building robust data pipelines with Apache Kafka. How Difficult Is It To Learn Kafka?

Kafka

Kafka Java Big Data Data Pipeline

How to Tune RocksDB for Your Kafka Streams Application

Confluent

MARCH 10, 2021

Apache Kafka ships with Kafka Streams, a powerful yet lightweight client library for Java and Scala to implement highly scalable and elastic applications and microservices that process and analyze data […].

Kafka

Kafka Scala Java Process

How to Get Started with Kafka Topics : A Beginner's Guide

ProjectPro

JUNE 6, 2025

Kafka Topics are your trusty companions. Learn how Kafka Topics simplify the complex world of big data processing in this comprehensive blog. More than 80% of all Fortune 100 companies trust, and use Kafka. Apache Kafka The meteoric rise of Apache Kafka's popularity is no accident, as it plays a crucial role in data engineering.

Kafka

Kafka Big Data Python Java

Stream Rows and Kafka Topics Directly into Snowflake with Snowpipe Streaming

Snowflake

MARCH 2, 2023

As part of this, we are also supporting Snowpipe Streaming as an ingestion method for our Snowflake Connector for Kafka. Now we are able to ingest our data in near real time directly from Kafka topics to a Snowflake table, drastically reducing the cost of ingestion and improving our SLA from 15 minutes to within 60 seconds.

Kafka

Kafka Data Ingestion Data Pipeline Cloud Storage

Top Apache Kafka Certifications for Data Professionals

ProjectPro

JUNE 6, 2025

Today, Kafka is used by thousands of companies, including over 80% of the Fortune 100. Kafka's popularity is skyrocketing, and for good reason—it helps organizations manage real-time data streams and build scalable data architectures. As a result, there's a growing demand for highly skilled professionals in Kafka.

Kafka

Kafka Certification AWS Retail

How to Use Apache Kafka for Real-Time Data Streaming?

ProjectPro

JUNE 6, 2025

If you’re looking for everything a beginner needs to know about using Apache Kafka for real-time data streaming, you’ve come to the right place. This blog post explores the basics about Apache Kafka and its uses, the benefits of utilizing real-time data streaming, and how to set up your data pipeline. Let's dive in.

Kafka

Kafka Hadoop Big Data Data Warehouse

My Python/Java/Spring/Go/Whatever Client Won’t Connect to My Apache Kafka Cluster in Docker/AWS/My Brother’s Laptop. Please Help!

Confluent

JUNE 9, 2020

tl;dr When a client wants to send or receive a message from Apache Kafka®, there are two types of connection that must succeed: The initial connection to a broker (the […].

Kafka

Kafka Java Python AWS

Bust the Burglars – Machine Learning with TensorFlow and Apache Kafka

Confluent

JULY 16, 2019

How cool would it be to build your own burglar alarm system that can alert you before the actual event takes place simply by using a few network-connected cameras and analyzing the camera images with Apache Kafka ® , Kafka Streams, and TensorFlow? Uploading your images into Kafka. Setting up your burglar alarm.

Kafka

Kafka Machine Learning Java Scala

The Rise of Managed Services for Apache Kafka

Confluent

SEPTEMBER 20, 2019

As a distributed system for collecting, storing, and processing data at scale, Apache Kafka ® comes with its own deployment complexities. To simplify all of this, different providers have emerged to offer Apache Kafka as a managed service. Before Confluent Cloud was announced , a managed service for Apache Kafka did not exist.

Kafka

Kafka Management Cloud AWS

12 Programming Languages Walk into a Kafka Cluster…

Confluent

APRIL 23, 2019

When it was first created, Apache Kafka ® had a client API for just Scala and Java. Since then, the Kafka client API has been developed for many other programming languages which enables you to pick the language you want. At Confluent, we have an engineering team dedicated to the development of these Kafka clients.

Programming Language

Programming Language Kafka Programming Scala

Getting Started with Rust and Apache Kafka

Confluent

OCTOBER 24, 2019

I’ve written an event sourcing bank simulation in Clojure (a lisp build for Java virtual machines or JVMs) called open-bank-mark , which you are welcome to read about in my previous blog post explaining the story behind this open source example. The schemas are also useful for generating specific Java classes. The bank application.

Kafka

Kafka Java Bytes Banking

Apache Kafka Vs Apache Spark: Know the Differences

Knowledge Hut

MAY 3, 2024

Spark Streaming Vs Kafka Stream Now that we have understood high level what these tools mean, it’s obvious to have curiosity around differences between both the tools. Spark Streaming Kafka Streams 1 Data received from live input data streams is Divided into Micro-batched for processing. 6 Spark streaming is a standalone framework.

Kafka

Kafka Scala Java Amazon Web Services

The Importance of Distributed Tracing for Apache-Kafka-Based Applications

Confluent

MARCH 26, 2019

Apache-Kafka ® -based applications stand out for their ability to decouple producers and consumers using an event log as an intermediate layer. This article describes how to instrument Kafka-based applications with distributed tracing capabilities in order to make dataflows between event-based components more visible.

Kafka

Kafka Transportation Metadata Consulting

Real-Time Analytics and Monitoring Dashboards with Apache Kafka and Rockset

Confluent

SEPTEMBER 26, 2019

In the early days, many companies simply used Apache Kafka ® for data ingestion into Hadoop or another data lake. However, Apache Kafka is more than just messaging. Some Kafka and Rockset users have also built real-time e-commerce applications , for example, using Rockset’s Java, Node.js

Kafka

Kafka BI SQL Hadoop

Designing the.NET API for Apache Kafka

Confluent

JUNE 27, 2019

Confluent’s clients for Apache Kafka ® recently passed a major milestone—the release of version 1.0. Magnus Edenhill first started developing librdkafka about seven years ago, later joining Confluent in the very early days to help foster the community of Kafka users outside the Java ecosystem. Leading up to the 1.0

Kafka

Kafka Designing Java Coding

Kafka Connect Deep Dive – JDBC Source Connector

Confluent

FEBRUARY 12, 2019

One of the most common integrations that people want to do with Apache Kafka ® is getting data in from a database. The existing data in a database, and any changes to that data, can be streamed into a Kafka topic. Here, I’m going to dig into one of the options available—the JDBC connector for Kafka Connect. Introduction.

Kafka

Kafka MySQL Bytes Java

Getting started with the MongoDB Connector for Apache Kafka and MongoDB

Confluent

JULY 17, 2019

Together, MongoDB and Apache Kafka ® make up the heart of many modern data architectures today. Integrating Kafka with external systems like MongoDB is best done though the use of Kafka Connect. The official MongoDB Connector for Apache Kafka is developed and supported by MongoDB engineers. Getting started.

MongoDB

MongoDB Kafka Database Medical

Deploying Kafka Streams and KSQL with Gradle – Part 3: KSQL User-Defined Functions and Kafka Streams

Confluent

JULY 10, 2019

As discussed in part 2, I created a GitHub repository with Docker Compose functionality for starting a Kafka and Confluent Platform environment, as well as the code samples mentioned below. We used Groovy instead of Java to write our UDFs, so we’ve applied the groovy plugin. gradlew composeUp. Note: When executing./gradlew

Kafka

Kafka Java Bytes SQL

A Beginners Guide to Spark Streaming Architecture with Example

ProjectPro

JUNE 6, 2025

Apache Spark Streaming Use Cases Spark Streaming Architecture: Discretized Streams Spark Streaming Example in Java Spark Streaming vs. Structured Streaming Spark Streaming Structured Streaming What is Kafka Streaming? Kafka Stream vs. Spark Streaming What is Spark streaming? Table of Contents What is Spark streaming?

Architecture

Architecture Kafka Java Scala

Fault Tolerance in Distributed Systems: Tracing with Apache Kafka and Jaeger

Confluent

JULY 24, 2019

Using Jaeger tracing, I’ve been able to answer an important question that nearly every Apache Kafka ® project that I’ve worked on posed: how is data flowing through my distributed system? Distributed tracing with Apache Kafka and Jaeger. Example of a Kafka project with Jaeger tracing. What does this all mean?

Kafka

Kafka Systems Bytes Project

Easier Stream Processing On Kafka With ksqlDB

Data Engineering Podcast

MARCH 2, 2020

The ksqlDB project was created to address this state of affairs by building a unified layer on top of the Kafka ecosystem for stream processing. Developers can work with the SQL constructs that they are familiar with while automatically getting the durability and reliability that Kafka offers. How is ksqlDB architected?

Kafka

Kafka Process PostgreSQL MySQL

How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka

Confluent

NOVEMBER 29, 2022

Apache Kafka’s Streams API embeds Machine Learning into any app or microservice (Java, Docker, Kubernetes, etc.) to add business value.

Machine Learning

Machine Learning Kafka Java Building

Dawn of Kafka DevOps: Managing Multi-Cluster Kafka Connect and KSQL with Confluent Control Center

Confluent

MAY 8, 2019

In anything but the smallest deployment of Apache Kafka ® , there are often going to be multiple clusters of Kafka Connect and KSQL. Kafka Connect rebalances when connectors are added/removed, and this can impact the performance of other connectors on the same cluster. Streaming data into Kafka with Kafka Connect.

Kafka

Kafka Management Hadoop Java

All About the Kafka Connect Neo4j Sink Plugin

Confluent

FEBRUARY 28, 2019

Only a little more than one month after the first release, we are happy to announce another milestone for our Kafka integration. Today, you can grab the Kafka Connect Neo4j Sink from Confluent Hub. . Neo4j extension – Kafka sink refresher. Testing the Kafka Connect Neo4j Sink. curl -X POST [link]. jar -f AVRO -e 100000.

Kafka

Kafka Java Programming Language Data Pipeline

Building Shared State Microservices for Distributed Systems Using Kafka Streams

Confluent

AUGUST 1, 2019

The Kafka Streams API boasts a number of capabilities that make it well suited for maintaining the global state of a distributed system. At Imperva, we took advantage of Kafka Streams to build shared state microservices that serve as fault-tolerant, highly available single sources of truth about the state of objects in our system.

Kafka

Kafka Systems Building Metadata

Internet of Things (IoT) and Event Streaming at Scale with Apache Kafka and MQTT

Confluent

OCTOBER 10, 2019

Apache Kafka ® and its surrounding ecosystem, which includes Kafka Connect, Kafka Streams, and KSQL, have become the technology of choice for integrating and processing these kinds of datasets. Microservices, Apache Kafka, and Domain-Driven Design (DDD) covers this in more detail. Example: Severstal. High throughput.

Kafka

Kafka Google Cloud Architecture Java

Spring for Apache Kafka Deep Dive – Part 3: Apache Kafka and Spring Cloud Data Flow

Confluent

MAY 30, 2019

Following part 1 and part 2 of the Spring for Apache Kafka Deep Dive blog series, here in part 3 we will discuss another project from the Spring team: Spring Cloud Data Flow , which focuses on enabling developers to easily develop, deploy, and orchestrate event streaming pipelines based on Apache Kafka ®. Command Line Shell.

Kafka

Kafka Cloud Data Pipeline PostgreSQL

Monitoring Data Replication in Multi-Datacenter Apache Kafka Deployments

Confluent

APRIL 10, 2019

Previously in 3 Ways to Prepare for Disaster Recovery in Multi-Datacenter Apache Kafka Deployments , we provided resources for multi-datacenter designs, centralized schema management, prevention of cyclic repetition of messages, and automatic consumer offset translation to automatically resume applications.

Kafka

Kafka Java Metadata Cloud

The Good and the Bad of Apache Kafka Streaming Platform

AltexSoft

OCTOBER 21, 2022

Kafka can continue the list of brand names that became generic terms for the entire type of technology. In this article, we’ll explain why businesses choose Kafka and what problems they face when using it. In this article, we’ll explain why businesses choose Kafka and what problems they face when using it. What is Kafka?

Kafka

Kafka Hadoop ETL Tools Java

How to Run Apache Kafka with Spring Boot on Pivotal Application Service (PAS)

Confluent

OCTOBER 7, 2019

This tutorial describes how to set up a sample Spring Boot application in Pivotal Application Service (PAS), which consumes and produces events to an Apache Kafka ® cluster running in Pivotal Container Service (PKS). With this tutorial, you can set up your PAS and PKS configurations so that they work with Kafka. Methodology.

Kafka

Kafka Java Coding Accessible

Data News — Week 24.11

Christophe Blefari

MARCH 15, 2024

Obviously Benoit prefers Kestra, at the expense of writing YAML and running a Java application. Unlocking Kafka's potential: tackling tail latency with eBPF. New Apache Arrow engines — Arrow has become one of the most used library when it comes to built in-memory engines.

Metadata

Metadata Software Engineer Software Engineering Data Warehouse

50 PySpark Interview Questions and Answers For 2025

ProjectPro

JUNE 6, 2025

The distributed execution engine in the Spark core provides APIs in Java, Python, and Scala for constructing distributed ETL applications. For input streams receiving data through networks such as Kafka , Flume, and others, the default persistence level setting is configured to achieve data replication on two nodes to achieve fault tolerance.

Hadoop

Hadoop Metadata Java Datasets

?? On Track with Apache Kafka – Building a Streaming ETL Solution with Rail Data

Confluent

OCTOBER 16, 2019

Using this data, Apache Kafka ® and Confluent Platform can provide the foundations for both event-driven applications as well as an analytical platform. With tools like KSQL and Kafka Connect, the concept of streaming ETL is made accessible to a much wider audience of developers and data engineers. Ingesting the data.

Kafka

Kafka Building Data PostgreSQL

Dawn of DevOps: Managing Apache Kafka Clusters at Scale with Confluent Control Center

Confluent

MAY 2, 2019

When managing Apache Kafka ® clusters at scale, tasks that are simple on small clusters turn into significant burdens. Relatedly, KIP-226 enabled dynamic broker reconfiguration since Apache Kafka 1.1. See the documentation (or, if you please, the Apache Kafka wiki ) for a complete list of which parameters this applies to.

Kafka

Kafka Management Food Consulting

30+ Data Engineering Projects for Beginners in 2025

ProjectPro

JUNE 6, 2025

Use Kafka for real-time data ingestion, preprocess with Apache Spark, and store data in Snowflake. This architecture shows that simulated sensor data is ingested from MQTT to Kafka. The data in Kafka is analyzed with Spark Streaming API and stored in a column store called HBase.

Data Engineer

Data Engineer Data Engineering Project Engineering

Data Engineering Weekly #219

Data Engineering Weekly

MAY 4, 2025

link] Uber: Fixrleak - Fixing Java Resource Leaks with GenAI Another interesting article from Uber demonstrates how AI significantly accelerates the reliability effects. The blog highlights how emerging AI tools automate otherwise cognitively intensive manual tasks to bring reliability in software engineering.

Data Engineer

Data Engineer Data Engineering Engineering Java

Spring for Apache Kafka Deep Dive – Part 4: Continuous Delivery of Event Streaming Pipelines

Confluent

JUNE 11, 2019

Here in part 4 of the Spring for Apache Kafka Deep Dive blog series, we will cover: Common event streaming topology patterns supported in Spring Cloud Data Flow. Create and manage event streaming pipelines, including a Kafka Streams application using Spring Cloud Data Flow. java -jar spring-cloud-dataflow-shell-2.1.0.RELEASE.jar.

Kafka

Kafka Cloud Java MongoDB

How to Use Schema Registry and Avro in Spring Boot Applications

Confluent

SEPTEMBER 5, 2019

Following on from How to Work with Apache Kafka in Your Spring Boot Application , which shows how to get started with Spring Boot and Apache Kafka ® , here I will demonstrate how to enable usage of Confluent Schema Registry and Avro serialization format in your Spring Boot applications. Initial revision. Prerequisities. Avro SerDes.

Kafka

Kafka Java Food Cloud

Dataflow Programming with Apache Flink and Apache Kafka

Confluent

SEPTEMBER 14, 2023

Learn how to use Apache Flink to build a Java pipeline that consumes clickstream data from Apache Kafka.

Kafka

Kafka Programming Java Building

Machine Learning with Python, Jupyter, KSQL and TensorFlow

Confluent

FEBRUARY 6, 2019

The blog posts How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka and Using Apache Kafka to Drive Cutting-Edge Machine Learning describe the benefits of leveraging the Apache Kafka ® ecosystem as a central, scalable and mission-critical nervous system. For now, we’ll focus on Kafka.

Machine Learning

Machine Learning Python Kafka Java

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

Data Engineering Podcast

OCTOBER 15, 2023

What was the process for adding full Java support in addition to SQL? What was the process for adding full Java support in addition to SQL? What are the problems that customers are trying to solve when they come to Decodable? When you launched your focus was on SQL transformations of streaming data.

Process

Process Building SQL BI

Kafka vs RabbitMQ - A Head-to-Head Comparison for 2025

Java vs Python for Data Science in 2025-What's your choice?

Webinars

Trending Sources

100+ Kafka Interview Questions and Answers for 2025

Webinars

How To Learn Apache Kafka By Doing in 2025

How to Tune RocksDB for Your Kafka Streams Application

How to Get Started with Kafka Topics : A Beginner's Guide

Stream Rows and Kafka Topics Directly into Snowflake with Snowpipe Streaming

Top Apache Kafka Certifications for Data Professionals

How to Use Apache Kafka for Real-Time Data Streaming?

My Python/Java/Spring/Go/Whatever Client Won’t Connect to My Apache Kafka Cluster in Docker/AWS/My Brother’s Laptop. Please Help!

Bust the Burglars – Machine Learning with TensorFlow and Apache Kafka

The Rise of Managed Services for Apache Kafka

12 Programming Languages Walk into a Kafka Cluster…

Getting Started with Rust and Apache Kafka

Apache Kafka Vs Apache Spark: Know the Differences

The Importance of Distributed Tracing for Apache-Kafka-Based Applications

Real-Time Analytics and Monitoring Dashboards with Apache Kafka and Rockset

Designing the.NET API for Apache Kafka

Kafka Connect Deep Dive – JDBC Source Connector

Getting started with the MongoDB Connector for Apache Kafka and MongoDB

Deploying Kafka Streams and KSQL with Gradle – Part 3: KSQL User-Defined Functions and Kafka Streams

A Beginners Guide to Spark Streaming Architecture with Example

Fault Tolerance in Distributed Systems: Tracing with Apache Kafka and Jaeger

Easier Stream Processing On Kafka With ksqlDB

How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka

Dawn of Kafka DevOps: Managing Multi-Cluster Kafka Connect and KSQL with Confluent Control Center

All About the Kafka Connect Neo4j Sink Plugin

Building Shared State Microservices for Distributed Systems Using Kafka Streams

Internet of Things (IoT) and Event Streaming at Scale with Apache Kafka and MQTT

Spring for Apache Kafka Deep Dive – Part 3: Apache Kafka and Spring Cloud Data Flow

Monitoring Data Replication in Multi-Datacenter Apache Kafka Deployments

The Good and the Bad of Apache Kafka Streaming Platform

How to Run Apache Kafka with Spring Boot on Pivotal Application Service (PAS)

Data News — Week 24.11

50 PySpark Interview Questions and Answers For 2025

?? On Track with Apache Kafka – Building a Streaming ETL Solution with Rail Data

Dawn of DevOps: Managing Apache Kafka Clusters at Scale with Confluent Control Center

30+ Data Engineering Projects for Beginners in 2025

Data Engineering Weekly #219

Spring for Apache Kafka Deep Dive – Part 4: Continuous Delivery of Event Streaming Pipelines

How to Use Schema Registry and Avro in Spring Boot Applications

Dataflow Programming with Apache Flink and Apache Kafka

Machine Learning with Python, Jupyter, KSQL and TensorFlow

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

Stay Connected