Cloud and Kafka - Data Engineering Digest

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Can you describe your experiences with Kafka? What are the operational challenges that you have had to overcome while working with Kafka? With Materialize, you can!

Kafka

Kafka Data Lake High Quality Data SQL

Confluent + Immerok: Cloud Native Kafka Meets Cloud Native Flink

Confluent

JANUARY 6, 2023

Introducing fully managed Apache Kafka® + Flink for the most robust, cloud-native data streaming platform with stream processing, integration, and streaming analytics in one.

Kafka

Kafka Cloud Management Process

Kora: The Cloud Native Engine for Apache Kafka

Confluent

MAY 15, 2023

Take a tour of the internals of Confluent’s Apache Kafka® service, powered by Kora: the next-generation, cloud-native streaming engine.Kora.

Kafka

Kafka Cloud Engineering

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Inside Agoda’s Private Cloud - Exclusive

The Pragmatic Engineer

JUNE 13, 2023

In a previous two-part series , we dived into Uber’s multi-year project to move onto the cloud , away from operating its own data centers. But there’s no “one size fits all” strategy when it comes to deciding the right balance between utilizing the cloud and operating your infrastructure on-premises.

Cloud

Cloud Database Utilities BI

Beyond Kafka: Conversation with Jark Wu on Fluss - Streaming Storage for Real-Time Analytics

Data Engineering Weekly

FEBRUARY 18, 2025

I spoke with Jark Wu , who leads the Fluss and Flink SQL team at Alibaba Cloud, to understand its origins and potential. It addresses many of Kafka's challenges in analytical infrastructure. The combination of Kafka and Flink is not a perfect fit for real-time analytics; the integration of Kafka and Lakehouse is very shallow.

Kafka

Kafka Lambda Architecture SQL Architecture

Making Confluent Cloud 10x More Elastic Than Apache Kafka

Confluent

MAY 30, 2022

Kafka is horizontally scalable, but it's not enough. So we made Confluent Cloud 10x more elastic - 10x faster to scale up to GB/s or down to zero, easier to use, and cost-effective.

Kafka

Kafka Cloud IT

Amazon Kinesis is not Apache Kafka

Waitingforcode

MAY 4, 2023

Open Source tools helped me switch to the cloud world a lot. The managed cloud services often share the same fundamentals as their Open alternatives. Today I'll focus on these differences for Amazon Kinesis service and Apache Kafka ecosystem. However, there is always something different.

Kafka

Kafka Cloud Management

Stream Rows and Kafka Topics Directly into Snowflake with Snowpipe Streaming

Snowflake

MARCH 2, 2023

As part of this, we are also supporting Snowpipe Streaming as an ingestion method for our Snowflake Connector for Kafka. Now we are able to ingest our data in near real time directly from Kafka topics to a Snowflake table, drastically reducing the cost of ingestion and improving our SLA from 15 minutes to within 60 seconds.

Kafka

Kafka Data Ingestion Data Pipeline Cloud Storage

Announcing the Snowflake Sink Connector for Apache Kafka in Confluent Cloud

Confluent

JUNE 26, 2020

We are excited to announce the preview release of the fully managed Snowflake sink connector in Confluent Cloud, our fully managed event streaming service based on Apache Kafka®. Our managed […].

Kafka

Kafka Cloud Management

Confluent Cloud: Making an Apache Kafka Service 10x Better

Confluent

MAY 30, 2022

What we’ve done to evolve from cloud Kafka to Confluent Cloud, a data streaming platform that’s 10X better than Kafka in elasticity, storage, resiliency, and more.

Kafka

Kafka Cloud Data

Spring for Apache Kafka 101

Confluent

OCTOBER 19, 2021

Extensive out-of-the-box functionality, a large user community, and up-to-date, cloud-native features make Spring and its libraries a strong option for anchoring your Apache Kafka® and Confluent Cloud based microservices architecture. […].

Kafka

Kafka Architecture Cloud IT

How to make Apache Kafka clients go fast(er) on Confluent Cloud

Confluent

MARCH 9, 2022

Within a few minutes, you provision a fully managed Apache Kafka® cluster […]. Imagine your team wants to design a data streaming architecture and you’re in charge of creating the prototype.

Kafka

Kafka Cloud Architecture Designing

Apache Kafka as a Service with Confluent Cloud Now Available on Azure Marketplace

Confluent

FEBRUARY 18, 2020

Less than six months ago, we announced support for Microsoft Azure in Confluent Cloud, which allows developers using Azure as a public cloud to build event streaming applications with Apache […].

Cloud

Cloud Kafka Building Programming

Upgrade Journey: The Path from CDH to CDP Private Cloud

Cloudera

SEPTEMBER 28, 2020

Cloudera delivers an enterprise data cloud that enables companies to build end-to-end data pipelines for hybrid cloud, spanning edge devices to public or private cloud, with integrated security and governance underpinning it to protect customers data. Support Kafka connectivity to HDFS, AWS S3 and Kafka Streams.

Cloud

Cloud Kafka Professional Services Metadata

How Michelin Cut Kafka Costs by 35% with Confluent Cloud

Confluent

MAY 23, 2023

Learn how Confluent Cloud helped Michelin streamline Apache Kafka® operations, reduce costs, and go to market 8-9 months faster. Keyword: Confluent Cloud

Kafka

Kafka Cloud

Implementing mTLS and Securing Apache Kafka at Zendesk

Confluent

JANUARY 7, 2021

At Zendesk, Apache Kafka® is one of our foundational services for distributing events among different internal systems. We have pods, which can be thought of as isolated cloud environments where […].

Kafka

Kafka Cloud Systems

Creating a Serverless Environment for Testing Your Apache Kafka Applications

Confluent

SEPTEMBER 1, 2020

If you are taking your first steps with Apache Kafka®, looking at a test environment for your client application, or building a Kafka demo, there are two “easy button” paths […].

Kafka

Kafka Building Cloud

Tackling the Hidden and Unhidden Costs of Kafka

Confluent

MAY 9, 2023

Low utilization and operational complexity dramatically increases Kafka costs, so we reinvented Kafka as a cloud-native and complete service to reduce costs for thousands of businesses at any scale.

Kafka

Kafka Utilities Cloud

Confluent + WarpStream = Large-Scale Streaming in your Cloud

Confluent

SEPTEMBER 9, 2024

Confluent has acquired WarpStream, an innovative Kafka-compatible streaming solution. Read the full statement by Jay Kreps, co-founder and CEO of Confluent.

Cloud

Cloud Kafka

Cloudera acquires Eventador to accelerate Stream Processing in Public & Hybrid Clouds

Cloudera

OCTOBER 12, 2020

We are thrilled to announce that Cloudera has acquired Eventador , a provider of cloud-native services for enterprise-grade stream processing. The DataFlow platform has established a leading position in the data streaming market by unlocking the combined value and synergies of Apache NiFi, Apache Kafka and Apache Flink.

Cloud

Cloud Process Scala Kafka

Setting Up Kafka Multi-Tenancy

DoorDash Engineering

MARCH 27, 2024

At DoorDash, we rely on message queue systems based on Kafka to handle billions of real-time events. We will delve here into how we set up multi-tenancy with a messaging queue system based on Kafka. In Kafka, a test tenant processing production event can cause data inconsistencies, including outages and other incidents.

Kafka

Kafka Architecture Algorithm Process

Project Metamorphosis Month 3: Infinite Storage in Confluent Cloud for Apache Kafka

Confluent

JULY 1, 2020

This is the third month of Project Metamorphosis, where we discuss new features in Confluent’s offerings that bring together event streams and the best characteristics of modern cloud data systems. […].

Project

Project Cloud Kafka Systems

Project Metamorphosis Month 6: Secure Apache Kafka in Confluent Cloud

Confluent

OCTOBER 7, 2020

The cloud opens up exciting new opportunities for information gathering, analysis, and sharing that can make every organization’s products and services better. Thanks to the cloud and its decentralized nature, […].

Cloud

Cloud Project Kafka IT

Making Apache Kafka Serveless: Lessons From Confluent Cloud

Confluent

JULY 27, 2021

Serverless offerings in the cloud are a favorite among software engineers—a prime example are object stores such as AWS S3. For the system designer, however, it is an engineering challenge […].

Cloud

Cloud Kafka Software Engineering Software Engineer

Turbo-Charging Confluent Cloud To Be 10x Faster Than Apache Kafka®

Confluent

JANUARY 10, 2024

Confluent Cloud is now 10x faster than Apache Kafka. Read our latency benchmarking results, the innovations behind-the-scenes, and the lessons we learned.

Kafka

Kafka Cloud

Speed, Scale, Storage: Our Journey from Apache Kafka to Performance in Confluent Cloud

Confluent

JULY 28, 2021

At Confluent, we focus on the holy trinity of performance, price, and availability, with the goal of delivering a similar performance envelope for all workloads across all supported cloud providers. […].

Cloud

Cloud Kafka Architecture

How to Turn a REST API Into a Data Stream with Kafka and Flink

Confluent

JUNE 17, 2024

Improve REST API response data w/Kafka and Flink SQL in Confluent Cloud; Automatic connector retriability combats REST flakiness; Demo w/OpenSky data.

Kafka

Kafka SQL Cloud Data

Trigger AWS Lambda Functions Directly from a Confluent Cloud Apache Kafka Topic

Confluent

SEPTEMBER 27, 2021

The distributed architecture of Apache Kafka® can cause the operational burden of managing it to quickly become a limiting factor for adoption and developer agility. For this reason, it is […].

Kafka

Kafka AWS Cloud Architecture

Apache Kafka Vs Apache Spark: Know the Differences

Knowledge Hut

MAY 3, 2024

Spark Streaming Vs Kafka Stream Now that we have understood high level what these tools mean, it’s obvious to have curiosity around differences between both the tools. Spark Streaming Kafka Streams 1 Data received from live input data streams is Divided into Micro-batched for processing. 6 Spark streaming is a standalone framework.

Kafka

Kafka Scala Java Amazon Web Services

Simplifying Data Architecture and Security to Accelerate Value

Snowflake

NOVEMBER 11, 2024

With Snowpipe for Apache Kafka (public preview soon in AWS and Microsoft Azure), a “pull” mechanism, rather than the existing “push” connector, allows you to extract and ingest Apache Kafka events into your Snowflake account directly without hosting your own Kafka Connect cluster.

Data Architecture

Data Architecture Architecture Data Lake Kafka

Unpacking the Latest Streaming Announcements: A Comprehensive Analysis

Jesse Anderson

JUNE 12, 2024

We also discuss the various systems using Kafka’s protocol. Confluent has never shied away from saying Kafka is “easy,” and I disagree. During the Kafka Summit London Keynote, the speakers said “easy” 17 times; in the Kafka Summit Bangalore Keynote, they said it 18 times. Using Confluent Cloud?

Kafka

Kafka Data Lake Architecture Cloud

Apache Kafka Native MQTT at Scale with Confluent Cloud and Waterstream

Confluent

JULY 15, 2020

With billions of Internet of Things (IoT) devices, achieving real-time interoperability has become a major challenge. Together, Confluent, Waterstream, and MQTT are accelerating Industry 4.0 with new Industrial IoT (IIoT) […].

Kafka

Kafka Cloud Programming

Stream Processing with Python, Kafka & Faust

Towards Data Science

FEBRUARY 18, 2024

Although the Faust library aims to bring Kafka Streaming ideas into the Python ecosystem, it may pose challenges in terms of ease of use. Lastly, I share my experience implementing a similar pipeline on the Google Cloud Platform. A table is a distributed in-memory dictionary, backed by a Kafka changelog topic.

Kafka

Kafka Python Process Google Cloud

API-First Approach to Kafka Topic Creation

DoorDash Engineering

DECEMBER 5, 2023

DoorDash’s Engineering teams revamped Kafka Topic creation by replacing a Terraform/Atlantis based approach with an in-house API, Infra Service. DoorDash’s Real-Time Streaming Platform, or RTSP, team is under the Data Platform organization and manages over 2,500 Kafka Topics across five clusters. Built atop Prometheus.

Kafka

Kafka Programming Language Metadata Architecture

New Features in Cloudera Streams Messaging for CDP Public Cloud 7.2.14

Cloudera

MARCH 11, 2022

With the launch of CDP Public Cloud 7.2.14, Cloudera Streams Messaging for Data Hub deployments has gotten some powerful new features! In this release , the Streams Messaging templates in Data Hub will come with Apache Kafka 2.8 Kafka & Cruise Control Updates. Kafka Updates: Deployments with Kafka 2.5

Cloud

Cloud Kafka Utilities Database

Pipeline to the Cloud – Streaming On-Premises Data for Cloud Analytics

Confluent

JANUARY 8, 2020

This article show how you can offload data from on-premises transactional (OLTP) databases to cloud-based datastores, including Snowflake and Amazon S3 with Athena. I’m also going to take the opportunity […].

Cloud

Cloud Database Data Kafka

Announcing the GA of Cloudera DataFlow for the Public Cloud on Microsoft Azure

Cloudera

FEBRUARY 10, 2022

After the launch of Cloudera DataFlow for the Public Cloud (CDF-PC) on AWS a few months ago, we are thrilled to announce that CDF-PC is now generally available on Microsoft Azure, allowing NiFi users on Azure to run their data flows in a cloud-native runtime. . The need for a cloud-native Apache NiFi service on Microsoft Azure.

Cloud

Cloud Kafka AWS Data Ingestion

Learning with LiveStreams: Cloud-Native Apache Kafka and Serverless Stream Processing

Confluent

MARCH 19, 2021

LiveStreams is a YouTube show about Confluent, real-time data streaming, and related technologies that help you maximize data in motion on any cloud. Every episode of LiveStreams will teach you […].

Cloud

Cloud Process Kafka Technology

Protecting Data Integrity in Confluent Cloud: Over 8 Trillion Messages Per Day

Confluent

JULY 30, 2021

Last year, Confluent announced support for Infinite Storage, which fundamentally changes data retention in Apache Kafka® by allowing […]. It’s about maintaining the right data even when no one is watching.

Data Integration

Data Integration Cloud Kafka Data

What’s New in CDP Private Cloud Base 7.1.7?

Cloudera

AUGUST 10, 2021

With the release of CDP Private Cloud (PvC) Base 7.1.7, Atlas / Kafka integration provides metadata collection for Kafa producers/consumers so that consumers can manage, govern, and monitor Kafka metadata and metadata lineage in the Atlas UI. Deep Dive 2: Atlas / Kafka integration. Deep Dive 1: Impala Row filtering.

Cloud

Cloud Kafka Metadata SQL

Project Metamorphosis Part 1: Elastic Apache Kafka Clusters in Confluent Cloud

Confluent

MAY 6, 2020

A few weeks ago when we talked about our new fundraising, we also announced we’d be kicking off Project Metamorphosis. What is Project Metamorphosis? Let me try to explain. I […].

Project

Project Kafka Cloud

How Netflix Accurately Attributes eBPF Flow Logs

Netflix Tech

APRIL 8, 2025

In cloud environments, IP addresses are reassigned to different workloads as workload instances are created and terminated, so IP addresses alone cannot provide insights on which workloads are communicating. We implemented a broadcasting mechanism using Kafka, where each node publishes learned time ranges to all other nodes.

AWS

AWS Kafka Cloud Programming

How to Elastically Scale Apache Kafka Clusters on Confluent Cloud

Confluent

JUNE 7, 2022

How to elastically scale Kafka clusters from 0 to 100 MB/s and back with automatic cluster resizing, data rebalancing, real-time consumption optimization, and monitoring in seconds.

Kafka

Kafka Cloud Data

Announcing the GA of Cloudera DataFlow for the Public Cloud

Cloudera

AUGUST 16, 2021

Are you ready to turbo-charge your data flows on the cloud for maximum speed and efficiency? Deploy, manage and monitor your standard NiFi flows running on-premises or on CDP Data Hub into cloud-native flows running on Kubernetes clusters on AWS. . NEW Cloudera DataFlow for the Public Cloud.

Cloud

Cloud AWS Kafka Utilities

Troubleshooting Kafka In Production

Confluent + Immerok: Cloud Native Kafka Meets Cloud Native Flink

Webinars

Trending Sources

Kora: The Cloud Native Engine for Apache Kafka

Webinars

Inside Agoda’s Private Cloud - Exclusive

Beyond Kafka: Conversation with Jark Wu on Fluss - Streaming Storage for Real-Time Analytics

Making Confluent Cloud 10x More Elastic Than Apache Kafka

Amazon Kinesis is not Apache Kafka

Stream Rows and Kafka Topics Directly into Snowflake with Snowpipe Streaming

Announcing the Snowflake Sink Connector for Apache Kafka in Confluent Cloud

Confluent Cloud: Making an Apache Kafka Service 10x Better

Spring for Apache Kafka 101

How to make Apache Kafka clients go fast(er) on Confluent Cloud

Apache Kafka as a Service with Confluent Cloud Now Available on Azure Marketplace

Upgrade Journey: The Path from CDH to CDP Private Cloud

How Michelin Cut Kafka Costs by 35% with Confluent Cloud

Implementing mTLS and Securing Apache Kafka at Zendesk

Creating a Serverless Environment for Testing Your Apache Kafka Applications

Tackling the Hidden and Unhidden Costs of Kafka

Confluent + WarpStream = Large-Scale Streaming in your Cloud

Cloudera acquires Eventador to accelerate Stream Processing in Public & Hybrid Clouds

Setting Up Kafka Multi-Tenancy

Project Metamorphosis Month 3: Infinite Storage in Confluent Cloud for Apache Kafka

Project Metamorphosis Month 6: Secure Apache Kafka in Confluent Cloud

Making Apache Kafka Serveless: Lessons From Confluent Cloud

Turbo-Charging Confluent Cloud To Be 10x Faster Than Apache Kafka®

Speed, Scale, Storage: Our Journey from Apache Kafka to Performance in Confluent Cloud

How to Turn a REST API Into a Data Stream with Kafka and Flink

Trigger AWS Lambda Functions Directly from a Confluent Cloud Apache Kafka Topic

Apache Kafka Vs Apache Spark: Know the Differences

Simplifying Data Architecture and Security to Accelerate Value

Unpacking the Latest Streaming Announcements: A Comprehensive Analysis

Apache Kafka Native MQTT at Scale with Confluent Cloud and Waterstream

Stream Processing with Python, Kafka & Faust

API-First Approach to Kafka Topic Creation

New Features in Cloudera Streams Messaging for CDP Public Cloud 7.2.14

Pipeline to the Cloud – Streaming On-Premises Data for Cloud Analytics

Announcing the GA of Cloudera DataFlow for the Public Cloud on Microsoft Azure

Learning with LiveStreams: Cloud-Native Apache Kafka and Serverless Stream Processing

Protecting Data Integrity in Confluent Cloud: Over 8 Trillion Messages Per Day

What’s New in CDP Private Cloud Base 7.1.7?

Project Metamorphosis Part 1: Elastic Apache Kafka Clusters in Confluent Cloud

How Netflix Accurately Attributes eBPF Flow Logs

How to Elastically Scale Apache Kafka Clusters on Confluent Cloud

Announcing the GA of Cloudera DataFlow for the Public Cloud

Stay Connected