Kafka and MongoDB - Data Engineering Digest

Kafka to MongoDB: Building a Streamlined Data Pipeline

Analytics Vidhya

FEBRUARY 28, 2024

We know that streaming data is data that is emitted at high volume […] The post Kafka to MongoDB: Building a Streamlined Data Pipeline appeared first on Analytics Vidhya. Handling and processing the streaming data is the hardest work for Data Analysis.

MongoDB

MongoDB Data Pipeline Kafka Building

Getting started with the MongoDB Connector for Apache Kafka and MongoDB

Confluent

JULY 17, 2019

Together, MongoDB and Apache Kafka ® make up the heart of many modern data architectures today. Integrating Kafka with external systems like MongoDB is best done though the use of Kafka Connect. The official MongoDB Connector for Apache Kafka is developed and supported by MongoDB engineers.

MongoDB

MongoDB Kafka Database Medical

Using the Fully Managed MongoDB Atlas Connector in a Secure Environment

Confluent

SEPTEMBER 18, 2020

Since the MongoDB Atlas source and sink became available in Confluent Cloud, we’ve received many questions around how to set up these connectors in a secure environment. By default, MongoDB […].

MongoDB

MongoDB Management Cloud AWS

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

The Rise of Managed Services for Apache Kafka

Confluent

SEPTEMBER 20, 2019

As a distributed system for collecting, storing, and processing data at scale, Apache Kafka ® comes with its own deployment complexities. To simplify all of this, different providers have emerged to offer Apache Kafka as a managed service. BigQuery, Amazon Redshift, and MongoDB Atlas) and caches (e.g.,

Kafka

Kafka Management Cloud AWS

Announcing Confluent Cloud for Apache Kafka as a Native Service on Google Cloud Platform

Confluent

APRIL 9, 2019

I’m excited to announce that we’re partnering with Google Cloud to make Confluent Cloud, our fully managed offering of Apache Kafka ® , available as a native offering on Google Cloud Platform (GCP). Confluent’s founders didn’t just write the original code of Apache Kafka, we also ran it as a service at massive scale.

Google Cloud

Google Cloud Kafka Cloud MongoDB

MongoDB CDC: When to Use Kafka, Debezium, Change Streams and Rockset

Rockset

JULY 28, 2022

MongoDB has grown from a basic JSON key-value store to one of the most popular NoSQL database solutions in use today. These attributes have caused MongoDB to be widely adopted especially alongside JavaScript web applications. These attributes have caused MongoDB to be widely adopted especially alongside JavaScript web applications.

MongoDB

MongoDB Kafka NoSQL Data Lake

Analysing Changes with Debezium and Kafka Streams

Confluent

JULY 29, 2020

Change Data Capture (CDC) is an excellent way to introduce streaming analytics into your existing database, and using Debezium enables you to send your change data through Apache Kafka®. Although […].

Kafka

Kafka Database MongoDB Data

Real-Time Data Streaming: MongoDB Change Stream Kafka

Hevo

AUGUST 27, 2024

Over the past few years, MongoDB has become a popular choice for NoSQL Databases. Catering to real-time processing requirements, MongoDB introduced a powerful feature to track data […] With the rise of modern data tools, real-time data processing is no longer a dream.

MongoDB

MongoDB Kafka NoSQL Data

30+ Data Engineering Projects for Beginners in 2025

ProjectPro

JUNE 6, 2025

Use Kafka for real-time data ingestion, preprocess with Apache Spark, and store data in Snowflake. This architecture shows that simulated sensor data is ingested from MQTT to Kafka. The data in Kafka is analyzed with Spark Streaming API and stored in a column store called HBase.

Data Engineering

Data Engineering Data Engineer Project Engineering

Announcing the MongoDB Atlas Sink and Source Connectors in Confluent Cloud

Confluent

JUNE 8, 2020

We are excited to announce the preview release of the fully managed MongoDB Atlas source and sink connectors in Confluent Cloud, our fully managed event streaming service based on Apache […].

MongoDB

MongoDB Cloud Management Kafka

The Good and the Bad of Apache Kafka Streaming Platform

AltexSoft

OCTOBER 21, 2022

Kafka can continue the list of brand names that became generic terms for the entire type of technology. In this article, we’ll explain why businesses choose Kafka and what problems they face when using it. In this article, we’ll explain why businesses choose Kafka and what problems they face when using it. What is Kafka?

Kafka

Kafka Hadoop ETL Tools Java

3 Ways to Offload Read-Heavy Applications from MongoDB

Rockset

SEPTEMBER 25, 2020

According to over 40,000 developers, MongoDB is the most popular NOSQL database in use right now. From a developer perspective, MongoDB is a great solution for supporting modern data applications. This blog post will look at three of them: tailing MongoDB with an oplog, using MongoDB change streams, and using a Kafka connector.

MongoDB

MongoDB Kafka Database NoSQL

Spring for Apache Kafka Deep Dive – Part 4: Continuous Delivery of Event Streaming Pipelines

Confluent

JUNE 11, 2019

Here in part 4 of the Spring for Apache Kafka Deep Dive blog series, we will cover: Common event streaming topology patterns supported in Spring Cloud Data Flow. Create and manage event streaming pipelines, including a Kafka Streams application using Spring Cloud Data Flow.

Kafka

Kafka Cloud Java MongoDB

Real-Time Fleet Management Using Confluent Cloud and MongoDB

Confluent

JUNE 5, 2020

Most organisations maintain fleets, a collection of vehicles put to use for day-to-day operations. Telcos use a variety of vehicles including cars, vans, and trucks for service, delivery, and maintenance. […].

MongoDB

MongoDB Cloud Management Kafka

MongoDB Performance Tuning - Top 5 Resources

Rockset

JUNE 9, 2020

In the course of implementing the Rockset connector to MongoDB , we did a fair amount of research on the MongoDB user experience, both online and through user interviews. Sharding What is MongoDB Sharding and the Best Practices? This was a recurring theme we heard when speaking with MongoDB users.

MongoDB

MongoDB NoSQL Kafka Database

JOINs and Aggregations Using Real-Time Indexing on MongoDB Atlas

Rockset

JUNE 16, 2020

MongoDB.live took place last week, and Rockset had the opportunity to participate alongside members of the MongoDB community and share about our work to make MongoDB data accessible via real-time external indexing. We would be responsible for building and maintaining pipelines from these sources to MongoDB.

MongoDB

MongoDB Data Lake PostgreSQL Kafka

Building a Real-Time Customer 360 on Kafka, MongoDB and Rockset

Rockset

OCTOBER 1, 2020

The data sources available include: users (MongoDB): Core customer data such as name, age, gender, address. online_orders (MongoDB): Online purchase data including product details and delivery addresses. instore_orders (MongoDB): In-store purchase data again including product details and store location. SELECT users.id

MongoDB

MongoDB Kafka Building Data Warehouse

Building Data Pipelines That Run From Source To Analysis And Activation With Hevo Data

Data Engineering Podcast

SEPTEMBER 11, 2022

With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Email hosts@dataengineeringpodcast.com ) with your story. Email hosts@dataengineeringpodcast.com ) with your story.

Data Pipeline

Data Pipeline Building MongoDB Scala

Astronomer with Ry Walker - Episode 6

Data Engineering Podcast

AUGUST 6, 2017

Contact Information Email @rywalker on Twitter Links Astronomer Kiss Metrics Segment Marketing tools chart Clickstream HIPAA FERPA PCI Mesos Mesos DC/OS Airflow SSIS Marathon Prometheus Grafana Terraform Kafka Spark ELK Stack React GraphQL PostGreSQL MongoDB Ceph Druid Aries Vault Adapter Pattern Docker Kinesis API Gateway Kong AWS Lambda Flink Redshift (..)

MongoDB

MongoDB PostgreSQL Kafka Data Pipeline

The Alooma Data Pipeline With CTO Yair Weinberger - Episode 33

Data Engineering Podcast

MAY 27, 2018

It’s easy to get one started but difficult to manage as new requirements are added and greater scalability becomes necessary.

Data Pipeline

Data Pipeline MongoDB Scala Kafka

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Monte Carlo

OCTOBER 31, 2024

Data Engineering Tools Data engineers need to be comfortable using essential tools for data pipeline management and workflow orchestration, including Apache Kafka, Apache Spark, Airflow, Dagster, dbt, and many more. Get familiar with data warehouses, data lakes, and data lakehouses, including MongoDB , Cassandra, BigQuery, Redshift and more.

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

Building a Telegram Bot Powered by Apache Kafka and ksqlDB

Confluent

MAY 12, 2020

Imagine you’ve got a stream of data; it’s not “big data,” but it’s certainly a lot. Within the data, you’ve got some bits you’re interested in, and of those bits, […].

Kafka

Kafka Building Big Data MongoDB

An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications

Data Engineering Podcast

AUGUST 21, 2022

With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs.

Lambda Architecture

Lambda Architecture MongoDB Scala MySQL

Create a Data Analysis Pipeline with Apache Kafka and RStudio

Confluent

JULY 13, 2021

In Data Science projects, we distinguish between descriptive analytics and statistical models running in production. Overall, these can be seen as one process. You start with analyzing historical data to […].

Data Analysis

Data Analysis Kafka Data Science Data

Connect Data From Legacy Databases Anywhere to MongoDB in Real Time With Confluent and Apache Kafka®

Confluent

DECEMBER 31, 1969

Connect any database to MongoDB using Confluent's cloud-native data streaming platform. Modernize any database, build streaming data pipelines, and empower real-time data in minutes.

MongoDB

MongoDB Database Kafka Data Pipeline

Data Pipeline- Definition, Architecture, Examples, and Use Cases

ProjectPro

JUNE 6, 2025

In addition, to extract data from the eCommerce website, you need experts familiar with databases like MongoDB that store reviews of customers. You can use big-data processing tools like Apache Spark , Kafka , and more to create such pipelines. However, it is not straightforward to create data pipelines.

Data Pipeline

Data Pipeline Architecture Kafka Data Lake

Announcing the MongoDB Atlas Sink and Source Connectors in Confluent Cloud

Confluent

MAY 6, 2021

Today, Confluent is announcing the general availability (GA) of the fully managed MongoDB Atlas Source and MongoDB Atlas Sink Connectors within Confluent Cloud. Now, with just a few simple clicks, […].

MongoDB

MongoDB Cloud Management Kafka

TimescaleDB: The Timeseries Database Built For SQL And Scale - Episode 65

Data Engineering Podcast

JANUARY 13, 2019

Contact Info Ajay @acoustik on Twitter LinkedIn Mike LinkedIn Website @michaelfreedman on Twitter Timescale Website Documentation Careers timescaledb on GitHub @timescaledb on Twitter Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?

Database

Database PostgreSQL SQL MongoDB

Adding Support For Distributed Transactions To The Redpanda Streaming Engine

Data Engineering Podcast

OCTOBER 5, 2021

How did the constraint of supporting the Kafka API influence your implementation strategy for transaction semantics? How did the constraint of supporting the Kafka API influence your implementation strategy for transaction semantics? What are the elements of streaming systems that make atomic transactions a complex problem?

Engineering

Engineering MongoDB Kafka Data Lake

How to Become a Big Data Developer-A Step-by-Step Guide

ProjectPro

JUNE 6, 2025

Step 2: Master Big Data Tools and Technologies Familiarize yourself with the core Big Data technologies and frameworks, such as Hadoop , Apache Spark, and Apache Kafka. Apache Kafka: Kafka is a distributed event streaming platform. Learning Scala is valuable when you focus on real-time data processing and analytics.

Big Data

Big Data Hadoop Scala NoSQL

TimescaleDB: Fast And Scalable Timeseries with Ajay Kulkarni and Mike Freedman - Episode 18

Data Engineering Podcast

FEBRUARY 11, 2018

Links Timescale PostGreSQL Citus Timescale Design Blog Post MIT NYU Stanford SDN Princeton Machine Data Timeseries Data List of Timeseries Databases NoSQL Online Transaction Processing (OLTP) Object Relational Mapper (ORM) Grafana Tableau Kafka When Boring Is Awesome PostGreSQL RDS Google Cloud SQL Azure DB Docker Continuous Aggregates Streaming Replication (..)

PostgreSQL

PostgreSQL NoSQL Google Cloud MongoDB

Leave Your Data Where It Is And Automate Feature Extraction With Molecula

Data Engineering Podcast

MARCH 8, 2021

To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Links Molecula Pilosa Podcast Episode The Social Dilemma Feature Store Cassandra Elasticsearch Podcast Episode Druid MongoDB SwimOS Podcast Episode Kafka Kafka (..)

IT

IT MongoDB Data Warehouse Kafka

Top 21 Big Data Tools That Empower Data Wizards

ProjectPro

JUNE 6, 2025

Source Code: Build a Similar Image Finder Top 3 Open Source Big Data Tools This section consists of three leading open-source big data tools- Apache Spark , Apache Hadoop, and Apache Kafka. Additionally, you will learn how to integrate Spark with Kafka and MongoDB. This also boosts Kafka's resilience and prevents server failure.

Big Data Tools

Big Data Tools Big Data Hadoop BI

Building Real Time Applications On Streaming Data With Eventador

Data Engineering Podcast

APRIL 19, 2020

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast Summary Modern applications frequently require access to real-time data, but building and maintaining the systems that make that possible is a complex and time consuming endeavor.

Building

Building PostgreSQL MongoDB SQL

A Look At The Data Systems Behind The Gameplay For League Of Legends

Data Engineering Podcast

NOVEMBER 20, 2022

With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Hevo]([link] Are you sick of repetitive, time-consuming ELT work?

Systems

Systems Metadata MongoDB Data Pipeline

Using the Amazon MSK Native Connector to Simplify Real-Time Analytics on Kafka

Rockset

DECEMBER 14, 2022

Rockset’s native connector for Amazon Managed Streaming for Apache Kafka (MSK) makes it simpler and faster to ingest streaming data for real-time analytics. Amazon MSK is a fully managed AWS service that gives users the ability to build and run applications using Apache Kafka.

Kafka

Kafka MongoDB SQL AWS

Handling Slow Queries in MongoDB - Part 2: Solutions

Rockset

AUGUST 25, 2020

In Part One , we discussed how to first identify slow queries on MongoDB using the database profiler, and then investigated what the strategies the database took doing during the execution of those queries to understand why our queries were taking the time and resources that they were taking.

MongoDB

MongoDB NoSQL SQL Relational Database

Rockset Enhances Kafka Integration to Simplify Real-Time Analytics on Streaming Data

Rockset

SEPTEMBER 14, 2021

We’re introducing a new Rockset Integration for Apache Kafka that offers native support for Confluent Cloud and Apache Kafka, making it simpler and faster to ingest streaming data for real-time analytics. With the Kafka Integration, users no longer need to build, deploy or operate any infrastructure component on the Kafka side.

Kafka

Kafka SQL MongoDB Computer Science

A Guide to the Confluent Verified Integrations Program

Confluent

AUGUST 19, 2019

It points to best practices for anyone writing Kafka Connect connectors. In a nutshell, the document states that sources and sinks are verified as Gold if they’re functionally equivalent to Kafka Connect connectors. Over the years, we’ve since seen wide adoption of Kafka Connect.

Programming

Programming Kafka Database-centric MongoDB

100 Data Modelling Interview Questions To Prepare For In 2025

ProjectPro

JUNE 6, 2025

Prepare for Your Next Big Data Job Interview with Kafka Interview Questions and Answers 2. Is MongoDB better than PostgreSQL in terms of performance? It's difficult to determine whether MongoDB is significantly faster than PostgreSQL since database performance depends on numerous parameters.

Data Warehouse

Data Warehouse NoSQL PostgreSQL Relational Database

Adopting Real-Time Data At Organizations Of Every Size

Data Engineering Podcast

DECEMBER 4, 2022

With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs.

Data Lake

Data Lake MongoDB MySQL Data Warehouse

Charting the Path of Riskified's Data Platform Journey

Data Engineering Podcast

JULY 10, 2022

With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Email hosts@dataengineeringpodcast.com ) with your story. Email hosts@dataengineeringpodcast.com ) with your story.

Metadata

Metadata MongoDB MySQL BI

Power Your Real-Time Analytics Without The Headache Using Fivetran's Change Data Capture Integrations

Data Engineering Podcast

SEPTEMBER 25, 2022

With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs.

Food

Food MongoDB Scala MySQL

Kafka to MongoDB: Building a Streamlined Data Pipeline

Getting started with the MongoDB Connector for Apache Kafka and MongoDB

Webinars

Trending Sources

Using the Fully Managed MongoDB Atlas Connector in a Secure Environment

Webinars

The Rise of Managed Services for Apache Kafka

Announcing Confluent Cloud for Apache Kafka as a Native Service on Google Cloud Platform

MongoDB CDC: When to Use Kafka, Debezium, Change Streams and Rockset

Analysing Changes with Debezium and Kafka Streams

Real-Time Data Streaming: MongoDB Change Stream Kafka

30+ Data Engineering Projects for Beginners in 2025

Announcing the MongoDB Atlas Sink and Source Connectors in Confluent Cloud

The Good and the Bad of Apache Kafka Streaming Platform

3 Ways to Offload Read-Heavy Applications from MongoDB

Spring for Apache Kafka Deep Dive – Part 4: Continuous Delivery of Event Streaming Pipelines

Real-Time Fleet Management Using Confluent Cloud and MongoDB

MongoDB Performance Tuning - Top 5 Resources

JOINs and Aggregations Using Real-Time Indexing on MongoDB Atlas

Building a Real-Time Customer 360 on Kafka, MongoDB and Rockset

Building Data Pipelines That Run From Source To Analysis And Activation With Hevo Data

Astronomer with Ry Walker - Episode 6

The Alooma Data Pipeline With CTO Yair Weinberger - Episode 33

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Building a Telegram Bot Powered by Apache Kafka and ksqlDB

An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications

Create a Data Analysis Pipeline with Apache Kafka and RStudio

Connect Data From Legacy Databases Anywhere to MongoDB in Real Time With Confluent and Apache Kafka®

Data Pipeline- Definition, Architecture, Examples, and Use Cases

Announcing the MongoDB Atlas Sink and Source Connectors in Confluent Cloud

TimescaleDB: The Timeseries Database Built For SQL And Scale - Episode 65

Adding Support For Distributed Transactions To The Redpanda Streaming Engine

How to Become a Big Data Developer-A Step-by-Step Guide

TimescaleDB: Fast And Scalable Timeseries with Ajay Kulkarni and Mike Freedman - Episode 18

Leave Your Data Where It Is And Automate Feature Extraction With Molecula

Top 15 Google BigQuery Interview Questions and Answers For 2023

Top 21 Big Data Tools That Empower Data Wizards

Building Real Time Applications On Streaming Data With Eventador

A Look At The Data Systems Behind The Gameplay For League Of Legends

Using the Amazon MSK Native Connector to Simplify Real-Time Analytics on Kafka

Handling Slow Queries in MongoDB - Part 2: Solutions

Rockset Enhances Kafka Integration to Simplify Real-Time Analytics on Streaming Data

A Guide to the Confluent Verified Integrations Program

100 Data Modelling Interview Questions To Prepare For In 2025

Adopting Real-Time Data At Organizations Of Every Size

Charting the Path of Riskified's Data Platform Journey

Power Your Real-Time Analytics Without The Headache Using Fivetran's Change Data Capture Integrations

Stay Connected