Sat.May 09, 2020 - Fri.May 15, 2020

article thumbnail

Change Data Capture Using Debezium Kafka and Pg

Start Data Engineering

Change data capture is a software design pattern used to capture changes to data and take corresponding action based on that change. The change to data is usually one of read, update or delete. The corresponding action usually is supposed to occur in another system in response to the change that was made in the source system.

Kafka 246
article thumbnail

Apache Kafka Needs No Keeper: Removing the Apache ZooKeeper Dependency

Confluent

Currently, Apache Kafka® uses Apache ZooKeeper™ to store its metadata. Data such as the location of partitions and the configuration of topics are stored outside of Kafka itself, in a […].

Kafka 145
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Drafting Your Data Pipelines

Team Data Science

With careful consideration and learning about your market, the choices you need to make become narrower and more clear. I can now begin drafting my data ingestion/ streaming pipeline without being overwhelmed. For A Quick Recap You can find the first blog post here, where I learned which tech is most in demand in Toronto: [link] And the second blog post is here where I learn which Toronto industries need data engineers the most: [link] The Pipeline Proposal I'll be creating several pipelines in

article thumbnail

COVID-19: The Perfect Storm

Teradata

The COVID-19 pandemic has brought with it a Perfect Storm of disruption that impacts all of us -- from our health to the economy to the supply chain. Read more.

IT 105
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

StreamNative Brings Streaming Data To The Cloud Native Landscape With Pulsar

Data Engineering Podcast

Summary There have been several generations of platforms for managing streaming data, each with their own strengths and weaknesses, and different areas of focus. Pulsar is one of the recent entrants which has quickly gained adoption and an impressive set of capabilities. In this episode Sijie Guo discusses his motivations for spending so much of his time and energy on contributing to the project and growing the community.

Cloud 100
article thumbnail

Building a Telegram Bot Powered by Apache Kafka and ksqlDB

Confluent

Imagine you’ve got a stream of data; it’s not “big data,” but it’s certainly a lot. Within the data, you’ve got some bits you’re interested in, and of those bits, […].

Kafka 141

More Trending

article thumbnail

How China is Using Advanced Analytics During the COVID-19 Pandemic

Teradata

Learn how advanced analytics are being used in China amidst the COVID-19 pandemic to help combat the spread of coronavirus now and in the future.

59
article thumbnail

Getting Started - Installing Apache Superset

Preset

Setting up Apache Superset for the first time can be difficult. Here's an easy way to accomplish the task!

40
article thumbnail

From Eager to Smarter in Apache Kafka Consumer Rebalances

Confluent

Everyone wants their infrastructure to be highly available, and ksqlDB is no different. But crucial properties like high availability don’t come without a thoughtful, rigorous design. We thought hard about […].

Kafka 106
article thumbnail

How Akka Typed Incentivizes Writing Good Code

Rock the JVM

Explore how Akka Typed integrates good practices directly into the API

Coding 52
article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Emulate Your Heroes with Data… and Vantage on AWS

Teradata

Teradata's top-ranked analytic software capabilities that market-leading companies have been using for years is available right now on Amazon Web Services (AWS). Learn more.

AWS 52
article thumbnail

Announcing ksqlDB 0.9.0

Confluent

We’re pleased to announce the release of ksqlDB 0.9.0! This version includes support for multi-join statements, enhanced LIKE expressions, and a host of usability improvements. We’ll go through a few […].

Process 92