2020

article thumbnail

Change Data Capture Using Debezium Kafka and Pg

Start Data Engineering

Change data capture is a software design pattern used to capture changes to data and take corresponding action based on that change. The change to data is usually one of read, update or delete. The corresponding action usually is supposed to occur in another system in response to the change that was made in the source system.

Kafka 246
article thumbnail

Top 10 Technology Trends for 2020

KDnuggets

With integrations of multiple emerging technologies just in the past year, AI development continues at a fast pace. Following the blueprint of science and technology advancements in 2019, we predict 10 trends we expect to see in 2020 and beyond.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

12 Days of Apache Kafka

Confluent

Before you say it: Yes, we are right now three days past Christmas, but technically the 12 days of Christmas refer to the days between Christmas and Epiphany, which is—I […].

Kafka 145
article thumbnail

Introducing Domain-Oriented Microservice Architecture

Uber Engineering

Introduction. Recently there has been substantial discussion around the downsides of service oriented architectures and microservice architectures in particular. While only a few years ago, many people readily adopted microservice architectures due to the numerous benefits they provide such as … The post Introducing Domain-Oriented Microservice Architecture appeared first on Uber Engineering Blog.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

How Netflix Scales its API with GraphQL Federation (Part 1)

Netflix Tech

Netflix is known for its loosely coupled and highly scalable microservice architecture. Independent services allow for evolving at different paces and scaling independently. Yet they add complexity for use cases that span multiple services. Rather than exposing 100s of microservices to UI developers, Netflix offers a unified API aggregation layer at the edge.

IT 144
article thumbnail

Keeping Small Queries Fast – Short query optimizations in Apache Impala

Cloudera

This is part of our series of blog posts on recent enhancements to Impala. The entire collection is available here. Apache Impala is synonymous with high-performance processing of extremely large datasets, but what if our data isn’t huge? What if our queries are very selective? The reality is that data warehousing contains a large variety of queries both small and large; there are many circumstances where Impala queries small amounts of data; when end users are iterating on a use case, filterin

Metadata 143

More Trending

article thumbnail

Top 5 must-have Data Science skills for 2020

KDnuggets

The standard job description for a Data Scientist has long highlighted skills in R, Python, SQL, and Machine Learning. With the field evolving, these core competencies are no longer enough to stay competitive in the job market.

article thumbnail

20+ Machine Learning Datasets & Project Ideas

KDnuggets

Upgrading your machine learning, AI, and Data Science skills requires practice. To practice, you need to develop models with a large amount of data. Finding good datasets to work with can be challenging, so this article discusses more than 20 great datasets along with machine learning project ideas for you to tackle today.

article thumbnail

A Comprehensive Guide to Natural Language Generation

KDnuggets

Follow this overview of Natural Language Generation covering its applications in theory and practice. The evolution of NLG architecture is also described from simple gap-filling to dynamic document creation along with a summary of the most popular NLG models.

article thumbnail

Top 5 Things Every Kafka Developer Should Know

Confluent

Apache Kafka® is an event streaming platform used by more than 30% of the Fortune 500 today. There are numerous features of Kafka that make it the de-facto standard for […].

Kafka 145
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

The Book to Start You on Machine Learning

KDnuggets

This book is thought for beginners in Machine Learning, that are looking for a practical approach to learning by building projects and studying the different Machine Learning algorithms within a specific context.

article thumbnail

7 Resources to Becoming a Data Engineer

KDnuggets

An estimated 8,650% growth of the volume of Data to 175 zetabytes from 2010 to 2025 has created an enormous need for Data Engineers to build an organization's big data platform to be fast, efficient and scalable.

article thumbnail

Introducing the Confluent Parallel Message Processing Client

Confluent

Consuming messages in parallel is what Apache Kafka® is all about, so you may well wonder, why would we want anything else? It turns out that, in practice, there are […].

Process 144
article thumbnail

Benchmarking Apache Kafka, Apache Pulsar, and RabbitMQ: Which is the fastest?

Confluent

Apache Kafka® is one of the most popular event streaming systems. There are many ways to compare systems in this space, but one thing everyone cares about is performance. Kafka […].

Kafka 145
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

How Real-Time Stream Processing Works with ksqlDB, Animated

Confluent

ksqlDB, the event streaming database, is becoming one of the most popular ways to work with Apache Kafka®. Every day, we answer many questions about the project, but here’s a […].

Process 145
article thumbnail

I wanna be a data scientist, but… how?

KDnuggets

It’s easy to say "I wanna be a data scientist," but. where do you start? How much time is needed to be desired by companies? Do you need a Master’s degree? Do you need to know every mathematical concept ever derived? The journey might be long, but follow this plan to help you keep moving forward toward your career goal.

Data 150
article thumbnail

Apache Kafka Needs No Keeper: Removing the Apache ZooKeeper Dependency

Confluent

Currently, Apache Kafka® uses Apache ZooKeeper™ to store its metadata. Data such as the location of partitions and the configuration of topics are stored outside of Kafka itself, in a […].

Kafka 145
article thumbnail

Preventing Fraud and Fighting Account Takeovers with Kafka Streams

Confluent

Many companies have recently started to take cybersecurity and data protection even more seriously, particularly driven by the recent General Data Protection Regulation (GDPR) legislation. They are increasing their investment […].

Kafka 145
article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

Designing Edge Gateway, Uber’s API Lifecycle Management Platform

Uber Engineering

The making of Edge Gateway, the highly-available and scalable self-serve gateway to configure, manage, and monitor APIs of every business domain at Uber. Evolution of Uber’s API gateway. In October 2014, Uber had started its journey of scale in what … The post Designing Edge Gateway, Uber’s API Lifecycle Management Platform appeared first on Uber Engineering Blog.

Designing 144
article thumbnail

99th Percentile Latency at Scale with Apache Kafka

Confluent

Fraud detection, payment systems, and stock trading platforms are only a few of many Apache Kafka® use cases that require both fast and predictable delivery of data. For example, detecting […].

Kafka 145
article thumbnail

Apache Kafka as a Service with Confluent Cloud Now Available on Azure Marketplace

Confluent

Less than six months ago, we announced support for Microsoft Azure in Confluent Cloud, which allows developers using Azure as a public cloud to build event streaming applications with Apache […].

Cloud 145
article thumbnail

Why We Leverage Multi-tenancy in Uber’s Microservice Architecture

Uber Engineering

The performance of Uber’s services relies on our ability to quickly and stably launch new features on our platform , regardless of where the corresponding service lives in our tech stack. Foundational to our platform’s power is its microservice-based architecture … The post Why We Leverage Multi-tenancy in Uber’s Microservice Architecture appeared first on Uber Engineering Blog.

article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

What’s New in Apache Kafka 2.5

Confluent

On behalf of the Apache Kafka® community, it is my pleasure to announce the release of Apache Kafka 2.5.0. The community has created another exciting release. We are making progress […].

Kafka 144
article thumbnail

Apache Kafka DevOps with Kubernetes and GitOps

Confluent

Operating critical Apache Kafka® event streaming applications in production requires sound automation and engineering practices. Streaming applications are often at the center of your transaction processing and data systems, requiring […].

Kafka 143
article thumbnail

Intrusion Detection with ksqlDB

Confluent

Apache Kafka® is a distributed real-time processing platform that allows for the ingestion of huge volumes of data. ksqlDB is part of the Kafka ecosystem and offers a SQL-like language […].

Kafka 143
article thumbnail

Introducing Confluent Platform 6.0

Confluent

Each month, we’ve announced a set of Confluent features organized around what we think are the key foundational traits of cloud-native data systems as part of Project Metamorphosis. Data systems […].

Project 143
article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.

article thumbnail

Life of a Netflix Partner Engineer?—?The case of extra 40 ms

Netflix Tech

Life of a Netflix Partner Engineer?—?The case of the extra 40 ms By: John Blair , Netflix Partner Engineering The Netflix application runs on hundreds of smart TVs, streaming sticks and pay TV set top boxes. The role of a Partner Engineer at Netflix is to help device manufacturers launch the Netflix application on their devices. In this article we talk about one particularly difficult issue that blocked the launch of a device in Europe.

Bytes 142
article thumbnail

Coronavirus Data and Poll Analysis – yes, there is hope, if we act now

KDnuggets

We examine the growth of coronavirus daily cases in most affected countries, and show evidence that social distancing works in reducing the rate of spread. We also analyze KDnuggets Poll results - the scale of change to online and how Data Science work is likely to increase or drop in different regions. Stay Healthy and practice social distancing!

article thumbnail

Transactional Machine Learning at Scale with MAADS-VIPER and Apache Kafka

Confluent

This blog post shows how transactional machine learning (TML) integrates data streams with automated machine learning (AutoML), using Apache Kafka® as the data backbone, to create a frictionless machine learning […].

article thumbnail

Confluent Raises $250M and Kicks Off Project Metamorphosis

Confluent

Confluent Raises $250M and Kicks Off Project Metamorphosis It’s an exciting day for Confluent, in the middle of a very unusual and difficult time in the larger world. Nonetheless, I […].

Project 142
article thumbnail

What Is Entity Resolution? How It Works & Why It Matters

Entity Resolution Sometimes referred to as data matching or fuzzy matching, entity resolution, is critical for data quality, analytics, graph visualization and AI. Learn what entity resolution is, why it matters, how it works and its benefits. Advanced entity resolution using AI is crucial because it efficiently and easily solves many of today’s data quality and analytics problems.