Sat.Jun 29, 2019 - Fri.Jul 05, 2019

article thumbnail

KSQL in Football: FIFA Women’s World Cup Data Analysis

Confluent

One of the football (as per European terminology) highlights of the summer is the FIFA Women’s World Cup. France, Brazil, and the USA are the favourites, and this year Italy is present at the event for the first time in 20 years. From a data perspective, the World Cup represents an interesting source of information. There’s a lot of dedicated press coverage, as well as the standard social media excitement following any kind of big event.

article thumbnail

Stress Testing Kafka And Cassandra For Real-Time Anomaly Detection

Data Engineering Podcast

Summary Anomaly detection is a capability that is useful in a variety of problem domains, including finance, internet of things, and systems monitoring. Scaling the volume of events that can be processed in real-time can be challenging, so Paul Brebner from Instaclustr set out to see how far he could push Kafka and Cassandra for this use case. In this interview he explains the system design that he tested, his findings for how these tools were able to work together, and how they behaved at diffe

Kafka 100
article thumbnail

Why Vantage Is Our Most Popular Release Ever

Teradata

Teradata Vantage is busting through analytic silos and raising the bar. Find out what drove these innovations and led to Vantage becoming our most popular release yet.

75
article thumbnail

Bringing Rich Experiences to Memory-constrained TV Devices

Netflix Tech

Bringing Rich Experiences to Memory-Constrained TV Devices By Jason Munning, Archana Kumar, Kris Range Netflix has over 148M paid members streaming on more than half a billion devices spanning over 1,900 different types. In the TV space alone, there are hundreds of device types that run the Netflix app. We need to support the same rich Netflix experience on not only high-end devices like the PS4 but also memory and processor-constrained consumer electronic devices that run a similar chipset as w

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Kafka Listeners – Explained

Confluent

This question comes up on Stack Overflow and such places a lot , so here’s something to try and help. tl;dr: You need to set advertised.listeners (or KAFKA_ADVERTISED_LISTENERS if you’re using Docker images) to the external address (host/IP) so that clients can correctly connect to it. Otherwise, they’ll try to connect to the internal host address—and if that’s not reachable, then problems ensue.

Kafka 101
article thumbnail

Has the Data Engineer replaced the Business Intelligence Developer?

Advancing Analytics: Data Engineering

It seems these days that every person I talk to is either a scientist, engineer or architect, we’re fairly obsessed with aligning our technical roles to respected professions that denote the amount of education & training that go into it – and that’s fair given how much time & effort goes into attaining these roles… but it really doesn’t help us define them.