Sun.Dec 24, 2023

article thumbnail

Troubleshooting Kafka In Production

Data Engineering Podcast

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Operating it at scale, however, is notoriously challenging. Elad Eldor has experienced these challenges first-hand, leading to his work writing the book "Kafka: : Troubleshooting in Production" In this episode he highlights the sources of complexity that contribute to Kafka's operational difficulties, and some of the main ways to identify and mitigate

Kafka 245
article thumbnail

SparkSQL is Destroying your Pipelines

Confessions of a Data Guy

It’s true, even if you don’t want it to be. SparkSQL is destroying your data pipelines and possibly wreaking havoc on your entire data team, infrastructure, and life. In your heart of hearts, you’ve probably known it for years. With great power comes great responsibility. We all know that even us Data Engineers are human […] The post SparkSQL is Destroying your Pipelines appeared first on Confessions of a Data Guy.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

1.5 Years of Spark Knowledge in 8 Tips

Towards Data Science

My learnings from Databricks customer engagements Figure 1: a technical diagram of how to write apache spark. Image by author. After working with ~15 of the largest retail organizations for the past 18 months, here are the Spark tips I commonly repeat. Throughout this post, we assume a general working knowledge of spark and it’s structure, but this post should be accessible to all levels of spark.

Scala 79
article thumbnail

Data Engineering Weekly #154

Data Engineering Weekly

RudderStack is the Warehouse Native CDP, built to help data teams deliver value across the entire data activation lifecycle, from collection to unification and activation. Visit rudderstack.com to learn more. Sanjeev Mohan: Unveiling the Crystal Ball: 2024 Data and AI Trends Sanjeev & Rajesh, as usual, share their excellent observations about data & AI industry trends.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?