Sat.Oct 17, 2020 - Fri.Oct 23, 2020

article thumbnail

Intrusion Detection with ksqlDB

Confluent

Apache Kafka® is a distributed real-time processing platform that allows for the ingestion of huge volumes of data. ksqlDB is part of the Kafka ecosystem and offers a SQL-like language […].

Kafka 143
article thumbnail

#ClouderaLife Spotlight: Otho Lyon, Director of Support, Public Sector 

Cloudera

There are some people who will tell you, “you can do anything you want to do in life”, and then there are some people who can show you that’s true. Otho Lyon – Director of Support, Public Sector – is one of those people. . For over four years, Otho has been Cloudera’s Director of Support – Public Sector. He leads the charge in the support strategy and execution of our most sensitive and secure government agency clients.

article thumbnail

Better Data Quality Through Observability With Monte Carlo

Data Engineering Podcast

Summary In order for analytics and machine learning projects to be useful, they require a high degree of data quality. To ensure that your pipelines are healthy you need a way to make them observable. In this episode Barr Moses and Lior Gavish, co-founders of Monte Carlo, share the leading causes of what they refer to as data downtime and how it manifests.

article thumbnail

Building Netflix’s Distributed Tracing Infrastructure

Netflix Tech

by Maulik Pandey Our Team?—? Kevin Lew , Narayanan Arunachalam , Elizabeth Carretto , Dustin Haffner , Andrei Ushakov, Seth Katz , Greg Burrell , Ram Vaithilingam , Mike Smith and Maulik Pandey “ @Netflixhelps Why doesn’t Tiger King play on my phone?”?—?a Netflix member via Twitter This is an example of a question our on-call engineers need to answer to help resolve a member issue?

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Lessons Learned from Evolving a Risk Management Platform to Event Streaming

Confluent

Every organization that exposes its services online is subject to the interest of malicious actors. The ongoing struggle with botnets, crawlers, script kiddies, and bounty hunters is challenging and requires […].

article thumbnail

New Multithreading Model for Apache Impala

Cloudera

Introduction. Today we are introducing a new series of blog posts that will take a look at recent enhancements to Apache Impala. Many of these are performance improvements, such as the feature described below which will give anywhere from a 2x to 7x performance improvement by taking better advantage of all the CPU cores. In addition, a lot of work has also been put into ensuring that Impala runs optimally in decoupled compute scenarios, where the data lives in object storage or remote HDFS.

Utilities 105

More Trending

article thumbnail

Modern Architecture and Analytics Need Each Other To Succeed

Teradata

Two major forces impacting IT are on a collision course: the desire for a “modern data architecture” and businesses' need for data analytics. Find out more.

article thumbnail

Bounding ksqlDB Memory Usage

Confluent

It’s important to be able to limit memory consumption of a running service. Use too much, and your application will crash and need to restart. If this happens too often, […].

Process 109
article thumbnail

Re-thinking The Insurance Industry In Real-Time To Cope With Pandemic-scale Disruption

Cloudera

The Insurance industry is in uncharted waters and COVID-19 has taken us where no algorithm has gone before. Today’s models, norms, and averages are being re-written on the fly, with insurers forced to cope with the inevitable conflict between old standards and the new normal. . Take the restaurants of New York City suing their insurers for hundreds of millions of dollars.

article thumbnail

Building a More Diverse and Inclusive Cloudera: Our First Steps

Cloudera

Cloudera recently appointed its first-ever Chief Diversity Officer, Sarah Shin. To maintain transparency about our diversity and inclusion (D&I) progress, we’ve asked her to share her vision for Cloudera’s future, and the initial steps she’s taking to help us get there. Q1: What drew you to work in the D&I space? . As a biracial woman, diversity and inclusion was a large part of my life before I even knew what it meant.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

HBase Clusters Data Synchronization with HashTable/SyncTable tool

Cloudera

Replication ( covered in this previous blog article ) has been released for a while and is among the most used features of Apache HBase. Having clusters replicating data with different peers is a very common deployment, whether as a DR strategy or simply as a seamless way of replicating data between production/staging/development environments. Although it is an efficient way of keeping different HBase databases in sync within a sub-second latency, replication only operates over data ingested aft

Bytes 65
article thumbnail

Modern Architecture…Why is This so Difficult?

Teradata

There is a lot of debate about what Modern Architecture means & what it actually comprises. However, the two terms that come up the most often are: simplicity & flexibility.

article thumbnail

Demystifying the Business Continuity Space: Part 2

Teradata

In part 2 of this series, we will cover techniques for defining & gathering availability requirements and making sure your business continuity solution provides the most potential ROI.

52
article thumbnail

Exit Here? The Big Banks' Battle for Survival

Teradata

Marketing is essential. It drives revenue and is a big investment for banks. But marketers cannot grow the size of the pie – only their slice of it. Read more.

Banking 52
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.