Sat.Dec 12, 2020 - Fri.Dec 18, 2020

article thumbnail

Introducing the Confluent Parallel Message Processing Client

Confluent

Consuming messages in parallel is what Apache Kafka® is all about, so you may well wonder, why would we want anything else? It turns out that, in practice, there are […].

Process 144
article thumbnail

Life of a Netflix Partner Engineer?—?The case of extra 40 ms

Netflix Tech

Life of a Netflix Partner Engineer?—?The case of the extra 40 ms By: John Blair , Netflix Partner Engineering The Netflix application runs on hundreds of smart TVs, streaming sticks and pay TV set top boxes. The role of a Partner Engineer at Netflix is to help device manufacturers launch the Netflix application on their devices. In this article we talk about one particularly difficult issue that blocked the launch of a device in Europe.

Bytes 142
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Is Data Engineering a must for Data Scientists?

Team Data Science

Organizations in several industries such as banking, healthcare, and automobiles are now acknowledging the value of data science in their mode of operation. Thus, an ideal and efficacious data science team are therefore expected to manage numerous volume of tasks. Even then, developing a team to successfully manage AI tasks is essential to tackle any challenges faced by organizations as regard data.

article thumbnail

Fostering inclusion with servant leadership

Cloudera

It is crucial for organizations to focus on supporting the new way of work, enhancing productivity, and improving cost efficiency to ensure business survival in the post-pandemic world. However, those that are overly focused on these short-term goals risk losing sight of what’s truly important. As shared in my previous post , diverse teams can help organizations unlock innovations that allow them to adapt to market changes quickly and drive business growth.

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Announcing ksqlDB 0.14.0

Confluent

We’re pleased to announce ksqlDB 0.14, one of the most feature-packed releases of the year. This version includes expanded query support over materialized views, incremental schema alteration, variable substitution, additional […].

Process 126
article thumbnail

Building A Self Service Data Platform For Alternative Data Analytics At YipitData

Data Engineering Podcast

Summary As a data engineer you’re familiar with the process of collecting data from databases, customer data platforms, APIs, etc. At YipitData they rely on a variety of alternative data sources to inform investment decisions by hedge funds and businesses. In this episode Andrew Gross, Bobby Muldoon, and Anup Segu describe the self service data platform that they have built to allow data analysts to own the end-to-end delivery of data projects and how that has allowed them to scale their o

More Trending

article thumbnail

How does Apache Spark 3.0 increase the performance of your SQL workloads

Cloudera

Across nearly every sector working with complex data, Spark has quickly become the de-facto distributed computing framework for teams across the data and analytics lifecycle. One of most awaited features of Spark 3.0 is the new Adaptive Query Execution framework (AQE), which fixes the issues that have plagued a lot of Spark SQL workloads. Those were documented in early 2018 in this blog from a mixed Intel and Baidu team.

SQL 99
article thumbnail

Spring Your Microservices into Production with Kubernetes and GitOps

Confluent

Microservice architectures continue to grow within engineering organizations as teams strive to increase development velocity. Microservices promote the idea of modularity as a first-class citizen in a distributed architecture, enabling […].

article thumbnail

Netflix at MIT CODE 2020

Netflix Tech

Martin Tingley In November, Netflix was a proud sponsor of the 2020 Conference on Digital Experimentation (CODE), hosted by the MIT Initiative on the Digital Economy. As well as providing sponsorship, Netflix data scientists were active participants, with three contributions. Eskil Forsell and colleagues presented a poster describing Success stories from a democratized experimentation platform.

Coding 64
article thumbnail

Top Tech Predictions for 2021

Teradata

From COVID-19 to AI in industry, our Teradata experts offer their best predictions for the state of technology and business in 2021 and beyond. Read more.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Enabling The Full ML Lifecycle For Scaling AI Use Cases

Cloudera

When it comes to machine learning (ML) in the enterprise, there are many misconceptions about what it actually takes to effectively employ machine learning models and scale AI use cases. When many businesses start their journey into ML and AI, it’s common to place a lot of energy and focus on the coding and data science algorithms themselves. While it’s important to have the in-house data science expertise and the ML experts on-hand to build and test models, the reality is that the actual data s

article thumbnail

4 Incredible ksqlDB Techniques (#2 Will Make You Cry)

Confluent

Building event streaming applications has never been simpler with ksqlDB. But what is it? ksqlDB is an event streaming database for building stream processing applications. Unlike Kafka Streams, ksqlDB programs […].

Kafka 106
article thumbnail

Mythbusting the Analytics Journey

Netflix Tech

Part of our series on who works in Analytics at Netflix?—?and what the role entails by Alex Diamond This Q&A aims to mythbust some common misconceptions about succeeding in analytics at a big tech company. This isn’t your typical recruiting story. I wasn’t actively looking for a new job and Netflix was the only place I applied. I didn’t know anyone who worked there and just submitted my resume through the Jobs page ???????

article thumbnail

Meet Magpie: The End-to-End Data Engineering Platform (VIDEO)

Silectis

If you’ve been following along with Silectis over the past couple of years, you are familiar with our data engineering platform, Magpie. You’re aware of the many outcomes it puts at the fingertips of data engineers, and teams of data practitioners more largely. If you’re new around here, not to worry. We can catch you up quickly. We are excited to share our brand new explainer video with you!

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Cloudera Replication Plugin enables x-platform replication for Apache HBase

Cloudera

The Cloudera Data Platform (CDP) is the latest Big Data offering from Cloudera. It includes Apache HBase and Phoenix as part of the platform. These two components are provided in 3 form-factors: For on-prem deployments, they are available in a manner similar to CDH & HDP (within the CDP Private Cloud offering). For customers that want to manage the database on their own in AWS & Azure, it is available as part of the CDP Public Cloud DataHub offering (with the Operational Database templa

AWS 80
article thumbnail

Announcing ksqlDB 0.14.0

Confluent

We’re pleased to announce ksqlDB 0.14, one of the most feature-packed releases of the year. This version includes expanded query support over materialized views, incremental schema alteration, variable substitution, additional […].

Process 52
article thumbnail

Avoid Making the Same Mistake Twice

Teradata

Data & analytics only have real value when they are used to improve performance by reducing costs, increasing customer satisfaction or driving new growth. Read more.

Data 52
article thumbnail

Build Internal Apps in Minutes with Retool and Rockset: A Customer 360 Example

Rockset

Rockset and Retool are teaming up to help you build internal apps in minutes. Rockset allows developers to turn complex analytics into data APIs simply, while Retool delivers the UI building blocks to quickly launch high-performance internal apps. Together, they empower developers to build performant internal tools, such as customer 360 and logistics monitoring apps, by solely using data APIs and pre-built UI components.

article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Top 4 Reasons Why You Should Upgrade Your Stream Processing Workloads To CDP

Cloudera

If there’s one thing enterprises have learned in 2020, it’s how to navigate through uncertain times, and in 2021, organizations will likely have to continue navigating through a shifting landscape. One trend that we’ve seen this year, is that enterprises are leveraging streaming data as a way to traverse through unplanned disruptions, as a way to make the best business decisions for their stakeholders. .

Process 71
article thumbnail

HBase Performance testing using YCSB

Cloudera

When running any performance benchmarking tool on your cluster, a critical decision is always what data set size should be used for a performance test, and here we demonstrate why it is important to select a “good fit” data set size when running a HBase performance test on your cluster. The HBase cluster configurations and the size of data set can vary the performance of your workload and the test results on the same cluster.

article thumbnail

Improving your Customer Centric Merchandising with Location based in-Store Merchandising

Cloudera

With any transformation in industry or marketplace, there are leaders and losers. The winners know the fundamental pillars that are hidden to some and evident to others that drive and enable success. In 2020, where connected consumers and the turmoil with the pandemic driven supply chains are driving more and more of retail’s response, at Cloudera we believe that the underlying foundation to retail’s success is based upon real-time and streaming data from retail’s edge – the retail s

Retail 59
article thumbnail

The role of data in COVID-19 vaccination record keeping

Cloudera

The role of data in COVID-19 vaccination record keeping. Now that the Pfizer vaccine has been approved by the FDA for use in the US, and the Moderna vaccine likely isn’t far behind, we are now on the verge of being able to emerge from the social distancing world that began earlier in 2020. Recent news has talked about distributing a vaccination record card to everyone who gets a COVID-19 vaccine. .

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Path to Profitability with More Agile Pricing

Teradata

Due to slowly crawling data processes, tedious rate filing reviews and gaining consensus among stakeholders, R&D actuaries & data scientists require a new level of pricing agility to be competitive.

article thumbnail

Bringing transaction support to Cloudera Operational Database

Cloudera

We’re excited to share that after adding ANSI SQL, secondary indices, star schema, and view capabilities to Cloudera’s Operational Database , we will be introducing distributed transaction support in the coming months. . What is ACID? The ACID model of database design is one of the most important concepts in databases. ACID stands for atomicity, consistency, isolation, and durability.