Top Data Engineering Digest Data Transparency Data Analytics Content for Week of Aug 22

Sat.Aug 22, 2020 - Fri.Aug 28, 2020

Optimized shot-based encodes for 4K: Now streaming!

Netflix Tech

AUGUST 28, 2020

by Aditya Mavlankar , Liwei Guo , Anush Moorthy and Anne Aaron Netflix has an ever-expanding collection of titles which customers can enjoy in 4K resolution with a suitable device and subscription plan. Netflix creates premium bitstreams for those titles in addition to the catalog-wide 8-bit stream profiles¹. Premium features comprise a title-dependent combination of 10-bit bit-depth, 4K resolution, high frame rate (HFR) and high dynamic range (HDR) and pave the way for an extraordinary viewing

Media

Media Algorithm Data Science Engineering

Most important tools for Data Engineers

Team Data Science

AUGUST 26, 2020

There are a huge number of tools and platforms for data engineers. It's this enormous selection that makes it difficult for newcomers to filter out the really important tools. In the course of the Data Engineer Coaching I was able to gain important experience in this regard and would like to tell you the most important tools on this basis today! During the coaching sessions I saw that a lot of tools keep coming up all the time: Kafka, Spark and AWS.

Data Engineering

Data Engineering Data Engineer Engineering AWS

Join 37,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Use of Modeling and Simulation for Understanding COVID-19 Dynamics

Teradata

AUGUST 25, 2020

This post presents a simulation framework that leverages several mathematical models to simulate the spread of diseases such as COVID-19 in urban environments.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

The Future Of The Telco Industry And Impact Of 5G & IoT – Part II

Cloudera

AUGUST 28, 2020

In part 2 of the series focusing on the impact of evolving technology on the telecom industry, we sat down with Vijay Raja, Director of Industry & Solutions Marketing at Cloudera to get his views on how the sector is changing and where it goes next. Hi Vijay, thank you so much for joining us again. To continue where we left off, as industry players continue to shift toward a more 5G centric network, how is 5G impacting the industry from a data perspective?

Machine Learning

Machine Learning Transportation Cloud Data Lake

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

Data Pipeline

An Overview of Confluent Cloud Security Controls

Confluent

AUGUST 28, 2020

Whether you are a developer working on a cool new real-time application or an architect formulating the plan to reap the benefits of event streaming for the organisation, the subject […].

Cloud

Metadata Management And Integration At LinkedIn With DataHub

Data Engineering Podcast

AUGUST 24, 2020

Summary In order to scale the use of data across an organization there are a number of challenges related to discovery, governance, and integration that need to be solved. The key to those solutions is a robust and flexible metadata management system. LinkedIn has gone through several iterations on the most maintainable and scalable approach to metadata, leading them to their current work on DataHub.

Metadata

Metadata Management Kafka Data Engineer

Digitalizing Energy: A Cure-All Salve or Expensive Placebo?

Teradata

AUGUST 26, 2020

No operator ever made, or ever will make, a single cent or penny from purely digitizing and then storing data – they need to do something with it! Find out how.

IT Data

More Trending

Digitalizing Energy: A Cure-All Salve or Expensive Placebo?

Teradata

AUGUST 26, 2020

No operator ever made, or ever will make, a single cent or penny from purely digitizing and then storing data – they need to do something with it! Find out how.

IT Data

Connect the Data Lifecycle: The power of data

Cloudera

AUGUST 27, 2020

There’s no doubt that cloud has become ubiquitous, and thank goodness for that in 2020. We wouldn’t have survived the challenges of this year without cloud. It’s supported everything, from the sudden changes in the way we work to the way we access healthcare and even shop for vital goods. While cloud is the vehicle, it’s what sits on it that makes it so valuable — data.

Telecommunication

Telecommunication Data Data Lake Cloud

Best of Kafka Summit 2020 Roundup

Confluent

AUGUST 27, 2020

If you know me, you know two things: first, that I am committed to remote work as an effective way to build a company; I’ve been a remote employee for […].

Kafka

Kafka Building

Eta-Expansion and Partially Applied Functions in Scala

Rock the JVM

AUGUST 28, 2020

Explore the intriguing world of eta-expansion: Discover how methods and functions interact in Scala, revealing insights that can elevate your coding game

Scala

Scala Coding

Will a Few Milliseconds Ruin Your Analytics Performance in the Cloud?

Teradata

AUGUST 27, 2020

Learn why milliseconds of WAN network latency can wreak havoc on performance for analytics in the cloud. Read more.

Cloud

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

Data

Certified technical partner solutions help customers succeed with Cloudera Data Platform

Cloudera

AUGUST 26, 2020

On August 18, we completed our Enterprise Data Cloud vision of bringing a truly hybrid cloud experience with the general availability of Cloudera Data Platform Private Cloud (CDP Private Cloud). CDP Private Cloud, which is based on Kubernetes (RedHat OpenShift), extends cloud-native speed, simplicity and economics for the connected data lifecycle to the on-prem world, enabling IT to respond to business needs faster and deliver rock-solid service levels so people can be more productive with data.

Machine Learning

Machine Learning BI Big Data Data Warehouse

Project Metamorphosis Month 5: Global Event Streaming in Confluent Cloud

Confluent

AUGUST 24, 2020

This is the fifth month of Project Metamorphosis: an initiative that addresses the manual toil of running Apache Kafka® by bringing the best characteristics of modern cloud-native data systems to […].

Project

Project Cloud Kafka Systems

Type-Level Programming in Scala: Part 3 - Sorting Lists

Rock the JVM

AUGUST 24, 2020

The final chapter in our type-level trilogy: mastering list sorting at compile time

Scala

Scala Programming

So You Think You’ve Got a Data Strategy?

Teradata

AUGUST 24, 2020

Data is the new battleground for banks; yet for all the talk about digitalization, most banks still do not have a coherent enterprise-wide strategy for data.

Banking

Banking Data

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

Systems

The Advantages Of Live Data-Streaming In The Competitive Financial Services Sector (Part II)

Cloudera

AUGUST 26, 2020

Live data-streaming offers businesses exciting new opportunities to transform the way they operate, leveraging real-time insights to drive better decision making and enhance operational efficiency. To find out more about how live-streaming data might impact the financial services sector, I sat down for a chat with Dinesh Chandrasekhar, Head of Product Marketing in Cloudera’s data-in-motion Business Unit.

Banking

Banking Data Ingestion Kafka Data Lake

Kafka Summit 2020: Day 1 Recap

Confluent

AUGUST 24, 2020

Here is what happened on day one of the event—spoiler alert: My first Summit was awesome. This year’s Kafka Summit is my first and I’ve been lucky to have a […].

Kafka

Type-Level Programming in Scala: Part 3 - Sorting Lists

Rock the JVM

AUGUST 24, 2020

The final chapter in our type-level trilogy: mastering list sorting at compile time

Scala

Scala Programming

Case Study: Matter Uses Rockset to Bring AI-Powered Sustainable Insights to Investors

Rockset

AUGUST 27, 2020

The effects of climate change and inequality are threatening societies across the world, but there is still an annual funding gap of US$2.5 trillion to achieve the UN Sustainable Development Goals by 2030. A substantial amount of that money is expected to come from private sources like pension funds, but institutional investors often struggle to efficiently incorporate sustainability into their investment decisions.

NoSQL

NoSQL Data Lake Portfolio Architecture

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

Manufacturing

Building an effective data approach in a hybrid cloud world – part 2

Cloudera

AUGUST 24, 2020

In the last blog with Deloitte’s Marc Beierschoder, we talked about what the hybrid cloud is, why it can benefit a business and what the key blockers often are in implementation. You can read it here. . Today we are continuing our discussion with Martin Mannion , EMEA Big Data Community lead at Deloitte and Paul Mackay, the EMEA Cloud Lead at Cloudera to look at why security and governance requirements must be tackled in the early stages of data-led use case development, thereby mitigating more

Cloud

Cloud Building Government Data

How to Format Zendesk Tags

Grouparoo

AUGUST 25, 2020

In the process of integrating Grouparoo with Zendesk , I searched the documentation for the right way to format tags, but was unable to find it. I thought I'd write up a guide to help others on the same journey. In case you are "that person" and just want the answer, here it is: Tags needs to be lowercase and not have any spaces. You can have underscores.

Database

Database Process IT Data

Production ML Capabilities Now Available In CDSW 1.8

Cloudera

AUGUST 25, 2020

Introducing Model Monitoring & Metrics Store In Cloudera Data Science Workbench. With only about 35% of machine learning models making into production in the enterprise ( IDC ), it’s no wonder that production machine learning has become one of the most important focus areas for data scientists and ML engineers alike. As you may remember, we recently announced a full set of MLOps capabilities in Cloudera Machine Learning , our cloud native machine learning tool for the cloud.

Machine Learning

Machine Learning Data Science Python Cloud

The Role Of Technology In A Changing Financial Services Sector Part II

Cloudera

AUGUST 25, 2020

Evaluating anomalies and unpredicted events like pandemics and ESG concerns. In part II of the series, we sat down for an interview with Dr. Richard Harmon, Managing Director of Financial Services at Cloudera, to find out more about how the industry is adopting new technology. You can catch-up and read part 1 of the series, here. Thank you for joining us for part two of our discussion around data, analytics and machine learning within the Financial Service Sector Dr.

Technology

Technology Machine Learning Banking Government

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

Data Engineering

Handling Slow Queries in MongoDB - Part 2: Solutions

Rockset

AUGUST 25, 2020

In Part One , we discussed how to first identify slow queries on MongoDB using the database profiler, and then investigated what the strategies the database took doing during the execution of those queries to understand why our queries were taking the time and resources that they were taking. In this blog post, we’ll discuss several other targeted strategies that we can use to speed up those problematic queries when the right circumstances are present.

MongoDB

MongoDB NoSQL SQL Database

Sat.Aug 22, 2020 - Fri.Aug 28, 2020

Optimized shot-based encodes for 4K: Now streaming!

Most important tools for Data Engineers

Webinars

Trending Sources

Use of Modeling and Simulation for Understanding COVID-19 Dynamics

Webinars

The Future Of The Telco Industry And Impact Of 5G & IoT – Part II

A Guide to Debugging Apache Airflow® DAGs

An Overview of Confluent Cloud Security Controls

Metadata Management And Integration At LinkedIn With DataHub

Digitalizing Energy: A Cure-All Salve or Expensive Placebo?

Sign up to get articles personalized to your interests!

More Trending

Digitalizing Energy: A Cure-All Salve or Expensive Placebo?

Connect the Data Lifecycle: The power of data

Best of Kafka Summit 2020 Roundup

Eta-Expansion and Partially Applied Functions in Scala

Will a Few Milliseconds Ruin Your Analytics Performance in the Cloud?

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Certified technical partner solutions help customers succeed with Cloudera Data Platform

Project Metamorphosis Month 5: Global Event Streaming in Confluent Cloud

Type-Level Programming in Scala: Part 3 - Sorting Lists

So You Think You’ve Got a Data Strategy?

Agent Tooling: Connecting AI to Your Tools, Systems & Data

The Advantages Of Live Data-Streaming In The Competitive Financial Services Sector (Part II)

Kafka Summit 2020: Day 1 Recap

Type-Level Programming in Scala: Part 3 - Sorting Lists

Case Study: Matter Uses Rockset to Bring AI-Powered Sustainable Insights to Investors

How to Modernize Manufacturing Without Losing Control

Building an effective data approach in a hybrid cloud world – part 2

How to Format Zendesk Tags

Production ML Capabilities Now Available In CDSW 1.8

The Role Of Technology In A Changing Financial Services Sector Part II

The Ultimate Guide to Apache Airflow DAGS

Handling Slow Queries in MongoDB - Part 2: Solutions

Stay Connected