Top Data Engineering Digest Metadata AWS Content for Week of Dec 01

Apache Zookeeper As A Building Block For Distributed Systems with Patrick Hunt - Episode 59

Data Engineering Podcast

DECEMBER 2, 2018

Summary Distributed systems are complex to build and operate, and there are certain primitives that are common to a majority of them. Rather then re-implement the same capabilities every time, many projects build on top of Apache Zookeeper. In this episode Patrick Hunt explains how the Apache Zookeeper project was started, how it functions, and how it is used as a building block for other distributed systems.

Systems

Systems Building Kafka Java

Cache warming: Agility for a stateful service

Netflix Tech

DECEMBER 4, 2018

by Deva Jayaraman , Shashi Madappa , Sridhar Enugula , and Ioannis Papapanagiotou EVCache has been a fundamental part of the Netflix platform (we call it Tier-1), holding Petabytes of data. Our caching layer serves multiple use cases from signup, personalization, searching, playback, and more. It is comprised of thousands of nodes in production and hundreds of clusters all of which must routinely scale up due to the increasing growth of our members.

AWS

AWS Architecture Kafka Metadata

One Audio Sequencer to Rule Them All

Pandora Engineering

DECEMBER 5, 2018

Photo credit: Carol Yepes Last month Pandora announced a public podcast beta in conjunction with the Podcast Genome Project. This rollout introduced many exciting features to our current mobile application offerings, including fully integrated and native podcast support. Ironically, one of the most interesting features and perhaps our biggest engineering win with this iteration is something that’s transparent to our end users: the inclusion of a new audio playback sequencer used exclusively for

Media

Media Algorithm Coding Data Science

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Open Source: November Review - Maintainer training, new releases and more

Zalando Engineering

DECEMBER 5, 2018

Project Highlights ExternalDNS version 0.5.9 is ready for testing. This project allows you to control DNS records dynamically via Kubernetes resources in a DNS provider-agnostic way. ExternalDNS also successfully made its way to the Kubernetes Incubator. Check out the list of changes in this new release. Zalando-Incubator welcomed two brand new open source projects 1) Darty - a data dependency manager for data science projects.

PostgreSQL

PostgreSQL Java Machine Learning Deep Learning

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

Data

Announcing my session at #SQLBits - Azure Databricks

Advancing Analytics: Data Engineering

DECEMBER 3, 2018

Simon Whiteley and I will be back at #SQLBits 2019 talking about hashtag#DataEngineering and #DataScience in Databricks. We will look at #ApacheSpark #Python #Engineering & #MachineLearning in this full day training day. Register Now Have you looked at Azure DataBricks yet? No! Then you need to. Why you ask, there are many reasons. The number 1, knowing how to use Apache Spark will earn you more money.

Data Science

Data Science Machine Learning Python Data Pipeline

Running SQL on Nested JSON

Rockset

DECEMBER 7, 2018

When we surveyed the market, we saw the need for a solution that could perform fast SQL queries on fluid JSON data , including arrays and nested objects: Best architecture to convert JSON to SQL? What are the ways to run SQL on JSON data without predefining schemas? I need database to take JSON and execute SQL. What are my options? The Challenge of SQL on JSON Some form of ETL to transform JSON to tables in SQL databases may be workable for basic JSON data with fixed fields that are known up fro

SQL

SQL MySQL Relational Database Database

Front-End Micro Services

Zalando Engineering

DECEMBER 5, 2018

The “micro frontends” idea has been around for a while now, with great resources such as this Tom Söderlund article , which includes a list of current existing implementations. In this article, I would like to take an in-depth look at the reference implementation using fragments: explain what it tries to achieve, where it falls short and possible solutions to those limitations.

Architecture

Architecture Engineering Technology Systems

Front-End Micro Services

Zalando Engineering

DECEMBER 5, 2018

The “micro frontends” idea has been around for a while now, with great resources such as this Tom Söderlund article , which includes a list of current existing implementations. In this article, I would like to take an in-depth look at the reference implementation using fragments: explain what it tries to achieve, where it falls short and possible solutions to those limitations.

Architecture

Architecture Engineering Technology Systems

Data Engineering Digest

Sat.Dec 01, 2018 - Fri.Dec 07, 2018

Apache Zookeeper As A Building Block For Distributed Systems with Patrick Hunt - Episode 59

Cache warming: Agility for a stateful service

Webinars

Trending Sources

One Audio Sequencer to Rule Them All

Webinars

Open Source: November Review - Maintainer training, new releases and more

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Announcing my session at #SQLBits - Azure Databricks

Running SQL on Nested JSON

Front-End Micro Services

Front-End Micro Services

Stay Connected