Sat.Feb 10, 2018 - Fri.Feb 16, 2018

article thumbnail

Building Reliable Reprocessing and Dead Letter Queues with Apache Kafka

Uber Engineering

In distributed systems, retries are inevitable. From network errors to replication issues and even outages in downstream dependencies, services operating at a massive scale must be prepared to encounter, identify, and handle failure as gracefully as possible. Given the scope … The post Building Reliable Reprocessing and Dead Letter Queues with Apache Kafka appeared first on Uber Engineering Blog.

Kafka 109
article thumbnail

TimescaleDB: Fast And Scalable Timeseries with Ajay Kulkarni and Mike Freedman - Episode 18

Data Engineering Podcast

Summary As communications between machines become more commonplace the need to store the generated data in a time-oriented manner increases. The market for timeseries data stores has many contenders, but they are not all built to solve the same problems or to scale in the same manner. In this episode the founders of TimescaleDB, Ajay Kulkarni and Mike Freedman, discuss how Timescale was started, the problems that it solves, and how it works under the covers.

article thumbnail

2017 – Another Award-Winning Year for Cloudera!

Cloudera

In many ways, 2017 was a singular year for Cloudera, not least because we staged a successful IPO and joined the ranks of the world’s fastest-growing, publicly traded companies. We deeply appreciate the vote of confidence and trust our customers have placed in us and are proud of the hard work of our 1,600-plus employees. These are some of the year’s highlights.

article thumbnail

Five Minutes from Machine Learning to RESTful API

Zalando Engineering

The benefits of Connexion: Zalando’s open source API-First framework In this article, I will show how quick and simple it can be to create a RESTful API for a machine learning model using Zalando’s open source Swagger/OpenAPI First framework called Connexion. Official documentation describes Connexion as the following: “Connexion is a framework on top of Flask that automagically handles HTTP requests based on OpenAPI 2.0 Specification (formerly known as Swagger Spec) of your API described in YAM

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Cloudera on Cloudera: Our Journey to Becoming more Data-driven

Cloudera

I’ve spent the last four years here at Cloudera talking with our customers about how to run their businesses better using their data and Cloudera’s products and services. Now I get to put my money where my mouth is – and turn my focus internally on how we at Cloudera can become more data-driven. We aspire to and are on the journey to be the best-run company on data, and to be our own best reference.

article thumbnail

Dave Shuman Talks IoT and Big Data on Federal News Radio

Cloudera

What exactly can we expect for IoT in 2018, and how can you improve your organization with connected devices? That was the question Dave Shuman set out to answer when he sat down last month with John Gilroy at the Federal News Radio headquarters in Washington, D.C. Federal Tech Talk looks at the world of high technology in the federal government and, as its host, John speaks the language of federal CISOs, CIOs, and CTOs.