June, 2017

article thumbnail

Rebuilding Yelp's Data Pipeline with Justin Cunningham - Episode 5

Data Engineering Podcast

Summary Yelp needs to be able to consume and process all of the user interactions that happen in their platform in as close to real-time as possible. To achieve that goal they embarked on a journey to refactor their monolithic architecture to be more modular and modern, and then they open sourced it! In this episode Justin Cunningham joins me to discuss the decisions they made and the lessons they learned in the process, including what worked, what didn’t, and what he would do differently

article thumbnail

Re-Architecting Cash and Digital Wallet Payments for India with Uber Engineering

Uber Engineering

Uber is developing a payment platform for India that enables operations teams to more seamlessly collect and distribute cash and digital wallet payments to drivers. In this article, San Francisco-based software engineer Yijun Liu reflects on his experiences working with … The post Re-Architecting Cash and Digital Wallet Payments for India with Uber Engineering appeared first on Uber Engineering Blog.

article thumbnail

Hadoop Cluster Overview: What it is and how to setup one?

ProjectPro

What is a Hadoop Cluster? In general, a computer cluster is a collection of various computers that work collectively as a single system. “A hadoop cluster is a collection of independent components connected through a dedicated network to work as a single centralized data processing resource. “ “A hadoop cluster can be referred to as a computational computer cluster for storing and analysing big data (structured, semi-structured and unstructured) in a distributed environment.

Hadoop 52
article thumbnail

Signalling Your Jenkins Build Status with a Mini USB Traffic Light

Zalando Engineering

As part of an effort to increase developer awareness of quality, we wanted to draw attention the fact that you should have healthy CI builds. The normal procedure revolved around emails sent to the individuals who broke the build with their last commit. With almost all of us used to receiving a lot of email-noise throughout the day, this is not a channel where you can expect an immediate reaction.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Engineering On-Demand Transportation for Business with Uber Central

Uber Engineering

When Uber launched in 2009, our mission was simple: make transportation as reliable as running water everywhere, for everyone. While our mission remains the same today, the number of Uber use cases have grown dramatically, motivating our engineers to think … The post Engineering On-Demand Transportation for Business with Uber Central appeared first on Uber Engineering Blog.

article thumbnail

Getting to Know Hadoop 3.0 -Features and Enhancements

ProjectPro

Hadoop was first made publicly available as an open source in 2011, since then it has undergone major changes in three different versions. Apache Hadoop 3 is round the corner with members of the Hadoop community at Apache Software Foundation still testing it. The major release of Hadoop 3.x is anticipated to be rolled out sometime mid of 2017. What else can be more exciting for the big data community than waiting for the release of a major new version of the tiny toy elephant?

Hadoop 52

More Trending

article thumbnail

The Modern Architecture of Search

Zalando Engineering

Information Retrieval (IR) systems are a vital component in the core of successful modern web platforms, and Zalando understand their importance incredibly well. The main goal of IR systems is to provide a communication layer that enables customers to establish a retrieval dialogue with underlying data. The immense explosion of unstructured data drives modern search applications to go beyond just fuzzy string matching, to invest in deep understanding of user queries through interpretation of use

article thumbnail

PostgreSQL in a time of Kubernetes

Zalando Engineering

A lot of time has passed at Zalando since the first services were started backed by PostgreSQL 9.0-rc1. Despite the adoption of other technologies, PostgreSQL remains the preferred relational database for most engineers around. You can follow some of the developments around PostgreSQL on the blog and also on GitHub where we share most of our PostgreSQL-related tooling.

article thumbnail

Quantitative UX Research – How Can it Complement our Customer Insights?

Zalando Engineering

Most people associate the term UX research with qualitative methods, for example, interviews with a small number of participants. These interviews are used to discover things such as customer problems, usability issues with a product, and customer journeys. Often, we concentrate on observable behavior by watching and interviewing customers while they actually use the product or prototype.

IT 40