Sat.Oct 13, 2018 - Fri.Oct 19, 2018

article thumbnail

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Uber Engineering

Uber is committed to delivering safer and more reliable transportation across our global markets. To accomplish this, Uber relies heavily on making data-driven decisions at every level, from forecasting rider demand during high traffic events to identifying and addressing bottlenecks … The post Uber’s Big Data Platform: 100+ Petabytes with Minute Latency appeared first on Uber Engineering Blog.

Big Data 109
article thumbnail

Improving The Performance Of Cloud-Native Big Data At Netflix Using The Iceberg Table Format with Ryan Blue - Episode 52

Data Engineering Podcast

Summary With the growth of the Hadoop ecosystem came a proliferation of implementations for the Hive table format. Unfortunately, with no formal specification, each project works slightly different which increases the difficulty of integration across systems. The Hive format is also built with the assumptions of a local filesystem which results in painful edge cases when leveraging cloud object storage for a data lake.

Data Lake 100
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

The Road Ahead: From Open Source to Open Services

Rockset

I love open-source but open-source software for data infrastructure is on the way out. There, I said it. And you might think I've got a screw loose, given the broad adoption of open source today, but hear me out. Yes, open source is ubiquitous in data management today, but the era of open-source innovation is all but over. In the age of public cloud, there is no longer a reason to build or use open source for data infrastructure, and a new category of software I'm labeling open services will ren

MongoDB 40
article thumbnail

Growing a Product Area at Zalando

Zalando Engineering

The six month journey of the customer inbox multi-disciplinary team The customer inbox multi-disciplinary area operates in the Fashion Store pillar of the Zalando platform organization. The purpose of the Customer Inbox Unit is to serve customers personal and practical fashion messages, through multiple channels, i.e. “Target the customers at the right time, at the right place.

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.