Sat.Jan 05, 2019 - Fri.Jan 11, 2019

article thumbnail

Performing Fast Data Analytics Using Apache Kudu - Episode 64

Data Engineering Podcast

Summary The Hadoop platform is purpose built for processing large, slow moving data in long-running batch jobs. As the ecosystem around it has grown, so has the need for fast data analytics on fast moving data. To fill this need the Kudu project was created with a column oriented table format that was tuned for high volumes of writes and rapid query execution across those tables.

article thumbnail

Who Was Smarter, Karl Benz or Sigmund Freud?

Teradata

David Socha compares Karl Benz and Sigmund Freud, two people that fundamentally and indisputably influenced how we live today.

75
article thumbnail

Keeping Pace with New iOS Releases

Pandora Engineering

How We Updated Pandora on iOS 12 Launch Day Photo Credit: Stavros Constantinou The Story That Shook the Press The Pandora app was amongst the very few enterprise apps that successfully released an update for iOS 12 on Apple’s day one September 21 launch date, supporting the exciting new Siri Shortcuts feature. Here are some notable quotes: Engadget, “Music app Pandora is taking advantage of Shortcuts at iOS 12’s launch.

article thumbnail

How to Do Data Science Using SQL on Raw JSON

Rockset

This post outlines how to use SQL for querying and joining raw data sets like nested JSON and CSV - for enabling fast, interactive data science. Data scientists and analysts deal with complex data. Much of what they analyze could be third-party data, over which there is little control. In order to make use of this data, significant effort is spent in data engineering.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

The Magic Coaching Wand

Zalando Engineering

How the Zalando Personalization Unit improved with a diagnostic In our coaching work, doing diagnostics can already create huge improvements without a lot of action on our part. Working at scale, Zalando has around 150 tech teams, this helps create an impact on the whole organisation. In this blog post, I will share the story of a diagnostic done in a unit of seven machine learning and data scientist teams (ML/DS) in Berlin, Helsinki and Dublin.

article thumbnail

Enterprise Opportunities to Apply Reinforcement Learning & AI

Teradata

Reinforcement learning is the machine learning approach that is behind some of the most talked about advances in AI.

More Trending

article thumbnail

Building a Serverless Microservice Using Rockset and AWS Lambda

Rockset

Rockset makes it easy to develop serverless microservices, data APIs, and data-driven applications. This video demo shows an example of what's possible with Rockset. For this exercise, we will build a serverless microservice to discover the stock symbols with the most mentions on Twitter. Ingest Our Twitter stream comes from Amazon Kinesis and is continuously ingested into Rockset.

AWS 40
article thumbnail

Open Source: December Review - Patroni, Machine Learning Meetup and more

Zalando Engineering

Project Highlights Patroni - one of the most well-known open source projects of Zalando is now deployed as the Postgres Failover Manager on GitLab.com. Patroni was created a few years back when we needed an automatic failover to manage hundreds of in-house clusters. The project was a fork of Compose Governor , Patroni quickly overtook the original version and became one of the most widely used template for PostgreSQL High Availability these days.