Sat.May 05, 2018 - Fri.May 11, 2018

article thumbnail

Brief Conversations From The Open Data Science Conference: Part 1 - Episode 30

Data Engineering Podcast

Summary The Open Data Science Conference brings together a variety of data professionals each year in Boston. This week’s episode consists of a pair of brief interviews conducted on-site at the conference. First up you’ll hear from Alan Anders, the CTO of Applecart about their challenges with getting Spark to scale for constructing an entity graph from multiple data sources.

article thumbnail

AI at the Forefront of Digital Transformation Process in 2018

InData Labs

Digital Transformation Definition Digital transformation has been a big topic for a few years now, and it has many definitions. From a business perspective, digital transformation is about leveraging digital technologies to improve processes, competencies, and business models. It is also about changing the culture of the company because it requires letting go of old.

Process 52
article thumbnail

Nominations Now Open for the Sixth Annual Cloudera Data Impact Awards

Cloudera

Cloudera 2017 Data Impact Award Winners. We are excited to kick off the 2018 Data Impact Awards ! Nominations are now open and will remain open until July 20th. Since 2012, the Data Impact Awards have showcased how organizations are using Cloudera and the power of data to transform themselves and achieve dramatic results. Click to tweet : Nominations are now open for the sixth annual Cloudera Data Impact Awards!

article thumbnail

How to Make Product Management for Enterprise Systems Work

Zalando Engineering

Moving from a more traditional internal IT setup to a product-driven culture I love building enterprise systems, because you get to work with your customers/users every day and literally see their lives change as you release new features. In my case, at Zalando, these are systems for fashion buying, supply chain management, inventory management and procure-to-pay processes (e.g. paying our suppliers for merchandise we bought from them).

Systems 40
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Many-to-Many Relationships Using Kafka

Zalando Engineering

Real-time joins in event-driven microservices As discussed in my previous blog post , Kafka is one of the key components of our event-driven microservice architecture in Zalando’s Smart Product Platform. We use it for sequencing events and building an aggregated view of data hierarchies. This post expands on what I previously wrote about the one-to-many data model and introduces more complex many-to-many relationships.

Kafka 40