Sat.Feb 24, 2018 - Fri.Mar 02, 2018

article thumbnail

Honeycomb Data Infrastructure with Sam Stokes - Episode 20

Data Engineering Podcast

Summary One of the sources of data that often gets overlooked is the systems that we use to run our businesses. This data is not used to directly provide value to customers or understand the functioning of the business, but it is still a critical component of a successful system. Sam Stokes is an engineer at Honeycomb where he helps to build a platform that is able to capture all of the events and context that occur in our production environments and use them to answer all of your questions abou

Kafka 100
article thumbnail

Recap of Hadoop News for February 2018

ProjectPro

News on Hadoop - February 2018 Kyvos Insights to Host Webinar on Accelerating Business Intelligence with Native Hadoop BI Platforms. PRNewswire.com, February 1, 2018. The leading big data analytics company Kyvo Insights is hosting a webinar titled “Accelerate Business Intelligence with Native Hadoop BI platforms.” on February 7, 2018 at 10 AM PST.

Hadoop 52
article thumbnail

Disrupt and Innovate in a Data-Driven World

Cloudera

If you do an internet search for ‘data-driven disruption’ you can find articles about almost every industry being disrupted by digitalisation and new applications of data. Banking, transportation, healthcare, retail, and real estate, all have seen the emergence of new business models fundamentally changing how customers use their services. While there are instances of data-driven efforts in the nonprofit sector, they are not as widespread as they can be.

article thumbnail

Data Analysis with Spark

Zalando Engineering

Apache’s lightning fast engine for data analysis and machine learning In recent years, there has been a massive shift in the industry towards data-oriented decision making backed by enormously large data sets. This means that we can serve our customers with more relevant, personalized content. We in the Digital Experience team are tasked with analysing Big Data in order to gather insights and support the product team with the decision making process.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.