July, 2018

article thumbnail

Mobile Data Collection And Analysis Using Ona And Canopy With Peter Lubell-Doughtie - Episode 41

Data Engineering Podcast

Summary With the attention being paid to the systems that power large volumes of high velocity data it is easy to forget about the value of data collection at human scales. Ona is a company that is building technologies to support mobile data collection, analysis of the aggregated information, and user-friendly presentations. In this episode CTO Peter Lubell-Doughtie describes the architecture of the platform, the types of environments and use cases where it is being employed, and the value of s

article thumbnail

Machine Learning and AI Underpin Predictive Analytics to Achieve Clinical Breakthroughs

Cloudera

The practice of medicine is not only a science, it is also an art. With it, difficult situations will arise requiring insightful judgments made by well-trained physicians who can tailor their approach to the needs of patients. As such, we are witnessing a revolution in the healthcare industry, in which there is now an opportunity to employ a new model of improved, personalized, evidence and data-driven clinical care.

article thumbnail

Recap of Hadoop News for June 2018

ProjectPro

News on Hadoop - June 2018 RightShip uses big data to find reliable vessels.HoustonChronicle.com,June 15, 2018. RightShip is using IBM’s predictive big data analytics platform to calculate the likelihood of compliance or mechanical troubles that an individual merchant ship will experience within the next year.It also leverages big data to analyse carbon emissions and vessel efficiency.

Hadoop 52
article thumbnail

Agile in People Operations

Zalando Engineering

Applying agile frameworks to HR processes At Zalando we set up multi-disciplinary teams to develop our products. We do not have a central tech unit, but tech is distributed everywhere. This means that the way our techies work together has also spread across the company. Everywhere in the organization people have touch points with agile frameworks and practices.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Ceph: A Reliable And Scalable Distributed Filesystem with Sage Weil - Episode 40

Data Engineering Podcast

Summary When working with large volumes of data that you need to access in parallel across multiple instances you need a distributed filesystem that will scale with your workload. Even better is when that same system provides multiple paradigms for interacting with the underlying storage. Ceph is a highly available, highly scalable, and performant system that has support for object storage, block storage, and native filesystem access.

Hadoop 100
article thumbnail

Building Data Flows In Apache NiFi With Kevin Doran and Andy LoPresto - Episode 39

Data Engineering Podcast

Summary Data integration and routing is a constantly evolving problem and one that is fraught with edge cases and complicated requirements. The Apache NiFi project models this problem as a collection of data flows that are created through a self-service graphical interface. This framework provides a flexible platform for building a wide variety of integrations that can be managed and scaled easily to fit your particular needs.

Building 100

More Trending

article thumbnail

The Data Science Iron Triangle – Modern BI and Machine Learning

Cloudera

The New Iron Triangle. It is cliché to discuss IT/business solutions as people, process, and technology. Some call it the “golden triangle,” but in this blog, we refer to it as the iron triangle. Since the 1960s, technology has disrupted business through the advent of computing and information management. These systems replaced highly manual operations such as record keeping, finance, and reporting.

BI 42
article thumbnail

Lean Testing, or Why Unit Tests are Worse than You Think

Zalando Engineering

An economic perspective on testing Testing is a controversial topic. People have strong convictions about testing approaches. Test Driven Development is the most prominent example. Clear empirical evidence is missing, which invites strong claims. I advocate for an economic perspective towards testing. Secondly, I claim that focussing too much on unit tests is not the most economic approach.

article thumbnail

Styling-API Reinvented

Zalando Engineering

Decoupled styling in UI components Styling isolation Styling isolation achieved via CSS-modules , various CSS-in-JS solutions or Shadow-DOM simulation is already a commonly used and embraced pattern. This important step in CSS evolution was really necessary for UI components to be used with more confidence. No more global scope causing name conflicts and CSS leaking in and out!

Media 40
article thumbnail

Dortmund Turns Six

Zalando Engineering

Zalando’s maiden tech hub celebrates in style With our 10th anniversary celebrations coming up, 2018 is a very special year in the Zalando universe. But while the company celebrates 10 years, we in Dortmund are excited to celebrate our own birthday as we turn six. Every year in July, we stop for a moment in Dortmund to reflect on our past journey together and celebrate the opening of our Dortmund Tech Hub in 2012.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Utilizing the Finite State Machine

Zalando Engineering

How using a State Machine saved our apps & flows from refactoring There is a lot to learn about a "Finite State Machine" (FSM). A little intro: what is a FSM? A Finite State Machine is an abstract model of computation, which can be in only one finite state at a specific moment. Finite State Machines are used to model problems in different domains such as AI, games, application flows, etc.