August, 2018

article thumbnail

Maximizing Process Performance with Maze, Uber’s Funnel Visualization Platform

Uber Engineering

At Uber, we spend a considerable amount of resources making the driver sign-up experience as easy as possible. At Uber’s scale, even a one percent increase in the rate of sign-ups to first trips (the driver conversion rate) carries a … The post Maximizing Process Performance with Maze, Uber’s Funnel Visualization Platform appeared first on Uber Engineering Blog.

Process 110
article thumbnail

Protecting Your Data In Use At Enveil with Ellison Anne Williams - Episode 45

Data Engineering Podcast

Summary There are myriad reasons why data should be protected, and just as many ways to enforce it in tranist or at rest. Unfortunately, there is still a weak point where attackers can gain access to your unencrypted information. In this episode Ellison Anny Williams, CEO of Enveil, describes how her company uses homomorphic encryption to ensure that your analytical queries can be executed without ever having to decrypt your data.

article thumbnail

Bringing AIOps to Machine Learning & Analytics

Cloudera

Two years ago I founded Hyperpilot with the mission to enable autopilot for container infrastructure. We learned a lot about data center automation based on real-time application and diagnostic feedback using applied machine learning. Last month, I joined Cloudera along with former team members Xiaoyun Zhu and Che-Yuan Liang to bring our expertise in intelligent automation to Cloudera’s modern platform for machine learning and analytics.

article thumbnail

Making slow queries fast with composite indexes in MySQL

nodeSWAT

Making slow queries fast using composite indexes in MySQL This post expects some basic knowledge of SQL. Examples were made using MySQL 5.7.18 and run on my mid 2014 Macbook Pro. Query execution times are based on multiple executions so index caching can kick in. The use-case came from a real application and the solution is used in production. So you have inserted preliminary data to your database and run a simple COUNT(*) query against it with a simple WHERE clause and… the spinner is still run

MySQL 52
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Recap of Hadoop News for July 2018

ProjectPro

News on Hadoop - July 2018 Hadoop data governance services surface in wake of GDPR.TechTarget.com, July 2, 2018. GDPR has turned out to be a strong motivator that would bring greater governance to big data. At the recent DataWorks Summit 2018 , though most of the attention was focussed on how Hadoop pioneer Hortonworks is all set to expand its service in the cloud, there was great interest and importance put on managing data privacy as well.

Hadoop 52
article thumbnail

Three Years of our Helsinki Tech Hub

Zalando Engineering

Getting to know our Finnish tech hub as it turns three In early 2015, Zalando decided to expand its tech expertise and open tech hubs around Europe. First up was Dublin in April, and not far behind, the Helsinki Tech Hub was launched in August 2015. The Helsinki hub has had an exciting journey so far; from scaling to over 60 employees and designing a custom office to fit our community in our first year, to continuing to grow to over 100 employees with over 30 nationalities by our second annivers

More Trending

article thumbnail

Graph Databases In Production At Scale Using DGraph with Manish Jain - Episode 44

Data Engineering Podcast

Summary The way that you store your data can have a huge impact on the ways that it can be practically used. For a substantial number of use cases, the optimal format for storing and querying that information is as a graph, however databases architected around that use case have historically been difficult to use at scale or for serving fast, distributed queries.

Database 100
article thumbnail

A New Era in Data Warehousing

Cloudera

How do you know when your Data Warehousing solution is working well? Surprisingly, when you fail to notice it. Here are some interesting observations that are often taken for granted: Credit card transactions are handled safely. True – millions of credit card transactions are processed within minutes for consistency, fraud and compliance, using petabytes of historical transactions as reference data.

article thumbnail

M3: Uber’s Open Source, Large-scale Metrics Platform for Prometheus

Uber Engineering

To facilitate the growth of Uber’s global operations, we need to be able to quickly store and access billions of metrics on our back-end systems at any given time. As part of our robust and scalable metrics infrastructure, we built … The post M3: Uber’s Open Source, Large-scale Metrics Platform for Prometheus appeared first on Uber Engineering Blog.

article thumbnail

Putting Airflow Into Production With James Meickle - Episode 43

Data Engineering Podcast

Summary The theory behind how a tool is supposed to work and the realities of putting it into practice are often at odds with each other. Learning the pitfalls and best practices from someone who has gained that knowledge the hard way can save you from wasted time and frustration. In this episode James Meickle discusses his recent experience building a new installation of Airflow.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42

Data Engineering Podcast

Summary One of the longest running and most popular open source database projects is PostgreSQL. Because of its extensibility and a community focus on stability it has stayed relevant as the ecosystem of development environments and data requirements have changed and evolved over its lifetime. It is difficult to capture any single facet of this database in a single conversation, let alone the entire surface area, but in this episode Jonathan Katz does an admirable job of it.

article thumbnail

The Value of Data for Philanthropy

Cloudera

The field of philanthropy is in constant search of the next big thing. And rightly so – our organizations must use limited resources as wisely as possible to try to tackle some of society’s most challenging problems. In recent years, we have heard a great deal about how new and sophisticated understanding of how to interpret the onslaught of data produced in the modern age could help us turn the corner on major social and environmental problems.

article thumbnail

AWS vs Azure-Who is the big winner in the cloud war?

ProjectPro

Research firm Gartner published a document stating that Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and IBM Cloud are innovative tech giants that provide highly cost-competitive alternatives to conventional on-premises hosting infrastructures. If you are new to cloud computing and cloud providers, you will want to select a cloud platform that can help you quickly start learning cloud computing.

AWS 45
article thumbnail

Azure Marketplace features Cloudera Customer 360 offering

Cloudera

Cloudera’s diverse and expansive partner ecosystem includes major tech companies constantly redefining the industry, consultancies guiding some of the most comprehensive digital transformations, fast-emerging ISVs challenging status-quo, and cloud companies providing unparalleled flexibility and scalability. Individually, these companies deliver great value to customers, so imagine the business outcomes and customer benefits made possible when two or more of these companies develop a joint offer

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Introducing Cloudera Enterprise 6.0

Cloudera

Digital technologies are changing business models, reshaping how companies go-to-market, win new customers and drive new revenue-producing opportunities. Consider the following practices that, until recently, were relegated to the R&D department: Data-driven decision making – the collection and analysis of data to guide decisions that improve success.

article thumbnail

Zalando at the DatSci Awards 2018

Zalando Engineering

Building data science products in multi disciplinary teams For the last three years, I have been working on different data science projects at Zalando, helping our more than 24 million customers find the most relevant items in the assortment we have. Along the way, I have learned how to scale data science , or how to build a new personalization product from scratch.

article thumbnail

Top 5 Reasons to Learn AWS

ProjectPro

“Cloud is now what we call the new normal. It’s no longer an experiment, it’s no longer an after-thought.”- said Vincent Quah, regional head of education, research, healthcare and non-profit organizations of AWS. Why should I learn AWS? Cloud computing is taking the tech world by storm and so is the need to learn cloud computing.

AWS 40
article thumbnail

Battle of the Frameworks

Zalando Engineering

How to choose a JavaScript framework? Developers are often biased about their technology choices. At the beginning of the year, I was about to start working on a new product and my team could choose any tech stack. I did not want to be one of these biased developers who chose the framework they liked. I wanted to make an informed and educated decision.

Scala 40
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Top 100 AWS Interview Questions and Answers for 2023

ProjectPro

Land your dream job with these AWS interview questions and answers suitable for multiple AWS Cloud computing roles starting from beginner to advanced levels. “I would like to become an AWS Solution Architect. What do you think are the most commonly asked AWS Architect Interview questions that I will have to answer during my interview?” We often get asked this question by beginners and professionals looking to land a top cloud computing gig.

AWS 40
article thumbnail

The Future of Data Science

Zalando Engineering

Debunking the myth of the data science bubble We’ve all read articles indicating the looming decline of data science. Some coined the term ‘data science bubble,’ some even went so far as set a date for the ‘death of data-science’ (they give it five years before the bubble implodes). This reached a point where anyone working in the field needed to start paying attention to these signals.

article thumbnail

Agile Principles Over Frameworks

Zalando Engineering

Embracing the diverse in working agile Very often I get asked what agile working looks like at Zalando. Do we use scrum? Do we use Kanban? Do we work with LeSS? Do we use SaFE? The answer to all of these is, “Yes”. As Agile Coaches we value principles more than frameworks. The principles are derived out of these diverse frameworks and they evolve over time.

article thumbnail

Meet the newest Data Superheros: The Sixth Annual Data Impact Awards Finalists Are…

Cloudera

Drum roll… Starting from well over 100 nominations, we are excited to announce the finalists for this year’s Data Impact Awards ! Each year, nominees have raised the bar, and this year is no exception. The level of impact that organizations have shown and the variety of use cases are inspiring. From AI models that power retail customer decision engines to utility meter analysis that disables underperforming gas turbines, these finalists demonstrate how machine learning and analytics have become

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.