March, 2019

article thumbnail

The Importance of Distributed Tracing for Apache-Kafka-Based Applications

Confluent

Apache-Kafka ® -based applications stand out for their ability to decouple producers and consumers using an event log as an intermediate layer. One result of this is that producers and consumers don’t know about each other, as there is no direct communication between them. This enables choreographed service collaborations, where many components can subscribe to events stored in the event log and react to them asynchronously.

Kafka 111
article thumbnail

Open Sourcing Peloton, Uber’s Unified Resource Scheduler

Uber Engineering

First introduced by Uber in November 2018, Peloton , a unified resource scheduler, manages resources across distinct workloads, combining separate compute clusters. Peloton is designed for web-scale companies like Uber with millions of containers and tens of thousands of nodes. … The post Open Sourcing Peloton, Uber’s Unified Resource Scheduler appeared first on Uber Engineering Blog.

article thumbnail

Why Analytics Projects Fail And What To Do About It

Data Engineering Podcast

Summary Analytics projects fail all the time, resulting in lost opportunities and wasted resources. There are a number of factors that contribute to that failure and not all of them are under our control. However, many of them are and as data engineers we can help to keep our projects on the path to success. Eugene Khazin is the CEO of PrimeTSR where he is tasked with rescuing floundering analytics efforts and ensuring that they provide value to the business.

Project 100
article thumbnail

Teradata Has Been Named One of the World's Most Ethical Companies 2019

Teradata

Teradata is thrilled to be named one the of the World’s Most Ethical Companies, for the tenth consecutive year.

99
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Netflix Tech

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can I run a check myself to understand what data is behind this metric?

article thumbnail

Cloudera Altus Director on AWS Marketplace makes cloud deployment and billing easy

Cloudera

Roughly a quarter of Cloudera’s customers have clusters on public cloud, with a majority of them on AWS. These customers often look for cloud infrastructure best practices guidance as they venture into AWS cloud resources for the first time. Some of the questions asked include: How many AMIs do I need? Should I use EBS or S3 for storage? Many of these questions are answered in the Cloudera on AWS reference architecture guide.

AWS 79

More Trending

article thumbnail

Using Machine Learning to Ensure the Capacity Safety of Individual Microservices

Uber Engineering

Reliability engineering teams at Uber build the tools, libraries, and infrastructure that enable engineers to operate our thousands of microservices reliably at scale. At its essence, reliability engineering boils down to actively preventing outages that affect the mean time between … The post Using Machine Learning to Ensure the Capacity Safety of Individual Microservices appeared first on Uber Engineering Blog.

article thumbnail

Building An Enterprise Data Fabric At CluedIn

Data Engineering Podcast

Summary Data integration is one of the most challenging aspects of any data platform, especially as the variety of data sources and formats grow. Enterprise organizations feel this acutely due to the silos that occur naturally across business units. The CluedIn team experienced this issue first-hand in their previous roles, leading them to build a business aimed at building a managed data fabric for the enterprise.

Building 100
article thumbnail

What Is Pervasive Data Intelligence?

Teradata

Chris Twogood explains Pervasive Data Intelligence and why it's important to large-scale enterprise businesses.

Data 91
article thumbnail

Spinnaker Sets Sail to the Continuous Delivery Foundation

Netflix Tech

Author: Andy Glover Since releasing Spinnaker to the open source community in 2015 , the platform has flourished with the addition of new cloud providers, triggers, pipeline stages, and much more. Myriad new features, improvements, and innovations have been added by an ever growing, actively engaged community. Each new innovation has been a step towards an even better Continuous Delivery platform that facilitates rapid, reliable, safe delivery of flexible assets to pluggable deployment targets.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Managing mortgage risk in an uncertain world

Cloudera

Picture the scene: a hopeful homebuyer sits in the almost deserted lobby of a high street bank, waiting for the appointment she booked with the mortgage consultant a week ago – a week ago! It annoys her that she has had to come to a branch she has not visited for years, all because she could not work out how to apply for a home loan on the bank’s website.

article thumbnail

SOA vs. EDA: Is Not Life Simply a Series of Events?

Confluent

When should you use an API? When should you use an event? Most contemporary software architectures are some mix of these two approaches. I will attempt to articulate in layman’s terms what an event-driven architecture (EDA) is and contrast it with service-oriented architecture (SOA). In essence, this is an attempt to differentiate and/or associate APIs with events.

article thumbnail

Improving the User Experience with Uber’s Customer Obsession Ticket Routing Workflow and Orchestration Engine

Uber Engineering

Every day, Uber users around the world initiate customer support tickets through our Customer Obsession Platform. To ensure a seamless user experience, each of those tickets must be matched with an agent who speaks the user’s language and who … The post Improving the User Experience with Uber’s Customer Obsession Ticket Routing Workflow and Orchestration Engine appeared first on Uber Engineering Blog.

article thumbnail

A DataOps vs DevOps Cookoff In The Data Kitchen

Data Engineering Podcast

Summary Delivering a data analytics project on time and with accurate information is critical to the success of any business. DataOps is a set of practices to increase the probability of success by creating value early and often, and using feedback loops to keep your project on course. In this episode Chris Bergh, head chef of Data Kitchen, explains how DataOps differs from DevOps, how the industry has begun adopting DataOps, and how to adopt an agile approach to building your data platform.

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

How to Use Analytics to Avoid Problems, Before They Impact Your Business

Teradata

Rob Armstrong uses the metaphor of turbulence when flying, to explains how businesses can prepare themselves for, and respond to, previously unforeseen challenges.

74
article thumbnail

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

Netflix Tech

Netflix’s engineering culture is predicated on Freedom & Responsibility, the idea that everyone (and every team) at Netflix is entrusted with a core responsibility and they are free to operate with freedom to satisfy their mission. This freedom allows teams and individuals to move fast to deliver on innovation and feel responsible for quality and robustness of their delivery.

Cloud 74
article thumbnail

Introducing the 2019 Data Heroes – EMEA!

Cloudera

The Data Heroes initiative is one of the ways that we recognize customers who achieve outstanding results with Cloudera technologies. The Data Visionary, Data Scientist, Data Architect, and HCC Community Champion awards are given out to organizations transforming their businesses through Big Data. Data Heroes design modern data architectures that work across hybrid and multi-cloud, and solve complex data management and analytic use cases that span from the Edge to AI. .

article thumbnail

Kafka Streams’ Take on Watermarks and Triggers

Confluent

Back in May 2017, we laid out why we believe that Kafka Streams is better off without a concept of watermarks or triggers , and instead opts for a continuous refinement model. This article explains how we are fundamentally sticking with this model, while also opening the door for use cases that are incompatible with continuous refinement. By continuous refinement , I mean that Kafka Streams emits new results whenever records are updated.

Kafka 106
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Serverless Data Management: A SQL Search and Analytics Engine

Rockset

When we started Rockset, we envisioned building a powerful cloud data management system that was really easy to use. Making the data stack simpler is fundamental to making data usable by developers and data scientists. Simplifying the Data Stack To that end, we incorporated user-friendly features that alleviate the pain we personally experienced as data practitioners.

SQL 52
article thumbnail

Customer Analytics At Scale With Segment

Data Engineering Podcast

Summary Customer analytics is a problem domain that has given rise to its own industry. In order to gain a full understanding of what your users are doing and how best to serve them you may need to send data to multiple services, each with their own tracking code or APIs. To simplify this process and allow your non-engineering employees to gain access to the information they need to do their jobs Segment provides a single interface for capturing data and routing it to all of the places that you

article thumbnail

Adding Cloud to Your Analytic Ecosystem

Teradata

Brian Wood explains what should be considered when adding cloud to your analytic ecosystem.

Cloud 72
article thumbnail

Netflix Public Bug Bounty, 1 year later

Netflix Tech

by Astha Singhal (Netflix Application Security) As Netflix continues to create entertainment people love, the security team continues to keep our members, partners, and employees secure. The security research community has partnered with us to improve the security of the Netflix service for the past few years through our responsible disclosure and bug bounty programs.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Introducing Cloudera Edge Management and Cloudera Flow Management

Cloudera

Cloudera’s vision of delivering Edge to AI solutions using the Enterprise Data Cloud will enable enterprises to transform dramatically. In today’s digitally connected enterprises, data originates from the edge, streams into the data center, lands in an Enterprise Data Cloud for downstream processing including Machine Learning and then serves back to the edge for real-time prediction and action.

article thumbnail

CloudBank’s Journey from Mainframe to Streaming with Confluent Cloud

Confluent

Cloud is one of the key drivers for innovation. Innovative companies experiment with data to come up with something useful. It usually starts with the opening of a firehose that continuously broadcasts tons of events before they start mining it to create music out of simply noise. Today, companies from all around the world are witnessing an explosion of event generation coming from everywhere, including their own internal systems.

Cloud 89
article thumbnail

Brand Identity Issues: How Does Logo Detection Work for Effective Marketing Campaign?

InData Labs

Social media has evolved into the main method of communicating ideas, sharing experience, brand stories, and building communities. The user engagement with ads on Facebook has tripled in the last 2 years, as Hootsuite reports. So far, more than 60% of users discover brands and goods on Instagram, employ such apps as Like2Buy that allows. Запись Brand Identity Issues: How Does Logo Detection Work for Effective Marketing Campaign?

Media 52
article thumbnail

Wake up to Pandora with the Clock app from Google

Pandora Engineering

It’s 7 o’clock. I tug my blanket tightly over my face, hoping to whisk the morning away. The sound of my favorite playlist soothes me to a gentle rise. I smile. The alarm worked. Months of hard work collaborating with Google™, including many days at each others’ offices, came to life in that moment. Just a week earlier, we announced the release of the Pandora integration with the Clock app from Google.

Media 52
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Building a Diverse and Inclusive Teradata

Teradata

Teradata CEO Oliver Ratzesberger celebrates International Women's Day.

article thumbnail

Advancing Analytics at #SQLBits 2019

Advancing Analytics: Data Engineering

Today is my first day back in the office after attending SQLBits in Manchester last week. SQLBits is the UK's largest Microsoft Data Platform conference. What makes this event special is that it is organised for the community, by the community and is not for profit - All the proceeds go in to funding the event and in particular, the awesome Friday night party.

article thumbnail

Learning with Limited Labeled Data

Cloudera

This post was originally published on the Cloudera Fast Forward Labs blog. . In recent years, machine learning technologies – especially deep learning – have made breakthroughs which have turned science fiction into reality. Autonomous cars are almost possible, and machines can comprehend language. These technical advances are unprecedented, but they hinge on the availability of vast amounts of data.

article thumbnail

Consuming Messages Out of Apache Kafka in a Browser

Confluent

Imagine a fire hose that spews out trillions of gallons of water every day, and part of your job is to withstand every drop coming out of it. This is what it is like to visualize the message throughput of Apache Kafka ®. At Confluent, we want to help developers understand how to think about event streaming and the opportunities it can create. Educating people on what an event stream looks like is a daunting task.

Kafka 79
article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.