Sat.Jul 24, 2021 - Fri.Jul 30, 2021

article thumbnail

Uber’s Fulfillment Platform: Ground-up Re-architecture to Accelerate Uber’s Go/Get Strategy

Uber Engineering

Introduction to Fulfillment at Uber. Uber’s mission is to help our consumers effortlessly go anywhere and get anything in thousands of cities worldwide. At its core, we capture a consumer’s intent and fulfill it by matching it with the right … The post Uber’s Fulfillment Platform: Ground-up Re-architecture to Accelerate Uber’s Go/Get Strategy appeared first on Uber Engineering Blog.

article thumbnail

Protecting Data Integrity in Confluent Cloud: Over 8 Trillion Messages Per Day

Confluent

It’s about maintaining the right data even when no one is watching. Last year, Confluent announced support for Infinite Storage, which fundamentally changes data retention in Apache Kafka® by allowing […].

article thumbnail

#ClouderaLife Spotlight: Vinicius Cardoso, Sr Solutions Engineering

Cloudera

Meet Vinicius Cardoso, better known as Vini. . He is a Sr. Solutions Engineer (SE) working in Australia. . In his role, customers are at the center of everything he does. Wearing the hat of Enterprise Architect, he dives deep to understand customer’s organization goals, initiatives and requirements in order to identify the key capabilities that need to be delivered. .

article thumbnail

Data Movement in Netflix Studio via Data Mesh

Netflix Tech

By Andrew Nguonly , Armando Magalhães , Obi-Ike Nwoke , Shervin Afshar , Sreyashi Das , Tongliang Liu , Wei Liu , Yucheng Zeng Background Over the next few years, most content on Netflix will come from Netflix’s own Studio. From the moment a Netflix film or series is pitched and long before it becomes available on Netflix, it goes through many phases.

Data 103
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Adding Context And Comprehension To Your Analytics Through Data Discovery With SelectStar

Data Engineering Podcast

Summary Companies of all sizes and industries are trying to use the data that they and their customers generate to survive and thrive in the modern economy. As a result, they are relying on a constantly growing number of data sources being accessed by an increasingly varied set of users. In order to help data consumers find and understand the data is available, and help the data producers understand how to prioritize their work, SelectStar has built a data discovery platform that brings everyone

BI 100
article thumbnail

Speed, Scale, Storage: Our Journey from Apache Kafka to Performance in Confluent Cloud

Confluent

At Confluent, we focus on the holy trinity of performance, price, and availability, with the goal of delivering a similar performance envelope for all workloads across all supported cloud providers. […].

Cloud 122

More Trending

article thumbnail

Recommender Systems: Behind the Scenes of Machine-Learning-Based Personalization

AltexSoft

Steve Jobs once said, “People don’t know what they want until you show it to them”. Well, try arguing that considering that we all watch videos suggested by YouTube, buy goods suggested by Amazon, and watch TV shows suggested by Netflix. People like being guided and given relevant offers and recommendations. They like being treated in a personal manner.

article thumbnail

Building a Multi-Tenant Managed Platform For Streaming Data With Pulsar at Datastax

Data Engineering Podcast

Summary Everyone expects data to be transmitted, processed, and updated instantly as more and more products integrate streaming data. The technology to make that possible has been around for a number of years, but the barriers to adoption have still been high due to the level of technical understanding and operational capacity that have been required to run at scale.

Building 100
article thumbnail

Design Considerations for Cloud-Native Data Systems

Confluent

Twenty years ago, the data warehouses of choice were Oracle and Teradata. Since then, growth and innovation has shifted to the cloud, and a new generation of data systems have […].

Systems 116
article thumbnail

Five Strategies to Accelerate Data Product Development

Cloudera

Introduction. With this first article of the two-part series on data product strategies, I am presenting some of the emerging themes in data product development and how they inform the prerequisites and foundational capabilities of an Enterprise data platform that would serve as the backbone for developing successful data product strategies. Once we have identified those capabilities, the second article explores how the Cloudera Data Platform delivers those prerequisite capabilities and has enab

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

What Is Referential Transparency and Why Should You Care?

Rock the JVM

Discover how referential transparency boosts your productivity as a functional programmer in Scala and why it's crucial

Scala 52
article thumbnail

Data cleaning for nulls with SQL vs. code

Grouparoo

When preparing your data set for analysis, it is crucial to ensure that your data set is both complete and accurate. One step in this process is deciding how to handle null values. Depending on how your data is going to be used, you may not want null values at all! Let's clean some data We're going to take a look at calculating Lifetime Value (LTV) of a customer.

SQL 52
article thumbnail

From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud

Confluent

Multi-tenancy brings cost-efficiency to infrastructure, and when done correctly, creates an economy of scale. Done incorrectly and you degrade the user experience and create maintenance nightmares for operators. This is […].

Cloud 111
article thumbnail

Learn How Cloudera Drives Healthcare Data Insights at HIMSS 21

Cloudera

HIMSS21 is just a few days away, and we hope you will join us to talk about how we can all achieve better health outcomes by working together. Health organizations across the world are evaluating safety precautions as COVID-19 cases continue to wax and wane and they consider universal questions such as, when is it safe to allow our administrative staff to return to the office, and how can we reassure our patients that we are committed to their health and safety?

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

What Is Referential Transparency and Why Should You Care?

Rock the JVM

Discover how referential transparency boosts your productivity as a functional programmer in Scala and why it's crucial

Scala 52
article thumbnail

Knowledge Graph Technologies Accelerate and Improve the Data Model Definition for Master Data

Zalando Engineering

The Master Data Management Challenge Master data management (MDM) is a technology-enabled discipline in which business and Information Technology work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise's official shared master data assets. 1 At Zalando we are at an early phase of realising MDM for our internal data assets and we have chosen to do it in a consolidated style.

article thumbnail

Kafka Summit APAC 2021 Recap

Confluent

The second of this year’s three online Kafka Summits is now complete! We hope you were able to join us for Kafka Summit APAC 2021 yesterday. We had over 13,000 […].

Kafka 105
article thumbnail

Driving Standards & Collaboration in Telco with Data & AI

Cloudera

I’m thrilled to report that Cloudera today announced its membership of the TM Forum , the leading industry standards and collaboration group for the telecommunications industry. This is an important step for our company and for our telecommunications and media customers and partners, adding significant momentum and acceleration to our development of solutions for the industry.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

DataKitchen Wins Data & Analytics Vendor of the Year Award – OnConferences

DataKitchen

The post DataKitchen Wins Data & Analytics Vendor of the Year Award – OnConferences first appeared on DataKitchen.

article thumbnail

Categorizing user-uploaded documents

Scribd Technology

Scribd offers a variety of publisher and user-uploaded content to our users and while the publisher content is rich in metadata, user-uploaded content typically is not. Documents uploaded by the users have varied subjects and content types which can make it challenging to link them together. One way to connect content can be through a taxonomy - an important type of structured information widely used in various domains.

article thumbnail

Making Apache Kafka Serveless: Lessons From Confluent Cloud

Confluent

Serverless offerings in the cloud are a favorite among software engineers—a prime example are object stores such as AWS S3. For the system designer, however, it is an engineering challenge […].

Cloud 105
article thumbnail

Ransomware is Becoming the Most Prevalent Malware Attack - Don’t Become the Next Victim

Teradata

Ransomware attacks can be devastating. That’s why it’s important to stay informed about what ransomware is, how it works and the types of ransomware there are.

IT 52
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Maximize Your Data Lakehouse Using Dremio Architecture

Preset

Apache Superset™ and Dremio provide a powerful accelerated lakehouse platform.

article thumbnail

50 Cloud Computing Interview Questions and Answers for 2023

ProjectPro

Why Learn Cloud Computing Skills? The job market in cloud computing is growing every day at a rapid pace. It is among the top skills that people want to upgrade. A quick search on Linkedin shows there are over 30000 freshers jobs in Cloud Computing and over 60000 senior-level cloud computing job roles. As an increasing number of companies are switching over to clouds after seeing the absolute benefits and ease - the job growth in the cloud market is burgeoning.

article thumbnail

Reduce Your Data Infrastructure TCO with Confluent’s New Splunk S2S Source Premium Connector

Confluent

Data is at the center of our world today, especially with the ever-increasing amount of machine-generated log data collected from applications, devices, and sensors from almost every modern technology. The […].

article thumbnail

Ransomware is Becoming the Most Prevalent Malware Attack - Don’t Become the Next Victim

Teradata

Ransomware attacks can be devastating. That’s why it’s important to stay informed about what ransomware is, how it works and the types of ransomware there are.

IT 52
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

How The Farmer’s Dog Achieved Rapid ML-Based Anomaly Detection with Monte Carlo

Monte Carlo

Companies across all industries are striving to become data-driven: making decisions based on data and building a culture of data trust and transparency. But data downtime —periods of time where data is missing, broken or otherwise erroneous—undermines those efforts and can cost companies upwards of $15 million annually. And very often, the ability to achieve more reliable data is both time-intensive and intensely manual.

Food 40
article thumbnail

Google Data Scientist Interview Questions To Get You Hired

ProjectPro

Google data science interviews are challenging. The data scientist interview questions are tricky, specific to Google’s data products, and cover a wide range of data science and machine learning concepts. The good news is that the right preparation can make a big difference and get you hired at one of the FANG companies. If you’re interviewing for a data scientist role at Google or you’re just curious about what a data scientist interview at Google looks like - we’ve brok

article thumbnail

20x Faster Ingestion with Rockset's New DynamoDB Connector

Rockset

Since its introduction in 2012, Amazon DynamoDB has been one of the most popular NoSQL databases in the cloud. DynamoDB, unlike a traditional RDBMS, scales horizontally, obviating the need for careful capacity planning, resharding, and database maintenance. As a result, DynamoDB is the database of choice for companies building event-driven architectures and user-friendly, performant applications at scale.

NoSQL 40
article thumbnail

Building a Roadmap for Enterprise Data and Analytics – A Framework

Teradata

Building a data analytics roadmap for a large, complex enterprise can be daunting. Breaking it down into essentials helps manage complexity, avoid pitfalls, & set the program in the right direction.

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.