Sat.Aug 28, 2021 - Fri.Sep 03, 2021

article thumbnail

Understand & Deliver on Your Data Engineering Task

Start Data Engineering

1. Introduction 2. Understanding your data engineering task 2.1. Data infrastructure overview 2.2. What exactly 2.3. Why exactly 2.4. Current state 2.5. Downstream impact 3. Delivering your data engineering task 3.1. How 3.2. Breakdown into sub-tasks 3.3. Delivering the finished task 4. Conclusion 5. Further reading 1. Introduction Congratulations! You are given a quick overview of the business and data architecture and are assigned your very first data engineering task.

article thumbnail

Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework

Uber Engineering

Introduction. Uber’s GSS (Global Scaled Solutions) team runs scaled programs for diverse products and businesses, including but not limited to Eats, Rides, and Freight. The team transforms Uber’s ideas into agile, global solutions by designing and implementing scalable solutions. One … The post Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework appeared first on Uber Engineering Blog.

AWS 141
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Designing And Building Data Platforms As A Product

Data Engineering Podcast

Summary The term "data platform" gets thrown around a lot, but have you stopped to think about what it actually means for you and your organization? In this episode Lior Gavish, Lior Solomon, and Atul Gupte share their view of what it means to have a data platform, discuss their experiences building them at various companies, and provide advice on how to treat them like a software product.

Designing 100
article thumbnail

Announcing Elastic Data Streams Support for Confluent’s Elasticsearch Sink Connector

Confluent

Today, as part of our expanded partnership with Elastic, we are announcing an update to the fully managed Elasticsearch Sink Connector in Confluent Cloud. This update allows you to take […].

Cloud 122
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

When Data Redefines Companies

Cloudera

The more an enterprise wants to know about itself and its business prospects, the more data it needs to collect and analyze. Additionally, the more data it collects and stores, the better its ability to know customers, to find new ones, and to provide more of what they want to buy. Sounds simple, but a surprising majority of U.S. companies (about two-thirds, according to CIO.com ) are only now getting tuned in to become fully functioning data-driven enterprises by starting new initiatives, scali

Hadoop 100
article thumbnail

Practical API Design at Netflix, Part 1: Using Protobuf FieldMask

Netflix Tech

By Alex Borysov , Ricky Gardiner Background At Netflix, we heavily use gRPC for the purpose of backend to backend communication. When we process a request it is often beneficial to know which fields the caller is interested in and which ones they ignore. Some response fields can be expensive to compute, some fields can require remote calls to other services.

More Trending

article thumbnail

Chugai Pharmaceutical

Teradata

Accelerating drug discovery and development with Teradata Vantage on AWS.

article thumbnail

Optimizing Cloudera Data Engineering Autoscaling Performance

Cloudera

The shift to cloud has been accelerating, and with it, a push to modernize data pipelines that fuel key applications. That is why cloud native solutions which take advantage of the capabilities such as disaggregated storage & compute, elasticity, and containerization are more paramount than ever. At Cloudera, we introduced Cloudera Data Engineering (CDE) as part of our Enterprise Data Cloud product — Cloudera Data Platform (CDP) — to meet these challenges.

article thumbnail

Towards a Reliable Device Management Platform

Netflix Tech

By Benson Ma , Alok Ahuja Introduction At Netflix, hundreds of different device types, from streaming sticks to smart TVs, are tested every day through automation to ensure that new software releases continue to deliver the quality of the Netflix experience that our customers enjoy. In addition, Netflix continuously works with its partners (such as Roku, Samsung, LG, Amazon) to port the Netflix SDK to their new and upcoming devices (TVs, smart boxes, etc), to ensure the quality bar is reached be

article thumbnail

Faster Results and a Better Experience with New Pagination in Rockset

Rockset

Summary: Pagination is a technique used to divide a result-set into smaller, more manageable chunks Historically, Rockset used the Limit-Offset method to implement pagination, but query results can be slow and inconsistent when dealing with very large data sets in real-time Rockset has now implemented a cursor-based approach for pagination, making queries faster, more consistent, and potentially cheaper for large data sets This is available today for all customers Pagination is a familiar techni

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Grouparoo v0.6 release

Grouparoo

The newest release of Grouparoo has a few updates that make working with data easier. Staying sync with your data warehouse. If rows are deleted in your data warehouse, then Grouparoo profiles get deleted. Combine or use logic to make profile properties. Use code to re-mix your data and get the perfect formats. New destinations: Mixpanel, Mailjet Profile deletion Data systems are often quite good at ingesting new data, but things get complicated when it gets deleted.

article thumbnail

Accenture’s Smart Data Transition Toolkit Now Available for Cloudera Data Platform

Cloudera

Cloudera and Accenture demonstrate strength in their relationship with an accelerator called the Smart Data Transition Toolkit for migration of legacy data warehouses into Cloudera Data Platform. Accenture’s Smart Data Transition Toolkit . Data warehousing is the backbone of every data driven organization , providing mission critical analytics. Today, modern data warehousing has evolved to meet the intensive demands of the newest analytics required for a business to be data driven.

article thumbnail

Learner Spotlight: Gino Parages

Dataquest

Meet Gino Parages, a former sales and IT business analyst with no coding skills who decided it was time to learn coding to give his career a boost. He chose Dataquest to help him achieve his learning goals and land the job he wanted. Here’s his story… Q: First, what are your preferred pronouns? A: He/him Q: All right, Gino! What’s your current job title?

article thumbnail

Data Quality + Data Lineage = ???

Datakin

Blog Data Quality + Data Lineage = Written by Peter Hicks on Sep 2, 2021 In a prior life, I dwelled in the day-to-day cycles of an e-commerce platform. I worked with a quite generalized system with orders, products, variants, SKUs, and customers that pined for every discount they could come by. The system built around the core business schema was the kind of chaos that data engineers are all too familiar with; large volumes of clickstream data, etl_warehouses, read replicas, and machine learning

Bytes 52
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Terraform Databricks Labs

Advancing Analytics: Data Engineering

In late 2020, Databricks introduced Databricks Labs a collection of Terraform Providers that gives you the ability to deploy nearly all Databricks resources onto Azure and Amazon Web Services (AWS) cloud platforms. Meaning you can deploy Databricks workspace, clusters, secrets, libraries, notebooks and automated jobs (and many more) at the time of provisioning the infrastructure, making it easy to manage and configure Databricks.

article thumbnail

What is a Data Incident Commander?

Monte Carlo

Incident management isn’t just for software engineers. With the rise of data platforms and the data-as-a-product mentality, building more reliable processes and workflows to handle data quality has emerged as a top concern for data engineers. In a previous post , we discussed how to set up automatic detection and alerting for bad data; now, guest author Glen Willis shares how the best data teams handle triaging and severity assessment for your broken data pipelines with the help of an emerging r

article thumbnail

How Rockset Enables SQL-Based Rollups for Streaming Data

Rockset

Until Now: The Slow Crawl from Batch to Real-Time Analytics The world is moving from batch to real-time analytics but it's been at a crawl. Apache Kafka has made acquiring real-time data more mainstream, but only a small sliver are turning batch analytics, run nightly, into real-time analytical dashboards with alerts and automatic anomaly detection.

SQL 52
article thumbnail

Acquiring is Dead. Long Live Acquiring.

Teradata

Data-driven services can help merchant acquirers add value to their core capabilities. However, to succeed, they need to be armed with the necessary data governance capabilities & know-how.

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

The Data Janitor Letters - July 2021

Pipeline Data Engineering

Data engineering salon. News and interesting reads about the world of data. Building a data team at a mid-stage startup: a short story Erik Bernhardsson, Working on something, "Bernco" The data culture is driven both from above (the CEO pushing for it) as well as from below (people in the trenches). It's OK to fail if at least you learned something from it.

SQL 52
article thumbnail

50 ML Projects To Strengthen Your Portfolio and Get You Hired

ProjectPro

The most trusted way to learn and master the art of machine learning is to practice hands-on projects. Projects help you create a strong foundation of various machine learning algorithms and strengthen your resume. But as the saying goes the voyage of a thousand miles starts with a single footstep, we present to you a 50 first steps guide on your machine learning journey.

article thumbnail

A Fresh Perspective on Monads: Generalizing Chained Computations

Rock the JVM

Explore a fresh perspective on monads: Discover new angles on this familiar concept with Rock the JVM

52
article thumbnail

Build Your CFO Analytics Foundation

Teradata

A core finance foundation, supported by the right data management tools, creates a trusted, auditable, and traceable source of all things financial. Read more.

article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

A day in the life of a Technical Fellow

Eventbrite Engineering

In my two most recent blog posts, I talked about how to write a Long-Term Technical Vision and a Golden Path. These are future-looking and high-level artifacts so the question I keep hearing is: do I need to give up coding to grow in my career and become a Technical Fellow? In this post I will … Continue reading "A day in the life of a Technical Fellow" The post A day in the life of a Technical Fellow appeared first on Engineering Blog.

Coding 40
article thumbnail

Using Internal Mobility For Growth

Zalando Engineering

Long time readers of this blog will remember that back in 2019, we published a feature on the benefits of rotating engineers between teams. For those of you who have not seen it, the article described an initiative that aimed to establish cross-functional knowledge sharing, encourage cross team collaboration, and bring greater product awareness, by providing engineers with an opportunity to work on different teams within our Developer Productivity department.

article thumbnail

Replacing Segment Computed & SQL Traits With dbt & RudderStack Reverse ETL

RudderStack

Learn to use dbt & RudderStack Reverse ETL to leverage the power of your data warehouse to sync enriched users, audiences, and other data to downstream tools.

SQL 40
article thumbnail

Send Form Data From Marketo to Multiple Destinations Using RudderStack

RudderStack

See how you can leverage RudderStack to easily track Marketo form submissions without disrupting Marketo or your marketing team.

Data 40
article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

Why Your Data Warehouse Should Be the Foundation of Your CDP

RudderStack

This post explores how RudderStack’s warehouse-first approach separates it from the traditional marketing CDP.

article thumbnail

Using RudderStack To Power Your Machine Learning Models

RudderStack

This post explores three interesting ways you can use RudderStack to unlock the power of machine learning.