Sat.Mar 13, 2021 - Fri.Mar 19, 2021

article thumbnail

Managing The DoorDash Data Platform

Data Engineering Podcast

Summary The team at DoorDash has a complex set of optimization challenges to deal with using data that they collect from a multi-sided marketplace. In order to handle the volume and variety of information that they use to run and improve the business the data team has to build a platform that analysts and data scientists can use in a self-service manner.

article thumbnail

Learning with LiveStreams: Cloud-Native Apache Kafka and Serverless Stream Processing

Confluent

LiveStreams is a YouTube show about Confluent, real-time data streaming, and related technologies that help you maximize data in motion on any cloud. Every episode of LiveStreams will teach you […].

Cloud 98
article thumbnail

FRTB: Will 2023 Finally be the Year?

Cloudera

The Fundamental Review of the Trading Book (FRTB), introduced by the Basel Committee on Banking Supervision (BCBS), will transform how banks measure risk. FRTB is designed to address some fundamental weaknesses that did not get addressed in the post-2008 financial crisis regulatory reforms. In order to help make banks more resilient to drastic market changes, it will impose capital requirements that are more closely aligned with the market’s actual risk factors.

Banking 56
article thumbnail

Semigroups and Monoids in Scala

Rock the JVM

This article provides a gentle introduction to Monoids and Semigroups, exploring functional abstractions and the workings of the Cats library

Scala 52
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Data-driven performance improvements: Distance running and data

Retail Insight

As a former distance runner, I have seen first-hand how investment in elite sport is accelerating athletic performance. Just as the world has developed, our physical capabilities have too. New lifestyles, technologies, science , and world-class facilities all help to enabl e athletes to go ‘Faster – Higher – Stronger’, as the Olympics motto states.

Data 52
article thumbnail

Securing the Infrastructure of Confluent with HashiCorp Vault

Confluent

In order for a technology like Confluent Cloud to make it easy to set data in motion, many different software systems are required to interact with each other using API […].

Cloud 92

More Trending

article thumbnail

CFO Analytics – Build Your Foundation

Teradata

Without a strong foundation, any house will crumble. The same holds true for CFO Analytics – without a core foundational model, a strong analytics platform will be difficult to build.

article thumbnail

Deep Learning vs Machine Learning -What's the Difference?

ProjectPro

“Machine Learning” and “Deep Learning” – are two of the most often confused and conflated terms that are used interchangeably in the AI world. However, there is one undeniable fact that both machine learning and deep learning are undergoing skyrocketing growth. According to Forbes , the global machine learning market will be worth $30.6 billion by 2024 and the deep learning market size is expected to reach $10.2 billion by 2025, expanding at a CAGR of 42.8% and 52.1

article thumbnail

Integrating Azure and Confluent: Ingesting Data to Azure Cosmos DB through Apache Kafka

Confluent

Building cross-platform solutions enables organizations to leverage technology driven by real-time data and enabled with both highly available services and low-latency databases hosted on Microsoft Azure. Azure Cosmos DB is […].

Kafka 75
article thumbnail

Dogfooding your product

Grouparoo

“Eating your own dogfood” or “dogfooding” is a term that always felt a bit odd to me, but the principles underlying it are incredibly important to product teams small and large. In short, Dogfooding means using your own product in order to better empathize with your users. When you build more empathy for your users, you build a better product. I’ll be sharing some thoughts on why dogfooding is important and some pointers on how to dogfood well.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Streaming Data Into Teradata Vantage Using AWS Glue Streaming ETL

Teradata

This guide describes the procedure to stream data into Teradata Vantage on AWS with AWS Glue Streaming ETL jobs and Amazon Kinesis, and to visualize the data with Amazon QuickSight.

AWS 52
article thumbnail

Flexbox Layout Behavior in Jetpack Compose

Zalando Engineering

Introduction The CSS Flexible Box Layout specification (AKA flexbox) is a useful abstraction for describing layouts in a platform agnostic way. For this reason, it is widely used on the web and even on mobile. Readers familiar with ConstraintLayout can think of flexbox as conceptually similar to the Flow virtual layout it supports. This type of layout is ideal for grids or other groups of views with varying sizes.

Coding 52
article thumbnail

N-Queens in Scala: How to Approach Algorithm Questions

Rock the JVM

Learn how to tackle a Google-style algorithm interview question in Scala using pure functional programming

Scala 52
article thumbnail

Don’t use underscores in your HTTP Headers

Grouparoo

Don’t use underscores in your HTTP Headers. at least according to AWS and Nginx! curl --HEADER "AUTH_TOKEN: abc" example.com # is bad curl --HEADER "AUTH-TOKEN: abc" example.com # is OK Grouparoo is a self-hosted application, and we are always helping folks run and deploy our service in new ways. Recently, we’ve been working an example application for Amazon Web Service’s (AWS) Elastic Beanstalk service.

AWS 52
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Texas Health Resources

Teradata

Healthcare provider uses Teradata Vantage for Patient 360 and Connected Identity efforts, improving the health of their 7M community members.

article thumbnail

10 NLP Techniques Every Data Scientist Should Know

ProjectPro

Natural language processing is a subfield of artificial intelligence that aims at making machines understand natural languages just like humans do. The power of NLP techniques dates back to the 1950’s Turing Test ( also known as the Imitation Game ), a test to determine if a machine can be considered intelligent.“ A computer would deserve to be called intelligent if it could deceive a human into believing that it was human.”- Alan Turing The Turing test serves as a milestone fo

article thumbnail

[VIDEO] How Resident Drives Data Observability with Monte Carlo

Monte Carlo

For Resident, a collection of premium direct-to-consumer brands, Data Observability has become a part of the Data Engineering team’s tech stack, helping them prevent broken data pipelines and achieve highly reliable data — at scale. “I only use 3 tabs at work: Gmail, BigQuery, and Monte Carlo,” says Resident’s Head of Data Engineering, Daniel Rimon.

article thumbnail

Unused Variables in Destructured Arrays

Grouparoo

Last week when working on Grouparoo's OneSignal integration , I wrote a quick CLI script to create users in their system that we could use for testing. This script would take in a single argument, which would be the external_user_id on OneSignal: $./create_user < external_user_id > Parsing command-line arguments To allow parsing command-line arguments, Node provides process.argv.

Coding 52
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Is There a Better Way to Drive Faster Business Value Without Creating More Technical Debt?

Teradata

Often departments and project teams strive to solve their analytic needs independently. At first blush, it seems quicker that way. But inadvertently these efforts create technical debit.

Project 52
article thumbnail

Optimize GraphQL Server with Lookaheads

Zalando Engineering

In our first post about How we use GraphQL at Zalando , we briefly shared about performance optimizations using GraphQL-JIT. GraphQL-JIT allowed us to scale our implementation without performance degradations. In this post, we share another optimization we use - Lookaheads. Same Model; Different Views In our GraphQL service, we do not have resolvers for every single field in the schema.

Bytes 40
article thumbnail

Data Observability in Practice: Data Monitoring at Scale with SQL and Machine Learning

Monte Carlo

Data pipelines can break for a million different reasons, but how can we have awesome data monitoring in place to ensure data quality issues are identified and addressed in real time—at scale? Sometimes, all it takes is a bit of SQL, some precision and recall, and a holistic approach to data observability. In this article , we walk through how you can create your own data monitors from scratch and leverage basic principles of machine learning to apply them at scale across your data pipelines.

article thumbnail

Batching API requests

Grouparoo

One thing we can observe from a table of CRM rate limits is a inverse correlation between "Enterprise-ness" systems and their normalized limits. Systems like Pardot, Marketo, Eloqua, and Salesforce have daily limits as opposed to per minute or second like the others. This encourages the use of their batching APIs. Batching in this context means performing API operations related to multiple people with one API request.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Time Series Analysis Using Apache Druid and Superset

Preset

In this tutorial, we'll showcase how to ingest data on Wikipedia edits into Apache Druid. Then, we'll use a time-series chart to visualize the data in Superset.

Data 40
article thumbnail

Elasticsearch or Rockset for Real-Time Analytics: Real-Time Ingestion and Indexing

Rockset

When working with a real-time analytics system you need your database to meet very specific requirements. This includes making the data available for query as soon as it is ingested, creating proper indexes on the data so that the query latency is very low, and much more. Before it can be ingested, there’s usually a data pipeline for transforming incoming data.

MongoDB 40
article thumbnail

Women in the workplace: What’s possible now?

Cloudera

In celebration of International Women’s Day, on Wednesday, March 10 women from across Cloudera in America and EMEA came together with an external audience to debate and discuss the state of women. Our transatlantic panel featuring Bianka Racz, Software Engineer; Christine Sherry, Director Critical Incident Response; Dominique Hollins, Director of Culture and Inclusion and Tahmeena Tariq Senior Corporate Counsel, came together to discuss how COVID has changed our expectations of employers, the wo

article thumbnail

Git prune to remove reference to master

Grouparoo

TLDR: This command will fully get rid of the master branch after you've switched to using main. git remote prune origin What I was seeing was git checkout master still referenced the existence of the old branch I had used. I needed to get rid of that old one so that it would be an error. The switch We have switched to using the main branch instead of the master branch on all our Github projects.

Project 52
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

You’re Not Realizing the Full Value of Your Company’s Data

Monte Carlo

You’re on Snowflake and Looker? Great. But for most companies, having a cloud data stack is just the tip of the iceberg when it comes to operationalizing their data and analytics at scale. We share five non-obvious roadblocks businesses face when becoming data driven and what some of the industry’s leading data engineering and analytics teams are doing to overcome them.