Fri.Jul 26, 2024

article thumbnail

How to implement data quality checks with greatexpectations

Start Data Engineering

1. Introduction 2. Project overview 3. Check your data before making it available to end-users; Write-Audit-Publish(WAP) pattern 4. TL;DR: How the greatexpectations library works 4.1. greatexpectations quick setup 5. From an implementation perspective, there are four types of tests 5.1. Running checks on one dataset 5.2. Checks involving the current dataset and its historical data 5.3.

Datasets 208
article thumbnail

Bayesian Thinking in Modern Data Science

KDnuggets

Discover how Bayesian thinking transforms decision-making with its unique approach to updating initial beliefs with new evidence.

article thumbnail

Data News — Week 24.30

Christophe Blefari

Tallinn ( credits ) Dear members, it's Summer Data News, the only news you can consume by the pool, the beach or at the office—if you're not lucky. This week, I'm writing from the Baltics, nomading a bit in Eastern and Northern Europe. I'm pleased to announce that we have successfully closed the CfP for Forward Data Conf, we received nearly 100 submissions and the program committee is currently reviewing all submissions.

MySQL 130
article thumbnail

A Framework for Multi-Model Forecasting on Databricks

databricks

Introduction Time series forecasting serves as the foundation for inventory and demand management in most enterprises. Using data from past periods along with.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Why the Newest LLMs use a MoE (Mixture of Experts) Architecture

KDnuggets

When it comes to AI, every expert in an MoE model specializes in a much larger problem—just like every doctor specializes in their medical field. This improves efficiency and increases system efficacy and accuracy.

article thumbnail

Odin: Uber’s Stateful Platform

Uber Engineering

Explore Odin, Uber’s stateful platform for managing all types of databases. It is a technology-agnostic, intent-based system that has dramatically improved the operational throughput of underlying hosts and databases company-wide.

More Trending

article thumbnail

Pickup in 3 minutes: Uber’s implementation of Live Activity on iOS

Uber Engineering

From WWDC reveal to delivery, discover how we tackled new tech, design challenges, and tight timelines to enhance rider & driver experiences with Live Activity® from Apple.

article thumbnail

Mainframe History: How Mainframe Computers Have Changed Over the Years

Precisely

Mainframes have one of the longest histories of any kind of computing technology that is still used today. In fact, mainframe history shows the fast-evolving landscape of technology, few innovations have left as profound a mark as mainframe computers. From their inception to the sophisticated systems of today, mainframes have continuously adapted to meet the ever-growing demands of business operations.

article thumbnail

The Engineering Behind Booking.com’s Ranking Platform | A System Overview

Booking.com Engineering

The Engineering Behind High-Performance Ranking Platform: A System Overview An Introduction Booking.com employs sophisticated ranking to optimize search results for each user. The system uses advanced machine learning algorithms and leverages extensive data, including user behavior, preferences, and past interactions, to tailor hotel listings and travel recommendations.

Systems 42