Sat.Jul 25, 2020 - Fri.Jul 31, 2020

article thumbnail

Ensuring Data Quality, With Great Expectations

Start Data Engineering

What is data quality As the name suggest, it refers to the quality of our data. Quality should be defined based on your project requirements. It can be as simple as ensuring a certain column has only the allowed values present or falls within a given range of values to more complex cases like, when a certain column must match a specific regex pattern, fall within a standard deviation range, etc.

Data 130
article thumbnail

Build More Reliable Distributed Systems By Breaking Them With Jepsen

Data Engineering Podcast

Summary A majority of the scalable data processing platforms that we rely on are built as distributed systems. This brings with it a vast number of subtle ways that errors can creep in. Kyle Kingsbury created the Jepsen framework for testing the guarantees of distributed data processing systems and identifying when and why they break. In this episode he shares his approach to testing complex systems, the common challenges that are faced by engineers who build them, and why it is important to und

Systems 100
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

I’ve Got the Key, I’ve Got the Secret. Here’s How Keys Work in ksqlDB 0.10.

Confluent

ksqlDB 0.10 includes significant changes and improvements to how keys are handled. This is part of a series of enhancements that began with support for non-VARCHAR keys and will ultimately […].

Process 122
article thumbnail

Advancing the Telecom Industry through Network Experience Analytics

Teradata

For today's Telco providers, new products & services are all driven by the end consumer's experience. That's where Teradata's Network Experience Analytics comes to play.

76
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Unbundling Data Science Workflows with Metaflow and AWS Step Functions

Netflix Tech

by David Berg, Ravi Kiran Chirravuri, Romain Cledat, Jason Ge, Savin Goyal, Ferras Hamad, Ville Tuulos Continue reading on Netflix TechBlog ».

AWS 60
article thumbnail

Data Pipelines in the Healthcare Industry

DareData

The Challenges of Medical Data In recent times, there have been several developments in applications of machine learning to the medical industry. We have heard news of machine learning systems outperforming seasoned physicians on diagnosis accuracy, chatbots that present recommendations depending on your symptoms , or algorithms that can identify body parts from transversal image slices , just to name a few.

More Trending

article thumbnail

Streaming Data Into Teradata Vantage Using Amazon Managed Kafka (MSK) Data Streams and AWS Glue Streaming ETL

Teradata

In this post, we provide step-by-step instructions on how to set up Vantage & author AWS Glue Streaming ETL jobs to stream data into Vantage from Amazon MSK and visualize the data.

AWS 52
article thumbnail

Performance Isolation for Your Primary MongoDB Cluster

Rockset

Database performance is a critical aspect of ensuring a web application or service remains fast and stable. As the service scales up, there are often challenges with scaling the primary database along with it. While MongoDB is often used as a primary online database and can meet the demands of very large scale web applications, it does often become the bottleneck as well.

MongoDB 40
article thumbnail

The Differences Between Null, Nothing, Nil, None, and Unit in Scala

Rock the JVM

Discover the different flavors of 'nothing-ness' in Scala and how they impact your code

Scala 52
article thumbnail

How PushOwl Uses ksqlDB to Scale Their Analytics and Reporting Use Cases

Confluent

Using a declarative SQL-like interface, ksqlDB makes it easy to integrate event streaming applications into any tech stack. This article illustrates how ksqlDB was added to PushOwl’s Python tech stack, […].

SQL 99
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Why Teradata Has Never Been Afraid of High Demand

Teradata

Teradata's Advanced SQL Engine can take on more work concurrently than competitors, all while continuing to deliver high throughput under high stress. Learn more.

SQL 52
article thumbnail

How To Build A Live-Updating COVID Dashboard Using Google Sheets and Apache Superset

Preset

The powerful combination of Google Sheets and Apache Superset

article thumbnail

Quick Reports: Xero to Power BI

FreshBI

The objective of this blog To give you the tools and the skills to connect to Xero Accounting from the Power BI Desktop and to have immediate access to the categorized data that drives each of the built-in reports in Xero. What you need to get started To get quick immediate access to the data that drives the Xero Reports and push them into Power BI, you’ll need 3 tools : Power BI Desktop : Download here>> ‘Quick Reports’ Power BI Custom Connector for Xero AND Power BI Quick Reports Templ

BI 52
article thumbnail

What is a Data Mesh — and How Not to Mesh it Up

Monte Carlo

Updated: January 2023. Ask anyone in the data industry what’s hot these days and chances are “data mesh” will rise to the top of the list. But what is a data mesh and why should you build one? Inquiring minds want to know. In the age of self-service business intelligence , nearly every company considers themselves a data-first company, but not every company is treating their data architecture with the level of democratization and scalability it deserves.

IT 45
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.