article thumbnail

Interesting startup idea: benchmarking cloud platform pricing

The Pragmatic Engineer

The current database includes 2,000 server types in 130 regions and 340 zones. Storing data: data collected is stored to allow for historical comparisons. Results are stored in git and their database, together with benchmarking metadata. Visualizing the data: the frontend that allows querying of live and historic data.

Cloud 326
article thumbnail

The Journey of a Senior Data Scientist and Machine Learning Engineer at Spice Money

Analytics Vidhya

Introduction Meet Tajinder, a seasoned Senior Data Scientist and ML Engineer who has excelled in the rapidly evolving field of data science. Tajinder’s passion for unraveling hidden patterns in complex datasets has driven impactful outcomes, transforming raw data into actionable intelligence.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Race For Data Quality in a Medallion Architecture

DataKitchen

It sounds great, but how do you prove the data is correct at each layer? How do you ensure data quality in every layer ? Bronze, Silver, and Gold – The Data Architecture Olympics? The Bronze layer is the initial landing zone for all incoming raw data, capturing it in its unprocessed, original form.

article thumbnail

How to get started with dbt

Christophe Blefari

In the ELT, the load is done before the transform part without any alteration of the data leaving the raw data ready to be transformed in the data warehouse. In a simple words dbt sits on top of your raw data to organise all your SQL queries that are defining your data assets.

article thumbnail

Data Integrity for AI: What’s Old is New Again

Precisely

The demand for higher data velocity, faster access and analysis of data as its created and modified without waiting for slow, time-consuming bulk movement, became critical to business agility. Which turned into data lakes and data lakehouses Poor data quality turned Hadoop into a data swamp, and what sounds better than a data swamp?

article thumbnail

Strobelight: A profiling service built on open source technology

Engineering at Meta

However, Strobelight has several safeguards in place to prevent users from causing performance degradation for the targeted workloads and retention issues for the databases Strobelight writes to. Strobelight also delays symbolization until after profiling and stores raw data to disk to prevent memory thrash on the host.

article thumbnail

The Journey of a Senior Data Scientist and Machine Learning Engineer in Fintech Domain

Analytics Vidhya

Introduction Meet Tajinder, a seasoned Senior Data Scientist and ML Engineer who has excelled in the rapidly evolving field of data science. Tajinder’s passion for unraveling hidden patterns in complex datasets has driven impactful outcomes, transforming raw data into actionable intelligence.