article thumbnail

Interesting startup idea: benchmarking cloud platform pricing

The Pragmatic Engineer

Python, Angular, SSR, SQLite, DuckDB, Cockroach DB, and many others. Results are stored in git and their database, together with benchmarking metadata. Benchmarking results for each instance type are stored in sc-inspector-data repo, together with the benchmarking task hash and other metadata.  There Tech stack.

Cloud 326
article thumbnail

Level Up Your Data Platform With Active Metadata

Data Engineering Podcast

Summary Metadata is the lifeblood of your data platform, providing information about what is happening in your systems. In order to level up their value a new trend of active metadata is being implemented, allowing use cases like keeping BI reports up to date, auto-scaling your warehouses, and automated data governance.

Metadata 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How Meta understands data at scale

Engineering at Meta

Understanding DataSchema requires grasping schematization , which defines the logical structure and relationships of data assets, specifying field names, types, metadata, and policies. JSON) into fields and sub-fields, and extracting features using APIs available in multiple languages (C++, Python, Hack).

article thumbnail

Metadata Management And Integration At LinkedIn With DataHub

Data Engineering Podcast

The key to those solutions is a robust and flexible metadata management system. LinkedIn has gone through several iterations on the most maintainable and scalable approach to metadata, leading them to their current work on DataHub. What were you using at LinkedIn for metadata management prior to the introduction of DataHub?

Metadata 100
article thumbnail

How to Create a Vocabulary for NLP Tasks in Python

KDnuggets

This post will walkthrough a Python implementation of a vocabulary class for storing processed text data and related metadata in a manner useful for subsequently performing NLP tasks.

Python 123
article thumbnail

Eliminate Friction In Your Data Platform Through Unified Metadata Using OpenMetadata

Data Engineering Podcast

Summary A significant source of friction and wasted effort in building and integrating data management systems is the fragmentation of metadata across various tools. After experiencing the impacts of fragmented metadata and previous attempts at building a solution Suresh Srinivas and Sriharsha Chintalapani created the OpenMetadata project.

Metadata 100
article thumbnail

Data News — Week 24.11

Christophe Blefari

yato, is a small Python library that I've developed, yato stands for yet another transformation orchestrator. Attributing Snowflake cost to whom it belongs — Fernando gives ideas about metadata management to attribute better Snowflake cost. This is Croissant.

Metadata 272