Remove Blog Remove Bytes Remove Metadata
article thumbnail

Introducing Netflix’s Key-Value Data Abstraction Layer

Netflix Tech

The first level is a hashed string ID (the primary key), and the second level is a sorted map of a key-value pair of bytes. Chunked data can be written by staging chunks and then committing them with appropriate metadata (e.g. This model supports both simple and complex data models, balancing flexibility and efficiency.

Bytes 99
article thumbnail

Open-Sourcing AvroTensorDataset: A Performant TensorFlow Dataset For Processing Avro Data

LinkedIn Engineering

In this blog post, we will discuss the AvroTensorDataset API, techniques we used to improve data processing speeds by up to 162x over existing solutions (thereby decreasing overall training time by up to 66%), and performance results from benchmarks and production. an array within a map, within a union, etc…). Default is 128 * 1024 (128KB).

Datasets 102
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Launching the Engineering Blog

Zalando Engineering

Our Engineering Blog was launched in June 2020 after a long break of the previous tech blog. What customizations we applied to design the blog and the publishing process. Static Site Generator Our previous tech blog used a CMS which only a limited number of people had access to. So which static site generator to choose?

article thumbnail

Improving Efficiency Of Goku Time Series Database at Pinterest (Part?—?1)

Pinterest Engineering

In the first blog, we will share a short summary on the GokuS and GokuL architecture, data format for Goku Long Term, and how we improved the bootstrap time for our storage and serving components. More information about the architecture can be found in the GokuL blog and the cost reduction blog.

Database 108
article thumbnail

Introducing Netflix TimeSeries Data Abstraction Layer

Netflix Tech

In previous blog posts, we introduced the Key-Value Data Abstraction Layer and the Data Gateway Platform , both of which are integral to Netflix’s data architecture. Metadata table : This table stores information about how each time slice is configured per namespace.

Bytes 94
article thumbnail

Netflix Cloud Packaging in the Terabyte Era

Netflix Tech

Our previous tech blog Packaging award-winning shows with award-winning technology detailed our packaging technology deployed on the streaming side. The inspection stage examines the input media for compliance with Netflix’s delivery specifications and generates rich metadata.

Cloud 96
article thumbnail

Optimization Strategies for Iceberg Tables

Cloudera

This blog discusses a few problems that you might encounter with Iceberg tables and offers strategies on how to optimize them in each of those scenarios. A bloated metadata.json file could increase both read/write times because a large metadata file needs to be read/written every time. Iceberg doesn’t delete the old data files.

Bytes 59