Remove Coding Remove Data Schemas Remove Document
article thumbnail

Indexing code at scale with Glean

Engineering at Meta

Were sharing details about Glean , Metas open source system for collecting, deriving and working with facts about source code. In August 2021 we open-sourced our code indexing system Glean. Glean collects information about source code and provides it to developer tools through an efficient and flexible query language.

Coding 55
article thumbnail

Implementing the Netflix Media Database

Netflix Tech

In the previous blog posts in this series, we introduced the N etflix M edia D ata B ase ( NMDB ) and its salient “Media Documentdata model. A fundamental requirement for any lasting data system is that it should scale along with the growth of the business applications it wishes to serve. This is depicted in Figure 1.

Media 97
article thumbnail

Improving Meta’s global maps

Engineering at Meta

We want our maps to be living documents that adapt to the needs of the people who use our apps, all while keeping up to date with data sources and trends in cartographic design. This new data schema was born partly out of our cartographic tiling logic, and it includes everything necessary to make a map of the world.

article thumbnail

AWS Glue-Unleashing the Power of Serverless ETL Effortlessly

ProjectPro

Application programming interfaces (APIs) are used to modify the retrieved data set for integration and to support users in keeping track of all the jobs. When Glue receives a trigger, it collects the data, transforms it using code that Glue generates automatically, and then loads it into Amazon S3 or Amazon Redshift.

AWS 98
article thumbnail

Open-sourcing Polynote: an IDE-inspired polyglot notebook

Netflix Tech

Visibility The Polynote UI provides at-a-glance insights into the state of the kernel by showing kernel status, highlighting currently-running cell code, and showing currently executing tasks. A notebook execution is a record of a particular piece of code, run at a particular point in time, in a particular environment.

Scala 94
article thumbnail

Snowflake Startup Spotlight: TDAA!

Snowflake

Processing complex, schema-less, semistructured, hierarchical data can be extremely time-consuming, costly and error-prone, particularly if the data source has polymorphic attributes. For many data sources, the schema of the data source can change without warning.

article thumbnail

Streaming Data from the Universe with Apache Kafka

Confluent

Much of the code used by modern astronomers is written in Python, so the ZTF alert distribution system endpoints need to at least support Python. We built our alert distribution code in Python, based around Confluent’s Python client for Apache Kafka. Alert data pipeline and system design.

Kafka 102