Remove Blog Remove Cloud Storage Remove Metadata
article thumbnail

Build an Open Data Lakehouse with Iceberg Tables, Now in Public Preview

Snowflake

With this public preview, those external catalog options are either “GLUE”, where Snowflake can retrieve table metadata snapshots from AWS Glue Data Catalog, or “OBJECT_STORE”, where Snowflake retrieves metadata snapshots directly from the specified cloud storage location. Now, Snowflake can make changes to the table.

Building 120
article thumbnail

Why Open Table Format Architecture is Essential for Modern Data Systems

phData: Data Engineering

Open Table Format (OTF) architecture now provides a solution for efficient data storage, management, and processing while ensuring compatibility across different platforms. In this blog, we will discuss: What is the Open Table format (OTF)? Why should we use it? A Brief History of OTF A comparative study between the major OTFs.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Cloudera Data Warehouse outperforms Azure HDInsight in TPC-DS benchmark

Cloudera

Performance is one of the key, if not the most important deciding criterion, in choosing a Cloud Data Warehouse service. In this blog post, we compare Cloudera Data Warehouse (CDW) on Cloudera Data Platform (CDP) using Apache Hive-LLAP to Microsoft HDInsight (also powered by Apache Hive-LLAP) on Azure using the TPC-DS 2.9

article thumbnail

Netflix Cloud Packaging in the Terabyte Era

Netflix Tech

Our previous tech blog Packaging award-winning shows with award-winning technology detailed our packaging technology deployed on the streaming side. Table 1: Movie and File Size Examples Initial Architecture A simplified view of our initial cloud video processing pipeline is illustrated in the following diagram.

Cloud 96
article thumbnail

Cost Conscious Data Warehousing with Cloudera Data Platform

Cloudera

With the separation of compute and storage, CDW engines leverage newer techniques such as compute-only scaling and efficient caching of shared data. These techniques range from distributing concurrent query for overall throughput to metadata caching, data caching, and results caching. 2,300 / month for the cloud hardware costs.

article thumbnail

Carbon Hack 24: Leveraging the Impact Framework to Estimate the Carbon Cost of Cloud Storage by Matt Griffin

Scott Logic

This blog post serves as a dev diary of the process, covering our challenges, contributions made and attempts to validate them. We started to consider breaking the components down into different plugins, which could be used for more than just cloud storage.

article thumbnail

Cloudera Data Platform extends Hybrid Cloud vision support by supporting Google Cloud

Cloudera

Customers who have chosen Google Cloud as their cloud platform can now use CDP Public Cloud to create secure governed data lakes in their own cloud accounts and deliver security, compliance and metadata management across multiple compute clusters. A provisioning Service Account with these roles assigned.