Remove Blog Remove Cloud Storage Remove Metadata
article thumbnail

Build an Open Data Lakehouse with Iceberg Tables, Now in Public Preview

Snowflake

With this public preview, those external catalog options are either “GLUE”, where Snowflake can retrieve table metadata snapshots from AWS Glue Data Catalog, or “OBJECT_STORE”, where Snowflake retrieves metadata snapshots directly from the specified cloud storage location. Now, Snowflake can make changes to the table.

Building 100
article thumbnail

Cloudera Data Warehouse outperforms Azure HDInsight in TPC-DS benchmark

Cloudera

Performance is one of the key, if not the most important deciding criterion, in choosing a Cloud Data Warehouse service. In this blog post, we compare Cloudera Data Warehouse (CDW) on Cloudera Data Platform (CDP) using Apache Hive-LLAP to Microsoft HDInsight (also powered by Apache Hive-LLAP) on Azure using the TPC-DS 2.9

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Netflix Cloud Packaging in the Terabyte Era

Netflix Tech

Our previous tech blog Packaging award-winning shows with award-winning technology detailed our packaging technology deployed on the streaming side. Table 1: Movie and File Size Examples Initial Architecture A simplified view of our initial cloud video processing pipeline is illustrated in the following diagram.

Cloud 96
article thumbnail

Cost Conscious Data Warehousing with Cloudera Data Platform

Cloudera

With the separation of compute and storage, CDW engines leverage newer techniques such as compute-only scaling and efficient caching of shared data. These techniques range from distributing concurrent query for overall throughput to metadata caching, data caching, and results caching. 2,300 / month for the cloud hardware costs.

article thumbnail

Carbon Hack 24: Leveraging the Impact Framework to Estimate the Carbon Cost of Cloud Storage by Matt Griffin

Scott Logic

This blog post serves as a dev diary of the process, covering our challenges, contributions made and attempts to validate them. We started to consider breaking the components down into different plugins, which could be used for more than just cloud storage.

article thumbnail

Cloudera Data Platform extends Hybrid Cloud vision support by supporting Google Cloud

Cloudera

Customers who have chosen Google Cloud as their cloud platform can now use CDP Public Cloud to create secure governed data lakes in their own cloud accounts and deliver security, compliance and metadata management across multiple compute clusters. A provisioning Service Account with these roles assigned.

article thumbnail

Migrate Hive data from CDH to CDP public cloud

Cloudera

Replication Manager can be used to migrate Apache Hive, Apache Impala, and HDFS objects from CDH clusters to CDP Public Cloud clusters. This blog post outlines detailed step by step instructions to perform Hive Replication from an on-prem CDH cluster to a CDP Public Cloud Data Lake. This blog post is not a substitute for that.

Cloud 73