Introducing the New SQL Editor
databricks
OCTOBER 14, 2024
Over the last few years, we've seen tremendous growth and adoption of Databricks SQL , our intelligent data warehouse purpose-built on the Data.
This site uses cookies to improve your experience. By viewing our content, you are accepting the use of cookies. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country we will assume you are from the United States. View our privacy policy and terms of use.
databricks
OCTOBER 14, 2024
Over the last few years, we've seen tremendous growth and adoption of Databricks SQL , our intelligent data warehouse purpose-built on the Data.
Cloudera
JULY 16, 2021
Did you know Cloudera customers, such as SMG and Geisinger , offloaded their legacy DW environment to Cloudera Data Warehouse (CDW) to take advantage of CDW’s modern architecture and best-in-class performance? Today, we are pleased to announce the general availability of HPL/SQL integration in CDW public cloud.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Data Engineering Podcast
FEBRUARY 4, 2024
Summary Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable.
Striim
NOVEMBER 19, 2024
SQL2Fabric Mirroring is a new fully managed service offered by Striim to mirror on premise SQL Databases. It’s a collaborative service between Striim and Microsoft based on Fabric Open Mirroring that enables real-time data replication from on-premise SQL Server databases to Azure Fabric OneLake. Striim automates the rest.
Hevo
MAY 17, 2024
Snowflake Data Warehouse delivers essential infrastructure for handling a Data Lake, and Data Warehouse needs. It can store semi-structured and structured data in one place due to its multi-clusters architecture that allows users to independently query data using SQL.
Snowflake
NOVEMBER 2, 2023
Over the years, the technology landscape for data management has given rise to various architecture patterns, each thoughtfully designed to cater to specific use cases and requirements. These patterns include both centralized storage patterns like data warehouse , data lake and data lakehouse , and distributed patterns such as data mesh.
Knowledge Hut
APRIL 23, 2024
In the modern data-driven landscape, organizations continuously explore avenues to derive meaningful insights from the immense volume of information available. Two popular approaches that have emerged in recent years are data warehouse and big data. Data warehousing offers several advantages.
dbt Developer Hub
NOVEMBER 14, 2021
I’ve used the dateadd SQL function thousands of times. I’ve googled the syntax of the dateadd SQL function all of those times except one, when I decided to hit the "are you feeling lucky" button and go for it. What is the DATEADD SQL Function? This allows you to add or subtract a certain period of time from a given start date.
Towards Data Science
JUNE 6, 2023
Fitbit activity analysis with DuckDB Photo by Jake Hills on Unsplash Wearable fitness trackers have become an integral part of our lives, collecting and tracking data about our daily activities, sleep patterns, location, heart rate, and much more. What insights are buried within my archive of personal fitness activity data?
databricks
JANUARY 31, 2024
Welcome to the blog series covering product advancements in 2023 for Databricks SQL, the serverless data warehouse from Databricks. This is part 2.
Data Engineering Podcast
MAY 13, 2021
Summary There is a lot of attention on the database market and cloud data warehouses. While they provide a measure of convenience, they also require you to sacrifice a certain amount of control over your data. Firebolt is the fastest cloud data warehouse. Visit dataengineeringpodcast.com/firebolt to get started.
Analytics Vidhya
MARCH 14, 2023
This results in the generation of so much data daily. This generated data is stored in the database and will maintain it. SQL is a structured query language used to read and write these databases. Introduction In today’s world, technology has increased tremendously, and many people are using the internet.
Analytics Vidhya
FEBRUARY 27, 2023
Introduction Data is the new oil in this century. The database is the major element of a data science project. So, we are […] The post How to Normalize Relational Databases With SQL Code? So, we are […] The post How to Normalize Relational Databases With SQL Code? appeared first on Analytics Vidhya.
Cloudera
JULY 11, 2023
Apache Impala and Apache Kudu make a great combination for real-time analytics on streaming data for time series and real-time data warehousing use cases. Apache Impala is a distributed C++ backed SQL engine that integrates with Kudu to serve BI results over millions of rows meeting sub-second service-level agreements.
Data Engineering Podcast
FEBRUARY 8, 2021
Summary With all of the tools and services available for building a data platform it can be difficult to separate the signal from the noise. In this episode Zeeshan Qureshi and Michelle Ark share their experiences using DBT to manage the data warehouse for Shopify.
Christophe Blefari
MARCH 15, 2024
With yato you give a folder with SQL queries and it guesses the DAG and runs the queries in the right order. Saying mainly that " Sora is a tool to extend creativity " Last point Mira has been mocked and criticised online because as a CTO she wasn't able to say on which public / licensed data Sora has been trained on.
Data Engineering Podcast
JANUARY 1, 2022
Summary Communication and shared context are the hardest part of any data system. In recent years the focus has been on data catalogs as the means for documenting data assets, but those introduce a secondary system of record in order to find the necessary information.
Data Engineering Podcast
MAY 1, 2022
Summary The predominant pattern for data integration in the cloud has become extract, load, and then transform or ELT. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform.
Data Engineering Podcast
JUNE 25, 2023
Summary Data transformation is a key activity for all of the organizational roles that interact with data. Because of its importance and outsized impact on what is possible for downstream data consumers it is critical that everyone is able to collaborate seamlessly. Can you describe what SQLMesh is and the story behind it?
Data Engineering Podcast
FEBRUARY 18, 2024
Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. Join in with the event for the global data community, Data Council Austin.
Cloudera
SEPTEMBER 24, 2020
Some of the most powerful results come from combining complementary superpowers, and the “dynamic duo” of Apache Hive LLAP and Apache Impala, both included in Cloudera Data Warehouse , is further evidence of this. Both Impala and Hive can operate at an unprecedented and massive scale, with many petabytes of data.
Monte Carlo
AUGUST 25, 2023
Modern companies are ingesting, storing, transforming, and leveraging more data to drive more decision-making than ever before. Data teams need to balance the need for robust, powerful data platforms with increasing scrutiny on costs. But, the options for data storage are evolving quickly. Let’s dive in.
Data Engineering Podcast
OCTOBER 14, 2019
Summary Managing a data warehouse can be challenging, especially when trying to maintain a common set of patterns. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform.
Data Science Blog: Data Engineering
SEPTEMBER 19, 2023
In the contemporary age of Big Data, Data Warehouse Systems and Data Science Analytics Infrastructures have become an essential component for organizations to store, analyze, and make data-driven decisions. So why using IaC for Cloud Data Infrastructures?
Cloudera
JANUARY 15, 2021
Cloud data warehouses allow users to run analytic workloads with greater agility, better isolation and scale, and lower administrative overhead than ever before. The results demonstrate superior price performance of Cloudera Data Warehouse on the full set of 99 queries from the TPC-DS benchmark. Introduction. higher cost.
Data Engineering Podcast
DECEMBER 8, 2019
Summary Data warehouses have gone through many transformations, from standard relational databases on powerful hardware, to column oriented storage engines, to the current generation of cloud-native analytical engines. If you are evaluating your options for building or migrating a data platform, then this is definitely worth a listen.
Monte Carlo
JANUARY 25, 2023
Over the past year, data contracts have taken the data world by storm as a novel approach to ensuring data quality at scale in production services. In the last couple of posts , I’ve focused on implementing data contracts in production services. Where to start implementing data contracts?
Data Engineering Podcast
OCTOBER 28, 2021
Summary Streaming data systems have been growing more capable and flexible over the past few years. He also explains why he started Decodable to address that limitation and the work that he and his team have done to let data engineers build streaming pipelines entirely in SQL. Missing data? Stale dashboards?
Data Engineering Podcast
AUGUST 31, 2020
Summary Data warehouse technology has been around for decades and has gone through several generational shifts in that time. The current trends in data warehousing are oriented around cloud native architectures that take advantage of dynamic scaling and the separation of compute and storage.
Cloudera
APRIL 3, 2023
Cloudera Contributors: Ayush Saxena, Tamas Mate, Simhadri Govindappa Since we announced the general availability of Apache Iceberg in Cloudera Data Platform (CDP), we are excited to see customers testing their analytic workloads on Iceberg. We will publish follow up blogs for other data services. To compact the tables use CDE Spark.
databricks
AUGUST 10, 2023
At this year's Data+AI Summit, Databricks SQL continued to push the boundaries of what a data warehouse can be, leveraging AI across the.
Monte Carlo
FEBRUARY 6, 2023
So, you’re planning a cloud data warehouse migration. But be warned, a warehouse migration isn’t for the faint of heart. As you probably already know if you’re reading this, a data warehouse migration is the process of moving data from one warehouse to another. A worthy quest to be sure.
Data Engineering Podcast
MAY 27, 2021
Summary The data warehouse has become the focal point of the modern data platform. With increased usage of data across businesses, and a diversity of locations and environments where data needs to be managed, the warehouse engine needs to be fast and easy to manage.
Start Data Engineering
JANUARY 16, 2021
Introduction Setup Code Conditional logic to read from mock input Custom macro to test for equality Setup environment specific test Run ELT using dbt Conclusion Further reading Introduction With the recent advancements in data warehouses and tools like dbt most transformations(T of ELT) are being done directly in the data warehouse.
Seattle Data Guy
MAY 26, 2024
Many data engineers and analysts don’t realize how valuable the knowledge they have is. They’ve spent hours upon hours learning SQL, Python, how to properly analyze data, build data warehouses, and understand the differences between eight different ETL solutions.
Data Engineering Podcast
JULY 8, 2019
Summary The market for data warehouse platforms is large and varied, with options for every use case. It was interesting to learn about some of the custom data types and performance optimizations that are included. Integrating data across the enterprise has been around for decades – so have the techniques to do it.
Netflix Tech
OCTOBER 27, 2020
By Tianlong Chen and Ioannis Papapanagiotou Netflix has more than 195 million subscribers that generate petabytes of data everyday. Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy.
Christophe Blefari
MARCH 1, 2023
dbt Core is an open-source framework that helps you organise data warehouse SQL transformation. dbt was born out of the analysis that more and more companies were switching from on-premise Hadoop data infrastructure to cloud data warehouses. This switch has been lead by modern data stack vision.
Monte Carlo
APRIL 16, 2022
The data warehouse is the foundation of the modern data stack, so it caught our attention when we saw Convoy head of data Chad Sanderson declare, “ the data warehouse is broken ” on LinkedIn. Treating data like an API. Immutable data warehouses have challenges too.
Hevo
MAY 9, 2024
Apache Hive is a Data Warehouse system that facilitates writing, reading, and manipulating large datasets residing across distributed storage using SQL. SQL (Structured Query Language) is a querying language that is used to perform various operations on the records stored in a database.
Seattle Data Guy
FEBRUARY 10, 2024
Photo by Tiger Lily Data warehouses and data lakes play a crucial role for many businesses. It gives businesses access to the data from all of their various systems. As well as often integrating data so that end-users can answer business critical questions.
Cloudera
MARCH 2, 2022
Similarly, with better usage of available memory more users can query the data at any given time, so more people can use the warehouse at the same time. This post explains the novel technique for how Impala, offered within the Cloudera Data Platform (CDP), is now able to get much more mileage out of the memory at its disposal.
Monte Carlo
AUGUST 30, 2024
It’s our goal at Monte Carlo to provide data observability and quality across the enterprise by monitoring every system vital in the delivery of data from source to consumption. We started with popular modern data warehouses and quickly expanded our support as data lakes became data lakehouses.
Data Engineering Podcast
JANUARY 21, 2024
A substantial amount of the data that is being managed in these systems is related to customers and their interactions with an organization. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Data projects are notoriously complex.
Expert insights. Personalized for you.
We have resent the email to
Are you sure you want to cancel your subscriptions?
Let's personalize your content