Tue.Nov 05, 2024

article thumbnail

Announcing the General Availability of Materialized Views and Streaming Tables for Databricks SQL

databricks

We’re excited to announce that materialized views (MVs) and streaming tables (STs) are now Generally Available in Databricks SQL on AWS and Azure.

SQL 132
article thumbnail

7 Python Projects to Boost Your Data Science Portfolio

KDnuggets

Enhance your data science portfolio with these seven engaging Python projects that demonstrate essential programming and software engineering skills.

Portfolio 127
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The “Gold-Rush Paradox” in Data: Why Your KPIs Need a Rethink

Towards Data Science

You’re not doing as good a job as you think you are Continue reading on Towards Data Science »

article thumbnail

Learn Python and get Certified as a Data Analyst for Free this Week!

KDnuggets

From the 4th of November to the 10th of November, the entire DataCamp platform is free.

Python 125
article thumbnail

Apache Airflow® 101 Essential Tips for Beginners

Apache Airflow® is the open-source standard to manage workflows as code. It is a versatile tool used in companies across the world from agile startups to tech giants to flagship enterprises across all industries. Due to its widespread adoption, Airflow knowledge is paramount to success in the field of data engineering.

article thumbnail

The Race For Data Quality in a Medallion Architecture

DataKitchen

The Race For Data Quality In A Medallion Architecture The Medallion architecture pattern is gaining traction among data teams. It is a layered approach to managing and transforming data. The Medallion architecture is a design pattern that helps data teams organize data processing and storage into three distinct layers, often called Bronze, Silver, and Gold.

More Trending

article thumbnail

BI-as-Code and the New Era of GenBI

Simon Späti

BI-as-Code and the New Era of GenBI Imagine creating business dashboards by simply describing what you want to see. No more clicking through complex interfaces or writing SQL queries - just have a conversation with AI about your data needs. This is the promise of Generative Business Intelligence (GenBI). At its core, GenBI delivers an unreasonably effective human interface , where we iterate quickly, based on BI-as-Code.

BI 130
article thumbnail

Turbocharging Atlas: How we reduced server initialization time to less than 2 minutes

ThoughtSpot

ThoughtSpot prioritizes the high availability and minimal downtime of our systems to ensure a seamless user experience. In the realm of modern analytics platforms, where rapid and efficient processing of large datasets is essential, swift metadata access and management are critical for optimal system performance. Any delays in metadata retrieval can negatively impact user experience, resulting in decreased productivity and satisfaction.

article thumbnail

Discover the Future of Data Streaming with Confluent at AWS re:Invent 2024

Confluent

Join Confluent at AWS re:Invent 2024 to learn how to stream, connect, process, and govern data, unlocking its full potential. Visit our booth for demos, sessions, and more.

AWS 59
article thumbnail

2025 Planning Insights: Data Quality Remains the Top Data Integrity Challenge and Priority

Precisely

Key Takeaways: Data quality is the top challenge impacting data integrity – cited as such by 64% of organizations. Data trust is impacted by data quality issues, with 67% of organizations saying they don’t completely trust their data used for decision-making. Data quality is the top data integrity priority in 2024, cited by 60% of respondents. The 2025 Outlook: Data Integrity Trends and Insights report is here!

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Splitting Large CSV Files in Snowflake Using Snowpark

Cloudyard

Read Time: 2 Minute, 31 Second In data engineering, we often encounter large files that need to be processed in chunks. Using Snowflake’s Snowpark, you can split a large CSV file into smaller parts and handle each as needed. However, while Snowpark provides powerful in-database processing capabilities, splitting files this way may not be the most efficient method in production environments.

AWS 52
article thumbnail

Loading data into Redshift with DBT

Yelp Engineering

At Yelp, we embrace innovation and thrive on exploring new possibilities. With our consumers’ ever growing appetite for data, we recently revisited how we could load data into Redshift more efficiently. In this blog post, we explore how DBT can be used seamlessly with Redshift Spectrum to read data from Data Lake into Redshift to significantly reduce runtime, resolve data quality issues, and improve developer productivity.

article thumbnail

What’s New in ArcGIS Knowledge 11.4 (Q4 2024)

ArcGIS

Learn about exciting new features in ArcGIS Knowledge with the 11.4 release of ArcGIS Enterprise, such as timeline layouts and greater web app support.

article thumbnail

9 Must-Watch Videos for Aspiring Data Leaders: Bridging Tech and Business for Data Team Success

Seattle Data Guy

Leading data teams can be challenging. You’ve got management and non-technical teams constantly reaching out with ad-hoc data requests; you’re likely trying to figure out what tools will work best and not blow the bank. Not to mention, you’ve got to bridge the gap between business and technology. All while trying to grow your data… Read more The post 9 Must-Watch Videos for Aspiring Data Leaders: Bridging Tech and Business for Data Team Success appeared first on Seattle D

Banking 130
article thumbnail

Apache Airflow® Crash Course: From 0 to Running your Pipeline in the Cloud

With over 30 million monthly downloads, Apache Airflow is the tool of choice for programmatically authoring, scheduling, and monitoring data pipelines. Airflow enables you to define workflows as Python code, allowing for dynamic and scalable pipelines suitable to any use case from ETL/ELT to running ML/AI operations in production. This introductory tutorial provides a crash course for writing and deploying your first Airflow pipeline.

article thumbnail

What’s New in ArcGIS Knowledge 11.4 (Q4 2024)

ArcGIS

Learn about the exciting new changes that are happening with ArcGIS Knowledge with the 11.4 release of ArcGIS Enterprise.

52
article thumbnail

Meet Michelle Hoover, Cloudera’s new SVP of Global Alliances and Channels

Cloudera

Cloudera’s partner ecosystem delivers best-of-breed technology solutions to joint customers from the biggest names in the industry and is a core pillar of the company’s growth strategy. Cloudera is committed to fostering collaboration with partners, growing relationships, and innovating for the future. To elevate Cloudera’s partner ecosystem, the company recently announced the promotion of Michelle Hoover to Senior Vice President of Global Alliances & Channels.

article thumbnail

Managing Data in Salesforce CRM

RandomTrees

Salesforce CRM is one of the effective tools available in the market for keeping a tab on customer relationships and sales in a business. Good data management in Salesforce provides insight into improving interactions with customers and making business operations easier. In this blog, we will take you through the basic steps that help in managing data effectively in Salesforce.