Tue.Jan 07, 2025

article thumbnail

Building a Fast, Light, and CHEAP Lake House with DuckDB, Delta Lake, and AWS Lambda

Confessions of a Data Guy

Building fun things is a real part of Data Engineering. Using your creative side when building a Lake House is possible, and using tools that are outside the normal box can sometimes be preferable. Checkout this video where I dive into how I build just such a Lake House using Modern Data Stack tools like […] The post Building a Fast, Light, and CHEAP Lake House with DuckDB, Delta Lake, and AWS Lambda appeared first on Confessions of a Data Guy.

AWS 130
article thumbnail

Getting to Know the SAR Analysis Toolset

ArcGIS

Must read article that introduces the SAR analysis toolset in ArcGIS Pro, which helps users extract valuable insights from processed SAR data.

Process 92
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

5 Free Courses to Master Data Wrangling with Python

KDnuggets

Do you want to learn data wrangling with Python on a budget? No worries, there are (at least) five free courses thatll provide you with solid knowledge.

Python 80
article thumbnail

Databricks on Databricks - Transforming the Sales Experience using GenAI Agents

databricks

At Databricks, our automation vision is to automate all aspects of the business, making it better, faster, and cheaper. For the sales teams.

IT 78
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Composable CDPs in Financial Services: Empowering Marketing

Snowflake

Marketers at financial services companies have their work cut out for them. Their companies have a wealth of data, but that data is often fragmented among different systems and divisions, and protected-class data has a wide range of restrictions on how it can be used for different product lines. Some of the most effective companies in the financial sector are preparing their strategy for long-term success by centralizing first-party data in the Snowflake AI Data Cloud for Financial Services.

Banking 76
article thumbnail

Mastering Multi-Cloud with Cloudera: Strategic Data & AI Deployments Across Clouds

Cloudera

In todays dynamic digital landscape, multi-cloud strategies have become vital for organizations aiming to leverage the best of both cloud and on-premises environments. As enterprises navigate complex data-driven transformations, hybrid and multi-cloud models offer unmatched flexibility and resilience. Heres a deep dive into why and how enterprises master multi-cloud deployments to enhance their data and AI initiatives.

Cloud 64

More Trending

article thumbnail

Understanding Change Data Capture (CDC) in MySQL and PostgreSQL: BinLog vs. WAL + Logical Decoding

Towards Data Science

How CDC tools use MySQL Binlog and PostgreSQL WAL with logical decoding for real-time data streaming Photo by Matoo.Studio on Unsplash CDC (Change Data Capture) is a term that has been gaining significant attention over the past few years. You might already be familiar with it (if not, dont worrytheres a quick introduction below ). One question that puzzled me, though, was how tools like the Debezium CDC connectors can read changes from MySQL and PostgreSQL databases.

article thumbnail

How Aon Reinsurance Solutions is Leveraging Databricks Delta Sharing to Help Make Better Decisions

databricks

Aon plc is a leading global firm providing risk, reinsurance, retirement, and health solutions. Focusing on data-driven insights, Aon operates in over 120.

Data 58
article thumbnail

Multi-Select Filtering with Streamlit and Snowflake

Cloudyard

Read Time: 2 Minute, 0 Second Streamlit, coupled with Snowpark, provides an excellent platform for creating interactive and user-friendly data applications. This blog highlights the significance of “Multi-Select Filtering with Streamlit and Snowflake” , demonstrating how multi-select filters can simplify data exploration by allowing users to apply multiple filters dynamically.

article thumbnail

Announcing egress control for your Databricks serverless and Mosaic AI Model Serving workloads

databricks

We are excited to announce that egress control for Databricks serverless and Mosaic AI Model Serving workloads is available in Public Preview on.

57
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Monte Carlo Now Surfaces Hex Projects and AI Apps Impacted By Poor Data Quality

Monte Carlo

Hex is one of the fastest growing collaborative data exploration solutions for a reason, it allows both technical and non-technical users to explore data, derive insights, and build powerful data applications for further visualization and exploration. A sample Project from the demo on the Hex homepage. Hex is a powerful way to bring analytics, data science, and business users together to answer tough business questionsbut only if the data populating the project can be trusted.

Project 52
article thumbnail

Airflow Data Intervals: A Deep Dive

Towards Data Science

Building idempotent and re-playable data pipelines Continue reading on Towards Data Science

article thumbnail

Predictive Models Are Nothing Without Trust

Cloudera

Airports are an interconnected system where one unforeseen event can tip the scale into chaos. For a smaller airport in Canada, data has grown to be its North Star in an industry full of surprises. In order for data to bring true value to operationsand ultimately customer experiencesthose data insights must be grounded in trust. Ryan Garnett, Senior Manager Business Solutions of Halifax International Airport Authority, joined The AI Forecast to share how the airport revamped its approach to data

Finance 61