Fri.Jun 14, 2024

article thumbnail

Data Engineering Projects

Start Data Engineering

1. Introduction 2. Run Data Pipelines 2.1. Run on codespaces 2.2. Run locally 3. Projects 3.1. Projects from least to most complex 3.2. Batch pipelines 3.3. Stream pipelines 3.4. Event-driven pipelines 3.5. LLM RAG pipelines 4. Conclusion 1. Introduction Whether you are new to data engineering or have been in the data field for a few years, one of the most challenging parts of learning new frameworks is setting them up!

article thumbnail

Unlocking Data Insights: Key Pandas Functions for Effective Analysis

KDnuggets

This article aims to cover some of the Pandas functions essential for data analysis. You can seamlessly handle missing values, remove duplicates, replace specific values, and perform several other data manipulation tasks by mastering these tools.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Where Does Data Governance Fit Into Hybrid Cloud?

Cloudera

At a time when artificial intelligence (AI) and tools like generative AI (GenAI) and large language models (LLMs) have exploded in popularity, getting the most out of organizational data is critical to driving business value and carving out a competitive market advantage. To reach that goal, more businesses are turning toward hybrid cloud infrastructure – with data on-premises, in the cloud, or both – as a means to tap into valuable data.

article thumbnail

Spatial data management at the Esri 2024 User Conference

ArcGIS

Spatial data management track at the Esri 2024 User Conference. This blog covers all the geodatabase management related sessions.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Snowflake Data Governance: Ensure Data Security and Compliance

Hevo

The constant increase in the data produced by modern technologies has given rise to significant challenges, such as data complexity, inconsistencies, and breaching issues. You need a structured approach to address these challenges and mitigate the risk of comprising sensitive data. What is Data Governance in Snowflake?

article thumbnail

How to Calculate and Optimize Your Snowflake Compute Costs: A Comprehensive Guide

Hevo

Snowflake is a cloud-based platform that manages large data workloads in virtual warehouses. It is known for its unique architecture and pricing model. Snowflake charges you for the compute resources, storage, and data transfer services you utilize.

More Trending

article thumbnail

Snowflake Data Masking: An Efficient Column-Level Security Feature

Hevo

Your organization might store sensitive data such as identification numbers, date of birth, or account numbers in Snowflake data warehouse tables. To ensure this information is accessible only to authorized people with appropriate roles, Snowflake supports column-level security through dynamic data masking policies.

article thumbnail

A Comprehensive Guide to Snowflake Migration: Best Practices and Strategies

Hevo

Snowflake is a cloud data warehousing solution that has become popular for companies with large data volumes. However, moving databases from an existing data platform to Snowflake can be complicated. You may face challenges in adapting existing pipelines that require custom code or integrating data from legacy systems to Snowflake’s environment.

article thumbnail

Snowflake Data Visualization- Guide 101

Hevo

How do you visualize your Snowflake data? Snowsight, the visual interface of Snowflake, allows two different easy ways to visualize your data within Snowflake- by using charts or dashboards.

Data 40
article thumbnail

Enhance Database Management with Schema Change Snowflake

Hevo

Schema management is crucial for ensuring data quality and consistency in a database. One prominent feature it enables is version control and change management. Version control helps maintain the history of schema versions, allowing an efficient way to track the changes made to the schema.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Snowflake Salesforce Integration: Efficient Ways

Hevo

Organizations often struggle with data silos and inconsistencies due to customer data being dispersed across multiple systems. Such scattered data can hinder the ability to make informed, data-driven decisions. Platforms like Salesforce and Snowflake help address these challenges by unifying customer data and robust analytics.

article thumbnail

A Guide to Streamline Azure Data Factory Snowflake Integration

Hevo

Azure Data Factory (ADF) is a Microsoft-managed data integration solution that facilitates the creation of cloud-based data workflows. It is a fully managed service that can be used to build data pipelines by orchestrating data movement. Snowflake is a fully managed SaaS (Software-as-a-Service) tool that offers cloud-based data warehouse services.

article thumbnail

Achieve Data Consistency with Snowflake Convert Date to YYYY-MM-DD

Hevo

Different systems and databases use various date formats. Converting date data into a consistent format will ensure accuracy across systems. For instance, you are collecting sales data from other regions that use different formats. Combining and analyzing the sales data would be time-consuming and error-prone if the data is not standardized.

article thumbnail

Foster Data-Driven Decisions Through Snowflake Data Analytics

Hevo

Data analytics helps to derive valuable insights from your raw data. It helps you align your business processes for better outcomes by identifying trends and patterns in the data that would otherwise be lost.

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.