My Obsidian Note-Taking Workflow
Simon Späti
JULY 28, 2024
A Vim-Inspired Approach to Efficient Note Management with Obsidian and Markdown
Simon Späti
JULY 28, 2024
A Vim-Inspired Approach to Efficient Note Management with Obsidian and Markdown
KDnuggets
JULY 29, 2024
Learn to build the end-to-end data science pipelines from data ingestion to data visualization using Pandas pipe method.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
databricks
AUGUST 1, 2024
Today, we are excited to announce that Lakehouse Federation in Unity Catalog is now Generally Available (GA) across AWS, Azure, and GCP! Lakehouse.
Confluent
JULY 29, 2024
Apache Kafka 3.8 adds 17 new KIPs (13 for Core, 3 for Streams & 1 for Connect). Highlights include 2 new Docker images, the ability to set task assignors, and more!
Advertisement
In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate
Seattle Data Guy
AUGUST 1, 2024
What would you do if you became the head or director of data for a 1,000-person company? Yesterday, you were plugging along as an analyst, and now, suddenly, you have all these new responsibilities. Figuring out where to start is part of the job. You’d probably feel a strong temptation to freak out. Who wouldn’t?… Read more The post How To Run A Data Team As A New Head Of Data appeared first on Seattle Data Guy.
KDnuggets
JULY 31, 2024
If you work in data, you’ll write SQL queries all the time. So how do you write efficient SQL queries that are optimized for performance? This tutorial will help you with just that.
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Waitingforcode
AUGUST 1, 2024
Welcome to the second blog post dedicated to the previous Data+AI Summit. This time I'm going to share with you a summary of Apache Spark talks.
ArcGIS
JULY 31, 2024
Catch eyes and imaginations with this fun technique that draws attention to your area of interest with a bit of style!
KDnuggets
JULY 30, 2024
Follow this 7 step recipe to mastering effective insight and information dissemination through compelling data story crafting.
databricks
JULY 31, 2024
We’re excited to announce the Public Preview of LakeFlow Connect for SQL Server, Salesforce, and Workday. These ingestion connectors enable simple and efficient.
Speaker: Tamara Fingerlin, Developer Advocate
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
Snowflake
AUGUST 1, 2024
Since introducing Snowpark Container Services, we’ve seen overwhelming adoption across industries from customers and partners, including Landing.AI , Relational.AI , H20.AI , SailPoint , AIR MILES , Spark NZ , and Eutelsat OneWeb. These organizations and many more are using Snowpark Container Services capabilities to easily and securely deploy everything from custom front-ends and large-scale ML training and inference to open source and homegrown models, all securely within Snowflake.
Confluent
JULY 30, 2024
Confluent Platform 7.
KDnuggets
JULY 29, 2024
Let's learn how to perform memory-efficient operations in pandas with large dataset.
databricks
AUGUST 1, 2024
At Data and AI Summit, we announced the general availability of Databricks Lakehouse Monitoring. Our unified approach to monitoring data and AI.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
ArcGIS
AUGUST 2, 2024
Essential Data Models in the Utility Network Foundations
Snowflake
AUGUST 1, 2024
Retrieval Augmented Generation (RAG) allows enterprises to ground responses from Large Language Models in their specific organization’s data. This helps ensure that AI-powered applications provide responses that are not only accurate, relevant, and consistent, but also aligned with business needs. At Snowflake, we make it simple for our customers to implement RAG, while also enabling the strict governance and privacy controls that businesses require.
KDnuggets
AUGUST 2, 2024
This tutorial will teach you how to simplifying your file management tasks, from organization to backup, using Python’s pathlib module.
databricks
JULY 30, 2024
Introduction An organization adopting new technologies or on a modernization journey typically focuses on upcoming tools, their features and potential performance/cost improvements under.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Data Engineering Weekly
JULY 28, 2024
Meta: Introducing Llama 3.1: Our most capable models to date Probability one of the hottest announcements this week is Llama 3.1 release - the first-ever open-sourced frontier AI model competitive with leading foundation models across a range of tasks, including GPT-4, GPT-4o, and Claude 3.5 Sonnet. The Llama3 herd of models is an insightful paper that helps one deeply understand the foundational model.
ArcGIS
JULY 29, 2024
More small-scale projected basemaps to add to the set I built in 2023
KDnuggets
AUGUST 1, 2024
Unlock your potential with these crafted 6 ChatGPT prompts designed to boost your productivity and streamline your operation workflows.
databricks
JULY 28, 2024
The transformative potential of artificial intelligence (AI) is undeniable. From productivity efficiency, to cost savings, and improved decision-making across all industries, AI is.
Advertisement
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
Confessions of a Data Guy
AUGUST 1, 2024
The post Daft: Distributed Dataframes with Python. appeared first on Confessions of a Data Guy.
Snowflake
JULY 31, 2024
Academic medical centers (AMCs) are a critical keystone of healthcare systems worldwide. They serve as major hubs of medical research, pioneering new treatments that advance and set the standard of care throughout medicine. They also educate and train the next generation of healthcare professionals, ensuring that the medical field continues to advance.
KDnuggets
AUGUST 1, 2024
Let's learn how to use multiindex pandas for hierarchical data operations.
databricks
JULY 29, 2024
Financial Valuations & Comparative Analysis Financial institutions specialized in capital markets such as hedge funds, market makers and pension funds have long been.
Speaker: Tamara Fingerlin, Developer Advocate
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
Confessions of a Data Guy
AUGUST 1, 2024
The post CI/CD for Data Engineers. appeared first on Confessions of a Data Guy.
Monte Carlo
JULY 30, 2024
It’s clear that data quality is becoming more of a focus for more data teams. So why are there still so many questions like these: A quick search on subreddits for data engineers, data analysts, data scientists, and more can yield a plethora of users seeking data quality advice. And while the comment below may seem like the accepted way of doing data quality management… … there’s actually a much better way.
KDnuggets
JULY 30, 2024
Learning how to perform several of the most basic matrix operations with NumPy.
databricks
JULY 31, 2024
Companies across all industries want to share data with each other to enable collaboration and accelerate innovation. However, these organizations often use different.
Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage
When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m
Let's personalize your content