Sat.Feb 17, 2024 - Fri.Feb 23, 2024

article thumbnail

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Multiple open source projects and vendors have been working together to make this vision a reality. In this episode Dain Sundstrom, CTO of Starburst, explains how the combination of the Trino query engine and the Iceberg table format offer the ease of use and execution speed of data warehouses with the infinite storage and sc

Data Lake 262
article thumbnail

5 Airflow Alternatives for Data Orchestration

KDnuggets

Top list of open-source tools for building and managing workflows.

Data 153
article thumbnail

ArcGIS Pro 3.3 Moves to.NET 8

ArcGIS

ArcGIS Pro 3.3 is planned to be available in May 2024. Install.NET 8 before attempting to install ArcGIS Pro 3.3 for the best user experience!

143
143
article thumbnail

Data News — Week 24.08

Christophe Blefari

My ideas these days ( credits ) Hey, fresh Data News edition. This week I've participated to a round table about data and did a cool presentation about Engines. The idea was to depict the history of engines over the last 40 years and what leads to polars and DuckDB. Obviously the I forgot a few things and I'll do a more complete v2 soon. This is my third presentation about DuckDB in the last 3 months and I think I'll slow down a bit until I get new crazy things to share.

Data Lake 130
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Data Engineering Best Practices - #2. Metadata & Logging

Start Data Engineering

1. Introduction 2. Setup & Logging architecture 3. Data Pipeline Logging Best Practices 3.1. Metadata: Information about pipeline runs, & data flowing through your pipeline 3.2. Obtain visibility into the code’s execution sequence using text logs 3.3. Understand resource usage by tracking Metrics 3.4. Monitoring UI & Traceability 3.5.

Metadata 130
article thumbnail

7 Free Kaggle Micro-Courses for Data Science Beginners

KDnuggets

Interested in learning data science? Check out these free micro-courses from Kaggle to learn essential data science skills.

More Trending

article thumbnail

Announcing the General Availability of Azure Private Link and Azure Storage firewall support for Databricks SQL Serverless

databricks

We are excited to announce the upcoming general availability of Azure Private Link support for Databricks SQL (DBSQL) Serverless, planned in April 2024.

SQL 126
article thumbnail

Location Referencing Guide to Esri Partner Conference and Esri Developer Summit

ArcGIS

Join us for an exciting Partner Conference and Developer Summit! Discover the latest in ArcGIS Location Referencing and connect with experts.

article thumbnail

3 Inspirational Stories of Leaders in AI

KDnuggets

Every leader has their origin story, and here are some that might inspire you.

149
149
article thumbnail

Aligning Velox and Apache Arrow: Towards composable data management

Engineering at Meta

We’ve partnered with Voltron Data and the Arrow community to align and converge Apache Arrow with Velox , Meta’s open source execution engine. Apache Arrow 15 includes three new format layouts developed through this partnership: StringView, ListView, and Run-End-Encoding (REE). This new convergence helps Meta and the larger community build data management systems that are unified, more efficient, and composable.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Strengthening Cyber Resilience through Efficient Data Management: A Response to M-21-31

databricks

In today's environment, proactive cybersecurity is crucial to any public sector agency. For many organizations, log data that security professionals need for effective.

article thumbnail

The Abstraction Problem – A Great Evil

Confessions of a Data Guy

There is a great evil Spirit that is haunting the streets of code in the land of programmers. It’s a Spirit of obfuscation and twisting things into what they are not. The Spirit wanders around on the loose looking for someone, and it finds ready victims among the ranks of new programmers and the innocent […] The post The Abstraction Problem – A Great Evil appeared first on Confessions of a Data Guy.

Coding 113
article thumbnail

A Roadmap For Your Data Career

KDnuggets

As you design your career in data, you’ve got to avoid getting stuck in your comfort zone or allowing your manager or current situation to determine your path.

Data 149
article thumbnail

5 minutes to make a map!

ArcGIS

Create a cool looking landscape map, in record time. Start the clock!

109
109
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Announcing the General Availability of Unity Catalog Volumes

databricks

Today, we are excited to announce that Unity Catalog Volumes is now generally available on AWS, Azure, and GCP. Unity Catalog provides a.

AWS 122
article thumbnail

Simplify Application Development With Hybrid Tables

Snowflake

We previously announced Snowflake’s Unistore workload , which continues Snowflake’s legacy of breaking down data silos by uniting transactional and analytical data in a consistent and governed platform. Today, we are pleased to announce that Hybrid Tables — the core feature powering Unistore — is in public preview in select AWS regions. Hybrid Tables is a new table type that enables transactional use cases within Snowflake with fast, high-concurrency point operations.

article thumbnail

Python in Finance: Real Time Data Streaming within Jupyter Notebook

KDnuggets

Learn a modern approach to stream real-time data in Jupyter Notebook. This guide covers dynamic visualizations, a Python for quant finance use case, and Bollinger Bands analysis with live data.

Finance 148
article thumbnail

New SQL Practice Problems

Confessions of a Data Guy

New SQL Practice Problems I’m trying something new. I get a lot of questions from folks about getting into the Data Engineering space, how to get better, grow, learn, etc. So I came up with a solution. SQL Practice Problems. Some moons ago I wrote a Data Engineering Practice repo on GitHub for free, and some 1.2K stars later […] The post New SQL Practice Problems appeared first on Confessions of a Data Guy.

SQL 100
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Unapologetically Technical Episode 9 – Gunnar Morling

Jesse Anderson

This week on Unapologetically Technical, I had the wonderful pleasure of interviewing Gunnar Morling, the creator of the Billion Row Challenge and Senior Staff Software Engineer at Decodable. In this episode, we talk about why it is so important to stay in a position long enough to gain experience and see the success or failure of decisions. He also shares his experiences at RedHat and working on Debezium.

article thumbnail

Is the modern data stack disappearing?

Christophe Blefari

No. This question generated a lot of content last week, and a lot of words were written. I wanted to keep my answer short so as not to burden you with a few thousand more words to read. Modern data stack has been coined by US companies and VCs—mainly Fivetran / dbt Labs—as a word to quickly emphasis a way to build data stack in the cloud related to ELT.

Data 100
article thumbnail

Navigating the Data Revolution: Exploring the Booming Trends in Data Science and Machine Learning

KDnuggets

Dive into transformative trends in data science, encompassing AI-powered automation, NLP, ethical considerations, decentralized computing, and interdisciplinary collaboration.

article thumbnail

8 Tips for Managing Stakeholder Expectations

Knowledge Hut

Why Stakeholder Management? One of the most critical aspects of project management is doing what’s necessary to develop and control relationships with all individuals that the project impacts. In this article, you will learn techniques for identifying stakeholders, analyzing their influence on the project, and developing strategies to communicate, set boundaries, and manage competing expectations.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Top digital trends for 2024: Predictions and insights

InData Labs

Top digital trends for 2024 will be unprecedented technological advancements that will reshape the way businesses operate. Introducing them into corporate structures is a strategic move for all companies that want to stay ahead of the curve. The tech and digital marketing industry trends we discuss below will change the way organizations handle customer service, Запись Top digital trends for 2024: Predictions and insights впервые появилась InData Labs.

article thumbnail

Beyond the Buzz: Braze Equips Modern Marketers with Powerful AI Tools

Snowflake

A lot of the buzz around AI focuses on its future potential. And we get it — we’re talking about a transformative technology that presents seemingly limitless possibilities. But an important aspect of this world-changing tech story that gets lost in the hype is understanding exactly what AI solutions are available for you and your team to employ right now, today.

article thumbnail

Free Mastery Course: Become a Large Language Model Expert

KDnuggets

It is a self-paced course that covers fundamental and advanced concepts of LLMs and teaches how to deploy them in production.

IT 147
article thumbnail

Advantages of Agile Testing Methodology

Knowledge Hut

What is Agile Testing? As the name implies, agile course projects are executed very quickly and with flexibility. Agile methods involve tasks executed in short iterations or sprints. Agile Testing is also iterative and takes place after each sprint, rather than towards the end of the project. Testing courses iteratively helps to validate the client requirements and adapt to changing conditions in a better manner.

Project 98
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Unlocking AI Assisted Development Safely: From Idea to GA

Pinterest Engineering

Sam Wang | Sr. Technical Program Manager; Joe Gordon | Sr. Staff Software Engineer At Pinterest we are continuously looking for ways to improve our developer experience, and we have recently shipped AI-assisted development for everyone while balancing safety, security, and cost. In this blog post, we share our journey of unlocking AI-assisted development, from the initial idea to the General Availability (GA) stage.

Scala 94
article thumbnail

WebSockets in Http4s

Rock the JVM

by Herbert Kateu 1. Introduction The WebSocket protocol enables persistent two-way communication between a client and a server where packets can be passed in both directions without the need for additional HTTP requests. The specification for this protocol is outlined in RFC 6455. WebSockets are used in applications such as Instant Messaging, Gaming, Simultaneous editing, and stock tickers to mention but a few.

Scala 94
article thumbnail

6 YouTube Channels to Learn about AI

KDnuggets

Are you looking into learning about AI? YouTube is your first stop.

147
147
article thumbnail

Delivering Telecom Sustainability Targets Using Autonomous Networks

Snowflake

As the world grapples with the escalating climate crisis, many industries are re-examining their operations to identify and implement sustainable practices. The telecommunications industry is no exception. Telecom companies face growing pressure from consumers, investors and regulators to reduce their carbon footprint and achieve net-zero emissions.

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.