Sat.Sep 23, 2023 - Fri.Sep 29, 2023

article thumbnail

Why are Cloud Development Environments Spiking in Popularity, Now?

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover a fresh industry trends: Cloud Developent Environments — which is analysis full subscribers have received 3 weeks ago.

Cloud 258
article thumbnail

5 Free Books to Help You Master Python

KDnuggets

From the basics of Python to clean architecture and more, here are five free books to level up your Python skills.

Python 157
article thumbnail

DuckDB + Delta Lake (the new lake house?)

Confessions of a Data Guy

I always leave it to my dear readers and followers to give me pokes in the right direction. Nothing like the teaming masses to set you straight. Recently I was working on my Substack Newsletter, on the topic of Polars + Delta Lake, reading remove files from s3 … I left a question open on […] The post DuckDB + Delta Lake (the new lake house?

Data 147
article thumbnail

Upgrade your Modern Data Stack

Christophe Blefari

Make your data stack take-off ( credits ) Hello, another edition of Data News. This week, we're going to take a step back and look at the current state of data platforms. What are the current trends and why are people fighting around the concept of the modern data stack. Early September is usually conference season. All over the world, people gather in huge venues to attend conferences.

Big Data 147
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Working at a Startup vs in Big Tech

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of four topics in today’s subscriber-only The Pulse issue. To get full newsletters twice a week, subscribe here. Willem Spruijt is a software engineer whom I worked on the same team with at Uber in Amsterdam, building payments systems.

article thumbnail

Top 7 Free Cloud Notebooks for Data Science

KDnuggets

Cloud notebooks are game-changers for data science, providing free access to computing, pre-built environments, collaboration features, and third-party integrations - everything you need to enhance your workflow.

More Trending

article thumbnail

Powering Vector Search With Real Time And Incremental Vector Indexes

Data Engineering Podcast

Summary The rapid growth of machine learning, especially large language models, have led to a commensurate growth in the need to store and compare vectors. In this episode Louis Brandy discusses the applications for vector search capabilities both in and outside of AI, as well as the challenges of maintaining real-time indexes of vector data. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

SQL 147
article thumbnail

Deploy Private LLMs using Databricks Model Serving

databricks

We are excited to announce public preview of GPU and LLM optimization support for Databricks Model Serving! With this launch, you can deploy.

article thumbnail

Introduction to Deep Learning Libraries: PyTorch and Lightning AI

KDnuggets

Simple explanation of PyTorch and Lightning AI.

article thumbnail

Getting started with Airflow in 10 mins

Marc Lamberti

At the end of this introduction to Airflow, you will be all set for getting started with Airflow. You will start with the basics, such as what Airflow is and the essential concepts. Then you will set up and run your local development environment using the Astro CLI to create your first data pipeline. I hope you’re getting excited. Fasten your seatbelt, take a deep breath, and let’s go For a complete hands-on introduction to Apache Airflow, here is a 6-hour course at a discount.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Data News — Week 23.38 (late)

Christophe Blefari

Early like my run ( credits ) Hey. This is a super late Data News, I wanted to send it earlier but I was travelling then enjoying time with friends and family. I'm still struggling a bit to write as fast as I would like, but 🤷‍♂️ So, sorry for the late edition and enjoy. Gen AI 🤖 Announcing Microsoft Copilot — Having everything under a common brand is great and Copilot is a great name.

Data 130
article thumbnail

Announcing the Public Preview of Lakeview Dashboards!

databricks

We are excited to announce the public preview of the next generation of Databricks SQL dashboards, dubbed Lakeview dashboards. Available today, this new.

SQL 126
article thumbnail

Deploying Your First Machine Learning Model

KDnuggets

With just 3 simple steps, you can build & deploy a glass classification model faster than you can say.glass classification model!

article thumbnail

Airflow TaskGroup: All you need to know!

Marc Lamberti

An Airflow TaskGroup helps make a complex DAG easier to organize and read. Airflow taskgroups are meant to replace SubDAGs, the historical way of grouping your tasks. Indeed, SubDAGs are too complicated only for grouping tasks. They bring a lot of complexity as you must create a DAG in a DAG, import the SubDagOperator (which is a sensor), define the parameters correctly, and so on.

Coding 130
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Data News — Week 23.38 (late)

Christophe Blefari

Early like my run ( credits ) Hey. This is a super late Data News, I wanted to send it earlier but I was travelling then enjoying time with friends and family. I'm still struggling a bit to write as fast as I would like, but 🤷‍♂️ So, sorry for the late edition and enjoy. Gen AI 🤖 Announcing Microsoft Copilot — Having everything under a common brand is great and Copilot is a great name.

Data 130
article thumbnail

Old School: Adapting Esri Basemaps for Printed Products

ArcGIS

Esri basemaps are designed to be used at multiple scales, but a static map needs everything in one view. How doe we get around that?

Designing 124
article thumbnail

Generative Agent Research Papers You Should Read

KDnuggets

Research paper in the exciting field that you don’t want to miss.

149
149
article thumbnail

easyJet bets on Databricks Lakehouse and Generative AI to be an Innovation Leader in Aviation

databricks

This blog is authored by Ben Dias, Director of Data Science and Analytics and Ioannis Mesionis, Lead Data Scientist at easyJet Introduction to.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Lessons from debugging a tricky direct memory leak

Pinterest Engineering

Sanchay Javeria | Software Engineer, Ads Data Infrastructure To support metrics reporting for ads from external advertisers and real-time ad budget calculations at Pinterest, we run streaming pipelines using Apache Flink. These jobs have guaranteed an overall 99th percentile availability to our users; however, every once in a while some tasks get hit with nasty direct out-of-memory (OOM) errors on multiple operators that look something like this: As is the case with most failures in a distribute

Utilities 110
article thumbnail

Working with Esri Vector Basemaps in ArcGIS Pro

ArcGIS

Esri Vector Basemaps are available for use in ArcGIS Pro, and that opens up some new possibilities for you.

Designing 119
article thumbnail

A Comparative Overview of the Top 10 Open Source Data Science Tools in 2023

KDnuggets

Are you looking for the open source tools to help you in your data science journey? Look no further. Discover these game-changers that will elevate your data-driven decisions.

article thumbnail

Ballard Power Systems RDU (Remote Diagnostics Unit) Visualization Platform for Interactive At-Scale Industrial IoT Streaming Analytics

databricks

This article represents a collaborative effort between Plotly, Ballard Power Systems, and Databricks. Fleets of buses worldwide run on hydrogen fuel cells made.

Systems 113
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Training Foundation Improvements for Closeup Recommendation Ranker

Pinterest Engineering

Fan Jiang | Software Engineer, Closeup Candidate Retrieval; Liyao Lu | Software Engineer, Closeup Ranking & Blending; Laksh Bhasin | Software Engineer, Core ML Foundations; Chen Yang | Software Engineer, Core ML Foundations; Shivin Thukral | Software Engineer, Closeup Ranking & Blending; Travis Ebesu | Software Engineer, Closeup Ranking & Blending; Kent Jiang | Software Engineer, Core Serving Infra; Yan Sun | Engineering Manager, Closeup Ranking & Blending; Huizhong Duan | Engine

article thumbnail

IBM Technology Chooses Cloudera as its Preferred Partner for Addressing Real Time Data Movement Using Kafka

Cloudera

Organizations increasingly rely on streaming data sources not only to bring data into the enterprise but also to perform streaming analytics that accelerate the process of being able to get value from the data early in its lifecycle. As lakehouse architectures (including offerings from Cloudera and IBM) become the norm for data processing and building AI applications, a robust streaming service becomes a critical building block for modern data architectures.

Kafka 100
article thumbnail

The Data Maturity Pyramid: From Reporting to a Proactive Intelligent Data Platform

KDnuggets

This article describes the data maturity pyramid and its various levels, from simple reporting to AI-ready data platforms. It emphasizes the importance of data for business and illustrates how data platforms serve as the driving force behind AI.

Data 149
article thumbnail

Using Images and Metadata for Product Fuzzy Matching with Zingg

databricks

Product matching is an essential function in many retail and consumer goods organizations. Incoming products are compared to items in the existing product.

Metadata 110
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Fueling Data-Driven Decision-Making with Data Validation and Enrichment Processes

Precisely

77% of data and analytics professionals say data-driven decision-making is the top goal for their data programs. Data-driven decision-making and initiatives are certainly in demand, but their success hinges on … well, the data that supports them. More specifically, the quality and integrity of that data. It seems obvious enough, but checking that your data is up to the task and taking any necessary steps to improve and maintain its quality can be easier said than done.

article thumbnail

How Snowflake Native Apps Deliver Security for App Builders and Consumers

Snowflake

The Snowflake Native App Framework , which leverages Snowflake’s advanced architecture, allows for a new level of security for applications. This security spans not just the application consumer, but also the application providers. Controlling all software and infrastructure in the Snowflake Data Cloud, Snowflake can secure the application code to protect the intellectual property (IP) of builders.

Python 98
article thumbnail

Unify Batch and ML Systems with Feature/Training/Inference Pipelines

KDnuggets

A new way to do MLOps for your Data-ML-Product Teams.

Systems 148
article thumbnail

Governing cybersecurity data across multiple clouds and regions using Unity Catalog & Delta Sharing

databricks

According to a 2023 report from Enterprise Search Group, 85% of organizations indicated they deploy applications on two or more IaaS providers, attesting.

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.