Sat.Sep 16, 2023 - Fri.Sep 22, 2023

article thumbnail

Top 20 Data Engineering Project Ideas [With Source Code]

Analytics Vidhya

Data engineering plays a pivotal role in the vast data ecosystem by collecting, transforming, and delivering data essential for analytics, reporting, and machine learning. Aspiring data engineers often seek real-world projects to gain hands-on experience and showcase their expertise. This article presents the top 20 data engineering project ideas with their source code.

article thumbnail

Bun: lessons from disrupting a tech ecosystem

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of four topics in yesterday’s subscriber-only The Pulse issue. To get full newsletters twice a week, subscribe here. Two weeks ago, a JavaScript runtime and toolkit called Bun was released and took the Node.js world by storm. Bun was mostly built by Jared Sumner , a former Stripe engineer, and recipient of the Thiel Fellowship (a grant of $100,000 for young people to drop out of s

article thumbnail

Airflow XCOM: The Ultimate Guide

Marc Lamberti

Wondering how to share data between tasks? What are XCOMs in Apache Airflow? Well, you are at the right place. In this tutorial, you will learn about XComs in Airflow. What they are, how they work, how you can define them, how to get them, and more. If you checked my course “Apache Airflow: The Hands-On Guide”, Aiflow XCom should not sound unfamiliar.

MySQL 246
article thumbnail

Building Linked Data Products With JSON-LD

Data Engineering Podcast

Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. In this episode Brian Platz explains how JSON-LD can be used as a shared representation of linked data for building semantic data products.

Building 189
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Python in Excel: This Will Change Data Science Forever

KDnuggets

You can now run Python code in Excel to analyze data, build machine learning models, and create visualizations.

Python 159
article thumbnail

Scala as a Junior Developer

Rock the JVM

By Lucas Nouguier Hey everyone, Daniel here. Lucas’ story is shared by lots of beginner Scala developers, which is why I wanted to post it here on the blog. I’ve watched thousands of developers learn Scala from scratch, and, like Lucas, they love it! If you want to learn Scala well and fast, take a look at my Scala Essentials course at Rock the JVM.

Scala 142

More Trending

article thumbnail

What is Apache Airflow?

Marc Lamberti

What is Apache Airflow? Perhaps your colleagues or YouTube videos have mentioned it. Maybe your job requires you to use it, but you’re unsure what it is. In this article, you will learn everything about what Airflow is, what it isn’t, and its core concepts and components. But, before answering this question, we need a proper understanding of what an “orchestrator” is.

article thumbnail

Ensemble Learning Techniques: A Walkthrough with Random Forests in Python

KDnuggets

A practical walkthrough for random forests in Python.

Python 154
article thumbnail

Top 20 Data Engineering Project Ideas with Source Code

Analytics Vidhya

Data engineering plays a pivotal role in the vast data ecosystem by collecting, transforming, and delivering data essential for analytics, reporting, and machine learning. Aspiring data engineers often seek real-world projects to gain hands-on experience and showcase their expertise. This article presents the top 20 data engineering project ideas with their source code.

article thumbnail

What's new on the cloud for data engineers - part 11 (06-09.2023)

Waitingforcode

It's time for another part of "What's new on the cloud for data engineers" Let's see what happened in the last 4 months.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Airflow DAG: Create your first DAG in 5 minutes

Marc Lamberti

Looking to create your first Airflow DAG? Wondering how to process data in Airflow? What are the steps to code your data pipelines? You’ve come to the right place! At the end of this short tutorial, you will have your first Airflow DAG! You might think starting with Apache Airflow is hard, but it is not. The truth is Airflow has so many features that it can be overwhelming.

article thumbnail

Getting Started with Scikit-learn in 5 Steps

KDnuggets

This tutorial offers a comprehensive hands-on walkthrough of machine learning with Scikit-learn. Readers will learn key concepts and techniques including data preprocessing, model training and evaluation, hyperparameter tuning, and compiling ensemble models for enhanced performance.

article thumbnail

How Edmunds builds a blueprint for generative AI

databricks

This blog post is in collaboration with Greg Rokita, AVP of Technology at Edmunds. Long envisioned as a key milestone in computing, we've.

Building 128
article thumbnail

ArcGIS for Nature-Related Assessments

ArcGIS

This Climate Week renews focus on nature. Learn more about how ArcGIS supports nature-related assessments to run sustainable organizations.

123
123
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Machine Learning Made Easy: Q&A with Snowflake Head of Artificial Intelligence and Machine Learning Strategy Ahmad Khan

Snowflake

Why AI has everyone’s attention, what it means for different data roles, and how Alteryx and Snowflake are bringing AI to data use cases There’s a llama on the loose! Well, more specifically, LLaMA (Large Language Model Meta AI), along with other large language models (LLMs) that have suddenly become more open and accessible for everyday applications.

article thumbnail

10 ChatGPT Projects Cheat Sheet

KDnuggets

KDnuggets' latest cheat sheet covers 10 curated hands-on projects to boost data science workflows with ChatGPT across ML, NLP, and full stack dev, including links to full project details.

Project 153
article thumbnail

A Costa Rica journey with a Twist of Pura Vida

databricks

Costa Rica is known for several things, both culturally and ecologically. Among those are biodiversity, coffee, Pura Vida, and most recently a rapidly.

126
126
article thumbnail

Locked by another application using ArcPy and a File geodatabase

ArcGIS

Data management tips and tricks for managing locks in a temporary file geodatabase with automated workflows.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

How to Easily Connect Airbyte with Snowflake for Unleashing Data’s Power?

Workfall

Reading Time: 9 minutes Imagine your data as pieces of a complex puzzle scattered across different platforms and formats. Making sense of this scattered information often feels like solving a gigantic puzzle blindfolded. This is where the power of data integration comes into play. If you’ve ever wished for a simplified way to seamlessly connect these puzzle pieces, then you’re in for a treat.

article thumbnail

Hands-On with Supervised Learning: Linear Regression

KDnuggets

If you're looking for a hands-on experience with a detailed yet beginner-friendly tutorial on implementing Linear Regression using Scikit-learn, you're in for an engaging journey.

article thumbnail

Introducing the Support of Lateral Column Alias

databricks

We are thrilled to introduce the support of a new SQL feature in Apache Spark and Databricks: Lateral Column Alias (LCA). This feature.

SQL 122
article thumbnail

How Leaders of the Modern Marketing Data Stack Differentiate Themselves in a Crowded Market

Snowflake

The marketing technology landscape has exploded in the last decade. With over 11,000 available solutions , an increase of 7,258% over the last 12 years, marketing organizations have never had more tool options to choose from. In this post, we’ll take a look at how leading vendors in the 2023 Modern Marketing Data Stack are differentiating their products in a crowded market. 360-degree customer view broken into 120 data silos As of 2019, the average enterprise used 120 marketing applications.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

How DoorDash Fosters Meaningful Engineering Career Development

DoorDash Engineering

As a tech company, it’s our products and platform – and the engineers that build them – that power what DoorDash is able to offer our Consumers, Dashers, and Merchants every day. We thrive on tackling challenging technical problems and creating opportunities for customers, and take a lot of pride in what we do (and, as regular readers know, periodically share details about our work on this blog).

article thumbnail

Hands-On with Unsupervised Learning: K-Means Clustering

KDnuggets

This tutorial provides hands-on experience with the key concepts and implementation of K-Means clustering, a popular unsupervised learning algorithm, for customer segmentation and targeted advertising applications.

Algorithm 151
article thumbnail

Orchestrating Data Analytics with Databricks Workflows

databricks

For data-driven enterprises, data analysts play a crucial role in extracting insights from data and presenting it in a meaningful way. However, many.

article thumbnail

ADP Enables Dynamic Benchmarking of Human Capital Management Metrics with Snowflake

Snowflake

ADP provides products, services and experiences that simplify work for more than 1 million clients in 140 countries. Large and small organizations across virtually every industry rely on ADP’s cloud-based human capital management (HCM) solutions to streamline HR, payroll, time, tax and benefits administration. Self-service HCM analytics help ADP’s clients understand workforce trends and benchmark their metrics against aggregated, anonymized data from over 30 million employee records.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Career stories: Influencing engineering growth at LinkedIn

LinkedIn Engineering

Since learning frontend and backend skills, Rishika’s passion for engineering has expanded beyond her team at LinkedIn to grow into her own digital community. As she develops as an engineer, giving back has become the most rewarding part of her role. From intern to engineer—life at LinkedIn My career with LinkedIn began with a college internship, where I got to dive into all things engineering.

article thumbnail

Top 5 Free Alternatives to GPT-4

KDnuggets

Think GPT-4 is a big deal? These Generative AI newbies are already stealing the show!

149
149
article thumbnail

Unexpected Tools in the Databricks Marketplace to Supercharge Manufacturing Supply Chains

databricks

“Supply chains compete, not companies” — Martin Christopher No two supply chains are identical - the unique combination of products, industries, and geographic locat.

article thumbnail

Think Your Company Doesn’t Need a Chief Data Officer? Here Are 7 Reasons Why It Does

Cloudera

Perhaps your C-suite is already a bit crowded. The typical hierarchy will include a CEO, COO, CFO, CTO, CMO, CIO, and a few more. Adding another position may not be terribly appealing, but there is one C-suite role every company should consider—chief data and analytics officer (CDO or CDAO). The CDO is the point person for your data strategy: the leader who oversees how data is collected, managed, and put to use to improve the organization; the person who ensures that wherever there are opportun

IT 83
article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.