Sat.Sep 16, 2023 - Fri.Sep 22, 2023

article thumbnail

Top 20 Data Engineering Project Ideas [With Source Code]

Analytics Vidhya

Data engineering plays a pivotal role in the vast data ecosystem by collecting, transforming, and delivering data essential for analytics, reporting, and machine learning. Aspiring data engineers often seek real-world projects to gain hands-on experience and showcase their expertise. This article presents the top 20 data engineering project ideas with their source code.

article thumbnail

Bun: lessons from disrupting a tech ecosystem

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of four topics in yesterday’s subscriber-only The Pulse issue. To get full newsletters twice a week, subscribe here. Two weeks ago, a JavaScript runtime and toolkit called Bun was released and took the Node.js world by storm. Bun was mostly built by Jared Sumner , a former Stripe engineer, and recipient of the Thiel Fellowship (a grant of $100,000 for young people to drop out of s

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Airflow XCOM: The Ultimate Guide

Marc Lamberti

Wondering how to share data between tasks? What are XCOMs in Apache Airflow? Well, you are at the right place. In this tutorial, you will learn about XComs in Airflow. What they are, how they work, how you can define them, how to get them, and more. If you checked my course “Apache Airflow: The Hands-On Guide”, Aiflow XCom should not sound unfamiliar.

MySQL 246
article thumbnail

Building Linked Data Products With JSON-LD

Data Engineering Podcast

Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. In this episode Brian Platz explains how JSON-LD can be used as a shared representation of linked data for building semantic data products.

Building 189
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Top 20 Data Engineering Project Ideas with Source Code

Analytics Vidhya

Data engineering plays a pivotal role in the vast data ecosystem by collecting, transforming, and delivering data essential for analytics, reporting, and machine learning. Aspiring data engineers often seek real-world projects to gain hands-on experience and showcase their expertise. This article presents the top 20 data engineering project ideas with their source code.

article thumbnail

Getting Started with Scikit-learn in 5 Steps

KDnuggets

This tutorial offers a comprehensive hands-on walkthrough of machine learning with Scikit-learn. Readers will learn key concepts and techniques including data preprocessing, model training and evaluation, hyperparameter tuning, and compiling ensemble models for enhanced performance.

More Trending

article thumbnail

What's new on the cloud for data engineers - part 11 (06-09.2023)

Waitingforcode

It's time for another part of "What's new on the cloud for data engineers" Let's see what happened in the last 4 months.

article thumbnail

Scala as a Junior Developer

Rock the JVM

By Lucas Nouguier Hey everyone, Daniel here. Lucas’ story is shared by lots of beginner Scala developers, which is why I wanted to post it here on the blog. I’ve watched thousands of developers learn Scala from scratch, and, like Lucas, they love it! If you want to learn Scala well and fast, take a look at my Scala Essentials course at Rock the JVM.

Scala 142
article thumbnail

10 ChatGPT Projects Cheat Sheet

KDnuggets

KDnuggets' latest cheat sheet covers 10 curated hands-on projects to boost data science workflows with ChatGPT across ML, NLP, and full stack dev, including links to full project details.

Project 151
article thumbnail

Airflow DAG: Create your first DAG in 5 minutes

Marc Lamberti

Looking to create your first Airflow DAG? Wondering how to process data in Airflow? What are the steps to code your data pipelines? You’ve come to the right place! At the end of this short tutorial, you will have your first Airflow DAG! You might think starting with Apache Airflow is hard, but it is not. The truth is Airflow has so many features that it can be overwhelming.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Predicting Snow Crab Habitat Using Machine Learning

ArcGIS

In collaboration with NOAA, we used the Presence-Only Prediction (Maxent) tool to predict snow crab habitat under changing climate conditions.

article thumbnail

How Edmunds builds a blueprint for generative AI

databricks

This blog post is in collaboration with Greg Rokita, AVP of Technology at Edmunds. Long envisioned as a key milestone in computing, we've.

Building 121
article thumbnail

Feature Store Summit 2023: Practical Strategies for Deploying ML Models in Production Environments

KDnuggets

On October 11th, 2023 the Feature Store Summit will bring together leading ML companies, such as Uber, WeChat and more, for in-depth discussions about data and AI.

Data 142
article thumbnail

Machine Learning Made Easy: Q&A with Snowflake Head of Artificial Intelligence and Machine Learning Strategy Ahmad Khan

Snowflake

Why AI has everyone’s attention, what it means for different data roles, and how Alteryx and Snowflake are bringing AI to data use cases There’s a llama on the loose! Well, more specifically, LLaMA (Large Language Model Meta AI), along with other large language models (LLMs) that have suddenly become more open and accessible for everyday applications.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

ArcGIS for Nature-Related Assessments

ArcGIS

This Climate Week renews focus on nature. Learn more about how ArcGIS supports nature-related assessments to run sustainable organizations.

118
118
article thumbnail

Apache Spark 3 Apache DataSketches: New Sketch-Based Approximate Distinct Counting

databricks

Introduction In this blog post, we'll explore a set of advanced SQL functions available within Apache Spark that leverage the HyperLogLog algorithm, enabling.

Algorithm 108
article thumbnail

Hands-On with Unsupervised Learning: K-Means Clustering

KDnuggets

This tutorial provides hands-on experience with the key concepts and implementation of K-Means clustering, a popular unsupervised learning algorithm, for customer segmentation and targeted advertising applications.

Algorithm 136
article thumbnail

How to Easily Connect Airbyte with Snowflake for Unleashing Data’s Power?

Workfall

Reading Time: 9 minutes Imagine your data as pieces of a complex puzzle scattered across different platforms and formats. Making sense of this scattered information often feels like solving a gigantic puzzle blindfolded. This is where the power of data integration comes into play. If you’ve ever wished for a simplified way to seamlessly connect these puzzle pieces, then you’re in for a treat.

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

How DoorDash Fosters Meaningful Engineering Career Development

DoorDash Engineering

As a tech company, it’s our products and platform – and the engineers that build them – that power what DoorDash is able to offer our Consumers, Dashers, and Merchants every day. We thrive on tackling challenging technical problems and creating opportunities for customers, and take a lot of pride in what we do (and, as regular readers know, periodically share details about our work on this blog).

article thumbnail

A Costa Rica journey with a Twist of Pura Vida

databricks

Costa Rica is known for several things, both culturally and ecologically. Among those are biodiversity, coffee, Pura Vida, and most recently a rapidly.

109
109
article thumbnail

KDnuggets News, September 20: Python in Excel: This Will Change Data Science Forever • New KDnuggets Survey!

KDnuggets

Python in Excel: This Will Change Data Science Forever • KDnuggets Survey: Benchmark With Your Peers On Data Science Spend & Trends 2023 H2 • 5 Best AI Tools For Maximizing Productivity • And much more!

article thumbnail

Career stories: Influencing engineering growth at LinkedIn

LinkedIn Engineering

Since learning frontend and backend skills, Rishika’s passion for engineering has expanded beyond her team at LinkedIn to grow into her own digital community. As she develops as an engineer, giving back has become the most rewarding part of her role. From intern to engineer—life at LinkedIn My career with LinkedIn began with a college internship, where I got to dive into all things engineering.

article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

How Leaders of the Modern Marketing Data Stack Differentiate Themselves in a Crowded Market

Snowflake

The marketing technology landscape has exploded in the last decade. With over 11,000 available solutions , an increase of 7,258% over the last 12 years, marketing organizations have never had more tool options to choose from. In this post, we’ll take a look at how leading vendors in the 2023 Modern Marketing Data Stack are differentiating their products in a crowded market. 360-degree customer view broken into 120 data silos As of 2019, the average enterprise used 120 marketing applications.

article thumbnail

Unexpected Tools in the Databricks Marketplace to Supercharge Manufacturing Supply Chains

databricks

“Supply chains compete, not companies” — Martin Christopher No two supply chains are identical - the unique combination of products, industries, and geographic locat.

article thumbnail

Hands-On with Supervised Learning: Linear Regression

KDnuggets

If you're looking for a hands-on experience with a detailed yet beginner-friendly tutorial on implementing Linear Regression using Scikit-learn, you're in for an engaging journey.

article thumbnail

How DoorDash Defines Great Engineering Management

DoorDash Engineering

As an Engineering org, we are tremendously proud of our accomplishments throughout the history of DoorDash, particularly in recent years as we’ve grown and scaled in service of our customers. This success has been heavily influenced by the strength and leadership of our Eng management team; great companies are built by great people who have great managers.

article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

ADP Enables Dynamic Benchmarking of Human Capital Management Metrics with Snowflake

Snowflake

ADP provides products, services and experiences that simplify work for more than 1 million clients in 140 countries. Large and small organizations across virtually every industry rely on ADP’s cloud-based human capital management (HCM) solutions to streamline HR, payroll, time, tax and benefits administration. Self-service HCM analytics help ADP’s clients understand workforce trends and benchmark their metrics against aggregated, anonymized data from over 30 million employee records.

article thumbnail

Introducing the Support of Lateral Column Alias

databricks

We are thrilled to introduce the support of a new SQL feature in Apache Spark and Databricks: Lateral Column Alias (LCA). This feature.

SQL 115
article thumbnail

Python in Excel: This Will Change Data Science Forever

KDnuggets

You can now run Python code in Excel to analyze data, build machine learning models, and create visualizations.

Python 150
article thumbnail

Think Your Company Doesn’t Need a Chief Data Officer? Here Are 7 Reasons Why It Does

Cloudera

Perhaps your C-suite is already a bit crowded. The typical hierarchy will include a CEO, COO, CFO, CTO, CMO, CIO, and a few more. Adding another position may not be terribly appealing, but there is one C-suite role every company should consider—chief data and analytics officer (CDO or CDAO). The CDO is the point person for your data strategy: the leader who oversees how data is collected, managed, and put to use to improve the organization; the person who ensures that wherever there are opportun

IT 77
article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.