Project - Data Engineering Digest

An educational side project

The Pragmatic Engineer

JUNE 1, 2023

I’d like to share a story about an educational side project which could prove fruitful for a software engineer who’s seeking a new job. Juraj created a systems design explainer on how he built this project, and the technologies used: The systems design diagram for the Rides application The app uses: Node.js Persistence.

Education

Education Project PostgreSQL Software Engineer

Data Engineering Projects

Start Data Engineering

JUNE 14, 2024

Projects 3.1. Projects from least to most complex 3.2. Introduction 2. Run Data Pipelines 2.1. Run on codespaces 2.2. Run locally 3. Batch pipelines 3.3. Stream pipelines 3.4. Event-driven pipelines 3.5. LLM RAG pipelines 4. Conclusion 1.

Data Engineer

Data Engineer Data Engineering Project Engineering

dbt multi-project collaboration

Christophe Blefari

OCTOBER 19, 2023

cross-project dependencies ( credits ) Over the last few years, dbt has become a de facto standard enabling companies to collaborate easily on data transformations. Whatever the number, there will be a critical point at which a single project no longer scale. Cross-project references is a key enabler to data team decentralisation.

Project

Project Finance SQL Government

Webinars

Smart Tech + Human Expertise = How to Modernize Manufacturing Without Losing Control

MORE WEBINARS

Top 20 Data Engineering Project Ideas [With Source Code]

Analytics Vidhya

SEPTEMBER 20, 2023

Aspiring data engineers often seek real-world projects to gain hands-on experience and showcase their expertise. This article presents the top 20 data engineering project ideas with their source code. Whether you’re […] The post Top 20 Data Engineering Project Ideas [With Source Code] appeared first on Analytics Vidhya.

Data Engineer

Data Engineer Data Engineering Coding Project

5 Early Indicators Your Embedded Analytics Will Fail

In this White Paper, Logi Analytics has identified 5 tell-tale signs your project is moving from “nice to have” to “needed yesterday.". Many application teams leave embedded analytics to languish until something—an unhappy customer, plummeting revenue, a spike in customer churn—demands change. But by then, it may be too late.

Project

Data Engineering Project for Beginners - Batch edition

Start Data Engineering

MAY 11, 2022

Introduction 2. Objective 3. Prerequisite 4.2 AWS Infrastructure costs 4.3 Data lake structure 5. Code walkthrough 5.1 Loading user purchase data into the data warehouse 5.2 Loading classified movie review data into the data warehouse 5.3 Generating user behavior metric 5.4. Checking results 6. Tear down infra 7. Design considerations 8.

Data Engineer

Data Engineer Data Engineering Project Data Lake

5 Quirky Data Science Projects to Impress

KDnuggets

SEPTEMBER 12, 2024

Develop unique yet standing-out data science projects to improve your data portfolio.

Data Science

Data Science Project Portfolio Data

Introducing Accelerator for Machine Learning (ML) Projects: Summarization with Gemini from Vertex AI

Cloudera

DECEMBER 9, 2024

Were thrilled to announce the release of a new Cloudera Accelerator for Machine Learning (ML) Projects (AMP): Summarization with Gemini from Vertex AI . The post Introducing Accelerator for Machine Learning (ML) Projects: Summarization with Gemini from Vertex AI appeared first on Cloudera Blog.

Machine Learning

Machine Learning Project Banking Accessibility

10 GitHub Repositories for Advanced Machine Learning Projects

KDnuggets

OCTOBER 16, 2024

Where can you find projects dealing with advanced ML topics? GitHub is a perfect source with its many repositories. I’ve selected ten to talk about in this article.

Machine Learning

Machine Learning Project IT

3 Challenges of Building Complex Dashboards with Open Source Components

Speaker: Ryan MacCarrigan, Founding Principal, LeanStudio

Watch this webinar with Ryan MacCarrigan, Founding Principal of LeanStudio, to learn about key considerations for launching your next analytics project. But what happens when you have a growing user base and additional feature requests?

Coding

7 Computer Vision Projects for All Levels

KDnuggets

OCTOBER 30, 2024

Each project, from beginner tasks like Image Classification to advanced ones like Anomaly Detection, includes a link to the dataset and source code for easy access and implementation.

Project

Project Datasets Coding Accessibility

How to build a data project with step-by-step instructions

Start Data Engineering

SEPTEMBER 18, 2024

Introduction 2. Parts of data engineering 3.1. Requirements 3.1.1. Understand input datasets available 3.1.2. Define what the output dataset will look like 3.1.3. Define SLAs so stakeholders know what to expect 3.1.4. Define checks to ensure the output dataset is usable 3.2. Identify what tool to use to process data 3.3. Data flow architecture 3.

Project

Project Building Datasets Architecture

7 Cool Data Science Project Ideas for Beginners

KDnuggets

OCTOBER 8, 2024

Start working on these projects today. Are you a data science beginner looking to build your portfolio?

Data Science

Data Science Project Portfolio Data

Top 16 Technical Data Sources for Advanced Data Science Projects

KDnuggets

JANUARY 23, 2024

Here are data repositories that will up your data science game and improve your data projects.

Data Science

Data Science Project Data Machine Learning

Why “Build or Buy?” Is the Wrong Question for Analytics

Every time an application team gets caught up in the “build vs buy” debate, it stalls projects and delays time to revenue. There is a third option. Partnering with an analytics development platform gives you the freedom to customize a solution without the risks and long-term costs of building your own.

Building

Data Science Side Quests: 4 Uncommon Projects to Elevate Your Skills

KDnuggets

APRIL 7, 2025

Doing data science projects can be demanding, but it doesnt mean it has to be boring. Here are four projects to introduce more fun to your learning and stand out from the masses.

Data Science

Data Science Project Data IT

Project Ideas to Master Data Engineering

KDnuggets

AUGUST 30, 2024

Data engineering is best learned by doing projects. Here are six projects focusing on different data engineering skills to ensure you have it all covered. But which ones?

Data Engineer

Data Engineer Data Engineering Project Engineering

Data Security with Snowflake: Row Access, Masking, and Projection Policies

Cloudyard

NOVEMBER 1, 2024

To solve this, we’ll apply Projection Policies to ensure that only certain roles can see sensitive columns like Customer numbers. Snowflake provides several layers of data security, including Projection Policies , Masking Policies , and Row Access Policies , that work together to restrict access based on roles.

Data Security

Data Security Accessible Accessibility Project

3 Ways of Building Python Projects using GPT-4o

KDnuggets

AUGUST 7, 2024

Learn about essential AI tools that can help you develop Python projects faster and with fewer bugs using natural language.

Python

Python Project Building

How to Find and Test Assumptions in Product Development

You'll learn: Why every product leader goes into a new project with untested, hidden assumptions. Watch this webinar with Laura Klein, product manager and author of Build Better Products, to learn how to spot the unconscious assumptions which you’re basing decisions on and guidelines for validating (or invalidating) your ideas.

Project

Top 10 Kaggle Machine Learning Projects to Become Data Scientist in 2024

KDnuggets

DECEMBER 7, 2023

Master Data Science with Top 10 Kaggle ML Projects to become a Data Scientist.

Machine Learning

Machine Learning Project Data Science Data

5 Tips for Structuring Your Data Science Projects

KDnuggets

JANUARY 6, 2025

Learn how to structure your data science projects to make them more organized and minimize chaos!

Data Science

Data Science Project Data

How to reference a seed from a different dbt project?

Start Data Engineering

DECEMBER 18, 2024

Ways to reuse seed data across multiple dbt projects 2.1. Setup project environment 2.2. Store your package for other dbt projects to reference 2.3. Use project dependencies (dbt enterprise only) 2.4. Introduction 2. Code setup 2.1.1. Prerequisites 2.1.2. Turn the source repo into a dbt package 2.2.1.

Project

Project Coding Data

7 Python Projects to Boost Your Data Science Portfolio

KDnuggets

NOVEMBER 5, 2024

Enhance your data science portfolio with these seven engaging Python projects that demonstrate essential programming and software engineering skills.

Portfolio

Portfolio Data Science Python Project

The Essential Guide to Building Analytic Applications

What should product managers keep in mind when adding an analytics project to their roadmap? What are best practices when designing the UI and UX of embedded dashboards, reports, and analytics? What should software teams know about implementing security that works with the rest of their products?

Analytics Application

Data contracts and Bitol project

Waitingforcode

MARCH 25, 2025

Data contracts was a hot topic in the data space before LLMs and GenAI came out. They promised a better world with less communication issues between teams, leading to more reliable and trustworthy data. Unfortunately, the promise has been too hard to put into practice. Has been, or should I write "was"?

Project

Project Data

How to Set Up Your First BigQuery Project

KDnuggets

SEPTEMBER 17, 2024

In this post, you'll learn what BigQuery is, understand its capabilities, and set up a project in Google Cloud which we will later use to practice using BigQuery for loading, querying, and analyzing data.

Project

Project Google Cloud Cloud Management

5 Unconventional Sources of Data for Your Next Project

KDnuggets

NOVEMBER 28, 2024

When working on a project, think beyond traditional data sources. Explore unconventional options like social media and user-generated content for fresh insights.

Project

Project Media Data

7 Projects to Master Data Engineering

KDnuggets

DECEMBER 4, 2024

Learn to build, run, and manage data engineering pipelines both locally and in the cloud using popular tools.

Data Engineer

Data Engineer Data Engineering Engineering Project

Modern Data Architecture for Embedded Analytics

Every data-driven project calls for a review of your data architecture—and that includes embedded analytics. Before you add new dashboards and reports to your application, you need to evaluate your data architecture with analytics in mind.

Data Architecture

How to decide on a data project for your portfolio

Start Data Engineering

SEPTEMBER 23, 2024

Steps to decide on a data project to build 2.1. Introduction 2. Objective 2.2. Research 2.2.1. Job description 2.2.2. Potential referral/hiring manager research 2.2.3. Company research 2.3. Data 2.3.1. Dataset Search 2.3.2. Generate fake data 2.4. Outcome 2.4.1. Visualization 2.5. Presentation 3. Conclusion 4. Read these 1.

Portfolio

Portfolio Project Datasets Data

Best Practices for Version Control in Data Science Projects

KDnuggets

SEPTEMBER 18, 2024

Versioning Best Practices for Data Science Projects As I have mentioned, this article assumes you have basic versioning knowledge. You don’t necessarily need to be adept at it, but at least you already have a Git version tool in the environment. If you haven’t, please follow the instructions for installation on the Git website.

Data Science

Data Science Project Data IT

7 GitHub Projects to Master Machine Learning

KDnuggets

MARCH 28, 2025

Learn model serving, CI/CD, ML orchestration, model deployment, local AI, and Docker to streamline ML workflows, automate pipelines, and deploy scalable, portable AI solutions effectively.

Machine Learning

Machine Learning Project

Life Cycle of Data Science Project

WeCloudData

MARCH 1, 2025

The data science lifecycle of a project outlines the […] The post Life Cycle of Data Science Project appeared first on WeCloudData. Data is shaping our decisions, from personalized shopping experiences to checking weather forecasts before leaving home. All of these data science applications have a life cycle to follow.

Data Science

Data Science Project Bytes Data

The Definitive Guide to Dashboard Design

Great dashboards lead to richer user experiences and significant return on investment (ROI), while poorly designed dashboards distract users, suppress adoption, and can even tarnish your project or brand.

Designing

7 MLOPs Projects for Beginners

KDnuggets

FEBRUARY 18, 2025

Develop AI applications, test them, and deploy on the cloud using user-friendly MLOps tools and straightforward methods.

Project

Project Cloud

NLP Project Life Cycle: A Case Study on Automated Resume Screening

WeCloudData

MARCH 20, 2025

NLP plays a crucial role in multiple domains and NLP projects ranging from its automating customer service, improving search engines, or analyzing social media sentiments. Natural Language Processing (NLP) has transformed technology by allowing machines to understand, decode, and generate human language.

Project

Project Media Technology Engineering

7 Data Science Projects to Land a 6 Figure Job

KDnuggets

JANUARY 8, 2025

In this article, Im going to share data science project ideas that will actually help you stand out. These are creative projects that solve problems with data, and Ive included source code and tutorials to help you replicate them.

Data Science

Data Science Project Data Coding

Make Map Icons with an Orthographic Projection

ArcGIS

APRIL 2, 2025

Create custom projections with only two coordinates and then turn them into icons for endless possibilities.

Project

The Definitive Guide to Embedded Analytics

Inside you will learn: How embedded analytics has become essential to business applications When to buy an embedded analytics solution and when to build one How to go-to-market, from pricing and packaging to external promotion How to build a business case and sell the project internally The future of embedded analytics …plus so much more.

Building

Paying down tech debt: further learnings

The Pragmatic Engineer

SEPTEMBER 19, 2024

This project helped onboard me to the software, its structure, its build, and our issue tracking and version control workflows. My first project was supporting i18n (internationalization) in the app. They hired a manager who had done this kind of project before, and set a target date of nearly two years.

Recruitment

Recruitment Java Coding Project

Introducing the dbt MCP Server – Bringing Structured Data to AI Workflows and Agents

dbt Developer Hub

APRIL 20, 2025

In particular, we expect both Business Intelligence and Data Engineering will be driven by AI operating on top of the context defined in your dbt Projects. Weve known for a while that the combination of structured data from your dbt project + LLMs is a potent combo (particularly when using the dbt Semantic Layer).

Structured Data

Structured Data SQL BI Project

Open source business model struggles at WordPress

The Pragmatic Engineer

OCTOBER 10, 2024

Heavy development investment: Automattic – a VC-funded company founded by Matt Mullenweg – is the largest contributor to Wordpress, paying more than 100 staff to work full-time on the project. ” HashiCorp is facing similar challenges with Terraform / OpenTofu.

Consulting

Consulting AWS Engineering Software Engineer

Bun: lessons from disrupting a tech ecosystem

The Pragmatic Engineer

SEPTEMBER 22, 2023

It begins with a clean state, and can ship something that works for, say, 90% of existing Node projects, and break the remaining 10%. I tip my hat to all volunteer open source contributors and maintainers — both for Node, and for other projects. Bun has no such constraint. If you are one of these people: thank you!

Programming Language

Programming Language Project Coding Engineering

The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data and AI

Speaker: Aindra Misra, Sr. Staff Product Manager of Data & AI at BILL (Previously PM Lead at Twitter/X)

Anticipated future use cases as we project into 2024 and beyond. Examine real world use cases, both internal and external, where data analytics is applied, and understand its evolution with the introduction of Gen AI.

Data

An educational side project

Data Engineering Projects

Webinars

Trending Sources

dbt multi-project collaboration

Webinars

Top 20 Data Engineering Project Ideas [With Source Code]

5 Early Indicators Your Embedded Analytics Will Fail

Data Engineering Project for Beginners - Batch edition

5 Quirky Data Science Projects to Impress

Introducing Accelerator for Machine Learning (ML) Projects: Summarization with Gemini from Vertex AI

10 GitHub Repositories for Advanced Machine Learning Projects

3 Challenges of Building Complex Dashboards with Open Source Components

7 Computer Vision Projects for All Levels

How to build a data project with step-by-step instructions

7 Cool Data Science Project Ideas for Beginners

Top 16 Technical Data Sources for Advanced Data Science Projects

Why “Build or Buy?” Is the Wrong Question for Analytics

Data Science Side Quests: 4 Uncommon Projects to Elevate Your Skills

Project Ideas to Master Data Engineering

Data Security with Snowflake: Row Access, Masking, and Projection Policies

3 Ways of Building Python Projects using GPT-4o

How to Find and Test Assumptions in Product Development

Top 10 Kaggle Machine Learning Projects to Become Data Scientist in 2024

5 Tips for Structuring Your Data Science Projects

How to reference a seed from a different dbt project?

7 Python Projects to Boost Your Data Science Portfolio

The Essential Guide to Building Analytic Applications

Data contracts and Bitol project

How to Set Up Your First BigQuery Project

5 Unconventional Sources of Data for Your Next Project

7 Projects to Master Data Engineering

Modern Data Architecture for Embedded Analytics

How to decide on a data project for your portfolio

Best Practices for Version Control in Data Science Projects

7 GitHub Projects to Master Machine Learning

Life Cycle of Data Science Project

The Definitive Guide to Dashboard Design

7 MLOPs Projects for Beginners

NLP Project Life Cycle: A Case Study on Automated Resume Screening

7 Data Science Projects to Land a 6 Figure Job

Make Map Icons with an Orthographic Projection

The Definitive Guide to Embedded Analytics

Paying down tech debt: further learnings

Introducing the dbt MCP Server – Bringing Structured Data to AI Workflows and Agents

Open source business model struggles at WordPress

Bun: lessons from disrupting a tech ecosystem

The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data and AI

Stay Connected