Database - Data Engineering Digest

How to Normalize Relational Databases With SQL Code?

Analytics Vidhya

FEBRUARY 27, 2023

The database is the major element of a data science project. To generate actionable insights, the database must be centralized and organized efficiently. If a corrupted, unorganized, or redundant database is used, the results of the analysis may become inconsistent and highly misleading. appeared first on Analytics Vidhya.

Relational Database

Relational Database Database SQL Coding

Scaling Beyond Postgres: How to Choose a Real-Time Analytical Database

Simon Späti

MARCH 11, 2025

It’s the Swiss Army knife of databases, and for many applications, it’s more than sufficient. Therefore, you’ve probably come across terms like OLAP (Online Analytical Processing) systems, data warehouses, and, more recently, real-time analytical databases.

Database

Database Data Warehouse Data Engineering Data Engineer

Understanding the Basics of Database Normalization

Analytics Vidhya

MARCH 2, 2023

Introduction Data normalization is the process of building a database according to what is known as a canonical form, where the final product is a relational database with no data redundancy. More specifically, normalization involves organizing data according to attributes assigned as part of a larger data model.

Database

Database Relational Database Building Process

Webinars

Going Beyond Chatbots: Connecting AI to Your Tools, Systems, & Data

Smart Tech + Human Expertise = How to Modernize Manufacturing Without Losing Control

MORE WEBINARS

MSSQL vs MySQL: Comparing Powerhouses of Databases

Analytics Vidhya

AUGUST 30, 2023

Introduction In the bustling arena of database management systems, two heavyweight contenders emerge, each carrying its arsenal of features and capabilities. In one corner, we have the suave and sophisticated Microsoft SQL Server (MSSQL), donned in the elegance of enterprise-level prowess.

MySQL

MySQL Database SQL Systems

New Study: 2018 State of Embedded Analytics Report

Why do some embedded analytics projects succeed while others fail? We surveyed 500+ application teams embedding analytics to find out which analytics features actually move the needle. Read the 6th annual State of Embedded Analytics Report to discover new best practices. Brought to you by Logi Analytics.

Project

Neo4j vs. Amazon Neptune: Graph Databases in Data Engineering

Analytics Vidhya

AUGUST 4, 2024

Traditional databases, while still valuable, often falter when it comes to handling highly connected data. Enter the unsung heroes of the data world: graph databases. This article discusses […] The post Neo4j vs. Amazon Neptune: Graph Databases in Data Engineering appeared first on Analytics Vidhya.

Database

Database Data Engineering Data Engineer Engineering

SQL Injection: The Cyber Attack Hiding in Your Database

Analytics Vidhya

FEBRUARY 2, 2023

Introduction SQL injection is an attack in which a malicious user can insert arbitrary SQL code into a web application’s query, allowing them to gain unauthorized access to a database. We can use this to steal sensitive information or make unauthorized changes to the data stored in the database.

Database

Database SQL Coding Accessible

Surveying The Market Of Database Products

Data Engineering Podcast

OCTOBER 29, 2023

Summary Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. What are the aspects of the database market that keep you interested as a VP of product?

Database

Database SQL BI Machine Learning

Designing A Non-Relational Database Engine

Data Engineering Podcast

APRIL 14, 2024

Summary Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. Can you describe what constitutes a NoSQL database? If you were to start from scratch today, what database would you build?

Non-relational Database

Non-relational Database Relational Database Database Designing

Monetizing Analytics Features: Why Data Visualizations Will Never Be Enough

Think your customers will pay more for data visualizations in your application? Five years ago they may have. But today, dashboards and visualizations have become table stakes. Discover which features will differentiate your application and maximize the ROI of your embedded analytics. Brought to you by Logi Analytics.

Data

7 Best Cloud Database Platforms

KDnuggets

OCTOBER 18, 2023

Cloud databases have made it easier and cheaper to develop enterprise-level applications, offering flexibility, convenience, and standard database functionality. See what KDnuggets recommends.

Database

Database Cloud IT Data Engineering

Reconciling The Data In Your Databases With Datafold

Data Engineering Podcast

MARCH 17, 2024

Summary A significant portion of data workflows involve storing and processing information in database engines. Your host is Tobias Macey and today I'm welcoming back Gleb Mezhanskiy to talk about how to reconcile data in database environments Interview Introduction How did you get involved in the area of data management?

Database

Database Data Lake High Quality Data Data Workflow

Mirroring SQL Server Database to Microsoft Fabric

Striim

NOVEMBER 19, 2024

SQL2Fabric Mirroring is a new fully managed service offered by Striim to mirror on premise SQL Databases. It’s a collaborative service between Striim and Microsoft based on Fabric Open Mirroring that enables real-time data replication from on-premise SQL Server databases to Azure Fabric OneLake.

SQL

SQL Database Data Warehouse Data Pipeline

A Beginner’s Guide to ClickHouse Database

KDnuggets

SEPTEMBER 13, 2024

Learn how to install ClickHouse DBMS, create a database, and run SQL queries using native and Python clients.

Database

Database SQL Python Data Engineering

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

Database

7 Modern SQL Database you Must Know in 2024

KDnuggets

JUNE 28, 2024

Explore the world of modern databases that are fast, secure, and cost-efficient, designed to tackle large-scale and diverse data challenges.

Database

Database SQL Designing Data

5 Free University Courses to Learn Databases and SQL

KDnuggets

MARCH 5, 2024

Looking to learn SQL and databases to level up your data science skills? Learn SQL, database internals, and much more with these free university courses.

SQL

SQL Database Data Science Data

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

FEBRUARY 25, 2024

Summary Building a database engine requires a substantial amount of engineering effort and time investment. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database.

Database

Database Technology Data Lake High Quality Data

An Honest Comparison of Open Source Vector Databases

KDnuggets

NOVEMBER 2, 2023

We will explore their use cases, key features, performance metrics, supported programming languages, and more to provide a comprehensive and unbiased overview of each database.

Database

Database Programming Language Programming

Modern Data Architecture for Embedded Analytics

You’ll learn: 7 approaches to data architecture for embedded analytics—from a transactional database to a columnar or in memory database. Discover the pros and cons of each approach, plus how to choose the right architecture for your business priorities, timeline, and customers.

Data Architecture

The 5 Best Vector Databases You Must Try in 2024

KDnuggets

NOVEMBER 17, 2023

The top vector databases are known for their versatility, performance, scalability, consistency, and efficient algorithms in storing, indexing, and querying vector embeddings for AI applications.

Database

Database Algorithm

Vector Database for LLMs, Generative AI, and Deep Learning

KDnuggets

FEBRUARY 28, 2024

Exploring the limitless possibilities of AI and making it context-aware.

Deep Learning

Deep Learning Database IT

Vector Technologies for AI: Extending Your Existing Data Stack

Simon Späti

MARCH 28, 2025

The database landscape has reached 394 ranked systems across multiple categoriesrelational, document, key-value, graph, search engine, time series, and the rapidly emerging vector databases. What fundamental differences exist between AI-focused vector databases and analytical vector engines like DuckDB or DataFusion?

Technology

Technology PostgreSQL MySQL Database

A Deep Dive into Data Replication: Most Effective Way to Protect Your Data

Analytics Vidhya

FEBRUARY 22, 2023

Introduction Data replication is also known as database replication, which is copying data to ensure that all information remains consistent across all data resources in real-time. data replication is like a safety net that keeps your information safe from disappearing or falling through the cracks. In most cases, data alters.

Database

Database Data NoSQL Datasets

Unapologetically Technical Episode 17 – Semih Salihoglu

Jesse Anderson

FEBRUARY 11, 2025

Semih is a researcher and entrepreneur with a background in distributed systems and databases. He then pursued his doctoral studies at Stanford University, delving into the complexities of database systems. Dont forget to subscribe to my YouTube channel to get the latest on Unapologetically Technical!

Computer Science

Computer Science Database Design Software Engineer Software Engineering

Interesting startup idea: benchmarking cloud platform pricing

The Pragmatic Engineer

OCTOBER 17, 2024

The current database includes 2,000 server types in 130 regions and 340 zones. Results are stored in git and their database, together with benchmarking metadata. Databases: SQLite files used to publish data Duck DB to query these files in the public APIs Cockroach DB : used to collect and store historical data.

Cloud

Cloud AWS Metadata Cloud Computing

How To Set Up Your Data Infrastructure In 2025 – Part 1

Seattle Data Guy

APRIL 15, 2025

Everyone is talking about AI, chatbots, LLMs, vector databases, and whether your data stack is “AI-ready.” Planning out your data infrastructure in 2025 can feel wildly different than it did even five years ago. The ecosystem is louder, flashier, and more fragmented.

Database

Database Data IT Big Data

The Roots of Today's Modern Backend Engineering Practices

The Pragmatic Engineer

NOVEMBER 21, 2023

We didn’t build our applications in neat containers, but in bulky monoliths which commingled business, database, backend, and frontend logic. We dabbled in network engineering, database management, and system administration. Our deployments were initially manual. were in english only.

Engineering

Engineering Bytes Cloud Computing AWS

How to Develop Serverless Code Using Azure Functions?

Analytics Vidhya

JANUARY 30, 2023

Whether we are analyzing IoT data streams, managing scheduled events, processing document uploads, responding to database changes, etc. Azure functions allow developers […] The post How to Develop Serverless Code Using Azure Functions? appeared first on Analytics Vidhya.

Coding

Coding Database Management Process

A Beginner’s Guide to Geospatial with DuckDB

Simon Späti

FEBRUARY 26, 2025

Traditionally, answering this question would require expensive GIS (Geographic Information Systems) software or complex database setups. ” This seemingly simple question requires analyzing competitor locations, population density, traffic patterns, and demographicsall spatial data.

Database

Database Data Engineering Data Engineer Accessible

How Apache Iceberg Is Changing the Face of Data Lakes

Snowflake

APRIL 2, 2025

Data storage has been evolving, from databases to data warehouses and expansive data lakes, with each architecture responding to different business and data needs. Traditional databases excelled at structured data and transactional workloads but struggled with performance at scale as data volumes grew.

Data Lake

Data Lake Metadata Cloud Storage Data Warehouse

What are Data Access Object and Data Transfer Object in Python?

Analytics Vidhya

FEBRUARY 6, 2023

Especially while working with databases, it is often considered a good practice to follow a design pattern. Introduction A design pattern is simply a repeatable solution for problems that keep on reoccurring. The pattern is not an actual code but a template that can be used to solve problems in different situations.

Accessible

Accessible Accessibility Python Database

What Is AWS DMS And Why You Shouldn’t Use It As An ELT

Seattle Data Guy

NOVEMBER 8, 2024

Whether it was moving data from a local database instance to S3 or some other data storage layer. Recently, I’ve encountered a few projects that used AWS DMS, which is almost like an ELT solution. It was interesting to see AWS DMS used in this manner. But it’s not what DMS was built for.

AWS

AWS IT Data Storage Database

Databricks Named a Leader in 2023 Gartner® Magic Quadrant™ for Cloud Database Management Systems

databricks

DECEMBER 21, 2023

We are excited to announce that Gartner has recognized Databricks as a Leader for a third consecutive year in the 2023 Gartner® Magic.

Database

Database Systems Cloud Management

Data News — Week 25.02

Christophe Blefari

JANUARY 11, 2025

Materialization of data warehouse layers — What are the consideration for every materialisation you should pick in your data warehouse layer: view, tables, schema vs. databases, etc. The best code is the code you never wrote — Every line of code is a form of debt—a liability that must be maintained and understood.

Data

Data Data Warehouse Coding Programming Language

Top 10 Powerful Data Modeling Tools to Know in 2023

Analytics Vidhya

JUNE 24, 2023

As a new developer, a robust data modeling foundation is crucial for effectively working with databases. Introduction In the era of data-driven decision-making, having accurate data modeling tools is essential for businesses aiming to stay competitive.

Database

Database Utilities Data Data Science

Monte Carlo Recognized as the #1 Leader in Data Observability and Data Quality by G2

Monte Carlo

DECEMBER 18, 2024

In the Winter 2025 awards, we were recognized as #1 across the entire Data Observability, Data Quality, and Database Monitoring categories , as well as #1 in the Enterprise Relationship Index. All told, we’re featured in 71 reports, including the #1 spot in 31 reports, and we received 44 badges, including eight Leader badges!

Database

Database High Quality Data Data Software Engineer

Step-by-Step Roadmap to Learn SQL in 2023

Analytics Vidhya

FEBRUARY 28, 2023

Introduction Structured Query Language is a powerful language to manage and manipulate data stored in databases. After being introduced in the 70s, it has become the standard querying language for relational databases. […] The post Step-by-Step Roadmap to Learn SQL in 2023 appeared first on Analytics Vidhya.

SQL

SQL Relational Database Data Science Database

Change Data Capture at Pinterest

Pinterest Engineering

NOVEMBER 18, 2024

Change Data Capture (CDC) is a crucial technology that enables organizations to efficiently track and capture changes in their databases. In this blog post, we’ll explore what CDC is, why it’s important, and our journey of implementing Generic CDC solutions for all online databases at Pinterest. What is Change Data Capture?

Kafka

Kafka MySQL Database Software Engineer

Microsoft’s Drasi: An Open-Source Tool for Efficient Change Management Systems

Analytics Vidhya

OCTOBER 7, 2024

Microsoft’s open-source tool, Drasi, addresses this need by effortlessly detecting, monitoring, and responding to data changes across platforms, including relational and graph databases.

Systems

Systems Management Database Data

How to Normalize Relational Databases With SQL Code?

Scaling Beyond Postgres: How to Choose a Real-Time Analytical Database

Webinars

Trending Sources

Understanding the Basics of Database Normalization

Webinars

MSSQL vs MySQL: Comparing Powerhouses of Databases

New Study: 2018 State of Embedded Analytics Report

Neo4j vs. Amazon Neptune: Graph Databases in Data Engineering

SQL Injection: The Cyber Attack Hiding in Your Database

Surveying The Market Of Database Products

Designing A Non-Relational Database Engine

Monetizing Analytics Features: Why Data Visualizations Will Never Be Enough

7 Best Cloud Database Platforms

Reconciling The Data In Your Databases With Datafold

Mirroring SQL Server Database to Microsoft Fabric

A Beginner’s Guide to ClickHouse Database

Get Better Network Graphs & Save Analysts Time

7 Modern SQL Database you Must Know in 2024

5 Free University Courses to Learn Databases and SQL

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

An Honest Comparison of Open Source Vector Databases

Modern Data Architecture for Embedded Analytics

The 5 Best Vector Databases You Must Try in 2024

Vector Database for LLMs, Generative AI, and Deep Learning

Vector Technologies for AI: Extending Your Existing Data Stack

A Deep Dive into Data Replication: Most Effective Way to Protect Your Data

Unapologetically Technical Episode 17 – Semih Salihoglu

Interesting startup idea: benchmarking cloud platform pricing

Top 6 Cassandra Interview Questions

How To Set Up Your Data Infrastructure In 2025 – Part 1

The Roots of Today's Modern Backend Engineering Practices

How to Develop Serverless Code Using Azure Functions?

A Beginner’s Guide to Geospatial with DuckDB

How Apache Iceberg Is Changing the Face of Data Lakes

Top 5 Interview Questions on Cassandra

What are Data Access Object and Data Transfer Object in Python?

Top 5 SQL Interview Questions With Implementation

What Is AWS DMS And Why You Shouldn’t Use It As An ELT

Databricks Named a Leader in 2023 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Data News — Week 25.02

Top 10 Powerful Data Modeling Tools to Know in 2023

Top 5 SQL Interview Questions

Monte Carlo Recognized as the #1 Leader in Data Observability and Data Quality by G2

Step-by-Step Roadmap to Learn SQL in 2023

Change Data Capture at Pinterest

Microsoft’s Drasi: An Open-Source Tool for Efficient Change Management Systems

Stay Connected