Sun.Dec 17, 2023

article thumbnail

Adding An Easy Mode For The Modern Data Stack With 5X

Data Engineering Podcast

Summary The "modern data stack" promised a scalable, composable data platform that gave everyone the flexibility to use the best tools for every job. The reality was that it left data teams in the position of spending all of their engineering effort on integrating systems that weren't designed with compatible user experiences. The team at 5X understand the pain involved and the barriers to productivity and set out to solve it by pre-integrating the best tools from each layer of the s

Data Lake 130
article thumbnail

Implementing cost-effective Test-Driven Development in an LLM application by Fanis Vlachos

Scott Logic

Introduction In our continuous pursuit of refining Scottbot, our LLM-powered chatbot , we have encountered and overcome various challenges. The implementation of Test-Driven Development practices and comprehensive testing, a topic previously explored in a dedicated blog , was a particularly difficult one due to the non-deterministic nature of the Language Learning Models (LLMs).

article thumbnail

A guide to UDP in Scala with FS2

Rock the JVM

by Herbert Kateu 1. Introduction UDP stands for User Datagram Protocol, it’s a transport layer communication protocol built on top of IP that is used to transmit data across a network. It works by bundling up data in a UDP packet, adding header information, and sending these packets to the target destination. The UDP header is fixed at 8 bytes and contains a source port, destination port, the checksum used to verify packet integrity by the receiving device, and the length of the packet which equ

Scala 90
article thumbnail

Amazon S3 to PostgreSQL: 2 Easy Methods to Replicate Data

Hevo

As a data engineer, you hold all the cards to make data easily accessible to your business teams. Your team just requested an Amazon S3 to PostgreSQL connection on priority. We know you don’t wanna keep your data scientists and business analysts waiting to get critical business insights.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

UDP with Scala and FS2 Explained

Rock the JVM

Learn how UDP works and discover how to transition from an old Java NIO implementation to a more modern and expressive API like FS2 in Scala

Scala 52
article thumbnail

Examples of PostgreSQL Stored Procedures: The Ultimate Guide

Hevo

Procedural language elements are known for increasing the database’s functionality using User-Defined Functions (UDFs) and Stored Procedures combined. On a broader level, PostgreSQL supports several procedural languages that help sustain data models that repeat previous values.

article thumbnail

If software development were a race, AI wins every time by Colin Eberhardt

Scott Logic

An exploration of the quantitative and qualitative impacts of Generative AI on software development. We’ve undertaken multiple experiments to better understand the impact of GenerativeAI tools (ChatGPT, Copilot) on developer productivity. Our quantitative results show a 37% improvement in productivity (speed), however, this result is a misrepresentation of what it means to be productive as a developer.