Fri.Oct 25, 2024

article thumbnail

10 Essential Python Libraries for Data Science in 2024

KDnuggets

The richness of Python’s ecosystem has one downside: it makes it difficult to decide which libraries are the best for your needs. This article is an attempt to amend this by suggesting ten (and some more, as a bonus) libraries that are an absolute must in data science.

article thumbnail

Tales from the Pipeline: 4 Data Horror Stories To Keep You Up at Night

Monte Carlo

“As he lay awake in his Bay Area apartment, the data leader couldn’t shake the feeling that something wasn’t right. He tried to shut his eyes—to force them closed—but the more the data engineer tried, the more convinced he became. Suddenly, a light appeared from the darkness. It was a Slack from the CEO. She was working late. And the data…it couldn’t be…it looked wrong.

article thumbnail

Building Interactive Data Science Applications with Python

KDnuggets

Using Python to build engaging and interactive applications where users can pass in an input, get and feedback and make use of multimedia elements such as images, videos, and audio.

Python 139
article thumbnail

Unlocking FHIR for Data and AI in a Meaningful Way

databricks

Discover how the Databricks and XponentL partnership is allowing customers to unlock their FHIR needs. Learn more about dbignite. Imagine you’re feeling.

Data 96
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Diff Authoring Time: Measuring developer productivity at Meta

Engineering at Meta

At Meta, we’re always looking for ways to enhance the productivity of our engineers and developers. But how exactly do you measure developer productivity? On this episode of the Meta Tech Podcast Pascal Hartig ( @passy ) sits down with Sarita and Moritz , two engineers at Meta who have been working on Diff Authoring Time (DAT) – a method for measuring how long it takes to submit changes to a codebase.

article thumbnail

Shift Left: Headless Data Architecture, Part 2

Confluent

Proceed further by establishing your own headless data architecture—formalizing a data access layer at the center of your org, accessible by both analytics and operations.

article thumbnail

The Curse of Conway and the Data Space

Towards Data Science

How modern trends can be traced back to Conway’s Law Image by the author. (Generated by Midjourney, touched up with Krita) This article was originally posted on my blog [link]. The article was triggered by and riffs on the “Beware of silo specialisation” section of Bernd Wessely’s post Data Architecture: Lessons Learned. It brings together a few trends I am seeing plus my own opinions after twenty years experience working on both sides of the software / data team divide.