Fri.Mar 08, 2024

article thumbnail

Getting Started With Claude 3 Opus That Just Destroyed GPT-4 and Gemini

KDnuggets

Anthropic has released a new series of large language models and an updated Python API to access them.

Python 156
article thumbnail

Never Put Databricks Notebooks in Production

Confessions of a Data Guy

Recently an Architecture at Databricks recommended people use Notebooks for Production workloads. Very bad and horrible idea. Very expensive compute for most people (All Purpose Clusters) and it leads to horrible development practices. It set off a firestorm on Linkedin when I commented people SHOULD NOT follow this advice. Read here and here The post Never Put Databricks Notebooks in Production appeared first on Confessions of a Data Guy.

article thumbnail

WTF is Regularization and What is it For?

KDnuggets

This article explains the concept of regularization and its significance in machine learning and deep learning. We have discussed how regularization can be used to enhance the performance of linear models, as well as how it can be applied to improve the performance of deep learning models.

IT 141
article thumbnail

DuckDB has MAJOR Problems! OOM Errors.

Confessions of a Data Guy

I recently did a challenge. The results were clear. DuckDB CANNOT handle larger-than-memory datasets. OOM Errors. See link below for more details. … DuckDB vs Polars – Thunderdome. 16GB on 4GB machine Challenge. The post DuckDB has MAJOR Problems! OOM Errors. appeared first on Confessions of a Data Guy.

Datasets 130
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Data News — Recommendations

Christophe Blefari

We all need recommendations ( credits ) When I started writing this newsletter nearly three years ago, I never imagined that the words I write on my keyboard would take such an important place in my life. All the interactions I have with you, whether online or offline, are always amazing and give me wings. Today I want to introduce a new feature in the Data News galaxy.

Data 130
article thumbnail

5 Redshift SQL Functions You Need to Know

Towards Data Science

With code examples on how to use them Continue reading on Towards Data Science »

SQL 73
article thumbnail

Snowflake Performance Optimization: Unleashing the Power of Table Clustering for Superior Results

RandomTrees

Snowflake, the cloud data platform, has gained immense popularity for its flexibility, scalability, and ease of use. As organizations increasingly rely on Snowflake for their data warehousing needs, optimizing performance becomes paramount. One powerful feature that often goes underutilized is table clustering. In this article, we’ll explore the concept of table clustering and how it can significantly enhance the performance of your Snowflake data warehouse.