Thu.Oct 24, 2024

article thumbnail

Skip Lines of CSV files with DuckDB and Polars

Confessions of a Data Guy

There are some things you don’t need until you need them. I ran into that situation recently with needing to process some CSV / Flatfiles on short notice. At first, it appeared to be easy, but then I realized, as usual, there was a little monkey wrench thrown into the middle of it. It is […] The post Skip Lines of CSV files with DuckDB and Polars appeared first on Confessions of a Data Guy.

Process 147
article thumbnail

5 Free Courses to Understand Machine Learning Algorithms

KDnuggets

To help you navigate this complex subject, we’ve compiled five free online courses that will give you a solid foundation in machine learning algorithms.

article thumbnail

Open Source Security at Databricks

databricks

The Databricks Product Security team is deeply committed to ensuring the security and integrity of its products, which are built on top of.

IT 138
article thumbnail

How to Use Hugging Face Transformers for Text-to-Speech Applications

KDnuggets

To use Hugging Face Transformers for Text-to-Speech, load a pre-trained TTS model and input the text you want to convert to speech. The model will generate audio, which you can save or play directly.

129
129
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Bringing Together Data Intelligence and Evaluation Intelligence: Databricks Ventures Invests in Galileo

databricks

Our customers say their biggest challenge in getting Generative AI from pilot to production is the " measurement problem." It's hard to.

Data 119
article thumbnail

Fine-tuning Llama 3.2 Using Unsloth

KDnuggets

Unsloth is a user-friendly framework, offering fast inference and fine-tuning for large language models. It also supports saving models in multiple formats, including vLLM and GGUF.

IT 124

More Trending

article thumbnail

Building a Cost-Optimized Chatbot with Semantic Caching

databricks

Chatbots are becoming valuable tools for businesses, helping to improve efficiency and support employees. By sifting through troves of company data and.

Building 105
article thumbnail

Free AI-Powered Interview Preparation Tool by 365 Data Science

KDnuggets

InterviewAce is now available for free to all 365 Data Science users, marking a significant addition to the platform's suite of career development tools.

article thumbnail

Bloor InBrief Report

Striim

Learn why Striim was awarded Bloor's 2024 Gold Mutable Award for the data in motion category

Data 52
article thumbnail

Building an Assignment Algorithm - Episode 1 / 3 by Josh Warren

Scott Logic

Last year, our team was working on an app that organised conferences. Our most interesting mission, in my opinion, was to design and build an algorithm that assigned talks to attendees according to their choices. This algorithm would save organisers the time, human error and brain power required to ensure all attendees are fairly allocated. After having built and run our algorithm, we achieved results that improved the fairness of previously time-costly hand-calculated assignments by 30% (accord

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

From Creativity to Analytics: Gen AI’s Future in Adtech and Martech

Snowflake

Adtech and martech companies are engaged in a fierce battle for audience attention. Customers are bombarded with thousands of ads and marketing messages every day, and the average attention span is plummeting, so it’s no wonder they tune out — or turn on ad blockers. But it’s not all doom and gloom. The global adtech market is expected to grow at a rate of 22.4% through 2030, and martech’s projected growth rate is 18.5% through 2032.

article thumbnail

Unlock the Value of Sensitive Data with Differential Privacy

Snowflake

The Snowflake AI Data Cloud has democratized data for thousands of customers, removing data silos and powering data sharing and collaboration use cases. Many customers have been able to unlock enormous value from their data with Snowflake, including safely collaborating on sensitive data using Snowflake Data Clean Rooms and Data Governance features.