Sun.Apr 21, 2024

article thumbnail

How does ChatGPT work? As explained by the ChatGPT team.

The Pragmatic Engineer

See a longer version of this article here: Scaling ChatGPT: Five Real-World Engineering Challenges. Sometimes the best explanations of how a technology solution works come from the software engineers who built it. To explain how ChatGPT (and other large language models) operate, I turned to the ChatGPT engineering team. "How does ChatGPT work, under the hood?

article thumbnail

Making Email Better With AI At Shortwave

Data Engineering Podcast

Summary Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.

Data Lake 182
article thumbnail

Apache Spark - What does going from 2.4 to 3.5 get you? by Steve Conway

Scott Logic

Apache Spark has now reached version 3.5.1, but what if you are still using a 2.4.x version? 2.4.8 went out of support in May 2021, so upgrading is strongly advised. If you go through the pain of updating to the latest version, what do you gain? Apache Spark SQL has gone through a major evolution, now supporting ANSI SQL, and adding many new features and making many performance improvements.

Scala 52
article thumbnail

Data Engineering Weekly #168

Data Engineering Weekly

Meta: Introducing Meta Llama 3 - The most capable openly available LLM to date Meta is taking an interesting approach in the growing LLM market with the open source approach and distribution across all the leading cloud providers and data platforms. It is exciting to see Llama 3 with 70B parameters on par with GPT-3.5, which I believe has 175B parameters.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.