Sat.Mar 23, 2024 - Fri.Mar 29, 2024

article thumbnail

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Data Engineering Podcast

Summary A core differentiator of Dagster in the ecosystem of data orchestration is their focus on software defined assets as a means of building declarative workflows. With their launch of Dagster+ as the redesigned commercial companion to the open source project they are investing in that capability with a suite of new features. In this episode Pete Hunt, CEO of Dagster labs, outlines these new capabilities, how they reduce the burden on data teams, and the increased collaboration that they ena

Data Lake 162
article thumbnail

A Collection Of Free Data Science Courses From Harvard, Stanford, MIT, Cornell, and Berkeley

KDnuggets

Learn everything about data science by exploring our curated collection of free courses from top universities, covering essential topics from math and programming to machine learning, and mastering the nine steps to become a job-ready data scientist.

article thumbnail

Announcing DBRX: A new standard for efficient open source LLMs

databricks

Databricks’ mission is to deliver data intelligence to every enterprise by allowing organizations to understand and use their unique data to build their.

Building 145
article thumbnail

Schema tracking in Delta Lake

Waitingforcode

Streaming Delta tables is slightly different from streaming native streaming sources, such as Apache Kafka topics. One of the significant differences is schema enforcement. It leads to the job failure in case of schema changes of the streamed table.

Kafka 130
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Snowflake Invests in Observe to Expand Observability in the Data Cloud

Snowflake

As organizations seek to drive more value from their data, observability plays a vital role in ensuring the performance, security and reliability of applications and pipelines while helping to reduce costs. At Snowflake, we aim to provide developers and engineers with the best possible observability experience to monitor and manage their Snowflake environment.

Cloud 123
article thumbnail

10 GitHub Repositories to Master MLOps

KDnuggets

Begin your MLOps journey with these comprehensive free resources available on GitHub.

157
157

More Trending

article thumbnail

Moderating Inappropriate Video Content at Yelp

Yelp Engineering

One of Yelp’s top priorities is the trust and safety of our users. Yelp’s platform is most well-known for its reviews, and its moderation practices have been recognised in academic research for mitigating misinformation and building consumer trust. In addition to reviews, Yelp’s Trust and Safety team takes significant measures when it comes to protecting its users from inappropriate material posted through other content types.

Building 115
article thumbnail

Bringing HDR photo support to Instagram and Threads

Engineering at Meta

Meta’s family of apps serves trillions of image download requests every day. And if you’re into high-quality images, you’ve probably noticed that Instagram and Threads have added support for high dynamic range (HDR) photos. Now people on Threads and Instagram can upload and share images that are more true-to-life, with the full color and range their device is capable of capturing.

Media 107
article thumbnail

5 Free Google Courses to Become a Software Engineer

KDnuggets

Want to become a software engineer? Make it happen with these free courses and guides from Google.

article thumbnail

Delivering the Next Generation of Consumer Experiences: Databricks and Adobe Announce Strategic Partnership

databricks

By Steve Sobel - Global Industry Leader; Communications, Media & Entertainment Today Databricks and Adobe are excited to announce a strategic partnership focused.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Top UI UX Trends to Know in 2024

Knowledge Hut

The process of developing digital assets that are both aesthetically pleasing and simple to use is known as user interface/user experience design, or UI/UX design. While UX designers concentrate on the user's journey and how they engage with the product, UI designers are more concerned with the appearance and feel of a product. Because of digital innovation and the dynamic needs of consumers, the field of UI/UX design is always developing.

Designing 105
article thumbnail

Phone Number Masking for Yelp Services Projects

Yelp Engineering

In this blog post, we highlight how phone number masking helps build consumer trust in the services marketplace at Yelp, decreases the friction in communication with service professionals, and allows for seamless switching between the Yelp app and a user’s phone. We present a high level overview of our in-house phone masking system and dive into the details of the engineering challenge of optimizing the usage of proxy phone number resources at Yelp’s scale.

Project 103
article thumbnail

The Promise of Edge AI and Approaches for Effective Adoption

KDnuggets

Organizations are adopting edge AI for real-time decision-making using efficient and cost-effective methods such as model quantization, multimodal databases, and distributed inferencing.

Database 152
article thumbnail

Announcing the State Reader API: The New "Statestore" Data Source

databricks

Databricks Runtime 14.3 includes a new capability that allows users to access and analyze Structured Streaming 's internal state data: the State Reader.

Data 133
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Setting Up Kafka Multi-Tenancy 

DoorDash Engineering

Real-time event processing is a critical component of a distributed system’s scalability. At DoorDash, we rely on message queue systems based on Kafka to handle billions of real-time events. One of the challenges we face, however, is how to properly validate the system before going live. Traditionally, an isolated environment such as staging is used to validate new features.

Kafka 103
article thumbnail

Data Architecture and Strategy in the AI Era

Cloudera

At a time when AI is exploding in popularity and finding its way into nearly every facet of business operations, data has arguably never been more valuable. More recently, that value has been made clear by the emergence of AI-powered technologies like generative AI (GenAI) and the use of Large Language Models (LLMs). But, even with the backdrop of an AI-dominated future, many organizations still find themselves struggling with everything from managing data volumes and complexity to security conc

article thumbnail

7 Steps to Mastering Large Language Model Fine-tuning

KDnuggets

From theory to practice, learn how to enhance your NLP projects with these 7 simple steps.

Project 151
article thumbnail

Announcing the General Availability of Databricks Notebooks on SQL Warehouses

databricks

Today, we are excited to announce the general availability of Databricks Notebooks on SQL warehouses. Databricks SQL warehouses are SQL-optimized compute that provide.

SQL 119
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Building Databricks Data Pipelines 101

Confessions of a Data Guy

Have you ever wondered at a high level what it’s like to build production-level data pipelines on Databricks? What does it look like, what tools do you use? The post Building Databricks Data Pipelines 101 appeared first on Confessions of a Data Guy.

article thumbnail

Four Data Engineering Projects That Look Great on your CV

Towards Data Science

Data pipelines that would turn you into a decorated data professional Continue reading on Towards Data Science »

article thumbnail

Become a Business Intelligence Analyst in Less Than 6 Months

KDnuggets

Ready to become a business intelligence analyst right here, right now?

article thumbnail

Managed Sportlogiq to Databricks Data Ingestion Pipelines for NHL Teams: A Game-Changing Alliance

databricks

Overview In the competitive world of professional hockey, NHL teams are always seeking to optimize their performance. Advanced analytics has become increasingly important.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

How To Build and Open Source PYPI Python Package

Confessions of a Data Guy

Ever wondered how to build and end-to-end project for an Open Source Python Package that gets published to PYPI? I built out lakescuman open-source package to help with Databricks Unity Catalog Delta Lake tables querying with Polars, DuckDB, or PyArrow. [link] The post How To Build and Open Source PYPI Python Package appeared first on Confessions of a Data Guy.

Python 100
article thumbnail

Snowflake Data Clean Rooms: Securely Collaborate to Unlock Insights and Value

Snowflake

In December 2023, Snowflake announced its acquisition of data clean room technology provider Samooha. Samooha’s intuitive UI and focus on reducing the complexity of sharing data led to it being named one of the most innovative data science companies of 2024 by Fast Company. Now, Samooha’s offering is integrated into Snowflake and launched as Snowflake Data Clean Rooms , a Snowflake Native App on Snowflake Marketplace, generally available to customers in AWS East, AWS West and Azure West.

Media 89
article thumbnail

Mastering Python for Data Science: Beyond the Basics

KDnuggets

This article serves as a detailed guide on how to master advanced Python techniques for data science. It covers topics such as efficient data manipulation with Pandas, parallel processing with Python, and how to turn models into web services.

article thumbnail

PySpark in 2023: A Year in Review

databricks

With the releases of Apache Spark 3.4 and 3.5 in 2023, we focused heavily on improving PySpark performance, flexibility, and ease of use.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

#ClouderaLife Employee Spotlight: Jess Hohn-Cabana

Cloudera

Meet Cloudera’s new Senior Vice President of Global Communications, Jess Hohn-Cabana. In this Employee Spotlight, we’ll get to know more about Jess, her new role, and her recent award win at the 2024 Ragan Top Women in Communications Awards. Get to Know Jess: A Seasoned Leader in Tech Communications and Branding Coming to Cloudera with nearly three decades of experience in tech communications and branding, Jess is a leader and a visionary on all things storytelling.

article thumbnail

How Advertising, Media & Entertainment and Manufacturing Companies Are Accelerating Data, Apps and AI Strategy in the Data Cloud

Snowflake

In 2023, we held our first Accelerate event to explore industry trends, track data and technology innovations in financial services, and lay out data strategy case studies for the industry. This year, we are expanding to five industry events featuring leaders sharing insights relevant to advertising, media and entertainment; manufacturing; healthcare and life sciences; financial services; and retail and consumer goods.

article thumbnail

The Art of Effective Prompt Engineering with Free Courses and Certifications

KDnuggets

Have you ever asked yourself ‘Am I using these generative AI tools correctly?

article thumbnail

Deloitte Data as a Service for Banking: A Modern Data Solution for Banks and Capital Markets Institutions

databricks

As new Generative AI capabilities continue to emerge with heightened customer expectations, data modernization and migration to the cloud have become critical success.

Banking 98
article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.