Tue.Dec 19, 2023

article thumbnail

5 Cheap Books to Master Data Science

KDnuggets

There are many data-learning materials locked up behind expensive books. These cheap books would bolster your skills without blowing up your savings.

article thumbnail

How Meta built the infrastructure for Threads

Engineering at Meta

On July 5, 2023, Meta launched Threads, the newest product in our family of apps, to an unprecedented success that saw it garner over 100 million sign ups in its first five days. A small, nimble team of engineers built Threads over the course of only five months of technical work. While the app’s production launch had been under consideration for some time, the business finally made the decision and informed the infrastructure teams to prepare for its launch with only two days’ advance notice.

article thumbnail

AI-Automated Cybersecurity: What to Automate?

KDnuggets

Soon AI will become embedded into daily business processes, including cybersecurity controls. The author explains how to assess which processes make sense to automate.

Process 147
article thumbnail

Make this AI-inspired topo landscape please

ArcGIS

Here's how to fake an isometric 3D topo terrain in 2D! And stuff.

138
138
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Order is king for the performance

Waitingforcode

Even though nowadays data processing frameworks and data stores have smart query planners, they don't take our responsibility to correctly design the job logic.

Designing 130
article thumbnail

AI debugging at Meta with HawkEye

Engineering at Meta

HawkEye is the powerful toolkit used internally at Meta for monitoring, observability, and debuggability of the end-to-end machine learning (ML) workflow that powers ML-based products. HawkEye supports recommendation and ranking models across several products at Meta. Over the past two years, it has facilitated order of magnitude improvements in the time spent debugging production issues.

More Trending

article thumbnail

Databricks Data Intelligence Platform for Retail comes to NRF 2024

databricks

Request a meeting with Databricks executives/thought leaders at NRF! Each January, thousands of leaders from retailers around the globe gather at Javits Center.

Retail 115
article thumbnail

Top 6 Episodes of The Data Chief Podcast: 2023

ThoughtSpot

2023 has been a year of breakthrough innovation for many, and a deer-in-headlights moment for others. I keep flashing back to the 90s when the Internet created new businesses and destroyed others—LLMs are doing the same, only with more velocity. From CDAOs to VCs alike, the rate of creative destruction is faster, but there is also an intense focus on value.

article thumbnail

Tips for training data preparation for object detection models

ArcGIS

We will dive into our best practices for preparing and using training samples for object detection models.

article thumbnail

React Architecture Pattern: Implementation + Best Practices

Knowledge Hut

React comes with so many features that help you develop fantastic User Interfaces. One cool thing with React is that it doesn't force you to use a predefined architectural pattern like other JavaScript frameworks do. That is to say, React allows you to determine the structure of your application by yourself. An architectural pattern in front-end development is the blueprint of the User Interface.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Optimizing the Value of AI Solutions for the Public Sector

Cloudera

Without a doubt, 2023 has shaped up to be generative AI’s breakout year. Less than 12 months after the introduction of generative AI large language models such as ChatGPT and PaLM, image generators like Dall-E, Midjourney, and Stable Diffusion, and code generation tools like OpenAI Codex and GitHub CoPilot, organizations across every industry, including government, are beginning to leverage generative AI regularly to increase creativity and productivity.

article thumbnail

Deployment of Exabyte-Backed Big Data Components

LinkedIn Engineering

Co-authors: Arjun Mohnot , Jenchang Ho , Anthony Quigley , Xing Lin , Anil Alluri , Michael Kuchenbecker LinkedIn operates one of the world’s largest Apache Hadoop big data clusters. These clusters are the backbone for storing and processing extensive data volumes, empowering us to deliver essential features and services to members, such as personalized recommendations, enhanced search functionality, and valuable insights.

article thumbnail

LiveRamp Customers Build ‘Foundation of Identity’ With Snowflake Native Apps

Snowflake

The best marketing is truly data-driven, creating powerful product promotions and offers through an understanding of customer needs and preferences. But for many organizations, building this understanding is more akin to solving an ever-growing jigsaw puzzle (with no easy edge pieces!) than reading data insights from a beautiful dashboard. Every customer store interaction, online transaction, form fill, event participation, chatbot response, text request, like, review, complaint, and click creat

article thumbnail

Build AI Apps with Amazon PartyRock and Amazon Bedrock

Workfall

Reading Time: 16 minutes Introducing Amazon PartyRock, an innovative platform that redefines the landscape of app exploration and creation. For Part 1 of this blog, refer here. In this transformative hands-on implementation, we will guide you through the PartyRock playground, an exciting journey that encompasses navigating its free features, signing in to unlock personalized experiences, experimenting with suggested apps, exploring a myriad of pre-built applications, and culminating in the creat

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Running Airflow DAG Only If Another DAG Is Successful

Towards Data Science

Using Airflow sensors to control the execution of DAGs on a different schedule Continue reading on Towards Data Science »

article thumbnail

How to integrate with dbt

dbt Developer Hub

Overview ​ Over the course of my three years running the Partner Engineering team at dbt Labs, the most common question I've been asked is, How do we integrate with dbt? Because those conversations often start out at the same place, I decided to create this guide so I’m no longer the blocker to fundamental information. This also allows us to skip the intro and get to the fun conversations so much faster, like what a joint solution for our customers would look like.

article thumbnail

A Blueprint for a Real-World Recommendation System

Rockset

Overview In this guide, we will: Understand the Blueprint of any modern recommendation system Dive into a detailed analysis of each stage within the blueprint Discuss infrastructure challenges associated with each stage Cover special cases within the stages of the recommendation system blueprint Get introduced to some storage considerations for recommendation systems And finally, end with what the future holds for the recommendation systems Introduction In a recent insightful talk at Index confe

Systems 52
article thumbnail

Toronto’s Data Science Renaissance: A Tale of Two Markets

WeCloudData

The Recap We continue with the unfolding saga of the data science jobs landscape, this time for the month of November. In my previous blog, I continued to compare Toronto’s data science jobs market against the rest of North America, and after a strong September, it didn’t look especially good for Toronto. It looked as […] The post Toronto’s Data Science Renaissance: A Tale of Two Markets appeared first on WeCloudData.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

ETL for Snowflake: Why You Need It and How to Get Started

Ascend.io

If you’re working with Snowflake or just starting to explore its capabilities, you might be wondering: Do I really need ETL for Snowflake? Is it possible to rely solely on Snowflake’s own features, or is there a strong case for bringing ETL into the mix? If so, where do I get started? In this article, we’re diving into these questions to clear up any confusion.

article thumbnail

Digitizing Customer Experience in the Travel Industry

Confluent

Legacy data systems often power travel experiences, such as on cruise lines, but modern customers want real-time experiences online. Here's how to think about data integration with data streaming for travelers.

article thumbnail

Conscientious Computing - Podcasts: What we are listening to right now! by Charlotte Hayes

Scott Logic

Many of us love a good podcast so I reached out to our project team to see what they were listening to in the tech and sustainability space. Here are their recommendations: Environment Variables If I were to pick one podcast to start with, then this would be the one. Published by the Green Software Foundation, each episode aims to bring listeners the latest news regarding how to reduce the emissions of software and how the industry is dealing with its own environmental impact.

Coding 52
article thumbnail

2023 in a nutshell —ride along!

Picnic Engineering

With operations in full swing to pull us through the busiest time of the year, the code slush we apply in some of our teams allow us to take a step back and reflect on another exciting year in the crazy little groceries roller coaster we call Picnic. In this blog, we’d like to give you a glimpse into some of the major developments in Picnic Tech in 2023.

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.

article thumbnail

15 Essential Java Full Stack Developer Skills in 2024

Knowledge Hut

Java, as the language of digital technology, is one of the most popular and robust of all software programming languages. It is ideal for cross-platform applications because it is a compiled language with object code that can work across more than one machine or processor. All programming is done using coding languages. Java, like Python or JavaScript, is a coding language that is highly in demand.

Java 98