Sat.Jan 21, 2023 - Fri.Jan 27, 2023

article thumbnail

Apple: The only big tech giant going against the job cuts tide

The Pragmatic Engineer

Comments

326
326
article thumbnail

The ChatGPT Cheat Sheet

KDnuggets

Impress your friends and loved ones by perfecting your ChatGPT prompt engineering game with this incredibly useful resource.

article thumbnail

Watch Meta’s engineers discuss optimizing large-scale networks

Engineering at Meta

Managing network solutions amidst a growing scale inherently brings challenges around performance, deployment, and operational complexities. At Meta, we’ve found that these challenges broadly fall into three themes: 1.) Data center networking: Over the past decade, on the physical front, we have seen a rise in vendor-specific hardware that comes with heterogeneous feature and architecture sets (e.g., non-blocking architecture).

article thumbnail

Building a Life Sciences Knowledge Graph with a Data Lake

databricks

This is a collaborative post from Databricks and wisecube.ai. We thank Vishnu Vettrivel, Founder, and Alex Thomas, Principal Data Scientist, for their contributions.

Data Lake 137
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Data News — Week 23.04

Christophe Blefari

My view from the train window ( credits ) Dear Data News readers it's a joy every week to write this newsletter, we are slowly approaching the second birthday of this newsletter. In order to celebrate this together I'd love to receive your stories about data —can be short or long, anonymous or not. This is an open box, just write me with what you have on the mind and I'll bundle an edition with it.

Data 130
article thumbnail

5 Ways to Deal with the Lack of Data in Machine Learning

KDnuggets

Effective solutions exist when you don't have enough data for your models. While there is no perfect approach, five proven ways will get your model to production.

More Trending

article thumbnail

Safely Test Your Applications And Analytics With Production Quality Data Using Tonic AI

Data Engineering Podcast

Summary The most interesting and challenging bugs always happen in production, but recreating them is a constant challenge due to differences in the data that you are working with. Building your own scripts to replicate data from production is time consuming and error-prone. Tonic is a platform designed to solve the problem of having reliable, production-like data available for developing and testing your software, analytics, and machine learning projects.

article thumbnail

Scalable Annotation Service?—?Marken

Netflix Tech

Scalable Annotation Service — Marken by Varun Sekhri , Meenakshi Jindal Introduction At Netflix, we have hundreds of micro services each with its own data models or entities. For example, we have a service that stores a movie entity’s metadata or a service that stores metadata about images. All of these services at a later point want to annotate their objects or entities.

Algorithm 117
article thumbnail

5 Free Data Science Books You Must Read in 2023

KDnuggets

Get your hands on these gems to learn Python, data analytics, machine learning, and deep learning.

article thumbnail

Tulip: Modernizing Meta’s data platform

Engineering at Meta

The technical journey discusses the motivations, challenges, and technical solutions employed for warehouse schematization, especially a change to the wire serialization format employed in Meta’s data platform for data interchange related to Warehouse Analytics Logging. Here, we discuss the engineering, scaling, and nontechnical challenges of modernizing Meta’s exabyte-scale data platform by migrating to the new Tulip format.

Bytes 111
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Customer Engagement Trends for 2023

Precisely

In today’s hypercompetitive business environment, companies must deliver a standout experience for their target audience. Companies that excel at customer experience (CX) are better at building brand loyalty, increasing total customer lifetime value, and turning occasional customers into brand evangelists. This compelling drive for outstanding CX coincides with an intensive shift toward digitization, personalization, and omnichannel alignment.

article thumbnail

Introduction to Synthetic Aperture Radar

ArcGIS

This blog will answer questions such as “What is SAR?”, “What can SAR be used for?”, and “How is SAR beneficial?”.

Education 100
article thumbnail

From Data Collection to Model Deployment: 6 Stages of a Data Science Project

KDnuggets

Here are 6 stages of a novel Data Science Project; From Data Collection to Model in Production, backed by research and examples.

article thumbnail

A Gousto use case: how Databricks helps create personalized recipe recommendations for customers at scale

databricks

“This blog is authored by Hai Nguyen, Senior Data Scientist at Gousto” Gousto is the UK's best value recipe box, serving up more rec.

Data 98
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Improving the customer’s experience via ML-driven payment routing

LinkedIn Engineering

Co-Authors: Xianyun Mao , Stan Xu , Rachit Kumar , Vikas R , Xia Hong , and�� Divyakumar Menghani �� As a LinkedIn member, you can subscribe to LinkedIn Premium on a monthly or annual basis. For our customers, we offer the same option for our Talent Solutions and/or Sales Navigator products. For each, LinkedIn offers subscription renewal payments. These subscription renewal payments used to go through a rule-based routing engine to selected payment gateways, which often resulted in a less-than-o

Banking 97
article thumbnail

Understanding and managing ArcGIS Online credits

ArcGIS

ArcGIS Online users and administrators - learn best practices for managing ArcGIS Online credits and get answers to frequently asked questions.

article thumbnail

7 Best Libraries for Machine Learning Explained

KDnuggets

Learn about machine learning libraries for building and deploying machine learning models.

article thumbnail

Work With Large Monorepos With Sparse Checkout Support in Databricks Repos

databricks

For your data-centered workloads, Databricks offers the best-in-class development experience and gives you the tools you need to adhere to code development best.

Coding 98
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Enforcing Device AuthN & Compliance at Pinterest

Pinterest Engineering

Armen Tashjian | Security Engineer, Corporate Security Intro Pinterest has enforced the use of managed and compliant devices in our Okta authentication flow, using a passwordless implementation, so that access to our tools always requires a healthy Pinterest device. Following the phishing-based attacks against our peers in the tech industry, Pinterest decided to take a two pronged approach to defend against similar attacks.

article thumbnail

One Minute Map Hacks: 71-75

ArcGIS

Another five hacks in an endless stream of one-minute how-to videos.

98
article thumbnail

Multi-modal deep learning in less than 15 lines of code

KDnuggets

Learn how to easily build, iterate and deploy a state-of-the-art deep learning model to predict customer ratings with a declarative approach to machine learning.

article thumbnail

Best Practices and Guidance for Cloud Engineers to Deploy Databricks on AWS: Part 2

databricks

This is part two of a three-part series in Best Practices and Guidance for Cloud Engineers to deploy Databricks on AWS. You can.

AWS 98
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

United Bank Limited optimizes its data analytics with the Cloudera Data Platform (CDP)

Cloudera

United Bank Limited (UBL), a Pakistani banking and financial services leader, serves over 11 million customers nationwide and operates 1,338 branches and 1,445 ATMs, along with its branchless banking proposition (combination ATM and online banking). In 2022, UBL was awarded Best Bank for Digital Solutions by Asiamoney and Market Leader of Digital Banking in Pakistan by Euromoney, a testament to its track record as the best in digital banking.

Banking 86
article thumbnail

Why Column-Aware Metadata Is Key to Automating Data Transformations

Snowflake

Data, data, data. It does seem we are not only surrounded by talk about data, but by the actual data itself. We are collecting data from every nook and cranny of the universe (literally!). IoT devices in every industry; geolocation information on our phones, watches, cars, and every other mobile device; every website or app we access—all are collecting data.

article thumbnail

7 SMOTE Variations for Oversampling

KDnuggets

Best oversampling techniques for the imbalanced data.

article thumbnail

Bringing Models and Data Closer Together

databricks

We are excited to announce a new AutoML capability to quickly and easily use Feature Store data to improve model outcomes. AutoML users.

Data 98
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Five tips for developing native applications using web maps

ArcGIS

Find out how including web maps in your development workflows using the ArcGIS Maps SDKs for Native Apps can increase your productivity!

84
article thumbnail

Containerizing the Beast – Hadoop NameNodes in Uber’s Infrastructure

Uber Engineering

We recently containerized Hadoop NameNodes and upgraded hardware, improving NameNode RPC queue time from ~200 to ~20ms – A 10x improvement! With this radical change, Uber’s Hadoop customers are happier and admins rest more at night.

Hadoop 82
article thumbnail

Top 8 Data Science Slack Communities to Join in 2023

KDnuggets

Take your Data Science journey to the next level by joining these Slack communities in 2023.

article thumbnail

Enabling Operational Analytics on the Databricks Lakehouse Platform With Census Reverse ETL

databricks

This is a collaborative post from Databricks and Census. We thank Parker Rogers, Data Community Advocate, at Census for his contributions. In this.

Data 98
article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.