Sat.Jan 21, 2023 - Fri.Jan 27, 2023

article thumbnail

Apple: The only big tech giant going against the job cuts tide

The Pragmatic Engineer

Comments

321
321
article thumbnail

Data News — Week 23.04

Christophe Blefari

My view from the train window ( credits ) Dear Data News readers it's a joy every week to write this newsletter, we are slowly approaching the second birthday of this newsletter. In order to celebrate this together I'd love to receive your stories about data —can be short or long, anonymous or not. This is an open box, just write me with what you have on the mind and I'll bundle an edition with it.

Data 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Do You Need A Modern Data Stack Consultant

Seattle Data Guy

Modern data stack consultant plays an important role in companies looking to become data-driven. They help companies design and deploy centralized data sets that are easy to use and reliable. They do so by using cloud based solutions that help automate data pipelines and processes with less code than in the past. But in order… Read more The post Do You Need A Modern Data Stack Consultant appeared first on Seattle Data Guy.

article thumbnail

Safely Test Your Applications And Analytics With Production Quality Data Using Tonic AI

Data Engineering Podcast

Summary The most interesting and challenging bugs always happen in production, but recreating them is a constant challenge due to differences in the data that you are working with. Building your own scripts to replicate data from production is time consuming and error-prone. Tonic is a platform designed to solve the problem of having reliable, production-like data available for developing and testing your software, analytics, and machine learning projects.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

5 Ways to Deal with the Lack of Data in Machine Learning

KDnuggets

Effective solutions exist when you don't have enough data for your models. While there is no perfect approach, five proven ways will get your model to production.

article thumbnail

Watch Meta’s engineers discuss optimizing large-scale networks

Engineering at Meta

Managing network solutions amidst a growing scale inherently brings challenges around performance, deployment, and operational complexities. At Meta, we’ve found that these challenges broadly fall into three themes: 1.) Data center networking: Over the past decade, on the physical front, we have seen a rise in vendor-specific hardware that comes with heterogeneous feature and architecture sets (e.g., non-blocking architecture).

More Trending

article thumbnail

Building a Life Sciences Knowledge Graph with a Data Lake

databricks

This is a collaborative post from Databricks and wisecube.ai. We thank Vishnu Vettrivel, Founder, and Alex Thomas, Principal Data Scientist, for their contributions.

Data Lake 119
article thumbnail

From Data Collection to Model Deployment: 6 Stages of a Data Science Project

KDnuggets

Here are 6 stages of a novel Data Science Project; From Data Collection to Model in Production, backed by research and examples.

article thumbnail

Customer Engagement Trends for 2023

Precisely

In today’s hypercompetitive business environment, companies must deliver a standout experience for their target audience. Companies that excel at customer experience (CX) are better at building brand loyalty, increasing total customer lifetime value, and turning occasional customers into brand evangelists. This compelling drive for outstanding CX coincides with an intensive shift toward digitization, personalization, and omnichannel alignment.

article thumbnail

Tulip: Modernizing Meta’s data platform

Engineering at Meta

The technical journey discusses the motivations, challenges, and technical solutions employed for warehouse schematization, especially a change to the wire serialization format employed in Meta’s data platform for data interchange related to Warehouse Analytics Logging. Here, we discuss the engineering, scaling, and nontechnical challenges of modernizing Meta’s exabyte-scale data platform by migrating to the new Tulip format.

Bytes 105
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Improving the customer’s experience via ML-driven payment routing

LinkedIn Engineering

Co-Authors: Xianyun Mao , Stan Xu , Rachit Kumar , Vikas R , Xia Hong , and�� Divyakumar Menghani �� As a LinkedIn member, you can subscribe to LinkedIn Premium on a monthly or annual basis. For our customers, we offer the same option for our Talent Solutions and/or Sales Navigator products. For each, LinkedIn offers subscription renewal payments. These subscription renewal payments used to go through a rule-based routing engine to selected payment gateways, which often resulted in a less-than-o

Banking 97
article thumbnail

The ChatGPT Cheat Sheet

KDnuggets

Impress your friends and loved ones by perfecting your ChatGPT prompt engineering game with this incredibly useful resource.

article thumbnail

Enforcing Device AuthN & Compliance at Pinterest

Pinterest Engineering

Armen Tashjian | Security Engineer, Corporate Security Intro Pinterest has enforced the use of managed and compliant devices in our Okta authentication flow, using a passwordless implementation, so that access to our tools always requires a healthy Pinterest device. Following the phishing-based attacks against our peers in the tech industry, Pinterest decided to take a two pronged approach to defend against similar attacks.

article thumbnail

Work With Large Monorepos With Sparse Checkout Support in Databricks Repos

databricks

For your data-centered workloads, Databricks offers the best-in-class development experience and gives you the tools you need to adhere to code development best.

Coding 97
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Understanding and managing ArcGIS Online credits

ArcGIS

ArcGIS Online users and administrators - learn best practices for managing ArcGIS Online credits and get answers to frequently asked questions.

article thumbnail

5 Free Data Science Books You Must Read in 2023

KDnuggets

Get your hands on these gems to learn Python, data analytics, machine learning, and deep learning.

article thumbnail

Why Column-Aware Metadata Is Key to Automating Data Transformations

Snowflake

Data, data, data. It does seem we are not only surrounded by talk about data, but by the actual data itself. We are collecting data from every nook and cranny of the universe (literally!). IoT devices in every industry; geolocation information on our phones, watches, cars, and every other mobile device; every website or app we access—all are collecting data.

article thumbnail

Enabling Operational Analytics on the Databricks Lakehouse Platform With Census Reverse ETL

databricks

This is a collaborative post from Databricks and Census. We thank Parker Rogers, Data Community Advocate, at Census for his contributions. In this.

Data 94
article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

Containerizing the Beast – Hadoop NameNodes in Uber’s Infrastructure

Uber Engineering

We recently containerized Hadoop NameNodes and upgraded hardware, improving NameNode RPC queue time from ~200 to ~20ms – A 10x improvement! With this radical change, Uber’s Hadoop customers are happier and admins rest more at night.

Hadoop 82
article thumbnail

Genetic Programming in Python: The Knapsack Problem

KDnuggets

This article explores the knapsack problem. We will discuss why it is difficult to solve traditionally and how genetic programming can help find a "good enough" solution. We will then look at a Python implementation of this solution to test out for ourselves.

article thumbnail

United Bank Limited optimizes its data analytics with the Cloudera Data Platform (CDP)

Cloudera

United Bank Limited (UBL), a Pakistani banking and financial services leader, serves over 11 million customers nationwide and operates 1,338 branches and 1,445 ATMs, along with its branchless banking proposition (combination ATM and online banking). In 2022, UBL was awarded Best Bank for Digital Solutions by Asiamoney and Market Leader of Digital Banking in Pakistan by Euromoney, a testament to its track record as the best in digital banking.

Banking 80
article thumbnail

Bringing Models and Data Closer Together

databricks

We are excited to announce a new AutoML capability to quickly and easily use Feature Store data to improve model outcomes. AutoML users.

Data 96
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

How to Build a Flexible Customer Support Platform with Kotlin

DoorDash Engineering

As DoorDash’s business has grown with increasing order volumes and through emerging businesses including grocery delivery, our customer support experience also needed to scale up efficiently. The legacy support application that DoorDash had built to issue credits and refunds was created only to address the original food delivery service. It couldn’t handle the needs of our new verticals.

article thumbnail

Setup and use JupyterHub (TLJH) on AWS EC2

KDnuggets

JupyterHub is a multi-user, container-friendly version of the Jupyter Notebook. However, it can be difficult to setup. This blog post will make you less likely to run into issues in this 15+ step process.

AWS 112
article thumbnail

Streaming Big Data Files from Cloud Storage

Towards Data Science

Methods for efficient consumption of large files Photo by Aron Visuals on Unsplash Working with very large files can pose challenges to application developers related to efficient resource management and runtime performance. Text file editors, for example, can be divided into those that can handle large files, and those that make your CPU choke, make your PC freeze, and make you want to scream.

article thumbnail

Best Practices and Guidance for Cloud Engineers to Deploy Databricks on AWS: Part 2

databricks

This is part two of a three-part series in Best Practices and Guidance for Cloud Engineers to deploy Databricks on AWS. You can.

AWS 95
article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

Introduction to Synthetic Aperture Radar

ArcGIS

This blog will answer questions such as “What is SAR?”, “What can SAR be used for?”, and “How is SAR beneficial?”.

article thumbnail

Top Posts January 16-22: ChatGPT as a Python Programming Assistant

KDnuggets

ChatGPT as a Python Programming Assistant • ChatGPT: Everything You Need to Know • Explainable AI: 10 Python Libraries for Demystifying Your Model’s …

article thumbnail

Optimizing Kafka Clients: A Hands-On Guide

Rock the JVM

This article is brought to you by Giannis Polyzos. Giannis is a proud alumnus of Rock the JVM, working as a Solutions Architect with a focus on Event Streaming and Stream Processing Systems. Enter Giannis: 1. Introduction Apache Kafka is a well-known event streaming platform used in many organizations worldwide. It is used as the backbone of many data infrastructures, thus it’s important to understand how to use it efficiently.

Kafka 69
article thumbnail

A Gousto use case: how Databricks helps create personalized recipe recommendations for customers at scale

databricks

“This blog is authored by Hai Nguyen, Senior Data Scientist at Gousto” Gousto is the UK's best value recipe box, serving up more rec.

Data 93
article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.