Sat.Mar 09, 2024 - Fri.Mar 15, 2024

article thumbnail

Data News — Week 24.11

Christophe Blefari

Mountains I hope this e-mail finds you well, wherever you are. I'd like to thank you for the excellent comments you sent me last week after the publication of the first version of the Recommendations. This is just the beginning! This week I've added a subscribe button in the Recommendations page in order for you to opt-in for the weekly recommendation email—every Tuesday.

Metadata 272
article thumbnail

The “10x engineer:" 50 years ago and now

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover one out of five topics from today’s subscriber-only article What Changed in 50 Years of Computing.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Version Your Data Lakehouse Like Your Software With Nessie

Data Engineering Podcast

Summary Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges the gap between data lake and warehouse capabilities is the catalog. The primary purpose of the catalog is to inform the query engine of what data exists and where, but the Nessie project aims to go beyond that simple utility.

Data Lake 147
article thumbnail

Processing time trigger, to be or not to be?

Waitingforcode

That's the question. The lack of the processing time trigger means more a reactive micro-batch triggering but it cannot be considered as the single true best practice. Let's see why.

Process 130
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Building Meta’s GenAI Infrastructure

Engineering at Meta

Marking a major investment in Meta’s AI future, we are announcing two 24k GPU clusters. We are sharing details on the hardware, network, storage, design, performance, and software that help us extract high throughput and reliability for various AI workloads. We use this cluster design for Llama 3 training. We are strongly committed to open compute and open source.

Building 145
article thumbnail

Apache Druid’s Architecture – How Druid Processes Data In Real Time At Scale

Seattle Data Guy

Recently, I wrote an article diving into what Druid is and which companies are using it. Now I wanted to do a deeper dive into Apache Druid’s architecture. Apache Druid has several unique features that allow it to be used as a real-time OLAP. Everything from its various nodes and processes that each have unique… Read more The post Apache Druid’s Architecture – How Druid Processes Data In Real Time At Scale appeared first on Seattle Data Guy.

More Trending

article thumbnail

Statistics for Machine Learning: What you need to know to become a certified expert

KDnuggets

Ready to become a SAS Certified Specialist in Statistics for Machine Learning? Here’s everything you need to know about the recently released certification from SAS.

article thumbnail

Announcing {arcgis}, an R package for ArcGIS Location Services

ArcGIS

A new R package created by the R-ArcGIS Bridge team enables integration with ArcGIS location services, enhancing their combined powers.

144
144
article thumbnail

Don’t Be So Smart

Confessions of a Data Guy

Most Software Engineers think of themselves as too smart. They think they are the best and brightest coder alive or that has ever lived. Doing so, they stunt themselves from becoming Senior Engineers and become hard to work with, the nightmare of the PR process. You don’t need to be the smartest person in the […] The post Don’t Be So Smart appeared first on Confessions of a Data Guy.

article thumbnail

Keeping track of engineering-wide goals and migrations

Yelp Engineering

What is Engineering Effectiveness Metrics (EE Metrics)? EE Metrics was envisioned as a hub that helps teams manage their technical debt. EE Metrics provides every team with a detailed web page that contains information about technical debt that needs to be addressed. It also serves as a platform to highlight top engineering initiatives at the organization level.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

AI Regulation is Rolling Out…And the Data Intelligence Platform is Here to Help

databricks

Policymakers around the world are paying increased attention to artificial intelligence. The world’s most comprehensive AI regulation to date was just passed by.

Data 122
article thumbnail

In the spotlight with Rahul Mani, ThoughtSpot’s Selfless Excellence champion

ThoughtSpot

This is part of our ongoing spotlight series which highlights ThougthSpot’s quarterly Selfless Excellence champion. ThoughtSpot's culture is rooted in our core value of Selfless Excellence. This means we consider our teammates, customers, and society at large ahead of our own personal wins without the distraction of office politics. Our common ground ensures that we are moving together with intention and integrity in everything we do—when we run the business, plan our go-to-market strategy,

article thumbnail

Data Modeling Is Easy

Confessions of a Data Guy

When you’ve been data modeling as long as I have, it gets to be the same old … same old. People make data modeling harder than it has to be. There is a lot of jargon that gets thrown around … third-normal-form, OLAP, OLTP … I give you the 3-4 basics that are at the […] The post Data Modeling Is Easy appeared first on Confessions of a Data Guy.

Data 100
article thumbnail

5 Ways To Use LLMs On Your Laptop

KDnuggets

Run large language models on your local PC for customized AI capabilities with more control, privacy, and personalization.

129
129
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Implementing LLM Guardrails for Safe and Responsible Generative AI Deployment on Databricks

databricks

Introduction Let’s explore a common scenario – your team is eager to leverage open source LLMs to build chatbots for customer support interactions.

Building 124
article thumbnail

SNP Unlocks SAP Data for Advanced Analytics with Its Snowflake Native App

Snowflake

As a cohesive ERP solution, SAP is often one of the largest data resources in an organization, containing everything from financial and transactional data to master information about customers, vendors, materials, facilities, planning and even HR. But SAP has limited analytics capabilities, and directly ingesting SAP data into Snowflake can present a challenge.

IT 95
article thumbnail

Is Devin Going To Take My Software Engineering Job?

Confessions of a Data Guy

Unless you’ve been hiding a rock you’ve probably heard the hubbub over Devin the new AI Software Engineer that is going to take your job. While this is a genius piece of marketing … it’s a bunch of crud. Never fear, you are in no more danger of losing your job in Software than when […] The post Is Devin Going To Take My Software Engineering Job?

article thumbnail

Boost Your Data Science Skills: The Essential SQL Certifications You Need

KDnuggets

If you are a data scientist who works with large amounts of data and hasn’t learned SQL yet - now might be the time.

SQL 122
article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

Announcing the General Availability of Databricks Feature Serving

databricks

Today, we are excited to announce the general availability of Feature Serving. Features play a pivotal role in AI Applications, typically requiring considerable.

article thumbnail

Improving ETAs with Multi-Task Models, Deep Learning, and Probabilistic Forecasts

DoorDash Engineering

The DoorDash ETA team is committed to providing an accurate and reliable estimated time of arrival (ETA) as a cornerstone DoorDash consumer experience. We want to ensure that every customer can trust our ETAs, ensuring a high-quality experience in which their food arrives on time every time. With more than 2 billion orders annually, our dynamic engineering challenge is to improve and maintain accuracy at scale while managing a variety of conditions within diverse delivery and merchant scenarios.

article thumbnail

Developer Summit 2024: A tour of the ArcGIS Well-Architected Framework

ArcGIS

The ArcGIS Well-Architected Framework and ArcGIS Architecture Center provides guidance for implementing systems with ArcGIS.

article thumbnail

5 Free University Courses to Learn Computer Science

KDnuggets

Want to switch to a tech career? Make it happen with these free computer science courses.

article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

Building an AI-Ready Retail Organization with Improved Data Governance

databricks

Artificial Intelligence is top-of-mind with every C-suite in Retail & Consumer Goods. Companies see the potential to deliver better customer service, derive faster.

Retail 99
article thumbnail

How Financial Services and Retail Companies Are Accelerating their Data, Apps and AI Strategy in the Data Cloud

Snowflake

Last year, we held our first Accelerate event , to explore industry trends, data and technology innovations, and data strategy case studies in financial services. This year, we are expanding to five industry events, featuring leaders in financial services; retail and consumer goods; manufacturing; media, advertising and entertainment; and healthcare and life sciences.

Retail 85
article thumbnail

Postgres Aurora DB major version upgrade with minimal downtime

Lyft Engineering

Photo by Frank Olsen UNDER CC BY-SA 3.0 DEED Introduction Our payment platform team had the unique challenge to upgrade our Aurora Postgres DB from v10 to v13. This DB was responsible for storing transactions within Lyft and contains ~400 tables (with partitions) and ~30TB of data. Upgrading the database in-place would have resulted in ~30 mins of downtime.

Bytes 79
article thumbnail

Build An AI Application with Python in 10 Easy Steps

KDnuggets

Explore the fundamental steps for creating a successful AI Application with Python and other tools.

Python 128
article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

Yelp’s AI pipeline for inappropriate language detection in reviews

Yelp Engineering

Yelp’s mission is to connect consumers with great local businesses by giving them access to reliable and useful information. Consumer trust is one of our top priorities, which is why we make significant investments in technology and human moderation to protect the integrity and quality of content on Yelp. As a platform for user-generated content, we rely on our community of users and business owners to help report reviews that they believe may violate our Terms of Service and Content Guidelines.

article thumbnail

Real-Time Data Streaming for Smart Warehouses

Confluent

Leverage Confluent Data Streaming Platform to bring real time to your smart warehouse, powering greater IoT automation, efficiency and cost savings.

Data 84
article thumbnail

What Separates the Winners and Losers in the Connected Vehicle Data Revolution

databricks

"Building vehicles that are more like smartphones is the future. We're about to change the ride just like Apple and all the smartphone.

article thumbnail

5 Essential Skills Every Data Scientist Needs in 2024

KDnuggets

Want to move into the data science field? Or advance your career in the data? Don’t miss these must-have skills.

article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.