Sat.Nov 19, 2022 - Fri.Nov 25, 2022

article thumbnail

Data News — Week 22.47

Christophe Blefari

Capturing the news ( credits ) Hello you, I hope this data news finds you well. Time flies to be honest. I've launched in a rush an Advent of Data. The goal is simple, in December: 24 data people will produce 24 data gems. Every day a new piece of content will be release on a dedicated website. If you wanna join the initiative please reply, we are still looking for a few slots to be filled in.

Data 130
article thumbnail

Twitter’s ongoing cruel treatment of software engineers

The Pragmatic Engineer

Originally published on 24 November 2022. 👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of five topics in today’s subscriber-only The Scoop issue. To get this newsletter every week, subscribe here. I was really hoping to not report anything more about Twitter, and that software engineers at the company would get space to heal after the traumatic events, and to focus on building the product.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

DuckDB: Getting started for Beginners

Marc Lamberti

DuckDB is an in-process OLAP DBMS written in C++ blah blah blah, too complicated. Let’s start simple, shall we? DuckDB is the SQLite for Analytics. It has no dependencies, is extremely easy to set up, and is optimized to perform queries on data. In this hands-on tutorial, you will learn what DuckDB is, how to use it, and why it is essential for you.

Datasets 130
article thumbnail

Tame The Entropy In Your Data Stack And Prevent Failures With Sifflet

Data Engineering Podcast

Summary The problems that are easiest to fix are the ones that you prevent from happening in the first place. Sifflet is a platform that brings your entire data stack into focus to improve the reliability of your data assets and empower collaboration across your teams. In this episode CEO and founder Salma Bakouk shares her views on the causes and impacts of "data entropy" and how you can tame it before it leads to failures.

Data Lake 130
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

How Much Math Do You Need in Data Science?

KDnuggets

There exist so many great computational tools available for Data Scientists to perform their work. However, mathematical skills are still essential in data science and machine learning because these tools will only be black-boxes for which you will not be able to ask core analytical questions without a theoretical foundation.

article thumbnail

Key Benefits of HR Digital Transformation

U-Next

Introduction to Digital HR . Digital HR refers to using technology, including software and apps, to improve how a company manages its employees. There are many digital transformation benefits associated with HR. The goal is to make it easier for businesses and their employees to connect, collaborate, share information and make decisions. . These are some of the top digital transformation statistics for HR and L&D: . 71% spend about a quarter of their time on social media to share human re

More Trending

article thumbnail

A Look At The Data Systems Behind The Gameplay For League Of Legends

Data Engineering Podcast

Summary The majority of blog posts and presentations about data engineering and analytics assume that the consumers of those efforts are internal business users accessing an environment controlled by the business. In this episode Ian Schweer shares his experiences at Riot Games supporting player-focused features such as machine learning models and recommeder systems that are deployed as part of the game binary.

Systems 130
article thumbnail

The Inescapable Conclusion: Machine Learning Is Not Like Your Brain

KDnuggets

The final article in this nine-part series summarizes the many reasons why Machine Learning is not like your brain - along with a few similarities.

article thumbnail

Impact of Digitization on HR Services and Processes

U-Next

Introduction to Digitization in Human Resources . Digitization in HR services is of utmost importance to an organization. It is a critical and strategic function that aims to optimize the workforce to meet business goals. The HR functions and processes have been evolving with advances in technology, changing consumer behavior patterns, and increasing globalization of markets.

Process 72
article thumbnail

Leveraging CockroachDB’s Change Feed for Real-Time Inventory Data Processing

DoorDash Engineering

Managing inventory levels is one of the biggest challenges for any convenience and grocery retailer on DoorDash. Maintaining accurate inventory levels in a timely manner becomes especially challenging when there are many constantly moving variables that may be changing on-hand inventory count. Situations that may affect inventory levels include, but are not limited to: Items expiring Items may have to be removed due to damage The items vendors sent are different than than what was ordered After

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

How Precision Time Protocol is being deployed at Meta

Engineering at Meta

Implementing Precision Time Protocol (PTP) at Meta allows us to synchronize the systems that drive our products and services down to nanosecond precision. PTP’s predecessor, Network Time Protocol (NTP) , provided us with millisecond precision, but as we scale to more advanced systems on our way to building the next computing platform, the metaverse and AI, we need to ensure that our servers are keeping time as accurately and precisely as possible.

article thumbnail

What is Chebychev’s Theorem and How Does it Apply to Data Science?

KDnuggets

Chebyshev’s Theorem applies to every data set and is heavily used by Statisticians, Data Scientists, and Machine Learning Engineers.

article thumbnail

What’s the Relationship Between Big Data and Machine Learning?

U-Next

Introduction to Machine Learning and Big Data . Big Data and Machine Learning are one of the most crucial and irreplaceable technologies today. Machine Learning allows computers to learn from data automatically without being explicitly programmed. This is done by providing the computer with training data, which it can use to improve its performance on future tasks.

article thumbnail

How to move data from spreadsheets into your data warehouse

dbt Developer Hub

Once your data warehouse is built out, the vast majority of your data will have come from other SaaS tools, internal databases, or customer data platforms (CDPs). But there’s another unsung hero of the analytics engineering toolkit: the humble spreadsheet. Spreadsheets are the Swiss army knife of data processing. They can add extra context to otherwise inscrutable application identifiers, be the only source of truth for bespoke processes from other divisions of the business, or act as the transl

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Retrofitting null-safety onto Java at Meta

Engineering at Meta

We developed a new static analysis tool called Nullsafe that is used at Meta to detect NullPointerException (NPE) errors in Java code. Interoperability with legacy code and gradual deployment model were key to Nullsafe’s wide adoption and allowed us to recover some null-safety properties in the context of an otherwise null-unsafe language in a multimillion-line codebase.

Java 55
article thumbnail

10 Amazing Machine Learning Visualizations You Should Know in 2023

KDnuggets

Yellowbrick for creating machine learning plots with less code.

article thumbnail

Organizational Health: What It Is and How HR Manages It?

U-Next

Introduction To Organizational Health . Organizational health is a measure of the effectiveness of the organization. It’s a holistic approach to organizational success that focuses on creating an environment where employees can fulfill their potential and achieve the company’s goals. . What organizational health measures are the well-being in which employees are engaged and performing at their best.

IT 52
article thumbnail

Snowflake: Provisioning in AAD to synch Users

Cloudyard

Read Time: 2 Minute, 7 Second During last post we discussed how to configure the Snowflake SSO Login with Azure Active Directory We created User ‘Darsh’ in Azure Active directory and assigned the required permission. To enable the SSO login at snowflake side we also created user manually in below way: CREATE USER "DMITTAL" PASSWORD = 'xxx' LOGIN_NAME ='darsh@sachinmittal2904outlook.onmicrosoft.com' But assume the scenario where we have number of users available in Azure Active Direc

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

PTP: Timing accuracy and precision for the future of computing

Engineering at Meta

Meta is deploying a timing protocol, Precision Time Protocol (PTP) , that will offer new levels of accuracy and precision to our networks and data centers. We believe PTP will become the global standard for keeping time in computer networks. PTP will benefit today’s products and services and will be a foundational technology behind the development of the metaverse.

Systems 53
article thumbnail

10 Most Common Data Quality Issues and How to Fix Them

KDnuggets

Ensuring data quality guarantees more data-informed decisions. Hence, this article highlights the common data quality issues and ways to overcome them.

Data 106
article thumbnail

A Brief Overview of the Unix File System

U-Next

Introduction . The Unix File System is a framework for organizing and storing large amounts of data in a manageable manner. It includes components like files, a group of connected data that can be conceptualized as a stream of bytes (or characters). In the Unix File System, a file is also the smallest storage unit. . In other words, the Unix File System is a method for organizing and logically analyzing massive amounts of data so that it is simple to manage. .

Systems 52
article thumbnail

How Data and Finance Teams Can Be Friends (And Stop Being Frenemies)

Monte Carlo

Recently I wrote an article about data silos that form across the organization, often due to lack of alignment with partners. This alignment can be difficult to come by, but is crucial to a data leader’s success. With the range of internal customers to support, it can be tempting for data teams to inhabit the principles of an assembly line or even a fry cook at McDonalds.

Finance 52
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

A journey through the Foundry: Becoming an analytics engineer at dbt Labs

dbt Developer Hub

Data is an industry of sidesteppers. Most folks in the field stumble into it, look around, and if they like what they see, they’ll build a career here. This is particularly true in the analytics engineering space. Every AE I’ve talked to had envisioned themselves doing something different before finding this work in a moment of serendipity. This raises the question, how can someone become an analytics engineer intentionally ?

article thumbnail

Top Posts November 14-20: Git for Data Science Cheatsheet

KDnuggets

Git for Data Science Cheatsheet • 6 Best Free Online Courses to Learn Python and Boost Your Career • How to Select Rows and Columns in Pandas Using [ ],loc, iloc,at and.iat • How LinkedIn Uses Machine Learning To Rank Your Feed • 7 SQL Concepts You Should Know For Data Science.

article thumbnail

7 Crucial Steps Involved in A Sales Process

U-Next

Introduction . Whether you work in B2B sales or direct-to-consumer sales, the sales process aims to develop an interest in your product and trust in you as a salesperson. You must be meticulous in your efforts to get a potential customer closer to making a purchase since you won’t close a sale without both. . A sales process that we see now includes the following: .

Process 52
article thumbnail

How SeatGeek Reduced Data Incidents to Zero with Data Observability

Monte Carlo

Data downtime, unknown unknowns, and the specter of schema changes loom large for data teams of all stripes, and the team at SeatGeek was no exception. As the only mobile ticketing marketplace built for fan experience, SeatGeek made its name on efficient customer experiences. So, when SeatGeek’s data leaders realized they were losing too much time root-causing data issues in their BI reports, they began looking for tools to help them discover their data problems faster.

Data 52
article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

Data Engineering Weekly #108

Data Engineering Weekly

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. Sign up free to test out the tool today. Google AI: The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation Google published Data Cards , a dataset documentation framework aimed at increasing transparency across dataset lifecycles.

article thumbnail

How to Use Graph Theory to Scout Soccer

KDnuggets

Take Soccer Analytics to the Next Level with Graph Theory: Here’s What to Know and How to Do It.

IT 126
article thumbnail

Introduction to Unix Operating System : Everything You Need To Know

U-Next

Introduction . Unix , one of the most powerful operating systems , is the forefather of operating systems such as Ubuntu, Solaris, and POSIX. Ken Thompson and Dennis Ritchie created it in the 1960s, and since then, it has been continuously improved. Due to AT&T’s discovery of the Unix operating system and distribution of the C to governmental and academic institutions, both operating systems have been ported to more machine families than any other.

Systems 52
article thumbnail

AWS re:Invent 2022: Rockset Will Be There…Will You?

Rockset

Rockset is heading to Vegas for AWS re:Invent. Will you be there? We have several opportunities for you and your team to learn more about real-time analytics and how companies like Klarna, Meta and Seesaw have made the move from batch to real time. Come by the Rockset Booth (#130) in the expo hall, November 28-December 1st. See a demo and try your hand at winning a Playstation 5 in our re:Invent prize giveaway.

AWS 52
article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.