Sat.Apr 30, 2022 - Fri.May 06, 2022

article thumbnail

Hypothesis Testing Explained

KDnuggets

This brief overview of the concept of Hypothesis Testing covers its classification in parametric and non-parametric tests, and when to use the most popular ones, including means, correlation, and distribution, in the case of one sample and two samples.

IT 160
article thumbnail

AI-First Benefits: 5 Real-World Outcomes

Cloudera

Artificial intelligence (AI) has been a focus for research for decades, but has only recently become truly viable. The availability and maturity of automated data collection and analysis systems is making it possible for businesses to implement AI across their entire operations to boost efficiency and agility. AI has the potential to transform operations by improving three fundamental business requirements: process automation, decision-making based on data insights, and customer interaction.

Insurance 134
article thumbnail

DataKitchen In The The insideBIGDATA IMPACT 50 List

DataKitchen

111
111
article thumbnail

Evolving And Scaling The Data Platform at Yotpo

Data Engineering Podcast

Summary Building a data platform is an iterative and evolutionary process that requires collaboration with internal stakeholders to ensure that their needs are being met. Yotpo has been on a journey to evolve and scale their data platform to continue serving the needs of their organization as it increases the scale and sophistication of data usage. In this episode Doron Porat and Liran Yogev explain how they arrived at their current architecture, the capabilities that they are optimizing for, an

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Machine Learning Is Not Like Your Brain Part One: Neurons Are Slow, Slow, Slow

KDnuggets

Artificial intelligence is not all that intelligent. While today’s AI can do some extraordinary things, the functionality underlying its accomplishments has very little to do with the way in which a human brain works to achieve the same tasks.

article thumbnail

Choose Compliance, Choose Hybrid Cloud

Cloudera

As digital transformation accelerates, and digital commerce increasingly becomes the dominant form of all commerce, regulators and governments around the world are recognizing the increased need for consumer protections and data protection measures. The European Union has been at the vanguard for some time (most recently having reached provisional agreement on the Digital Services Act ) but from Australia to Brazil , from South Africa to California (the rest of the US hasn’t quite caught on yet!

Cloud 109

More Trending

article thumbnail

Leading The Charge For The ELT Data Integration Pattern For Cloud Data Warehouses At Matillion

Data Engineering Podcast

Summary The predominant pattern for data integration in the cloud has become extract, load, and then transform or ELT. Matillion was an early innovator of that approach and in this episode CTO Ed Thompson explains how they have evolved the platform to keep pace with the rapidly changing ecosystem. He describes how the platform is architected, the challenges related to selling cloud technologies into enterprise organizations, and how you can adopt Matillion for your own workflows to reduce the ma

article thumbnail

How To Structure a Data Science Project: A Step-by-Step Guide

KDnuggets

Check out all the necessary steps to successfully structure your data science projects leveraging data science templates.

article thumbnail

Winning With Data in the Fight Against Fraud, Waste, and Abuse

Cloudera

Fraud, waste, and abuse (FWA) in government is a constant, multi-billion dollar issue that challenges agency leaders at all levels and across all sectors, from healthcare to education to taxation to Social Security. The scope and scale of public spending — federal outlays alone were approximately $6.6 trillion in fiscal year 2020 according to the Congressional Budget Office — make FWA an inherently difficult problem to solve.

article thumbnail

How to Remove Apache Kafka Brokers the Easy Way

Confluent

The recent release of Confluent Cloud and Confluent Platform 7.0 introduced the ability to easily remove Apache Kafka® brokers and shrink your Confluent Server cluster with just a single command. […].

Kafka 84
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Podcast: Storytime for DataOps

DataKitchen

The post Podcast: Storytime for DataOps first appeared on DataKitchen.

72
article thumbnail

Image Classification with Convolutional Neural Networks (CNNs)

KDnuggets

In this article, we’ll look at what Convolutional Neural Networks are and how they work.

article thumbnail

#Clouderalife Volunteer Spotlight: Lynne Montalbo!

Cloudera

This month we are proud to spotlight Lynne Montalbo, senior business systems analyst from Santa Clara, California, who volunteers as a professional development mentor with Braven. Braven’s mission is to empower promising, underrepresented young people—first-generation college students, students from low-income backgrounds, and students of color—with the skills, confidence, experiences, and networks necessary to transition from college to strong first jobs, which lead to meaningful careers and li

article thumbnail

From the Cellar to the Cloud – How Aedifion is Driving Next-Generation Building Automation with Confluent

Confluent

It is no exaggeration that a lot is going wrong in commercial buildings today. The building and construction sector consumes 36% of global final energy and accounts for almost 40% […].

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Monte Carlo Named One of the Best Places to Work in the Bay Area for 2022

Monte Carlo

I’m honored to share that Monte Carlo was just named a Best Place to Work in the Bay Area for 2022 by the San Francisco Business Times and the Silicon Valley Business Journal, placing 6th in the small business category. This recognition is especially meaningful to our leadership team because the results are based directly on employee feedback, collected anonymously from a third-party researcher.

article thumbnail

9 Free Harvard Courses to Learn Data Science in 2022

KDnuggets

Learn Python programming, statistics, and machine learning online from one of the world’s top universities.

article thumbnail

A Real-Time Rockset Intern Experience

Rockset

I spent the spring of my junior year interning at Rockset , and it couldn’t have been a better decision. When I first arrived at the office on a sunny day in San Mateo, I had no idea that I was about to meet so many systems engineering gurus, or that I was about to consume immensely good food from the festive neighboring streets. Working with my talented and resourceful mentor, Ben (Software Engineer, Systems), I’ve been able to learn more than I ever thought I could in three months!

Food 52
article thumbnail

Slim CI/CD with Bitbucket Pipelines

dbt Developer Hub

Continuous Integration (CI) sets the system up to test everyone’s pull request before merging. Continuous Deployment (CD) deploys each approved change to production. “Slim CI” refers to running/testing only the changed code, thereby saving compute. In summary, CI/CD automates dbt pipeline testing and deployment. dbt Cloud , a much beloved method of dbt deployment, supports GitHub- and Gitlab-based CI/CD out of the box.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Seven Benefits of a Powerful Data Fabric

Teradata

The value provided by a powerful data fabric is key for a successful digital transformation. Find out why.

Data 52
article thumbnail

SQL Notes for Professionals: The Free eBook Review

KDnuggets

The free book is a combination of SQL cheat sheets and practical database examples. It provided bite-size information about every SQL function and attribute with coding samples.

SQL 159
article thumbnail

How Rockset Handles Data Deduplication

Rockset

There are two major problems with distributed data systems. The second is out-of-order messages, the first is duplicate messages, the third is off-by-one errors, and the first is duplicate messages. This joke inspired Rockset to confront the data duplication issue through a process we call deduplication. As data systems become more complex and the number of systems in a stack increases, data deduplication becomes more challenging.

Kafka 52
article thumbnail

Packaging generated code from protobuf files for gRPC Services

Eventbrite Engineering

Background At Eventbrite, we identified in our 3-year technical vision that one of our goals is to enable autonomous dev teams to own their code and architecture so as to be able to deliver reliable, high quality and cost effective solutions to our customers. However, this autonomy does not mean that our team has to … Continue reading "Packaging generated code from protobuf files for gRPC Services" The post Packaging generated code from protobuf files for gRPC Services appeared first on E

Coding 52
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Why Does Elder Research Need a Chief Scientist Committee?

Elder Research

The post Why Does Elder Research Need a Chief Scientist Committee? appeared first on Elder Research.

52
article thumbnail

6 Highest Paying Companies for Data Scientists

KDnuggets

These are the six top paying companies for data scientists. I’ve looked at absolute salary, but I’ll fill you in on other factors you should consider as well when it comes to picking a data science job for money.

article thumbnail

Meet The Graduates: Guoda Paulikaite

Pipeline Data Engineering

In this interview series we’ll share some of the stories that Daniel and I get to watch unfold at Pipeline Academy. Check out what our graduates have to say about the course, how they’ve tackled its challenges and what they are doing now with their new data engineering superpowers. Peter: Can I ask you to please introduce yourself to the readers of Pipeline Academy’s blog?

article thumbnail

Making dbt Cloud API calls using dbt-cloud-cli

dbt Developer Hub

dbt Cloud is a hosted service that many organizations use for their dbt deployments. Among other things, it provides an interface for creating and managing deployment jobs. When triggered (e.g., cron schedule, API trigger), the jobs generate various artifacts that contain valuable metadata related to the dbt project and the run results. dbt Cloud provides a REST API for managing jobs, run artifacts and other dbt Cloud resources.

Cloud 52
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Building Ripple: Engineering Spotlight Pt. 2

Ripple Engineering

In part one of our two-part series, we heard from RippleX engineers that are ideating, creating and executing on new applications using cutting-edge blockchain and crypto technology. Now, we’ll explore how the RippleNet engineering team is building the foundational payments infrastructure on the XRP Ledger that will allow value to move as easily as information moves today.

article thumbnail

How to Build Strong Data Science Portfolio as a Beginner

KDnuggets

After learning the basics of data science, you can start to work on real-world problems. But how do you showcase your work? In this article, we are going to learn a unique way to create a data science portfolio.

Portfolio 123
article thumbnail

Mind the (Sustainability) Gap

Teradata

Less than 20% of retailers on track to meet sustainability pledges. Granular, integrated data is the key to move from reporting to action. Read about our framework for profitable sustainability.

Retail 52
article thumbnail

DataKitchen Noted For DataOps Thought LeaderShip

DataKitchen

The post DataKitchen Noted For DataOps Thought LeaderShip first appeared on DataKitchen.

52
article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.