Sat.Aug 27, 2022 - Fri.Sep 02, 2022

article thumbnail

How to Select Rows and Columns in Pandas Using [ ],loc, iloc,at and.iat

KDnuggets

Subset selection is one of the most frequently performed tasks while manipulating data. Pandas provides different ways to efficiently select subsets of data from your DataFrame.

Data 160
article thumbnail

An Exploration Of What Data Automation Can Provide To Data Engineers And Ascend's Journey To Make It A Reality

Data Engineering Podcast

Summary The dream of every engineer is to automate all of their tasks. For data engineers, this is a monumental undertaking. Orchestration engines are one step in that direction, but they are not a complete solution. In this episode Sean Knapp shares his views on what constitutes proper automation and the work that he and his team at Ascend are doing to help make it a reality.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Incremental Strategies to Move Your Data Strategy Forward Remove Obstacles to Unlock Possibilities in Financial Services

Cloudera

Firms are burdened with tech debt and endless regulatory compliance, often leaving innovation last to receive the necessary budgets. Data-fuelled innovation requires a pragmatic strategy. This blog lays out some steps to help you incrementally advance efforts to be a more data-driven, customer-centric organization. Embrace incremental progress. The financial sector’s evolution is unleashing myriad demands on firms operating in the market.

article thumbnail

Teradata VantageCloud Lake and ClearScape Analytics: Empowering Enterprise Analytical Innovation

Teradata

Teradata's new offerings, VantageCloud Lake and ClearScape Analytics, make it the complete cloud analytics & data platform, with cloud-native deployment and expanded analytics capabilities.

Cloud 98
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

The Difference Between Training and Testing Data in Machine Learning

KDnuggets

When building a predictive model, the quality of the results depends on the data you use. In order to do so, you need to understand the difference between training and testing data in machine learning.

article thumbnail

Alumni Of AirBnB's Early Years Reflect On What They Learned About Building Data Driven Organizations

Data Engineering Podcast

Summary AirBnB pioneered a number of the organizational practices that have become the goal of modern data teams. Out of that culture a number of successful businesses were created to provide the tools and methods to a broader audience. In this episode several almuni of AirBnB’s formative years who have gone on to found their own companies join the show to reflect on their shared successes, missed opportunities, and lessons learned.

Building 100

More Trending

article thumbnail

What Do You Want to be Famous for?

Teradata

Financial services organizations that exhibit true data literacy avoid bottlenecks and instead choose to build best in class solutions that meet current and future needs. Find out more.

article thumbnail

Decision Tree Pruning: The Hows and Whys

KDnuggets

Decision trees are a machine learning algorithm that is susceptible to overfitting. One of the techniques you can use to reduce overfitting in decision trees is pruning.

article thumbnail

Getting Started with the KRaft Protocol

Confluent

Kafka Raft lets you use Apache Kafka without ZooKeeper by consolidating metadata management. Here’s how you can learn and do more with KRaft.

Kafka 78
article thumbnail

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

Data is the fuel that drives government, enables transparency, and powers citizen services. But while state and local governments seek to improve policies, decision making, and the services constituents rely upon, data silos create accessibility and sharing challenges that hinder public sector agencies from transforming their data into a strategic asset and leveraging it for the common good. .

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Expert Roundtable: How to Build Real-Time Personalization and Recommendation Systems

Rockset

I recently had the good fortune to host a small-group discussion on personalization and recommendation systems with two technical experts with years of experience at FAANG and other web-scale companies. Raghavendra Prabhu (RVP) is Head of Engineering and Research at Covariant , a Series C startup building an universal AI platform for robotics starting in the logistics industry.

Systems 52
article thumbnail

Build a Reproducible and Maintainable Data Science Project: A Free Online Book

KDnuggets

This free online book is a fantastic resource on how to structure, manage, and maintain your real-world data science projects.

article thumbnail

Celebrate Back-to-School Season With Data Streaming Basics

Confluent

All the best data streaming resources, tips, and guides to help you learn introductory concepts, streaming architecture basics, common tools and technologies, and more.

article thumbnail

5 Ways To Ensure High Functioning Data Engineering Teams 

Monte Carlo

Data engineering is a relatively young profession, even for the tech space. To put it in perspective, front-end engineering has twice the number of years in industry maturity. While the role itself is rapidly evolving, the tooling, processes, and team structure are fragmented and amorphous at best. As a result, the day-to-day responsibilities of a data engineer can look radically different from one company to another, depending on the needs of the business and the data that drives it.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

August 2022 dbt Update: v1.3 beta, Tech Partner Program, and Coalesce!

dbt Developer Hub

Semantic layer, Python model support, the new dbt Cloud UI and IDE… there’s a lot our product team is excited to share with you at Coalesce in a few weeks. But how these things fit together—because of where dbt Labs is headed—is what I’m most excited to discuss. You’ll hear more in Tristan’s keynote , but this feels like a good time to remind you that Coalesce isn’t just for answering tough questions… it’s for surfacing them.

article thumbnail

Machine Learning Metadata Store

KDnuggets

In this article, we will learn about metadata stores, the need for them, their components, and metadata store management.

Metadata 158
article thumbnail

Loan Prediction using Machine Learning Project Source Code

ProjectPro

This article will walk you through how one can start by exploring a loan prediction system as a data science and machine learning problem and build a system/application for loan prediction using your own machine learning project. Loan sanctioning and credit scoring forms a multi-billion dollar industry -- in the US alone. With everyone from young students, entrepreneurs, and multi-million dollar companies turning to banks to seek financial support for their ventures, processing these application

article thumbnail

Data Quality Monitoring – You’re Doing It Wrong

Monte Carlo

Occasionally, we’ll talk with data teams interested in applying data quality monitoring narrowly across only a specific set of key tables. The argument goes something like: “You may have hundreds or thousands of tables in your environment, but most of your business value derives from only a few that really matter. That’s where you really want to focus your efforts.

IT 52
article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

MarkLogic And Machine Learning: Easy way of ML

Knoldus

Reading Time: 6 minutes Introduction Machine learning is a subfield of computer science. Used to deal with the construction of artificial intelligence systems that can learn without being explicitly programmed. It has been applied in many areas such as data analysis, pattern recognition, and understanding human behavior. MarkLogic combines database internals, search-style indexing, and application server behavior into a unified system.

article thumbnail

3 Ways to Append Rows to Pandas DataFrames

KDnuggets

Learn a simple way to append rows in the form of arrays, dictionaries, series, and dataframes to another dataframe.

Python 158
article thumbnail

AI in Drug Discovery and Repurposing: Benefits, Approaches, and Use Cases

AltexSoft

According to McKesson, a company with a two-hundred-year history delivering a third of all drugs across North America, you need around six months to start a pharmacy and another seven to nine months to see any revenue. If this seems too long and complex, just make a comparison with a drug development process. It takes at least ten years and $2.6 billion to get a new medicine to the market.

Medical 52
article thumbnail

Getting Started with Scala sbt

Rock the JVM

Discover sbt: The popular Scala build tool that simplifies project management and enhances productivity

Scala 52
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

Declarative Connectors with Confluent for Kubernetes

Confluent

Manage connectors declaratively with Confluent for Kubernetes.

article thumbnail

Machine Learning in the Enterprise: Use Cases & Challenges

KDnuggets

This article provides insights into how leading data scientists are embracing machine learning in their organizations and covers some of the major ML challenges and trends in the enterprise.

article thumbnail

What is Data Discovery: Definitions & Overview

Monte Carlo

In the world of data engineering, data discovery refers to the ability to find relevant data sets across your data platform and understand their context. Data discovery makes data engineering and analytical engineering tasks more efficient and can enable self-service access for other types of data consumers. Just like knowledge workers need to tap into a shared repository to discover and combine relevant information across documents or slide decks, data professionals need to do the same with dat

article thumbnail

The Benefits of Natural Language AI for Content Creators

KDnuggets

In this article, we will discuss the benefits of natural language AI for content creators, highlighting the key reasons why you should consider using it to improve your content output.

IT 112
article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

The Complete Data Science Study Roadmap

KDnuggets

This article will map out the things you need to do to become a data scientist.

article thumbnail

Combining Pandas DataFrames Made Simple

KDnuggets

For this tutorial, we will work through examples to understand how different mehtods for combining Pandas DataFrames work.

Python 132
article thumbnail

Top Posts August 22-28: Free Python Project Coding Course

KDnuggets

Free Python Project Coding Course • 5 Tricky SQL Queries Solved • Decision Tree Algorithm, Explained • Free AI for Beginners Course • The Complete Collection of Data Science Projects & Part 2.

Coding 98
article thumbnail

KDnuggets News, August 31: The Complete Data Science Study Roadmap • 7 Techniques to Handle Imbalanced Data

KDnuggets

The Complete Data Science Study Roadmap • 7 Techniques to Handle Imbalanced Data • 3 Ways to Append Rows to Pandas DataFrames • The Bias-Variance Trade-off • How to Package and Distribute Machine Learning Models with MLFlow.

article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.