Sat.Mar 26, 2022 - Fri.Apr 01, 2022

article thumbnail

Building A Data Governance Bridge Between Cloud And Datacenters For The Enterprise At Privacera

Data Engineering Podcast

Summary Data governance is a practice that requires a high degree of flexibility and collaboration at the organizational and technical levels. The growing prominence of cloud and hybrid environments in data management adds additional stress to an already complex endeavor. Privacera is an enterprise grade solution for cloud and hybrid data governance built on top of the robust and battle tested Apache Ranger project.

article thumbnail

Machine Learning Pipeline Optimization with TPOT

KDnuggets

Let's revisit the automated machine learning project TPOT, and get back up to speed on using open source AutoML tools on our way to building a fully-automated prediction pipeline.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Soldiers, Rogues, and Mages of Data Teams

Jesse Anderson

Data Teams are like Role Playing Games (RPG). If you’re not familiar with RPGs, there is a person or group of characters all working together for a common goal. A crucial part of the individual characters are their levels, skills, and stats. In many games, higher levels are required to unlock specific skills. Likewise, stats show how well a character can utilize their skills.

article thumbnail

Do Data Companies Need Chief Ethics Officers?

Cloudera

Sometimes it takes a billion-dollar mistake to bring the murkier side of data ethics into sharp focus. Equifax found this out to their own cost in 2017 when they failed to protect the data of almost 150 million users globally. The catastrophic breach was bad enough on its own — but Equifax waited three months to go public with the news. As the public furore rose to a crescendo, the credit organization dragged its feet on disclosing exactly what kind of information had been leaked.

Data 119
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Eliminate The Bottlenecks In Your Key/Value Storage With SpeeDB

Data Engineering Podcast

Summary At the foundational layer many databases and data processing engines rely on key/value storage for managing the layout of information on the disk. RocksDB is one of the most popular choices for this component and has been incorporated into popular systems such as ksqlDB. As these systems are scaled to larger volumes of data and higher throughputs the RocksDB engine can become a bottleneck for performance.

article thumbnail

Top 13 Skills That Every Data Scientist Should Have

KDnuggets

Let me walk you through the top 13 data science skills that you should have to become a successful data scientist. Following this outline, you’ll have a great path of digestible steps to educate yourself and be prepared to apply for data scientist positions.

Education 151

More Trending

article thumbnail

Hybrid Data Cloud Success for State and Local Governments

Cloudera

State and local governments generate and store enormous amounts of data essential to their ability to deliver citizen services. But how can they capitalize on all of their data to become engines of growth and innovation, empowering and enhancing their ability to provide services and better serve their communities? Data doesn’t arrive on the doorsteps of government offices as a neatly packaged asset.

article thumbnail

Introducing Stream Processing Use Case Recipes Powered by ksqlDB

Confluent

From fraud detection and predictive analytics, to real-time customer experiences and cyber security, stream processing has countless benefits for use cases big and small. By unlocking the power of continuous […].

Process 87
article thumbnail

8 Free MIT Courses to Learn Data Science Online

KDnuggets

Create a data science learning path with courses from the world’s most prestigious university.

article thumbnail

Closing the Gap Left by Third Party Cookie Deprecation

Teradata

Consumers expect personalized experiences when they interact with a brand. But organizations are losing the ability to listen to their customers via digital channels. Fixing this is critical.

98
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Monte Carlo Announces Release of Observability Platform for Locally Sourced, Small-Batch Data

Monte Carlo

Today, on April 1st , Monte Carlo announced the release of Data Observability Small Batch, a next-generation platform for locally-sourced, small-batch data. The solution was painstakingly crafted by artisan developers to serve a new wave of data engineers who are nostalgic for data platforms the way they used to be. “The world is tired of over-processed data, mass-marketed to them in over-hyped dashboards,” says Barr Moses, CEO, Monte Carlo.

article thumbnail

Software Engineer Sanjana Kaundinya on Moving from the Classroom to Confluent

Confluent

When Sanjana Kaundinya chose Confluent for her first job out of college, she was eager to learn as much as possible—and in the two years since, that’s exactly what she’s […].

article thumbnail

A Bug That Can Make You a Data Science Hero

KDnuggets

What if I tell you that there is a bug that can take you on a ride in the world of data science. Yes, if you have the bug of curiosity, consider yourself the best fit for the data science profession.

article thumbnail

Case Study: How Rockset Made Me a Day Three Hero at Sounding Board

Rockset

I’ve been working as a data and software engineer for more than 20 years. Not long after I joined my current employer Sounding Board , I had to normalize nested JSON arrays in a complex document schema so that I could join the child records to other collections and then denormalize data into a single result set — and I had to do it fast. On top of that, I had to make that data available to our custom-built application via a secure RESTful endpoint with a less than one second response time.

MongoDB 52
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

You Have More Data Quality Issues Than You Think 

Monte Carlo

Say it with me: your data will never be perfect. Any team striving for completely accurate data will be sorely disappointed. Data testing , anomaly detection, and cataloging are important steps, but technology alone will not solve your data quality problem. Like any entropic system, data breaks. And as we’ve learned building solutions to curb the causes and downstream impact of data issues, it happens more often than you think.

article thumbnail

This 6-Month Product Management Program Is The Ultimate Choice For Next-Gen Product Experts!

U-Next

Let’s face it! Product Management CAN BE TOUGH, but only if you haven’t laid your hands on the best training experience for Product enthusiasts in all its glory: the PG Certificate Program in Product Management by IIM Indore & Jigsaw. Several present-day Product Experts started their journeys with this exclusive 6-month program & found multiple doors of opportunities, wide open to welcome them.

article thumbnail

Time Series Forecasting with Ploomber, Arima, Python, and Slurm

KDnuggets

In this blog you will see how the authors took a raw.ipynb notebook that does time series forecasting with Arima, modularized it into a Ploomber pipeline, and ran parallel jobs on Slurm.

Python 108
article thumbnail

Locate your Data and Boost it with Geo-Processing

DareData

Many times, a data developer is constrained by the data they were given. In Data Science / Engineering projects, it is not unusual that extra data is added from other sources - even ones that are outside of the organization. But extra data can be available in non-standard ways, that require new processing techniques. For example, when it comes to geographical data many governments provide open spatial data about their territory.

Process 52
article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

Which Google Cloud certification is best for me?

A Cloud Guru: Data Engineering

Considering your options when it comes to Google Cloud (GCP) certification paths? This post will talk about the various GCP cloud certifications, what each cert covers, what it could mean for your career, and how you can set (and achieve) your own personal goals. Accelerate your career Get started with ACG and transform your career with […] The post Which Google Cloud certification is best for me?

article thumbnail

These Sales Enthusiasts Mastered Strategic Sales In Just 4 Months With The Executive Program in Strategic Sales Management

U-Next

With the onset of the 5th industrial revolution, the world is moving closer towards embracing newer technologies in almost every walk of life. In the business ecosphere, those who upskill & transform into the best professionals versions of themselves are bound to be at the forefront of this revolution. The Sales domain, too, cannot be home to traditional sales methods for too long.

article thumbnail

Connect With the Data Science Community at Rev 3 in NYC, the #1 MLOps Conference

KDnuggets

The most ambitious Enterprise MLOps conference is coming to New York City on May 5-6, bringing the data science community together in-person for a one-of-a-kind event. This year, you’ll hear from 30+ speakers across dozens of industries. Save 50% with the promo code KDN. Register now!

article thumbnail

Object equality in Java and Kotlin

Booking.com Engineering

Introduction We are going to review the subtleties and complications of trying to compare objects for equality in Java, where the problem originates, why it is important, Kotlin’s approach on the problem and some recommendations on the topic. Determining if two entities are the same is a fundamental operation in mathematics and we implement this operation in programming by the weaker notion of equivalency; the difference being that we are content with equality across a specific subset of propert

Java 52
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

Case Study: How Dimona Built a Real-Time Inventory Management System on Rockset

Rockset

At Dimona , a leading Latin American apparel company founded 55 years ago in Brazil, our business is t-shirts. We design them, manufacture them, and sell them to consumers online and through our five retail stores in Rio de Janeiro. We also supply B2B companies for their customers in Brazil and the United States. Source: [link] We’ve come a long way since 2011 when I joined Dimona to launch our first website.

Systems 52
article thumbnail

Short-Term and Vacation Rental Data: Sources and Analysis

AltexSoft

Vacation and short-term rentals are experiencing a post-COVID renaissance. The data clearly shows the stable, worldwide increase in demand for alternative accommodations, from apartments to farm stays to igloos. The data also indicates that more and more companies in the sector tie their bright future with… data. According to the Global Vacation Rental Report 2022 , 40 percent of property managers rely on market business intelligence (BI) or analytics services, a big leap compared to just 13 per

article thumbnail

Data Science at the Command Line: The Free eBook

KDnuggets

If you are familiar with Python & R, then improve your current data science workflow by integrating Unix power tools.

article thumbnail

A Scala Project with Akka, Cats, and Cassandra

Rock the JVM

Akka, Cats, and Cassandra in a larger Scala project integrating multiple pieces of the Scala ecosystem

Scala 52
article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

Previewing the MYOB Podcast

Elder Research

The post Previewing the MYOB Podcast appeared first on Elder Research.

52
article thumbnail

IS DATAOPS MORE THAN DEVOPS FOR DATA?

DataKitchen

Data 52
article thumbnail

Machine Learning Textbook: Stochastic Processes and Simulations

KDnuggets

The 100 page book on stochastic processes. Published in 2022. This off-the-beaten-path machine learning tutorial is designed for busy professionals, researchers and students eager to learn and apply methods ranging from simple to advanced, in a minimum amount of time. Offered with data sets, source code, videos, spreadsheets and solved exercises.

article thumbnail

The Cost of Bad Data Has Gone Up. Here Are 8 Reasons Why.

Monte Carlo

You may not have heard the term data downtime, but I’m willing to bet you’ve experienced it and the cost of bad data firsthand. Urgent ping from your CEO about “missing data” in a critical report? Duplicate tables wreaking havoc in your Snowflake warehouse, all titled some variation of “Mikes_Table_GOOD-V3.”? Or, perhaps you’ve unintentionally made a decision based on bad data from last year’s forecasts?

Media 52
article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.