January, 2020

article thumbnail

Top 10 Technology Trends for 2020

KDnuggets

With integrations of multiple emerging technologies just in the past year, AI development continues at a fast pace. Following the blueprint of science and technology advancements in 2019, we predict 10 trends we expect to see in 2020 and beyond.

article thumbnail

Pay Down Technical Debt In Your Data Pipeline With Great Expectations

Data Engineering Podcast

Summary Data pipelines are complicated and business critical pieces of technical infrastructure. Unfortunately they are also complex and difficult to test, leading to a significant amount of technical debt which contributes to slower iteration cycles. In this episode James Campbell describes how he helped create the Great Expectations framework to help you gain control and confidence in your data delivery workflows, the challenges of validating and monitoring the quality and accuracy of your dat

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Infinite Storage in Confluent Platform

Confluent

A preview of Confluent Tiered Storage is now available in Confluent Platform 5.4, enabling operators to add an additional storage tier for data in Confluent Platform. If you’re curious about […].

Data 123
article thumbnail

Data Privacy and Why it Matters to Our Customers

Teradata

People want control over their personal data, but are also willing to trade it away for convenience. When does the exploitation of our data become unethical? Read more!

IT 115
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Engineering SQL Support on Apache Pinot at Uber

Uber Engineering

Uber leverages real-time analytics on aggregate data to improve the user experience across our products, from fighting fraudulent behavior on Uber Eats to forecasting demand on our platform. . As Uber’s operations became more complex and we offered additional features and … The post Engineering SQL Support on Apache Pinot at Uber appeared first on Uber Engineering Blog.

SQL 112
article thumbnail

Simulating Cohorts

Grouparoo

In the last post , I made a case that the way to make the biggest difference in a metric like retention is to increase how many tests you can run each month. It turns out, going from 1 to 4 tests a month makes a huge difference, especially as those cohorts build on each other over time. To prove this out, I built a spreadsheet. Because I learned even more from creating the spreadsheet itself than writing the blog post, I thought I'd give those learnings some airtime, too.

More Trending

article thumbnail

Replatforming Production Dataflows

Data Engineering Podcast

Summary Building a reliable data platform is a neverending task. Even if you have a process that works for you and your business there can be unexpected events that require a change in your platform architecture. In this episode the head of data for Mayvenn shares their experience migrating an existing set of streaming workflows onto the Ascend platform after their previous vendor was acquired and changed their offering.

Kafka 100
article thumbnail

Introducing Confluent Platform 5.4

Confluent

I am pleased to announce the release of Confluent Platform 5.4. Like any new release of Confluent Platform, it’s packed with features. To make them easier to digest, I want […].

120
120
article thumbnail

Analytics in the Hybrid Cloud – An Architect’s Perspective

Teradata

The hybrid cloud is not just a consideration, but for many of our customers, already a reality. Read more to learn best practices when considering a hybrid or multi-cloud environment.

Cloud 87
article thumbnail

Case Study: Standard Cognition Uses Rockset to Deliver Data APIs and Real-Time Metrics for Vision AI

Rockset

Walk into a store, grab the items you want, and walk out without having to interact with a cashier or even use a self-checkout system. That’s the no-hassle shopping experience of the future you’ll get at the Standard Store , a demonstration store showcasing the AI-powered checkout pioneered by Standard Cognition. The company makes use of computer vision to remove the need for checkout lines of any sort in physical retail locations.

Retail 40
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

The Shots You Get to Take

Grouparoo

At Grouparoo , we have been interviewing a lot of marketers. The overall learning is that it's a hard job. The biggest reason is that they need data to make their campaigns work and do not have the means to get that data. Basically, they need Engineers to prioritize writing code to get the data into the tool they are using. That rarely happens.

Coding 52
article thumbnail

A Comprehensive Guide to Natural Language Generation

KDnuggets

Follow this overview of Natural Language Generation covering its applications in theory and practice. The evolution of NLG architecture is also described from simple gap-filling to dynamic document creation along with a summary of the most popular NLG models.

article thumbnail

Planet Scale SQL For The New Generation Of Applications With YugabyteDB

Data Engineering Podcast

Summary The modern era of software development is identified by ubiquitous access to elastic infrastructure for computation and easy automation of deployment. This has led to a class of applications that can quickly scale to serve users worldwide. This requires a new class of data storage which can accomodate that demand without having to rearchitect your system at each level of growth.

SQL 100
article thumbnail

Streams and Tables in Apache Kafka: Topics, Partitions, and Storage Fundamentals

Confluent

Part 1 of this series discussed the basic elements of an event streaming platform: events, streams, and tables. We also introduced the stream-table duality and learned why it is a […].

Kafka 95
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Not Just SQL Anymore! Using R and Python with Vantage

Teradata

Learn about the different ways to use R and Python with Vantage and the pros and cons of each option. Read more from our Teradata expert.

Python 80
article thumbnail

RocksDB Is Eating the Database World

Rockset

A Brief History of Distributed Databases The era of Web 2.0 brought with it a renewed interest in database design. While traditional RDBMS databases served well the data storage and data processing needs of the enterprise world from their commercial inception in the late 1970s until the dotcom era, the large amounts of data processed by the new applications—and the speed at which this data needs to be processed—required a new approach.

article thumbnail

The Book to Start You on Machine Learning

KDnuggets

This book is thought for beginners in Machine Learning, that are looking for a practical approach to learning by building projects and studying the different Machine Learning algorithms within a specific context.

article thumbnail

7 Resources to Becoming a Data Engineer

KDnuggets

An estimated 8,650% growth of the volume of Data to 175 zetabytes from 2010 to 2025 has created an enormous need for Data Engineers to build an organization's big data platform to be fast, efficient and scalable.

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

I wanna be a data scientist, but… how?

KDnuggets

It’s easy to say "I wanna be a data scientist," but. where do you start? How much time is needed to be desired by companies? Do you need a Master’s degree? Do you need to know every mathematical concept ever derived? The journey might be long, but follow this plan to help you keep moving forward toward your career goal.

Data 148
article thumbnail

Predict Electricity Consumption Using Time Series Analysis

KDnuggets

Time series forecasting is a technique for the prediction of events through a sequence of time. In this post, we will be taking a small forecasting problem and try to solve it till the end learning time series forecasting alongside.

IT 135
article thumbnail

Top 9 Mobile Apps for Learning and Practicing Data Science

KDnuggets

This article will tell you about the top 9 mobile apps that help the user in learning and practicing data science and hence is improving their productivity.

article thumbnail

7 Steps to a Job-winning Data Science Resume

KDnuggets

A resume plays a key role in bagging that dream data science job. We break down the nuances of a job-winning data science resume so that you can go ahead and transform your own resume.

article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

Why Python is One of the Most Preferred Languages for Data Science?

KDnuggets

Why do most data scientists love Python? Learn more about how so many well-developed Python packages can help you accomplish your crucial data science tasks.

article thumbnail

Change Data Capture For All Of Your Databases With Debezium

Data Engineering Podcast

Summary Databases are useful for inspecting the current state of your application, but inspecting the history of that data can get messy without a way to track changes as they happen. Debezium is an open source platform for reliable change data capture that you can use to build supplemental systems for everything from maintaining audit trails to real-time updates of your data warehouse.

Database 100
article thumbnail

The Data Science Interview Study Guide

KDnuggets

Preparing for a job interview can be a full-time job, and Data Science interviews are no different. Here are 121 resources that can help you study and quiz your way to landing your dream data science job.

article thumbnail

The Future of Machine Learning

KDnuggets

This summary overviews the keynote at TensorFlow World by Jeff Dean, Head of AI at Google, that considered the advancements of computer vision and language models and predicted the direction machine learning model building should follow for the future.

article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.

article thumbnail

Top 5 AI trends for 2020

KDnuggets

We are all witnessing a staggering growth of AI technology with so many new benefits for people while also changing the way we live and work. As AI continues to grow, which applications will have a significant impact in 2020?

article thumbnail

An Introductory Guide to NLP for Data Scientists with 7 Common Techniques

KDnuggets

Data Scientists work with tons of data, and many times that data includes natural language text. This guide reviews 7 common techniques with code examples to introduce you the essentials of NLP, so you can begin performing analysis and building models from textual data.

Data 151
article thumbnail

Deepfakes Security Risks

KDnuggets

Deepfakes have instilled panic in experts since they first emerged in 2017. Microsoft and Facebook have recently announced a contest to identify deepfakes more efficiently.

108
108
article thumbnail

Python String Processing Primer

KDnuggets

Pursuing a text analytics path but don't know where to start? Try this string processing primer to first gain an understanding of using Python to manipulate and process strings at a basic level.

Python 114
article thumbnail

What Is Entity Resolution? How It Works & Why It Matters

Entity Resolution Sometimes referred to as data matching or fuzzy matching, entity resolution, is critical for data quality, analytics, graph visualization and AI. Learn what entity resolution is, why it matters, how it works and its benefits. Advanced entity resolution using AI is crucial because it efficiently and easily solves many of today’s data quality and analytics problems.