April, 2020

article thumbnail

Review: Building a Real Time Data Warehouse

Start Data Engineering

Many data engineers coming from traditional batch processing frameworks have questions about real time data processing systems, like “What kind of data model did you implement, for real-time processing?

article thumbnail

Preventing Fraud and Fighting Account Takeovers with Kafka Streams

Confluent

Many companies have recently started to take cybersecurity and data protection even more seriously, particularly driven by the recent General Data Protection Regulation (GDPR) legislation. They are increasing their investment […].

Kafka 145
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Taming Complexity In Your Data Driven Organization With DataOps

Data Engineering Podcast

Summary Data is a critical element to every role in an organization, which is also what makes managing it so challenging. With so many different opinions about which pieces of information are most important, how it needs to be accessed, and what to do with it, many data projects are doomed to failure. In this episode Chris Bergh explains how taking an agile approach to delivering value can drive down the complexity that grows out of the varied needs of the business.

Hadoop 100
article thumbnail

Teradata Supports China’s Fight Against COVID-19

Teradata

By fully utilizing the data for telco operators in China, Teradata helped communities battle the COVID-19 epidemic through ongoing public health communication, travel updates and inquiries.

Utilities 111
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

How Netflix brings safer and faster streaming experience to the living room on crowded networks…

Netflix Tech

How Netflix brings safer and faster streaming experience to the living room on crowded networks using TLS 1.3 By Sekwon Choi At Netflix, we are obsessed with the best streaming experiences. We want playback to start instantly and to never stop unexpectedly in any network environment. We are also committed to protecting users’ privacy and service security without sacrificing any part of the playback experience.

article thumbnail

5 Must Know Workforce Analytics as an HR Manager

U-Next

Workforce analytics – what sounds like a complex technical term is relatively easy and inevitable when you understand its importance and how to leverage its full potential to your benefit. Some of the elite HR analysts out there have implemented workforce analytics in their businesses and have seen results that their competitors couldn’t dream of. As an HR manager, you need to be aware of these analytics concepts and theories to pave way for organizational and departmental success and curb any b

More Trending

article thumbnail

What’s New in Apache Kafka 2.5

Confluent

On behalf of the Apache Kafka® community, it is my pleasure to announce the release of Apache Kafka 2.5.0. The community has created another exciting release. We are making progress […].

Kafka 144
article thumbnail

Building Real Time Applications On Streaming Data With Eventador

Data Engineering Podcast

Summary Modern applications frequently require access to real-time data, but building and maintaining the systems that make that possible is a complex and time consuming endeavor. Eventador is a managed platform designed to let you focus on using the data that you collect, without worrying about how to make it reliable. In this episode Eventador Founder and CEO Kenny Gorman describes how the platform is architected, the challenges inherent to managing reliable streams of data, the simplicity off

Building 100
article thumbnail

All Models Are Wrong (But Some Are Useful)

Teradata

Lots of smart people have created many predictive analytics models to help us manage the COVID-19 pandemic. But many of these models use different inputs, different heuristics, and come to different conclusions.

article thumbnail

Bringing 4K and HDR to Anime at Netflix with Sol Levante

Netflix Tech

By Haruka Miyagawa & Kylee Peña Continue reading on Netflix TechBlog ».

96
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Scala For Big Data Engineering – Why should you care?

Advancing Analytics: Data Engineering

The thought of learning Scala fills many with fear, its very name often causes feelings of terror. This suggests it’s either doing something very good, or very bad! The truth is Scala can be used for many things; from a simple web application to complex ML (Machine Learning). Moreover, it unusually fully incorporates two programming paradigms: OOP (Object Orientated Programming) and FP (Functional programming).

Scala 52
article thumbnail

Advantages of Using dbt(Data Build Tool)

Start Data Engineering

In this article we aim to go over the reasoning behind why someone might want to use dbt. If you are interested in learning dbt checkout this article.

Building 130
article thumbnail

Confluent Raises $250M and Kicks Off Project Metamorphosis

Confluent

Confluent Raises $250M and Kicks Off Project Metamorphosis It’s an exciting day for Confluent, in the middle of a very unusual and difficult time in the larger world. Nonetheless, I […].

Project 142
article thumbnail

Building A Knowledge Graph Of Commercial Real Estate At Cherre

Data Engineering Podcast

Summary Knowledge graphs are a data resource that can answer questions beyond the scope of traditional data analytics. By organizing and storing data to emphasize the relationship between entities, we can discover the complex connections between multiple sources of information. In this episode John Maiden talks about how Cherre builds knowledge graphs that provide powerful insights for their customers and the engineering challenges of building a scalable graph.

Building 100
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Breaking the COVID-19 Chain with Data Analytics

Teradata

How can Teradata's data analytics platform help communities stop the spread of COVID-19? Find out more.

article thumbnail

AI and Automation Quick Wins that HR Teams Should Focus On

U-Next

Is your HR department failing to embrace the digital transformation (especially AI) revolution? If so, you are not alone. Human resources departments are notorious for lagging behind in adopting new technologies. It comes as no surprise that the use of automation and artificial intelligence in HR is still relatively rare compared to other departments in organizations across different industries.

article thumbnail

Lessons Gleaned from Attending and Speaking at the World Economic Forum for Africa 2019 Gathering

Hepta Analytics

Last year in September, Hepta Analytics was amongst the few startup companies invited to participate in the World Economic Forum for Africa in Cape Town, South Africa. Such a rare opportunity for a young company like ours given the high profile individuals, such as heads of states invited to attend these types of events. It also included a great mix of local, regional and international companies execs, academic and civil society leaders, all coming together to discuss one thing: Shaping inclusi

Food 52
article thumbnail

Apache Airflow Review: the good, the bad

Start Data Engineering

When getting started with Apache Airflow , data engineers have questions similar to the two below “What are people’s opinions of Airflow?

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

Introducing Confluent Platform 5.5

Confluent

We are pleased to announce the release of Confluent Platform 5.5. With this release, Confluent makes event streaming more broadly accessible to developers of all backgrounds, enhancing three categories of […].

article thumbnail

Google Sheets Source

Grouparoo

Grouparoo is the Reverse ETL platform to connect Google Sheets data to your SaaS tools. This enables all of those crazy sheets out there to be the source of truth for your profiles and be fed into your marketing tools. Don't forget: with great power comes great responsibility! Google setup In Grouparoo, apps make the connection to facilitate data movement in the form of sources and destinations.

MySQL 52
article thumbnail

I’m Sorry CXOs, but You’re Mostly Doing Analytics All Wrong

Teradata

There is no ROI in technology - specifically in data analytics, AI & Machine Learning - until we deploy in production and change the way we do business.

article thumbnail

Predict Attrition in a Company by Help of Analytics

U-Next

There’s always a sense of apprehension when someone walks down to the HR desk to put down their papers. More so if it is a key employee whose loss is going to be a definite setback. Then people wonder – the upper management, the line manager, the HR department – how it is that they never saw this coming. There used to be a time when employee retention processes would kick in only after an employee resigned.

Systems 52
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

4 Handy Ways to Read Files in Scala

Rock the JVM

Master file reading in Scala with ease: compare it to other languages and discover how our simple API approach is almost as straightforward as Python's read()

Scala 52
article thumbnail

Index Scan: Using Rockset's Search Index to Speed up Range Scans Over a Specific Field

Rockset

Recently, InfoWorld’s Martin Heller described Rockset as a "one-of-a-kind database for operational analytics." After testing Rockset with a variety of queries on a large collection, Heller rated Rockset 4.5 out of 5 stars. Heller’s review of Rockset can be found here. Only one of the test queries timed out: SELECT * FROM commons."twitter-firehose" ORDER BY "twitter-firehose".favorite_count DESC LIMIT 10 For context, twitter-firehose is one of Rockset’s demo collections.

article thumbnail

Confluent Platform Now Supports Protobuf, JSON Schema, and Custom Formats

Confluent

When Confluent Schema Registry was first introduced, Apache Avro™ was initially chosen as the default format. While Avro has worked well for many users, over the years, we’ve received many […].

Data 103
article thumbnail

Open Sourcing a GitHub Engagement Dashboard

Preset

This post details the process of building a GitHub community dashboard by extracting data out of the GitHub API, loading it into a database, and building a Superset dashboard on top of it.

article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.

article thumbnail

Teradata and the MIT COVID Challenge Hackathon

Teradata

Teradata participated in the MIT COVID Challenge Hackathon to design approaches and mentor teams focused on stamping out the pandemic. Learn more.

article thumbnail

How to Make the Most of HR Analytics?

U-Next

It’s time that organizations realize great skillsets like HR Analytics, are the key to bigger businesses. Companies now find it fancy to promote contests around “Best place to work” and the popularity of LinkedIn and Employee Relations departments have gained a significant amount of importance in the last few years. “Human Resources isn’t a thing we do.

Process 52
article thumbnail

Understanding Spark Query Plans

Rock the JVM

In this article, you'll learn one of the most important Spark skills: reading how your job will run, which is foundational for any further Spark optimization

52
article thumbnail

Can I Do SQL-Style Joins in Elasticsearch?

Rockset

Elasticsearch is an open-source, distributed JSON-based search and analytics engine built using Apache Lucene with the purpose of providing fast real-time search functionality. It is a NoSQL data store that is document-oriented, scalable, and schemaless by default. Elasticsearch is designed to work at scale with large data sets. As a search engine, it provides fast indexing and search capabilities that can be horizontally scaled across multiple nodes.

SQL 40
article thumbnail

What Is Entity Resolution? How It Works & Why It Matters

Entity Resolution Sometimes referred to as data matching or fuzzy matching, entity resolution, is critical for data quality, analytics, graph visualization and AI. Learn what entity resolution is, why it matters, how it works and its benefits. Advanced entity resolution using AI is crucial because it efficiently and easily solves many of today’s data quality and analytics problems.