Sat.Apr 25, 2020 - Fri.May 01, 2020

article thumbnail

What Does It Mean for a Column to Be Indexed

Start Data Engineering

When optimizing queries on a database table, most developers tend to just create an index on the field to be queried.

IT 130
article thumbnail

ksqlDB Execution Plans: Move Fast But Don’t Break Things

Confluent

The ksqlDB Engineering Team has been hard at work preparing ksqlDB for production availability in Confluent Cloud. This is the first in a series of posts that deep dives into […].

Cloud 103
article thumbnail

Taming Complexity In Your Data Driven Organization With DataOps

Data Engineering Podcast

Summary Data is a critical element to every role in an organization, which is also what makes managing it so challenging. With so many different opinions about which pieces of information are most important, how it needs to be accessed, and what to do with it, many data projects are doomed to failure. In this episode Chris Bergh explains how taking an agile approach to delivering value can drive down the complexity that grows out of the varied needs of the business.

Hadoop 100
article thumbnail

All Models Are Wrong (But Some Are Useful)

Teradata

Lots of smart people have created many predictive analytics models to help us manage the COVID-19 pandemic. But many of these models use different inputs, different heuristics, and come to different conclusions.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

4 Handy Ways to Read Files in Scala

Rock the JVM

Master file reading in Scala with ease: compare it to other languages and discover how our simple API approach is almost as straightforward as Python's read()

Scala 52
article thumbnail

Confluent Platform Now Supports Protobuf, JSON Schema, and Custom Formats

Confluent

When Confluent Schema Registry was first introduced, Apache Avro™ was initially chosen as the default format. While Avro has worked well for many users, over the years, we’ve received many […].

Data 103

More Trending

article thumbnail

How to Be Most Productive When Working from Home

Teradata

Some tips and tricks to successfully and productively work from home.

85
article thumbnail

3 Fun Call-by-Name Tricks in Scala

Rock the JVM

Maximize call-by-name semantics in Scala: control when your results are evaluated

Scala 52
article thumbnail

Broadcom Modernizes Machine Learning and Anomaly Detection with ksqlDB

Confluent

Mainframes are still ubiquitous, used for almost every financial transaction around the world—credit card transactions, billing, payroll, etc. You might think that working on mainframe software would be dull, requiring […].

article thumbnail

Index Scan: Using Rockset's Search Index to Speed up Range Scans Over a Specific Field

Rockset

Recently, InfoWorld’s Martin Heller described Rockset as a "one-of-a-kind database for operational analytics." After testing Rockset with a variety of queries on a large collection, Heller rated Rockset 4.5 out of 5 stars. Heller’s review of Rockset can be found here. Only one of the test queries timed out: SELECT * FROM commons."twitter-firehose" ORDER BY "twitter-firehose".favorite_count DESC LIMIT 10 For context, twitter-firehose is one of Rockset’s demo collections.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

It’s Your Data, Set it Free…

Teradata

Complex data siloes in industrial settings can limit the ability of scientists & engineers to develop valuable analytics. Learn how Vantage helps unleash their potential.

IT 59
article thumbnail

Kafka Summit Austin 2020 is Going Virtual

Confluent

As you know, we were looking forward to welcoming the Apache Kafka® community to Austin, TX, for Kafka Summit in August. Meeting together in person is always the best way […].

Kafka 99
article thumbnail

Measuring Code Coverage of Golang Binaries with Bincover

Confluent

Measuring coverage of Go code is an easy task with the built-in go test tool, but for tests that run a binary, like end-to-end tests, there’s no obvious way to […].

Coding 94
article thumbnail

Teradata – Lowest Cost for Enterprise-Scale Analytics

Teradata

Teradata provides the lowest cost per query for enterprise-scale analytics.

52
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

It’s Your Data, Set it Free…

Teradata

Complex data siloes in industrial settings can limit the ability of scientists & engineers to develop valuable analytics. Learn how Vantage helps unleash their potential.

IT 52