How to Host a Virtual Global Data Science Hackathon
Teradata
MARCH 25, 2021
Learn how best to host a virtual hackathon, or any virtual event, with these tips and tricks from our Teradata team. Read more.
Teradata
MARCH 25, 2021
Learn how best to host a virtual hackathon, or any virtual event, with these tips and tricks from our Teradata team. Read more.
François Nguyen
MARCH 22, 2021
Just an illustration – not the truth and you certainly can do it with other technologies. TL;DR After setting up and organizing the teams, we are describing 4 topics to make data mesh a reality. the selfserve platform based on a serverless philisophy (life is too short to do provisioning) the building of data products (as code) : we are building data workflows not data pipelines the promotion of data domains where the metadata on the data life cycle is as important as your data The old dat
Cloudera
MARCH 23, 2021
We announced at our Partner Sales Kickoff, the winners of the 2021 Cloudera Partner Awards. These six awards recognize Cloudera partners who are dedicated to enabling customers to do more with their data by leveraging the power of an enterprise data cloud. Thank you to this year’s winners for their partnership in helping our joint customers’ ability to drive value from their data in the hybrid cloud.
Data Engineering Podcast
MARCH 22, 2021
Summary The world of business is becoming increasingly dependent on information that is accurate up to the minute. For analytical systems, the only way to provide this reliably is by implementing change data capture (CDC). Unfortunately, this is a non-trivial undertaking, particularly for teams that don’t have extensive experience working with streaming data and complex distributed systems.
Advertisement
Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.
Confluent
MARCH 26, 2021
I previously showed how to install and set up Apache Kafka® on Windows in minutes by using the Windows Subsystem for Linux 2 (WSL 2). From there, it’s only a […].
Netflix Tech
MARCH 22, 2021
Written by Nick Tomlin , Michael Possumato , and Rahul Pilani. This post shares how the Revenue & Growth Tools (RGT) team approaches creating full-stack tools for the teams that are the financial backbone of Netflix. Our primary partners are the teams of Revenue and Growth Engineering (RGE): Growth, Membership, Billing, Payments, and Partner Subscription.
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Teradata
MARCH 23, 2021
As the auto sector transforms, vehicle data is becoming one of the most important sources of insight. But if it is left in fragmented silos, it quickly becomes a cost & delivers little value.
DataKitchen
MARCH 26, 2021
The post How the DataKitchen Platform Delivers End-to-End Data Observability first appeared on DataKitchen.
Data Council
MARCH 25, 2021
Here's our March 2021 roundup of links from across the web that we selected for you: 1. How to Build a Community (Fishtown Analytics) Claire Carroll's first personal blog post on community-building is a must-read. As Fishtown Analytics' community manager for the last 2.5 years, she's arguably behind the success of the dbt community and its best-in-class practices, so we expected good advice… but she really hit the ball out of the park with this one!
Cloudera
MARCH 24, 2021
Introduction. One of the most effective ways to improve performance and minimize cost in database systems today is by avoiding unnecessary work, such as data reads from the storage layer (e.g., disks, remote storage), transfers over the network, or even data materialization during query execution. Since its early days, Apache Hive improves distributed query execution by pushing down column filter predicates to storage handlers like HBase or columnar data format readers such as Apache ORC.
Speaker: Tamara Fingerlin, Developer Advocate
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
Ripple Engineering
MARCH 25, 2021
Amidst the work-from-home environment, Vidya Mani joined Ripple in early 2020 as the Vice President of Engineering for RippleNet. A year into her role, she focuses on improving Ripple’s infrastructure and strengthening her team to further the company’s vision for a more inclusive financial system. RippleNet is an enterprise solution which helps banks and other financial institutions streamline global payments and reach new customers.
Confluent
MARCH 25, 2021
Self-managing a distributed system like Apache Kafka®, along with building and operating Kafka connectors, is complex and resource intensive. It requires significant Kafka skills and expertise in the development and […].
Grouparoo
MARCH 24, 2021
The Grouparoo application is written in JavaScript (Node). It uses the modern promise-based pattern ( async / await ) for reading and writing data asynchronously. And we do this a lot — we are a data sync tool! Every once in awhile we'll come across a JavaScript library that is written around the old callback-based pattern, where the error object is the first parameter in the callback function, followed by the result.
Cloudera
MARCH 24, 2021
For over 8 years, Christine Sherry has brought her hard-work ethic and skills to Cloudera. In her time here, she’s climbed up the ladder and currently sits as our Director, Global Critical Incident Management. . In her role, she leads Cloudera’s Global Support Critical Incident team who is responsible for managing our customers’ most critical technical issues to resolution.
Advertisement
Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.
DataKitchen
MARCH 23, 2021
The post DataOps and the Cloud: A Match Made in Heaven first appeared on DataKitchen.
Preset
MARCH 21, 2021
How to contribute to the Apache Superset™ Project. Help others, advocate for Superset and code development.
Teradata
MARCH 21, 2021
Working to optimize Retail & CPG Supply Chains often feels like a life-sized game of Whack-a-Mole -- making a change here creates an issue there. Find out how integrated, real-time data can help.
Cloudera
MARCH 25, 2021
Data privacy is an increasingly complex and contentious topic. The appropriate use of data and transparency to the potential uses of the data are at the center of debate amongst the largest Big Tech companies. . The protection and controls around data become increasingly complex when used in the context of banking and insurance activities. Personal and confidential information carries heightened sensitivity in the light of financial, health and insurance activities.
Advertisement
Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?
DataKitchen
MARCH 25, 2021
The post DataKitchen Wins DataOps Company of the Year, 2021, from the Data Breakthrough Awards first appeared on DataKitchen.
Monte Carlo
MARCH 24, 2021
As a new or aspiring data engineer, there are some essential technologies and frameworks you should know. How to build a data pipeline? Check. How to clean, transform, and model your data? Check. How to prevent broken data workflows before you get that frantic call from your CEO about her missing data? Maybe not. By leveraging best practices from our friends in software engineering and developer operations (DevOps), we can think more strategically about tackling the “good pipelines, bad data” pr
Rockset
MARCH 23, 2021
Introduction 502 and 504 errors can be a nuisance for Rockset and our users. For many users running customer-facing applications on Rockset, availability and uptime are very important, so even a single 5xx error is cause for concern. As a cloud service, Rockset deploys code to our production clusters multiple times a week, which means that any component of our distributed system has to stop and restart with new code in an error-free way.
Cloudera
MARCH 25, 2021
I used to work for a female CEO who said that in her organization, “the powder room is the power room”. It’s been a while since I heard that statement, yet such an environment is still far from the truth for many companies. Women are still underrepresented in the science, technology, engineering, and mathematics (STEM) field and more so in leadership positions.
Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali
As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.
Preset
MARCH 21, 2021
Prophet is a popular time-series forecasting library created by Facebook. Learn how to use Prophet and Apache Druid for forecasting.
Let's personalize your content