Data Scientist, Data Engineer & Other Data Careers, Explained
KDnuggets
APRIL 27, 2022
In this article, we will have a look at five distinct data careers, and hopefully provide some advice on how to get one's feet wet in this convoluted field.
KDnuggets
APRIL 27, 2022
In this article, we will have a look at five distinct data careers, and hopefully provide some advice on how to get one's feet wet in this convoluted field.
Confluent
APRIL 25, 2022
Data streaming is a new category of technology that is reshaping the way businesses operate, but there hasn’t been a place for everyone in the ecosystem to come together and […].
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Cloudera
APRIL 27, 2022
This week I participated in an informative event that Cloudera hosted with TechCrunch: Data and the Culture Transformation. The event was moderated by tech industry analyst Maribel Lopez, and we were joined by Shirley Collie, chief health analytics actuary at Discovery Health in South Africa. The conversations focused on how company data cultures are rapidly evolving and delivering new levels of value to businesses with the emergence of data ecosystems.
Data Engineering Podcast
APRIL 24, 2022
Summary There are very few tools which are equally useful for data engineers, data scientists, and machine learning engineers. WhyLogs is a powerful library for flexibly instrumenting all of your data systems to understand the entire lifecycle of your data from source to productionized model. In this episode Andy Dang explains why the project was created, how you can apply it to your existing data systems, and how it functions to provide detailed context for being able to gain insight into all o
Speaker: Tamara Fingerlin, Developer Advocate
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
KDnuggets
APRIL 27, 2022
Solving the Python coding interview questions is the best way to get ready for an interview. That’s why we’ll lead you through 15 examples and five concepts these questions cover.
Confluent
APRIL 27, 2022
It’s official: Kafka Summit is back! Technically, it never went away—it just went online. But this week in London, Kafka Summit returned in all its glory to welcome over 1,200 […].
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Data Engineering Podcast
APRIL 24, 2022
Summary A huge amount of effort goes into modeling and shaping data to make it available for analytical purposes. This is often due to the need to simplify the final queries so that they are performant for visualization or limited exploration. In order to cut down the level of effort involved in making data usable, Matthew Halliday and his co-founders created Incorta as an end-to-end, in-memory analytical engine that removes barriers to insights on your data.
KDnuggets
APRIL 25, 2022
Create and collaborate on data science projects or train machine learning models using free cloud Jupyter notebook platforms. You get a hassle-free IDE experience and free compute resources.
Teradata
APRIL 29, 2022
Find out why data analytics and connectivity will be the difference between retailing taking off and being grounded.
Cloudera
APRIL 29, 2022
April is Autism Awareness Month, and as we close out the month I sat down with Clouderan Susana L ó pez Huertas, who shared her story of raising a son with autism and the work she is doing to promote an environment where autistic adults can thrive in the workforce. . Meet Susana L ó pez Huertas. Susana, who has been a part of Cloudera for about a year, works out of the Madrid office as a senior account manager for the country’s Telecom, Media, and Central Public Sector accounts.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
Netflix Tech
APRIL 26, 2022
by Vivek Kaushal At Netflix, we aim to provide recommendations that match our members’ interests. To achieve this, we rely on Machine Learning (ML) algorithms. ML algorithms can be only as good as the data that we provide to it. This post will focus on the large volume of high-quality data stored in Axion?—?our fact store that is leveraged to compute ML features offline.
KDnuggets
APRIL 25, 2022
Metadata is the data providing context about the data, more than what you see in the rows and columns. By managing your metadata, you're effectively creating an encyclopedia of your data assets.
Teradata
APRIL 26, 2022
Managing the new class of emerging risks requires infusing the principles of resiliency and efficient risk analytics into traditional risk management frameworks.
Monte Carlo
APRIL 26, 2022
What is DataOps? DataOps is a discipline that merges data engineering and data science teams to support an organization’s data needs, in a similar way to how DevOps helped scale software engineering. Similar to how DevOps applies CI/CD to software development and operations, DataOps entails a CI/CD-like, automation-first approach to building and scaling data products.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Rockset
APRIL 28, 2022
It is often said time flies when you are having fun and I couldn't agree more. I have been at Rockset for almost three years now and it is still so interesting to me. On one hand, I am just getting started and have so much more to do and on the other, I am so proud of the distance we have covered in the last few years! Photo by Daoudi Aissa on Unsplash Our customers tell us that the work we are doing matters to them: Rockset made me a hero on day three of my new job.
KDnuggets
APRIL 26, 2022
Create simple, effective machine learning plots with Yellowbrick.
Zalando Engineering
APRIL 27, 2022
Anyone who has been following the topic of Site Reliability Engineering (SRE) has likely heard of Service Level Objectives (SLOs) , and Service Level Indicators (SLIs). SLIs and SLOs are at the core of the SRE practices. They are fundamental to establish the balance between building new features on a product, shipping fast, or working on the reliability of that product.
Monte Carlo
APRIL 28, 2022
Most data pros know Snowflake’s pricing model is consumption based–you pay for what you use. What many don’t know is Snowflake actually WANTS you to optimize your costs and has provided helpful features to rightsize your consumption. Waste isn’t good for anyone. Instead of spinning cycles on deteriorated SQL queries, the data cloud provider would rather have you focus those Snowflake credits toward projects like building data apps.
Advertisement
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
Yelp Engineering
APRIL 24, 2022
Yelp’s mission is to connect people with great local businesses. On the Recommendations & Discovery team, we sift through billions of users-business interactions to learn user preferences. Our solutions power several products across Yelp such as personalized push notifications, email engagement campaigns, the home feed, Collections and more. Here we discuss the generalized user to business recommendation model which is crucial to a lot of these applications.
KDnuggets
APRIL 25, 2022
Check out this list of data science project ideas that you can use to boost your skills, organized by level of expertise.
Scribd Technology
APRIL 27, 2022
We are very excited to be presenting and attending this year’s Data and AI Summit which will be hosted virtually and physically in San Francisco from June 27th-30th. Throughout the course of 2021 we completed a number of really interesting projects built around delta-rs and the Databricks platform which we are thrilled to share with a broader audience.
Rockset
APRIL 26, 2022
As Kafka Summit is in full swing in London this week and the topic of event streaming is all over my Linkedin feed, I saw a post asking " Is streaming dead? " referring to CNN+ being shut down. In the last few days, Netflix took a once-in-a-lifetime beating in the stock market , and CNN redefined fail fast ( pioneered by Silicon Valley ) when it announced the breaking news that it will shut down CNN+ just weeks after a very splashy debut.
Speaker: Tamara Fingerlin, Developer Advocate
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
KDnuggets
APRIL 29, 2022
Top-rated data science tracks consist of multiple project-based courses covering all aspects of data. It includes an introduction to Python/R, data ingestion & manipulation, data visualization, machine learning, and reporting.
KDnuggets
APRIL 29, 2022
Extract, profile, and manage your customer data in a flash with customer data management solutions, and achieve a customer-centric culture.
KDnuggets
APRIL 28, 2022
Here are some of the best practices for implementing MLOps successfully.
KDnuggets
APRIL 28, 2022
We’re proud to announce that the 4th annual Knowledge Graph Conference is taking place on May 2-6 at Cornell Tech, NYC and virtually on Airmeet.
Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage
When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m
KDnuggets
APRIL 28, 2022
Need high availability for SQL Server Docker containers in Kubernetes? Here’s how to get it.
KDnuggets
APRIL 28, 2022
If you don’t already know a programming language, or if you’re deciding to choose another language, have a read and see if Python is for you.
KDnuggets
APRIL 25, 2022
SQL is a must-know for anyone working in the data industry. Here’s how you can learn it from scratch.
KDnuggets
APRIL 27, 2022
How much impact does sparsity have on model performance?
Advertisement
Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.
Let's personalize your content