15 Free Machine Learning and Deep Learning Books
KDnuggets
OCTOBER 31, 2022
Check out this list of 15 FREE ebooks for learning machine learning and deep learning.
KDnuggets
OCTOBER 31, 2022
Check out this list of 15 FREE ebooks for learning machine learning and deep learning.
Start Data Engineering
OCTOBER 22, 2022
1. Introduction 2. Data project template 2.1. Prerequisites 2.2. Setup infra 2.3. Tear down infra 3. Set up data infrastructure 3.1. Run data infra on your laptop with containers 3.2. Manage cloud infrastructure with code 4. Set up development workflow 4.1. CI: Automated tests & checks before the merge with GitHub Actions 4.2. CD: Deploy to production servers with GitHub Actions 4.3.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Data Engineering Podcast
OCTOBER 30, 2022
Summary Business intelligence has grown beyond its initial manifestation as dashboards and reports. In its current incarnation it has become a ubiquitous need for analytics and opportunities to answer questions with data. In this episode Amir Orad discusses the Sisense platform and how it facilitates the embedding of analytics and data insights in every aspect of organizational and end-user experiences.
The Pragmatic Engineer
OCTOBER 27, 2022
This issue was written in Oct 2022, sent out to all subscribers of The Pragmatic Engineer Newsletter in October 2022. The observations on how Big Tech hiring will slow down have since been validated, with Meta not only laying off in November, but also rescinding offers in January 2023, and Amazon doing the same. If you want to get the pulse of the industry in your inbox, subscribe.
Simon Späti
OCTOBER 19, 2022
Will Rust kill Python for Data Engineers? If you only came here to know this, my answer is no. Betteridge’s Law strikes again! But then again, you have to ask: was Python made for Data Engineering in the first place? Rust may not replace Python outright, but it has consumed more and more of JavaScript tooling and there are increasingly many projects trying to do the same with Python/Data Engineering.
Jesse Anderson
OCTOBER 19, 2022
I have a calendar reminder that tells me when I founded Big Data Institute. It just told me I founded the company eight years ago. The reminder is called “Independent Anniversary.” It’s the day I split off and executed my vision for an independent, big data consulting company. Independence has all sorts of manifestations. For you, it’s an independent look at technology and vendors from someone who’s worked at a vendor (Cloudera) and worked in distributed systems for even longer.
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Confluent
OCTOBER 3, 2022
Apache Kafka 3.3 includes KRaft mode, improves partition scalability and resiliency while simplifying Kafka deployment, as well as updates to Kafka Streams, Connect, and more.
Data Engineering Podcast
OCTOBER 23, 2022
Summary Agile methodologies have been adopted by a majority of teams for building software applications. Applying those same practices to data can prove challenging due to the number of systems that need to be included to implement a complete feature. In this episode Shane Gibson shares practical advice and insights from his years of experience as a consultant and engineer working in data about how to adopt agile principles in your data work so that you can move faster and provide more value to
The Pragmatic Engineer
OCTOBER 20, 2022
👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of five topics in today’s subscriber-only The Scoop issue. To get this newsletter every week, subscribe. Pollen, the events festival tech startup, went bankrupt in August after raising more than $200M in venture funding. In an exclusive investigative article , I covered the events and details leading up this bankruptcy.
Simon Späti
OCTOBER 19, 2022
Will Rust kill Python for Data Engineers? If you only came here to know this, my answer is no. Betteridge’s Law strikes again! But then again, you have to ask: was Python made for Data Engineering in the first place? Rust may not replace Python outright, but it has consumed more and more of JavaScript tooling and there are increasingly many projects trying to do the same with Python/Data Engineering.
Speaker: Tamara Fingerlin, Developer Advocate
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
Jesse Anderson
OCTOBER 5, 2022
My guest this week is Brent Dykes , Founder and Chief Data Storyteller at Analytics Hero. Before he founded his own company, he was at Omniture, Adobe, and Domo. Analytics Hero is a consulting business based around data storytelling Data storytelling was a new concept to me. Brent defines it as “as a structured approach for communicating insights to a targeted audience using narrative elements and explanatory visuals.
KDnuggets
OCTOBER 24, 2022
Preprocessing data for machine learning models is a core general skill for any Data Scientist or Machine Learning Engineer. Follow this guide using Pandas and Scikit-learn to improve your techniques and make sure your data leads to the best possible outcome.
Confluent
OCTOBER 4, 2022
Confluent’s new Stream Designer is the industry’s first visual interface for rapidly building, testing, and deploying streaming data pipelines natively on Apache Kafka.
Teradata
OCTOBER 12, 2022
ClearScape Analytics provides robust functionality giving people across the organization the ability to efficiently execute their roles in the analytics process on a common platform.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
The Pragmatic Engineer
OCTOBER 10, 2022
Part of this article was originally published in The Scoop #27 , for subscribers of The Pragmatic Engineer Newsletter last week. I decided to publish this section for everyone to read after the Business Insider article claiming that 15% of Facebook employees - 12,000 people - may lose their jobs started to spread within the media. The Business Insider article was not specific to software engineers but still spread heavily within tech circles.
Data Engineering Podcast
OCTOBER 30, 2022
Summary One of the most impactful technologies for data analytics in recent years has been dbt. It’s hard to have a conversation about data engineering or analysis without mentioning it. Despite its widespread adoption there are still rough edges in its workflow that cause friction for data analysts. To help simplify the adoption and management of dbt projects Nandam Karthik helped create Optimus.
DataKitchen
OCTOBER 14, 2022
The post Podcast: Scaling DataOps first appeared on DataKitchen.
KDnuggets
OCTOBER 25, 2022
As more businesses experiment with data, they realize that developing a machine learning (ML) model is only one of many steps in the ML lifecycle.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Confluent
OCTOBER 6, 2022
Current 2022 is a wrap! Here are some of the top keynote speeches, exciting new data streaming technologies, popular sessions, and where to find videos online.
Teradata
OCTOBER 13, 2022
Implementing a cookieless identity solution will help businesses maintain advertising efforts amid the phaseout of third-party cookies.
Cloudera
OCTOBER 17, 2022
As we wrap up Hispanic Heritage month this #ClouderaLife Spotlight features Elias Avila, senior staff proactive support engineer for Cloudera. In this spotlight, we talk about his career in technology and his philosophy for getting the most out of work in terms of satisfaction and advancement. We also talk about his upbringing in the primarily Mexican American community of Salinas, California, and the important role Hispanics play in California’s Central Valley. .
Data Engineering Podcast
OCTOBER 23, 2022
Summary The database market has seen unprecedented activity in recent years, with new options addressing a variety of needs being introduced on a nearly constant basis. Despite that, there are a handful of databases that continue to be adopted due to their proven reliability and robust features. MariaDB is one of those default options that has continued to grow and innovate while offering a familiar and stable experience.
Advertisement
Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.
AltexSoft
OCTOBER 21, 2022
We say ‘xerox’ speaking of any photocopy, whether or not it was created by a machine from the Xerox corporation. We describe information search on the Internet with just one word — ‘google’. We ‘photoshop pictures’ instead of editing them on the computer. And COVID-19 made ‘zoom’ a synonym for a videoconference. Kafka can continue the list of brand names that became generic terms for the entire type of technology.
KDnuggets
OCTOBER 19, 2022
This post is a summary of 2 distinct frameworks for approaching machine learning tasks, followed by a distilled third. Do they differ considerably (or at all) from each other, or from other such processes available?
Netflix Tech
OCTOBER 18, 2022
by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations. A large number of batch workflows run daily to serve various business needs.
Teradata
OCTOBER 25, 2022
Developing an IT sustainability strategy can bring major positive change across the enterprise, lowering costs and optimizing resource use.
Advertisement
Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?
Cloudera
OCTOBER 11, 2022
A recent VentureBeat article , “4 AI trends: It’s all about scale in 2022 (so far),” highlighted the importance of scalability. I recommend you read the entire piece, but to me the key takeaway – AI at scale isn’t magic, it’s data – is reminiscent of the 1992 presidential election, when political consultant James Carville succinctly summarized the key to winning – “it’s the economy”.
Data Engineering Podcast
OCTOBER 16, 2022
Summary The "data lakehouse" architecture balances the scalability and flexibility of data lakes with the ease of use and transaction support of data warehouses. Dremio is one of the companies leading the development of products and services that support the open lakehouse. In this episode Jason Hughes explains what it means for a lakehouse to be "open" and describes the different components that the Dremio team build and contribute to.
AltexSoft
OCTOBER 13, 2022
Take a look at the featured image above. Beautiful, isn’t it? The interesting thing is, it isn’t a painting drawn by some famous artist, nor is it a photo taken by a satellite. The image you see has been generated with the help of Midjourney — a proprietary artificial intelligence program that creates pictures from textual descriptions. Neural nets can create images, video, and audio content that not every person can.
KDnuggets
OCTOBER 17, 2022
Sparse features can cause problems like overfitting and suboptimal results in learning models, and understanding why this happens is crucial when developing models. Multiple methods, including dimensionality reduction, are available to overcome issues due to sparse features.
Advertisement
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
Let's personalize your content