How to Engineer Date Features in Python
KDnuggets
MARCH 15, 2022
This article discusses and demonstrates how to quickly engineer some common date features using Python.
KDnuggets
MARCH 15, 2022
This article discusses and demonstrates how to quickly engineer some common date features using Python.
Start Data Engineering
MARCH 18, 2022
Objective Setup Pre-requisites Components Source systems Schedule & Orchestrate Extract Load Transform Data visualization Choosing tools & frameworks Future work & improvements Conclusion Further reading References Objective It can be difficult to know where to begin when starting a data engineering side project. If you have wondered What data to use for your data project?
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Confluent
MARCH 15, 2022
We are excited to announce ksqlDB 0.24! It comes with a slew of improvements and new features. Access to Apache Kafka® record headers will enable a whole host of new […].
Data Engineering Podcast
MARCH 13, 2022
Summary Data observability is a term that has been co-opted by numerous vendors with varying ideas of what it should mean. At Acceldata, they view it as a holistic approach to understanding the computational and logical elements that power your analytical capabilities. In this episode Tristan Spaulding, head of product at Acceldata, explains the multi-dimensional nature of gaining visibility into your running data platform and how they have architected their platform to assist in that endeavor.
Speaker: Tamara Fingerlin, Developer Advocate
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
KDnuggets
MARCH 14, 2022
In this article, we will be going through the algorithms that can be used for classification tasks.
Teradata
MARCH 16, 2022
Neither crystal balls nor black boxes will provide the agility needed for accurate demand forecasting in today’s retail & CPG environment. Learn more about new approaches to FDP.
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Data Engineering Podcast
MARCH 13, 2022
Summary The modern data stack is a constantly moving target which makes it difficult to adopt without prior experience. In order to accelerate the time to deliver useful insights at organizations of all sizes that are looking to take advantage of these new and evolving architectures Tarush Aggarwal founded 5X Data. In this episode he explains how he works with these companies to deploy the technology stack and pairs them with an experienced engineer who assists with the implementation and traini
KDnuggets
MARCH 15, 2022
Also: Decision Tree Algorithm, Explained; The Complete Collection of Data Science Cheat Sheets – Part 2; Top Programming Languages and Their Uses; The Complete Collection of Data Science Cheat Sheets – Part 1.
Confluent
MARCH 16, 2022
Gemma Singleton joined Confluent three years ago as an associate level solutions architect. While the job at Confluent appeared less “senior” than the position she was leaving, moving into a […].
Cloudera
MARCH 17, 2022
Data is the true currency of the digital age, and it plays an indispensable role in defining and accelerating the mission of Government agencies. . Every level of government is awash in data (both structured and unstructured) that is perpetually in motion. It is constantly generated – and always growing in volume – by an ever-growing range of sources, from IoT sensors and other connected devices at the edge to web and social media to video and more.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
Monte Carlo
MARCH 14, 2022
The companies we talk to are diligently building their data product or platform. This includes migrating to Snowflake , integrating with Databricks, moving towards a data mesh , or generally investing in their data stack. Increasingly, we are seeing data departments modernize their team structure with data product managers at the helm of such projects.
KDnuggets
MARCH 16, 2022
The best knowledge is still placed in the libraries; within books. In this article, discover some of the top recommended Data Science books catering to beginners.
AltexSoft
MARCH 18, 2022
As it sometimes happens, when one approach doesn’t work to solve a particular problem, you try a different one. When that approach doesn’t work either, it may be a good idea to combine the best parts of both. At least that’s often the case with technology tasks. Machine learning is no exception. You’ve probably heard of two main ML techniques — supervised and unsupervised learning.
Cloudera
MARCH 15, 2022
Each year, various departments and teams across the organization welcome early talent to embark on internships which allow them to kickstart their careers within the technology and big data industries. One of those early talent interns is Trang Luong, who worked within the APAC Inside Sales team earlier this year for a six-months, supporting the team in connecting with prospects and customers to guide them in their data-driven journey. .
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Rockset
MARCH 17, 2022
If you’re thinking about implementing real-time analytics , you've probably realized that you're going to need real-time updates. Real-time updates give you the power to insert, delete and update data in place. To do that, you'll need something more: a mutable database. In this post we'll discuss the three main reasons why a mutable database is required for real-time updates. 1) Late Arriving Data in Time-Based Window Rollups ⌛️ Let's say you have a rollup that's counting events for each hour.
KDnuggets
MARCH 18, 2022
Real-time AI/ML is on the rise and feature stores are key to successfully deploying them. Read on to see how the choice of online store and the feature store architecture play important roles in determining its performance and cost.
Palantir
MARCH 16, 2022
Engineering for Impact: Problem solving with purpose at Palantir Editor’s note: In this blog post, we sit down with UK Health Lead Joanna Peller, recipient of Data IQ’s 2022 100 award, to discuss what she’s learned leading Palantir’s UK Health work during the pandemic. Describe your path to Palantir. After growing up in Boston, Massachusetts, I moved to the UK to study Mathematics at UCL.
Cloudera
MARCH 14, 2022
The world cannot ignore the horrific invasion of Ukraine and the plight of the Ukrainian people, who are facing death and devastation in the defense of their country. Our Cloudera team members and their families in Ukraine have been impacted in ways we cannot imagine, and their safety is our top priority. . Cloudera – and many of our employees individually – are engaged in multiple activities to help Ukrainians, including donating supplies, running a donation matching program, and providing acco
Advertisement
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
Rockset
MARCH 15, 2022
Growth of the Internet of Things (IoT) hasn’t matched the hype due to numerous pain points: limited, unreliable network coverage, high connectivity, and device maintenance costs, and the uncertainty created by diverse, constantly-evolving cellular standards (4G versus 5G, LTE-M versus NB-IoT, etc.) 1NCE was founded in 2017 as a pure-play IoT connectivity provider to jumpstart IoT deployments by solving every one of those pain points.
KDnuggets
MARCH 17, 2022
Check out this article on using CTGANs to create synthetic datasets for reducing privacy risks, training and testing machine learning models, and developing data-centric AI products.
KDnuggets
MARCH 17, 2022
How will AI and data science impact the world of business in the next decade? Find out what trends to look out for in 2022 and beyond at NVIDIA GTC.
KDnuggets
MARCH 17, 2022
In this short blog, we’ll review the process of taking a POC data science pipeline (ML/Deep learning/NLP) that was conducted on Google Colab, and transforming it into a pipeline that can run parallel at scale and works with Git so the team can collaborate on.
Speaker: Tamara Fingerlin, Developer Advocate
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
KDnuggets
MARCH 18, 2022
In this blog, you will see how to build a web-application with Greppo and Google Earth using Python.
KDnuggets
MARCH 15, 2022
Have you spent hours taking coding bootcamps, online courses, and tutorials, only to feel like you aren’t getting anywhere?
KDnuggets
MARCH 16, 2022
In this guide, we'll learn how to use multiple inheritance in Python and make it sustainable.
KDnuggets
MARCH 17, 2022
Still running your artificial intelligence workloads in the cloud? If your applications depend on techniques like person detection and pose estimation to name a few, then it’s time you looked into on-device AI.
Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage
When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m
KDnuggets
MARCH 16, 2022
Competition for viewers’ attention is not over after the players leave the field. Now, anyone who can put up a highlight compilation or a game summarization first gets the edge. So, let’s talk about how media companies do just that — with the help of Artificial Intelligence.
KDnuggets
MARCH 14, 2022
If your new year's resolution was to start a career in data science but you have stalled, simply follow these easy steps to acquire professional certification within a year.
KDnuggets
MARCH 16, 2022
Deep dive into Kubernetes, the essential tool for anyone deploying and managing cloud-native applications!
KDnuggets
MARCH 18, 2022
Join this webinar and learn how competitive companies utilize third-party data, enhancing mobile customer experiences, through personalization and localization.
Advertisement
Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.
Let's personalize your content