article thumbnail

Data Pipeline Design Patterns - #2. Coding patterns in Python

Start Data Engineering

Introduction Sample project Code design patterns 1. Singleton, & Object pool patterns Python helpers 1. Functional design 2. Factory pattern 3. Strategy pattern 4. Dataclass 3. Context Managers 4. Testing with pytest 5.

Designing 148
article thumbnail

How to test PySpark code with pytest

Start Data Engineering

Ensure the code’s logic is working as expected with tests 2.1. pytest: A powerful Python library for testing 2.2.1. Set context, run code, check results & clean up 2.2.2. Introduction 2. Test types for data pipelines 2.2. Tests are identified by their name 2.2.3. Use fixture to create fake data for testing 2.2.4.

Coding 208
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Python Essentials for Data Engineers

Start Data Engineering

Introduction Data is stored on disk and processed in memory Running the code Run on Codespaces Run on your laptop Using python REPL Python basics Python is used for extracting data from sources, transforming it, & loading it into a destination [Extract & Load] Read and write data to any system [Transform] Process data in Python or instruct (..)

Python 148
article thumbnail

Announcing FawltyDeps - a dependency checker for your Python code

Tweag

It is a truth universally acknowledged that the Python packaging ecosystem is in need of a good dependency checker. If you work with Python, and care about keeping your projects lean and repeatable, then this is for you. The dependency is now installed in your Python virtual environment or on your system. 3rd-party imports).

Python 145
article thumbnail

Speeding Up Your Python Code with NumPy

KDnuggets

Why NumPy is significantly faster than standard Python code execution.

Python 100
article thumbnail

How To Write Efficient Python Code: A Tutorial for Beginners

KDnuggets

Are you a programmer looking to get better at Python? Learn some of Python’s features that’ll help you write more elegant and Pythonic code.

Python 124
article thumbnail

Enhance Your Python Coding Style with Ruff

KDnuggets

Ruff's 700+ built-in lint rules, reimplemented in Rust for speed, provide comprehensive linting and formatting to enforce clean and consistent Python code.

Coding 111