Data Engineering: Fast Spatial Joins Across ~2 Billion Rows on a Single Old GPU
Towards Data Science
MAY 30, 2023
I have spent many years in Data Engineering on Big Data solutions, and one of the tasks that we had do regularly was to perform spatial joins of human movement data through multiple polygons. ORC is often overlooked in favour of Parquet but offers features that can outperform Parquet on certain systems.
Let's personalize your content