Wednesday, January 17th, 2024 (10 months ago)
The CuPy-Xarray project makes mixing GPU acceleration with Xarray workflows very convenient! Explore the new documentation and tutorials to explore how CuPy-Xarray enables GPU accelerations on large multidimensional datasets. 🎉 🥳 🚀
CuPy is a GPU-accelerated library for numerical computations. CuPy provides a NumPy-like array object -- a duck array -- that follows various standard array protocols and executes computations on CUDA-capable devices. Xarray can wrap duck array objects (i.e. NumPy-like arrays) that follow specific protocols.
Thus Xarray can handle CuPy arrays, and cupy-xarray
provides a number of useful methods under the xarray_object.cupy
namespace, allowing seamless transition between CPU and GPU computations in your data pipeline.
GPU acceleration is becoming increasingly important in scientific research, data analysis, and AI/ML techniques due to its ability to perform massively parallel computations. GPUs can greatly accelerate the processing of array datasets, allowing for faster analysis and modeling of large datasets. By leveraging the power of GPUs with tools such as CuPy and CuPy-Xarray, Xarray users can gain significant performance improvements and unlock new opportunities for scientific discovery.
We have recently created detailed documentation with examples to help users get started with CuPy-Xarray. Check it out at this link.
The new documentation offers the following topics:
groupby
, resample
, rolling
, and apply_ufunc
to xarray objects.apply_ufunc
: Custom CUDA kernels for apply_ufunc
and how to use apply_ufunc
with groupby
and resample
.If you have any questions, encounter issues, or want to contribute, the community forum is a great place to start.
We also worked to improve upstream support for the primitives that Xarray needs. For example this pull request enabled the use of Xarray's .rolling
methods. An open pull request, when merged, will make it more clear when Xarray objects are wrapping CuPy arrays.
CuPy-Xarray is a Python library helps you use CuPy, a GPU array library, and Xarray, a library for multi-dimensional labeled array computations, to enable fast and friendly data processing on GPUs. With the new documentation and tutorials, users can quickly adapt to this integration and optimize their data science workflows.🚀
A special thanks to the Xarray, CuPy, and Pangeo communities for making this integration possible. Collaborations like these are a testament to the power of open-source and community-driven development. 💪
Much thanks to the NVIDIA RAPIDS team (specifically Jacob Tomlinson, John Kirkham) for initiating the cupy-xarray
project and guiding us along the way.
This work was partly funded by NSF Earthcube award "Jupyter Meets the Earth" (1928374); and NASA's Open Source Tools, Frameworks, and Libraries award "Enhancing analysis of NASA data with the open-source Python Xarray Library" (80NSSC22K0345).
From anaconda:
1conda install cupy-xarray -c conda-forge 2
From PyPI:
1python -m pip install cupy-xarray 2