Dask is a flexible library for parallel computing in Python.
Dask is composed of two parts:
- Dynamic task scheduling optimized for computation. This is similar to Airflow,
Luigi, Celery, or Make, but optimized for interactive computational workloads.
- "Big Data" collections like parallel arrays, dataframes, and lists that extend
common interfaces like NumPy, Pandas, or Python iterators to
larger-than-memory or distributed environments. These parallel collections run
on top of dynamic task schedulers.
WWW: https://dask.org/
WWW: https://github.com/dask/dask