La información en esta página no se encuentra completamente disponible en tu idioma de preferencia. Muy pronto esperamos tenerla completamente disponible en otros idiomas. Para obtener información en tu idioma de preferencia, por favor descarga el PDF aquí.
Data scientists and engineers often use libraries such as pandas to perform ad hoc data analysis. Although expressive and powerful, these libraries have one critical limitation: they only work on a single CPU and don't provide horizontal scalability for large data sets. Dask, however, includes a lightweight, high-performance scheduler that can scale from a laptop to a cluster of machines. And because it works with NumPy, pandas and Scikit-learn, Dask looks promising for further assessment.