Page MenuHomePhabricator

Upgrade to Pandas ≥ 2 in Conda-Analytics
Open, MediumPublic

Description

Pandas 2.0 removes support for a casting to the unitless datetime64 dtype. PySpark tries to do this when collecting a datetime field to a Pandas dataframe, causing TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.

PySpark fixes this in either 4.0 (according to the bug tracker) or 3.5 (according to this StackOverflow answer).