You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to (I believe) a downstream bug in pyarrow, when ParquetToArrowDecodingHandler tries to resolve a path to a parquet file, it fails with a FileNotFoundError.
It seems that the s3:// prefix is not passed to the final step of reading the parquet file, which results in pyarrow trying to load the file from a local path which does not exist.
apache/arrow#31812
The issue contains a description of where in pyarrow the issue occurs.
User workaround:
Use another type that interfaces well with parquet e.g. Polars and its flyte plugin.
Expected behavior
Parquet files can be decoded successfully with ParquetToArrowDecodingHandler even from s3
Describe the bug
Due to (I believe) a downstream bug in pyarrow, when
ParquetToArrowDecodingHandler
tries to resolve a path to a parquet file, it fails with aFileNotFoundError
.It seems that the
s3://
prefix is not passed to the final step of reading the parquet file, which results in pyarrow trying to load the file from a local path which does not exist.apache/arrow#31812
The issue contains a description of where in pyarrow the issue occurs.
User workaround:
Use another type that interfaces well with parquet e.g. Polars and its flyte plugin.
Expected behavior
Parquet files can be decoded successfully with
ParquetToArrowDecodingHandler
even from s3Additional context to reproduce
Depedencies:
pyarrow.Table
objectFileNotFoundError
occurs.Remark: In the error message, the
s3://
prefix is missing from the erroneous pathScreenshots
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: