Skip to content

parquet_writer

Classes:

Name Description
ParquetWriter

Classes#

ParquetWriter #

Bases: FileWriter


Writes the dataset to a parquet file.

Converts a xr.Dataset object to a pandas DataFrame and saves the result to a parquet file using pd.DataFrame.to_parquet(). Properties under the to_parquet_kwargs parameter are passed to pd.DataFrame.to_parquet() as keyword arguments.


Classes:

Name Description
Parameters

Methods:

Name Description
write

Attributes:

Name Type Description
file_extension str
parameters Parameters

Attributes#

file_extension class-attribute instance-attribute #
file_extension: str = 'parquet'
parameters class-attribute instance-attribute #
parameters: Parameters = Field(default_factory=Parameters)

Classes#

Parameters #

Bases: BaseModel

Attributes:

Name Type Description
dim_order Optional[List[str]]
to_parquet_kwargs Dict[str, Any]
Attributes#
dim_order class-attribute instance-attribute #
dim_order: Optional[List[str]] = None
to_parquet_kwargs class-attribute instance-attribute #
to_parquet_kwargs: Dict[str, Any] = {}

Functions#

write #
write(
    dataset: xr.Dataset,
    filepath: Optional[Path] = None,
    **kwargs: Any
) -> None
Source code in tsdat/io/writers/parquet_writer.py
def write(
    self,
    dataset: xr.Dataset,
    filepath: Optional[Path] = None,
    **kwargs: Any,
) -> None:
    # QUESTION: Can we reliably write the dataset metadata to a separate file such
    # that it can always be retrieved? If not, should we declare this as a format
    # incapable of "round-tripping" (i.e., ds != read(write(ds)) for csv format)?
    df = dataset.to_dataframe(self.parameters.dim_order)  # type: ignore
    df.to_parquet(filepath, **self.parameters.to_parquet_kwargs)  # type: ignore