tsdat.io.filehandlers

This module contains the File Handlers that come packaged with tsdat in addition to methods for registering new File Handler objects.

Package Contents

Classes

AbstractFileHandler

Abstract class to define methods required by all FileHandlers. Classes

FileHandler

Class to provide methods to read and write files with a variety of

CsvHandler

FileHandler to read from and write to CSV files. Takes a number of

NetCdfHandler

FileHandler to read from and write to netCDF files. Takes a number of

Functions

register_filehandler(patterns: Union[str, List[str]]) → AbstractFileHandler

Python decorator to register an AbstractFileHandler in the FileHandler

class tsdat.io.filehandlers.AbstractFileHandler(parameters: Union[Dict, None] = None)[source]

Abstract class to define methods required by all FileHandlers. Classes derived from AbstractFileHandler should implement one or more of the following methods:

write(ds: xr.Dataset, filename: str, config: Config, **kwargs)

read(filename: str, **kwargs) -> xr.Dataset

Parameters

parameters (Dict, optional) – Parameters that were passed to the FileHandler when it was registered in the storage config file, defaults to {}.

write(self, ds: xarray.Dataset, filename: str, config: tsdat.config.Config = None, **kwargs)None

Saves the given dataset to a file.

Parameters
  • ds (xr.Dataset) – The dataset to save.

  • filename (str) – The path to where the file should be written to.

  • config (Config, optional) – Optional Config object, defaults to None

read(self, filename: str, **kwargs)xarray.Dataset

Reads in the given file and converts it into an Xarray dataset for use in the pipeline.

Parameters

filename (str) – The path to the file to read in.

Returns

A xr.Dataset object.

Return type

xr.Dataset

class tsdat.io.filehandlers.FileHandler[source]

Class to provide methods to read and write files with a variety of extensions.

FILEREADERS :Dict[str, AbstractFileHandler]
FILEWRITERS :Dict[str, AbstractFileHandler]
static _get_handler(filename: str, method: Literal[read, write])AbstractFileHandler

Given the filepath of the file to read or write and the FileHandler method to apply to the filepath, this method determines which previously-registered FileHandler should be used on the provided filepath.

Parameters
  • filename (str) – The path to the file to read or write to.

  • method (Literal[) – The method to apply to the file. Must be one of: “read”,

  • "write".

Returns

The FileHandler that should be applied.

Return type

AbstractFileHandler

static write(ds: xarray.Dataset, filename: str, config: tsdat.config.Config = None, **kwargs)None

Calls the appropriate FileHandler to write the dataset to the provided filename.

Parameters
  • ds (xr.Dataset) – The dataset to save.

  • filename (str) – The path to the file where the dataset should be written.

  • config (Config, optional) – Optional Config object. Defaults to None.

static read(filename: str, **kwargs)xarray.Dataset

Reads in the given file and converts it into an xarray dataset object using the registered FileHandler for the provided filepath.

Parameters

filename (str) – The path to the file to read in.

Returns

The raw file as an Xarray.Dataset object.

Return type

xr.Dataset

static register_file_handler(method: Literal[read, write], patterns: Union[str, List[str]], handler: AbstractFileHandler)

Method to register a FileHandler for reading from or writing to files matching one or more provided file patterns.

Parameters
  • method ("Literal") – The method the FileHandler should call if the pattern is

  • Must be one of (matched.) – “read”, “write”.

  • patterns (Union[str, List[str]]) – The file pattern(s) that determine if this

  • should be run on a given filepath. (FileHandler) –

  • handler (AbstractFileHandler) – The AbstractFileHandler to register.

tsdat.io.filehandlers.register_filehandler(patterns: Union[str, List[str]])AbstractFileHandler[source]

Python decorator to register an AbstractFileHandler in the FileHandler object. The FileHandler object will be used by tsdat pipelines to read and write raw, intermediate, and processed data.

This decorator can be used to work with a specific AbstractFileHandler without having to specify a config file. This is useful when using an AbstractFileHandler for analysis or for tests outside of a pipeline. For tsdat pipelines, handlers should always be specified via the storage config file.

Example Usage:

import xarray as xr
from tsdat.io import register_filehandler, AbstractFileHandler

@register_filehandler(["*.nc", "*.cdf"])
class NetCdfHandler(AbstractFileHandler):
    def write(ds: xr.Dataset, filename: str, config: Config = None, **kwargs):
        ds.to_netcdf(filename)
    def read(filename: str, **kwargs) -> xr.Dataset:
        xr.load_dataset(filename)
Parameters

patterns (Union[str, List[str]]) – The patterns (regex) that should be used to match a filepath to the AbstractFileHandler provided.

Returns

The original AbstractFileHandler class, after it has been registered for use in tsdat pipelines.

Return type

AbstractFileHandler

class tsdat.io.filehandlers.CsvHandler(parameters: Union[Dict, None] = None)[source]

Bases: tsdat.io.filehandlers.file_handlers.AbstractFileHandler

FileHandler to read from and write to CSV files. Takes a number of parameters that are passed in from the storage config file. Parameters specified in the config file should follow the following example:

parameters:
  write:
    to_dataframe:
      # Parameters here will be passed to xr.Dataset.to_dataframe()
    to_csv:
      # Parameters here will be passed to pd.DataFrame.to_csv()
  read:
    read_csv:
      # Parameters here will be passed to pd.read_csv()
    to_xarray:
      # Parameters here will be passed to pd.DataFrame.to_xarray()
Parameters

parameters (Dict, optional) – Parameters that were passed to the FileHandler when it was registered in the storage config file, defaults to {}.

write(self, ds: xarray.Dataset, filename: str, config: tsdat.config.Config = None, **kwargs)None

Saves the given dataset to a csv file.

Parameters
  • ds (xr.Dataset) – The dataset to save.

  • filename (str) – The path to where the file should be written to.

  • config (Config, optional) – Optional Config object, defaults to None

read(self, filename: str, **kwargs)xarray.Dataset

Reads in the given file and converts it into an Xarray dataset for use in the pipeline.

Parameters

filename (str) – The path to the file to read in.

Returns

A xr.Dataset object.

Return type

xr.Dataset

class tsdat.io.filehandlers.NetCdfHandler(parameters: Union[Dict, None] = None)[source]

Bases: tsdat.io.filehandlers.file_handlers.AbstractFileHandler

FileHandler to read from and write to netCDF files. Takes a number of parameters that are passed in from the storage config file. Parameters specified in the config file should follow the following example:

parameters:
  write:
    to_netcdf:
      # Parameters here will be passed to xr.Dataset.to_netcdf()
  read:
    load_dataset:
      # Parameters here will be passed to xr.load_dataset()
Parameters

parameters (Dict, optional) – Parameters that were passed to the FileHandler when it was registered in the storage config file, defaults to {}.

write(self, ds: xarray.Dataset, filename: str, config: tsdat.config.Config = None, **kwargs)None

Saves the given dataset to a netCDF file.

Parameters
  • ds (xr.Dataset) – The dataset to save.

  • filename (str) – The path to where the file should be written to.

  • config (Config, optional) – Optional Config object, defaults to None

read(self, filename: str, **kwargs)xarray.Dataset

Reads in the given file and converts it into an Xarray dataset for use in the pipeline.

Parameters

filename (str) – The path to the file to read in.

Returns

A xr.Dataset object.

Return type

xr.Dataset