File HandlersΒΆ

File Handlers declare the classes that should be used to read raw input and write final output files.

For input files, you can specify a Python regular expression to match any specific file name pattern that should be read by that File Handler. A custom filehandler can contain any level of pre-analysis that the user desires; the only requirement is that it returns an xarray Dataset.

For output files, you can specify one or more formats. Tsdat will write processed data files using all the output formats specified.

Custom file handlers are stored in (typically) ingest/<ingest_name>/pipeline/filehandler.py. Once written, they must be specified in the storage_config.yml file like shown:

file_handlers:
  input:
    custom:   # Label to identify your file handler
      file_pattern: ".*.ext"   # Use a Python regex to identify files this handler should process
      classname: ingest.<ingest_name>.pipeline.filehandlers.CustomHandler   # Declare the fully qualified name of the handler class
      parameters:   # Parameters provided to filehandler function
        threshold: 50   # Parameter name and value (accessed in filehandler function via `self.parameters.get(<param_name>)`)

    # Tsdat built-in csv file handler and parameter keywords
    csv:
      file_pattern: ".*.csv"
      classname: tsdat.io.handers.CsvHandler
      parameters:
        read:
          read_csv: # pandas.read_csv arguments
            sep: ","
            header: 0
            index_col: False

    # Tsdat built-in netcdf file handler and parameter keywords
    netcdf:
      file_pattern: ".*.nc"
      classname: tsdat.io.handers.NetCdfHandler
      parameters:
        read:
          load_dataset: # xarray.load_dataset arguments
            engine: "netcdf4"


  # Tsdat built-in output filetypes
  output:
    netcdf:
      file_extension: ".nc"  # Declare the file extension to use when writing output files
      classname: tsdat.io.handlers.NetCdfHandler

    csv:
      file_extension: ".csv"
      classname: tsdat.io.handers.CsvHandler

Tsdat natively handles csv and netcdf file formats: