Converters

Converters are classes that are used to convert units from the raw data to standardized format. Each Converter should extend the Converter base class. The Converter base class defines one method, run, which converts a numpy ndarray of variable data from the input units to the output units.

Currently tsdat provides two converters for working with time data. StringTimeConverter converts time values in a variety of string formats, and TimestampTimeConverter converts time values in long integer format.

In addtion, tsdat provides a DefaultConverter which converts any units from one UDUNITS-2 supported units type to another. This converter will be initiated if a variable’s output units (e.g “m”) are specified differently from its input units (e.g. “mm”) in the pipeline configuration file.

Converters are specified in the pipeline_config_<ingest_name>.yml file within variable definitions:

variables:
  time:
    input:
      name: time
      converter:
        classname: "tsdat.utils.converters.TimestampTimeConverter" # Converter name
        parameters:
          timezone: "US/Pacific"
          unit: "s"
    dims: [time]
    type: float
    attrs:
      long_name: Time (UTC) # automatically converts this without tz based on local computer
      standard_name: time
      units: "seconds since 1970-01-01T00:00:00"

  displacement:
    input:
      name: displacement
      units: "mm" # Units the input variable was measured in (DefaultConverter)
    dims:
      [dir, time]
    type: float
    attrs:
      units: "m" # Units the variable should be output in (DefaultConverter)
      comment: "Translational motion as measured by the buoy"

DefaultConverter

Default class for converting units on data arrays.

StringTimeConverter

Convert a time string to a np.datetime64, which is needed for xarray.

TimestampTimeConverter

Convert a numeric UTC timestamp to a np.datetime64, which is needed for xarray.

class tsdat.utils.converters.Converter(parameters: Optional[Dict] = None)[source]

Bases: abc.ABC

Base class for converting data arrays from one units to another. Users can extend this class if they have a special units conversion for their input data that cannot be resolved with the default converter classes.

Parameters

parameters (dict, optional) – A dictionary of converter-specific parameters which get passed from the pipeline config file. Defaults to {}

abstract run(data: numpy.ndarray, in_units: str, out_units: str)numpy.ndarray[source]

Convert the input data from in_units to out_units.

Parameters
  • data (np.ndarray) – Data array to be modified.

  • in_units (str) – Current units of the data array.

  • out_units (str) – Units to be converted to.

Returns

Data array converted into the new units.

Return type

np.ndarray

class tsdat.utils.converters.DefaultConverter(parameters: Optional[Dict] = None)[source]

Bases: tsdat.utils.converters.Converter

Default class for converting units on data arrays. This class utilizes ACT.utils.data_utils.convert_units, and should work for most variables except time (see StringTimeConverter and TimestampTimeConverter)

run(data: numpy.ndarray, in_units: str, out_units: str)numpy.ndarray[source]

Convert the input data from in_units to out_units.

Parameters
  • data (np.ndarray) – Data array to be modified.

  • in_units (str) – Current units of the data array.

  • out_units (str) – Units to be converted to.

Returns

Data array converted into the new units.

Return type

np.ndarray

class tsdat.utils.converters.StringTimeConverter(parameters: Optional[Dict] = None)[source]

Bases: tsdat.utils.converters.Converter

Convert a time string to a np.datetime64, which is needed for xarray. This class utilizes pd.to_datetime to perform the conversion.

One of the parameters should be ‘time_format’, which is the the strftime to parse time, eg “%d/%m/%Y”. Note that “%f” will parse all the way up to nanoseconds. See strftime documentation for more information on choices.

Parameters

parameters (dict, optional) – dictionary of converter-specific parameters. Defaults to {}.

run(data: numpy.ndarray, in_units: str, out_units: str)numpy.ndarray[source]

Convert the input data from in_units to out_units.

Parameters
  • data (np.ndarray) – Data array to be modified.

  • in_units (str) – Current units of the data array.

  • out_units (str) – Units to be converted to.

Returns

Data array converted into the new units.

Return type

np.ndarray

class tsdat.utils.converters.TimestampTimeConverter(parameters: Optional[Dict] = None)[source]

Bases: tsdat.utils.converters.Converter

Convert a numeric UTC timestamp to a np.datetime64, which is needed for xarray. This class utilizes pd.to_datetime to perform the conversion.

One of the parameters should be ‘unit’. This parameter denotes the time unit (e.g., D,s,ms,us,ns), which is an integer or float number. The timestamp will be based off the unix epoch start.

Parameters

parameters (dict, optional) – A dictionary of converter-specific parameters which get passed from the pipeline config file. Defaults to {}

run(data: numpy.ndarray, in_units: str, out_units: str)numpy.ndarray[source]

Convert the input data from in_units to out_units.

Parameters
  • data (np.ndarray) – Data array to be modified.

  • in_units (str) – Current units of the data array.

  • out_units (str) – Units to be converted to.

Returns

Data array converted into the new units.

Return type

np.ndarray