tsdat.config

Module that wraps objects defined in pipeline and yaml configuration files.

Submodules

Classes

Config

Wrapper for the pipeline configuration file.

DatasetDefinition

Wrapper for the dataset_definition portion of the pipeline config

DimensionDefinition

Class to represent dimensions defined in the pipeline config file.

Keys

Class that provides a handle for keys in the pipeline config file.

PipelineDefinition

Wrapper for the pipeline portion of the pipeline config file.

QualityManagerDefinition

Wrapper for the quality_management portion of the pipeline config

VariableDefinition

Class to encode variable definitions from the config file. Also provides

class tsdat.config.Config(dictionary: Dict)[source]

Wrapper for the pipeline configuration file.

Note: in most cases, Config.load(filepath) should be used to instantiate the Config class.

Parameters

dictionary (Dict) – The pipeline configuration file as a dictionary.

Class Methods

lint_yaml

Lints a yaml file and raises an exception if an error is found.

load

Load one or more yaml pipeline configuration files. Multiple files

Method Descriptions

static lint_yaml(filename: str)

Lints a yaml file and raises an exception if an error is found.

Parameters

filename (str) – The path to the file to lint.

Raises

Exception – Raises an exception if an error is found.

classmethod load(self, filepaths: List[str])

Load one or more yaml pipeline configuration files. Multiple files should only be passed as input if the pipeline configuration file is split across multiple files.

Parameters

filepaths (List[str]) – The path(s) to yaml configuration files to load.

Returns

A Config object wrapping the yaml configuration file(s).

Return type

Config

class tsdat.config.DatasetDefinition(dictionary: Dict, datastream_name: str)[source]

Wrapper for the dataset_definition portion of the pipeline config file.

Parameters
  • dictionary (Dict) – The portion of the config file corresponding with the dataset definition.

  • datastream_name (str) – The name of the datastream that the config file is for.

Class Methods

get_attr

Retrieves the value of the attribute requested, or None if it does

get_coordinates

Returns the coordinate VariableDefinition object(s) that dimension

get_static_variables

Retrieves a list of static VariableDefinition objects. A variable is

get_variable

Attemps to retrieve the requested variable. First searches the data

get_variable_names

Retrieves the list of variable names. Note that this excludes

Method Descriptions

get_attr(self, attribute_name)Any

Retrieves the value of the attribute requested, or None if it does not exist.

Parameters

attribute_name (str) – The name of the attribute to retrieve.

Returns

The value of the attribute, or None.

Return type

Any

get_coordinates(self, variable: tsdat.config.variable_definition.VariableDefinition)List[tsdat.config.variable_definition.VariableDefinition]

Returns the coordinate VariableDefinition object(s) that dimension the requested VariableDefinition.

Parameters

variable (VariableDefinition) – The VariableDefinition whose coordinate variables should be retrieved.

Returns

A list of VariableDefinition coordinate variables that dimension the provided VariableDefinition.

Return type

List[VariableDefinition]

get_static_variables(self)List[tsdat.config.variable_definition.VariableDefinition]

Retrieves a list of static VariableDefinition objects. A variable is defined as static if it has a “data” section in the config file, which would mean that the variable’s data is defined statically. For example, in the config file snippet below, “depth” is a static variable:

depth:
  data: [4, 8, 12]
  dims: [depth]
  type: int
  attrs:
    long_name: Depth
    units: m
Returns

The list of static VariableDefinition objects.

Return type

List[VariableDefinition]

get_variable(self, variable_name: str)tsdat.config.variable_definition.VariableDefinition

Attemps to retrieve the requested variable. First searches the data variables, then searches the coordinate variables. Returns None if no data or coordinate variables have been defined with the requested variable name.

Parameters

variable_name (str) – The name of the variable to retrieve.

Returns

Returns the VariableDefinition for the variable, or None if the variable could not be found.

Return type

VariableDefinition

get_variable_names(self)List[str]

Retrieves the list of variable names. Note that this excludes coordinate variables.

Returns

The list of variable names.

Return type

List[str]

class tsdat.config.DimensionDefinition(name: str, length: Union[str, int])[source]

Class to represent dimensions defined in the pipeline config file.

Parameters
  • name (str) – The name of the dimension

  • length (Union[str, int]) – The length of the dimension. This should be one of: "unlimited", "variable", or a positive int. The ‘time’ dimension should always have length of "unlimited".

Class Methods

is_unlimited

Returns True is the dimension has unlimited length. Represented by

is_variable_length

Returns True if the dimension has variable length, meaning that

Method Descriptions

is_unlimited(self)bool

Returns True is the dimension has unlimited length. Represented by setting the length attribute to "unlimited".

Returns

True if the dimension has unlimited length.

Return type

bool

is_variable_length(self)bool

Returns True if the dimension has variable length, meaning that the dimension’s length is set at runtime. Represented by setting the length to "variable".

Returns

True if the dimension has variable length, False otherwise.

Return type

bool

class tsdat.config.Keys[source]

Class that provides a handle for keys in the pipeline config file.

ALL = ALL
ATTRIBUTES = attributes
DATASET_DEFINITION = dataset_definition
DEFAULTS = variable_defaults
DIMENSIONS = dimensions
PIPELINE = pipeline
QUALITY_MANAGEMENT = quality_management
VARIABLES = variables
class tsdat.config.PipelineDefinition(dictionary: Dict[str, Dict])[source]

Wrapper for the pipeline portion of the pipeline config file.

Parameters

dictionary (Dict[str]) – The pipeline component of the pipeline config file.

Raises

DefinitionError – Raises DefinitionError if one of the file naming components contains an illegal character.

Class Methods

check_file_name_components

Performs sanity checks on the config properties used in naming

Method Descriptions

check_file_name_components(self)

Performs sanity checks on the config properties used in naming files output by tsdat pipelines.

Raises

DefinitionError – Raises DefinitionError if a component has been set improperly.

class tsdat.config.QualityManagerDefinition(name: str, dictionary: Dict)[source]

Wrapper for the quality_management portion of the pipeline config file.

Parameters
  • name (str) – The name of the quality manager in the config file.

  • dictionary (Dict) – The dictionary contents of the quality manager from the config file.

class tsdat.config.VariableDefinition(name: str, dictionary: Dict, available_dimensions: Dict[str, tsdat.config.dimension_definition.DimensionDefinition], defaults: Union[Dict, None] = None)[source]

Class to encode variable definitions from the config file. Also provides a few utility methods.

Parameters
  • name (str) – The name of the variable in the output file.

  • dictionary (Dict) – The dictionary entry corresponding with this variable in the config file.

:param

available_dimensions: A mapping of dimension name to DimensionDefinition objects.

Parameters

defaults (Dict, optional) – The defaults to use when instantiating this VariableDefinition object, defaults to {}.

Class Methods

add_fillvalue_if_none

Adds the _FillValue attribute to the provided attributes dictionary

get_FillValue

Retrieves the variable’s _FillValue attribute, using -9999 as a

get_coordinate_names

Returns the names of the coordinate VariableDefinition(s) that this

get_data_type

Retrieves the variable’s data type.

get_input_name

Returns the name of the variable in the input if defined, otherwise

get_input_units

If the variable has input, returns the units of the input variable

get_output_units

Returns the units of the output data or None if no units attribute has

get_shape

Returns the shape of the data attribute on the VariableDefinition.

has_converter

Returns True if the variable has an input converter defined, False

has_input

Return True if the variable is copied from an input dataset,

is_constant

Returns True if the variable is a constant. A variable is constant

is_coordinate

Returns True if the variable is a coordinate variable. A variable is

is_derived

Return True if the variable is derived. A variable is derived if it

is_predefined

Returns True if the variable’s data was predefined in the config

is_required

Returns True if the variable has the ‘required’ property defined and

run_converter

If the variable has an input converter, runs the input converter for

to_dict

Returns the Variable as a dictionary to be used to intialize an

Method Descriptions

add_fillvalue_if_none(self, attributes: Dict[str, Any])Dict[str, Any]

Adds the _FillValue attribute to the provided attributes dictionary if the _FillValue attribute has not already been defined and returns the modified attributes dictionary.

Parameters

attributes (Dict[str, Any]) – The dictionary containing user-defined variable attributes.

Returns

The dictionary containing user-defined variable attributes. Is guaranteed to have a _FillValue attribute.

Return type

Dict[str, Any]

get_FillValue(self)int

Retrieves the variable’s _FillValue attribute, using -9999 as a default if it has not been defined.

Returns

Returns the variable’s _FillValue.

Return type

int

get_coordinate_names(self)List[str]

Returns the names of the coordinate VariableDefinition(s) that this VariableDefinition is dimensioned by.

Returns

A list of dimension/coordinate variable names.

Return type

List[str]

get_data_type(self)numpy.dtype

Retrieves the variable’s data type.

Returns

Returns the data type of the variable’s data as a numpy dtype.

Return type

np.dtype

get_input_name(self)str

Returns the name of the variable in the input if defined, otherwise returns None.

Returns

The name of the variable in the input, or None.

Return type

str

get_input_units(self)str

If the variable has input, returns the units of the input variable or the output units if no input units are defined.

Returns

The units of the input variable data.

Return type

str

get_output_units(self)str

Returns the units of the output data or None if no units attribute has been defined.

Returns

The units of the output variable data.

Return type

str

get_shape(self)Tuple[int]

Returns the shape of the data attribute on the VariableDefinition.

Raises

KeyError – Raises a KeyError if the data attribute has not been set yet.

Returns

The shape of the VariableDefinition’s data, or None.

Return type

Tuple[int]

has_converter(self)bool

Returns True if the variable has an input converter defined, False otherwise.

Returns

True if the Variable has a converter defined, False otherwise.

Return type

bool

has_input(self)bool

Return True if the variable is copied from an input dataset, regardless of whether or not unit and/or naming conversions should be applied.

Returns

True if the Variable has an input defined, False otherwise.

Return type

bool

is_constant(self)bool

Returns True if the variable is a constant. A variable is constant if it does not have any dimensions.

Returns

True if the variable is constant, False otherwise.

Return type

bool

is_coordinate(self)bool

Returns True if the variable is a coordinate variable. A variable is defined as a coordinate variable if it is dimensioned by itself.

Returns

True if the variable is a coordinate variable, False otherwise.

Return type

bool

is_derived(self)bool

Return True if the variable is derived. A variable is derived if it does not have an input and it is not predefined.

Returns

True if the Variable is derived, False otherwise.

Return type

bool

is_predefined(self)bool

Returns True if the variable’s data was predefined in the config yaml file.

Returns

True if the variable is predefined, False otherwise.

Return type

bool

is_required(self)bool

Returns True if the variable has the ‘required’ property defined and the ‘required’ property evaluates to True. A required variable is a variable which much be retrieved in the input dataset. If a required variable is not in the input dataset, the process should crash.

Returns

True if the variable is required, False otherwise.

Return type

bool

run_converter(self, data: numpy.ndarray)numpy.ndarray

If the variable has an input converter, runs the input converter for the input/output units on the provided data.

Parameters

data (np.ndarray) – The data to be converted.

Returns

Returns the data after it has been run through the variable’s converter.

Return type

np.ndarray

to_dict(self)Dict

Returns the Variable as a dictionary to be used to intialize an empty xarray Dataset or DataArray.

Returns a dictionary like (Example is for temperature):

{
    "dims": ["time"],
    "data": [],
    "attrs": {"units": "degC"}
}
Returns

A dictionary representation of the variable.

Return type

Dict