tsdat.config
¶
Module that wraps objects defined in pipeline and yaml configuration files.
Submodules¶
Classes¶
Wrapper for the pipeline configuration file. |
|
Wrapper for the dataset_definition portion of the pipeline config |
|
Class to represent dimensions defined in the pipeline config file. |
|
Class that provides a handle for keys in the pipeline config file. |
|
Wrapper for the pipeline portion of the pipeline config file. |
|
Wrapper for the quality_management portion of the pipeline config |
|
Class to encode variable definitions from the config file. Also provides |
-
class
tsdat.config.
Config
(dictionary: Dict)[source]¶ Wrapper for the pipeline configuration file.
Note: in most cases,
Config.load(filepath)
should be used to instantiate the Config class.- Parameters
dictionary (Dict) – The pipeline configuration file as a dictionary.
Class Methods
Lints a yaml file and raises an exception if an error is found.
Load one or more yaml pipeline configuration files. Multiple files
Method Descriptions
-
static
lint_yaml
(filename: str)¶ Lints a yaml file and raises an exception if an error is found.
- Parameters
filename (str) – The path to the file to lint.
- Raises
Exception – Raises an exception if an error is found.
-
classmethod
load
(self, filepaths: List[str])¶ Load one or more yaml pipeline configuration files. Multiple files should only be passed as input if the pipeline configuration file is split across multiple files.
- Parameters
filepaths (List[str]) – The path(s) to yaml configuration files to load.
- Returns
A Config object wrapping the yaml configuration file(s).
- Return type
-
class
tsdat.config.
DatasetDefinition
(dictionary: Dict, datastream_name: str)[source]¶ Wrapper for the dataset_definition portion of the pipeline config file.
- Parameters
dictionary (Dict) – The portion of the config file corresponding with the dataset definition.
datastream_name (str) – The name of the datastream that the config file is for.
Class Methods
Retrieves the value of the attribute requested, or None if it does
Returns the coordinate VariableDefinition object(s) that dimension
Retrieves a list of static VariableDefinition objects. A variable is
Attemps to retrieve the requested variable. First searches the data
Retrieves the list of variable names. Note that this excludes
Method Descriptions
-
get_attr
(self, attribute_name) → Any¶ Retrieves the value of the attribute requested, or None if it does not exist.
- Parameters
attribute_name (str) – The name of the attribute to retrieve.
- Returns
The value of the attribute, or None.
- Return type
Any
-
get_coordinates
(self, variable: tsdat.config.variable_definition.VariableDefinition) → List[tsdat.config.variable_definition.VariableDefinition]¶ Returns the coordinate VariableDefinition object(s) that dimension the requested VariableDefinition.
- Parameters
variable (VariableDefinition) – The VariableDefinition whose coordinate variables should be retrieved.
- Returns
A list of VariableDefinition coordinate variables that dimension the provided VariableDefinition.
- Return type
List[VariableDefinition]
-
get_static_variables
(self) → List[tsdat.config.variable_definition.VariableDefinition]¶ Retrieves a list of static VariableDefinition objects. A variable is defined as static if it has a “data” section in the config file, which would mean that the variable’s data is defined statically. For example, in the config file snippet below, “depth” is a static variable:
depth: data: [4, 8, 12] dims: [depth] type: int attrs: long_name: Depth units: m
- Returns
The list of static VariableDefinition objects.
- Return type
List[VariableDefinition]
-
get_variable
(self, variable_name: str) → tsdat.config.variable_definition.VariableDefinition¶ Attemps to retrieve the requested variable. First searches the data variables, then searches the coordinate variables. Returns
None
if no data or coordinate variables have been defined with the requested variable name.- Parameters
variable_name (str) – The name of the variable to retrieve.
- Returns
Returns the VariableDefinition for the variable, or
None
if the variable could not be found.- Return type
-
get_variable_names
(self) → List[str]¶ Retrieves the list of variable names. Note that this excludes coordinate variables.
- Returns
The list of variable names.
- Return type
List[str]
-
class
tsdat.config.
DimensionDefinition
(name: str, length: Union[str, int])[source]¶ Class to represent dimensions defined in the pipeline config file.
- Parameters
name (str) – The name of the dimension
length (Union[str, int]) – The length of the dimension. This should be one of:
"unlimited"
,"variable"
, or a positive int. The ‘time’ dimension should always have length of"unlimited"
.
Class Methods
Returns
True
is the dimension has unlimited length. Represented byReturns
True
if the dimension has variable length, meaning thatMethod Descriptions
-
is_unlimited
(self) → bool¶ Returns
True
is the dimension has unlimited length. Represented by setting the length attribute to"unlimited"
.- Returns
True
if the dimension has unlimited length.- Return type
bool
-
is_variable_length
(self) → bool¶ Returns
True
if the dimension has variable length, meaning that the dimension’s length is set at runtime. Represented by setting the length to"variable"
.- Returns
True
if the dimension has variable length, False otherwise.- Return type
bool
-
class
tsdat.config.
Keys
[source]¶ Class that provides a handle for keys in the pipeline config file.
-
ALL
= ALL¶
-
ATTRIBUTES
= attributes¶
-
DATASET_DEFINITION
= dataset_definition¶
-
DEFAULTS
= variable_defaults¶
-
DIMENSIONS
= dimensions¶
-
PIPELINE
= pipeline¶
-
QUALITY_MANAGEMENT
= quality_management¶
-
VARIABLES
= variables¶
-
-
class
tsdat.config.
PipelineDefinition
(dictionary: Dict[str, Dict])[source]¶ Wrapper for the pipeline portion of the pipeline config file.
- Parameters
dictionary (Dict[str]) – The pipeline component of the pipeline config file.
- Raises
DefinitionError – Raises DefinitionError if one of the file naming components contains an illegal character.
Class Methods
Performs sanity checks on the config properties used in naming
Method Descriptions
-
check_file_name_components
(self)¶ Performs sanity checks on the config properties used in naming files output by tsdat pipelines.
- Raises
DefinitionError – Raises DefinitionError if a component has been set improperly.
-
class
tsdat.config.
QualityManagerDefinition
(name: str, dictionary: Dict)[source]¶ Wrapper for the quality_management portion of the pipeline config file.
- Parameters
name (str) – The name of the quality manager in the config file.
dictionary (Dict) – The dictionary contents of the quality manager from the config file.
-
class
tsdat.config.
VariableDefinition
(name: str, dictionary: Dict, available_dimensions: Dict[str, tsdat.config.dimension_definition.DimensionDefinition], defaults: Union[Dict, None] = None)[source]¶ Class to encode variable definitions from the config file. Also provides a few utility methods.
- Parameters
name (str) – The name of the variable in the output file.
dictionary (Dict) – The dictionary entry corresponding with this variable in the config file.
- :param
available_dimensions: A mapping of dimension name to DimensionDefinition objects.
- Parameters
defaults (Dict, optional) – The defaults to use when instantiating this VariableDefinition object, defaults to {}.
Class Methods
Adds the _FillValue attribute to the provided attributes dictionary
Retrieves the variable’s _FillValue attribute, using -9999 as a
Returns the names of the coordinate VariableDefinition(s) that this
Retrieves the variable’s data type.
Returns the name of the variable in the input if defined, otherwise
If the variable has input, returns the units of the input variable
Returns the units of the output data or None if no units attribute has
Returns the shape of the data attribute on the VariableDefinition.
Returns True if the variable has an input converter defined, False
Return True if the variable is copied from an input dataset,
Returns True if the variable is a constant. A variable is constant
Returns True if the variable is a coordinate variable. A variable is
Return True if the variable is derived. A variable is derived if it
Returns True if the variable’s data was predefined in the config
Returns True if the variable has the ‘required’ property defined and
If the variable has an input converter, runs the input converter for
Returns the Variable as a dictionary to be used to intialize an
Method Descriptions
-
add_fillvalue_if_none
(self, attributes: Dict[str, Any]) → Dict[str, Any]¶ Adds the _FillValue attribute to the provided attributes dictionary if the _FillValue attribute has not already been defined and returns the modified attributes dictionary.
- Parameters
attributes (Dict[str, Any]) – The dictionary containing user-defined variable attributes.
- Returns
The dictionary containing user-defined variable attributes. Is guaranteed to have a _FillValue attribute.
- Return type
Dict[str, Any]
-
get_FillValue
(self) → int¶ Retrieves the variable’s _FillValue attribute, using -9999 as a default if it has not been defined.
- Returns
Returns the variable’s _FillValue.
- Return type
int
-
get_coordinate_names
(self) → List[str]¶ Returns the names of the coordinate VariableDefinition(s) that this VariableDefinition is dimensioned by.
- Returns
A list of dimension/coordinate variable names.
- Return type
List[str]
-
get_data_type
(self) → numpy.dtype¶ Retrieves the variable’s data type.
- Returns
Returns the data type of the variable’s data as a numpy dtype.
- Return type
np.dtype
-
get_input_name
(self) → str¶ Returns the name of the variable in the input if defined, otherwise returns None.
- Returns
The name of the variable in the input, or None.
- Return type
str
-
get_input_units
(self) → str¶ If the variable has input, returns the units of the input variable or the output units if no input units are defined.
- Returns
The units of the input variable data.
- Return type
str
-
get_output_units
(self) → str¶ Returns the units of the output data or None if no units attribute has been defined.
- Returns
The units of the output variable data.
- Return type
str
-
get_shape
(self) → Tuple[int]¶ Returns the shape of the data attribute on the VariableDefinition.
- Raises
KeyError – Raises a KeyError if the data attribute has not been set yet.
- Returns
The shape of the VariableDefinition’s data, or None.
- Return type
Tuple[int]
-
has_converter
(self) → bool¶ Returns True if the variable has an input converter defined, False otherwise.
- Returns
True if the Variable has a converter defined, False otherwise.
- Return type
bool
-
has_input
(self) → bool¶ Return True if the variable is copied from an input dataset, regardless of whether or not unit and/or naming conversions should be applied.
- Returns
True if the Variable has an input defined, False otherwise.
- Return type
bool
-
is_constant
(self) → bool¶ Returns True if the variable is a constant. A variable is constant if it does not have any dimensions.
- Returns
True if the variable is constant, False otherwise.
- Return type
bool
-
is_coordinate
(self) → bool¶ Returns True if the variable is a coordinate variable. A variable is defined as a coordinate variable if it is dimensioned by itself.
- Returns
True if the variable is a coordinate variable, False otherwise.
- Return type
bool
-
is_derived
(self) → bool¶ Return True if the variable is derived. A variable is derived if it does not have an input and it is not predefined.
- Returns
True if the Variable is derived, False otherwise.
- Return type
bool
-
is_predefined
(self) → bool¶ Returns True if the variable’s data was predefined in the config yaml file.
- Returns
True if the variable is predefined, False otherwise.
- Return type
bool
-
is_required
(self) → bool¶ Returns True if the variable has the ‘required’ property defined and the ‘required’ property evaluates to True. A required variable is a variable which much be retrieved in the input dataset. If a required variable is not in the input dataset, the process should crash.
- Returns
True if the variable is required, False otherwise.
- Return type
bool
-
run_converter
(self, data: numpy.ndarray) → numpy.ndarray¶ If the variable has an input converter, runs the input converter for the input/output units on the provided data.
- Parameters
data (np.ndarray) – The data to be converted.
- Returns
Returns the data after it has been run through the variable’s converter.
- Return type
np.ndarray
-
to_dict
(self) → Dict¶ Returns the Variable as a dictionary to be used to intialize an empty xarray Dataset or DataArray.
Returns a dictionary like (Example is for temperature):
{ "dims": ["time"], "data": [], "attrs": {"units": "degC"} }
- Returns
A dictionary representation of the variable.
- Return type
Dict