pipeline
Modules:
Name | Description |
---|---|
model_to_dict |
|
Classes:
Name | Description |
---|---|
Pipeline |
|
Classes#
Pipeline #
Bases: ParameterizedClass
, ABC
Base class for tsdat data pipelines.
Methods:
Name | Description |
---|---|
prepare_retrieved_dataset |
|
run |
|
Attributes:
Name | Type | Description |
---|---|---|
cfg_filepath |
Optional[Path]
|
The pipeline.yaml file containing the parameters used to instantiate this object |
dataset_config |
DatasetConfig
|
Describes the structure and metadata of the output dataset. |
quality |
QualityManagement
|
Manages the dataset quality through checks and corrections. |
retriever |
Retriever
|
Retrieves data from input keys. |
settings |
Any
|
|
storage |
Storage
|
Stores the dataset so it can be retrieved later. |
triggers |
List[Pattern]
|
Regex patterns matching input keys to determine when the pipeline should run. |
Attributes#
cfg_filepath
class-attribute
instance-attribute
#
The pipeline.yaml file containing the parameters used to instantiate this object
dataset_config
class-attribute
instance-attribute
#
Describes the structure and metadata of the output dataset.
quality
instance-attribute
#
Manages the dataset quality through checks and corrections.
triggers
class-attribute
instance-attribute
#
Regex patterns matching input keys to determine when the pipeline should run.
Functions#
prepare_retrieved_dataset #
Modifies the retrieved dataset by dropping variables not declared in the DatasetConfig, adding static variables, initializing non-retrieved variables, and importing global and variable-level attributes from the DatasetConfig.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
The retrieved dataset. |
required |
Returns:
Type | Description |
---|---|
Dataset
|
xr.Dataset: The dataset with structure and metadata matching the |
Dataset
|
DatasetConfig. |
Source code in tsdat/pipeline/base/pipeline.py
run
abstractmethod
#
Runs the data pipeline on the provided inputs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
List[str]
|
A list of input keys that the pipeline's Retriever class can use to load data into the pipeline. |
required |
Returns:
Type | Description |
---|---|
Any
|
xr.Dataset: The processed dataset. |