transformation_pipeline
Modules:
Name | Description |
---|---|
decode_cf |
|
Classes:
Name | Description |
---|---|
TransformationPipeline |
|
Classes#
TransformationPipeline #
Bases: IngestPipeline
Pipeline class designed to read in standardized time series data and enhance its quality and usability by combining multiple sources of data, using higher-level processing techniques, etc.
Classes:
Name | Description |
---|---|
Parameters |
|
Methods:
Name | Description |
---|---|
hook_customize_input_datasets |
|
run |
|
Attributes:
Name | Type | Description |
---|---|---|
parameters |
Parameters
|
|
retriever |
StorageRetriever
|
|
Attributes#
Classes#
Parameters #
Bases: BaseModel
Attributes:
Name | Type | Description |
---|---|---|
datastreams |
List[str]
|
A list of datastreams that the pipeline should be configured to run for. |
Functions#
hook_customize_input_datasets #
hook_customize_input_datasets(
input_datasets: Dict[str, xr.Dataset], **kwargs: Any
) -> Dict[str, xr.Dataset]
Code hook to customize any input datasets prior to datastreams being combined and data converters being run.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_datasets
|
Dict[str, Dataset]
|
The dictionary of input key (str) to input dataset. Note that for transformation pipelines, input keys != input filename, rather each input key is a combination of the datastream and date range used to pull the input data from the storage retriever. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Dataset]
|
Dict[str, xr.Dataset]: The customized input datasets. |
Source code in tsdat/pipeline/pipelines/transformation_pipeline.py
run #
Runs the data pipeline on the provided inputs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
List[str]
|
A 2-element list of start-date, end-date that the pipeline should process. |
required |
Returns:
Type | Description |
---|---|
Dataset
|
xr.Dataset: The processed dataset. |