tsdat.io.retrievers
¶
Classes¶
Default API for retrieving data from one or more input sources. |
|
Retriever API for pulling input data from the storage area. |
- class tsdat.io.retrievers.DefaultRetriever[source]¶
Bases:
tsdat.io.base.Retriever
Default API for retrieving data from one or more input sources.
Reads data from one or more inputs, renames coordinates and data variables according to retrieval and dataset configurations, and applies registered DataConverters to retrieved data.
- Parameters
readers (Dict[Pattern[str], DataReader]) – A mapping of patterns to DataReaders that the retriever uses to determine which DataReader to use for reading any given input key.
coords (Dict[str, Dict[Pattern[str], VariableRetriever]]) – A dictionary mapping output coordinate variable names to rules for how they should be retrieved.
data_vars (Dict[str, Dict[Pattern[str], VariableRetriever]]) – A dictionary mapping output data variable names to rules for how they should be retrieved.
- class Parameters[source]¶
Bases:
pydantic.BaseModel
- merge_kwargs :Dict[str, Any][source]¶
Keyword arguments passed to xr.merge(). This is only relevant if multiple input keys are provided simultaneously, or if any registered DataReader objects could return a dataset mapping instead of a single dataset.
- readers :Dict[Pattern, tsdat.io.base.DataReader][source]¶
A dictionary of DataReaders that should be used to read data provided an input key.
Class Methods
Prepares the raw dataset mapping for use in downstream pipeline processes.
Method Descriptions
- retrieve(self, input_keys: List[str], dataset_config: tsdat.config.dataset.DatasetConfig, **kwargs: Any) xarray.Dataset [source]¶
Prepares the raw dataset mapping for use in downstream pipeline processes.
This is done by consolidating the data into a single xr.Dataset object. The retrieved dataset may contain additional coords and data_vars that are not defined in the output dataset. Input data converters are applied as part of the preparation process.
- Parameters
input_keys (List[str]) – The input keys the registered DataReaders should read from.
dataset_config (DatasetConfig) – The specification of the output dataset.
- Returns
xr.Dataset – The retrieved dataset.
- class tsdat.io.retrievers.StorageRetriever[source]¶
Bases:
tsdat.io.base.Retriever
Retriever API for pulling input data from the storage area.
Class Methods
Retrieves input data from the storage area.
Method Descriptions
- retrieve(self, input_keys: List[str], dataset_config: tsdat.config.dataset.DatasetConfig, storage: Optional[tsdat.io.base.Storage] = None, **kwargs: Any) xarray.Dataset [source]¶
Retrieves input data from the storage area.
Note that each input_key is expected to be formatted according to the following format:
“datastream::start-date::end-date”,
e.g., “sgp.myingest.b1::20220913.000000::20220914.000000”
This format allows the retriever to pull datastream data from the Storage API for the desired dates for each desired input source.
- Parameters
input_keys (List[str]) – A list of specially-formatted input keys.
dataset_config (DatasetConfig) – The output dataset configuration.
storage (Storage) – Instance of a Storage class used to fetch saved data.
- Returns
xr.Dataset – The retrieved dataset