Skip to content

retriever

Classes#

DataConverterConfig #

DataReaderConfig #

RetrievedVariableConfig #

Bases: BaseModel

Specifies how the variable should be retrieved from the raw dataset and the preprocessing steps (i.e. DataConverters) that should be applied.

Attributes#

data_converters class-attribute instance-attribute #
data_converters: List[DataConverterConfig] = Field(
    [],
    description="A list of DataConverters to run for this variable. Common choices include the tsdat UnitsConverter (classname: 'tsdat.io.converters.UnitsConverter') to convert the variable's data from its input units to specified output units, and the tsdat StringToDatetime converter (classname: 'tsdat.io.converters.StringToDatetime'), which takes dates/times formatted as strings and converts them into a datetime64 object that can be used throughout the rest of the pipeline. This property is optional and defaults to [].",
)
name class-attribute instance-attribute #
name: Union[str, List[str]] = Field(
    description="The exact name or list of names of the variable in the raw dataset returned by the DataReader."
)

RetrieverConfig #

Bases: ParameterizedConfigClass, YamlModel


Contains configuration parameters for the tsdat retriever class.

This class will ultimately be converted into a tsdat.io.base.Retriever subclass for use in tsdat pipelines.

Provides methods to support yaml parsing and validation, including the generation of json schema for immediate validation. This class also provides a method to instantiate a tsdat.io.base.Retriever subclass from a parsed configuration file.

Parameters:

Name Type Description Default
classname str

The dotted module path to the pipeline that the specified configurations should apply to. To use the built-in IngestPipeline, for example, you would set 'tsdat.pipeline.pipelines.IngestPipeline' as the classname.

required
readers Dict[str, DataReaderConfig]

The DataReaders to use for reading input data.

required

Attributes#

coords class-attribute instance-attribute #
coords: Dict[
    str,
    Union[
        Dict[Pattern, RetrievedVariableConfig],
        RetrievedVariableConfig,
    ],
] = Field(
    {},
    description="A dictionary mapping output coordinate variable names to the retrieval rules and preprocessing actions (i.e. DataConverters) that should be applied to each retrieved coordinate variable.",
)
data_vars class-attribute instance-attribute #
data_vars: Dict[
    str,
    Union[
        Dict[Pattern, RetrievedVariableConfig],
        RetrievedVariableConfig,
    ],
] = Field(
    {},
    description="A dictionary mapping output data_variable variable names to the retrieval rules and preprocessing actions (i.e. DataConverters) that should be applied to each retrieved coordinate variable.",
)
readers class-attribute instance-attribute #
readers: Optional[Dict[Pattern, DataReaderConfig]] = Field(
    description="A dictionary mapping regex patterns to DataReaders that should be used to read the input data. For each input given to the Retriever, the mapping will be used to determine which DataReader to use. The patterns will be searched in the order they are defined and the DataReader corresponding with the first pattern that matches the input key will be used."
)

Functions#

coerce_to_patterned_retriever classmethod #
coerce_to_patterned_retriever(
    var_dict: Dict[
        str,
        Union[
            Dict[Pattern, RetrievedVariableConfig],
            RetrievedVariableConfig,
        ],
    ]
) -> Dict[str, Dict[Pattern[str], RetrievedVariableConfig]]
Source code in tsdat/config/retriever.py
@validator("coords", "data_vars")
@classmethod
def coerce_to_patterned_retriever(cls, var_dict: Dict[str, Union[Dict[Pattern, RetrievedVariableConfig], RetrievedVariableConfig]]) -> Dict[str, Dict[Pattern[str], RetrievedVariableConfig]]:  # type: ignore
    to_return: Dict[str, Dict[Pattern[str], RetrievedVariableConfig]] = {}  # type: ignore
    for name, var_retriever in var_dict.items():  # type: ignore
        if isinstance(var_retriever, RetrievedVariableConfig):
            var_retriever = {re.compile(r".*"): var_retriever}
        to_return[name] = cast(
            Dict[Pattern[str], RetrievedVariableConfig], var_retriever
        )
    return to_return